►
Description
Filecoin -- an exabyte scale IPFS system - present3ed by @jbenet at IPFS bing 2022 - IPFS Implementations - https://2022.ipfs-thing.io
A
I'm
going
to
talk
about
file
coin,
the
largest
type
of
deployment
right
after
me,
you'll
hear
about
lotus
one
of
the
implementations
of
falcoin
from
ayush,
I'm
going
to
talk
about
three
parts,
I'm
going
to
give
it
a
quick
intro,
use
cases
and
scale.
I
want
to
talk
about
the
falcon
architecture,
the
broader
system
architecture,
so
you
get
a
sense
of
how
all
the
different
components
operate
and
then
I'm
going
to
talk
about
some
sets
of
problems
around
retrieval,
interop,
indexers
and
computation.
A
So
I
I've
given
a
lot
of
talks
about
this,
but
you
can,
in
a
summary
falcon
is
a
cryptopowered
source
network,
it's
blockchain,
coordinated
storage
market,
so
think
of
using
a
blockchain
to
advertise
storage
providers
and
advertise
retrieval
providers,
advertise
clients
and
so
on
and
be
able
to
coordinate
operation.
A
The
network
verifies
storage
using
zero
knowledge
proofs.
It's
the
largest
snark
system,
I
think
to
date.
Yeah.
I
don't
think
anything
has
to
pass
that.
Yet
it
is
a
it
uses,
personal
replication
and
proof
of
space
time,
some
deep
cryptographic
primitives
that
are
not
primitives
constructions
based
on
other
primitives
and
so
on
that
we
ended
up
using
and
having
to
create
and
then
and
then
use.
A
This
is
the
stats
of
the
network,
so
there's
about
17
extra
bytes
of
storage
capacity,
there's
about
so
there's
about
four
thousand
source
providers.
It's
about
400
organizations
and
lots
of
projects,
and
so
on,
yeah,
so
the
to
get
a
sense
of
the
capacity
most
of
the
facilities
are
large
scale
cloud
style
data
centers.
A
So
you
can
think
of
a
lot
of
data
centers
in
cities
all
around
the
world
with
very
large
racks
of
capacity,
and
that
capacity
right
now
is
actual
arranged
bits
that
that
are
being
proved
at
all
times,
so
that
we
know
in
every
24
hours
that
all
of
that
capacity
and
all
the
data
that's
stored.
I'll
show
the
data
in
a
moment
are
approved
every
24
hours.
So
we
know
that
capacity
is
there
by
the
way,
the
blockchain
and
all
of
that
is
in
ipld.
So
it's
using
seabor
in
this.
A
U
so
it's
a
totally
ipfs
native
blockchain
system.
This
is
the
capacity.
This
is
the
capacity
growth
and
this
is
a
data
onboarding.
So
it's
onboarding
on
the
on
the
order
of
0.5
to
1
petabyte
a
day,
and
so
that's
a
lot
of
data.
So
we
went
from
kubo
to
lotus
where
kubo
was
like.
A
You
know
at
the
beginning,
for
dealing
with
megabytes
and
gigabytes
and
like
making
its
way
to
terabytes
and,
like
you
finally
made
it
and
then
lotus
is
like
bam,
petabytes
exabytes.
A
Now,
of
course,
it's
not
petabytes
or
extra
bytes
in
data
distribution,
that's
in
capacity
and
it
ends
up
dealing
with
like
all
of
the
storage
outside.
So
one
extremely
big
difference
between
all
the
falcon
implementations
and
coupon
other
priority
implementations
is
that
the
data
size
is
so
big
that
you
have
to
do
a
lot
of
stuff
outside
of
any
kind
of
dac
manipulation
or
outside
of,
like
one
nice,
dag
store
or
something
like
that.
A
A
A
There's
a
lot
of
native
web3
use
cases,
so
things
like
consumer
storage,
video
and
audio
nfts
and
web
through
storage,
and
so
on.
All
of
those
have
tens
of
thousands
of
users
like
developers
and
generating
like
millions
of
tens
of
millions
of
objects,
and
so
on,
I
think
getting
to
on
the
order
of
100
million.
But
all
of
that
stuff
is
tiny.
A
Now
falcon
still
has
to
operate
really
well
for
all
the
webster
use
cases,
because
that's
where
a
lot
of
the
web
3
applications
are
so
it
has
to
do
both
like
do
really
well
for
being
the
underlying
data
datastore
for
those
webster
use
cases
and
start
pulling
large
scale
web
2
things
where
it's
headed
is
to
enable
large-scale
web-3
applications.
So
this
is
what
needs
to
happen
in
order
for
web3
to
cross
the
chasm
we
need
to
be
able
to
deal
with.
A
We
need
to
be
able
to
build
things
like
all
the
applications
on
the
left
using
web
through
primitives,
so
things
like
ipfs
things
like
blockchains
and
whatnot.
It's
a
long
road
to
get
here.
The
big
kind
of
like
next
big
blocker
is
things
like
data
pipelines
and
consensus,
scalability
and
so
on,
and
there's
a
lot
of
kind
of
direction
into
computation.
That
falcon
is
taking
so
things
like
pluck
one
out
of
the
fvm,
which
is
a
wasm
based
virtual
machine.
The
the
upgrade
just
went
live
earlier
this
week.
A
Sorry
last
week
I
think
last
week
it's
blur,
and
that
is
going
to
enable
a
bunch
of
different
runtimes
to
be
added,
on
top,
so
think
of
being
able
to
add
the
evm
or
algorith
ses,
or
things
like
that
swings
out
and
whatnot
and
a
bunch
of
other
other
systems
or
target
wasm
directly
through
some
some
other
kind
of
wasm
native
runtime
from
there.
A
A
One
reason
that
there's
going
to
be
many
computational
networks
is
that
cryptographic
primitives
are
very
different,
and
so
they
yield
different,
rational,
different
economic
structures
for
getting
verifiability
and
very
different
performance
profiles.
So
it's
unlikely
that
we'll
see
one
single
computational
network,
we'll
see
probably
many
for
a
while
it'll
take
a
while
for
these
to
kind
of
synthesize,
there's
a
bunch
of
next-gen
scalability.
A
That
needs
to
happen
there's.
This
is
what
the
consensus
lab
group
is
working
on
and
kind
of,
like
the
the
goal
is
to
get
to
billions
of
transactions
per
second
or
trillions
of
transactions
per
second
through
things
like
applying
hierarchical
consensus
and
and
so
on,
but
we
really
want
to
get
to
like
blockchains
that
have
very
fast
analytics
things
like
millisecond
finality,
so
within
a
data
center,
so
that
means
you're
never
going.
You
have
a
blockchain
that
can
operate
within
a
region
and
then
stamp
up
great.
A
So
let's
talk
about
ip
fast.
So
far,
coin
is
an
ipfs
system.
At
the
end
of
the
day,
the
entire
blockchain
is
ability.
Data,
all
of
the
sectors
are
ipled
data,
the
data
within
the
sectors,
our
ipld
data,
all
of
this
stuff
is
stuff,
and
the
transfer
protocols
are
ipfs
oriented
protocols,
but
lotus
started
in
a
different
direction.
A
A
The
dht
is
not
going
to
scale
for
petabytes
of
hundreds
of
petabytes
of
stuff,
so
we
really
need
much
more
scalable
content
routers,
that's
what
the
whole
session
tomorrow
is
about
is
about
very
scalable
content
routing
systems,
and
you
basically
need
of
one
network
accesses
to
be
able
to
like
do
this
with
very
large
indices,
and
you
need
decentralization
which
is
tricky,
but
basically
you
don't
want
to
do
something
like
adam
layout,
because
then
you
end
up
like
with
a
bunch
of
hops,
and
you
end
up
like
advertising
petabytes
of
stuff
into
into
the
dc.
A
Falcon
has
a
bunch
of
different
consensus,
nodes
and
storage
writer
nodes,
lotus,
venus,
forest
and
huhan.
This
is
from
blockchain
client
diversity
requirements
where
you
want
separate
independent
code
bases
to
make
sure
that,
like,
if
you
get
into
some
problems
with
one
other
nodes
and
the
network
can
potentially
carry
on,
I
found
this
table.
I
don't
know
if
it's
up
to
date,
but
this
is
a
table
that
like
implies
where
things
are
and
you'll
hear
about
lotus
in
a
moment.
A
There's
other
programs
and
other
nodes,
so
think
of
boost
is
a
layer
that
gets
added
to
things
like
lotus
and
other
nodes
that
can
do
sorts
and
retrieval
for
a
source,
writer
and
think
of
it,
as
kind
of
like
an
interrupt
layer
and
a
bunch
of
added
tools
and
and
systems
to
be
able
to
mediate,
the
client
and
storage
provider
deal
making
of
like.
Oh,
I
want
to
retrieve
this
thing
and
I
want
to
pay
you
or
hey.
A
I
want
you
to
take
this
data
and
like
store
it
once
you
start
moving
around
large
amounts
of
data
like
tens
of
gigabytes
of
terabytes,
you
have
to
deal
with
like
all
kinds
of
scheduling,
problems
and
so
on,
and
especially
when
you
have
some
computation
that
you
are
going
to
run
in
the
background
like
ceiling,
and
so
that's
where
things
like
boost
come
in
there's
another
implementation
called
s3
which
we'll
also
hear
about
later
today.
I
think
and
that's
think
of
that
as
an
ips
implementation
tuned
for
medium
scale,
data
onboarding
into
falcon.
A
So
this
is
like
10
gigabytes
to
a
petabyte,
so
it
can't
handle
like
10
petabytes
plus.
At
that
point,
you
want
to
move
things
outside
of
the
wire
you
just
don't
want
to
use
the
internet
but
telling
gearbest
petabyte.
Is
this
sweet
spot
for
forestry
and
think
of
s3
as
like
an
intermediate
node
between
clients
and
source
providers?
A
Falcon
knows
are
ipfs
nodes,
so
think
of
this
diagram,
where,
like
the
liquidity
network
is
a
very
large
network,
ibps
is
is
on
top
of
that,
and
falcon
is
a
subset
of
that
of
the
apple
class
network.
A
It's
ipod
to
be
in
unix
to
fast
all
the
way
down,
all
the
blocking
data's,
type,
bld
data
and
so
on.
Files
today
are
imported
like
regular
basics
files
that
are
imported
as
unix
compass.
Maybe
they
should
be
windows.
The
transports
today
are
bitsweb
and
graphsync,
and
the
network
is
liquidity,
http
or
offline.
A
So
there's
a
lot
of
offline
movement
of
data,
because
these
scales-
you
just
you,
you
can't
use
the
internet.
So
one
important
thing:
there's
a
lot
of
people
always
talk
about
falcon
and
ipfs
and
interrupt,
but
that's
a
misnomer.
It's
like
think
of,
like
lotus
and
kubo,
interop
and
part
of
what
led
to
this
was
that
people
were
calling
lotus
fat
coin
and
people
were
calling
go,
ipfs,
ipfs
and
so
like.
A
We
have
like
undo
this
conceptual
damage,
so
I'm
just
gonna.
Let
that
stay
for
a
moment
just
burn
it
into
your
mind.
So
that's
a
good
one
and
it
falcon
was
sort
of
designed
to
have
all
these
like
different
components
that
are
meant
to
compose
when
you
want
to
create
a
node
so
think
of
a
consensus
node
as
having
a
set
of
libraries,
it
has
a
olympic
p
node,
a
local
repository,
some
local
notion
of
time
and
a
bunch
of
facilities
for
the
blockchain.
A
Think
of
a
storage
mining
node
as
having
all
of
that,
plus
the
ability
to
deal
with
files
and
the
ability
to
do
the
storage
mining
components.
Think
of
a
storage
provider,
as
like
all
of
that,
plus
the
ability
to
make
deals
and
and
participate
in
the
storage
market
and
think
of
a
retrieval
provider.
Note
is
not
having
to
maintain
the
blockchain
state
at
all.
It's
just
a
it's
just
another
node
and
a
client
and
doesn't
have
to
check
in
with
consensus.
A
That's
in
theory.
So
that's
the
theory
of
the
protocols
and
and
the
protocols
allow
that
kind
of
thing.
In
practice
the
implementations
are
a
lot
murkier
and
the
libraries
are
not
as
easily
decoupled
and
so
on,
because
when
you're
making
a
thing,
you,
like
you,
encounter
all
kinds
of
different
constraints
that
are
different
than
the
theoretical
constraints.
So
it's
not
as
easy
as
like
plug
and
play
to
making
these
things
these
end
up
being,
like
totally
different
code
bases,
the
so
some
quick
notes
on
retrieval
and
drop
indexers.
A
I
think
this
is
a
useless
slot
now
so
yeah
so
I'll
couple
comments
and
then
I'll
I'll
hop
off
so
unretrievable,
so
the
whole
storage
flow
is
working
pretty
well
now.
The
retrieval
flow
is
not
working
very
well
yet
we're
right
now.
This
is
where
why
we
made
an
indexer
service
that
can
index
all
of
the
stuff
that
the
storage
providers
have
can
ingest
that
into
like
one
place
and
then
make
that
accessible.
A
So,
like
that's
live
now,
that's
getting
connected
to
the
ipvs
gateway,
you'll,
probably
hear
about
that
tomorrow,
and
then
we
want
retrieval
provider
networks
to
be
able
to
kind
of
use
that
indexing
information
and
then
pull
data
from
storage
providers.
Those
retail
provider
networks
are
being
tested
right
now.
Those
are
getting
built
at
the
moment.
There's
there's
no
like
there
are
some
test
nets,
live
that
work
and
are
trying
to
raise
the
gateway
and
so
on,
but
they're,
not
kind
of
good
enough.
A
You
have
to
like
make
that
retrieval
flow
work.
Well,
so
today,
like
you
can
easily
like
relatively
easily
write
your
data
into
into
file
coin,
depending
on
you
know
how
big
it
is
if
it's
big,
it's
gonna,
be
harder.
If
it's
small,
it's
very
easy
using
many
of
the
on-ramps
but
then
retrieving
it
again
depends
if
you
use
an
on-ramp
that
has
really
nice
caching
for
you
and
solves
it
for
you
great,
like
so
things
like
web,
through
storage
and
seo
storage
and
so
on.
All
work
really
well.
A
A
So
that's
what
retro
networks
and
retro
markets
are
aiming
for
on
the
order
of
100
000
to
10
million
nodes
and
we're
kind
of
thinking
of
like
being
able
to
deal
with
10
to
the
18
objects.
That's
kind
of
what
that's
web
scale.
Maybe
10
to
the
15
is
good
enough,
but
10
to
the
18
is
like
safe
territory,
and
these
will
likely
end
up
being
tiered.
A
The
l2
might
have
a
bunch
of
content
that
was
positioned
there
ahead
of
time,
predicting
that
the
data
was
going
to
be
it's
going
to
be
requested
in
a
region
and
then,
if
the
all
of
those
fail
or
the
l1
might
just
go
straight
to
l3,
which
is
the
the
sp
so
think
of
like
standard
style
system,
style,
caching,
but
but
per
region,
and
so
think
of
these
as
being
deployed
in
a
locale,
so
that
you
minimize
the
the
latency
you
don't
want
to
be
crossing
continents.
A
Scale
data
repository
on
where
you
can
dump
a
bunch
of
data
in
and
then
run
a
bunch
of
data
pipelines
and
computation
to
then
build
other
kinds
of
applications.
So
we
you
know
all
this
stuff
can
take
like
large
archives
of
all
the
crypto
trees
from
like
all
the
applications,
and
just
be
this
like
cold
storage
of
all
the
stuff
and
then
whatever
you
want
to.
A
Pin
you
get
close
to
the
to
the
end
users
and
you
keep
like
the
local
hot
caches
for
every
everything
closer
to
wherever
the
person
is
cool.
That's
it.