►
From YouTube: The Rise of Elastic IPFS - @alanshaw - Connecting IPFS
Description
The Rise of Elastic IPFS - presented by @alanshaw at IPFS þing 2022 - Connecting IPFS - https://2022.ipfs-thing.io
A
Hi,
I'm
alan.
This
is
the
talk
about
the
rise
of
elastic
ipf
s,
so
as
you've
just
kind
of
learned,
I
guess
a
little
bit.
Pfs
is
a
new.
It's
open
source,
ipfs
implementation
that
runs
in
the
cloud
it
separates,
read
and
write
pipelines
to
allow
it
to
scale
massively.
A
If
you're
interested
in
the
elastic
ibfs
architecture
and
you're
watching
this
on
a
video
in
the
future,
then
you
should
look
at
francisco's
talk
that
he
just
gave
and
also,
like,
I
said,
there's
a
there's,
a
deep
dive
into
the
provider
subsystem
later
today
in
the
content,
routine
performance
track
with
me
and
paulo.
A
This
talk,
though,
this
talk
is
the
story
of
how
we
got
our
initial
implementation
out
of
the
door
and
into
production.
So
let's,
let's
go.
Let's
first
talk
a
little
bit
about
what
this
all
kind
of
hinges
on.
A
So
what
we
have
is
this
tool,
which
is
continuously
collecting
data
on
the
uploads
that
both
nft.storage
and
web3.storage
receive
and
and
so
the
gist
of
this
thing
is
we
we
pick
a
cid
that
we
know
we're
storing,
and
we
pick
a
pier
that
we
know
that's
storing
that
data
and
we
do
some
stuff
to
check
up
on
that
information.
And
then
we
graph.
B
A
On
this
nice
on
this
nice
grafana
panel,
and
so
these
are
like
the
the
headline
stats,
but
you
can
see
like
at
the
top
left
there
there's
dht
provider
records.
So
this
is
the
percentage
of
time
where
we
ask
the
dht.
If
there
is
a
provider
record
that
says
that
this
cid
is
being
provided
by
this
pier
and
that's-
you
can
see
like
we
at
this
period
of
time,
not
doing
so
well
like
about
50
kind
of
availability.
A
Top
top
right
here
is
bit
swap
availability.
So
what
we
do
is
we
make
a
p2p
connection
to
the
peer
that
we
know
is
meant
to
be
storing
that
cid
and
we
ask
it
using
a
bitswap.
Have
message:
do
you
have
this
cid
of,
and
so
it's
meant
to
say
yes,
doesn't
always
happen,
that's
bad!
That's!
But
anyway,
that's
that
that's
that
panel
we
also
chart
like
checks
per
second
but
connection
errors.
So
what
can
happen
is
when
we're
trying
to
connect
to
that
pier.
A
We
might
experience
a
connection
error
because
it
is
very
busy
or
it's
broken
or
down,
and
that's
no
good
and
that's
often
the
reason
why
this
is
not
100
connection.
Error
is
also
bad
at
this
period
of
time,
and-
and
so
this
bottom
right
panel
is
bit
swap
round
trip
time.
A
So
the
actual
time
it
takes
to
send
that
have
message
and
receive
a
response
responds
to
it
so
they're
the
kind
of
headline
stats
you
can
get
real
overview
of
like
this
is
all
of
the
nodes
that
all
the
peers
that
we're
running
in
in
all
of
our
clusters,
that
we
have
for
nft
storage
and
then
from
there
you
can
drill
down
into
per
peer
metrics,
and
so
this
is
rainbow
mode
and
rainbow
mode
is
not
meant
to
be
a
thing
that
indicates
that
our
peers
are
acting
very
erratically,
and
that
is
not
a
good
thing
that
you
want
from
your
prof
production
infrastructure.
A
So
this
becomes
more
useful
when
you
actually
filter
by
a
particular
peer
I've
just
selected
like
random
ones.
Here
this
isn't
the
same.
These
random
ones,
like
this
top
left
one,
was
having
a
really
bad
time
on
connections,
and
maybe
it
got
restarted
and
then
you
can
see
it's
it's
doing
a
a
lot
better.
A
This
one
is
this:
guy
was
we
were
finding
provider
records
for
this
particular
pier
for
every
cid
that
we
checked
for
well,
not
every
cid
but
most
of
them,
and
then
it
started
to
have
a
bit
of
a
bad
time
bits.
What
found
this
was.
One
was
doing.
Okay
had
a
really
bad
time
and
then
maybe
got
restarted
again
and
you
get
you
kind
of
get
the
idea.
This
is
like
drilling
down
into
purpose.
I
mean
you
can
see
from
here.
A
This
is
a
really
good
indication
of
when
there's
a
peer.
That's
that's
currently
in
trouble
is
struggling
in
some
in
some
way,
because
the
checks
are
telling
us
that
think
bad
things
are
going
on.
Anyway.
You
get
the
idea
that,
like
all
of
these,
all
of
those
metrics
that
you
see
in
that
graph
ana
are
all
like
specific
to
the
data
that
we're
storing.
A
So,
there's
cds
that
we
know
have
been
uploaded
uploaded
to
web
freedom,
storage
and
nft,
dot
storage,
so
they're
specific
to
us,
but
this,
like
this
all
hinges
on
this
ipfs
check
tool,
which
is
like
a
generic
public
open
source
api
that
is
available
to
for
anyone
to
use
so
and
anyone
can
run
it
themselves.
A
We
actually
run
our
own
one
as
well,
and
so
you
can
just
go
and
put
your
cid
in
and
you
appear
and
check
if,
if
good
things
are
happening
and
it
will
show
you
the
results
afterwards,
it's
made
by
the
dean
because
he's
a
wizard
and
willian
and
elena
really
heavily
for
those
those
stats.
So
thank
you,
brad,
okay,
so
anyway,
how
to
production
so
checkup
gave
us
the
tools
to
kind
of
determine
how
elastic
ibfs
was
performing.
A
Basically,
as
you
learned
in
the
previous
kind
of
talk,
alaska
ibfs
works
off
of
s3
buckets
it
car
files
that
people
upload
go
straight
into
s3
buckets
and
it
reads
the
blocks
out
of
those
directly
out
of
those
car
files.
A
What
we
in
in
our
in
dot
storage,
kind
of
apis
at
the
moment,
like
we
already
write
to
s3
buckets
and
we
were
doing
we've-
been
doing
that
forever
for
like
disaster
recovery,
just
in
case
our
cluster
decided
to
blow
up
like
we
still
got
like
some
some
kind
of
extra
backup
of
all
of
the
data
that
we
could
restore
restore
from
if
anything
happened,
but
that
turned
out
to
be
really
good,
because
it
meant
that
we
could
get
elastic
ipfs
up
and
running
ingest,
all
our
existing
data
and
any
new
data
that
was
coming
in
without
putting
elastic
ipfs
on
the
critical
path
for
either
of
those
products,
so
yeah.
A
That
would
turn
out
to
be
really
good
anyway.
So
these
these
are
the
things
that
happened
and
the
graphs
that
show
the
resolution
things,
but
like
first
of
all,
when
the
implementation
was
done
when
paulo
and
the
team
finished
like
building
it
like,
we
did
this
kind
of
sanity
step
zero,
like
is
this
thing
a
goa
check
and-
and
so
this
is
kind
of
like
a
really
naughty
like
one-to-one
connection-
send
like
try
and
transfer
something
over
there.
A
So
it's
not
really
typical
of
ipfs,
because
potentially
you'll
be
able
to
get
stuff
over
bitswap
from
multiple
peers,
but
this
is
just
kind
of
is
it?
Is
it
actually
going
to
be
usable?
And
so
just
to
note,
like
we
do,
expect
ipf
last
skype
fest
to
be
a
little
bit
slower
than
go
ipfs,
because
we're
trading
off
we've
got
network
io,
where
we're
fetching
stuff
from
s3
buckets
and
and
the
indexes
from
the
dynamite
db
versus
like
disk,
I
o,
which
is
like
ssd
disk.
A
You
can
create
that
really
really
fast
so
anyway,
this
is
a
speed
test.
Before
we
did
like
any
optimizations,
it
was
just
fresh
out
the
door
and
and
like
we
found
it
was
you
know
it
was.
It
was
reasonable.
It
was
usable
and
so
we're
like
okay
right,
let's,
let's
continue,
and
so
there's
there's
optimizations
that
we've
done
and
are
still
to
come,
that
I'll
talk
about
a
little
bit
later.
A
A
We
noticed
that
it
was
all
the
way
up
here
around
like
six
milliseconds,
which
is
a
little
bit
slower
than
than
regular
kind
of
go
ipfs,
kubo
ipfs,
and
so
we
managed
to
actually
kind
of
almost
half
that
round
trip
time
and
it's
it's
now,
it's
a
lit,
tiny
bit
slower
than
than
go
ipfs
on
on
a
good
day,
but
still
like
consistently
better
than
a
lot
of
our
cluster
peers
in
in
our
in
our
clusters
at
the
moment.
So
what
do
we
do
to
optimize?
It?
A
A
Yeah,
that's
the
only
one.
I
actually
don't
know
what
happened
so
thanks
all
right
anyway.
A
So
the
next
thing
that
happened
to
us
was
that
the
indexer
nodes
came
online
and-
and
this
is
this
was
amazing
to
watch,
because
I
was
I
was
like
camping
in
a
field
at
the
time,
and
I
was
just
sat
on
my
mobile
like
on
grafana
just
right,
refreshing
it,
but,
but
that
within
a
few
few
days
they
read
all
of
the
advertisements
that
we've
generated
and
and
effectively
they've
indexed,
the
majority
of
all
data
ever
uploaded
to
web
3.0,
storage,
nft
storage,
that's
terabytes
of
data,
that's
1.5
billion
more
than
that
cids
in
total,
and
and
once
it
got
up
there.
A
It's
basically
stayed
on
like
here
forever
ever
ever
since,
which
is
incredible
and
like
I
don't
know,
if
maybe
we
can
go
back
to
when
you
look
at
rainbow
mode.
This
is
the
same
sort
of
thing
that
was
happening
yeah.
This
is
the
same
graph,
but
like
this
just
everywhere,
it's
it's
like
that
sort
of
thing,
whereas
we're
now
like
consistently
up
in
the
you
know,
1990s
to
hundreds
also.
A
The
cool
thing
about
this
is
that
once
the
index
and
nodes
had
indexed
all
of
this
data
it
it
meant
that,
like
these
provider,
records
were
like
essentially
available
on
the
dht,
so
so
people
started
discovering
elastic,
ipfs
and,
and
that
put
it
under
load,
which
is
rap,
but
then
we
saw
connection
errors.
We
started
seeing
this
in
the
graphs
and
we
were
like.
Oh
no,
what's
happened,
and
so
we
managed
to
we
might.
A
This
is
where
we
fixed
that,
but
we
we
in
node
there's
this
massive
foot
gun
if
you've
got
an
event
emitter
and
you
omit
an
event
called
error
and
don't
listen
for
that
event.
Then
it
just
takes
down
the
whole
process
and
turns
out
we'd
missed
that
in
one
of
the
connections
that
we
were
making,
and
that
was
what
was
causing
that.
So
this
is
where
we
fixed
it.
I
only
started
seeing
it
when
we
started
getting
loads
of
traffic
and
then,
after
that
that
was
fixed.
A
A
The
thing
we
we
did
realize
was
that
we
weren't
currently
graphing
in
checkup
was
the
time
between
a
user
uploading,
a
car
file
and
the
time
it
takes
for
that
data
to
be
available
on
the
ipfs
network,
and
by
that
I
mean
people
can
connect
to
the
an
ipfs
peer
and
transfer
it
via
bitswap,
that's
important
to
us,
because
as
soon
as
people
can
transfer
it
via
bitswap,
it's
available
on
the
gateways
essentially
and
a
lot
of
our
read
traffic
is
through
gateways.
A
So
this
is
an
important
metric
for
us
to
us
to
be
tracking,
and
so
we
did
some
changes,
and
now
we
can
graph
this
that,
like
this,
is
currently
at,
like
the
actual
time
to
index,
a
car
varies
based
on
like
the
size
of
a
car,
but
also
the
number
of
blocks.
The
number
of
blocks
in
the
block
size
as
well
can
change
so
it
like
it's
diff
different
this.
This
is
our
old.
This
is
the
old
value.
A
We
did
some
changes
recently
to
the
dyno
db,
which
we
already
talked
about,
where
we
now
bulk
bulk
right,
which
basically,
I
think,
half
this
or
something
like
that.
So
essentially,
you
know
this
is
this
means
that
data
that
gets
uploaded
or
car
files
that
get
uploaded
are
available
on
the
gateways.
In
you
know,
less
than
the
second
ish
depend
like
given,
depending
on
what
what
size
and
number
of
blocks
are
in
the
car.
Like
you.
A
Cool
so
yeah
network
availability.
We
did
that
then
this
was
the
other,
the
second
second
kind
of
kit.
We
realized
that
there
were
still
kind
of
connection
errors.
It
was
causing
these
little
little
bumps
in
our
bit
swap
no
response,
so
things
were
trying
to
connect
and
not
not
responding,
and
for
this
we
found.
The
reason
was
that
we
we
actually
depend
on
a
native
dependency,
called
sodium
native.
A
It's
used
by
noise
to
do
the
connection
encryption
and
what
was
happening
was
that
under
load,
that
native
dependency
was
somehow
triggering
this
like
race,
condition
in
node,
16
and
and
like
just
pulling
the
whole
process
down.
So
the
whole,
like
the
whole
kubernetes
container,
had
had
to
be
restarted,
essentially,
which
is
not
good
when
you
need
a
like
a
bit
swap
connection
that
is
kind
of
open
for
a
long
period
of
time
to
send
stuff.
So
this
is
where
we
fixed
that
and
turns
out.
A
This
is
the
first
time
it's
ever
happened
to
me,
but
the
internet
said
that
in
node
17
it
had
been
fixed.
A
So
we
we
took
a
punt
on
that
and
upgraded
to
node
18,
because
that's
the
next
long-term
support
version
in
in
our
bit
swap
peers
and
it
fixed
it,
and
I
can't
believe
it
worked.
I
can't
believe
it
worked,
but
it
did-
and
I
was
so
happy.
That's
me
happy
so
yeah.
So
then
then,
like
we
didn't
see
that
error
ever
again,
so
which
is
right
and
so
that
at
this
point
we
were
feeling
pretty
confident
in
elastic
ipfs.
A
We
had
consistently
good
metrics
on
the
checkups
for
for
kind
of
a
few
weeks
and
we
decided
to
do
like
a
soft
deploy
in
in
cloudflare
workers.
What
you
can
do
is
you
can
you
can
essentially
keep
the
worker
alive
and
running
without
blocking
on
sending
a
response
to
the
user?
So
previously
what
we
were
doing
is
we
were
uploading
cars
to
cluster.
A
We
were
also
uploading
them
to
s3,
at
the
same
time,
waiting
for
both
both
of
those
tasks
to
finish
and
then
responding
responding
to
the
to
the
user,
and
so
we
switched
that
round
so
that
we
uploaded
to
s3.
We
responded
to
the
user
and
in
the
background
we
still
upload
to
cluster.
So
it's
also
also
in
cluster,
and
this
happened.
A
We
saw
a
massive
reduction
in
the
amount
of
time
it
takes
to
upload
stuff
a
huge
reduction
in
variance
as
well,
which
is
which
is
incredible
and,
and
everyone
was
very
happy
and
got
excited
and
yeah.
It
meant
that
there
was
just
a
whole
lot,
a
whole
lot
less
fire
to
to
kind
of
deal
with,
and
we
had
a
very
good
time,
and
so
like
the
only
thing
left
to
do
now
is
to
take
cluster
out
of
the
picture
completely
for
uploads.
A
We
still
use
it
for
pinning
service
apis,
so
people
still
gonna,
pin
cids
to
us
and
we
need
cluster
to
actually
go
and
fetch
stuff
from
the
network.
So
cluster
can
basically
do
what
it
does
best
like
it.
A
It
is
for
pinning
things
and
that's
what
it
does:
pinning
data
finding
and
fetching
data
from
the
ipfs
network
and
storing
it
and
so
uploads
actual
the
car
file
uploads
can
can
can
go
straight
in
straight
into
s3,
be
indexed
by
elastic
ipfs
and
be
available
like
that,
and
so
as
we
need
to
just
do
the
hard
deploy
and
and
then
also
the
optimization.
So
these
are
some
of
the
optimizations
that
we've
done
and
are
thinking
of
we've
already
done.
A
The
we
added
an
lru
cache
to
the
to
the
bit
swap
peers,
so
any
blocks
that
seen
very
recently
will
just
be
able
to
serve
without
going
to
the
dynamodb
for
the
index
information
or
going
to
s3
to
get
the
block
from
the
car
file.
We
need
to
take
advantage
of
data
proximity.
The
way
like
I
said
we
always
put
like
in
the
backup
bucket
and
I
think
midway
through
the
the
elastic
ibfs
being
being
built.
A
We
decided
that
we'd
just
use
that
bucket,
rather
than
have
a
separate
bucket,
that
we
upload
to
and
and
so
elastic
ibfs
stuff
is
all
deployed
in
the
in
the
west
and
the
buckets
in
the
east.
So
whenever
the
peers
need
to
surf
data
needs
to
get
it
from
the
other
side
of
america
and
yeah
so
that
you
know
that
that
that's
time,
but
it's
also
money
for
us
and
so
yeah.
We
will
fix
that
pretty
soon.
We
can
also
have
multi-region
peers.
A
Currently,
all
of
our
bit
swap
peers
are
in
the
same
region,
so
we
could
put
them
in
multiple,
multiple
region,
regions
and
also
have
people
connect
to
them
in
a
place,
that's
closer
to
them
than
the
other
side
of
the
world,
which
would
be
rad
yeah
by
byron.
So
this
is
request,
optimization
yeah.
A
So
if,
if
we're
asking
for
multiple
cids
within
a
car
file
and
they're
in
the
same
one,
rather
than
making
separate
requests
for
each
block,
what
we
could
do
is
make
one
request
that
covers
that
whole
range,
and
maybe
we
get
some
junk
in
the
middle,
but
that
might
be
faster
than
making
separate
requests
for
each
block
and
then
also
take
advantage
of
data
locality.
A
So
if
you've
uploaded
a
car
file
with
a
dag
in
it,
and
then
someone
starts
to
bit-swap
that
it's
likely
they're
going
to
want
the
other
blocks
in
that
same
car
file
like
we
get
this
magic
kind
of
it's
the
same,
dag
so
you're,
probably
gonna.
What
we
could
do
is
just
instead
of
like
serving
each
individual
block.
We
could
just
like
preload
that
whole
car
into
the
cache,
so
that,
like
as
the
bit
swap
session,
progresses
as
they
ask
for
the
route
and
then
ask
for
more
and
stuff.
A
We've
already
got
that
stuff
to
serve
to
them
straight
away.
A
A
So
we
might
be
able
to
do
something
there
and
also
yeah
we'd
love
to
maybe
switch
to
r2
or
use
r2
from
cloudflare
because
of
the
the
free
egress,
because
s3
costs
money
for
egress
and
and
that's
all
of
the
optimizations
and
that's
the
finished
of
my
talk.
You've
missed
it.
B
So
those
graphs
that
you
showed
still
won't
have
the
second
point
right.
C
A
C
B
So
when
you're
scanning
your
car
file,
you're
not
skipping
blocks
anymore,
just
batching,
certainly
either
block
and
writing
on
top
right.
C
C
One
range
query:
in
order
to
just
get
the
first
one
for
that
particular
multi-hatch,
one
of
the
cool
things
that
we've
talked
about
is:
we
should
just
generate
car
b2
indexes
for
everything
ever
and
then
once
we
see
that,
like
you
know,
a
bunch
of
cds
are
a
couple
different
profiles.
We
should
just
grab
those
car
indexes
and
I
bet
they're
all
in
one
actually
and
then
you
just
do
one
big
range.
B
Is
the
the
check
monitoring
tool
that
you
used
for
evaluating
when
you
were
ready
to
switch
to
elastic
provider?
I
know
that
using
ips
checked
into
the
hood,
but
the
packaging
that
into
a
hey.
I
want
to
monitor
this
set
of
cids,
like
you
know,
for
as
an
infrastructure
operator.
Is
that
something
that,
like
open
source.
A
Yeah
it's
it's
called
checkup,
it's
in
a
github
repo
in
the
web3.storage.org
and
yeah.
You
just
set
environment
variables
like
the
cluster
api
url
that
you
want
to
use
and
we've
changed
it
obviously
to
also
include
electric
provider
but
yeah
like
anyone
like
it's
not
specific
to
our
is
specific
to
our
setup,
in
that
we
have
cluster
and
elastic
provider,
but
other
than
that
anyone
can
use
it.
It's
open
source
and
available.