►
From YouTube: Bitswap and IPFS real-time metrics analysis - Leo Balduf
Description
This talk was given at IPFS Camp 2022 in Lisbon, Portugal.
A
My
name
is
Leo
and
I
will
be
talking
about
real-time
metrics
and
mostly
about
bit
Swap
and
other
ipfs
Basics.
So,
as
we
heard
earlier,
all
the
content
retrieval
in
ipfs
starts
with
bit
swap
so
it's
quite
nice
that
I'm
first,
so
why
do
I
want
to
do
this,
or
why
do
we
want
to
do
this?
We
have
a
bunch
of
fancy
grafana
dashboards,
which
display
real-time
information
about
the
network.
You
probably
can't
read
this,
but
this
is
real-time.
A
Maybe
you
can
read
the
link
at
the
bottom,
which
is
grafana.monitoring.ipfs
dot
truly.group,
where
you
can
go
and
look
at
this
stuff
in
real
time
all
right.
We
are
currently
having
a
project
with
PL,
which
has
two
parts.
So
the
first
part
is
about
this.
It
is
about
collecting
metrics,
analyzing
them
and
visualizing
them,
mostly
in
real
time
wherever
possible,
and
that's
this
talk
and
the
other
part
is
about
tracking
content
throughout
its
lifetime,
which
is
not
this
talk,
so
just
for
your
expectations.
A
I
will
start
a
bit
at
the
beginning.
I
will
introduce
some
Basics
about
ipfs
lip
peer-to-peer
bit
Swap
and
whatnot,
so
we
have
a
little
bit
of
background
to
work
on
and
I'm
going
to
talk
about
real-time
analysis,
mostly
of
bit
Swap
and
other
ipfs
stuff,
and
we're
going
to
have
a
lot
of
time
for
Q
a
at
the
end,
all
right,
the
basics.
We
already
heard
that
ipfs
was
built
on
top
of
Libya
TPM
and
lip
peer-to-peer
does
so-called
stream
multiplexing.
A
A
A
Then
there
is
a
nut
hole,
punching
protocol
which
I
won't
talk
about
today
and
then
there's
bit
swap
which
I
will
talk
about
on
this
layer
already.
So
this
is
produced
using
the
ID
protocol.
We
can
already
ask
a
bunch
of
interesting
questions,
for
example,
how
many
nodes
support
some
protocol,
for
example,
I
know
that
the
new
hole
punching
approach
is
being
rolled
out
and
an
interesting
question
would
be
how
many
nodes
support
this
at
the
moment
and
of
course,
how
does
this
change
over
time?
A
A
few
of
the
things
here,
but
we
can
also
look
in
into
the
protocols
and
we
can
think
about,
can
we
use
these
protocols
to
extract
metrics
that
are
interesting
to
us.
So
I
mentioned
earlier
that
there's
the
academlia
protocol,
which
does
DHT
stuff.
There
are
a
bunch
of
DHT
crawlers,
using
this
protocol
to
find
DHT
servers
on
the
network,
infer
the
graph
of
DHC
servers
and
in
further
basically
the
network
core.
A
So
we
can
do
that.
There's
the
ID
protocol,
which
I
mentioned
earlier,
which
is
about
agent
versions,
supported
stream
protocols,
transport
protocols
and
all
that,
so
we
can.
We
use
that
to
extract
metrics
about
the
network.
Basically,
then,
there
is
bit
swap
which
I
deal
with
a
lot
bit
swap
is
used,
as
we
heard
earlier,
to
initiate
requests.
So
every
request
is
broadcast
on
bit
swap
first,
so
we
can
collect
these
requests
and
learn
about
data
requests
on
the
network.
A
A
So
do
we
do
this?
Yes,
of
course,
we
are
running
a
distributed
setup
to
collect,
collect
these
metrics.
We
run
multiple
large
monitoring
nodes
with
unlimited
connection
capacity.
That
means
a
normal
ipfs
node
has
some
600
to
900
connections.
At
any
time,
we
run
nodes
that
have
something
like
20
000
connections
at
any
time,
because
they
have
unlimited
connection
capacity.
A
A
We
run
unmodified
Kubo,
so
we
run
unmodified
software
with
a
plugin
that
we
developed
to
extract
these
metrics
for
us,
then
we
have
a
client
that
connects
to
the
plugin
and
collects
these
metrics
in
real
time
and
analyzes
them.
Finally,
we
display
it
on
grafana
the
entire
setup.
It's
a
distributed
system.
It's
a
bit
complicated,
it's
set
up
with
ansible
needs
a
bunch
of
servers
which
are
connected
with
a
VPN,
but
the
end
goal
is
to
have
this
Deployable
by
anyone.
A
A
As
I
said,
the
monitors
are
passive.
This
is
easier
to
run.
There's
also
in
the
end,
leads
to
a
uniform
sample
of
peers
that
each
monitor
is
connected
to
we
analyze
this
and
it's
actually
true,
which
is
quite
nice,
so
we
each
monitor,
gets
a
uniform
sample
of
all
peers
on
the
network.
So
we
can
run
statistics
on
those.
A
A
You
need
a
custom
client,
it
works,
but
it's
not
great
and
for
some
things
we
had
to
reinvent
the
wheel,
for
example,
for
back
pressure,
and
things
like
that.
So
on
the
on
the
agenda
is
to
move
this
to
another
system
that
that
does
the
pops
up
for
us,
but
it
needs
to
deal
with
many
thousand
requests
per
second
and
I
still
have
to
figure
that
out.
A
All
right
I
will
go
into
bit
swap
in
most
detail
in
a
second.
But
let's
first
maybe
talk
about
this
ID
protocol
I
mentioned
earlier
that
every
ipfs
node
runs
this
and
exchanges
information
about
streams,
protocols
that
are
supported
about
the
agent
version
about
the
public
key
and
stuff,
like
that,
any
running,
ipfs
node
any
running
kubernet
keeps
track
of
this
and
knows
it
locally
and
our
plugin
exports
this
and
makes
it
available
for
real-time
analysis
stuff
that
we
get
is,
for
example,
all
the
peers
that
we're
connected
to
what
protocols
do
they
speak?
A
What
can
they
do?
What
agent
versions
are
they
running?
What
transport
protocols
do
they
support?
We
can
also
see
the
number
of
DHT
servers,
so
the
network
core,
because
it's
a
number
of
DHT
clients
and
surprisingly,
we
are
connected
to
DHT
clients
much
more
often
than
to
DHT
servers.
So
what
we
can
do
with
this
passive
monitoring
setup
is
look
at
The
Fringe
of
the
network,
whereas
DHT
crawling
can
look
at
the
core
of
the
network.
For
example,
another
thing
we
can
do
is
estimate
the
size
of
the
network
using
multiple
Vantage
points.
A
A
You
can
also
do
this
for
more
than
two
monitors
and
we
model
this
as
a
modified
coupon
collectors
problem,
which
is
not
so
simple,
anymore
and
I'm.
Quite
glad
that
someone
else
did
that
and
not
me,
we
have
it
in
our
paper.
It
works
yeah
in
practice.
This
is
not
completely
real
time.
We
just
sample
our
monitors
in
intervals
at
the
same
time
and
then
we
basically
compute
this.
A
A
We
can
also
I
show
this
on
the
first
slide,
or
so
we
can
do
geolocation.
We
can
geolocate
peers
so,
for
example,
for
size
estimates
we
could
see
where
the
peers
are
located,
but
we
can
also
do
this
for
requests
and
we
are
currently
doing
this
for
all
the
requests
we
get.
So
we
get
I,
don't
know
a
few
thousand
up
to
ten
thousand
or
so
requests
per
second
and
we
geolocate
the
origin
of
those.
A
A
We
then
pick
one
of
them
and
get
the
content
from
them.
That's
how
bitsource
works
every
node
does
this
as
the
first
step
when
they
request
data,
so
every
request
for
data
goes
through
bit
swap
now
now
we
are
not
P
or
P1
anymore.
We're
P3
we're
not
requesting
data,
we're
only
listening
to
other
people's
data
requests.
We
basically
eavesdropping.
If
you
want
sniffing
the
network
for
science,
you
will
notice
that
we
don't
reply
with
a
half
message,
because
we
don't
have
any
content.
A
A
That's
simple.
We
also
get
a
bunch
a
lot
of
cids.
Of
course
we
get
many
millions
of
unique
cids
every
day
and
we
can
analyze
those.
We
can
look
at
the
codec
that
is
listed
in
every
CID
as
a
proxy
to
estimate
the
usage
of
the
network.
I
will
show
results
for
that
in
a
second,
we
can
derive
content
popularity
distributions,
which
we
did
with
interesting
results,
read
our
paper
and
we
can,
for
example,
also
download
the
content
sampling
it
to
estimate
what
is
being
requested,
for
example,
mime
types.
A
Looking
at
that
request
rates,
for
example,
we
get
some
many
thousand
requests
or
so
per
second
per
monitor,
which
is
quite
nice,
but
it
generates
a
lot
of
data
which
we
have
to
store
somewhere,
which
is
not
so
nice.
This
is
one
of
the
reasons
why
we
want
to
do
all
of
this
in
real
time.
Ultimately,.
A
A
A
Janus
usually
introduces
me
with
I,
want
to
be
the
Google
of
ipfs
and
store
all
the
data,
but
I
really
don't.
This
is
many
many
terabytes
of
data
that
we're
storing
already
with
the
traces
and
it's
too
much.
We
don't
want
to
store
it.
It's
terrible.
A
We
want
a
live
view
of
the
network,
still
having
all
of
those
metrics
extractable
from
the
traces,
but
in
real
time
for
that
we're
running
multiple
monitors,
it
would
be
nice
to
have
one
unified
request
stream
of
the
entire
network,
but,
as
I
said,
we
have
to
deal
with
multiple
monitors,
concurrent
requests
arriving
on
those
monitors
and
I
put
it
here.
Oh
God,
the
Horrors-
these
are
maybe
readable,
traces
from
two
monitors
and
we
have
timestamps
origin
peer
request,
type
and
CID.
A
A
A
A
A
All
of
these
metrics
we're
feeding
them
into
Prometheus,
and
we
experience
a
curse
of
I
would
call
it
cardinality,
but
it's
usually
referred
to
as
dimensionality
every
counter
that
we
feed
into
Prometheus.
We
have
to
annotate
with
a
bunch
of
labels
and
all
of
those
have
some
cardinality,
for
example,
which
monitor
received
the
message.
A
bunch
of
possibilities.
A
Are
they
duplicates?
Are
they
matched
between
the
monitors,
origin
country?
Absolutely
terrible?
There's
hundreds
of
countries,
origin
group,
is
this
Gateway?
Is
it
a
DHT
server?
Is
it
whatever
entry
types
multicodex?
So
we
have
all
of
these
labels
on
every
time
series,
and
this
is
a
product.
This
is
not
a
sum.
This
is
a
product
which
ends
up
with
a
giant
cardinality
in
these
Time
series.