►
From YouTube: Content Routing Open Performance Problems - @aschmahmann - Content Routing 1: Performance
Description
Content Routing Open Performance Problems - presented by @aschmahmann at IPFS þing 2022 - Content Routing 1: Performance - https://2022.ipfs-thing.io
A
All
right,
everybody
so
hi,
my
name
is
Aden
I
work
within
PL
on
IPS,
Stu
and
stewardship.
This
talks
about
some
it's.
It's
called
open
problems
and
Alignment,
but
it's
sort
of
misnamed
in
the
sense
that
I'm
a
person
I
have
an
opinion
you
have.
You
are
also
people,
you
have
other
opinions,
so
this
is
not
trying
to
get
alignment
as
much
as
trying
to
raise
points
that
we
can
then
discuss
and
align
on
later
right.
The
talk
does
not
give
alignment
it
gives
discussion.
It
lets
us
get
alignment.
A
So
a
couple
points
that
I
wanted
to
flag,
some
of
which
one
is
already
talked
about,
which
means
I
get
to
speed
through
them.
There
are
requirements,
there's
different
types
of
requirements
in
terms
of
what
is
it
a
client
will
need
right.
Okay,
if
I
need
it
to
be
under
half
a
second
or
something.
This
is
fine
everything
we
do
already.
Does
this
even
a
couple
hundred
milliseconds?
We
can
do
this
if
I
need
like
20,
milliseconds
I
need
to
be
closer
to
the
end
user
right
this
we
need
this,
which
means
you
know.
A
Speed
of
light
only
does
so
much,
so
you
got
to
figure
something
out
two
other
problems
that
I
wanted
to
point
out.
A
little
bit
are
like
incentivization
and
spam,
so
incentivization
means
like
who
and
who
is
starring
and
serving
all
of
these
records,
and
why
and
spam
is
like
how
do
I
know
which
records
I
want,
in
particular,
think
of
spam
a
little
bit
as
like
the
content,
routing
problem,
if
I
could
like
extend
on
it.
A
Maybe
a
little
bit
is
like
there
is
on
the
other
half
of
the
content,
routing
a
data
transfer
protocol
request
and
if
the
data
transfer
protocol
request
and
I
have
a
choice
between
five
peers
and
one
of
them
is
in
Australia
and
the
other
is
in
Iceland,
I,
better
choose
the
one
in
Iceland
or
like
I'm
gonna,
be
back
in,
like
you
know,
five
second
land
right
waiting
for
the
the
connection
Okay.
So
some
some
like
some
like
basic
napkin
math
right
that
we
already
went
through
like
there's
lots
of
petabytes
of
data.
A
You
can
divide
them,
even
if
you
take
like
the
ipfs
public
DHT
with
40
000
nodes,
and
you
divide
it
up.
You're
still
getting
like
everybody
has
to
store
like
many
like
tens
of
terabytes
of
data
right.
It's
it's
like
too
much
for
that
kind
of
system,
and
it's
going
to
cost.
You
know,
and
it's
going
to
cost
a
bunch
of
money
to
run
the
system.
Maybe
you
distributed
over
enough
people.
It
turns
out
it's
small
enough
but
like
these
are
there's
like
actual
amounts
of
cost
associated
with
running
a
system.
A
There's
lots
of
ways
to
do
this.
You
could
do
public
goods,
you
can
do
public
goods
and
everyone
can.
You
have
lots
of
people
they'll
donate
a
little
bit.
Okay.
Well
now,
with
lots
of
people,
lots
of
different
places,
and
now
maybe
I
start
running
into
that
whole,
like
they're
all
over
the
Internet
problem.
Again,
you
could
switch
it
around.
A
A
You,
you,
pay
a
DNS
provider
to
like
make
all
the
magic
happen,
and
there
are
other
ways
to
try
and
incentivize
participation
right
in
a
sense.
This
is
like
blockchains
are
in
this
business.
So
how
do
I
incentivize
the
existence
of
like
a
a
shared
good
that
lets
us
get
this
thing
to
go
again.
If
we
look
at
the
ipfs
public
DHT,
the
issues,
in
my
opinion,
around
some
of
the
DHT
stuff
are
not
so
much
on
like
the
the
latency
retrieval
side,
because
you
can
actually
get
it
down.
A
If
you
ignore
some
of
the
like,
even
if
it
was
only
a
single
hop
that
Pier
might
be
in
Australia,
then
you
say
it's
okay,
because
you
have
replication
factors
that
move
them
into
your
country,
because
your
replication
factor
is
high
enough.
Your
problem
is
like
these
40
160
terabytes
per
server
right
and
like
who's
running
these
servers,
and
if
all
the
data
is
coming
from
like
pending
services
and
file
coin
SPS
and
retrieval
markets,
why
is
my
desktop
storing
their
data
for
free
for
them
like
they
have
they
have
the
resources
they
have?
A
A
If
I
give
you
bad
results,
I
waste
your
time,
maybe
make
it
impossible
to
find
the
content
and
I
end
up
censoring
it.
If
I
don't
give
you
good
results,
I
could
be
the
same
deal
right,
I've,
dosed,
you
or
maybe
I've
just
made
it
more
expensive
for
you,
because
you
know
that
Google
home
page
there's
only
one
guy
who
has
it
he's
gonna
be
100
bucks
for
the
home
page.
A
You
know
that
kind
of
thing
and
as
a
as
a
consumer
of
this
routing
system,
that's
doing
the
lying.
What
are
you
going
to
do
about
it?
So
you
can
say,
like
you
know,
I've
got
a
Dean
guy,
not
asking
him
for
more
routing
requests.
He
seems
very
fishy
or
you
could
say.
A
Oh
you
know
what
fine
I'll
ask
him,
but
I
will
ask
someone,
nice
and
reputable
like
like
Steve
and
that
that
will
probably
even
out,
but
this
requires
more
work
on
behalf
of
the
client
which
they
may
or
may
not
be
willing
to
do
that
like
that,
may
be
something
that
they
just.
They
cannot
afford
to
make
it
happen.
They
they
don't
run
long-lived
processes
that
can't
accumulate
their
own
reputation
and
they
can't
afford
to
make
multiple
queries.
Because
writing
multiple
queries
is
expensive
for
them.
A
You
know,
and
even
when
we
talk
about
performance
like
like
who's
performance
and
and
for
what
right,
there
may
be
different
types
of
this.
So
like
there's
the
extreme
example
which
is
like
there's
one
side
of
this,
which
is
like
the
Twitter
thing
that
one
mentioned,
which
is
like
I,
have
I
need
to
get
like.
A
You
know
a
kilobyte
of
data
really
fast
and
if
you
make
me,
wait
like
two
seconds
for
a
Kill
by
of
data,
I'm
gonna
strangle
you
all
right
and
there's
the
other,
which
is
I,
am
downloading
100
gigabytes
of
data,
and,
if
you
make
me,
wait
30
seconds
for
a
routing
request.
I
almost
certainly
don't
care.
If
you
give
me
enough
peers
that
it
will
actually
download
faster
because
I
just
want
to
get
the
data
and
time
to
First
byte
is
not
what
drives
me.
A
How
much
do
I
care
about
the
puts
right?
We
talked
a
lot
about
the
guts
when
I
do
a
put.
How
long
does
it
need
to
take
to
show
up
when
I
do
an
update
when
I,
whether
the
update
is
like
I
no
longer
have
this
content?
Please
stop.
Bothering
me
or
I've
changed
my
network
address
I
now
live
over
here.
How
long
does
that
need
to
how
long
of
a
propagation
is
acceptable
right?
A
All
right
so
spam
there
when
people
think
about
Academia
and
spam
things
they
tend
to
think
of
like
civil
attacks
and
attacking
the
network
and
the
shape
of
the
network
and
and
that
kind
of
thing.
This
is
not
that
this
is
the
fact.
This
is
simply
the
fact
that
we
in
the
business
of
Provider
records
have
this
problem,
which
is
that
our
cids
help
us
verify
the
content,
and
that
is
cool
and
we
have
public
keys
that
can
help
us.
A
You
know,
talk
to,
you
know
other
nodes
securely,
and
that
is
fine,
but
when
I
go
talk
to
them,
how
do
I
know
they're?
How
do
I
know
they
have
the
stuff
I
want
or
that
they're
the
right
person
at
all?
They
advertise
an
IP
address
to
me,
but
it's
an
ipv4
address
and
it
could
be
from
anyone
on
the
planet
because
there's
no
certificate
validating
that's
theirs,
and
and
even
if
your
routing
system
is
totally
honest
and
just
trying
to
do
its
best
here.
How
can
it
help
you
solve
these
problems?
A
A
A
You
don't
get
access
to
this,
even
though
that's
been
a
part
of
like
a
thing
you
could
do
in
the
bitsoft
protocol
since,
like
day
one
because
it's
associated
with
peer
IDs,
but
it's
true
like
I,
could
advertise
data
and
be
like
it
is
only
for
people
whose
names
start
with
a
and
and
now
what
right.
A
Your
routing,
like
I
as
a
client
I,
would
like
to
make
sure
I
do
not
go
and
ask
someone
for
that
data
if
they're
not
going
to
give
it
to
me,
because
now
I'm,
just
wasting
resources,
I
may
not
have,
and
then,
of
course,
this.
This
amplification
attack
against
other
people
on
the
network
where
I
can
say
yep.
A
My
name
is
Adin
I.
Have
this
IP
address?
Also
it's
the
address
of
of
Gus's
Home
Server.
Also
try
there
yeah
it's
totally,
not
gonna
Dawson,
it's
gonna
be
fun.
A
Okay,
so
so
like
hard
mode
right,
what
is
like
the
hard
mode?
What
is
one
of
the
hard
modes
of
this
problem?
I
need
like
everything
in
under
10
milliseconds,
the
provider
records
have
to
have
more
information
than
we
have,
certainly
more
than
the
ipfs
public
DHT
has,
which
is
like
guy
exists.
A
Here
is
peer,
ID,
see
Elsewhere
for
addresses,
but
like
significantly
more
information
such
that
I
as
a
client
can
figure
out
how
to
prioritize
them
all
the
bat
all
the
ones
that
you
know
all
the
bad
ones
have
been
removed,
bad
being
bad.
For
me,
the
client
who
has
asked
for
them,
or
at
least
so
that
I
can
self-filter
them
and
all
of
the
valid
ones
should
be
available
and
valid
I'm
is,
is
not
being
specifically
defined
right,
as
one
mentioned
earlier,
you
don't
have
to
store
all
the
records
everywhere.
A
I
want
I
want
advertising
to
be
free,
I
want
Discovery
to
be
free,
I
want
this
all
to
just
work,
so
people
can
just
show
up
and
use
it
and
they
don't
have
to
sign
up
with
anything
or
or
get
any
tokens,
and
also
somehow
this
free
should
not
turn
into
a
web
2
free
where
we
make
it
violate
your
privacy
issues.
That
makes
all
of
the
talks
tomorrow
impossible,
because
your
incentivization
scheme
has
focused
on
you
know
spying
on
people's
data
right.
A
So
you
don't
you
don't
get
to
cheat
like
that,
because
we're
trying
to
avoid
some
of
the
web
2
things
and
also
it
needs
to
be
not
just
10
to
the
15
cids,
but
also
that
they
need
to
be.
You
know,
retrievable
from
like
billions
of
devices.
This
is
the
hard
mode
thing
I,
don't
think
we
need
to
do
all
of
the
hard
mode
things
and
a
bunch
of
the
talks
that
you
have
heard.
Some
I
think
that
you
will
hear
talk
about
areas
that
we
can
improve.
A
That
are
not
specifically
solving
all
of
the
hard
problems
here,
or
even
more
than
a
couple
of
them,
but
are
also
focused
on
like
making
our
implementations
able
to
solve
able
to
solve
the
easy
problems
that
are
not
these,
but
like
enable
enable
parts
of
this,
this
scheme
to
work
so
to
try
and
condense,
where
we
were
at
as
we
were
flying
through
this
there's
a
lot
hard
problems
here.
It
will
likely
not
be
one
size
fits
all.
A
A
It
would
be
nice
if
I
could
figure
out
how
to
Like
Glue
them
together
and
leverage
the
fact
that
content
addressing
unlike
location,
addressing
means
like
I,
have
many
options
right,
like
the
fact
that
we
have
strength
and
self-certifiable
data
means
that
our
routing
system
can
be
worse
than
the
like
location,
addressing
routing
system
right.
This
is
like
our
superpower
that
we
get
to
use,
and
so
we
need
like
reasonable
ways
for
developers
to
work
with
these
things
and
then
the
high
level
questions.
A
How
is
it
going
to
cost
to
keep
the
system
running
who's
gonna
put
the
bill.
Why
are
they
gonna
flip
the
bill
speed?
What's
the
what's
the
model
for
your
particular
chunk
of
this
routing
ecosystem
that
you
are
trying
to
meet
right,
not
going
to
be
the
same
for
everyone,
the
same
way
that
not
all
data
storage
is
equal
in
terms
of
how
retrievable,
in
terms
of
like
the
retrieval
latencies
on
it
right,
there
are
reasons
why
there
is
hot
storage
and
Cold
Storage
and
enabling
faster
retrieval
for
the
clients
right.
A
A
If
what
they
do
is
they
give
me
a
list
of
records
and
the
guy
I
need
to
get
is
in
Australia,
I've
lost
again
right
I,
as
the
client
needs
to
be
able
to
figure
out
how
to
get
the
data
quickly,
because
otherwise,
when
I
try
and
add
my
numbers
up
so
that
my
actual
bytes
on
my
machine
is
under
half
a
second
or
300
milliseconds
I'm
gonna
be
like
great
content,
routing
20
milliseconds
done,
then
you
can
be
like.
A
Oh
no,
my
data
transfer,
it's
taking
me
400
milliseconds,
except
it's
not
going
to
be
the
data
transfer
protocol's
fault.
That's
not
going
to
be
the
content,
routing
system's
fault.
It's
going
to
be
this
piece
in
the
middle
that
you
dropped,
which
is
the
content.
Routing
system
needs
to
know
who
to
go.
The
data
transfer
system
needs
to
know
who
to
go.
Ask
and
that's
all.
Thank
you.