►
From YouTube: Lunch + Fireside Chat with David Mazières and Juan Benet - Juan Benet, David Mazières
Description
David Mazieres is a computer scientist, best known as creator of the Stellar Consensus Protocol and coauthor of Kademlia, the peer-to-peer distributed hash table used in IPFS. He will be joined by Juan Benet, inventor of IPFS and Filecoin, in a dynamic and wide-ranging conversation about challenges and opportunities in designing large-scale distributed systems.
A
David
and
many
others
have
gone
on
to
build
other
distributed
systems
and
file
systems
and
operating
systems,
and
now
consensus
protocols
and
like
stellar
and
and
so
on,
and
what
I
wanted
to
talk
about
today
is
a
set
of
ideas
in
peer-to-peer
and
distributed
systems
and
secure
systems
that
are
really
good
and
promising,
but
haven't
kind
of
Taken
taken
hold
yet
so
there's
all
of
these
really
great
ideas
and
research
that
make
a
ton
of
sense,
but
just
for
whatever
reason
get
stuck
in
the
r
d
Pipeline
and
don't
make
it
all
the
way
into
production
use
and
sometimes
there's
good
reasons
for
that,
like
hey.
A
Actually,
the
idea
wasn't
as
good,
or
sometimes
it's
just
not
the
right
time
or
sometimes
there's
no
good
implementation,
or
sometimes
it's
just
difficult
to
shift
the
world
right
so
content.
Addressing
is
one
great
example
of
this.
It
made
a
lot
of
sense
from
from
the
beginning
and
so
on,
but
Shifting
the
entire
structure
of
the
internet
to
go
from
location,
addressing
to
content
addressing
is
quite
difficult,
and
so
you
have
to
chip
away
at
the
problem
for
a
while,
so
yeah.
Maybe
let's
just
start
with.
A
How
do
you
think
about
r
d
and
just
maybe
how
you
approach
research,
ideas
and
and
producing
work,
and
then
how?
How
do
you
want
them
to
get
fleshed
out
and
productionized
in
the
world.
B
Okay,
well,
how
there's
several
questions
there,
so
how
do
I
approach
r
d?
Well
so
I
have
the
luxury
being
at
Stanford
of
being
able
to
work
with
super
smart
people
and
do
whatever
I
want.
So
the
question
is
like
what?
What
should
we
do?
This
kind
of
a
a
blank
slate
and
I
I
always
come
down
to
the
kind
of
the
same
three
questions
when,
like
we
have
an
idea
or
like
a
student
approaches
me
with
an
idea.
It's
you
know.
B
Why
would
anyone
want
this
right,
the
stuff
that,
like
you,
think
it's
good,
but
then
you
dig
into
it.
It's
like!
Actually,
you
wouldn't
ever
use
this
in
practice
right.
So
that's
one!
Why
would
anyone
want
this
two?
Can
we
is
there
any
chance
that
we
can
actually
pull
this
off
right,
like
I
could
come
up
with
some
amazing
sounding
thing
but
like
if
it's
actually
impossible?
Our
chance
of
success
is
zero.
B
Then
it's
not
worth
it
and
then
the
third
question
is
just
okay:
let's
do
the
really
half-assed
thing
and
like
try
to
solve
it
with
like
today's
existing
ideas.
You
know:
can
we
get
90
of
the
way
there
like
you
know,
why
isn't
the
kind
of
the
half-assed
solution
good
enough,
and
so
once
you
answer
those
three
questions,
then
you
can
kind
of
identify
an
idea.
That's
maybe
worth
putting
time
into
and.
A
Once
you
identify
those
ideas-
and
you
put
some
time
into
it
and
maybe
get
some
recent
results,
how
do
you
then
go
to
getting
that
thing
adopted
like
do
you
kind
of?
If
you
were
to
I'm
sure
that
you
can
think
of
so
many
ideas
that
are
just
sort
of
stuck
in
paper
form
but
haven't
made
it?
How
do
you
sort
of
arrange
the
the
pipeline
to
get
some
of
those
ideas
built
out
faster
yeah.
B
I
mean
so,
particularly
as
an
OS
researcher,
you
know
backwards.
Compatibility
is
the
is
kind
of
like
the
number
one
impediment
to
getting
stuff
out
there
right.
You
can
build
a
brilliant
new
operating
system
kernel
and
if
it
doesn't
run
a
web
browser,
you
know,
like
you
know,
you're
not
going
to
to
you,
have
a
lot
of
traction
say
on
people's
desktops.
So
so
the
answer
is:
try
to
slice
the
problem
up
in
in
such
a
way
that
you
don't
have
to
solve
everything.
B
And
so
one
way
to
do
that
in
taking
the
example
of
operating
systems
is
look
at
a
new
kind
of
emerging
areas
right
so
maybe
like
it's
too
late
for
the
desktop,
but
maybe
like
you
know
you
don't
care,
you
don't
need
backwards.
Compatibility
for,
like
whatever
is
running
the
embedded
processor
on
your
fridge
and
so
like
you
can
kind
of
like
use
the
new
area.
B
B
But
but
you
know
maybe
there's
this
other
areas,
so
the
other
thing
you
can
do
is
try
to
make
the
problem
easier,
and
so
we
have
this
project.
My
my
then
student,
Adam
belay
who's.
Now
a
professor
at
MIT,
this
project
called
Dune
and
the
idea
of
Dune
was:
let's
use
the
virtualization
Hardware
in
you
know:
x86
servers
to
not
Implement
a
virtual
machine
abstraction,
but
to
implement
a
Linux
process
abstraction.
B
So
basically
you
run
a
Linux
process,
but
it's
running
in
in
basically
you
know
it's
it's
running
in
the
kernel
mode,
but
like
of
the
guest
mode
of
the
CPU,
and
if
you
use
a
VM
call
instruction,
you
can
actually
get
a
Linux
system
call,
and
so
this
because
the
problem
with
operating
systems
right
is
you
have
like
a
brilliant
idea
for
like
how
to
redo
the
network,
stack
right,
and
so
you
implement
this
and
then
to
make
it
useful.
You
have
to
implement
a
file
system
and
you
have
to
implement.
B
You
know
a
windowing
system
or
whatever,
and
so
here
the
idea
is
that
you
could
do
just
the
part.
That's
interesting
like
just
redo
the
networking
stack
say
and
then,
if
you
just
just
need
a
file
system,
you
could
just
kind
of
pass
that
through
and
use
VM
calls,
and
so
you
can
literally
like
call
printf
in
the
middle
of
an
interrupt
driver
and
there
is
a
standard
output.
It's
like
connected
to
the
standard
output
of
that
Linux
driver,
so
that
has
been
like
really
useful.
B
There's
been
a
you
know,
a
bunch
of
research
projects
that
managed
to
use
Dune
and
it
made
their
their
life
their
lives
a
lot
simpler.
So
but
you
know
sometimes
it's
like
sometimes
problems
just
like
require
a
huge
effort
right,
but
I
think
often
you
can
figure
it
out.
The
nice
thing
about
being
at
the
software
and
operating
systems
is
often
you
can
figure
out
how
to
do
it
with
less
effort.
A
Yep
and
when
you
think
about
like
the
the
range
of
work
that
you've
been
doing
so
from
again
peer-to-peer
file
systems,
operating
systems,
consensus
and
so
on,
there's
probably
a
set
of
ideas
there
that
you
thought,
like
really
were
really
great
ideas
that
just
haven't
made
it
through.
Can
you
maybe
mention
two
or
three
of
those
ideas
and
we
can
like
dig
into
them.
B
What
are
great
ideas
that
have
not
made
it
out
there
I,
don't
know,
I
mean
I,
have
their
sort
of
principle,
well,
there's
principles
and
then
there's
ideas.
So
you
know
there's
like
technologies
that
I
wish
were
out
there,
and
it's
just
an
unfortunate
that
that
they
aren't
I
mean
just
an
example,
not
not
that
I've
contributed
in
this
area,
but,
like
password,
authenticated
key
exchange
right
to
me.
B
It's
it's
grotesque
that
we're
typing
passwords
and
like
sending
them
in
clear
text
to
servers
right
and
like
we
literally
have
better
Solutions
than
this
right,
even
with
passwords
right
like
of
course,
a
lot
of
people
now
are
saying:
oh
passwords,
don't
work
get
rid
of
them,
but
actually
we
could
do
passwords
better
and
the
thing
that's
standing
in
the
way
that
like
I
had
a
student
Quinn
slack
who
implemented
you,
know
really
pretty
tasteful
implementation
of
pig,
which
was
password,
authend
key
exchange
in
the
browser,
and
we
went
and
we
talked
to
the
people
at
Mozilla
and
they
said
well,
you
know
the
problem
is
that
people
want
to
have
who
are
building
websites
want
to
have
control
over
like
the
failure
path.
B
You
know
the
small
advantage
that
they
currently
have
in
exchange
for
like
a
much
much
bigger
Advantage
in
terms
of
principles,
though
I
would
say
you
know,
they're
sort
of
principles
that
guide
my
work,
that
that
cut
across
a
lot
of
areas
and
one
is
that,
like
I,
I,
really
deeply
believe
in
sort
of
egalitarian
interfaces
and
apis
and
one
of
the
things
that's
a
really
unfortunate
thing
that
you
can
do
in
a
system.
Design
is
have
basically
sort
of
privileged
and
unprivileged
programmers
users.
B
Software
have
access
to
kind
of
qualitatively
different
apis
right,
because
then
you
get
this
thing
like
on
on
Unix
and
Linux,
where
essentially,
root
is
like
the
garbage
can
and
there's
all
the
stuff
that,
like
you
know,
for
example,
there's
protection
built
into
Linux
right.
It's
called
user,
IDs
and
and
yet
applications
can't
use
user
IDs
for
the
most
part
to
like
sandbox
themselves,
because
you
know
they
would.
B
It
would
be
like
more
of
a
pain
to
install
them
and
it
wouldn't
just
like
work
you'd
have
to
like
create
a
new
user,
ID
and
stuff,
and
so
we
do
other
things
to
kind
of
sandbox
software
and
we're
like
giving
up
the
ability
to
use
the
hardware
to
do
that.
So
you
know
the
principle
would
be
like.
Instead,
try
to
you
know,
try
to
make
sure
that,
like
privilege
is
like
a
quantitative
difference,
not
a
qualitative
difference
right
and
that's
certainly
something
we've
taken
to
Heart
in
like
the
work
at
Stellar.
B
Where
you
know
what
we're
trying
to
do
is
make
sure
that
you
know
any
random
person
who
wants
to
innovate,
and
you
know
a
big
Bank
have
access
to.
Like
the
same
apis,
even
if
they
have
sort
of
different
assets
so
like
there's
just
more
ability
to
innovate
and
then
I
guess
the
second
principle
would
be
I,
don't
know
I,
guess
you
could
call
it
like
the
weight
weight
and
balance
principle.
I'm.
B
Not
you
know
if
you're
designing
an
airplane
like
you
can't
lard
it
up
with
too
many
features
because
like
if
it's
too
heavy
It's,
Not,
Gonna,
Fly
and
unfortunately,
we've
kind
of
lost
that
with
software
right,
you
just
kind
of
Pile
in
the
libraries
and
whatever
and
so
and
the
other
thing
people
do
is
just
throw
more
engineering
effort
into
something
and
it's
easier
to
kind
of
write
code
than
delete
code
and
one
of
the
things
that
is,
you
know
you
see
it
at
say
at
Stanford,
where,
like
we're
working
on
researching
small
teams,
you
have
to
make
these
design
decisions.
B
B
Do
everything
all
at
once,
I
guess
if
I
could
add
a
third
one
yeah
if
you're,
if
you're,
if
you're
designing
systems
think
about
the
graph
of
like
the
number
of
hours
of
experience,
people
have
with
your
system
or
as
developers
using
your
platform
and
the
sort
of
power
that
they
have
using
it,
and
what
you
want
to
try
to
do
is
you
know,
have
a
high.
Why?
B
Why
intercept,
but
also
like
a
kind
of
constant
slope,
and
these
systems,
like
you
know
like
say,
cbosplus,
is
a
programming
language
right
like
to
really
do
stuff?
You
need
like
two
years
experience.
You
know,
like
that's
kind
of
like
there's
something
wrong
with
your
design.
If
that's
the
shape
of
that
graph.
A
Yeah,
how
does
that
apply
to
say
software,
libraries
or
or
I
guess
yeah
civil
plus
makes
sense.
You
have
to
learn
an
enormous
amount
python.
You
can
get
started
really
quickly
and
already
leverage
a
lot
of
power.
How
do
you
see
that
kind
of
applying
in
the
design
of
of
yeah
software?
Where
does
that
go
like
into
API
design?
Does
that
go
into
like
feature
set
design
or
yeah.
B
It
goes
into
API
design
and
it
also
goes
into
tooling
right.
I
mean
one
of
the
things
like
I
spent
a
fair
amount
of
time
at
Stellar,
which
is
the
the
blockchain
I'm
involved
with
doing
stuff.
B
That's
not
super
research
worthy
but
like
implementing
like
a
kind
of
an
assembly
language
for
Stellar
transactions,
because
what
you
know
it
looked
like
there
are
people
who,
like
either
you're,
really
Advanced
and
your
you're
basically
you're
implementing
to
like
our
go
apis
or
you
are
like
basically
using
a
wallet
right
and
there
wasn't
like
this
sort
of
gradual.
If
you
wanted
to
just
write
a
script
to
like
generate
a
bunch
of
transactions,
it
was
like
a
pain
to
do
that.
B
You'd
actually
have
to
write
a
go,
go
program
and
with
my
silly
tool
called
Stellar
transaction
compiler,
you
can
suddenly
write
like
shell
scripts
or
you
know,
just
simple
command
lines
to
basically
like
generate
transactions
to
kind
of
try
to
fill
in
that
Gap.
So
I
guess
look
at
sort
of
the
gaps
in
you
know
like
people
who
have
like
you
know,
9
15
months
of
experience
with
your
system
like
what
can
they
do?
B
You
know
if
they
are,
if
they
aren't
experts
and
if
there's
like
kind
of
a
trough
there,
where
there's
not
like
a
payoff
like
find
ways
to
like
fill
in
that
that
area
and
you
sort
of
create
a
smoother
on-ramp
for
people
to
like
use
your
and
become
experts
in
your
system.
Right,
you
got
to
get
those
endorphin
rushes
right,
oh
I
did
it.
It
works
right.
You
know
like
that's
what
motivates
a
lot
of
us
developers.
A
Right,
let's
dive
into
content
addressing
so
there's,
actually
a
question
that
I've
never
quite
been
able
to
figure
out.
Where
did
the
idea
of
hash
linking
and
file
systems
start
because
I've
traced
it
back
to
sfsro,
which
it
did
and
then
probably
be
before
that
like
fossil
inventy
in
plan,
nine
in
Bell
Labs,
like
already
had
the
beginnings
of
that
but
I
think
you
interned
there
for
a
while.
So
I
was
wondering
if,
like
you
brought
that
idea
or
like
you
learned
there
or
I.
B
Think
we
sfsro
might
have
predated
venti
cool,
but
yeah
I
mean
I,
I,
I
I,
remember
specifically,
so
so
from
my
dissertation
I'd
done
this
file
system
called
SFS
self-certifying
file
system.
The
key
idea,
then
again
following
the
principle
of
like
filling
in
the
gaps
in
in
and
and
making
the
system
more
accessible.
B
The
the
idea
was
to
embed
the
public
keys
of
servers
in
path
names,
and
then
you
know,
certificate
management
would
just
be
like
creating
a
bunch
of
symbolic
links,
because
a
certificate
is
just
a
way
to
sign
one
human,
readable
name
to
a
public
key
and
so,
rather
than
learn.
How
to
you
know,
fill
out
like
an
x509,
Certificate
request
or
whatever,
like
everybody,
who
can
do
a
little
bit
of
shell
programming
understands
what
a
symbolic
link
is.
B
So
it's
like
it's
much
easier
to
do
that,
but
if
you
did
that
as
a
certificate,
Authority
just
became
a
directory
full
of
symbolic
links,
assigning
human
readable
names
to
these
other
path,
names
that
contain
public
keys
and-
and
so
we
built
of
course,
then,
if
you
want
your
significant
authority,
of
course,
in
order
to
be
secure,
you
need
to
have
the
signing
key
be
offline,
so
we
had
kind
of
two
dialects
of
the
protocol.
B
A
read
write
where
the
server
is
online
and
then
read
only
where
the
stuff
was
signed
and
so
kind
of
the
first
version
of
the
read-only.
We
kind
of
signed
all
the
different
pieces
of
the
file
system
individually
and
then
I.
Remember
I
was
you
know
talking
to
my
advisor
France
at
the
time
and
I
was
like
and
he
was
talking
about
how
we're
going
to
like
redo.
This
I
was
like
no,
no,
no
I've
got
the
way.
We're
going
to
redo
this
much
simpler,
there's
only
going
to
be
two
RPC
calls.
B
One
get
me
a
digitally
signed
message
that
contains
the
hash
of
the
root
directory
and
two
here's
a
hash
value.
Get
me
the
pre-image
and
like
that's
all,
you
need
server
side,
so
we
can
make
the
server
side
super
simple
and
that's
that's.
What
led
to
sfsro
that
I
think
was
in
like
two
osti
2000,
so
I
think
I
was
like
a
little
bit
before
eventive.
A
Yeah
and
that's
the
core
hash
linking
for
data
that
then
you
know
ipfs
and
git,
and
so
many
other
things
are
are
based
on.
So
you
know
I
think
we're
still
in
like
the
quest
of
adding
content
addressing
to
the
network
like
we're.
A
Now
what
10
and
then
10
to
hundreds
of
millions
of
people
benefiting
from
this,
but
not
yet
billions
we'll
get
there,
but,
as
you
think
about
like
that
set
of
ideas
of
kind
of
hash
linking
into
these
extremely
powerful,
primitive,
and
yet
you
know
it's
2022,
and
so
many
systems
out
there
are
still
kind
of
linking
to
all
this
mushy
Dynamic,
not
certified,
not
authenticated
data
structures
like
what,
when
what
has
gone
wrong
over
here
right
like.
Why
is
it
it's
either?
Is
it
too
hard
is
the
idea
too
complex?
A
We
sometimes
talk
about
it
as
like
people
have
to
go
through
their
Merkle
journey
of
understanding
how
everything
is
when
you
hash
link
everything.
Everything
gets
better
and
you
can
move
everything
into
different
spots
and
you
get
all
the
security
properties
and
distribution
properties,
and
that
idea
space
is,
for
whatever
reason,
not
as
prevalent
as
say
something
equally
complex
like
encapsulation
and
protocols,
and
the
protocol
stack
and
networks
or
reliability
in
reliable
transport.
So,
like
all
these
ideas
are
kind
of
seem
equally
complex,
but
for
whatever
reason,
hash
linking
isn't
quite
there.
B
B
One
one
is
it's
actually
providing
security
at
the
transport
layer
by
by
encrypting
your
Communications,
and
the
second
is
that
it's
kind
of
it's
kind
of
most
of
the
implementations
build
in
this
like
x509
certificate
stuff,
and
what
you
really
would
like
to
do
is
kind
of
disaggregate
these
things
and
and
be
able
to
kind
of
authenticate,
both
contents
and
and
endpoints
in
in
different
ways.
Right.
So
so,
like
a
problem
with
the
web,
is
that
you
can't
name
a
public
key
right.
B
It's
like
you,
can
do
this
key
management
in
x509
and
get
your
https
URL,
and
then
you
can
get
a
web
page,
which
is
a
bunch
of
HTML.
But
then
you
can't
use
what
you
get
back
to
actually
name
another
public
key
right.
So
the
key
management
can't
be
part
of
the
system
itself,
and
it
doesn't
doesn't
have
this
like
self
referential
property.
A
B
But
I'm
saying
like
the
the
system
that
everybody
that
you
know
a
lot
of
people
use,
watch
change
but
way
more
people
use
the
web
and
sort
of
like
the
thing.
The
thing
that,
like
is
probably
many
people's
first
experience
of
the
internet,
now
is
like
using
web
browser.
Just
doesn't
have
the
flexibility
to
implement
your
own
key
management
or
content
authentication
as.
A
You
think
about
like
blockchains
and
where
they
might
go
like
not
just
kind
of
the
public
large-scale
cryptocurrency
blockchains,
but
just
in
general,
this
idea
of
using
consensus
with
hash
linking
and
maintaining
these
logs.
You
know
you
could
redo
the
ca
system.
This
way
you
could
redo
DNS.
This
way
you
could
redo.
You
could
have
a
secure
time
this
way.
How
do
you
see
that
kind
of
like
sinking
into
the
network
stack.
B
B
In
the
end,
nodes
and
I
view
it
kind
of
a
lot
of
systems
designed
throwing
in
a
fourth
principle
here,
as
kind
of
keeping
in
mind
a
more
generalized
version
of
that
which
is
no,
you
don't
just
have
end
nodes
in
the
core
of
the
network,
but
you
have
sort
of
routers
that
are
closer
to
the
user
and
that
the
user
has
more
control
over
within
a
an
end.
Node,
you
have
stuff
that
you
could
do
in
the
kernel
versus
stuff.
B
You
could
do
in
a
library
versus
stuff,
you
can
do
in
the
application
and
so
kind
of
you
know
the
more
you
can
move
that
functionality
out
say
towards
being
in
a
library,
I
think
the
better.
So
so
the
question
is,
you
know,
should
you
know,
content
addressing,
replace
IP,
I?
Think
probably
not.
A
But
HTTP,
for
example,
like
just
pulling
some
today,
you
use
HTTP
to
pull
content
and
you
ask
for
a
a
file
path
and
you
get
back
some
bytes,
and
maybe
you
should
ask
for
a
path.
Yeah
get
back
a
hash.
Take
the.
B
Hash,
and
so
your
question
is
how
should
that
sink
into
the
network
core
and
so
I
think
maybe
like
an
interesting
open
resource,
question
and
I
do
have
a
couple
of
students
at
Stanford
were
thinking
about
sort
of
thinking
about
this?
B
Is
that
a
lot
of
these
dhts
that
you
can
use
are
not
Byzantine
fault,
tolerant
and
building
like
a
fully
decentralized
Byzantine
fault,
tolerant,
distributed
hash
table
seems
like
a
very
hard
problem,
and
so
the
question
is:
if
we
could
make,
you
know,
ask
a
little
bit
more
of
the
core
Network.
A
So,
as
you
think
about
dhcs,
one
of
the
things
that
we've
run
into
is
just
the
speed
like
when
you
want
to
use.
So
in
epiphast
we
use
a
DHE
when
we
use
gadamia
to
do
the
discovery,
so
the
the
content,
routing
part
of
the
problem.
So
if
you
have
a
hash,
how
do
you
identify
what
nodes
in
the
network
have
the
content?
So
you
can
go
get
it
right.
We
call
this
providing
so
content.
A
Providing
we
create
these
provider
records,
we
put
them
in
the
DHC,
a
user
trying
to
look
it
up,
looks
at
the
DHT
gets
a
provided
record
can
go
and
find
them,
but
the
HDs
have
a
little
space
per
node
and
then
they
have
really
large.
You
know
round
trip
time
distance
to
all
of
the
nodes,
so
traversing
these
dhds
gets
really
expensive
when
you
want
to
do
something
like
a
page
load
right.
A
A
Incredibly
fast,
you
can
get
close
to
O
of
one
or
how
do
you
or
end
up
with
ehds
that
are
a
lot
smaller,
so
one
route
is
this
content
indexing
pathway
where
we've
gone,
which
is
instead
of
having
say
hundreds
of
thousands
or
millions
of
DHT
nodes,
have
tens
to
hundreds
but
make
those
nodes
extremely
large.
You
have
some
security
between
the
decentralization
of
those
nodes,
but
you
now
have
tons
of
Records
in
one
in
a
smaller
set
of
machines
which
of
these
Pathways.
Do
you
do
you
think
we're
we're
gonna?
A
Can
we
like
hack,
like
the
massive
scale,
DHD
systems,
with
like
millions
of
nodes
and
kind
of
somehow
solve
the
the
speed
of
light
problem,
or
do
we
really
want
to
go
for
like
tens
to
hundreds
to
maybe
thousands
of
nodes
in
like
interconnects
in
data
centers,
and
you
know
make
these
massive
rulers
there?
Well.
A
A
So
it's
speed
of
light
when
so,
when
you
have
millions
of
nodes-
and
you
don't
yet
have
a
secure
communication
setup,
a
secure
Channel,
you
end
up
in
this
problem,
where,
if
you
want
to
get
the
information
from
the
next
party,
you
have
to
then
establish
a
new
secure
connection
to
them,
and
so
that
ends
up
with
a
bunch
of
round-trip
times.
Well,
why
does
it
have
to
be
secure?
A
Because,
because
of
privacy
reasons,
you
don't
want
to
necessarily
just
tell
everybody
so
two
two
parts:
one
is
you
don't
want
to
flood
the
network,
so
you
want
to
do
iterative
discovery
which
is
slower.
If
you
do
recursive,
you
then
need
some
other
incentive
structure
in
the
DHT
to
make
sure
that
you
know
you
don't
flood
it
and
then
second,
you
want
privacy
in
the
queries,
and
you
want
to
be
careful
about
who
you
necessarily
convey.
What
information
that
you're
looking
for.
B
I,
don't
know
I
guess
I
would
I
would
push
back
on
that
assumption
a
little
bit
so,
first
of
all
like
it
depends
what
you're
using
right
you
could.
You
could
certainly
Implement
a
custom
protocol
that
would
that
would
require
fewer
round
trips.
B
B
A
B
Yeah,
and
so
that
you
can
I
mean
the
other
thing
is
you
could
use
like
suppose
when
you
at
the
time
you
learn
of
like
a
node's
ID,
you
also
learn
of
like
it's
a
public
key.
You
could
potentially
do
like
a
non-interactive,
Diffie
Hellman
key
exchange,
so
you
basically
like,
can
send
the
thing
in
encrypted
right.
I
mean
that's
not
you.
You
lack
forward
secrecy
there,
but
I
mean
given
that
we're
talking
about
a
kind
of
level
of
casual.
What
we
you
know.
B
This
is
only
going
to
work
against
a
sort
of
casual
attacker
who's
like
eavesdropping
on
the
Wi-Fi
anyway,
so
you
have
to
kind
of
find
the
right
balance
so
anyway,
this
is
the
point
being
like
already
we're
talking
about
stuff
other
than
speed
of
light.
A
But
but
the
way
this
connects
is
that
you
have
these
problems
that
cause
that
add
another.
You
know
round
trip
and
when
you
have
other
round
trips
and
you're,
trying
to
Traverse
a
very
large,
you
end
up
end
up
in
the
sequential
problem
where
you
have
to
like
find
out
a
bunch
of
information
and
in
order
to
find
out
the.
B
Information
I
understand,
but
but
what
I'm
saying
is
that
we're
it's
important
to
say
that
these
round
trips
aren't
like
this
experience?
The
the
performance
bottleneck
isn't
the
speed
of
light.
It's
other
things.
It's
like
node
failure
right.
Why
is
node
failure?
What's
you
know?
What's
probably
like
the
number
one
concept
of
node
failure?
Well,
like
would
a
nightmare
not
traversal
is
to
implement
right.
B
So
if
you
were
to
sort
of
in
terms
of
what
could
you
ask
in
the
network,
it's
like
well,
you
know,
start
rolling
out
IPv6
and
you
know
have
some
range
of
ports
that,
like
applications,
can
use
that
are
basically
not
not
firewalled
so
like
these
are
things
that,
like
you,
could
ask
from
from
the
network
and
and
I
think
that
that
you
could
do
okay,
I
mean
I,
I,
I
I.
B
A
You
can
put
them
in
kind
of
the
you
could
look
at
the
Grapevine
of
the
internet
and
put
these
wherever
there's
some
high
connectivity
in
in
the
pathways
right.
So
if
you
look
at
the
tree
structure
and
you
put
in
content
routers
in
the
vertices
that
have
lots
of
Downstream
parties,
then
kind
of
increase
the
amount
of
content
there,
because
storage
is
getting
super
cheap
right
like
you
can,
so
you
can
have
a
you
know.
A
We
these
things
are
going
to
carry
terabytes
now
and
then
it'll
be
tens
of
terabytes
within
not
too
long
a
server
rack
is
now
a
petabyte.
So,
like
you
can
put
a
petabyte
of
Records
into
you,
know
a
your
basement
or
a
pair
device
records
into
this
building
and
then
suddenly
have
this
massive
content.
Router.
B
B
Then
it's
going
to
be
like
a
big
deal,
and
so,
if
you,
if
you
so
it
is,
you
know,
there's
an
argument
to
be
made
that,
like
okay,
maybe
you
know
if
ketchup
is
like
the
you
know,
shot
three
is
like
the
the
hash
to
end
all
hashes.
So
we're
happy
to
like
just
commit
to
that
and
like
do
this,
and
maybe
the
interface
is
simple
enough,
that
we
feel
we
have
something
that,
like
is
the
right
thing,
and
so
we
we're
ready
to
like
bake
it
in
stone.
B
Let's
make
up
credible
protocols,
I
I'm,
a
little
more.
You
know,
I'm
a
little
wary
of
that
right,
so
I'd
want
to
know
like
want
to
make
sure
that
it
can't
actually
happen
at
end
nodes
where,
like
user,
we
could
have
like
competing
versions
of
the
protocol.
If
necessary,
yeah.
A
No
I
mean
you
know
instead
of
committing
to
catch
act.
You
just
add
a
bite
ahead
of
time,
telling
you
that
you're
going
to
use
which
hash
function,
you're.
A
We
already
have
an
ipfs
and
ipfs.
We
use
chat
if
it's
six,
we
use
blake2
and
we
use
a
bunch
of
other
hash
functions
and
it's
already
kind
of
all
integrated.
So
you
can
navigate
through
the
through
the
pathways
and
then
you
just
disable
you
disable
hashes
as
you
as
they
break
or
they
get
weaker.
B
But
so
maybe
we're
at
a
point
where,
where,
if
you
could
like
basically
commit
to
your
software
and
say
like
I'm,
comfortable,
never
upgrading
this
again,
modular
minor
bug
fixes,
like
you'd,
be
happy
doing
that.
B
And
so
then
you
just
have
to
make
the
argument
that
show
that
it's,
like
the
the
benefit
you
get
from
pushing
into
the
network
is
worth
the
lack
of
upgradability
and
the
lack
of
competition
from
other.
Potentially,
let's.
A
Talk
about
privacy
for
a
moment
like,
as
you
think,
about
preserving
privacy
and
reader
and
writer
paths,
so
writer
privacy
is
hard.
Reader
privacy
is
way
harder
when
you
kind
of
like
think
about
how
to
make
content,
publishing
and
content
viewing
properly
private
of,
like
all
of
the
approaches
that
you've
seen
tried,
what
do
you
think
is
the
most
promising.
B
I'm
not
sure,
because
there's
there's
a
bunch
of
different
approaches
and
none
of
them,
none
of
them
is
ideal
and
they
don't
really
compose
right.
So,
like
private
information,
retrieval,
for
example,
is
really
cool,
but
it's
really
expensive
on
the
server
side.
So
another
thing
you
do
is,
you
could
like
say,
split,
trust
and
say
well.
B
These
two
organizations
need
to
collude
to
figure
out
what
I'm
looking
up
right,
but
that's
not
a
great
thing
either
right
and
then
you
can
do
sort
of
half-assed
things
where
you
have
at
least
plausible
deniability,
because
you're
fetching
something
random
but
like
statistically,
people
are
still
going
to
figure
out
what
what
you're
doing
and
so
I
and
so
I,
don't
I,
don't
know
because,
like
none
of
these
are,
are
really
I.
A
Mean
the
servers
are
pretty
fast
now
we're
talking
about.
You
know
using
zero
knowledge,
and
you
know
in
five
years,
we'll
be
using
five
ten
years
we'll
be
using
fully
homomorphic
encryption
like
it's.
B
Like
super
private
and
for
truly
private
information
retrieval
you
basically,
it
only
works
if
you
touch
if
the
server
touches
every
single
piece
of
data,
because
if
it
doesn't
touch
every
piece
of
data,
then
knows
that
you
didn't
fetch
a
particular
record.
So
if
you
want
perfect
privacy
like
there's
no
way
to
do
it
under
law
again,.
A
A
Devices
no
no
I
mean
like
the
lookups
that
you
might
be
doing
so.
So
a
human
does
a
certain
amount
of
lookups
and
sure
that
might
increase
on
something.
B
A
You
can
probably
add
some
kind
of
structure
to
this
to
break
down
the
problem
right,
so
you
could
like
do
some
some
like
bucketing,
first
and
then
private
information
retrieval
in
some
within
some
buckets
yeah.
B
So
maybe,
and
then
you
have
to
look,
but
so
that's
my
point
right
that
there's
all
these
trade-offs
and
you
know
another
thing
you
could
do
like
first,
you
also
have
to
think
about
censorship
resistance.
So
one
of
the
things
that
we
did
I
had
a
student
at
NYU,
Mark
Waldman,
who
did
he
built
a
system
called
tangler,
and
the
idea
was
that
all
of
the
data
that
was
stored
was
kind
of
information
theoretically
unrelated
to
the
content.
B
A
Is
3x
is
not
terrible:
yeah
yeah,
that's
pretty
good,
so
one
thing
so
we're
at
lunch
now,
so
David
has
kindly
agreed
to
do
office
hours
with
folks
here
that
are
doing
distributed,
systems,
work
and
working
on
content,
routing
or
dhcs
or
cdns
or
hash,
linking
and
so
on.
A
So
we'll
right
after
lunch,
we'll
go
to
the
terrorists
or
some
somewhere
up
there
and
if
you're
interested
in
like
talking
about
some
of
the
problems
that
you're
working
on
some
of
the
tech
that
you're
building
and
get
feedback
from
David
we'll
we'll
do
that
we'll
kind
of
just
to
keep
it
to
like
five
five
to
ten
minutes
and
then
so
that
other
people
get
get
that
and
yeah
after
lunch.
I'll
take
maybe
two
questions
from
the
audience
and
then
we'll
we'll
stop
for
lunch.
C
Hey
good
afternoon
so
I
landed
in
in
the
talk
about
private
data.
Retrieval
I'm,
very
interested
in
you
know,
I
I.
Imagine
you
could
mix
data
right.
You
could
mix
data
from
various
sources.
So
it's
like
you
know
this.
One
file
has
information
from
five
different
sources
mixed
and
then
you
can
request
a
bit
from
one
server
from
another
server
from
another
server
and
the
specific
combination
of
data
that
you
retrieve
from
various
servers
could
be
reintegrated
into
the
exact
data
you
need
and
that
way
none
of
the
service
you.
A
B
That's
the
tangler,
so
that
was
the
tangler
idea
and
so
there's
kind
of
two
things:
there's
one
the
fact
that
all
the
data
is
Tangled,
Up
So
you!
When
you
request
blocks
those
blocks,
could
be
used
to
reconstruct
multiple
documents.
It
doesn't
say
what
you're
doing
and
you
couldn't
sort
of
censor
a
document
without
causing
collateral
damage
to
other
documents,
and
then
the
second
part
of
it
is.
You
need
to
have
a
a
way
of
kind
of
shuffling
the
data
around
such
that.
B
If
you
a
server
cheats
on
even
one
block
one
data
block
that
like
they,
you
know
they
can
get
sort
of
kicked
out
of
the
system
and
such
that
nobody
is
sort
of
permanently
responsible
for
any
block
so
that,
even
if
someone
is
being
bad
like
eventually
like
it'll,
move
to
other
places,
yep.
C
A
Yeah
I
mean
that
depends
on
kind
of
the
system
design
and
what
are
the
guarantees
that
you
want
to
preserve
like
and
sure
you're
going
to
end
up
with
many
many
notes
involved
in
in
the
protocol?
You.
B
Can
check
out
our
tangler
paper
and
it's
from
like
1997
or
something.
But
you
know
from
by.
A
The
way
just
tons
of
amazingly
good
ideas
are
in
the
literature
like
just
have
not
been
implemented
for
some
reason
or
have
been,
but
like
got
stuck
somewhere
in
the
r
d
Pipeline
and
you
know
waiting
to
be
to
be
distributed.
All
right
final
question
just
raise
your
hands.
D
You
talked
earlier
a
bit
about
the
like
sort
of
combining
Byzantine
fault
tolerance
with
distributed
hash
table.
Could
you
talk
about
like
I
guess
the
security
model
that
like
or
distributed
hash
tables,
have
right
now
and
how
Byzantine
fault
tolerance
like
comes
into
play.
B
Yeah,
it
was
basically
not
good,
so
there's
we
don't
really
have
a
good
solution
to
this
problem
right.
So
what
you
know,
people
can
do
things
with
sort
of
admission
control
to
try
to
prevent
civil
attacks,
but
it's
fairly
unsatisfying-
and
you
know
my
then
student
Michael
Friedman
who's
a
professor
Princeton.
Now
he
built
this
thing
called
the
choral
content,
distribution
Network
and
the
goal
is
always
to
like
open
this
up
to
a
bunch
of
you
know
and
let
anyone
participate
we
never
ended
up.
B
Doing
that
and,
and
in
part,
is
because
we
couldn't
trust
the
the
nodes
right.
Someone
could
mess
it
up
and
well,
it
was
it
was
that
and
the
fact
that
you
couldn't
authenticate
content
in
a
in
HTTP
and
HTML
because,
like
what
you
really
want
to
do,
is
you
know
fetch
content
from
this
caching
layer
and
not
have
to
trust
the
caching
layer
for
Content
integrity
and
just
because
again,
the
browsers
aren't
designed
in
the
way
that
I
wish
they'd
been
designed.
You
can't
do
that
very
easily.
Yep.
A
Well,
thank
you
very
much
David,
thank
you
for
being
here
with
us
and
sharing
the
knowledge
so.
B
What
1pm
Terrace.
B
The
Terrace,
okay,
yeah,
maybe
see
some
of
you,
then
thanks.
So
much
thanks.