►
From YouTube: HAMTs & Other IPLD Data Structures - Rod Vagg
Description
How does IPLD represent complex data? How does this affect data transfer and what kinds of data you can request?
A
Distributed
data
structures
now
these
we
call
these
persistent
and
immutable
data
structures
and
they're,
not
new,
because
this
term
persistent
and
immutable
data
structures
has
been
around
for
a
long
time
and
you'll
actually
find
these
data
structures
in
the
in
the
functional
programming
World
in
particular
the
standard
libraries
of
closure
Haskell
Scala.
They
all
contain
these
persistent
and
immutable
data
structures,
and
they
map
really
well
to
the
content
addressing
world
because
we
do
immutability
and
we
like
persistence.
A
So
a
lot
of
these
data
structures
are
old
and
we've
sort
of
pulled
them
into
the
new
world
of
content.
Addressing
the
problem
is
that
the
algorithms
involve
a
lot
of
trade-offs
and
when
you
come
into
the
content,
addressing
World
you've
got
a
whole
new
suite
of
things
to
think
about,
and
not
just
what
happens
on
the
local
computer.
A
So,
let's,
let's
see
what
we're
talking
about
with
persistent
and
immutable
data
structures
by
building
a
really
simple
one
that
I'm
calling
super
large
array.
This
would
work
in
practice.
I
I,
don't
know
if
you
would
use
it,
but
you
could
build
this.
So,
let's
start
with
and
we're
building
an
array,
we're
building
a
something
where
we
can
store-
and
you
know
potentially
infinite
number
of
entries
in
this
huge
array
distributed
on
ipfs.
A
So
we're
going
to
start
off
by
saying
we
can
store
five
things
in
a
block
on
ipfs,
so
we're
setting
a
limit
and
we're
going
to
fill
it
up
with
five
elements.
Now
what
happens
when
we
want
to
add
more
things
to
our
array,
because
we're
going
to
be
able
to
expand
this
thing
right?
Well,
we
stick
more
things
into
another
block
and
then
we
create
a
structure
to
reference
both
of
these
blocks.
So
we've
we've
created
a
dag.
A
We
have
a
root
that
includes
a
link
to
the
first
block
and
the
second
block,
and
then
we
have
our
array
elements
in
order.
So
we
end
up
with
three
blocks.
So
three
cids,
two
of
the
leaf
nodes,
contain
the
leaf
data
Whatever,
Whatever,
It,
Is,
We're,
storing,
perhaps
they're
just
numbers,
perhaps
they're
cids,
but
distinctly
at
the
leaf
nodes
is
where
our
data
is
and
then
we've
got
the
intermediate
the
the
root
one
to
to
contain
them
both.
A
Now
we
can
fill
this
up
and
keep
on
adding
blocks.
We've
got
25
elements
in
our
array,
but
what
happens
at
this
point,
because
now
our
root
block
contains
five
elements
and
we're
we're
at
our
limit
for
for
Block
length
here
for
Block
size.
A
So
what
do
we
do
if
we
want
to
put
another
element
into
this
array?
Well,
we
had
another
level
to
this
to
this
structure,
so
each
of
these
levels
we're
going
to
give
a
height,
so
height,
one
is
where
our
values
are
height.
2
is
the
one
just
above
that
that
holds
them
together,
but
as
we
start
to
need
to
collect
these
height
two
nodes
more
than
one
of
them,
we
need
a
height
three,
where
we
start
to
create
more
more
nodes
as
well.
A
So
potentially
this
thing
could
keep
on
going
up
and
up
and
up
and
in
that
way
we
form
a
dag
that
can
contain
a
arbitrary
number
of
elements,
but
you
can
see
that
some
of
the
problems
we
get
into
when
we
get
when
we're
getting
large
dags.
We
have
these
very
tall
structures
and
very
wide,
and
we
get
into
all
these
sort
of
data
transfer
problems
that
we've
been
talking
about
here
as
well.
A
What
what
should
the
width
be
of
these
things?
I
set
five
just
because
they
fit
on
the
screen
really
nicely.
But
how
do
we?
How
do
we
come
up
with
a
good
value
for
width?
We
could
approximate
the
byte
size
that
we're
aiming
for
and
we've
got
various
limitations
through.
Our
systems
like
bit
swap
is
not
so
good
with
blocks
somewhere
in
the
one
to
two
Meg
size.
So
we
sort
of
could
aim
for
you
know
maybe
one
Meg
blocks,
but
that
depends
on
what
we're
storing
in
these
things.
A
So
maybe
we're
just
storing
cids,
and
if
we
are,
then
we
could
get
the
approximate
size
of
cids
and
and
use
that
to
figure
out
what
the
width
of
these
blocks
are.
It's
a
bit
of
a
challenge
because
even
cids
are
flexible
in
their
size.
You
can
get.
You
know,
cids
that
are
64
128,
bytes
long,
even
identity
cids,
which
just
break
everything
anyway.
A
The
most
interesting
part
of
this
is
the
the
costs
of
mutation
so
size,
the
bigger
we
make
these
things
the
more
complicated
it
comes
for
mutation,
because
we
don't
just
want
to
have
a
static
array.
Often
we'll
want
to
have
a
an
array
of
data
that
we
want
to
mutate
over
time.
We
want
to
add
to
things
we
want
to
change
elements.
This
is
the
nature
of
real
world
data.
We
love
immutable
data,
but
we
like
to
be
able
to
mutate
these
things
to
and
create
new
copies
of
them.
A
So
let's
look
at
the
at
the
implications
of
mutation.
So
in
this
in
this
picture,
I've
I've
made
up
a
CID
for
each
of
these
blocks.
So
each
of
these
blocks
has
a
unique
CID
and
we're
going
to
do
a
couple
of
mutation
operations.
Firstly,
we're
going
to
append
another
element
to
the
end.
So
what
happens
when
we
mutate
by
adding
an
element
on
the
right
hand
side
here
we
get
a
new
CID
for
that
block
because
it's
changed.
A
But
then,
thanks
to
Merkel
merkeldags,
the
linking
block
has
to
get
a
new
CID,
because
the
news,
the
idea
of
the
pair
of
the
child,
has
been
given
to
the
intermediate
and
then
the
root
gets
a
new
CID.
So
every
mutation
does
this
bubbling
up
to
the
root,
and
so
we've
created
essentially
three
new
blocks
in
this
process
of
just
adding
one
element
to
the
end.
So
if
you
want
to
keep
all
the
historical
data,
you've
got
all
those
original
blocks,
plus
three
more
blocks
now
to
contain
this.
A
Now,
what
happens
if
you
just
wanted
to
edit
one
of
the
elements?
So
let's
say
you
just
want
to
change
one
of
the
middle
ones
and
it
and
give
it
a
new
value
well
to
get
the
same
thing
happening
where
that
creates
a
new
CID
for
that
one,
and
then
you
need
to
include
that
in
the
linking
intermediate
and
then
you
need
to
get
a
new
route.
So,
each
time
you
get
a
new
route.
A
So
this
mutation
has
real
costs
and
they
sort
of
flow
on
costs,
but
they
have
implications
for
all
parts
of
the
stack.
So
these
costs
Cascade,
as
you
can
see,
just
small
changes
in
parts
of
it
just
Cascade
all
the
way
up
dags.
If
we
make
our
blocks
too
large,
let's
say
we
were
aiming
for
one
Meg
blocks
with
these
arrays
and
we
just
want
to
change
one
little
element.
A
A
We
could
make
the
block
small
so
that
mutating
them
doesn't,
it
doesn't
seem
so
costly
and
you
get
more
potential
deduplication
across
the
dag.
But
when
you
make
lock
small,
then
you're
going
to
get
much
taller
dags.
So
you
end
up
having
to
mutate
a
lot
of
blocks.
A
So
there's
a
sweet
spot
here
somewhere,
but
it
really
is
application,
develop
application
specific.
It
depends
on
what
you're
doing
with
the
data
whether
you're
transferring
it
is
this
going
over
ipfs.
Are
you
transferring
around
the
world?
A
So
the
moral
of
this
story
is
that
these
kinds
of
costs
really
are
application
specific
and
you
should
Benchmark
to
figure
that
out
and
a
data
transfer
slide,
because
this
is
definitely
a
data
transfer
talk,
so
mutation
costs
are
very
similar
to
the
data
transfer
costs,
the
the
longer
the
taller
the
trees.
As
we've
heard
from
a
few
previous
talks,
if
you
make
these
trees
taller,
then
you
get
these
round-trip
costs.
So
you
need
to
search
your
dag
to
find
the
let's
say
you
just
want
one
element
out
of
this
array.
A
If
it's
a
long,
stringy
dag,
then
you're
going
to
have
to
do
this
round
trip
thing
just
to
find
the
element
you
want,
but
if
they're
really
thick
blocks
like
really
heavy,
then
you,
then
you
might
have
fewer
round
trips
but
you're
going
to
be
transferring
a
lot
more
data
just
to
find
the
one
element
you
want,
because
each
of
those
nodes
is
going
to
be
really
thick,
but
also
there's
an
algorithm
involved
here
in
finding
what
you
want
so
sometimes
with
these
structures
it's
difficult
to
to
prefetch.
A
You
can't
predict
where
you're
going
to
go
until
you
get
the
next
thing,
so
that
makes
this
whole
data
transfer
process
really
tricky
with
our
Dags.
Anyway,
let's
get
into
the
really
interesting
bit,
which
is
hamps,
which
we
talk
a
lot
about
we
hand
up.
If
you
haven't
heard
of
a
hampt,
how
many
people
have
heard
of
a
hampt,
okay,
so
yeah,
okay,
you've
heard
of
it,
but
I
want
to
talk
about
how
they
actually
work,
because
this
is
kind
of
fun.
A
So
a
hemp
is
a
hash
array,
mapped
tree.
It's
it's
a
key
value
store.
It's
just
a
map
like
a
associative
array
that
you
would
have
in
a
normal
programming
language
and
it's
a
mashup
of
a
hash
table
and
a
prefix
tree.
So
we
put
these
two
concepts
together
and
we
get
this.
It's
like
a
hash
table
tree.
A
Now,
it's
actually
a
common
algorithm.
That's
used
within
standard
libraries,
you'll
even
find
it
in
Java
stand
but
standard
libraries
for
their
collections,
because
it's
such
an
efficient
way
of
of
making
associative
arrays
now
the
nice
thing
about
hamps
is
for
our
purposes
is
that
we
use
they
use
hashing
to
create
Randomness,
which
leads
to
a
denseness
property
and
denseness
is
something
we
really
want
out
of
our
dags,
where
we
can
get
them
so
denseness.
A
Basically,
the
fewer
the
fewer
blocks,
the
better
most
of
the
time,
a
few
hops
from
the
route
to
where
you
want
to
go
the
better.
It
is
for
both
data
data
transfer
and
storage
and
all
sorts
of
things,
and
and
this
this
mutation
cost
as
well.
When
you
get
these
cascading
costs
the
fewer
things
you
have
to
mutate,
usually
the
better.
So
you
know,
within
the
constraints
of
size
and
all
that
sort
of
stuff,
so
you
know
block
size,
does
become
a
a
secondary
consideration
when
you're
talking
about
denseness.
A
Let's
start
off
by
having
a
look
at
what
a
hash
table
is
and
then
I'm
going
to
show
you
how
we
can
take
a
hash
table
at
a
prefix
tree
and
we
get
a
hampt
and
it's
actually
simpler
than
it
sounds
so
a
hash
table.
This
is
you
remember
this
one
for
your
next
technical
interview,
because
this
will
definitely
come
up.
A
hash
table
is
given
a
list
of
key
value
pairs.
We
want
to
distribute
them
in
an
array
of
indexed
buckets:
zero
to
n
some
number
n.
A
You
set
the
size
of
this
thing
using
a
little
algorithm
where
you
you
select
the
bucket
by
taking
the
hash
of
a
key
and
running
some
some
little
algorithm
over
that
to
get
an
index
of
the
bucket.
So
you
use
the
hash
of
the
key
to
get
some
randomness
and
you
use
that
random
value
to
find
an
index
which
tells
you
which
bucket
to
put
the
value
in
so
the
buckets
will
then
contain
an
array
of
elements
so
that
it
could
be.
A
You
know
any
number
of
elements
that
end
up
in
these
buckets,
but
the
hash
provides
a
Randomness
in
the
distribution
so
that
the
the
buckets
over
time
even
out
two
keys
that
resolve
to
the
same
bucket,
because
you
can
see
how
this
we're
sort
of
truncating
the
hash
here
a
little
bit
we're
using
it
to
narrow
it
down.
A
You
can
often
have
two
keys
in
the
same
bucket
and
we
call
that
a
collision,
so
they're
said
to
collide.
So,
let's,
let's
run
a
really
simple
hash
table
across
seven
seven
keys
and
values.
Now
we've
got
keys
here,
which
are
some
names
and
some
values
which
are
just
an
individual
integer,
we're
going
to
store
them
in
a
hash
table.
We're
going
to
have
a
hash
table
of
also
seven
buckets
and
potentially
we
could
add
more
keys
here,
but
we're
just
going
to
do
seven
to
seven
and
see
what
happens.
A
Then
we
are
going
to
run
a
little
index
function
and
what
I'm
going
to
do
really
simple,
I'm,
going
to
take
the
first
byte
of
the
Shaha
output,
I'm,
going
to
run
a
mod
7
over
it
just
to
find
which
bucket
these
go
into
I'm
using
seven,
because
I
could
also
use
a
I
could
mask
off
the
first
three
bits,
which
would
be
a
mask
of
seven
and
I
could
just
use
three
bits.
That's
a
really
interesting
thing
that
I'll!
Let
you
think
about
why
we
might
want
to
do
that.
A
But
for
now
it's
just
a
mod
7
just
and
that
gives
us
a
number
between
0
and
6.,
and
then
we
sort
them
into
these
buckets
so
the
bucket
that
they
belong
in,
and
you
can
see
that
we
end
up
actually
filling
up
three
buckets
with
these
seven
names
and
which
doesn't
sound
great.
But
given
enough
of
these
Keys,
then
you
would
find
distribution
roughly
even
across
all
of
these
buckets
and
you'd
fill
them
up
evenly.
A
So
to
look
up
in
a
hash
table,
you
do
the
same
thing.
You
take
a
key.
You
run
the
index
of
the
hash,
and
then
you
jump
to
that
bucket
and
you
do
a
sort,
a
search
through
that
bucket
and
that
could
be
a
a
sorted
search
or
just
a
you
know,
an
iteration
through
the
bucket
to
find
the
the
value
that
you
want
and
we
store
the
key
and
the
value
in
those
things
so
that
you
can
actually
look
up
the
key.
A
So,
for
instance,
if
I
want
to
look
up
William,
then
I
take
the
key.
William
and
I
run.
I
run
the
hash
over
it
and
I
find
out
that
using
the
hash
and
using
that
little
function,
it's
in
bucket
one.
So
I
go
look
in
bucket
one
there's
an
array
of
three
items:
look
through
that
array,
I
find
William
and
then
I've
got
a
value
so
great.
What
happens
if
I
want
to
find
Gary
well,
I
run
a
hash
of
Gary
I
find
the
the
index
function.
A
Gives
me
one
I
go
to
bucket
one
and
there's
three
items
and
Gary's
not
in
there.
So
Gary
does
not
exist
in
our
hash
table
great
okay,
hash
table
for
ipld.
Are
they
any
good
for
ipld?
Well,
it's
this
bucket
size
problem
that
becomes
a
problem
for
ipld.
If
these
buckets
can
grow
infinitely,
then
we're
going
to
get
blocks
that
are
too
big.
A
We
could
encode
that
into
a
block,
but
if
we
keep
on
adding
to
it
the
block
just
to
become
out
of
out
of
control-
and
it's
not
very
useful
if
you've
just
got
a
block
with
this
data
in
it.
So
when,
when
we
keep
on
mutating
these
things
and
adding
blocks
adding
things
to
our
blocks,
then
and
the
block
sizes
get
bigger,
then
the
costs
are
associated
with
that
mutation
keep
on
going
up,
and
that
goes
up
exponentially.
So
we
don't
actually
want
very
large
blocks
with
our
data.
A
So
stopping
users
from
forcing
collisions
is
something
we
want
to
avoid.
So
to
solve
these
issues,
we're
going
to
turn
a
hash
table
into
a
tree
and
that's
what
a
Hamp
does
a
Hamp
takes.
A
hash
table
and
repeats
the
pattern
by
taking
buckets
and
turning
them
into
their
own
little
hash
table
and
repeating
this
index
operation
as
we
iterate
through
a
hash.
A
So,
instead
of
an
array
of
buckets,
what
we
can
do
is
have
an
array
of
either
entries
or
links
to
a
new
table
as
as
an
in-memory
data
structure
in
some
programming
language,
then
this
link
would
just
be
a
pointer
to
another
another
struct,
but
in
ipld
it's
a
CID,
it's
a
link
to
a
whole
new
block,
so
we
run
the
same
index
function,
but
we
have
a
way
of
deriving
a
different
index
at
each
level
of
this
tree,
depending
on
which
level
you're
on
over
the
same
hash.
A
So,
for
example,
if
the
the
original
block
uses
the
first
byte
like
we
did,
then
maybe
the
second
layer
could
use
the
second
byte
to
derive
the
the
location.
So
we're
going
to
try
that,
with
this
with
this
hash
table,
we're
going
to
turn
it
into
a
tree.
First
of
all,
we'll
notice
that
we
have.
We
have
a
raise
of
entries
here.
We
just
want
to
have
one
entry
per
per
item,
so
one
of
one
of
them's,
fine,
the
first
one
has
one
element
in
it.
A
So
we'll
leave
that
alone,
but
two
of
them
have
three
elements
so
we're
gonna,
we're
gonna,
insert
links
there
and
run
this
same
procedure
across
these.
These
two
things
so
we'll
run
the
hash
again
or
we
could
just
reuse
the
same
hash.
Now
we're
going
to
our
index
function
now
jumps
up
to
the
second
byte.
A
So
we
take
the
second
byte
of
the
hash
and
we
get
a
totally
new
index
for
our
elements
and
then
we
can
put
them
in
new
buckets
for
this
second
layer
and
you'll
see
they
nicely
fall
into
complete
the
separate
buckets.
So
now
we
have
a
two
layer
tree
of
three
elements
where
each
entry
falls
into
a
separate
bucket
at
the
end.
So
there's
no
arrays
of
elements
here,
there's
no
searching
you've
just
got
to
look
up
where
it
is
and
jump
to
the
next
thing
until
you
find
it
so
this
this
is.
A
This
will
be
what
our
ipld
blocks
look
like.
They
just
simply
the
blocks
with
an
array
and
the
arrays
either
have
an
entry
or
they
have
a
CID
to
another
block.
A
So
looking
up
and
so
we're
going
to
look
up
and
we
run
the
hash
of
Ann,
we
find
that
in
the
first
block
the
bucket
that
we
want
to
look
in
is
bucket
number
four.
We
go
there
and
we
find
a
CID.
So
it's
not
there,
but
it
tells
us
where
to
go
the
second
second
block
or
the
second
layer.
We
run
that
index
function
again,
but
over
the
second
byte
and
we
find
a
new
index
which
is
five
looking
five
and
there's
and
that's
great
now.
A
If
we
did
Gary
first
one
we'll
find
bucket
number
one
we
go
in.
There
find
a
CID
jump
to
CID
the
second
CID.
It
tells
us
to
look
in
bucket
number
two
and
it's
not
there.
How
Gary
doesn't
match
William,
so
Gary
doesn't
exist
in
our
hash
table.
Okay,
have
we
solved
our
problems?
Well,
we've
controlled
the
size
of
the
blocks
by
having
n
times
the
number
of
either
entries
or
links.
So
these
things
can't
grow
infinitely.
A
They're
they're
bound,
which
is
great
so
overflows
when
you
hit
an
overflow,
it
causes
the
creation
of
new
blocks
and
we've
also
increased
security
by
using
the
more
more
of
the
hash.
A
So
you
might
be
able
to
force
force
collisions
by
you
know,
making
keys
that
have
a
particular
first
byte
that
you
want,
but
it's
very
difficult
to
do
that
as
you
proceed
through
a
hash,
because
you
could
keep
on
forcing
that,
but
look
how
difficult
it
is
to
to
Hash
for
Bitcoin
and
get
those
leading
zeros,
it's
very
difficult
to
find
to
brute
force
hashes
that
will
collide
at
all
the
places
that
you
want
so
characteristics.
Denseness
hash
digests
give
us
Randomness,
which
is
great
because
it
gives
us
this.
A
This
dense
distribution
of
non-random
keys.
So
it
doesn't
really
matter
if
you're,
storing
keys
that
look
like
each
other.
The
hash
function
gives
us
this
random
distribution
and
we
don't
also
have
we
also
don't
have
a
strict
height
equals
one
base
layer
where
all
the
entries
are
stored.
Like
our
super
large
array,
they
don't
all
exist
at
the
bottom.
A
They're
actually
distributed
all
through
the
thing
and
I'll
show
you
a
picture
of
what
that
might
look
like,
but
we
don't
strictly
say
that
there's
this
bottom
level,
where
they
all
exist,
but
the
more
you
fill
it
up,
the
more
they
the
entries
tend
to
get
pushed
down.
The
linear
scan
is
also
eliminated.
So
we
don't
need
to
search
through
buckets
to
find
these
things.
So
the
lookup
costs
are
really
all
about
just
jumping
blocks
and
running
the
index
thing.
A
So
that's
not
awesome
dilation.
When
we
delete
an
element,
we
we
have
to
reverse
what
we've
done
and
we
what
we
do
is
we
do
a
collapse
to
turn
this
back
into
the
form
that
would
occur
if
we
didn't
have
those
elements.
So
if
we
only
have
one
element
left
in
a
node,
then
we
we
collapse
that
element
back
up
into
the
parent,
and
then
we
create
this
form
of
the
Hampton
that
it
would
be
as
if
that
deleted
element
was
never
there.
A
I'll
tell
you
why
that's
important
in
a
second,
we
could
bring
back
buckets
and
we
actually
do
this
in
practice.
When
we
use
hands,
so
we
can
bring
back
buckets
with
a
maximum
length.
So
we,
what
we
say
is
there's
a
maximum
length
of
buckets,
say
three.
A
It's
a
little
bit
like
having
these
breaks
between
talks.
You
you
allow
for
some
flexing
when
people
go
too
long
or
Too,
Short
and
yeah,
it
helps
balance
the
the
mutation
size
with
block
size
and
the
traversal
depth
trade-offs.
So
buckets
tend
to
be
something
we
use
in
practice.
A
So
canonical
forms
back
to
this
idea
of
why
we
do
the
deletion.
What
we
want
with
in
ipld
typically
is
to
have
for
any
set
of
data.
We
want
to
run
the
algorithm
and
end
up
with
the
same
Roots,
the
ID
of
our
dag,
regardless
of
how
of
the
order
in
which
we
create
this
thing
so
for
a
key
value
set.
A
Regardless
of
what
order
we
add
our
keys,
we
want
to
end
up
with
the
same
root
CID
if
it
contains
the
same
data,
so
this
this
gives
us.
This
deduplication
allows
us
to
have
consensus
when
we're
talking
about
the
same
data
set
so
with
hamps.
We
do
that
by.
If
we
have
buckets,
we
have
to
sort
them
so
that
they're,
not
just
in
there,
by
the
insertion
order.
A
Just
for
a
mental
picture
of
what
amps
end
up
looking
like
this.
This
is
the
kind
of
thing
that
we
would
see
in
practice
with
a
hampt
where
the
little
orange
orange
blocks
are
values
that
we
might
find
and
you'll
see
that
those
the
values
are
distributed
throughout
the
the
dag,
but
they
tend
to
be
pushed
down
towards
the
bottom
and
the
the
upper
nodes
tend
to
be
contained,
mostly
links.
A
But,
unlike
the
super
large
array,
you
can't
just
look
along
the
bottom
layer
and
find
all
your
entries
they're
actually
distributed
all
the
way
through
and
so
just
evaluating,
hamps
they're
dense,
which
means
they're
they
they
can
be
fast
for
lookups,
so
we
don't
have
to
Traverse
all
the
way
down
the
bottom
to
find
what
we
want.
We
can
often
just
hop
straight
to
it
and
because
we
pack
the
values
in
really
nicely
with
this
Randomness,
then
we
get
this
denseness
and
they're
flexible.
They
can
be
they've
got
lots
of
knobs.
A
A
We
need
cryptographic
hash
function.
If
we
don't
use
a
cryptographic
cache
function,
then
we
allow
people
to
Brute
Force
collisions
that
may
not
matter
in
some
situations
where
you
don't
accept
user
input,
but
typically
for
ipld.
We
we
want
cryptographic
hash
functions
to
allow
this
Randomness
to
be
to
avoid
these
problems
of
collisions
and
hamps
are
all
about
Randomness.
They
can't
do
key
ordering,
which
is
a
bit
of
a
bummer,
but
you
can't
you
can
iterate
over
a
Hamp,
but
but
the
order
in
which
you
iterate
will
always
be
random.
A
A
So
you
can't
you
can't
do
much
with
a
Hamp
without
running
the
algorithm.
You
can
transfer
it,
but
it's
very
difficult
to
make
it
meaningful
without
having
the
algorithm
to
run
over
it.
A
I
might
skip
that
one
for
transfer
or
traversal
we
get.
This
denseness
is
great.
So
when
we're
transferring
this
between
two
parties,
we
don't
have
to
have
these
long
stringy
things
that
we
have
to
go
back
and
forth.
So
densance
is
great,
but
because
it's
again
we
need
the
algorithm,
it's
very
difficult
to
do
any
prefetching.
A
You
know
like
smart
prefetching
anyway,
so
it's
basically
impossible
to
predict
where
you're
going
to
go
in
a
Hamp
to
find
the
value
that
you
want.
I'm
going
to
skip
schema
just
quickly
other
data
structures,
just
just
to
compare
this
to
some
other
ones.
We
have
this
other
data
structure
we're
using
file
coin.
It
works
okay
in
filecoin,
but
it's
not
very
awesome.
A
If
we
were
to
replace
a
data
structure
in
final
coin,
it
would
be
the
AMT.
So
it's
a
specialized
data
structure
that
is
used
to
store
array
like
that
or
it's
a
little
bit
like
super
large
array,
but
it's
a
mashup
between
that
and
a
hampt.
The
reason
we
do
that
is
because
we
actually
in
file
coin,
where
we
use
it,
we
actually
want
it
for
a
sparse
array.
We
don't
just
want
to
pack
up
from
zero
onwards.
We
want
to
insert
these
indexes
in
in
a
sparse
way.
A
The
reason
it's
not
awesome
is
it
fails
our
denseness
criteria,
because
what
we
end
up
with
is
the
AMT.
This
does
this
kind
of
thing
where
the
root,
the
top?
The
top
left
here
will
connect
down
to
nodes
that
span
a
long
way
as
you
go
to
the
far
the
far
right
there.
So
when
we,
when
the
indexes
increase
that
you're
storing
it's
the
further
you
push
out
there,
so
you
can
imagine
transferring
this
kind
of
data
gets
pretty
pretty
annoying.
A
Lastly,
ordered
collections.
This
is
something
Mo
mentioned
at
the
order
collections
of
the
next
great
Holy
Grail.
With
order
collections,
we
can
build
databases,
I'd
love,
to
be
able
to
come
and
do
a
talk
about
how
all
the
things
we
can
do
with
order
collections,
but
we
don't
have
solid
order
collections
implemented
yet,
but
with
them.
A
If
we
can
build
these
things,
then
we
can
build
all
sorts
of
powerful
things
on
top
of
them,
but
they're
a
little
bit
hard,
because
if
you
take
away
the
randomness,
then
you
you
have
all
sorts
of
problems
with
density
and
and
structure,
but
the
key
Insight
that
that
we're
working
on,
including
in
poly
trees
and
there's
a
couple
of
other
ways
that
are
being
approached
is
that
we
can
use
hash
digest
as
a
source
of
Randomness
to
generate
probabilities
rather
than
rather
than
than
strict
branching
like
we
do
with
with
a
Hamp
we
can
use,
we
can
run
patches
of
keys
and
generate
probabilities
of
branching
there's.
A
A
talk
on
probably
trees
would
probably
go
more
in
more
into
detail
on
this
anyway.
That's
it
from
me.
Hopefully
that
wasn't
too
long.