►
Description
Banyan is a tree data structure that provides an indexed, compressed and encrypted append only log.
It is used in a distributed event sourcing system at
A
So
tactics,
what
was
tactics
doing
or
what
is
tactics
doing?
The
company
still
exists,
but
I'm
I've
moved
to
another
company,
so
what
they
were
doing
is
what
they're
doing
is
machine
and
human
Communication
in
Factory.
So
basically
you
have
a
bunch
of
nodes.
Node
can
be
a
person
or
a
machine
or
a
server
in
the
server
rack,
and
they
all
want
to
talk
to
each
other.
A
It's
a
distributed
event
sourcing
system,
so
each
node
produces
an
event
log
and
you
replicate
these
event
logs,
so
the
other
nodes,
and
then
every
node
has
some
kind
of
snap.
Some
kind
of
state
which
you
can
then
analyze
and
draw
conclusions
from
basically
and
The
crucial
part
is
that
this
system
should
be
partition
tolerant.
So
if
one
node
is
not
available,
things
should
just
carry
on
and
that
actually
work
quite
well.
A
So
our
knees
attack
takes
one
need
that
we
had
was.
We
want
event,
storage
for
distributed
systems,
quite
clear.
Yes,
so
we
wanted
to
be
petition
tolerant.
A
We
wanted
to
be
local
first,
so
basically,
every
device
should
always
be
able
to
work,
even
even
if
you
put
it
in
a
faraday
cage
and
it
cannot
communicate
with
the
outside
in
any
way,
it
should
should
still
at
least
be
able
to
interact
with
the
local
state.
I
think
that's
very
important.
We
have
seen
many
projects
today.
That
mentioned
this
local
first
principles
and
I
think
this
is
the
way
to
go.
A
So
we
wanted
to
deploy
this
first
in
small
private
swarms,
but
then
eventually
also
on
the
global
ipfs.
So
in
addition
to
all
the
other
things,
you
also
needed
some
kind
of
encryption,
because
you
don't
want
your
machine
data
on
the
public
iqfs
in
a
public
way.
People
just
don't
like
that
for
some
reason
so
requirements
further.
It
have
to
be
content
addressed.
You
want
an
append,
Only
log,
so
we
want
to
be
able
to
append
quickly
to
the
end.
We
want
it
indexed
by
offset.
A
So
it
should
be
able
to
say
give
me
message,
number
two
five
six
quickly
and
we
also,
in
addition,
wanted
to
have
a
way
to
query
all
the
events
in
a
log
by
tax,
which
is
the
strings
basically,
and
it
should
be
compressed
because
machine
data
is
very
very
it's
a
lot
of
data
sometimes.
But
it's
also
very
repetitive.
Imagine
a
temperature
sensor.
It
shows
like
one
temperature,
then
one
second
later
tells
you.
The
temperature
is
still
the
same,
and
if
you
have
a
few
billion
of
those,
then
you
really
need
compression.
A
Otherwise
it's
just
no
fun
at
all
and
it
has
to
be
encrypted
as
I
just
mentioned,
so
append
only
lock.
We
need
faster
pen,
we
need
address
by
stable
index,
and
we
also
want
to
be
able
to
remove
data
at
some
point,
because
we
don't
want
the
devices
to
run
out
of
space.
But
when
we
do
that
we
don't
want
the
indexes
to
change,
so
you
can
say,
remove
all
events
from
zero
to
one
million,
but
then
the
event
number
one
billion
and
one
will
still
have
that
offset
it-
will
not
move
to
the
front.
A
That's
an
operation,
that's
just
not
supported,
and
so
this
is
the
tricky
part.
We
also
wanted
to
have
some
kind
of
querying.
So
we
want
to
say
give
me
all
events
that
match
a
certain
pattern,
whatever
whatever
it
is.
In
our
case
it
was
tax
and
time
and
stuff
like
that,
but
the
the
data
structure
itself
is
generic
on
what
you
want
to
carry
on.
So
you
can
Define
your
own
types.
What
you
want,
what
you
want
to
be
your
data?
A
What
you
want
to
be
your
keys,
which
is
the
indexed
part
and
what
you
want
to
be
your
summary,
so
how
you
want
your
keys
to
be
summarized
up
the
tree
basically
and
yeah,
so
this
is
our.
This
were
our
needs
again.
We
also
need
time
ranges,
there's
another
index
and
tax
and
you
can
do
a
Boolean
combination
of
all
the
above,
so
I
mean
the
typical
querying
thing
has
encryption.
So
since
eventually
you
want
to
put
private
data
on
public
ipfs,
we
need
the
encryption,
but,
as
I
mentioned
before,
we
also
need
compression.
A
So
what
happens?
If
we
try
to
compress
encrypted
data?
Well,
you
can
do
it,
but
it
doesn't
work.
Just
the
compression
will
be
not
very
good
or
your
encryption
algorithm
is
very
shitty.
A
One
of
the
two
things
will
be
true
so
for
encryption
we
just
use
the
fast
stream
Cipher
and
we
have
a
two
different
secrets
for
the
keys
and
for
the
values.
So
you
might
say,
the
values
are
private,
but
the
keys
are
so
you
can
can
tell
people
they
can
query,
but
they
need
an
additional
key
to
get
to
the
actual
values
it's
useful
in
some
cases
and
so
now
to
the
to
the
compression.
So
obviously
we
have
to
compress
compress
before
we
encrypt
otherwise
yeah.
A
It
doesn't
work
and
then
this
this
is
something
we
use
Z
standard
from
Facebook,
which
is
very,
very
good,
general
purpose,
compression
algorithm
and
yeah.
This
is
one
thing
you
really
want
to
do.
If
you
have
data
that
looks
like
that,
XML
version
equals
blah
blah
blah
super
webos.
Then
you
want
to
have
a
lot
of
stuff
to
put
into
your
blog,
and
if
you
have
binary
data,
your
compression
is
not
going
to
be
able
to
do
much.
A
So
you
really
want
to
determine
your
block
size,
not
on
the
size
of
the
uncompressed
data,
but
on
the
size
of
the
compressed
data,
and
that
means
you
have
to
have
a
tight
integration
with
the
compression
algorithm.
So
you
cannot
just
say
here:
hey
compress
this,
but
you
have
to
feed
the
chunks
and
then
see
when
it's
full,
basically
so
that
that
this
was
neat,
so
the
data
structure
I
came
up
with
is
called
Banyan.
A
So
there's
a
banyan
tree,
it's
a
very
interesting
looking
tree
and
it
has
well,
it
can
get
multiple
roots
and
at
some
point,
The
Roots
will
be
indistinguishable
from
the
original
root
so
and
that
I
found
that
to
be
a
nice
nice
coincidence.
A
So
what's
Banyan
Banyan
is
rust,
crate.
It
is
licensed,
MRT
and
Apache,
so
the
usual
rust
license.
Basically,
so
you
can
use
it
no
problem,
the
the
higher
level
parts
of
arctics
are
proprietary,
but
Banyan
and
IP
has
embed,
are
public.
A
It's
basically
it's
a
persistent
sequence
of
key
value
pairs
where
key
and
value
are
typed.
You
have
faster
access
by
index
and
you
have
custom
queries
by
summary.
We
heard
this
before
a
few
times
and
in
case
of
arctics
there's
certain
stuff
in
the
keys,
but
you
can
Define
your
own
keys
and
you
can
delete
ranges
so
now,
so,
basically,
there's
a
value
type.
A
So
but
one
thing
that
is
not
opaque,
it
is
seaboar,
but
if
you
have
a
link,
so
you
have
a
sit
in
your
in
your
blob,
it
will
be
scraped,
so
the
data
the
links
need
to
be
put
outside
the
compressed
block,
otherwise
ipfs
won't
find
the
links.
I
mean
IPS,
cannot
find
links
and
compress
and
encrypted
data.
A
So
you
have
a
key
type.
This
can
be
summarized
and
again,
links
will
be
scraped
and
you
have
a
summary
type.
The
summary
type
can
be
the
same
as
a
key
type,
but
you
can
also
have
some
other
summary
yeah.
Like
imagine.
Your
key
is
a
string
and
then
your
summary
might
want
to
be
a
set
of
strings
or
whatever
and
yeah.
You
must
be
able
to
compute
a
summary
from
keys
or
you
must
be
able
to
compute
a
summary
from
summaries.
A
A
For
example,
this
doesn't
fit
on
the
slide
anymore,
but
imagine
you
have
a
tree
and
every
single
hash
in
the
tree
is
going
to
use
the
same
hash
algorithm
at
the
same,
all
the
all
the
other
parameters,
and
then
in
due
to
efficiency,
you
just
stored
32
bytes
of
hash
as
your
link
and
then
somewhere
all
the
way
up.
You
know,
or
these
all
these
hashes
are
going
to
be
shot
256
whatever.
A
So
this
is
how
this
is
done
in
Rust,
there's
a
there's,
a
trade
which
this
is
a
common
design
term
pattern
in
Rust,
where
you
have
a
trade
which
just
allows
you
to
set
a
bunch
of
types
and
yeah.
There's
all
these
types
that
I
just
mentioned,
and
now,
let's
look
how
the
teeth
tree
structure
looks
like
so
gold
is
to
be
chunky.
A
What
I
mean
by
chunky
is
that
you
want
to
pack
as
much
as
possible
into
a
single
node,
obviously,
within
the
limits
of
what
bitstrap
can
handle,
you
want
to
pack
as
much
as
possible
in
the
node
and
so
to
do
to
allow
that
nodes
are
always
sequences.
A
node
never
contains
a
single
value.
It
contains
a
sequence
of
values,
a
note
as
in
a
node
that
ends
up
in
on
ipfs
and
so
level.
No
zero
nodes
are
sequences
of
values,
level,
one
nodes
are
sequences
of
keys
and
level.
A
So
that
means
that
the
data
that
you
need
often
because
you
need
to
you-
need
it
for
indexing
or
for
querying-
is
further
up
the
tree
than
the
data
that
you
need.
Rarely
like
the
like
the
values
that,
if
you
need
the
values,
have
to
go
all
the
way
down,
but
if
you
just
need
to
figure
out
which
values
you
need,
you
can
do
it
further
up
in
this.
In
the
tree,
so
this
is
how
the
tree
looks:
I
just
drew
it
on
the
Whiteboard.
So
basically
you
have
values.
A
Those
are
just
sequences
of
blobs
of
silver
blobs.
Then
you
have
keys.
Each
key
contain
contains
multiple
links
to
leaves
and
for
each
Leafs
it
has
the
the
keys.
So
it's
a
actually
it's
a
sequence
of
sequences.
If
you,
if
you
look
at
it
like
that
and
then
corresponding
to
each
sequence,
it
has
a
link
and
then
level.
A
Two
and
above
you
have
summaries,
which
are
again
sequences
of
summaries,
and
you
have
a
sequence
of
sequences
summaries
and
for
each
summary
sequence,
there's
a
link
to
the
to
the
left
below
and
now
the
block
format
is
seen
from.
Ipfs
is
pretty
simple,
because
there's
just
not
that
much
that
you
can
do
because
most
of
the
thing
has
to
be
in
the
encrypted
part.
So
the
block
format
is
a
number
which
is
just
an
offset
in
the
Stream,
then
this
Vector
of
the
seats.
Basically,
this
is
the
extracted
Links
of
the
content.
A
You
can
you
can
not
have
your
cake
and
needed
to
if
you
want
the
links
to
be
completely
hidden,
you
cannot
extract
them,
but
then
ipvs
won't
be
able
to
do
anything
with
it.
So
we
have
to
extract
them.
If
you
don't
want
them
extract
it,
just
don't
mark
them
with
tech42,
very
simple
and
then
obviously
you
have
to
compressed
and
encrypted
data.
A
A
So
why
not
taxi
boy
see
obvious
question
Yeah,
so
basically
that
that's
keyboard
to
me
feels
a
little
bit
like
binary,
Json
and
I.
Don't
want
Json
I,
don't
want
text
Json,
but
I
also
don't
want
binary.
Json
I
want
to
be
able
to
use
arbitrary
things
as
dictionary
keys
and
not
just
strings
whenever
I
have
to
stringify
something
to
put
it
in
a
map.
I
die
a
little
inside
kind
of
and
I
find
daxibo
very,
limiting
compared
to
straight
sibo.
I.
A
Think
there's
nothing
wrong
with
straight
sieber,
just
use
it
it's
it's
fine
and
there's
a
lot
of
effort
done
in
in
taxibo
about
canonicalization
and
I.
Don't
care
for
that
at
all,
because
there's
just
one
device
writing
this
stuff.
So
the
canonical
stuff
is
whatever
the
device
that
drives
the
event
has
written,
so
there's
not
going
to
be
any
collisions
or
like
matching
or
whatever
so,
but
nevertheless,
you
can
of
course
put
daxibo
data
in
Banyan.
It's
no
problem
at
all.
Daxibo
is
valid
seabor.
A
It's
also
valid
sibo
with
attack
42,
so
it
will
just
work
yeah
and,
but
you
don't
have
to
you-
can
also
use
c-bar
which
exceeds
what
is
possible
in
a
dark,
sibo
and
yeah.
That's
the
end
of
my
talk,
the
rest.
There's
an
example:
it's
just
a
big
main
function,
which
does
three
different
things.
First
thing
it
does.
It
just
uses
money
and
it's
just
a
sequence
of
events.
So
what
do
you
get
compared
to
like
the
typical
thing
to
do
a
linked
list?
A
You
get
insanely
good
compression
of
your
data
because
it's
it's
packing
multiple
events
into
one
block
and
then
using
the
standard
compression
to
to
get
rid
of
the
redundancy.
So
you
will
get
some
very
good
compression
compared
to
if
you
do.
The
typical,
like
attack,
links,
data
structure,
thing
and
you
also
get
Lightning
Fast
indexed
Access,
but
obviously
you
don't
get
any
queries
because
well,
you
haven't
defined
anything
that
you
want
to
query
on
then.
A
The
next
thing
is
an
example
where
you
use
Banyan,
like
tactics
does
so
you
have
a
predefined
index
types
which
is
tax
and
times
and
so
on,
and
then
there's
also
an
example
which
allows
you
to
use
Bunny
with
a
custom
index
type.
It's
a
very
simple.
Basically,
your
value
is
just
the
scalar
value
and
a
integer
or
whatever,
and
your
summary
type
is
just
a
range
like
minimum
maximum.
But
this
is
the
smallest
possible
example.