►
From YouTube: Linux2ipfs - Joroppo // IPFS Implementations Workshop
Description
In this May 2022 IPFS implementations workshop, we heard a number of first-hand accounts from builders & thinkers working on and with IPFS, taking stock of the current implementation story, and imagining what’s possible through shared effort on a protocol with such broad applicability.
A
Hi
everyone,
I'm
geropo,
I'm
working
at
protocol
labs
at
the
developer
advocate
for
ipfs,
and
I
will
tell
you
the
story
of
something
I've
built
building
for
ipfs,
which
is
linux
to
idfs.
So
the
main
part
of
the
story
is
not
actually
linux
to
ipfs.
It's
mainly
how
it
goes
to
happen
and
what
lesson
you
can
take
if
you
want
to
make
your
own
custom
project.
So
the
first
problem
I
have
is:
I'm
a
linux
user.
A
A
That
means
that,
basically,
when
someone
downloads
the
data,
they
reshare
it
to
other
peoples
and
right
now,
the
way
debian
and
ubuntu
repository
works
is
they
have
a
lot
of
mirrors
which
are
web
servers
around
the
world?
That's
very
expensive,
so
obviously
the
developers
does
not
pay
for
most
of
them.
Most
of
them
are
like
universities
that
have
a
few
gigabits
of
bandwidth.
Three
that
host
one
and
ipfs
would
allow
like,
instead
of
having
lots
of
spread
things
having
one
global
network
that
everyone
can
use,
which
is
very
fast
total.
A
So
what
zero
is
basically
just
web
servers
without
the
files,
so
the
first
step
for
me
was
getting
the
actual
data
for
the
packages
to
be
able
to
create
my
own
repository.
Basically,
the
first
thing
I
try
with
wget
because
well
I
have
web
servers
w
get
downloaded
from
web
servers.
It
didn't
work
very
well
mainly
because
I
couldn't
find
an
easy
way
to
have
incremental
updates.
A
That
means
that
often
like
the
repo
itself
is
very
big,
but
it's
not
getting
updated
completely
every
day,
most
of
the
time,
only
a
few
gigabyte
of
added
per
day,
and
ideally
I
want
to
only
download
the
new
gigabytes
as
I
would
be
wasting
my
bandwidth.
So
I
didn't
go
dash
route.
Instead,
I
go
with
earthsync,
which
is
a
very
well-known
tool
for
doing
that,
and
that's
my
command.
That's
my
script.
A
It
downloads
debian
and
ubuntu
and
termix,
and
that's
basically
it
that's
not
what
I
have-
and
I
have
an
issue
which
is
they
are
very
big-
depends
of
your
definition
of
very
big,
but
I
basically
tried.
Oh
yes,
sorry,
my
setup.
So
for
the
context,
that's
an
ass,
which
I
built
of
recycled
hardware.
Only
hdd
is
on
you.
Everything
else
is
very
old
hardware,
and
that
means
that
when
I
tried
adding
it
to
ipfs,
it
was
very
slow.
A
It
took
more
than
a
week
and
it
was
not
faster
like
every
time
it
took
more
than
a
week
and
the
repos
update
more
often
than
a
week,
so
I
would
always
have
to
catch
up
and
be
very
old
with
like.
If
you
assume
you
want
to
download
security
patches,
for
example,
having
them
a
week,
late
can
be
devastating.
So
that
was
not
a
good
solution.
I
will,
in
my
explanation,
give
you
some
some
tips
about
writing
fast
software.
The
first
one,
the
most
important
one
is
measure.
Really.
A
You
might
think
you
know
why
your
software
is
slow
and
I
can't
tell
you
by
experience.
No
you
don't
it's
incredible.
How
many
times
just
profiling
showed
me
how
I
was
wrong
about
my
assumption
about
what
was
still
what
was
fast
and
that's
the
first
rule
like
you
need
to
bottleneck,
doing
a
benchmarking
and
profiling
ipfs.
A
There
are
many
tools
you
can
use,
but
basically
the
confusion
I
came
with
is
ipfs
was
not
really
playing
well
on
io.
The
main
issue,
I
believe,
were
my
discs,
which
were
hard
drives
and
the
issue
with
hard
drives
is
they
have
poor
random
rates
and
I'm
not
actually
sure
this
was
the
issue,
but
that's
why
I
tried
to
fix
so
to
doing
that.
The
first
rule
is
only
do
the
work
you
need
to
do.
A
That's
the
first
rule
for
going
fast
and
the
way
I
think
ibfs
fails
at
this,
for
my
precise
case
is
like
it's
a
very
complete
theme.
You
can
use
it
to
build
a
lot
of
application
and
the
main
issue
I
have
with
the
way
I
was
building.
It
is
it's
kind
of
built
around
a
demon
model
which
so,
like
you,
have
a
single
service
that
communicates
over
an
api
which
is
often
http
and
that
had
lots
of
overhead
and
present
some
optimization.
I
wanted
to
do
so
to
do
that.
A
I
created
a
linux
to
ipfs,
which
is
a
custom
tool.
I've
made
and
it's
really
small
and
the
main
part
of
it
was
it.
I
needed
it
to
be
easy
to
experiment
with
so
like
it's
one
file,
one
single
name
file,
it's
about
one,
try
the
line
of
code,
and
it
only
has
one
job
which
is
taking
the
files
on
the
hard
drive
and
uploading
it
to
an
ipfs
node,
which
right
now
I
use
estuary
because
of
convenience
reasons.
A
But,
for
example,
linux
itfs
doesn't
know
how
to
do
bitswap,
it
doesn't
know
libya2p,
it
doesn't
know
nerdworking,
but
I
don't
really
care,
I
want
it
to
attract
to
a
node
and
that
it
does
really
good.
So
the
way
it
works.
The
main
points
of
it
is
the
recursive
function.
That's
really
the
heard
bit
of
the
code
and
to
do
that,
it's
mainly
what
you
would
think
it.
You
will
code
something
like
that.
A
So
I
give
it
a
pass
on
my
hard
drive
and
it
will
scan
it
see
what
is
it
and
different
depending
of
the
type
it's
going
to
do
a
different
thing.
So
basically,
I
have
a
small
example
on
the
bottom
right:
let's
assume
it's
adding
this
directory.
It
would
first
start
here,
cabc,
add
abc
and
write,
create
an
empty
directory
block.
Then
it
will
repeat
so
like
the
requestion
on
this
step
is
finished.
A
We
enter
in
fu,
we're
reckless
in
foo
we
enter
in
food,
we
recourse
in
bar
right,
then
we
take
care
of
buzz
and
high,
which
is
the
files
so
that
we
write
it
to
the
output
color
file
and
once
bus
is
finished,
we
can
go
to
high
and
then,
when
high
is
finished,
we
go
back
up
the
stack
so
bar
through
and
top,
and
we
have
finished
and
we
have
added
everything.
I
don't
want
to
spend
much
time
on
this,
because
it's
really
like
the
canonical
way
of
implementing
it.
That's
the
different
way.
A
The
second
rule
of
going
fast
is
doing
chips
check.
First,
one
issue:
ipfs
was
not
going
doing
a
tram.
Is
I
told
you,
linux,
repo
updates
a
few
gigabytes
per
day.
However,
the
way
linear
ipf
is
quite
effects.
Sorry,
I
should
have
let
go
ideas
but
guys
handle
this.
Is
it
hash
everything
again
and
if
it's
mana,
if
it
finds
the
same
hash,
it's
not
going
to
restore
it,
but
the
issue
is
that
means
that
I
have
to
hash
everything-
and
I
have
to
remind
you-
this
is
all
hardware
with
a
terrible
cpu.
A
I
don't
want
to
do
that.
So
the
way
I
get
around
this
is
by
chanting
a
bit
so
the
most
file
systems
report
modification
time
and
that
allows
me
to
basically
compare
a
file
and
if
the
modification
is
older
than
the
last
time
I've
updated
it.
I
know
that
the
content
hasn't
changed,
so
I
just
reuse
the
cid
I
got
from
the
old
one,
so
I
have
an
old.json
file
which
saw
all
my
old
c
ids,
and
I
use
that
to
skip
adding
files
that
have
not
been
added.
That's
also
good.
A
On
bandwidth,
because
I
don't
have
to
re-upload
the
wall
repo
every
time
I
just
to
estuary,
I
just
upload
the
new
files.
The
forward
of
going
fast
is
using
your
resource
to
your
full
potential.
A
A
It's
not
really
that
they
go
fast,
but
they
have
a
very
low
latency
and
hard
drive
have
terrible
latency,
and
so
the
issue
I
was
running
into
is
basically
most
of
the
time
I
was
waiting
for
the
disk
to
seek
to
a
certain
place,
so
the
way
hard
drive
works
is
they
have
a
head
that
sticks
around
the
disk
and
most
of
the
time
I
spend
doing
this
and
the
way
you
can
fix
this
is
by
adding
reflecting
so
instead
of
like
having
to
copy
things
back
and
forth
between
my
my
files
and
my
destination,
what
I
do
is
I
wrestling
the
data,
so
that's
a
feature
of
a
newer
linux
journal
and
btrfs
one.
A
The
file
system
I
use,
which
allows
me
to
create
a
copy
and
write
copy.
That
means
that
the
data
is
not
actually
copied
was
it
created,
is
only
a
tiny,
tiny
block
on
the
disk
which
says
this
file
like
the
new
destination
points
to
the
data
of
the
old
file,
and
if
anyone
modify
the
the
shared
data,
if
it
just
modifies
the
the
block
on
the
disk,
it
would
modify
it
for
both
files,
which
I
don't
want
it's
kind
of,
because
else
the
data
would
be
corrected.
So
what
they
do
is
rethinking
it.
A
If
someone
would
write
to
that
shared
piece
of
data,
it
will
create
a
copy
of
it
and
write
to
the
copy
instead
and
that's
where
most
of
the
speed
comes
from,
because
I
keep
copying
the
data,
I
only
have
to
read
it
and
I
write
a
few
metadata,
which
is
like
megabytes
for
the
terabytes
of
data.
I
have.
A
The
second
thing
is
with
ipfs.
I
could
not
find
an
easy
way
to
send
the
data
to
estuary.
While
I
was
pinging,
I
was
traversing
it,
so
I
had
to
first.
Do
the
ikeft
add
which
take
weeks
and
then
upload
to
estuary
which
could
potentially
also
take
weeks.
So
what
I
did
is
I
used
the
small
representation
on
the
left.
A
We
have
the
traversing
module
and
basically
it's
going
to
send
the
data
by
32
gigabyte
blocks,
so
we
create
a
32
gigabyte
file
and
it's
going
to
send
it
to
the
sound
module
and
they
both
work
in
parallel.
So
the
the
traversing
module
doesn't
have
to
be
completely
finished.
It
can
send
smaller
chunks
and
that
allows
them
to
work
in
parallel,
and
that's
also
very
nice
because
then
like
basically,
the
total
time
is
whatever
of
them
is
the
slowest,
because
the
other
one
is
just
gonna
run
in
the
background.
A
So
right
now,
there's
also
gain
a
lot
of
speed.
The
issue
I
had
doing
that
is,
I
had
to
split
my
dag
into
multiple
blocks
and
that
didn't
really
work
out.
So
estuary
wants
full
dags
and
full
tags.
That
means
that
if
I
send
a
file
a
directory,
sorry
I
have
to
also
send
all
the
files
and
the
subdirectories
and
all
the
sub
files
in
the
sub
directory
and
all
the
way
down
to
the
dag.
A
A
So
the
main
difference
is
when
I
send
the
deck
to
estuary:
I'm
not
I'm
not
giving
it
the
true
cid,
I'm
only
giving
it
rosy,
ids
and
basically
a
rossi
id
is
just
some
bytes
and
it
doesn't
have
any
encoding.
It's
just
about
it
as
is,
and
the
way
the
way
it
works
is
I
send
the
actual
bytes
the
same
circularized
byte,
but
with
the
with
the
russia
id
so
the
hash
matches,
and
if
someone
so
estuary
is
getting
the
fake
root,
which
is
smaller
blocks
of
32
gigabytes.
A
But
if
someone
wants
to
use
the
real
route,
which
is
far
bigger,
it
still
works,
because
when
this
person
downloading
the
real
files
is
going
to
ask
it
to
estuary,
actually
doesn't
care
that
the
codec
is
wrong.
All
it
tells
it
has
a
hash
matches,
and
that
way
I
can
get
around
splitting
my
thighs
in
smaller
chunks
and
that
also
that's
not
only
about
speed
and
that's
also
because
estuary
has
a
limit
of
32
gigabytes
because
of
the
fire
con
sector
limits.
A
This
way,
I'm
also
getting
around
the
ficon
size,
and
I
can
split
around
multiple
sectors:
five
consecutors
to
store
the
data.
The
last
thing
I
did
to
greatly
improve
performance
is
parallel
shanking,
instead
of
just
having
the
traversal
doing
one
block
at
a
time.
I
now
spawn
a
lot
of
different
chunkers
and
that's
only
on
one
file,
but
that
means
that
instead
of
working
on
two
megabyte
blocks
in
the
file,
I
can
work
32
times
2
megabytes
and
that's
very
good
at
generating
big
q
depths.
A
A
Basically,
since
I'm
now
asking
the
kernel
for
a
lot
more
data,
basically
like
32
times
more
so
kernel
is
much
has
much
more
leeway
in
which
data
it
can
give
me
first
and
the
the
kernel
is
able
to
optimize
the
seeking
pattern
of
the
hard
drive
far
more
efficiently
and
that's
giving
me
better
performance,
because
I
am
using
more
performance
of
the
hard
drives
by
asking
more
of
them,
and
so
now
I
have
an
architecture
kind
of
like
this.
A
I
have
the
travel
source
that
creates
a
chunking
job
and
the
chunker
and
the
traversal
send
the
blocks
to
the
sending
module
and
all
of
this
works
in
parallel,
which
helps
getting
faster,
and
so
the
final
result
of
all
of
this
work
was
before,
where
on
ipfs,
an
ad
will
take
one
week
for
adding
the
same
data.
It
took
one
hour
and
30
minutes,
that's
for
a
full
flash,
so
I
have
to
add
all
the
data
again
and
if
I
do
an
incremental
update.
A
A
I
can
improve
that
number,
but
basically
that's
very
good,
like
I
still
think
for
that's
was
for
information.
This
test
was
doing
adding
the
terabytes,
the
debian
repository,
which
is
1.8
terabyte.
A
So
I
think
that's
pretty
good
and
also
one
neat
thing
is
like
this
is
actually
faster
than
the
right
speed
of
my
disk,
because
my
discount,
only
I
don't
remember
the
actual
number
sorry,
but
because
I'm
using
rest
thinking-
and
I
don't
have
to
do-
the
copy
linux
to
ipfs-
create
a
fake
copy
like
a
copy
without
mark
copy
and
write.
But
since,
like
I'm,
not
actually
moving
the
blocks
on
the
disk,
it's
able
to
write
faster
than
the
actual
write
speed,
because
it's
not
writing
so
again.
A
So
the
first
rule
is
like
not
doing
work
since
I'm
not
doing
the
copies.
That's
faster
the
future
plans
I
want
to
make
it
actually
usable,
it
doesn't
actually
work
for
hosting
linux
data,
mainly
because
I
have
an
issue
with
sim
links
in
the
gateway
and
when
the
attitude
the
debian
package
manager
try
to
fetch
it,
it
doesn't
really
understand.
Basically,
the
the
package
manager
go.
Ipfs
gateway,
have
a
different
idea
of
what
should
happen
with
the
same
link
and
they
don't
agree
and
doesn't
work
making
faster
most
of
the.
A
So
there
are
multiple
points
that
could
be
improved.
All
of
them
is
about
the
first,
the
zeros
rule
of
making
the
code
faster.
This
is
thing
I
measured
that
is
actually
slowing
down
my
so
basically
my
current
bottlenecks,
so
starting
is
slow.
So
if
I
go
to
the
traversal
it,
I
could
accelerate
starting
parallel
uploads,
because
the
pinning
service
is
very
slow.
In
my
case,
that's
the
main
bottleneck
blake
too,
because
if
the
painting
service
is
fast
enough,
then
my
cpu
is
slow
and
yeah.
A
So
all
of
these
could
improve
it
better
ux
and
remove
bugs
because
it's
buggy,
I
don't
actually
recommend
you
use
it,
and
the
main
point
of
this
is
not
to
tell
you:
hey,
go
use.
Linux
ipfs
is
great.
It's
not
it's
a
buggy
code
that
it
works
for
me
because
I
wrote
it,
but
I'm
not
really
sure
anyone
should
use
it.
The
main
point
is
like
this
took
me
about
five
days
of
work.
To
make
doing
the
same,
optimization
in
go.
A
A
Since
linux
to
ipfs
has
very
few
features.
I
can
work
more
on
publishing
them
with
the
same
amount
of
time,
and
all
of
this
is
possible
to
docs
and
spec.
Really
the
hardest
part
is
understanding
what
you
want
to
do,
but
once
you
kind
of
know
about
what's
an
iplt
block,
why
do
I
want
to
do
x
and
y?
Read
the
spec
and
implement
it?
It's
really
easy.
I
think
once
you
get
like
as
the
first
understanding
hurdle.
Actually
writing
your
own
implementation
is
surprisingly
easy.