►
From YouTube: Pinset orchestration with IPFS Cluster - Hector Sanjuan
Description
IPFS Cluster makes distributing a pinset across an scalable set of Kubo peers easy. In this talk, we will explore the basic features, setup and monitoring for home and production deployments.
A
A
A
The
first
thing
to
understand
is
cluster
beers
are
sidecars
to
Google
peers
and,
as
cluster
peers,
they're
fully
independent
entities.
There
is
one
cluster
peer
per
Cubo
beer
and
they're,
usually
collocated
cluster
beers
have
their
own
identity
and
they
have
their
own
configuration
and
they
communicate
between
each
other
using
a
private,
lipid2p
Network.
All
communication
with
Kubo
is
done
through
the
kubos
HTTP
RPC
API
ipfs
cluster
software
comes
with
two
main
binaries,
unlike
Kubo,
where
the
ipfs
command
is
both
the
server
and
the
client
in
cluster.
A
There
is
an
application
that
runs
a
Daemon
and
an
application
that
runs
the
client
and
this
application
uses
the
rest
API
exposed
by
the
cluster
Daemon
to
to
talk
to
the
demon
and
perform
operations
on
the
demon.
The
demon
is
run
by
ibfs
cluster
service
and
the
client
is
run
by
ipfs.
Cluster
CTL
I
mentioned
cluster
peers
from
a
private
Network,
and
they
used
that
private
Network
to
communicate
with
each
other,
basically
using
lipid
to
be
pops
up
and
an
internal
RPC
API
that
they
share.
A
Podcaster
beers
are
tracking
and
modifying
what
we
call
the
cluster
pin
set.
The
cluster
pin
set
contains
all
the
bins
that
the
cluster
should
be
tracking
and
their
opinion
options
is
a
key
value.
Store
is
a
big
database
with
all
the
pins
there
options
and
these
database
is
replicated
to
all
the
Clusters
that
are
participating
in
the
cluster
in
the
ipfs
cluster.
A
So
in
Kubo
a
pin
is
just
a
CID
and
a
pinning
mode
that
can
be
either
recursive
or
direct
in
cluster.
It
is
more.
The
pin
in
cluster
includes
custom
metadata
that
the
user
can
provide
desire,
replication
factors,
creation
date,
expiration
date,
so
that
the
pins
can
be
removed
or
unpinned
from
ipfs
at
a
given
date,
Origins
information
and
other
options.
A
Each
individual
cluster
beer,
of
course,
is
tracking
all
these
pins
and
they
can
complete
this
static
information
that
is
stored
in
the
cluster
pin
set
with
Dynamic
up-to-date
information
that
they
extract
from
their
own
from
the
Roman
state
I'm
from
Google.
That
is,
the
state
or
status
of
the
pin.
Whether
the
pin
is
spinning
or
queued
to
be
pinned
or
has
errored
pinning
or
has
successfully
completed,
pinning
the
addresses
of
the
allocated
peers,
in
this
case
the
ipf
spears,
the
timestamp
of
the
last
status,
change,
etc,
etc.
A
When
you
send
a
new
PIN
to
the
cluster,
so
when
you
ask
the
cluster
to
pin
a
CID,
there
is
a
complex
process
that
is
triggered
which
identifies
based
on
the
bin
options.
I
showed
you
and
based
on
the
on
the
state
of
the
beers
in
the
cluster,
where
to
allocate
a
pin
that
is
which
pairs
in
the
cluster
should
be
the
ones
asking
Kubo
to
pin
the
item.
A
Every
cluster
appear
can
optionally
export
three
different
apis
and
potentially
all
together,
so
you
can
enable
and
disable
at-wheel.
The
rest
API
is
a
native
API.
This
includes
a
full
feature,
parity
with
what
cluster
has
to
offer
and
is
the
most
performance
one,
because
it's
built
exactly
to
to
fit
how
cluster
is
and
there's
a
second
API
called
the
apfs
proxy
API,
the
proxy
API
mimics,
the
Google's
RPC
API,
except
for
a
couple
of
methods
like
bin,
add
or
bin
RM.
A
Instead
of
doing
what
ipf
has
what
a
single
ipfs
demon
will
do,
it
interprets
those
things
as
a
cluster
bin
or
a
cluster
bin
ad
or
a
cluster
pin
removal
such
making
some
specific
cost
to
ipfs
cluster-wide
operations.
This
is
thought
so
that
you
can
essentially
drop
up
cluster
beer
where,
before
there
was
a
single
ipfs,
Daemon
and
not
notice
a
difference,
because
the
full
RPC
API
offered
by
Kubo
is
is
kept
unchanged.
A
This
is
experimental
because
it's
very
new,
but
it
offers
essentially
compatibility
with
anything
supporting
the
opinion,
Services
API
Kubo
itself,
and
it's
very
good
for
opinion
and
opinion,
among
other
things,
some
very
easy
things
that
this
allows
you
to
do
is
to
have
a
local
Kubo
Daemon
on
your
machine
with
opinion
services
with
opinion
service
configured
as
a
remote
backend
for
that
Kubo
demon.
That
is
actually
a
cluster
running
somewhere
else.
A
Unfortunately
least
operation
Flags,
like
Beijing
pages
and
filtering
they're,
not
well
supported,
because
pagination
is
very
difficult
to
do
in
in
the
way
cluster
stores
the
state,
let's
open.
The
second
part
of
our
talk
and
and
discuss
a
bit
about
scale,
so
the
performance
as
size
of
a
cluster
can
be
related
to
different
dimensions
and
this
Dimensions
May
matter
more
or
less
to
the
cluster
operators.
A
It
depends
what
you're
doing
with
the
cluster
and
what
applications
you
are
implemented
on
the
cluster
and
what
type
of
pins
you're
putting
in
the
cluster,
whether
they're,
big
they're,
small,
whether
you
have
many
of
them
or
not
so
many,
etc,
etc.
But
the
dimensions
of
the
things
that
matter
here
is
how
well
Kubo
demons
are
performing
when
they
try
to
pin
something
how
fast
the
cluster
appears
can
commit
new
pins
to
the
cluster.
A
A
So
if
we're
diving
into
each
of
these,
you
can
see
that
cluster
with
many
small
beers
might
speed
up
the
overrate,
the
overall
rate
of
pinning,
because
you
get
more
ipfs
or
Google
demons
to
be
pinning
at
the
same
time.
So
you
can
pin
more,
but
at
the
same
time
this
translates
into
more
information
flowing
in
the
cluster's
room,
since
all
of
them
need
to
replicate
the
pin
set
which
may
make
clustering
ingestion
through
both
smaller,
depending
on
the
available
bandwidth
and
so
on.
A
The
type
of
disk
that
you
use
is
the
layout
whether
you're
using
lvm
volumes,
also
some
rate
configuration
ssds
or
Spinning
Disk.
The
Google
data
store
configuration
the
available
boundaries
on
the
machine
and
particularly
the
ipfs
configuration,
including
internal
bits
of
settings,
etc,
etc,
is
very
important
if
you
want
to
make
Google
beam
more
and
faster.
This
is
not
related
to
to
Cluster
at
all.
This
is
related
to
configuring
Google,
in
a
way
that
cluster
can
can
tell
it
to
pin
and
Google
pins
faster.
A
You
can
always
scale
Kubo
machines
vertically.
You
can
always
add
more
RAM
and
CPU
tune
the
configuration
accordingly
and
we
operate
Google
nodes
with
20
million
pins
in
them
with
50
terabytes
of
data
in
them,
but
they
are
expensive
and
they
need
huge
amount
of
run
so
we're
using
like
192
gigabytes
of
RAM
for
these
nodes.
Instead,
it's
more
sustainable
to
use
smaller
machines
for
Kubo,
because
Governor
only
needs
to
being
Google
needs
to
provide
to
the
network.
A
A
In
terms
of
how
many
pins
you
can
ingest
to
a
cluster,
the
minimal
number
of
pin
requests
that
a
single
cluster
peer
can
track
and
replicate
and
sync
to
a
network
of
25
pairs
is
2050
pins
per
second,
this
is
the
base.
So
this
is
the
the
lowest
number
that
you
can
do
by
doing
just
things
stupidly:
sending
requests
to
the
API
of
a
single
node.
A
Even
so,
this
number
means
1
million
bins
ingested
every
hour
and
as
I
say,
aglaster
is
made
of
multiple
beers.
You
can
write
two
different
beers.
At
the
same
time,
you
can
parallelize
your
requests
to
the
API
Etc,
so
these
numbers
can
only
go
up
so
I'm,
giving
you
the
very
base.
The
variable
is
figure.
A
This
works
because
cluster
beers
use
Conflict
Free,
replicated
data
types
to
sync,
their
state
and
we've
seen
that
we
can
batch,
pin
requests
and
we
can
actually
get
a
huge
performance
gains
by
by
doing
batching
and
only
sending
sending
batch
updates
to
the
rest
of
the
cluster
every
few
seconds
or
when
the
batches
get
be
big
enough,
etc,
etc.
It's
a
matter
of
configuring
I'm,
finding
the
right
balance
between
the
usage
that
you
give
to
the
cluster
and
the
performance
that
you
want
to
get
out
of
it.
A
Therefore,
the
size
of
the
inside
will
be
heavily
influenced
by
the
machine
performance,
but
the
badger
data
store
backend
that
cluster
uses
and
how
it's
configured
and
how
it
optimizes
the
available
RAM
usage,
our
biggest
cluster,
has
100
million
pins
in
it
is
made
by
25
piers
and
every
of
those
peers
is
taking
their
portion
of
the
100
million
pins
and
making
sure
that
it's
actually
pinned
and
is
retrying
those
pins
that
are
not
getting
through
Etc.
A
In
fact,
about
your
data
store
that
stores
those
pins
and
that
provides
a
backend
for
the
crdt
structures
that
are
used
to
coordinate
the
the
sinking
of
the
pin
set,
has
not
100
million
keys
in
it.
But
300
350
million
keys
in
it,
but
in
general
the
size
of
this
data
store
is
only
0.3
percent
of
the
actual
of
the
actual
size
of
the
content
is
stored
on
ipfs.
A
So
even
if
this
go
grows,
big
and
I'm
talking
at
this
data
store
is
now
all
together
will
be
three
terabytes
in
that
cluster
from
all
the
beers.
That's
only
a
small
portion
of
the
of
the
940
terabytes
that
the
cluster
is
it's
storing
so
yeah.
These
figures
that
I'm
giving
you
are
not
made
up.
A
They
come
from
an
actual
cluster
setups,
which
means
that
in
your
own
deployments,
you
can
expect,
at
the
very
least
on
this
type
of
performance
at
these
ballparks,
and
if
your
report
deployments
have
less
requirements
that
this
you
can
expect
a
spec
very
reliable
operations
with
cluster
ipfs
cluster
has
received
lots
of
improvements
in
the
last
year
to
get
what
it
is.
But
of
course
you
we
have.
We
need
to
note
always
that
the
heavy
lifting
in
these
machines
is
made
by
ipfs.
A
So
ipfs
is
the
is
a
real
resource
hog
in
in
any
machines
and
I'm
in
Kobo.
In
this
case,
in
the
sense
that
the
work
of
retrieving
the
content,
the
work
of
writing
it
to
disk
the
role,
the
work
of
announcing
it,
the
content,
I'm,
providing
and
having
the
bits
of
accessions
to
all
other
appears
are
all
things
that
fall
on
ipfs
shoulders
and
again
it
will
depend
on
usage.
It
will
depend
on
how
many
people
are
requesting
that
content.
How
many
things
your
opinion
at
the
same
time,
that
the
machines
will
behave
differently.
A
A
So
a
good
property
of
this
crdt
synchronization
used
by
cluster
beers
is
a
cluster
piece,
are
always
fully
operative,
regardless
of
the
state
of
other
peers.
So
this
is.
This
is
very
different
from
other
other
distributed.
Key
value
stores
like
raft
where
replica
and
Pearson
management
is
a
special
operation
in
ipfs
cluster.
Any
new
beer
just
needs
to
be
given
the
multi-address
of
some
other
beer
in
the
cluster,
and
as
soon
as
it
connects
to
it,
it
will
discover
the
cluster
and
be
fully
operative.
You
don't
need
to
do
nothing
else.
A
You
don't
need
to
do
a
special
operation.
You
don't
need
to
run
a
cluster
command
to
add
the
new
peer.
This
happens
all
automatically,
so
it
is
relatively,
it's
actually
very
easy
to
add
new
beers
to
the
cluster,
in
the
sense
that
you
just
started
and
you
and
you're
going,
and
you
can
use
that
peer
from
that
moment
to
write
and
store
new
content.
A
Obviously,
you
can
send
more
more
pins
to
a
cluster
than
it
can
actually
pin.
That
means
that
these
pins
will
be
cute
because
we
don't
send
all
the
beams
as
they
come
to
ipfs
directly.
There's
a
a
configurable
number
of
pins
that
you
can
set
so
you're
never
going
to
be
pinning
more
than
10
things
at
a
time.
For
example,
everything
that
cannot
be
that
cannot
be
pinning
at
that
moment
goes
into
a
pinning
queue
and
cluster
peers
use
those
spinning
queues
to
Signal
when
a
beer
is
overwhelmed
by
pins.
A
So,
for
example,
if
your
opinion
cluster
page
can
ensure
that
your
pins
are
replicated
in
different
regions,
so
that
a
pin
is
replicated
on
your
three
different
regions,
for
example,
and
in
those
regions
they
will
choose
which
beer
have
the
lower
pinning
queues
so
which
pairs
are
not
overwhelmed
by
the
things
that
are
opinion
in
the
sense
that
they're
falling
behind
and
not
managing
to,
pin
everything
that
they
should
be
pinning
and
of
those
that
are
not
overwhelmed.
A
It
will
choose
those
that
have
most
free
space.
This
ensures
that
the
cluster
capacity
is
used
in
a
balanced
fashion.
So
everything,
normally,
what
you
see
is
that
everything
converges
into
into
into
into
into
storage
use.
So
the
storage
used
in
all
the
beers
will
tend
to
be
will
tend
to
be
the
same
and
they
will
end
up
filling
up
at
the
same
level
and
then
you
will
get
your
pins
distributed
in
your
cluster
in
a
very
balanced
fashion
and
if
not,
it
usually
equalizes
over
time.
A
I
think
the
graph
here
shows
shows
are
pinning
you
that
you
see
it
doesn't
go.
It
doesn't
go
very
high
because
it's
configured
to
not
to
not
be
more
than
25,
000
or
so
and
the
moment
it
gets
over
that
cluster
will
stop
sending
sending
pins
to
that
a
specific
peer
and
then
a
specific
peer
can
actually
get
done
through
that
with
this
skill
and
it
it's
just
not
growing
in
defense
indefinitely.
A
A
You
can
scrape
the
rates
of
pinning
you
can
scrape
how
many
things
are
cute,
and
this
United,
with
the
metrics
that
Kubo
demons
are
exporting
themselves,
can
give
you
very
good
insights
into
what's
a
state
of
the
cluster
can
give
you
insight
into
whether
the
cluster
campaign,
everything
that
you
send
it
to
it,
whether
the
cluster
is
having
a
lot
of
Errors
when
pinning
and
things
are
not
getting
through.
How
long
is
it
taking
to
pin
something
which
appears
out
faster?
A
How
how
well
synced
the
beers
are
in
terms
of
the
cluster
pin,
set,
etc,
etc?
That's
it
that's
all.
I
have
to
say
I,
don't
want
to
leave
without
pointing
you
to
the
documentation.
The
documentation
goes
much
more
in
depth
that
I
can
do
here.
It
explains
you,
for
example,
exactly
what
are
the
configuration
options
in
Kubo
and
in
cluster
when
you,
when
you're
deploying
at
scale
so
which,
since
you
have
to
touch
which,
since
you
want
to
adjust
and
I
hope,
you
can
go
home
with
a
better
idea
of
how
ipfs
cluster
operates.
A
There
are
some
features
in
ipfs
classes
that
I
haven't
talked
about
particularly
collaborative
clusters
that
use
ipfs
cluster
followers
I'm
happy.
If
you
reach
out
to
me
to
discuss
later,
we
can
talk
about
it
and
coming
up,
there's
a
presentation
about
the
ipfs
operator,
which
is
how
we're
making
clusters
deployments
to
be
really
painless,
painless
and
automatically
optimized
by
doing
them
on
kubernetes
and
having
your
your
full.
Your
full
cluster,
essentially
coming
up
from
nothing
very
easily.
A
Also
I
told
you
that
the
pencil
synchronization
layering
cluster
appears
is
powered
by
miracle.
Crdts
and
I'll
be
diving
into
how
they
work
in
40
minutes
in
the
ipfs
201
app
design
patterns
and
developer
tools
track.
So
if
you
want
to
come
and
see
me
there,
I
will
be
very
glad
to
have
you
as
well.
Thank
you
very
much.