►
Description
Downloand Presentation:
https://www.snia.org/sites/default/files/SDC/2017/presentations/Containers/Kerns_Daniel_Advancing_Clustered_Storage_Architecture_with_Kubernetes.pdf.pdf
Abstract:
Internally storage systems are actually complex distributed systems and implementers spend significant resources managing the cluster. Now that we have maturing container orchestration systems, much of that work can be delegated to the orchestrator allowing storage developers to concentrate on storage. In this presentation, we will talk about our experiences building Rook.io on top of Kubernetes and demonstrate how container orchestration changes perspectives on storage cluster implementation.
A
I'm
hope
Dan
Kearns
I
run
the
RIC
project.
I
have
to
also
work
for
quantum
today.
What
I
want
to
do
is
talk
to
you
about
how,
by
leveraging
environment
like
kubernetes,
you
can
orchestrate
very
complicated
storage
systems
and
save
yourself.
A
lot
of
work,
consequently,
focus
more
time
on
building
storage
systems
than
on
building
distributed
computing
systems.
I'm
gonna
give
you
a
demo
of
rook
running
on
kubernetes
and
I'm,
going
to
talk
to
you
about
the
architecture
that
are
some
of
the
architecture
that
we
use
inside
of
rook
to
support
staff
on
kubernetes.
A
Ok,
so
I
think
everybody
understands
I'm,
gonna
fall
off
the
stage.
Sure
I
think
everybody
understands
that
storage
systems
distributed
storage
systems
are
really
complicated,
distributed
systems.
There's
all
kinds
of
stuff
going
on.
There
can
be
thousands
of
instances
of
different
process
types.
There
can
be
thousands
of
nodes,
they're,
actually
very
complicated
systems
and
storage
developers
spend
a
lot
of
time
working
on
the
lifecycle
of
the
various
processes
and
demons,
and
things
like
that
that
support
the
storage
system.
So
it's
a
common
cause
of
failure.
A
It's
not
just
about
the
data
structures
that
represent
the
storage,
it's
about
keeping
the
system
alive
and
healthy,
and
that's
where
kubernetes
comes
in
and
makes
an
enormous
enormous
difference.
So
in
a
cell
system,
there's
probably
10
different
services,
maybe
maybe
more
than
10
thousands
of
instances
of
those
services
in
the
storage
cluster
yeah,
you
have
to
keep
track
of
them
all.
You
have
to
manage
the
lifecycle
and
you
have
to
do
what
you
need
to
do
to
keep
them
all
healthy,
kubernetes,
I,
don't
know
how
much
people
know
about
kubernetes.
A
Kubernetes
is
one
of
several
container
orchestration
systems.
The
goal
is
to
drive
up
the
utilization
and
cluster
computing
systems
and
to
move
to
a
more
declarative
model
than
procedural
model
for
managing
the
various
applications
that
run
in
the
system.
It's
got
all
kinds
of
features.
It's
it's
there's
lots
of
goodness.
It
seems
to
be
picking
up
steam
in
the
market.
Place
were
at
quantum,
were
firm
believers
and
kubernetes,
and
so
we
looked
at
how
we
could
use
kubernetes.
A
The
thing
that
really
jumps
out
about
kubernetes
is
the
declarative
nature
of
what
you
do.
You
generally
don't
write
code
to
tell
kubernetes
what
to
do.
You
basically
write
a
spec.
This
is
what
should
happen.
I
should
have
three
instances
of
this
application
running
in
my
cluster
kubernetes
goes
and
make
sure
that's
that
that
constraint
is
valid
at
all
times.
There'll
always
be
three
instances,
so
if
one
of
them
dies
whatever
reason,
a
new
instance
is
created.
A
Likewise,
you
can
say
you
know:
I
want
to
have
enough
instances
so
that
my
response
time
with
my
application
is
such
and
such
and
kubernetes
will
do
the
scale
up
and
scale
down
and
whatnot
of
that
application.
For
you
there's
also
a
bunch
of
support
for
the
deployment
of
applications
to
notes.
If
you
ever
built
a
system
like
this,
the
deployment,
it's
a
simplest
thing
in
the
world,
except
that
it's
not
right
and
there's
actually
a
lot
of
trouble
that
goes
into
deploying
applications
and
distributed
systems.
Kubernetes
actually
makes
that
simple
and
likewise
updates.
A
When
it
comes
time
to
update
that
application,
you
generally
don't
want
to
take
your
service
offline.
You
generally
want
to
have
some
sort
of
online
update
plan,
but
of
course,
those
are
hard
to
build
and
again
kubernetes
has
platform
support
for
doing
those
kind
of
online
or
rolling
updates,
there's
several
different
strategies
and,
of
course,
in
order
to
do
all
this.
It's
monitoring
the
health
of
the
system
and
whatnot.
So
there's
a
great
deal
of
information
available
about
the
health
of
the
various
application.
Instead
of
running
the
system,
so
I.
A
Sort
of
mentioned
the
declarative
thing:
it
supports
lots
of
different
service
models,
so
you
can
have
a
service
model.
I
talked
about
an
instances
and
load
balance,
Vinson
says,
but
there's
all
kinds
of
different
service
models
and
there's
this
concept
of
a
thing
called
a
pod,
which
is
a
collection
of
services
that
work
together,
and
so
you
can
put
a
bunch
of
services
in
a
pod,
say,
micro,
services
and,
and
they
will
be
scheduled
together.
It
has
an
extensible
control
plane.
This
is
one
of
the
biggest
things.
A
That's
a
benefit
for
us,
I
mean
the
current
version
of
kubernetes.
These
things
are
called
CR
DS
and
you
you
define
CR
DS
and
can
interact
with
a
control
plane
through
these
CR
DS,
and
it
provides
complete
application,
lifecycle,
support,
which
is
a
great
deal
for
the
storage
cluster,
because
it
has
to
run
forever.
Now,
let's
look
at
storage
and
kubernetes.
So
if
each
one
of
these
vertical
things
is
a
node
in
kubernetes,
you
may
have
some
number
of
applications
running
on
these
nodes
in
whatever
layout
is
necessary.
For
that
application.
A
A
What
that
means
in
kubernetes
speak
is
that
if
this
application
is
connected
to
some
storage
out
here
and
then
that
application
happens
to
be
rescheduled
on
this
node,
the
the
volume
attach
points
will
be
moved
to
that
other
node
seamlessly
and
that
application
can
continue
running
so
that
it
dies
here
and
scheduled
there
or
if
it's
rescheduled
in
multiple
places,
kubernetes
will
handle
that
transition
for
you
and
that's
actually
pretty
cool
in
rook.
So
rook
is
an
open-source
project
that
that
we
sponsor
at
quantum
we
have
lots
of
contributors.
A
You
should
go
to
github
too,
to
look
to
check
it
out.
Our
idea
is
gee
what
if
we
actually
brought
the
storage
system
into
the
cluster?
What
if
we
ran
the
storage
processes
inside
the
kubernetes
cluster?
This
gives
you
a
lot
of
flexibility.
First,
it
means
you
don't
need
to
have
that
external,
a
separate
storage
box
that
you
bought
from
somebody
else.
A
You
can
use
disks
attached
to
the
various
nodes
in
the
cluster,
but
it
also
means
that
the
applications
that
make
up
the
storage
system
can
benefit
from
all
this
goodness
that
I've
been
talking
about
kubernetes
I.
Think
that's
one
of
the
the
major
benefits,
so
we
orchestrate
south
in
ruk
ruk
is
an
exclusively
set.
Will
do
other
storage
backends
over
time.
That
stuff
is
interesting
for
tons
and
tons
of
reasons
and
there's
been
quite
a
number
of
Ceph
Doc's
at
this
conference.
A
All
of
those
apply
here
we
have
a
pod
called
the
rook
operator
and
you
can
think
of
it
as
a
system
operator
or
something
like
that.
You
could
think
of
it
as
the
operator
pattern,
if
you're
familiar
with
that.
But
the
idea
is
to
manage
the
storage
cluster
and
so
there's
a
bunch
of
code
in
the
rook
operator
that
manages
the
lifetime.
A
So
you
can
build
a
kubernetes
cluster
where
the
applications
and
the
storage
all
run
together
in
the
cluster
and-
and
you
know
that
has
certain
benefits
or
if
you're,
building
a
standalone
storage
device
you
can
just
not
have
any
applications
running
except
the
storage
applications
gives
you
quite
a
bit
of
flexibility
in
how
you
end
up
deploying
the
storage.
So
our
architecture
looks
a
little
bit
like
this
same
picture
as
before,
except
the
rook
processes
are
running
throughout
the
cluster.
What
we
get
from
kubernetes.
A
On
the
other
hand,
he
is
support
for
all
the
things
in
this
box
and
some
of
those
things
are
pretty
hairy
like
I
mentioned
the
upgrade
and
the
deployment,
but
also
things
like
security
and
and
scheduling
of
these
processes
we're
using
kubernetes
security
for
controlling
access
to
the
cluster.
In
this
case,
and
so
we're
hiding
the
Ceph
level,
AF
has
its
own
security
system
and
so
we're
hiding
that,
because
for
a
kubernetes
user,
they
already
have
to
deal
with
the
kubernetes
security.
What
do
they
want
to
deal
with
the
stuff
stuff?
A
A
We
have
the
rook
agent,
which
is
a
rook
component
that
runs
on
every
node
in
the
cluster
that
handles
persistent
volume
mounting
and
the
rook
API,
which
is,
it
just
presents
a
rest,
a
restful
api,
if
you're
not
using
the
kubernetes
CRD
extensions
and
I'm
going
to
go
into
each
of
these
insert
talk
about
how
they're
managed
a
little
bit
differently
in
the
rook
cluster.
So
first
off
the
operator,
the
SEF
manager
and
the
rook
API
are
pretty
much
single
tense.
You
generally
run
one
instance
of
these
guys.
A
You
may
run
two
instance
or
two
or
more
for
high
availability,
but
it's
a
it's
a
master/slave
relationship.
It's
a
you'll
fail
over
to
the
other
node.
If
you
need
to
so
they're
very
easy
to
manage
in
the
SEF
environment.
As
a
cluster
comes
up,
we
bring
up
the
rook
operator
and
then
we
support
multiple
stuff
clusters
on
a
kubernetes
cluster.
So
then,
you'll
specifically
after
you
bring
up
the
operator,
you'll
specifically
say
I'd
like
to
create
a
soft
cluster
named.
A
A
They
control
the
Ceph
cluster
in
a
very
in
a
very,
very
real
way.
It
there
have
to
be
at
least
two
running
at
all
times
and
in
our
environment.
We
only
support
three
because
we
never
saw
a
need
for
more
than
three,
but
you
can
have
any
op
number
above
above
three
and
the
monitors
maintain
a
quorum
so
they're
how
we
detect
or
they're
they
they.
A
They
maintain
this
each
each
maintains
a
copy
of
the
state
of
the
cluster
and
there's
a
quorum
elder
another
underneath
that
has
to
run,
and
so
it's
really
important
that
we
run
just
three
of
these
we
and
that
their
names
are
really
well
known,
and
so
one
thing
that
happens
in
kubernetes
is
pods
can
get
rescheduled
over
time.
A
node
fails
or
it
dies
it
crashes.
A
But
what
we
actually
chose
to
do
is
if
Mon
three
dies
and
gets
rescheduled
on
another
node,
we're
using
kubernetes
service
IP
addresses
so
that
the
IP
address
remains
consistent,
regardless
of
where
in
the
cluster
the
Mon
comes
up.
If
that
turns
out,
it's
been
really
really
productive
for
us,
everything
in
kubernetes
cluster
has
to
be
able
to
fail,
and
that
includes
the
Mons,
and
so
we
spent
a
great
deal
of
time
here.
Understanding
failure
in
the
cluster
and
understanding
how
we
recover
from
failure
in
the
cluster,
thus
fos.
A
A
It's
important.
It's
really.
It's
really
surprising
to
people
who
have
used
Seth
before
that.
There's
no
self.com
files
here
and
they
do
exist
under
the
covers,
but
nobody
uses
Rooke
ever
sees
these
files,
they're
generated
on
the
fly
by
ruk
and
they're,
put
in
the
right
places
in
the
pond,
and
so
the
Ceph
demons
can
run
unmodified.
A
A
They
take
a
long
time
to
start
up,
and
so
you
can
have
you
can
handle
others
parameters
that
you
can
use
to
control
how
often
they're
started
up
and
at
what
at
what
the
transition
point
is
and
so
on.
We
have
the
opportunity
to
do
SSL
termination
at
the
kubernetes
load
balancers
if
we
want,
instead
of
terminating
farther
inside
the
SEF
system,
so
that
actually
simplifies
a
bunch
of
the
deployment
and
configuration
of
SEF.
A
So
what
do
we
get
from
kubernetes
again,
this
lifecycle?
It's
a
really
big
deal.
If
you've
built
a
distributed
system,
just
processes
dying
right,
you
don't
want
to
wait
to
find
out
and
and-
and
we
get
this
lifecycle
support
at
both
the
node
pod
and
process
level.
We
have
automatic
failover
of
services,
kubernetes
remaps
IP
addresses
and
makes
things
look
fairly,
invisible
for
that
support
from
kubernetes
directly
for
upgrade
and
for
rolling
upgrade.
A
Has
support
for
secrets
built
in
just
basically
think
of
as
as
encrypted
things
that
you
don't
want
other
people
to
see.
We
use
kubernetes
secrets
to
store
the
Ceph
secrets,
so
the
Ceph
keys
for
us
eff,
encryption
and
that
sort
of
stuff
are
all
stored
in
kubernetes
secrets
so
again
we're
leveraging
the
kubernetes
authentication
and
access
control
system
to
manage
the
Ceph
secrets
so
you'll
when
out
demo.
You
won't
see
any
of
them
and
it
has
an
extensible
API.
A
A
A
C
A
A
B
B
A
A
It
just
runs,
watch
over
and
over
and
in
the
window
and
makes
us
he
makes
it
easy
to
see
what's
going
on
in
the
cluster.
So
again,
we
can
see
if
there's
three
pop
there's
three
nodes
running
currently:
there's
no
a
pods
being
scheduled
in
the
cluster
nope,
persistent
bomb
claims
and
no
services,
and
so,
if
I
go
and
create
the
rook
operator
and
I'll
show
you
my
cheat
sheet
here.
I
definitely
have
to
cut
and
paste
kubernetes
commands,
so
the
rook
operator
has
been
created,
so
this
yamo
file
has
been
essentially
installed
in
kubernetes.
A
At
that
point,
this
rook
operator
pod
over
here
has
been
fired
up
and
that's
all
that's
happened
right
though
it's
just
a
process
of
running
somewhere
and
it's
not
really
doing
anything.
We
haven't
created
a
storage
cluster
yet,
but
that
is
what
the
work
operator
does,
and
so
we
go
over
here
and
we
create
a
storage
cluster.
Okay
and
I'll
show
you
what
this
looks
like.
A
A
Basically,
any
kubernetes
selector
here
could
be
used
to
identify
the
nodes
or
the
devices
in
particular
I'm,
going
to
use
the
blue
store
files,
the
blue
start
storage
system
with
a
couple
of
parameters
and
and
that's
the
whole
cluster
spec.
Now,
in
the
background
here,
we've
installed,
SEF
write
an
entire
stuff
installation,
and
so
the
monitors
our
car
have
just
come
up.
There's
there's
3
mons
running
out
of
the
three
different
nodes,
so
they
have
an
anti
affinity,
so
they
never
run.
A
In
the
same
note,
in
a
couple
seconds
here,
the
the
OSDs
and
the
SEF
manager
will
come
up
if
we're
lucky
they're
there.
It
is
so
3
OS,
3,
OSD
pods
running
at
each
of
the
three
nodes
and
each
of
those
OSD
pods,
because
I
only
have
one
drive
on
these
VMs
is
just
running
one
OSD,
but
if
there
had
been
10
drives
on
the
VM
I'd
be
running,
they'll
be
running
10
OS
days
so
now
we'll
send
an
entire
stuff
clusters
up
and
I'm
going
to
show
you
how
to
interoperates
with
kubernetes.
A
A
A
A
A
What's
really
cool
now,
because
it's
kubernetes
is
that
I
can
go
in
fire
up
applications
and
use
the
Ceph
storage
that's
been
created,
so
this
is
just
a
standard.
Wordpress
demo
application
and
I'll
show
you
the
file
a
minute.
But
in
the
background
here
you
can
see
that
a
wordpress
and
a
my
sequel
app
have
come
online.
The
my
sequel
app
is
connecting
to
that
replica
pool
that
pool
that
we
created
and
my
sequel
is
starting
up.
So
let's
have
a
look
at
that.
A
Thing
just
very
traditional
kubernetes
yeah
mo
we're
going
to
create
a
service
called
WordPress.
That's
going
to
have
an
endpoint,
it's
going
to
create
a
persistent
volume
claim
called
my
sequel.
Pv
claim
we're
going
to
map
that
to
the
Brooke
storage
cluster,
we're
going
to
fire
up
a
my
sequel
instance.
That's
what
this
is
and
telling
it
to
use
that
persistent
volume
claim
and
then
we're
going
to
move
forward
and
fire
up
a
wordpress
instance,
and
so,
if
we're
lucky,
we
can
actually
come
over
and
find
out
what
port
it's
running
on
three.
A
A
A
The
my
sequel
and
WordPress
descriptions
are
really
very
Fenella
now,
if
you're,
building
a
storage
appliance
if
you're
not
interested
in
building
a
hyper-converged
system,
if
you're,
just
building
a
storage
appliance
realize
the
run,
a
storage
system
is
a
lot
more
than
just
running
safe
right.
You
have
a
bunch
of
other
applications
running
on
the
cluster
right.
In
fact,
you
know,
databases
and
other
sorts
of
metadata
running
in
the
cluster
too,
to
run
the
support
applications
for
the
storage
cluster
so
running.
A
And
the
cool
thing
you
can
do
once
this
finally
starts
up
is
you
can
tell
that
my
sequel
instance
that
you
know
gee
I
want
you
to
reschedule
another
note
if
we
can
just
kill
it
or
something
like
that,
and
it
will
automatically
kubernetes
will
pick
another
note
for
it
to
run
on
reconnect
the
persistent
volume
claim
to
that
instance.
It'll
still
have
all
of
its
data,
and
the
application
WordPress
in
this
case
will
continue
working.
A
I
literally,
can
never
in
a
bra
cute
milkman.
What
does
it
say
here.