►
From YouTube: Ceph Days NYC: State of the Cephalopod
Description
Presented by: Neha Ojha & Josh Durgin
In this talk, we'll provide an update on the state of the Ceph upstream project, recent development efforts, current priorities, and community initiatives. We will share details of features released across components in the latest Ceph release, Quincy, and explain how this release is different from previous Ceph releases. The talk will also provide a sneak peek into features being planned for the next Ceph release, Reef.
A
A
So
today,
I
wanted
to
talk
a
bit
about
the
seat
of
the
stuff
project.
What's
been
going
on
recently
in
the
latest
releases,
what
we're
planning
for
the
future
and
some
of
what's
Happening.
Overall,
it's
been
a
a
big
several
big
years
of
growth.
That's
a
community!
In
general,
we've
changed
to
a
new
governance
model
for
our
technical
governance.
Instead
of
having
a
kind
of
a
single
person
stage.
A
While
we've
had
the
project,
we've
moved
to
a
model
of
a
council,
as
Mike
mentioned,
and
the
shared
leadership
with
the
leads
or
the
different
components
of
stuff.
So
the
the
leads
elect
the
council
every
few
couple
years
and
the
council
is
responsible
for
making
sure
everything
else
basically
happens
and
taking
care
of
the
community
as
well.
Foundation.
A
A
few
major
Focus
areas
for
us
in
general
have
been
quality
performance
and
scalability
additional
usability,
so
we've
been
doing
lots
more
testing
at
Large
Scale,
trying
to
detect
things
before
users
hit
them
and
improving
our
processes
like
creating
RC
releases
that
users
can
test
out
and
at
Large
Scale.
Before
again,
the
final
release
hits
the
ground.
A
We've
also
been
focusing
much
more
on
performance
testing
and
performance
improvements
and
stuff.
A
big
part
of
that
is
the
Crimson
project.
We'll
talk
we'll
talk
more
a
bit
about
later,
but
this
also
means
trying
to
run
at
much
larger
scales
and
much
more
varied,
Hardware,
more
continuous,
on
a
more
continuous
basis
with
more
realistic
workloads.
A
Finally,
we're
also
trying
looking
to
see
more
about
how
stuff
is
used
and
get
more
continuous
feedback
from
the
community.
Part
of
this
is
the
Telemetry
effort
which
we
will
talk
more
about
later.
We
have.
This
is
all
opt-in
data
reporting
that
you
can
share
anonymously.
What
what
house
bigger
cluster
is
how
many
osds
it
has
these
sorts
of
information
and
right
now
we
have
somewhere
north
of
800
petabytes
of
of
stuff
clusters,
reporting,
there's
a
massive
amount
of
users
out
there
and
there's
either
all
you
can
inspect
some
of
the
data.
A
That's
out
there
in
the
public,
23
dashboards,
which
we'll
talk
more
about
later
and
for
the
developers
there's
some
private
dashboards
where
we
can
go
and
see.
For
example,
if
we
adjusted
to
release
how
many
crashes
are
happening,
which
classes
is
it
affecting,
is
it
only
a
one-off
incident,
or
is
it
widespread
that
sort
of
thing.
A
So
in
terms
of
releases
we're
midway
between
Quincy
and
reef,
originally
our
plan
was
to
release
Reef
in
March,
but
due
to
some
lab
outages
this
year,
we'll
pushed
that
back
into
June
this
time.
A
So
as
before
that
we're
continuing
with
the
yearly
release
process
doing
a
major
release
each
year
and
deprecating
the
release
three
three
years
away,
so
we'll
have
to
reef,
is
out.
Pacific
will
become
deprecated.
A
A
A
We're
also
trying
to
do
more
logical
scale,
testing
on
our
Upstream
in
our
stream
Lab,
which
is
not
as
large.
We
can't
run
thousands
of
osds
and
and
use
that
in
full
space,
but
we
can
run
thousands
of
osds
with
a
much
more
limited
amount
of
resources
in
space,
so
we're
developing
some
methods
to
actually
test
out
things
you
see
at
scale
without
needing
the
full
scale
of
data.
A
Yeah
finally,
there's
also
more
focus
on
performance
as
well
and
the
performance
it
can.
It
can
be
a
very
interesting
thing
to
to
test.
There
are
many
different
aspects
of
this
one
of
them.
A
lot
of
the
things
you
can
test
for
performance
actually
end
up
showing
up
even
at
smaller
scale,
so
we
have
a
smaller
much
smaller
scale,
a
set
of
high
performance
nodes
in
our
RF
stream
and
CPL
lab,
and
we're
always
improving
and
and
looking
for
more
help
with
the
bunny
tests.
A
There
currently
I
think
we
have
what
is
it
something
like
16
or.
A
Right
so
we've
got
a
few
different
clusters
of
a
mixture
of
Intel
and
AMD
processors
and
a
mixture
of
different
manufacturers
fast
mdme
devices,
and
we're
continuing
to
try
to
improve
that
that
testing
and
maybe
even
make
it
into
more
of
a
continuous
process
rather
than
a
kind
of
one-off
manual
test
effort.
A
So
within
the
radius
layer,
which
is
the
base
layer
for
everything
and
stuff,
there's
been
a
lot
of
focus,
a
lot
of
focus
on
them,
stability
and
reliability.
So
one
of
the
major
things
there
is
quality
of
service
being
able
to
maintain
the
performance
of
the
different
clients
versus
when
different
things
are
happening
inside
the
cluster,
like
you
have
lots
of
scrubbing,
going
on
to
check
for
data
Integrity
or
lots
of
recovery
from
a
failure.
A
If
you
want
to
keep
your
clients
happy
and
keep
the
latency
and
throughput
going
as
you
expect
so
in
Quincy-
and
this
was
made
in
the
default,
we
only
have
a
new
scheduler
called
m-clock
which
implements
this
quality
of
service
for
background
operations.
We
made
further
environments
to
it,
they're
coming
up
in
brief
and
right
now.
It's
only
for
background
operations
compared
to
clients,
but
the
in
the
future
we're
going
to
extend
it
to
be
supporting
different
classes
of
clients,
so
different
clients
can
have
different
reservations
for
amount
of
iops.
A
They
want
that
sort
of
thing.
A
Let's
try
to
make
this
to
the
easy
to
use,
because
stuff
has
many
many
options
for
tuning
these
kinds
of
things
today.
So
with
with
the
m
clock
scheduler,
instead
of
setting
tens
of
options
to
figure
out
how
fast
you
want
recovery
to
go,
you
just
choose
a
single
profile.
If
you
want
recovery
to
go
faster,
do
you
want
clients
to
go
faster,
or
do
you
want
something
in
the
middle?
A
It's
always
possible
to
go
deeper
and
configure
things
at
a
much
lower
level
than
that,
but
we're
trying
to
keep
things
simple,
to
manage
and
and
maintain.
So,
if
you
don't
want
to,
if
you
don't
need
to,
you,
don't
have
to
carry
around
all
those
nitty-gritty
details
of
thousands
of
configuration
options.
A
This
also
applies
to
some
of
the
other
improvements
we
made
in
radius,
but
in
terms
of
how
we're
reporting
Health
errors
and
how
we're
reporting
still
operations
we're
trying
to
make
it
easier
for
folks
to
diagnose
performance
issues
and
stability
issues
in
the
cluster
sooner
and
go
see
some
more
improvements
there
in
the
future,
as
well
as
distributed
tracing,
gets
more
embedded
in
the
different
Protocols
of
stuff.
Brief
will
be
the
first
release
where
then
that's
going
to
be
present
and
easily
Deployable.
A
So
also
coming
up
in
Reef,
there
are
lots,
more
improvements
for
blue
store
for
performance
purposes
and
and
also
potentially
addressing
different
kinds
of
performance
problems
that
kind
of
throw
up
like
fragmentation.
This
has
been
a
major
cause
of
issues
in
Blue
Store
in
the
past,
and
there
are
some
thoughts
there
and
improving
the
way.
A
In
brief,
this
also
includes
bouncing
across
the
primaries,
because
when
you,
when
stuff
is
doing
reads
and
rights,
reads,
are
always
going
to
the
primary
OSD.
So
if
you
have
a
skew
in
how
many
osds
are
primary
for
a
certain
number
of
objects,
you're
going
to
see
hot
spots,
so
now
the
bouncer
will
take
that
into
account
as
well.
If
you
have
even
more
even
performance.
A
One
of
the
major
Focus
areas
for
staff
and
performance
is
the
Crimson
project,
which
is
a
re-implantation
of
the
OSD
in
a
new
architecture
which
isn't
shared
nothing,
meaning
it
it's
designed
to
avoid
any
kind
of
cross,
CPU
or
cross-thread
communication.
So
they
go
with
the
maximum
speed
from
the
wire
to
the
disk.
A
This
is
in
the
experimental
stage
right
now,
where
it
supports
the
basic
operations
of
RBD,
but
it's
still
undergoing
stabilization
and
has
a
long
way
to
go
in
terms
of
optimizations
before
being
fully
production,
ready
and
being
able
to
be
used
for
all
workloads.
A
So
in
Reef
we
have
some
initial
support
for
multiple
reactors
and
snapshots,
but
we're
continuing
to
build
out
the
test,
Suite
there
and
continuing
to
improve
the
functionality
and
to
make
it
fully
able
to
be
used
for
RBD
in
the
next
release.
Probably
yes,
a
big
aspect
of
that
as
well,
is
the
C
store,
which
is
a
new
store
back
in
the
OSD
designed
specifically
for
high
performance
systems
like
ndme
disks
and
other
sorts
of
very
high
performance
media.
A
So,
as
I
mentioned
earlier,
we
have
this
inflammatory
system,
which
is
all
opt-in,
reporting,
we're
trying
to
make
this
easier
to
use
and
easier
to
update.
So
we
introduced
a
number
of
Notions
around
how
to
manage
the
data
there.
I
think
you
eat
is
going
to
talk
a
bit
more
about
this
in
more
detail
later
so
I'm
going
to
skip
through
this
pretty
fast,
but
essentially
we're
Gathering,
both
metadata
event,
clusters
crashes
that
happen
and
also
information
about
Drive
drives
that
fail.
So
we
can
feed
that
back
into
models
predicting
values
before
they
occur.
A
So
in
Quincy,
it's
much
easier
to
manage
things
like
expanding
the
cluster,
adding
new
host,
removing
them,
and
it's
gained
support
for
a
number
of
different
protocols
like
monitoring
protocols
and
adding
proxies
to
be
able
to
make
things
h.
A
and
reef
is
expanding
that
support
across
different
functionality
in
RBD
and
rgw,
especially
and
trying
to
make
things
easier
to
Monitor
and
easier
to
diagnose
by
integrating
the
centralized
logging
as
well.
So
you
can
have
all
your
logs
go
to
a
standard,
elasticsearch
based
platform
to
inspect
and
and
sift
through.
A
So
fadm
is
the
current
way
to
deploy
our
clusters
with
stuff
which
isn't
designed
to
be
as
easy
to
use
as
possible.
It's
deploying
them
via
containers
and
system
D,
but
managing
it
all
through
the
step.
Orchestration
commands,
which
are
common,
Zippos,
fadm
and
Brook,
which
is
the
deployment
method
on
top
of
kubernetes
stuff.
Idiom
itself
has
gained
support
for
a
number
of
different
things
in
Quincy
and
in
general,
we're
trying
to
make
it
easier
to
use
and
easier
to
deploy
in
flexible
ways
that
people
would
like
to
so
in
Reef.
A
We're
also
making
it
simpler
to
upgrade
more
piecemeal
so
that
you
don't
have
to
upgrade
the
whole
thing.
Whole
thing
at
once.
That
was
a
little
bit
more
difficult
in
the
past,
making
it
simpler
to
set
up
multi-site
deployments
for
the
various
Gateway
and
for
mirroring
with
RBD,
and
we're
also
added
support
for
automatically
rotating
the
keys
for
authentication
for
demons
which
keeps
your
clusters
here.
A
So
Rook
is
the
main
way
to
play
stuff
on
top
of
kubernetes
and
Rook
as
I
always
catching
up
and
keeping
up
with
new
features
and
stuff
and
new
features
in
kubernetes.
So
now
it's
supporting
much
much
easier
ways
to
troubleshoot
a
stuff
cluster
I
believe
there's
even
a
new
project,
a
little
bit
separate
from
book
to
me
kind
of
gather.
Some
of
these
troubleshooting
commands
together
in
one
place,
to
make
it
simpler
to
diagnose
and
and
to
inspect
a
cluster
deployed
in
a
kubernetes
environment.
A
A
Currently
the
work
there
was
around
our
Gateway,
which
Jonas
was
going
to
talk
about
after
me,
I
believe
in
the
future
that
I
guess
will
be
a
large,
an
important
protocol
for
Seth,
because
it's
becoming
a
very
widely
supported
standard
and
a
very
easy
way
to
connect
to
all
kinds
of
operating
systems
and
hosts.
A
There's
also
some
performance
improvements
within
RBD
at
the
RVD
itself,
support
for
a
caching
Daemon
that
does
right
back,
contributed
by
Intel
and
designed
to
take
advantage
of
very
fast
local
storage
if
you
want
to
have
rights
going
through
a
single
discount
locally
before
being
pushed
back
to
the
main
cluster
and
finally,
we're
focusing
a
lot
in
RBD
on
the
multi-site
capabilities
there
asynchronous
mirroring,
namely
between
multiple
clusters.
A
Interesting
research
in
the
future
is
ongoing
from
a
group
at
Northeastern
University
into
a
long
structured
format
for
IBD,
which
has
the
potential
to
drastically
improve
the
performance
for
random.
I
o
in
particular,
but
it
has
some
restrictions
on
what
workloads
it
applies
to
I.
Think
it's
a
very
interesting
effort
for
the
future.
A
In
Rita's
Gateway,
the
SV
and
object
storage
interface
per
SEF
has
a
number
of
ongoing
efforts.
A
lot
of
these
are
around
supporting
Ai
and
analytics
types
of
types
of
workloads
like
things
like
f2b,
select
being
able
to
run
SQL
queries
against
your
objects.
This
is
a
particularly
interesting
project
because
it's
able
to
run
Standalone
as
well,
so
you
can
easily
develop
your
queries
against
a
local
file
and
then
run
them
against
your
cluster.
A
Another
big
big
aspect
for
the
radio's
Gateway
is
multi-size
support,
so
it
supported
different
kinds
of
multi-save
replication
for
a
long
time
now,
but
there
is,
there
are
some
performance
issues
that
we're
looking
to
address
in
Quincy
and
brief,
namely
around
parallelizing
more
of
the
work
among
multiple
gateways
and
synchronizing
things
across
and
balancing
that
load
across
all
the
gateways.
A
There's
also
ongoing
work
towards
being
able
to
debug
and
understand
what's
going
on,
namely
with
the
Jaeger
or
open
Telemetry
tracing,
which
is
now
Deployable
in
quite
easily
vsf,
ADM
and
I
believe
rook
in
the
pretty
soon.
A
This
will
let
you
debug,
where
operations
are
getting
selective
or
where
performance
is
slowing
down,
basically,
each
type
of
the
way,
so
you
can
see
if
something's
being
synced
from
site
a
to
type
B.
Is
it
getting
stuck
or
is
it?
Is
it
slowing
down
outside
a
or
site
b
or
which
part
of
it
it's
going
wrong?
A
If
you
have
a
problem,
there's
a
lot
of
work
into
the
readers
gateways
zipper
back
end,
which
is
a
kind
of
pluggable
back
end
to
let
you
put
different
things
behind
rgw
for
different
purposes.
If
you
wanted
to
run
it
in
a
very
constrained
environment,
for
example,
you
could
back
it
with
a
simple
file
or
a
simple
database.
A
Within
CFS
there's
a
lot
of
work
on
this
ffs
top
utility
again
for
introspecting.
What's
going
on
within
your
system
from
a
performance
perspective,
so
you
can
very
easily
see
what
clients
are
doing.
What
and
and
if
there's
an
issue
go
in
and
look
more
deeply
at
that.
A
There's
also
work
towards
cloning
and
snapshotting
support,
similar
to
what
we
have
for
the
black
device,
where
you're
able
to
your
files
and
easily
make
copy
and
write
changes
to
those.
A
A
Philosophy
I
want
to
touch
on
is
testing
and
quality
in
general,
so
there's
a
few
different
aspects
to
this
one.
A
new
effort
that
we're
working
to
focus
more
on
this
year
is
the
developer
experience.
If
it's
very
difficult
to
run
the
test
for
a
system,
it's
it
makes
much
harder
to
add
new
tests
and
keep
them
maintained.
A
So
if
we
can
create
a
way
to
run
these
these
tests
locally
instead
of
having
to
wait
hours
for
our
builds
and
push
it
through
a
queue
for
days,
if
you
can
run
those
on
your
own
laptop
or
you
go
into
your
machine,
it
becomes
much
easier
of
our
position
to
contribute
and
to
maintain
those
tests.
A
In
addition,
making
a
easier
to
develop
local
developer
environment
can
make
it
much
more
much
easier
and
faster
to
develop
if
we
can
make
it
based
on
kind
of
an
incremental
builds,
instead
of
always
be
building
everything
from
scratch.
So
that's
something
we're
going
to
see
more
improvements
on
coming
up.
A
As
I
mentioned
earlier,
there's
there
was
a
bit
of
a
large
outage
in
our
lab
that
slowed
down
our
release
process
this
year,
so
we're
making
some
improvements
in
our
lab
infrastructure
to
make
it
more
resilience
and
also
looking
to
how
we
can
run
these
kinds
of
tests
in
other
environments.
A
In
case
there
is
a
problem
with
one
another
aspect
of
that
is
engaging
with
other
organizations
that
have
Hardware
they'd
like
to
help
other
community
with
and
enabling
them
to
contribute
in
by
rather
running
tests
themselves,
an
RC
candidates,
for
example,
or
maybe
providing
some
backup
Hardware
in
case
our
CPA
lab,
has
an
issue.
We
need
to
get
things
going
to
be
able
to
merge
code
and
and
get
releases
out
the
door
and
finally,
as
I
mentioned
earlier,
looking
into
improving
performance
by
on
a
more
continuous
basis
and
monitoring
it
via
a
CI
system.
A
Yeah,
so
the
question
was
for
the
qos:
will
you
be
able
to
do
a
fine-grained
decision
for
a
particular
image,
and
the
answer
is
yes,
the
idea
is
that
you'd
be
able
to
apply
a
policy
for
how
many
this
this
image
it
gets
or
you
could
make
it
on
a
per
pool
basis
or
there
are
several
different
granularities,
but
the
underlying
support
would
be
even
at
the
Image
level.