►
From YouTube: Keynote: What's Planned for Ceph Octopus - Sage Weil, Co-Creator, Chief Architect Ceph
Description
Keynote: What's Planned for Ceph Octopus - Sage Weil, Co-Creator, Chief Architect & Ceph Project Leader, Red Hat
About Sage Weil
Red Hat
Ceph Project Leader
Madison, WI, USA
Twitter Tweet
Sage helped build the initial prototype of Ceph at the University of California, Santa Cruz as part of his graduate thesis. Since then he has led the open source project with the goal of bringing a reliable, robust, scalable, and high performance storage system to the free software community.
A
Thanks
everyone
welcome
back
to
day,
2
hope
everyone
is
having
a
great
time.
I
know,
I've
been
really
enjoying
this
conference
and
I'm
looking
forward
to
talking
to
more
of
you
today,
so
I
thought
we
would
just
take
a
few
minutes
and
talk
a
little
bit
about
what
is
coming
in
the
next
stuff
release
stuff
optimist.
Yesterday,
I
talked
about
what
our
priorities
were,
that
we
set
a
year
ago
and
what
we
did
for
anomalous
and
so
I
wanted
to
give
everyone
a
glimpse
of
what
our
priorities
I'm
over
thinking
about
our
four
octopus.
A
So
it's
a
lovely
picture
of
an
octopus
I
found
through
Google
Images
I
mentioned
yesterday,
that
we
used
to
think
of
the
stuff
priorities
in
terms
of
before
sort
of
components
that
we
picked
out
a
year
ago,
but
that
on
poun
reflection
we
had
a
discussion
on
Saturday
with
some
developers
and
sort
of
revised
that
thinking
to
sort
of
group
things
in
five
categories
or
five
themes.
It's
it's
somewhat.
An
all-inclusive
in
terms
of
this
is
like
everything
that
we
kind
of
feel
like
we
should
be
doing
so.
A
It's
I'm
not
sure
I
would
describe
them
as
priorities
rather,
but
rather
call
them
themes,
because
all
of
these
are
important,
and
so
I
really
need
to
be
looking
at
all
of
them.
So
but
I'm
going
to
go
through
them
and
give
you
just
a
glimpse
of
some
of
the
things
that
we're
talking
about
for
octopus,
and
these
aren't
necessarily
guarantees
that
they're
going
to
be
in
the
next
release,
because
we
have
to
sort
of
revise
our
planning
and
so
on.
But
this
is
really
what
we're.
A
What
we're
thinking
about
so
I'm
gonna
start
with
usability,
making
stuff
easier
to
use
easier,
to
manage
easier
to
consume
and
easier
to
operate
at
scale
and
I.
Think
the
biggest
piece
in
this
category
is
around
the
the
orchestrator
API
that
I
mentioned
before.
We
really
want
to
improve
the
clusters,
ability
to
reach
out
and
talk
to
either
rook
or
some
sort
of
sort
of
bare
metal
type
orchestration
tool
currently
based
on
SSH.
A
That's
the
goal
and
how
far
exactly
we
get
because
there
you
know
it
goes
on
with
nav
NFS,
daemons,
I
scuzzy
gateways,
well,
that
one's
actually
goes
to
are
actually
in
pretty
good
shape.
How
far
we
get
remains
to
be
seen,
but
we
want
to
have
some
basic
set
of
functionality.
That's
all
in
place
and
part
of
the
goal
here
is
to
take
the
sort
of
the
venerable
historical
stuff
deploy
tool
and
capture.
A
This
paves
the
way
to
have
sort
of
a
standard
set
of
processes
in
the
set
documentation
that
are
consistent
for
all
users,
whether
you
or
I
kick
things
off
with
rook
or
using
sort
of
that
bare
metal
approach.
We
can
have
consistent
documentation
for
replacing
disks
and
so
on.
So
really
looking
forward
to
that
one
of
the
things
that
this
unblocks
is
the
possibility
of.
Finally
automating
upgrades
every
time
we
have
a
major
release.
You
know
it's
like
a
15
item
list
to
like
make
sure
this
is
the
case.
A
Upgrade
the
names
in
this
order
double
check
that
this
health
state
is
set.
This
way,
you
know,
run
this
command
and
there's
the
set
of
steps
vary
a
little
bit
between
each
release,
but
we'd
like
to
be
able
to
automate
this
more
and
the
challenge
here
is
figuring
out
what
the
division
of
responsibilities
is
between
the
orchestration
tool.
That
is
probably
gonna.
A
Kick
this
process
off
and
how
much
stuff
can
manage
on
its
own
and
so
we're
discussing
how
we
can
do
that,
so
that
the
manager
module
can
handle
all
of
those
sort
of
internal
stuff
dependencies
and
and
careful
gating
and
sequences
and
so
forth,
but
still
leverage
this
orchestration
API
to
go
through
an
automatic,
the
OST
restarts
and
make
sure
your
PJs
repair
and
all
that
stuff.
So
the
ear
upgrade
can
proceed
in
a
nice
fashion.
So
this
is
coming
and
we're
excited
about
it.
A
One
of
the
other
sort
of
last
things,
an
item
that
is
gonna,
be
worked
on
for
octopus
in
the
usability
category
category
is
taking
the
thing
that
we
did
for
our
beady
top.
That
lets
you
identify
the
top
I/o
users
and
do
the
same
thing
for
set
of
s,
and
so
a
small
thing,
but
very
useful
for
actual
operators
in
the
quality
category.
I
think
the
biggest
effort
here
is
best
step
board
is
around
the
telemetry
and
crash
reporting
that
I
mentioned
yesterday.
A
That's
new
and
Nautilus,
and
so
right
now
the
Nautilus
now
has
the
ability
to
phone
this
information
home
and
once
we
can
convince
users
that
it's
in
their
best
interest
to
turn
it
on.
But
right
now
is
just
getting
dumped
at
a
database.
So
we
have
to
build
all
the
backend
tools
that
let
us
introspect
and
analyze
and
identify
trends
so
that
if
we
push
out
of
stuff
release
and
people
start
seeing
a
particular
stuff
crash
and
those
start
trickling
back
into
the
database,
that
we
can
actually
notice
and
preemptively
go
and
figure
out.
A
What's
going
on
and
fix
the
bug
as
quickly
as
possible,
and
we
also
want
to
make
sort
of
the
back-end
infrastructure
tools
so
that
developers
can
can
browse
through
those
data
sets
and
identify.
You
know
if
this
particular
crash
is
happening
on
these
specific
versions.
Maybe
it
started
at
this
point
and
stopped
at
this
point.
Whatever
the
the
correlating
factors
are,
we
also
want
to
make
sure
that
we're
doing
everything
we
can
to
get
users
to
turn
on
the
telemetry
and
if
users
are
don't
want
to
do
it
figure
out.
A
A
So
we
have
to
make
sure
that
that's
an
ongoing
conversation
and
that
we're
very
careful
about
making
sure
that
we're
not
phoning
home
data
that
we
shouldn't
there's
an
effort
and
that
was
kicked
off
a
couple
of
months
ago,
called
doc,
II
better,
which
is
focus
simply
on
making
the
Ceph
documentation
better
they
week.
It's
a
it's
a
team
that
meets
every
other
week
a
couple
different
times
over
the
course
of
the
month
and
they
discuss
all
of
the
documentation
infrastructure.
A
All
the
tooling
on
the
back
end
that
automatically
generates
the
documentation
on
the
website
and
also
where
the
content
and
gaps
are
because
we
we
think
that
the
Ceph
documentation
is
one
of
the
key
areas
where
you
can
invest.
That
will
help
us
grow
the
user
community
and
help
onboard
developers
and
so
on.
A
So
if
you
know
anybody
who's,
interested
or
you're
interested
in
participating
on
that
effort-
and
definitely
let
us
know-
and
we're
also
continually
looking
at
our
test
suite
our
automated
test,
suite
that
we
run
to
make
sure
that
we're
doing
everything
we
can
to
ensure
that
that
stuff
is
high
quality.
So
individual
components
are
meeting
regularly
to
review
what
the
test
coverage
is
and
brainstorming
ways
that
we
can
expand,
write
new
tests
to
cover
parts
of
the
code
that
current
they
aren't
being
sufficiently
tested.
A
A
We
need
to
reinvest
in
that
effort
and
make
sure
that
we're
running
those
tests
regularly
and
we've
also
discussed
in
the
past,
but
haven't
fully
implemented
a
test
suite
that
does
downgrade
testing
so
that
within
a
major
release
Nautilus,
for
example,
if
you
install
a
point
release
and
it
causes
problem
to
ensure
that
you'll
be
able
to
downgrade
back
to
a
previous
stable
version
within
the
same
series
performance.
So
one
of
the
first
are
the
main
focus
areas
here
is
around
raters
QoS,
so
we've
had
a
QoS
infrastructure
design.
A
That's
been
partially
implemented
in
ratos
for
quite
a
while.
Now
and
in
fact
there
was
a
great
talk
yesterday
from
the
folks
from
ZTE
about
their
efforts
around
this
and
some
of
their
pending
changes,
and
that's
all
great,
but
largely
this
effort
has
been
blocked
because
it
depends
on
carefully
managing
the
queue
depth
in
blue
store.
A
All
the
QoS
prioritization
happens
of
a
higher
level,
and
once
you
commit
to
doing
that
work,
if
you're
feeding
your
prioritized
commands
into
a
deep
queue
with
a
high
latency,
then
it's
pretty
ineffective
and
the
trick
that
we're
trying
to
address
trying
to
solve
is
figuring
out.
How
do
you
manage
that
queue
depth
in
an
automated
stuff
self
tuning
fashion,
so
that
it'll
automatically
adapt
to
a
slow,
hard
disk
or
a
very
fast
team,
give
me
different
workloads,
different
I/o
sizes
and
so
on?
A
So
we're
gonna
put
some
real
effort
in
this
cycle
to
figuring
out
addressing
that
problem,
because
that's
gonna
unblock
I'm
being
able
to
deliver
this
as
a
general
solution.
There's
also
ongoing
work
with
blue
store
in
general
to
improve
performance.
There's
sort
of
two
efforts
here:
one
is
around
charting
the
rocks.
Tb,
that's
used
for
metadata
internal
to
blue
store
so
that
the
effects
of
compaction
are
less
impactful
and
the
space
utilization
is
more
efficient
and
the
other
is
looking
at
a
rocks.
A
Tb
fork,
currently
a
fork
called
T
rocks
TB
that
essentially
separates
the
the
keyboard
key
portion
of
the
data
and
the
value
portion
of
the
data
into
different
IO
streams
to
improve
compaction.
Behavior
and
initial
testing
has
shown
that
the
combination
of
these
two
changes
has
had
significant
impact
in
improvements
in
performance
for
Blue
Star,
so
we're
we're
excited
about
both
of
them.
In
this
fs
space.
A
There's
an
investment
in
in
create
and
unlink
operations
to
function,
asynchronously
in
stuff,
FS
workloads,
the
the
latency
tends
to
be
dominated
by
the
fact
that
you
have
to
have
a
round-trip
to
the
MDS,
for
each
create
or,
unlike
and
so
being
able
to
do.
Those
asynchronous
can
unblock
things
like
on
tar
and
RM
and
so
our
fourth
to
go
much
faster,
so
it's
complicated,
but
the
team
is
working
through
it
and
we're
very
excited
about
making
this
sort
of
leap
leap
forward
in
the
protocol.
And
finally,
there
were
several
sessions.
A
Yesterday
about
the
crimson
effort,
that
effort
is
continuing.
Of
course,
the
focus
initially
is
getting
sort
of
an
end
to
end
implementation
so
that
we
can
test
the
folio
path
and
then
observe
what
how
well
it's
doing
and
what
how
it
behaves
so
that
we
can
validate
a
lot
of
our
initial
assumptions
about
how
the
software
should
be
designed
and
how
it
should
be
put
together
and
based
on
that,
then
we
can
figure
out
what
the
next
steps
are.
So
I
remind
everyone.
A
So
our
GW
has
multi-site
Federation
and
replication.
That's
managed
at
sort
of
a
site
granularity,
but
there
are
a
couple
of
key
things
that
we're
doing
to
sort
of
revise
the
way
that
our
GW
is
structured,
how
the
multi
site
is
structured
to
sort
of
do
the
next
iteration
v3
of
this.
The
first
is
bucket
granularity
control
of
those
multi
zone,
multi-site
federated
replication
relationships.
A
One
is
support
for
a
sort
of
a
pass-through
storage
so
that,
when
you
put
into
a
bucket
you'll,
actually
just
write
it
through
to
s3
or
to
Azure,
or
something
like
that,
like
that
say
that
you
can
use
our
GW
as
a
protocol
translator
to
give
you
a
consistent,
API
endpoint
across
different
topologies
different
clouds,
one
is
bi-directional:
replication
of
a
bucket
to
an
external
object
store,
so
you
can
have
an
rgw
bucket,
an
s3
bucket
that
are
mirrored
an
active-active,
writable
and
replicating
both.
A
Both
directions
in
the
final
capability
is
having
individual
objects
within
a
bucket,
be
able
to
tear
out
to
an
external
storage
expanding
on
the
current
lifecycle
policy
that
currently
allows
you
to
tear
within
a
South
cluster.
So
you
can
also
tear
out
to
something
like
glacier
or
it's
or
its
moral
equivalent,
and
finally,
we
have
multi-site
capabilities
and
stuff
in
our
GW
and
in
our
biddies
async
mirroring
capability.
A
We
need
to
finally
take
a
look
at
what
we're
gonna
do
in
the
South
of
s
space
I'm
initially-
and
this
is
probably
gonna-
take
the
form
of
sort
of
a
snapshot
and
periodic
sink
type
of
capability
for
disaster
recovery,
but
we're
currently
brainstorming
ideas
about
how
we
could
do
sort
of
more
online
bi-directional
replication
within
the
filesystem
and
how
to
address
those.
Those
use
cases
in
the
last
piece
in
the
ecosystem.
Space
is
probably
an
unsurprising
story,
we're
continuing
to
vest
in
invest
in
our
integrations
with
and
relationships
with,
kubernetes
and
rook.
A
We
want
to
make
stuff
the
obvious
storage
choice
for
container
infrastructure.
Openstack,
of
course,
is
a
huge
ecosystem
that
we're
already
well
integrated
with
and
heavily
invested
in.
We
want
to
continue
to
make
those
users
happy
analytics,
is
sort
of
an
emerging
in
use
case
that
we're
looking
at
that's
seeing
a
lot
of
traction
around.
You
know
data
Lakes,
big
data,
a
IML
analytics
backed
by
our
GW
and
keep
our
eyes
out
for
new
for
new
ecosystems
that
are
growth,
opportunities
and
places
where
stuff
can
really
shine.
A
I
mentioned
yesterday
that
we're
thinking
about
changing
our
release,
cadence
from
9
months,
12
months,
wires,
actually
tweeted
at
a
poll.
The
results
are
maybe
not
decisive
but
leaning
towards
the
12-month
cadence.
So
this
is
gonna,
be
an
ongoing
discussion
point
for
the
community.
If
you're
have
an
opinion
here,
let
us
know
you'll
see
some.
You
know
threads
on
the
on
the
list
and
so
forth,
but
the
the
net
of
it
would
be
that,
instead
of
upgrading
every
18
months,
you
could
upgrade
every
24
months
every
two
years
at
the
at
the
limit.
A
I
just
want
to
say
a
few
words
about
how
you
can
get
involved
in
these
efforts.
Steph
is
an
open
community,
open
source
project,
the
more
people
that
help
us,
the
more
that
we
can
do
so
we
use
all
the
usual
free
and
open
source
tools
and
processes.
We
have
mailing
lists
and
stuff
develop
stuff
users
mailing
lists
you
go
website.
You
can
sign
up
for
those
if
you're
not
on
them
already
we're
on
IRC
all
the
time.
So
you
can
talk
to
us.
A
One
of
the
easiest
ways
to
get
involved
as
a
developer
is
to
go
on
github,
look
at
pull
requests
and
help
review
code.
That's
one
of
the
hardest
things
for
sort
of
the
core
development
team
to
do
is
ensure
that
they're
setting
aside
time
to
review
new
pull
requests,
but
it's
also
one
of
the
most
important
things
we
should
be
doing
in
order
to
bring
new
developers
into
the
community.
A
So
it's
a
good
good
place
to
focus
and,
of
course,
just
opening
tickets,
opening
bugs
and
commenting
on
existing
bugs
if
you're
seeing
them
in
your
environment
is
extremely
helpful.
So
we
know
how
to
prioritize
issues
and
what
we
should
be
fixing
the
documentation,
as
I
mentioned,
is
a
priority
and
a
focus
to
make
it
as
good
as
possible
and
as
helpful
as
possible
to
make
it
easy
for
users
to
on-ramp.
A
There's
now
a
new
link
in
the
upper
right
on
any
documentation
page
that
will
link
directly
to
github
to
let
you
do
a
pull
request
to
edit
documentation.
So
if
you
want,
if
you
see,
changes
the
inaccuracies
or
typos
or
whatever,
it's
super
easy
to
make
those
changes
and
propose
them
so
I
encourage
you
to
do.
That
is
a
good
way
to
contribute.
We
have
lots
and
lots
of
meetings.
We
use
video
chat
because
the
stuff
development
team
is
distributed
all
over
the
world
and
so
there's
a
public
community
calendar.
A
We're
gonna
send
this
link
this
URL
somewhere
else,
so
that
you
don't
need
to
copy
it
down,
but
there's
a
there's,
a
public
calendar
that
has
all
of
our
stand-up
meetings
all
of
our
weekly
meetings
on
various
topics.
All
these
meetings
are
open.
Some
of
them
are
focused
towards
users
and
are
very
easy
for
people
to
join
and
discuss
things.
A
Others
are
the
daily
stand-ups
for
developers,
so
developers
can
join
and
ask
about
the
pull
request
that
they've
open
and
ask
about
bugs
and
so
on,
but
you're
welcome
to
drop
in
on
any
of
them
and
talk
to
people
on
YouTube.
We
have
a
set
channel
on
YouTube
that
has
a
ton
of
video
content.
So
all
of
the
talks
here
at
cephalic
on
are
being
recorded.
They're
all
gonna
go
in
this
channel.
All
the
talks
from
last
year's
cephalic
on
in
beijing
are
here
all
of
our
weekly
meetings.
A
Most
of
our
weekly
meetings
are
recorded
and
available
here
we
also
have
code
walkthroughs
on
lots
of
different
stuff
components,
and
we
have
several
of
those
walkthroughs
that
are
targeted
specifically
at
new
contributors.
How
to
get
your
development
environment
set
up
how
to
write
your
first
patch,
how
the
Ceph
code
is
organized
and
how
it's
how
to
approach
it,
and
so
I
definitely
encourage
you
to
look
at
this
as
a
good
resource.
And,
finally,
we
were
in
the
process
of
revising
the
sort
of
stuff
getting
and
started
getting
involved.
A
Page,
that's
on
the
Ceph
website
so
that
it
has
links
to
all
of
these
different
resources
and
we'll
be
ensuring
that
those
are
those
are
easy
to
find
and
last
cephalic
on
is
only
once
a
year,
but
we
have
SEF
days
in
all
over
the
world
and
in
various
geographies,
and
so,
if
you're,
trying
to
connect
with
your
local
stuff
community.
This
is
a
great
way.
So
the
next
one
is
going
to
be
the
Netherlands
there's
one
plant
at
CERN
and
September.
A
There's
gonna
be
in
1:1
in
London,
when
Poland
and
if
you
haven't
had
a
set
day
in
your
area
and
you'd
like
to
organize
one
and
the
hardest
part
is
usually
finding
a
venue
that
can
hold.
You
know
100-ish
people
and
then
just
talk
to
us
and
we'll
help
you
set
it
up.
It's
not
actually
that
difficult
I
mean
we
would
love
to
continue
the
successful
program
in
new
areas.
So
thank
you
very
much.