►
From YouTube: Ceph Developer Summit Quincy: RADOS Follow-up
A
All
right,
I
guess,
we'll
give
people
a
couple
of
minutes
to
just
trickle
in,
but
while
we're
waiting
welcome
everyone
to
the
follow-up
session
from
cds,
raiders,
which
happened
a
couple
of
weeks
ago.
A
A
So
in
the
interest
of
time,
what
what
we've
done
is
that
we've
already
created
some
trello
cards
for
the
items
that
are
there
in
the
ether
pad,
and
there
are
also
some
items
that
have
been
there
since
pacific,
which
just
spilled
over
to
quincy.
A
A
If
does
that
sound
good
to
everybody,
any
other
ideas.
A
Cool
perfect,
okay,
I
think
I'm
just
going
to
do
start.
First
of
all,
I've
shared
the
ether
pad
in
the
chat
and
also
the
backlog
trello.
I
can
share
my
screen.
So
that
comes
easier.
I'm
talking
about.
A
Cool
so
yeah.
This
is
the
raiders
section,
the
first
item
and
the
second
item.
The
first
one
was
about
dashboard
and
traders
and
there's
detailed
things
that
we
want
to
do
there,
but
I
don't
think
there
are
any
particular
greatest
trello
cards
required.
A
Those
required
will
be
made
by
the
dashboard
team
and
we
figure
out
what
needs
to
be
done
there
so
I'll
skip
that
next.
Also,
the
crash
telemetry
panel
stuff
lots
of
improvements
mentioned
in
the
ether
pad,
nothing
particular
I
mean
we
have
one
card
here
about
mapping,
device
errors
and
osd
crashes
factor
devices.
That's
a
radius
card,
that's
been
there,
but
I
don't
think,
there's
anything
specific
that
we
want
to
create
there
as
well.
A
So
moving
on
to
the
next
item,
which
is
about
blue
store
or
split
cash
improvements,
we
did
not
go
over
this
in
detail
in
the
last
cdm
session,
because
the
availability
of
some
of
the
blue
store
books,
but
there's
a
detailed
document
about
how
we
want
to
make
these
improvements
and
we'll
probably
go
over
this
in
a
performance
call
or
something
for
now
we
have
added.
This
is
in
this
link
to
the
either
pad
is
already
there.
A
A
A
A
We
have
a
bunch
of
improvements
that
we
have
mentioned
here
already
things
like
improve
progress,
module
scalability
is
just
about
the
progress
module.
Similarly,
for
insights,
we
have
a
separate
card
and
the
entire
ether
path
that
we
have
is
linked
here.
A
In
general,
I
feel
we
have
cards
that
capture
some
of
the
improvements
that
are
mentioned
in
this,
but
I
guess
the
idea
will
be
some
of
the
cards
that
are
already.
There
are
more
like
short-term
things
that
we
are
aiming
to
get
done
in
quincy.
As
we
start
taking
things
as
done,
we
will
be
able
to
make
more
cards
that
we
can
probably
target
for
quincy,
or
maybe
even
later,
that
that
was
the
general
approach
for
this.
A
A
The
other
thing
which
I
think
is
going
to
come
up
next,
is
about
the
common
manager
rule.
So
there
is
a
pull
request
from
patrick
that
already
addresses
this.
A
The
idea
is
to
be
able
to
use
the
common
sqlite
database.
So
currently
the
plan
is
to
be
able
to
migrate
the
device
health
metrics
pool.
D
A
In
future
we
went
inside
starts
using
a
radar
school
to
store
its
data.
We
can
also
use
the
same
tool,
dot
manager
that
we
decided
to
name
it.
So
I
yeah
go
ahead.
Sage.
B
I
was
gonna
say
we
could
just
rename
that
pool
and
reuse
it
instead
of
creating
a
new
one.
B
E
A
A
All
right
next,
there
is
this
item
about
autoscaler
improvements
here
also,
work
has
already
started
happening.
There
is
a
pr
from
junior
that
already
came
in
start
off
old
creation
with
one
pg
if
the
auto
scale
mod
is
on,
but
I
guess
we
also
have
items
here
in
general
for
auto
scaler.
A
Like
one
is
this,
we
decided
that
we
will
have
separate
auto
profiles
and
users
can
just
select
which
profile
they
want
to
select
based
on
what
kind
of
workloads
they
are
running
or
what
the
state
of
the
cluster
is,
so
that
this
card
addresses
that
bit
and
and
yeah
this.
This
is
the
main
thing.
This
is
what
we
attempted
to
address
in
specific.
There
are
some
rough
edges,
so
it's
still
marked
frequency.
B
A
Good
okay,
so
next
item
here
is
about
cluster
log
messages
going
through
faxes.
This
is
bad,
so
here
I
think
there
are
a
couple
of
things
that
we
discussed.
A
One
was
about
controlling
the
trimming
rate
in
the
monitors
in
a
more
adaptable
fashion
versus
just
statically,
choosing
a
value
which
we
were
doing
earlier.
This
pr
has
already
merged.
A
So
that
address
is
one
part
of
the
problem,
but
I
guess
the
other
main
issue
was
that
we
decided
that
we
do
not
want
to
store
all
these
low
request
messages
in
the
cluster
log
and
probably
just
direct
it
to
the
manager
log
and
not
store
everything
I
have
like.
You
know,
n
number
of
updates
that
we
store
so
have
got
your
in
yours,
so
this
addresses
that
problem.
A
Will
be
both
osd
and
manager,
cluster
login,
which
will
now
go
into
the
manager
log,
and
we
just
limit
the
number
of
messages
going
in.
A
Those
are
basically
the
three
main
things
that
we
can
address
in
quincy.
There's
also
things
about
storing
things
in
a
dumb
way
and
we
should
fix
it.
This
will
need
a
refactor
from
the
log
monitor
yeah.
I.
D
A
It
is
underrated,
it's
hidden
somewhere,
so
this
one
was
a
good
idea.
This
needs
to
be.
A
This
next
on
the
list
is
structured,
con
files
and
we've
already
merged
a
pr
from
kifu
that
implements
this,
which
is
awesome.
So
hopefully
we
can
get
the
rest
of
the
items
he
has
here
as
a
part
of
quincy.
We've
also
put
it
in
here:
auto
generator
talks.
A
A
That
brings
us
to
this
last
item
here,
automated
auth
key
rotation.
We
had
a
detailed
discussion
about
this.
Unfortunately,
the
the
source
from
where
this
requirement
came
in
there
is
lack
of
clarity
on
what
is
actually
needed,
so
we
hope
to
get
more
clarity
from
them,
maybe
in
a
month
or
so.
But
if
we
do
get
clarity
on
that,
it
would
be
a
nice
to
have
featured
in
quincy,
but
I
I
don't
have
any
more
details
at
the
moment,
but
we
have
a
card
for
it
and.
A
A
So
the
first
item
here
is
the
second
part
of
the
pg
removal.
Optimization
you've
got
a
pr
for
it,
and
igor
is
already
working
on
it.
So
hopefully
we
can
get
that
done.
A
A
Then
we've
got
so:
we've
got
a
few
items
related
to
a
qos.
This
is
something
we
did
not
discuss,
particularly
at
cds,
but
we
have
a
lot
of
clarity
around
what
we
want
to
do
and
what
is
done.
So
there
is
not
trimming
scrub
and
pg
deletion
that
we
want
to
address
next,
as
some
of
the
background
activities
that
m
clock
will
also
prioritize.
A
So
if
we
get
these
done,
we
can
essentially
I
I
think
this
covers
everything
we
don't
require
any
of
those
manual
sleep
stuff
that
we
have
for
all
of
these
background
operations
and
we've
been
snap
running,
I
don't
think
we've
started
work
on,
but
pg
deletion
and
scrub.
We
have
ongoing
work
on
that.
D
Yeah,
so
so
we
have
seen
that
we
have
yeah.
We
have
identified
the
the
physical
deletion
and
the
sub
tuning
part
for
which
we
come
up
with
the
test
cases
using
cbt
itself,
so
once
we
test
using
that,
we
can
go
ahead
and
fine
tune,
as
as
we
find
things.
A
Perfect-
and
I
would
like
to
also
mention
that
the
aim
with
quincy
is
to
default
to
m
clock
scheduler.
I
already
have
a
pr
out
for
that.
There
are
some
testing
issues
that
came
up,
so
we
might
need
to
adjust
some
tests
and
stuff,
but
if
everything
goes
well,
we
will
go
ahead
with
that
change.
So
quincy
will
default
to
n
clock
scheduler
versus
wp
q,
which
is
the
default
at
the
moment.
A
Then,
apart
from
the
background
activity
stuff,
we
also
have
automate
baseline
measurements,
so
currently
m
clock
or
the
m
clock.
Scheduler
requires
us
to
feed
in
some
data
or
some
information
about
cluster
performance
or
throughput,
and
that's
a
manual
step
at
this
moment,
and
we
have
some
default
values
but
which
may
not
hold
for
different
kinds
of
clusters.
A
So
the
idea
is
to
be
able
to
automate
this
step
so
that
users
can
just
run
something
and
or
this
can
also
be
done
like
a
background
activity
by
the
manager,
and
we
automatically
figure
out
what
those
values
should
be
versus
a
manual
step
involved,
so
that
I
believe,
is
going
to
be
cool
and
I
think
that's
the
only
thing
that's
holding
us
from
making
everything
you
know
work
out
of
the
box
at
the
moment.
A
And
yeah,
the
the
last
item
about
qs
is
all
this
that
we
talked
about
earlier
was
the
background
activity
stuff.
We
also
want
to
do
qos,
client
versus
client,
and
that
is
going
to
be
kind
of
the
final
piece
to
make
sure
we
have
complete
qos
in
cef
any
anything
else.
We
want
to
talk
about
sridhar.
D
This
yeah-
I
think
that
covers
the
high
level
stuff
that
we
want
to
do.
B
Do
we
need
something
that
covers
sort
of
the
user
facing
part
of
this
like
how
either
set
qs
policies
or
how
they
can
or
monitor.
A
Yeah,
I
guess
that
we
could
have
I
I
don't
think
there
is
much
integration
with
the
dashboard,
but
in
terms
of
how
users
can
use
it.
At
the
moment
we
have
detailed
documentation,
developer
documentation
and
user-facing
documentation
from
treader,
which,
which
is
one
part,
has
merged.
A
One
is
cylinder
review,
so
we
have
the
docs
part
of
it,
but
it'd
be
really
nice
to
have
it
as
a
part
of
the
dashboard
and
that's
one
thing
I
discussed
with
the
dashboard
team
in
cds
is
currently
what
we
have
is
m
clock
just
has
some
profiles.
The
default
profile
is
going
to
prioritize
client,
I
o,
but
it
would
be
nice
to
have
give
users
the
ability
to
just
change
that
profile
in
the
dashboard
itself.
A
B
A
A
A
So
the
next
item
is
just
a
clean
up
item,
but
I
think
it's
again
useful.
The
idea
is
that
we
have
a
whole
bunch
of
asserts
in
the
code
which
are
multi-condition
asserts.
So
from
debugging
point
of
view,
it's
very
difficult
to
understand.
It's
not
obvious,
let's
just
say
that
which
assert
was
hit.
So
there
is
some
extra
work
that
developers
need
to
do
be
able
to
identify
which
assert
would
hit
or
the
bug
is
about.
A
B
C
Guess
it's
just
yeah
go
ahead.
I
wonder
if
it's
worth
considering
integrating
the
the
data
that
we
have
from
the
crashes
now
that
we
know
that
we
do
have
io
errors
from
devices,
so
I
haven't
looked
into
that
too
deeply,
but
maybe
we
can
use
this
information
as
well.
B
It
might
depend
on
where
we
show
it.
I'm
not
sure
if
we've
put
much
thought
into
that,
like
my
first
guess
would
be
the
device
ls
screen
that
lists
your
devices
should
show
like
error
count
or
air
history
or
something
I
don't
know,
and
I
could
show
crashes
too
if
we
wanted
to
do
that,
but
it
doesn't
map
as
cleanly
to
device
as
it
does
to
a
demon.
So
that
may
or
may
not
make
sense.
C
B
But
I
think
yeah
having
having
I
o
errors
by
by
device
also
by
pool
like
if
we
can
link
it
to
a
pg,
even
not
just
a
device.
That
might
also
be
helpful.
B
A
I
remember
one
of
the
initial
ideas
was
about.
We
already
have
this
metric
nam
shad
repaired.
That
would
just
exposing
this
at
the
manager
level.
Yeah
will
be
useful
to
begin
with.
B
A
Okay,
you
should
see
that
somewhere,
you
should
probably
see
this
could
be
there.
Yeah
I'll,
take
a
look
to
see
what
what
these
values
are
are
if
these
values
are
even
reflecting
any
of
the
reverse.
A
There
are
a
whole
a
lot
of
other
items
like
some
marked
with
small,
which
be
nice
for
somebody
who
wants
to
get
introduced
to
raiders
or
like
even
ceph
and
start
picking
up,
but
we
don't
really
have
a
target
in
mind
for
them.
So
I'm
not
going
to
go
through
all
of
these
items,
I'm
going
to
move
to
store,
monitor
and
manager.
A
So
for
bluestone
we
discussed
the
split
cache
improvements.
There
are
a
couple:
others
one
is
about
make
assets
unique
per
return
value
so
currently,
when
we
assert
in
some
places
in
blue
store,
we
just
assert,
but
it's
not
clear
why
we
assert
it
as
in
the
return
value,
is
not
clear
whether
it
was
out
of
space
or
it
was
some
sort
of
eio
making
them
explicit
again
helps
with
debug
ability
in
general.
So
this
is
again
marked
with
small
for
blue
store.
A
And
then
final
item
for
blue
store.
This
has
been
there
for
a
bit.
I
think
there
are
several
other
improvements
that
we
are
making
in
pacific
because
of
which
the
cash
bin
cash
age
binning
pr
from
mark
had
to
wait.
But
at
this
point
I
believe
all
of
that
stuff
has
already
merged
yep.
F
Okay,
we
can.
We
can
do
this
now.
I
think,
based
on
stuff
that
that
adam
wrapped
up
and
based
on
some
of
the
other
stuff
that
we've
merged
you
should
be
able
to
go
in
now.
A
Yeah
and
then
one
of
the
main
blockers
was
the
rocks
the
column
family
shouting.
So
that's,
you
know
gone
with
pacific,
so
it's
pretty
stable.
So
we
can
start
working
on
this
again.
A
A
Then
this
there's
this
item
about
severity
of
degradedness
in
self
health.
This
is
thompson
from
this.
Danny
already
started
working
on
this
at
some
point,
but
it
got.
I
think
he
got
distracted
by
the
some
of
the
qos
work
he
was
focusing
on.
A
I
guess
let's
go
to
the
rfe,
so
the
idea
is
to
be
able
to
display
pgs.
This
the
example
here
is
for
ec
in
k,
plus
n
minus
and
for
all
quantities.
F,
health.
A
Yeah,
essentially,
when
you
at
what
point
do
you
do,
you
know
make
pgs
inactive
think
just
having
to
know
how
many
of
them
are
in
in
what
state,
just
the
breakup
of
how
bad
or
how
much
time
a
pg
would
need
to
recover
break
up
of
that,
making
that
more
obvious
to
the
user.
Currently,
it's
just
overall
degradedness
of
the
cluster
that
we
show
so
this
this
is
where
this
comes
from.
A
Cool
and
this
item
actually
spilled
from
pacific,
I'm
not
sure
what
the
status
of
this
one
is
again.
B
A
A
A
All
right,
then,
we
are
already
running
over,
so
I'm
just
going
to
go
through
the
manager
once
real,
quick.
We
talked
okay,
there's
a
separate
card
for
this
perfect
that
you
added
sage,
okay,
so
this
is
one
we
can
say
it's
merged
now
or
not
much
date.
B
A
A
B
We'd
have
to
check,
I
actually
thought
we
might
have.
It
already
included
this,
but
I'm
not
sure
this
basically
just
include
which
multi-site
rgw
features
being
used
in
telemetry.
We
should
just
refresh.
C
Yeah,
I
think
I
think
we
collect
just
how
much
like
we
count
them,
but
not
more
than
that,
and
it
goes
together
with
the
other
card
that
we
just
need
to
find
a
good
way
of
collecting
new
data
from
telemetry.
C
A
All
right,
then,
the
next
card
here
is
about
capturing
exceptions
and
crashes
for
the
manager,
similarly
similar
to
what
we
do
for
c,
plus
plus
crashes.
I
think
this
is
going
to
be
another.
A
A
Doesn't
mean
any
more
discussion
here
next,
we've
gotten
this
item
from,
and
this
is
also
spelled
over
from
pacific.
The
implementation
was
not
that
trivial.
The
idea
is
to
be
able
to
use
the
actual
osd
utilization
for
balancing
pgs
by
the
upma
balancer
versus.
Currently,
what
we
use
is
the
number
of
pg's
per
usd
there's.
There's
a
lot
of
discussion.
We've
done
around
this.
We
all
think
this
is
a
good
idea
getting
it
in.
That
is
a
left.
A
A
This
is
another
thing
we
all
agree
is
yeah,
so
next
one
is
also
something
that
fits
in
like
a
short-term
improvement
of
our
manager.
Scalability
also
fits
in
with
some
of
the
trimming
work
we
did
with
making
trimming
more
dynamic,
so
this
manager's
tax
period
is
again
a
static
value.
The
idea
is
to
be
able
to
dynamically
change
this,
based
on
what
the
ingest
rate
is
for
the
manager
and
brad
has
been
working
on
this.
A
And
then
there's
this
broadcast
about
manager
and
for
scalability
talked
about
telemetry
very
shared
data.
Page.
I
see
you
added
it.
B
A
That's
all
this
is
one
more
small
card
about
display
detailed
output
for
enable
modules
but
yeah.
I
think
this
is
another
beginner
card.
A
I'll
get
this
there's
a
building,
a
request.
B
A
B
I
would
put
this
in
the
whole
bucket
of
discussion
about
how
we
need
to
look
at
the
object
or
implementation
itself,
as
well
as
the
messenger
and
reactor
frameworks,
not
sure
this
came
up
a
little
bit
during
cds,
but
it's
not
it's
not
clear
that
it's.
It's
likely
that
there's
more
work
that
needs
to
be
done.
The
object
or
itself
would
be
to
actually
run
in
the
reactor
framework
right
now
it
uses
the
seo
function,
calls
but
doesn't
actually
it
still
uses
that
same
red
architecture.
B
E
Okay,
yeah.
B
A
Okay,
that's
rotation,
yeah!
I
think
that's
it
anything
else
that
I'm
missing.
B
E
A
A
Cool
thanks,
everyone
for
joining
and
have
a
great
day,
see
you
later
thanks.