►
From YouTube: Ceph Performance Meeting 2022-09-22
Description
Join us weekly for the Ceph Performance meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contribute/
What is Ceph: https://ceph.io/en/discover/
A
A
Ronan
that
the
the
Deep
scrub
thing
that
you
mentioned
in
stand
up
is
that
discussion
happening
on
the
mailing
list,
or
is
that
was
that
a
private
emails.
A
I
have
to
apologize
this
morning.
I
was
supposed
to
be
in
a
two-hour
meeting
that
got
canceled
at
the
last
minute,
but
I've
been
busy
with
actually
exciting
things
that
we'll
talk
about
today
with
Adam,
but
I
didn't
go
through
PR's
at
all,
so
I
don't
have
any
any
updates
on
that.
I
I,
the
last
like
couple
months,
I've
been
kind
of
slacking
on
going
through
PR's,
so
I
apologize
for
that,
but
hopefully.
A
No,
no
I
I
haven't
updated
the
list,
so
I'll
I'll
try
to
be
better
next
week
about
that,
but
but
yeah.
Hopefully
the
trade-off
is
worth
it.
We've
got
some
exciting
things
to
talk
about
today,
I
think
so
anyway,
it
looks
like
we've
got
people
from
Cornell,
so
this
is
good
all
right,
so
I
guess,
since
I
don't
have
any
pairs
to
talk
about
I'm
gonna,
Dive,
Right
In.
A
We
had
a
really
good
meeting
last
week
where
we
talked
about
all
of
these
issues
surrounding
snapshots
and
ideas
to
try
to
make
it
faster
in
the
OSD
or
potentially
specifically
with
RBD
mirror
ideas
to
maybe
make
RBD
mirror
faster
without
relying
on
snapshots,
so
I
think
just
as
a
quick
recap
of
last
week,
all
of
us
kind
of
had
different
ideas
about
things
that
we
could
try
to
make
this
better.
A
And
this
we
started
working
on.
A
Some
of
those
I
ended
up
working
on
defragmenting
objects
prior
to
snapshot,
with
the
hope
that
I
could
reduce
the
number
of
extents
and
the
number
of
shared
blobs
periodically
by
doing
a
copy,
and
that
ended
up
being
kind
of
interesting
I
I'm,
going
to
actually
talk
about
it
before
we
get
to
Adam's
stuff,
because
Adam's
approach
ends
up
being
so,
amazingly
good
that
that,
if
I
presented
mine
afterwards,
I
think
it
would
just
kind
of
you
know
be
really
unimpressive
so
I'll
I'll
start
out,
and
then
Adam
I'll
give
it
over
to
you
after
that.
A
So,
okay,
I'm
gonna,
share
my
screen
here.
If
you
can
see
this,
the
the
the
gist
of
it
is
when
I
started
implementing
the
defermentation.
This
is.
This
is
really
really
stupid
and
simple
implementation.
It's
it's
just!
If,
if
we
see
that
the
number
of
extents
has
exceeded
some
threshold,
then
we
will
read
the
object
and
rewrite
it
out,
basically
and
and
rewrite
it
out
in
a
defermented
way.
A
So
it's
that's
all
it's
doing
this
DT
value
that
you
see
in
in
this
graph
is
the
threshold
and
then
DJ
is
Jitter.
So
basically,
I
made
it
so
that
there's
some
random
Monte
Carlo
like
aspect
to
this,
to
to
make
it
so
that
we
don't
have
just
a
stampede
of
object
rights
all
at
once,
hopefully
is
spread
it
out
a
little
bit
more.
So
you
can
see
in
this
first
graph
here
that
it
is
effectively
reducing
the
CPU
usage.
A
It's
Baseline
is
maybe
in
this
test
around
a
100,
One
Core
and
we're
dropping
from
about
two
and
a
half
cores
down
to
closer
to
around
one
and
a
half
course.
So
it
is
the
CP
when,
for
sure
the
one
of
the
big
concerns
with
an
approach
like
this
is
that
when
we
copy
the
object
in
sever
the
object
from
the
snapshots,
we
use
more
more
space
on
the
disk
for
sure
when
we
have
a
very
low
threshold.
It's
quite
a
bit
more.
A
In
these
tests
we
have
a
five
gigabyte
volume
and
up
to
five
snapshots,
once
we
hit
steady
state,
is
always
five.
So
if
you
were
just
copying
objects,
it'd
be
30
gigabytes
of
disk
usage.
A
A
There's
there's
a
little
bit
of
of
kind
of
fluctuation
as
we
go
along
here,
but
it
tends
to
even
out
over
time
and
we're
a
little
bit
higher.
It's
not
too
bad,
though.
So.
That's
not
really
a
problem.
You
know.
Maybe
it
is
a
little
bit,
but
it's
not
not
a
major
space
amplification,
but
the
bigger
concern
is
Right
amplification.
When
we
copy
objects,
we
end
up
actually
invoking
a
pretty
big
workload
on
the
disks.
A
When
we
have
this
kind
of
Jitter
approach,
where
we
add
Jitter
in,
we
can
kind
of
smooth
it
out.
So
this
yellow
line,
you
can
see
that
when
we
do
a
snapshot
at
least
temporarily
we're
we're
using
about
100
megabytes
per
second
of
of
disk
IO
to
copy
objects,
not
ideal,
but
if
we
were
able
to
further
smooth
that
out
over
time,
we
could
probably
drop
that
down
if
we
could
make
it
more
of
a
background,
constant
workload.
There's
some
possibility
there,
so
this
does
help.
A
If
we
actually
look
at
the
client
level
iops
both
read
and
write,
you
can
see
that
the
blue
is
our
default
case
and
in
the
red,
yellow
and
green
cases,
we
don't
see
as
much.
We
don't
see
as
many
stalls.
This
is
on
nvme
on
ishidi
I
suspect
that
all
of
this
looks
a
little
bit
different,
but
you
know
at
least
as
far
as
the
code
goes.
This
does
appear
to
be
helping.
It's
not
you
know
it's
there's
some
value
to
it.
A
Think
if
we
want
to
continue
pursuing
something
like
this
it'd
be
better
to
transition
this
to
some
kind
of
a
background
process
where
we've
very
slowly
go
over
and
look
at
at
Ono's
that
are
heavily
fragmented
and
and
then
maybe
you
know
see
if
we
can
make
it
better
and
that
might
still
actually
even
possibly
be
some
might
provide
some
advantage
in
with
even
with
Adam's
code,
but
it
need
to
be
a
very
slow
process.
I
think
much
slower
than
even
you
know
this
0.8
case
I've
got
here.
A
We'd
wanted
to
to
have
you
know
as
minimal
disc
impact
as
possible.
So
that's
what
I
did
over
the
last
week.
A
B
B
The
idea
was
to,
instead
of
have
each
shared
blob,
each
blob
track
track
its
own
allocations
with
the
other
user
uses
users
and
making
and
that's
required
to
convert
a
standard
blob
into
shared
blob.
We
decided
to
just
try:
how
will
it
work
if
we
had
just
one
object,
some
tracker
that
will
attract
the
data
for
all.
B
All
the
allocations
for
all
that
object
and
all
the
Clones
snapshots
of
that
object.
So
then
we
have
in
that
sense.
After
we
start
making
a
snapshot
of
an
object.
We
just
have
a
class
of
objects
joined
by
a
common
tracker.
B
The
expectation
was
that
it
will
be
somewhat
good
and
it
will
require
some
further
improvements,
but
the
idea
was
the:
the
concept
was
that
we
could
somehow
move
the
load
from
shared
blobs
to
some
just
CPU
intensive
actions
that
will
be
contained
into
just
one
one
place.
We
could
then
improve
that,
but
it
turned
even
better
than
than
expected.
B
In
the
meantime,
my
my
Branch
seems
to
work.
It's
only
now
has
a
problem
that
it's
fsck
is
basically
asserting
and
not
it's
still
okay,
but
the
tests
are
getting
wild
because
they
do
find
unexpected
results.
So
I
cannot
do
full
proper,
safe
objects
or
tests,
but
for
what
I
can
ascertain
it's
just
the
logic
of
snapshots
and
containing
different
tracking
properly
and
containing
different
data
in
different
snapshots
is
working
properly.
So
now,
there's
only
a
thing
of
Mark
presenting
the
comparable
results
with
his
setup.
A
So
the
CPU
usage
is
really
really
low,
much
lower
than
it
was
in
my
PR.
A
In
fact,
it's
it's
kind
of
amazing
I,
don't
think
Adam,
either
Adam
or
I
could
believe
just
how
good
these
results
are
it's.
It
appears
that,
like
all
of
the
usage
that
we're
seeing
in
the
default
case
and
in
Maine,
is
primarily
due
to
sure
blood
breaths,
it's
it's
pretty
crazy,
no
space,
amplification
whatsoever
as
expected
for
Ray
apps.
We
we
don't
see
any
additional
write
load
on
the
disk,
as
as
you'd
expect.
A
A
Interestingly,
though,
that
doesn't
appear
to
be
correlated
really
with
the
the
client
write
drops
that
maybe
a
little
bit
I
mean
we,
we
kind
of
see
something
here
that
could
be
correlated
with,
like
maybe
the
the
8000
second
Mark,
roughly
with
what
we
saw
in
the
the
block
device
right
drops
here,
but
we
don't
see
it
in
like
the
red
case,
with
with
Adam's
chart
tracker,
there
was
a
drop
in
the
block
device
right
throughput.
Yet
there's
no
correlated
client
breakdrop.
A
So
it's
still
a
little
bit
of
a
mystery
what's
going
on
there,
but
the
good
news
is
that
similar
to
some,
in
my
case,
Adam's
case
seems
to
be
doing
a
very
good
job
of
reducing
client
right,
I
O
throughput
drops
here.
I
will
have
to
see
in
the
HTT
case
what
it
looks
like,
but
I
don't
think
it's
gonna
make
anything
worse.
A
I
suspect
it
will
only
make
it
better,
so
it'll
be
very
interesting
to
see
what
Paul's
tests
show
and
the
the
read
side,
Adam
and
I
were
puzzling
over
this
this
morning.
For
some
reason,
in
the
default
case,
we're
exceeding
the
fio
cap
on
read
iops.
Actually,
we
do
that
on
the
right.
Apps
too,
it
should
be
500,
and
yet
we
see
it
bounce
up
and
same
in
the
read
case
and
with
atoms
change
it.
A
It
still
goes
above
it,
but
at
a
it's
capped
at
like
a
lower
limit,
which
neither
of
us
quite
understand,
maybe
it's
some
artifact
of
how
fio
does
its
capping,
but
in
any
event
again
we
see
that
in
the
default
case
we
drop
below
the
the
cap
more
often
without
M's
changes.
So
it
all
looks
very,
very
good.
A
I
I'm
I'm
still
amazed
by
how
much
of
an
improvement
atoms
changes
are
making
here,
and
it's
also
making
the
the
the
structure
some
of
these
data
structures
simpler
by
getting
rid
of
shared
gloves,
which
I'm
I'm
just
incredibly
excited
about
so
yeah,
that's.
That
was
my
read
of
some
of
this
data
that
we
got
Adam.
Is
that
I'll
sound,
sound
right
to
you
as
well.
B
Exactly
the
same,
I
can
confirm
that
the
quality
of
results
was
so
extremely
good
that
we
really
thought
for
some
time
that
there
might
be
some
error.
Hence
you
can
see
two
sets
of
results.
One
is
original
branch
and
then
there
is
a
small
fix,
but
that
didn't
actually
change
much
so
I'm
for
now
I'm
a
bit
positive
that
it
might
be
a
real
real
output
and
we
can
have
such
performance
for
wrong
running
cluster.
B
A
All
right,
that's
I,
think
all
I've
got
Adam
anything
else.
B
No
I
will
I
I
can
just
promise.
I
will
continue
to
feel
in
First
Step
make
an
fsck
in
sync
with
this
code,
so
I
can
properly
run
all
tests
and
then
have
some
road
map
to
put
it
into
a
compatible
extension
to
current,
because
current
in
in
today's
State
it's
a
bit
hacky
implementation,
but
with
little
effort
in
it
can
just
extend
how
we
organize
our
objects
for
now.
B
I
think
the
easiest
solution
will
be
to
maybe
for
some
time
have
both
ways
of
handling
shared
blobs
and
possibly
still
using
that
old,
complicated
approach
when
we
have
compressed
objects.
Maybe
this
is
the
I
mean
even
not
sure
now
that
it
will
be
ultimately
the
case,
but
for
now
I
would
love
to
preserve
that
compatibility.
A
Cool,
so
any
any
questions.
A
All
right:
well,
then,
I'll
open
it
up.
C
Hey
Mark,
not
so
much
a
discussion,
but
I
just
wanted
to.
Let
you
know
that
there's
a
PR
for
rgw's
HTTP
3
front
end
now
not
working
yet,
but
the
Prototype
is
in
progress.
A
What
what
kind
of
difference
do
you
think
it
will
make.
C
A
How
how
are
we
doing
now
in
terms
of
CPU
usage
and
our
job
argument?
C
Cpu
usage
generally
I,
don't
know
that
I
have
a
handle,
but
I
mean
we've.
We
did
a
lot
of
work
on
the
the
Beast
front
end
and
it's
CPU
usage.
A
I,
don't
know
if
you
can
still
hear
me
or
close
the
other
browser
window.
Oh
good,
okay
and
I.
Don't
have
an
echo
anymore,
which
is
fantastic,
I
know
a
while
back.
A
You
know
we
had
those
really
big
CPU
usage
numbers.
That
I
think
you
guys
improved
quite
a
bit,
but
that
seems
like
it
might
be
the
yeah
every
every
time
I
hear
feedback
from
kind
of
people
trying
to
do
cloud
deployments
and
things
they.
A
They
always
want
the
CPU
Stitch
of
our
demons
to
be
lower
and
usually
the
simplest
decide,
I
hear
about
it,
a
lot
but
I
suspect,
especially
if,
if
they
want
to
scale
out
rgw,
you
know
anything
we
can
do
there
to
to
reduce
overhead
CPU
overhead
is
probably
a
win.
A
But
anyway
sounds
sounds
like
it.
It
has
potential
at
some
point,
I'd
like
to
get
back
to
actually
doing
a
profile
on
rgw
and
just
see
kind
of
where
we're
at
these
days
under,
like
a
really
heavy
ball
object.
Workload.
A
All
right,
I,
don't
hear
anything
anymore,
I,
don't
know
if
that
means
there's
no
talking
or
if
it's
just
I
can't
hear
anything,
but.
C
A
Okay,
well,
if
no
one
has
anything
else,
then
I
suppose
we
can
wrap
up.
Adam
I
just
want
to
say
again
how
impressed
I
am
with
the
the
work
that
you
did.
It's
it's
really
really
impressive.
How
how
much
this
helps.
So!
Congratulations,
I,
I,
think
you
you
you
I,
don't
know
if
it's
a
hit
a
home
run
or
won
the
lottery
or.
A
A
I
I
was
gonna,
say
I,
don't
feel
like
laundry
captures
it,
though,
because
it
was
really
a
result
of
your
hard
work,
not
not
just
random
random
chance.
So,
but
you
know
good
job,
it's
it.
It
looks
like
Your
Instinct
was
was
right.
A
I'm
I'm
very
excited
to
to
get
your
change
to
to
Paul,
to
try
out,
because
I
I
suspect
it's
going
to
be
a
big
win.
Hopefully,
hopefully,
I
I'm
still
a
little
nervous.
What
the
just
the
general
process
of
snapshotting
and
fragmentation
might
look
like
on
hard
drives
with
this
workload,
but,
to
some
extent
we're
going
to
hit
you
know
the
limits
of
what
you
can.
You
can
do
right,
but
what
we
can
solve
ourselves
here
with
this
approach.
A
A
Well,
thank
you
all
for
coming
and
we'll
we'll
talk
again
next
week
and
have
a
great
week.
Everybody.