►
From YouTube: Ceph Performance Meeting 2022-03-31
Description
Join us weekly for the Ceph Performance meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contribute/
What is Ceph: https://ceph.io/en/discover/
A
Not
a
whole
lot
going
on
with
pull
requests
this
week,
which
is
not
super
surprising
since
everyone's
trying
to
desperately
get
fixes
into
quincy
and
fix
the
fixes
in
quincy,
so
not
not
too
much
there
casey,
I
did
see
there
was
this,
this
rgw
multi
site,
resharding
pr.
It
looks
like
they
actually
merged
into
another
branch
that
I
saw.
That
is
that
going
into
master
eventually
here.
B
A
Oh
okay,
okay,
so
yeah!
I
guess
I'm
I'm
sort
of
documenting
something
here
that
will
get
merged
in
master
later,
but
it
doesn't
matter
very
cool.
Okay,
two
updated
prs
one
this
one
from
gabby
that
changes
some
code
in
the
no
column
b
code.
Gabby.
Do
we
do
we
still
need
this?
Is
this
still
relevant
after
the
discussion
from
last
week?.
C
A
Okay,
okay,
cool,
okay
and
then
the
other
one.
Is
this:
this
tracer
work
that's
been
kind
of
ongoing
for
a
while.
I
think,
there's
some
more
testing
that,
just
when
I
was
glancing
at
it,
it
looked
like
they
were
saying
that
maybe
it
was
a
little
more
chaotic
than
it
had
been
previously,
but,
like
I
confess
I
didn't
look
super
closely
at
it.
A
So
in
any
event,
there's
there's
more
more
testing
going
on
there
more
work
to
make
sure
it
looks
okay,
and
that
was
that
was
all
I
saw.
Oh,
go
ahead,
hey
mark.
D
Hey
I
have,
I
have
some
updates
on
that
and
that
pr
okay
at
the
beginning,
I
ran
a
s3
test
on
that
pr
and-
and
we
have
got
that,
we
we
don't
suffer
from
any
performance
degradation
and
then
I
run
the
rados
benchmark
tool
and-
and
we
can,
we
can
see
you
can
see
like
two
percent.
D
Okay
generation
on
the
performance
and
also
sometimes
I'm
getting
unstable
results
with
the
rados
benchmark
tool
and
that
vr
I'm
trying
to
compare
that
pr
to
the
to
the
master.
D
A
D
A
Oh,
no,
okay,
okay!
Well,
yeah!
Good
luck
with
diagnosing
it!
If
it's,
how
long
does
the
unstable
period
last
for
is?
It
is
just
a
blip
or
is
it?
Is
it
like
consistent
over
time?
A
No
wait!
It's
consistent.
D
This,
isn't
I
mean
if
you
try
go
ahead,
sometimes
I'm
getting.
It
goes
around
200
above
or
less
than
what
I
posted
in
the
in
the
pl
and
but
I
think
that
we
can.
D
We
can
see
that,
even
though
we
have
those
unstable
results
in
any
time,
we
get
we're
getting
like
two
two
percent
less
performance
than
the
and
then
the
master.
A
You
could
you
could
try
looking
at
like
either
perf
and
and
do
like
the
the
cash
hit,
miss
statistics
that
you
can
get
out
of
it
and
see.
If
that,
maybe
is
this
you're
you're
causing
you
know
worse
cash
behavior
on
the
cpu
when
the
pr
is
applied,
it
might
be
a
low
impact
thing
to
look
at
or
you'll
get
a
full
call
graph
for
richard.
E
A
Yeah,
it's
possible,
you
might
not.
You
have
the
resolution
to
be
able
to
see
it
and,
like
I
call
graph,
if
it's
only
like
two
percent,
it's
pretty
pretty
tough
to
spot.
You
could
try,
but
maybe
not,
but
maybe
you'd
see
it.
If
you
saw
that,
like
your,
your
cash
hit
rate
was
lower
or
something
you
know,
those
are
good
statistics
still.
These
have.
D
E
A
All
right!
Well,
good
luck!
I
hope
you
can
find
it
were
there
any
yeah,
no
problem
were
there
any
other
pr's
that
people
had
that
they
want
to
talk
about
this
week.
A
All
right,
if,
if
not
then
so,
the
the
big
topic
I
have
for
today
is
that
we've
been
trying
to
track
down
a
right
regression
in
quincy
we
talked
about
a
little
bit
last
week.
A
I
was
fairly
convinced
incorrectly
that
it
was
caused
by
the
no
column
b
code
and
it
it's
not,
but
it's
possible
that
no
column
b
was
somehow
triggering
this
a
little
bit
the
past
week
adam
and
I
have
both
been
running
a
bunch
of
tests
trying
to
to
narrow
it
down
on
efficient,
alice
adam's,
not
seeing
this,
but
I'm
fairly
consistently
seeing
it
on
mako,
which
is
our
amd
nodes
that
have
samsung
drives
in
them,
and
I
was
able
to
do
a
bisect
and
get
down
to
about
10
commits
and
in
those
10
commits
there
was
a
change
that
we
made
to
the
avl
allocator.
A
That
change
is
in
pull
request,
41615
I'll
link
it
in
the
chat
window.
Here
and
this
morning
I
went
back
and
took
the
the
quincy
release,
commit
that
that
we're
working
from
right
now
and
reverted
that
pr
against
that
commit,
and
I'm
testing
that
branch
now
and
I
don't
have
a
lot
of
results
for
it
yet.
A
There
may
be
something
else
that
we
need
to
track
down,
possibly,
but
this
so
far
at
least
on
these
nodes
seems
to
be
maybe
maybe
the
smoking
gun.
So
I
I
tried
to
read
through
that
pr
a
little
bit
and
there
was
a
lot
of
discussion,
igor
and
adam.
I
think
you
both
looked
at
it
fairly
closely,
do
either
of
you
feel
like
you,
you
can
remember,
or
have
a
sense
for
how
that
first
fit
strategy
was
changing,
actually
changing
in
the
pr
or
or
if
it
was.
A
It
seemed
like
there's,
maybe
some
some
disagreement
like
who
had
disagreement
regarding
some
of
the
things
that
you
were
originally
thinking,
igor.
F
Well,
I
I
don't
remember
much
from
that
pr,
but
I
clearly
remember
it
was
about
to
make
avl
allocator
work
faster.
It
was
just
limiting
the
search
to
not
overburden
cpu.
So,
if
anything
from
cpu
point
of
view,
we
should
be
now
faster.
The
only
result
that
was
never
actually
tested
is
how
new
patterns
of
allocations
may
appear.
That
previously
were
not
anticipated.
I
mean
neither
previous
ones
nor
new
ones
were
analyzed,
so
it
was
both
okay
in
the
in
the
sense
they
were
both
not
tested
so
yeah.
G
H
A
I
G
J
G
And
so
yes,
pacific
with
with
that
to
see
if.
A
Yes,
yes,
that's
on
the
docket,
but
but
I'm
trying
to
get
through
these
other
tests
first
before
I
go
back
and
test
that
one
that
probably
will
be
next,
though.
F
H
But
just
to
mention
how
how
long
could
it
take
to
to
perform
the
allocate
from
an
allocation
it
was
like
60,
70,
milliseconds
per
single
locations.
The
part
was
pretty
dramatic.
A
These
new
restrictions
is
that
right,
am
I
thinking
about
this
right
that,
basically,
before
we
didn't
do
anything
we
we
just
had
no
limit
like
as
if
these
were
both
zero,
but
when
this
code
was
added
now
we
we
impose
those
limits
for
how
long
we
do
first
fit
is
that
is
that
correct.
H
So
it
would
be
very
interesting
to
assess
how
ours
is
back.
What
what
would
be
the
difference
in
performance
between
single
chunk,
right
versus
tons
of
smaller
rides?
Our
ios
tag
handles
that.
H
A
Well,
I
haven't
dropped
it
down,
yet
I've
been
just
doing
the
60
osd
configuration
because
I
was
started
consistently,
hitting
it
with
this.
This
other
set
of
tests
that
I
have
that's
what
I've
been
running
in,
but
now
that
we
know
what
it
is,
maybe
that
we
have
a
target.
I
can
start
trying
with
like
a
single
osd.
F
C
H
F
Yeah,
that
would
be
cool,
although
I'm
not
hopeful,
because
in
my
intuition,
the
most
important
part
in
the
graduation
that
we
get
is
that
we
actually
change
a
shape
of
our
objects
files
in
fio
terms
on
the
disk
when
we
modify
them.
So
if
we
just
have
files
under
fio,
our
overrides
will
just
overwrite
and
not
put
data
somewhere
else,
leaving
the
unmodified
data
in
old
place.
A
G
H
H
Maybe
similar
performance
degradation
in
upstream
recently
and
yeah
the
it
was
like
some
additional
fragmentation
happened
to
his
osgs
and
they
were
of
all
flash
or
maybe
some
some
drives
badly
handled.
A
Yeah
it's
possible
and
unfortunately,
for
us
these
are
not
probably
very
uncommon
drives.
These
are
samsung
pm
983's.
I
believe
they're,
basically
like
they're
they're,
fairly
reasonably
priced,
not
super
low,
but
not
super
high
right
endurance,
so
kind
of
they're.
You
know
reasonably
priced.
H
That
synthetically
using
fio
and
hence
we
will
be
able
to
assess
that
performance
degradation
if
any
versus
different
drives.
A
Yeah
agreed,
I
think,
now
that
it
seems
like
I'm
coalescing
around
this
vr,
assuming
that
that
continues
to
be
the
case.
My
thought
is
then
now
I
at
least
have
a
target
to
try
to
shrink
this.
The
test
set
up
to
hit
it
right
and
before
it
was
just
the
space
was
too
big
to
figure
out
what
it
was
exactly,
but
now
now
I
think
we're
narrowing
in
so
it
should
be
easier
to
to
attempt
to
reproduce
this
if,
if
not
with
fio,
maybe
I
can
do
it
with
a
fio.
A
Icor
or
adam,
do
you
do
you
know,
what's
the
nuance
in
in
first
fit
versus
best
fit
mode
like
what?
What
how
does
the
behavior
change
in
those
two
different
modes.
F
New
ones,
I
mean
that's
a
strange
thing:
the
best
fit
just
tries
to
search
on
throughout
all
the
chunks
to
find
exactly
free
space.
That's
exactly
what
you
request,
but
that
mode,
while
very
good
in
memory,
usually
attended
to
force
your
rights
to
basically
go
to
very
random
places
on
the
device,
so
the
first
fit
mode
is
just
trying
to
not
do
it
and
go
as
close
as
possible
with
a
reasonably
sized
sized
element.
F
So
it's
not
it's
totally
different
modes
of
work.
A
A
F
Results
from
two
different
chunks:
we
should
be
able
to
at
least
in
a
bulk
compare
what
we
are
getting
from
a
locator,
because
either
at
least
amount
of
chunks
getting
out
of
allocator
should
be
an
indicator.
H
And
again,
I'd
like
to
do
it
latency
registered
now.
F
H
A
H
H
H
Maybe
well,
it
might
be
an
interesting
experiment
to
to
try
to
switch
locators
at
this
point,
but
this
requires
hdd
start
and
you
might
to
make
sure
that
president's
cup
does
not
pack
at
least
picture.
So
if,
if
sdg
start
preserves
the
performance
drop,
then
you
can
try
to
switch
locator
just
to
to
make
sure
that's
exactly
the
case.
I
mean
oh.
A
Sure,
there's
a
a
lot
of
things
to
test
here
so
we'll
I'm
not
sure
what
order
I'm
gonna
do
it
yet,
but
I
can
you
can
see
if
that
might
be
something
that
we
could
try
doing.
It'll
be
a
little
tricky
with
the
way
cbt
works
but
possible.
A
One
thing
I
did
want
to
bring
up,
but
you
are
adam.
If
you
look
at
the
allocator
test
tab,
that's
the
one!
It's
like,
second
from
the
last
one
right
before
adam's
tab
in
that
spreadsheet,
in
those
tests
looking
at
pacific
and
quincy,
this
isn't,
with
you
know
the
the
revert
or
anything
it
looks
like
the
avl
allocator.
Actually
didn't
do
that
much
differently.
It
was
a
little
different,
but
not
that
much
different
in
pacific
and
quincy,
but
the
hybrid
allocator
looked
much
much
worse.
H
But,
on
the
other
hand,
the
performance
drop
might
be
caused
by
switch
to
bitmap
locator
or
a
sort
of
duplicate
allocation
attempt,
which
is
you
need
four
megabyte
chunk.
You
first
go
to
a
wheel,
locator
collect
to
get
the
required
space
and
you
get
no
space
since
there
are
no
large
chunk
clutch
enough
chunks
and
then
hybrid
allocator
rolls
back
to
the
bitmap
allocator.
H
So
again,
you
rather
have
well
by
the
way
it
would
be
interesting
to
get.
H
A
free
dump
three
three
junk
dump
from
one
of
the
decorated
tracies
and
try
to
see
how
what's
the
three
chunk
layout
in
this
state,
I
mean
how
we
can
even
run
some
some
allocated
simulation
against
this
dump
and
see
what
are
the
latencies
for
the
location.
A
Is
that
something
that
we
could
see?
Even
if
we
we
don't
have
a
performance
drop,
would
we
still
be
able
to
see
then
the
fragmentation
if
we
were
to
go
digging
and
look
at
that.
H
Well,
potentially,
we
can
see
that
sort
of
fragmentation.
We
have
a
couple
of
very
simple
assessment
to
learn
how
fragmented
is
this
state,
but
honestly
they
they
are
pretty
straightforward
and
trivial.
So.
H
In
other
words,
we
don't
have
good
enough
tooling
to
to
learn
the
the
fragmentation.
So
it's
it's
definitely
doable
once
you
have
this
free
dump,
but
but
no
tooling,
for
that.
H
F
Scratch,
well,
I
think,
there's
a
problem
with
assessing
quality
from
allocation
fragmentation.
I
mean
allocator
fragmentation
because
I
do
not
see
a
relation
between
how
fragmented
is
our
free
space
and
in
how
many
fragments
the
allocator
is
providing
to
just
objects
that
are
just
being
written.
Certainly
the
info,
how
many
allocations?
C
H
A
So
I'll
I'll
definitely
see
if
I
can
get
data
out
of
that.
Actually,
I've
got
I'm
running
right
now
with
quincy.
I
could
well.
This
is
the
the
reverse,
the
revert,
so
I
don't
know
if
it's,
I
guess
it's
sort
of
useful,
because
it
will
tell
us
the
case
where
it's
not
problematic.
How
do
I
look
at
the
histogram.
F
Well
I'll,
send
you
a
command
offline.
Yes,
please
please
see
me
on
that
one.
I
was
about
to
ask
the
same.
H
G
F
I
guess
we
have
two
investigation
paths.
One
is
that
allocator
is
somehow
consuming
more
tpu
now,
and
that
is
affecting
performance,
and
the
second
investigation
path
is
that
there
is
quality
of
data
provided
by
allocator,
makes
it
more
difficult
to
order
path
of
the
software
stack
to
efficiently
use
that
to
write
to
the
disk.
H
So
it's
it's
pretty
similar
to
dumping
performance
counters,
but
you
should
use
the
histogram
keyboard
instead
of.
Therefore,
something
like
that.
I
H
I
A
Okay,
okay,
I
got
something
I
can
narrow
this
down
to
allocate
from.
A
A
L
H
H
A
We
didn't
backport
that
to
quincy,
though
version
you
could
see.
Oh
this,
isn't
that
old,
okay,
or
that
this
is
old,
okay,
so.
H
So
actually,
my
pr
is
built
on
this
there's
a
visualization
tool.
Lava
has
just
checked
yeah.
L
Yeah,
I've
never
used
it
successfully
before
so
I
can't
say,
but
I
could,
it
might
be
something
that
has
to
be
figured
out
later.
I
just
know
it
exists.
J
A
I
think
it's
trying
to
use
stuff
in
the
path
which,
on
the
system
isn't
the
case
if
you're
running,
because
it
doesn't
this
stuff,
is
in
user
local
bin.
Does
this
give
me
the
ability
to
change
the
stuff
commands
path.
A
H
Oh,
oh,
okay,
yeah,
but
it's
it's
not
matched
yet
what's
good
about
this
visualization
tool,
it
should
be
available
in
machine,
so
might
be
available
for
you
right
now.
Once
you
make.
E
A
Sure,
okay!
Well,
since
we
don't
have
that
right
now,
would
anyone
care
to
to
double
check
my
numbers,
I'm
looking
at
that
the
paste
on
line
158,
it
looks
to
me,
like
we've,
got
like
23
million
entries
in
the
I
believe
the
4
to
8k
range
based
on
this.
H
Well,
I
believe
we
it's
not.
It
doesn't
make
much
sense
to
analyze
a
single
dump,
so
we
need
it's
it's
better
to
compare
good
numbers
versus
bad
ones,
so
they
sure
much
more
helpful.
F
But
igor,
I
have
a
question
here.
I
see
his
values
only
on
a
diagonal
of
this
array.
Does
it
mean
that
basically,
we
always
were
given
the
size
that
we
requested
from
allocator.
H
A
But
I
wonder
if
what
this
is
really
telling
us
is
that
we
were
fragmenting
really
quickly
because
of
this
other
pr,
and
so
we
we
dropped
down
into
this
really
low
performance
state.
H
H
H
And
so
it
might
be
dry.
A
F
We
can
have
totally
different
software
on
flash
translation
layer
it
could.
I
could
even
imagine
that
we
could
have
a
drive
that
basically
very
dynamically
relocates,
even
single
sectors,
though,
whatever
order
I
write,
I
basically
always
write
into
their
continuous
region,
while
other
degree
doesn't
do
that.
I
mean
that's
just
a
speculation,
but
it
could
be
implemented
that
way.
A
Yeah,
unfortunately,
then
that
means
that
this
this
pr
may
be
an
optimization,
in
your
case,
unofficial
analysis
and
it's
a
major
regression
in
my
case
on
mako.
A
F
I
will
do
that.
Unfortunately,
my
last
version
was
specific,
so
I
have
to
recreate
everything
for
quincy
again.
A
A
All
right:
well,
I
think
that's
it
for
now,
then
anyone
have
anything
else
that
they
want
to
talk
about
this
week
before
we
leave.
A
All
right
well,
then,
thank
you
for
coming
guys
and
have
a
great
week
talk
to
you
later.