►
From YouTube: Ceph Performance Meeting 2023-03-09
Description
Join us weekly for the Ceph Performance meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contrib...
What is Ceph: https://ceph.io/en/discover/
A
Hey
guys,
I'm
sorry,
I'm
late,
I
I
got
into
talking
about
defragmentation
with
Josh
and
the
course
stand
up
and
and
lost
track
of
time.
Hopefully
we
can
maybe
continue
that
conversation
here,
all
right,
so
so
obviously
core
is
still
going
on.
Hopefully
we
will
get
some
of
those
folks
soon.
Oh
hey
Josh!
You
came.
A
That's
that's
good
all
right!
Well,
let's!
Let's
start
this
thing,
then
oh
I
gotta
go
find
my
a
web
browser
tab
for
the
performance
meeting.
Give
me
one.
Second,
all
right,
so
I
I
confessed
again
that
I
was
not
great
this
morning
and
didn't
make
it
all
the
way
through
the
PRS,
but
I
got
through
at
least
a
decent
number
of
them.
A
A
This
is
basically
just
for
the
Crimson
performance
test
to
to
be
able
to
collect
a
whole
bunch
of
background
system,
level,
information
and
I,
guess
I
guess
in
the
QA
Suite
we
didn't
work
in
collect
all
data
before
so
this
PR
should
fix
that
I
approved
it
I
didn't
test
it.
It
looked
fine
just
from
a
casual
glance,
though
so
I
figure
that
if
there
are
issues,
will
will
figure
it
out
after
this
is
applied,
not
super
worried
about
it.
A
Okay
and
the
next
PR
was
also
Crimson
PR.
This
is
to
add
fine-grained.
Caching.
This
is
from
one
of
the
developers
at
Intel.
They
also
conveniently
provided
a
benchmark
in
the
pr
which
I
always
am
very
happy
to
see
and
in
the
the
four
megabyte
pre-filled
case
they
they
saw
a
really
huge
increase
in
performance,
which
honestly
actually
surprises
me
just
a
little
bit.
A
Oh
I'm,
sorry,
this
is
a
4K
ranked
read
performance
test.
After
doing
a
four
megabyte,
pre-fill,
that's
what
they
mean
here
so
I
again.
It
actually
surprises
me.
A
little
bit.
I
didn't
see
anything
quite
as
bad.
The
last
time
I
looked.
Oh,
this
is
c-sword,
though
anyway,
sorry
I'm,
I'm,
getting
lost
in
this
looks
like
a
good
Improvement,
so
I
think
once
Sam
has
a
chance
to
look
at
that.
Hopefully
we'll
we'll
merge
that
pretty
quickly,
but
definitely
definitely
good
things.
A
Those
were
the
only
two
new
PRS
that
I
saw.
I
didn't
make
it
fully
through
the
pr
list,
but
I
didn't
see
anything
obvious
that
had
closed
that
was
new
anyway.
A
I
did
see
a
couple
of
updated
PRS.
There
have
been
some
additional
reviews
on
the
qat
batch
PR,
that's
from
intelligent
I
think
there
is
Igor's,
PR
or
sorry
case,
Corey's
PR
or
saying
the
roxdb
iterator
bounds
for
collection
list,
I,
think
that
has
a
couple
of
new
fixes
in
place.
A
A
The
other
PR
that
I
did
see
that
was
updated,
though,
was
Igor's
pr4
not
resetting
the
pre-fetched
buffer,
while
doing
multi-chunk
I
think,
it's
probably
supposed
to
say,
reads
at
the
end
there,
but
basically
bluefest
prefetching,
Behavior
and
I
think.
The
hope
is
that
this
might
help
us
with
roxdb
when
it's
trying
to
do
prefetches,
if
it
does
there's
a
chance
that
we
might
be
able
to
re-enable
blue
sir
buffer
disabled
Blue
Store
buffered
Io.
A
If
this,
if
this
works
well
we'll
see,
but
generally
speaking,
that
was
the
big
reason
that
we
had
to
revert
the
switch
to
direct.
I
o
that
we're
not
reading
From
the
Block
cash
and
rocksdb
properly
or
roxdb,
isn't
there's
some
issue
there
and
we're
entirely
relying
on
the
the
page
cache
the
the
Linux
kernel
page
cache,
and
if
we
can
somehow
improve
things
at
the
bluefest
layer,
then
maybe
we
don't
need
to
do
that
anymore.
A
C
B
D
C
Is
PR,
fixed
I
was
16
that
specific
issue
so
just
well
battery
using
the
prefetch
buffer
and
lately
I
discovered
that
it
could
result
in
RAM
usage
growth,
while
detox
DB
performing
bulk
data
reads
from
DB,
for
instance
during
iterating
all
the
every
record.
It
doesn't
close
all
SSD
files
along
the
path
and
if
we
do
not
dream
our
buffers
as
well,
equal
gigabytes
of
RAM
occupied.
So
the
latest
update
improved
the
streaming
as
well.
A
Nice
remind
me:
can
we
still
use
the
read
ahead
at
the
blue
FS
layer?
If
we're
using
blue
star
buffered,
I
o
false,
will
that
will
that
still
work.
A
Okay,
so
I
mean
the
the
the
behavior
it
seems
like
we
see
is
that
when
we,
when
we
do
a
poster
buffer,
IO
false
or
not
properly,
reading
from
the
roxdb
blog
cache
for
whatever
reason
and
we're
somehow
we're
not
doing
any
kind
of
prefetch,
and
so
we
end
up
just
re-reading
the
same
blocks
over
and
over
again,
when
we
do
these,
these
kinds
of
listing
steps
where
we
we
re-list
or
re-re,
evaluate
some
kind
of
iteration,
and
we
also
are
not
doing
any
kind
of
pre-fetching
very
well.
A
So
we
we
basically
just
end
up
relying
on
the
page
cache
in
the
kernel
to
to
save
us,
but
there's
probably
multiple
ways
that
we
could
avoid
this
at
the
bluefest
layer,
I
suspect,
while
still
doing
direct.
I
o
against
the
kernel.
C
A
A
C
It
actually
does
well
at
least
for
some
sequential
scans.
It
reads:
SSD
file
using
prefetch
call,
which
attempts
to
get
something
like
housing,
Nick,
and
then
it's
gone
through
that
block
using
small
reads.
C
A
A
Do
we
have
this
pattern
where
we
will
iterate
to
a
point
and
then
we
might
do
like
a
deletion,
and
then
we
start
we
do
a
a
like
a
like
a
ring
scan
and
start
at
a
certain
offset
again
and
then
reiterate
over
again
until
we
hit
another
point
where
we
do
deletion
and
we
reiterate
over
the
same
range
over
and
over
again
and
when
we
have
blue
star
buffered
io
on
we
do
the
reads
from
page
cache,
but
when
we
don't
with
direct,
I
o
we
end
up
like
doing
that
over
and
over
again,
and
we
do
small
reads
against
the
disc.
B
C
A
But
we
don't
really
want
it
right,
like
it's
slower,
like
in
other
parts
of
the
OSD,
it's
faster
to
use
direct,
I
o,
and
we
don't
really
want
to
mix
them
so
I
guess
what
I
was
wondering
is.
Can
we
do
direct,
I
o
against
the
kernel,
but
still
have
some
kind
of
prefetch
buffer
that
we
that
we
keep
ourselves
at
the
blue
effects
layer.
C
The
the
shh
the
issue
with
roxdb
is
that
it
attempts
to
read
from
the
same
locations
multiple
times
and.
C
A
Yeah
and
you'd
expect
that
if
you
did
a
reread
of
the
same
region
that
it
should
be
read
from
block
cache,
but
for
whatever
reason
it
doesn't
seem
to
I,
don't
know
if
that's
because
it's
just
failing
or
if
it's,
because
maybe
we're
triggering
the
the
eviction
of
the
block
cache
I
do
sometimes
see
that
rocksdb,
like
evicts
the
entire
blockage,
maybe
due
to
SST
files
becoming
invalid.
But
it's
it's
definitely
it's
definitely
not
working
the
way.
I
expected
it
to.
D
C
D
C
C
Benchmarking,
oh
well,
collecting
some
latencies
and
numbers
on
on
operation.
Latency
I!
Don't
like
didn't
see
any
overhead
here.
A
But
if
we
showcase
the
really
bigger
example.
Maybe
we
could.
A
I
think
the
thinker
issue,
though,
is
that
we
have
to
make
sure
that
our
iteration
performance
remains
good.
One,
Way
or
Another
it'd
be
nice.
If
we
had
some
guarantee
that
we
could
do
it
in
our
code,
rather
than
relying
either
on
roxdb
or
on
the
Linux
page,
cache
right
like
I,
don't
I,
don't
like
being
reliant
on
either
of
those
things.
C
Synchronous
or
maybe
parallel
access
to
data,
so
right
now
it
looks
like
rocksdb
well,
at
least
for
for
large
scans.
It
performs
reading
in
sequential
manner
yeah
and
maybe
if
we
do
prefetch
most
smart,
at
least
at
our
level
and
Trigger
discreets
beforehand.
C
This
might
provide
some
benefit,
because
right
now,
I
can't
see
so
for
this
sequential
scans.
I
could
see
that
we
under
before
from
disk
bandwidth,
point
of
view
like
why
slower
slower
than
the
disk
might
is
capable
and.
D
C
For
SATA
drive,
I
am
afraid
that
for
Jimmy
drives,
it
might
be
even
worse
and.
C
Data
and
parallel,
we
could
benefit
from
that.
A
This
was
I,
think
testing
that
other
people
at
Red
Hat
had
done.
A
I
think
I
had
done
some
as
well,
but
the
stuff
that's
in
the
pr
is
from
from
other
people
on
the
cat's
team,
I
think.
A
I
think
that
case
was
using
hard
drives,
probably
with
nvme
DB
wall
yeah
yeah.
A
But
anyway
take
a
look.
We
can,
we
can
think
about
it
more,
but
I
think
it's
it's
a
good
thing
for
us
to
figure
out.
If
we
can,
if
we
can
go
back
figure
out
a
way
to
go
back
to
bluefest,
buffered
IO,
disabled
I,
think
it's
it's
a
good
idea
at
least
a
try.
B
A
Yeah
I
think
we
just
need
to
figure
out
how
to
deal
with
this
super
irritating
Corner
case
that
actually
ends
up
being
kind
of
a
big
Corner
case,
but
I
I
very
much
would
like
to
see
if
we
can
move
back
to
disabling
it.
A
C
B
Something
I
actually
started
to
think
about
I.
Don't
think
I
wrote
any
code
for
it,
though,
is
like
technically.
We
don't
have
to
prefetch
the
blue
FS
level.
We
just
need
to
Cache
right,
because
if
the
issue
is
that
rocksdb
is
basically
asking
for
the
same
data
over
and
over
again,
if
you
just
give
it
that
data
from
memory,
you're,
good,
yeah,
I,.
B
Do,
though,
was
and
like
I've
had
zero
time
to
do
this
for
years
now,
but
was
figure
out
why
it's
asking
for
the
same
data
over
and
over
again,
when
it
should
be
caching
that
internally.
A
I
did
I
did
like
a
a
huge
like
like
walk
through
the
rock
CD
code,
and
it's
changed
dramatically
in
like
the
last
three
years,
like
it,
it's
very
different
for
Nautilus
versus
Pacific.
So
it's
that
is
probably
going
to
take
like
multiple
independent
code,
walkthroughs
to
figure
out
what
each
version
of
rocksdb
is
actually
doing.
Yeah.
A
C
Bro
just
my
comment:
I
disagree
about
kitchen
retro
reflection,
since
it's
still
helpful
for
long
scans
and
drops
GB
behaves
in
a
way
where
it
issues
a
call
to.
C
File
system
saying,
please
find
larger
block
to
retrieve
and
then
scans
it
through
using
small
reads
and
if
we
do.
D
C
The
this
secondary
reads:
they
are
not
repetitive,
they
just.
D
A
Taking
care
of
this
one
use
case
where
we
see
the
awful
Behavior
I
mean
like
everything,
I
was
seeing
with
blue
store.
Sorry
bluefest
buffer,
IO
off
looked
better,
except
in
this
one
case
so
like.
If
we
take
care
of
that
I
think
I
think
we
can
probably
reevaluate
this
like
for
the
third
time.
B
A
But
we've
mostly
landed
with
it
off
yeah.
That
was
where
I
was
at
and
then
and
then
people
got
burned.
I
was
like
oh
I,
think
Dan
banister,
like
wrote,
another
PR,
the
disability
and
or
to
turn
it
back
on
turn
bluetooth
back
on
again
and
and
then
we
just
kind
of
like
yeah,
yeah
I
think
we
need
to
do
that
because
too
many
people
are
hitting
this
this
awful
case,
but
I
really
wanna
get
back
to
being
able
to
switch
it
back
to
direct.
I
o.
If
we
can.
A
All
right,
I,
don't
think
that
is
a
a
yes,
so,
okay,
that
was
those
were
updated.
Prs
did
I
miss
anything
guys.
I
didn't
make
it
quite
through
the
list
today.
So
was
there
anything
I
missed.
A
All
right,
if,
if
not
then
I
don't
actually
have
anything
listed
for
today
for
discussion
topics,
does
anyone
want
to
jump
into
The,
Fray.
A
If
not
I
do
have
one
thing:
I
could
actually
bring
up
I.
Think.
Were
you
the
author
of
the
the
the
the
the
pr
for
CBT
to
disable
the
existing
check
for
for
the
results
directory.
A
To
one
second,
I
can
pull
it
up.
A
D
A
Yeah,
sorry,
it
was
Nissan's
PR.
Kid
said
that
Nissan
was
gonna,
come
to
the
the
meeting
to
talk
about
it,
I
couldn't.
A
Confusing
I'm
just
I'm,
terrible
with
names
that
I
couldn't
remember,
who
did
it
so
I
I
apologize,
but
I
guess
I
did
want
to
talk
about
this.
Possibly
the
goal
with
that
particular
line
of
code.
That's
there
is
to
prevent
the
re-running
of
tests
that
were
already
run
and
and
that
that
particular
check
has
saved
me
many
many
times
in
the
past
from
overwriting
existing
test
data
when,
when
you've
run
it
against
an
existing
directory.
A
So
in
the
the
it
sounds
like
for
some
reason
in
the
the
pathology
test
that
use
CBT,
it's
triggering
this.
This
code,
causing
I,
guess
a
result
structure
to
be
created
in
the
second
Loop
of
a
benchmark,
and
it
will
skip
that.
But
I
don't
see
this
in
any
of
the
testing
I've
done
in
like
the
the
year
or
two
that
this
code
has
been
there.
A
A
I'm,
just
not
sure
I
want
to
get
rid
of
that
check
completely
so
anyway,
that's
that's
the
status
of
that
I
guess.
A
A
A
A
A
Well,
I'll
I'll
try
to
follow
up
with
Nissan
and
and
see
if
we
can
just
just
hash
it
out
and
figure
out
what's
going
on
here,
but
that's
that's
I
guess
the
status
of
this
thing
is
I'm
hoping
we
can
just
figure
out
a
more
nuanced
fix
than
completely
disabling.
The
existing
directory
check.
A
Well,
that's
all
I
have
so
unless
anyone
else
has
anything,
maybe
we
wrap
up
a
little
early
today.
A
All
right,
I,
don't
think,
there's
a
sign,
no
one's
got
anything
so
enjoy
the
rest
of
your
day.
Guys
and
I'll
see
you
next
week
have
a
good
one.