►
From YouTube: Ceph Performance Meeting 2023-08-17
Description
Join us weekly for the Ceph Performance meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contribute
What is Ceph: https://ceph.io/en/discover/
A
Okay,
so
let's
see
the
first
updated
Pier
that
I
saw
promote
object
when
reading
newly
written
object
at
read.
Only
cash
I
have
not
looked
closely
at
this
one.
I.
Don't
think
it's
actually
gotten
a
review
yet
from
anyone.
It's
just
assigned
to
core
right
now,
so
yeah
I,
don't
this
is
really
small.
This
is
just
in
the
primary
log
PG
code.
A
Well,
I
didn't
even
yeah.
This
is
a
small
change,
so
one
of
us
should
review
this
at
some
point.
No
real
description
other
than
this
little
one.
Well
anyway,
yeah
there's
that
PR
and
then,
let's
see,
there's
also
an
updated
PR,
not
not
much
up
in
terms
of
Updates.
This
is
Igor's
not
resetting
the
prefetched
buffer
PR
from
a
while
back.
This
was
like
yeah.
A
This
is
like
this
blue
FS
pre-fetching
thing,
so
it
was
marked
stale,
Igor
Mark
to
unsdale
and
he's
hoping
that
Adam
can
do
a
review
on
this.
One
I
believe
that,
if
this
works,
maybe
it
would
improve
our
whole
situation
around
needing
to
use
buffered.
A
I
o
for
blue
FS
I,
don't
know
if
Igor
ever
tested
that
or
not,
but
our
one
of
our
big
issues
right
now
is
with
how
prefetching
works
for
reads
that
we
would
expect
to
be
coming
from
the
rocks
to
be
black
cash,
but
are
not
so
anyway
yeah.
Definitely
if
we
can
get
that
one
under
a
microscope
a
little
bit
more
and
see
how
much
is
helping
that
would
be
good.
A
All
right,
it
doesn't
sound
like
I
did
so
for
this
week.
I
don't
have
a
whole
lot
to
talk
about
the
pr
that
that
I'm
working
on
to
improve
Erasure
coding
has
not
seen
any
updates.
A
I've
been
kind
of
holding
down
the
fort
at
klyso,
while
Dan
has
been
on
vacation,
so
I've
been
involved
in
a
lot
of
other
random
stuff,
but
I'm
hoping
to
get
back
to
that
I
wanted
to
this
week.
Maybe
next
week.
One
of
the
things
I
do
want
to
do
is
once
that.
We're
we're
satisfied
with
that.
Pr
I
also
want
to
go
back
and
revisit
the
idea
of
data
cache
for
EC
shards
on
the
primary.
That
would
allow
us
to
avoid
doing
a
remodify
right
over
the
network.
A
I
think
it's
maybe
something
we
should.
We
should
also
be
looking
at
kind
of
closely
is
probably
more
important
than
having
a
local
buffer
cache
inside
the
OSD
for
Erasure
coding
is
actually
caching,
the
the
remote
shards
so
we'll
see,
but
I
think
that's
also,
probably
something
that
we
should
be
looking
closely
at
other
than
that
I.
Don't
really
have
a
whole
lot,
so
I'll
open
it
up,
Gabby
I
know:
you've
got
some
interesting
stuff.
You've
been
working
on.
So
if
you
have
any
interest
in
talking
about
that.
B
Yeah
so
I
got
this
thing
about
that
snip,
my
pair
dream.
B
The
fix
itself
is
very
small,
but
the
testing
for
this
taking
forever
to
complete
its.
So
it's
supposed
to
be
walking
fun
and
so.
A
A
B
A
B
A
B
Let's
hope
for
better
sound,
so
first
the
problems
I'm
still
don't
know.
How
to
explain
is
that
my
tests
generate
100,
000
objects,
create
a
snap
and
then
override
those
hundred
thousand
objects
which
create
hundred
thousand
clones
and
then
two
hundreds
snap
object.
B
200
000
snip
objects,
so
all
together,
three
hundred
thousand,
when
we
dream
we
now
end
with
three
hundred
thousand
tombstone,
so
the
first
streamtex
sometime
and
then
it
seems
to
be
growing
more
and
more
and
my
explanation
to
this
is
the
accumulation
of
tombstones
and
then
eventually,
things
go
back
to
normal.
So
we
start
with
I
think
with
like
40
minutes
trimming
and
after
a
few
iteration,
we
go
up
to
37
minutes
for
the
same
amount
of
work,
but
probably
with
a
lot
of
tombstones
created.
B
B
What
I
didn't
keep
record
and
I
didn't
pay
attention,
and
maybe
I
should
next
time
is
that
some
of
the
tests
been
executed,
one
after
another,
meaning
I
finish
one
test,
I
record
the
numbers
and
immediately
start
the
next
test,
and
some
tests
have
very
long
time
between
them.
Like
I,
ran
some
tests
in
the
night
and
then
I
continue
in
the
morning,
so
I'm
speculating
that.
B
Maybe
there
is
auto
compaction
happening,
but
it
only
happens
when
we
have
enough
empty
Cycles,
and
that
happens
when
I
take
breaks
between
tests,
but
when
the
tests
are
executed
one
after
another,
we
are
always
very
busy
because
when
we
dream
the
system
is
about
200
percent
busy
and
additively
to
200
busy.
So
maybe
three,
maybe
compaction
is
not
very
effective
at
that
point.
B
So
that's
this
thing
with
my
code.
The
times
is
much
better
and
it
mostly
goes
something
five
to
eight
minutes
and
even
the
spikes
I
think
the
worst
spikes
was
like
15
minutes.
So
this
the
whole
spikes
were
a
little
bit
more
than
the
best
case
in
the
in
in
base
code
and
I
had
much
less
spikes
and
I.
B
Think
that
could
be
attributed
to
the
fact
that
in
the
previous
code
we
spend
disproportionate
amount
of
work
on
the
last
entries,
because
the
first
12
and
a
half
percent
of
the
object
we
pay
by
one
call
for
every
trim,
the
next
25
two
and
a
half
percent.
We
pay
two
goals
and
in
the
end
we
pay
eight
call
so
the
last
eighth
or
the
last
20
12.5
percent
of
the
system.
B
So
maybe
and
another
thing
with
no
actually
sorry,
that's
so
with
my
code
and
that
when
I
was
doing
if
the
iterator.
With
my
changes,
the
tests
were
executed
one
after
another
in
a
quick
succession.
So
it
seems
that
they
are
not
very
they're
not
affected
by
taking
breaks
or
not
taking
breaks.
So
that's
this.
A
So
Gary
I'm
I'm
still
super
curious,
whether
or
not
tweaking
the
the
the
the
compaction
on
deletion
settings
to
be
more
aggressive
and
Trigger
more
compactions
sooner.
If
that
would
help
or
not.
A
So
so
all
it
does
is
basically
when
you're
iterating
through
Keys,
if
over
by
default
right
now,
I
think
it's.
If
you,
if
you
see
8
000
Keys
over
a
16
000
key
window,
then
it
will
trigger
compaction.
A
B
A
B
B
A
Do
you
know
when
you
do
your
your
New
York,
creating
tombstones?
Is
there
iteration
happening
concurrently,
or
does
it
all
happen
in
like
one
one
go?
What
do
you
mean
I
didn't
get
the
question.
You
said
that
you're
creating
like
300
000
tombstones
right,
yes,
is
that
all
happening
in
like
one
Loop
or
is
there
other
stuff
happening
in
between.
A
Do
you
iterate
to
find
those
the
next
two
objects
in
between
yeah
exactly
exactly
so.
What
should
happen,
then
right
is
that
you,
you
do
all
this
iteration
as
you're
creating
tombstones,
and
it's
only
until
you
get
to
the
point
where
you
have
like
16
000
tombstones
with
the
current
settings
where
you,
then
you
trigger
the
compaction,
but
you've
done
a
lot
of
iteration
up
until
that
point
over
tombstones
to
get
there.
So
it
might
be
that
by
shrinking
those
those
settings
down
to
trigger
compactions
faster,
maybe
it's
better.
B
I
I,
don't
think
I,
don't
think
that's
the
case
here,
because
what
happened
is
as
I
mentioned,
like
it's
going
a
bit
like
I,
don't
know
the
word
for
this
in
English
like
the
first
one
is
pain,
one.
The
first
eighth
of
the
system
is
paying
one
access,
then
the
second
half
is
paying
two
accesses
and
so
on.
A
B
This
is
if
we
would
be
compacting
by
16
000.
That
should
be
happening
long
time
before.
A
B
No,
no,
no,
it
really!
You
can
really
see
when
it's
taking
a
long
time.
It's
usually
it's
been
in
the
last
twelve
and
a
half
percent,
that's
just
where
we
spend
unproportional
amount
of
time.
Doing
that,
because
you'll
see
the
beginning,
like
take
you
very
few
minutes
to
finish
half
of
the
system,
and
then
we
just
slow
down.
A
Could
you
could
try
changing
it?
The
other
way
make
it
so
those
are
much
it
takes
longer
to
trigger
compaction
with
the
settings.
I
was
kind
of
wondering,
though,
if
it's
possible
well,
you
can
find
out.
You
can
look
at
the
logs
if
there's
a
tool,
I've
got
that
you
can
look
and
it'll
tell
you
about
the
compaction
events
that
are
happening,
but
I
was
kind
of
wondering
if
it's
possible
that,
because
this
is
like
an
asynchronous
compaction
that
happens
if
there's
something
that's
preventing
roxyb
from
doing
it
right
away
like.
B
B
A
B
B
A
Yes,
well
good
good,
yeah,
I,
think
I
think
it
will
actually
be
a
really
good
good
Improvement.
We.
A
We
definitely
see
that
this
is
one
of
the
cases
that
or
this
class
of
problems
are
ones
that
other
users
users
a
lot.
The
I
mean
even
to
the
point
where
you
can't
like
heartbeat
timeouts.
B
Even
we
feel
fixed
before
you
fix
I,
remember,
testing
this
before
on,
remember
what
it
was,
but
it
was
sometime
in
Quincy,
then
we've
just
seen
the
system
keep
growing
and
after
a
few
iterations
we
got
to
100
sorry
to
one
hour
and
then
it
was
just
keep
on
growing
and
just
like.
B
So
what
the
customer
have
in
the
fields
now
they
would
see
this
thing
just
growing
and
growing
growing
and
getting
to
like
crazy
and
all
the
time
it's
200
CPU
and
if
they
have
more
snaps
being
trimmed,
then
this
thing
just
been
accumulated,
which
means
the
system
never
have
any
time
to
do
anything
else.
Yeah,
so
I
think
your
change
is
probably
the
most
critical,
because
some
break.
B
A
B
You
need
to
go
to
like
five
minutes,
I
think
with
my
code.
It's
now
by
averaged
seven
minutes:
okay,
100
000
object,
which
again
it's
a
very
it's
a
large
amount
of
object.
I,
don't
think
it's
it's
normal
to
see
that
amount
every
15
minutes,
but
at
least
it
means
that,
even
if
every
15
minutes
they
will
generate
100
000
objects,
we're
still
going
to
be
able
to
complete
them
half
the
time,
and
then
you
could
do
something
else.
Yeah.
A
A
Well,
thank
you
for
coming
and
have
a
great
week
everybody
and
see
you
next
week,
thanks
bye,
bye.