►
From YouTube: Ceph Performance Meeting 2021-08-12
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
Oh
lots
of
lots
of
stuff
for
pacific
right
now,
but
I
had
a
question
for
you
that
when
I
saw
you
I
wanted
to
to
bring
up,
I
made
a
pr
a
little
while
back
for
rgw.
That
starts
the
object
cache
and
I
was
wondering
if
I
could
convince
one
of
you
guys
to
take
a
look
at
it.
This
is
just
kind
of
sitting
there.
A
B
So
you're
you're
seeing
winds
from
from
this.
It
eliminates
lack
intention.
A
I
think
so
it's
been
a
while,
since
I
looked
at
it,
but
I
think
it
did
did
I
put
anything
in
the
comments.
A
B
A
I
can
go
back
and
look
I'm
trying
to
remember
what
it
was.
I
was
looking
at
at
that
point,
but
I
probably
have
data
somewhere.
A
B
A
There
was
when
I
was
going
through
and
looking
at
this,
I
was
actually
going
back
and
looking
at
a
bunch
of
like
documentation
and
tests
from
people
looking
at
shared
locks
versus
exclusive
locks
and
kind
of
the
the
just
seemed
to
be
that
shared
locks.
Kind
of
generally
speaking,
aren't
even
wins
in
the
cases
where
you'd
think
they'd
be
wins,
but
I
mean
I
don't
know
how
much
it
actually
matters.
B
A
A
All
right:
well,
we
got
people
from
core
starting
to
come
and
show
up
here.
So,
let's
see,
we've
got
five
new
yards
that
I
thought
this
week.
Let's
see,
there's
a
couple
from
igor
and
adam,
but
incremental
update
mode
for
bluefs
logging.
D
It's
basically
adding
a
new
op
for
bluefs
log,
that
for
a
cases
when
we
just
modify
a
content,
size
or
file
or
just
at
one
extent
we
just
serialize
only
reduced
amount
of
information,
so
we
just
add
a
new
extent
if
it's
present
and
update
modify
time
and
update
size.
That's
basically,
all
there
are
two
pull
requests
because
currently
with
eagle,
we
talk
which
solution
is
nicer,
one
will
go
in
and
the
other
will
will
die.
That's
that's
how
it
will
be.
D
The
reason
is
to
read,
reduce
speed
of
a
blue
fs
log
growth.
That's
it.
A
All
right
cool
we've
got
a
pr
from
kifu
that
reverts
the
the
change
that
made
it
so
that
we
were
always
building
debug
builds
from
source.
So
that's
good.
I
think
that
probably
maybe
confused
some
people
on
testing,
possibly
myself
included
in
some
tests
over
the
last
couple
weeks.
Mark
cogan
discovered
that
so
kudos
to
him,
so
that
should
basically
fix
that
change.
Then
we've
got
a
couple
of
additional
prs
that
are
aimed
at
changing
the
way
that
deferred
io
the
deferred.
A
I
o
path
works,
which
we
we
previously
had
changed
between
novels
and
pacific.
The
gist
of
this
is
that,
right
now,
when
running
on
hard
drives,
we
see
really
a
huge
amount
of
right
amplification
in
the
in
rock
cb,
because
we
are
deferring
even
very
large
rights
with
the
the
way
that
master
currently
works.
A
Basically,
anytime,
that
the
preferred
deferred
size
is
the
same
as
the
blob
size.
We
basically
defer
all
io
so
we'll
talk
about
that
more
in
a
little
bit,
but
these
are
both
different
pr's
that
are
aimed
at
improving
that
okay,
updated.
A
See
this
mds
lock
switching
to
fair
mutex
that
a
review
has
been
requested
from
patrick
keith
who
previously
reviewed
it
rgw
tracing
casey.
It
looks
like
I
think,
you're
still
reviewing
that
right:
yeah,
okay,
cool.
A
Radix,
optimization
for
carriage
handling
and
buffer
list,
it
failed
qa
I'd
keep.
Who
would
run
that
through
tests?
So
I
guess
that
just
needs
another
look
adam.
It
looks
like
you,
updated
your
bluefish,
fine,
green,
locking
pr.
D
That's
correct
I
to
this
week
I
was
doing
that
delayed
self
review,
finding
some
missing
parts
in
logic
and
some
inconsistencies
still
not
to
be
merged
yet,
and
I'm
not
seriously
reviewed
by
anyone
else,
will
be
soon.
A
Sure,
okay
and
then
it
looks
like
there's
also
further
review
being
done
on
this
osd
compression
bypass
after
rgw
compression
pr.
A
And
let's
see
no
movement,
I've
got
a
couple
of
things
in
here
that
I
need
to
go
back.
Mem
store,
refactoring
cleanup
that
can
I
don't
know
I
could
probably
just
be
merged
now
I
can
switch
it
from
do
not
merge
to
merge.
I
did
want
to
maybe
do
a
couple
more
things
on
that.
So
we'll
see
what
else.
A
E
A
Adam
for
the
tc
malik
change
your
your
pr.
What
do
you
think
we
do?
You
wanna
keep
trying
to
go
that
route
or.
A
D
A
I
think
you
you
had
a
really
good
idea
with
it
if
we
can
set
it
inside
stuff,
rather
than
relying
on
outside
tools
setting
it,
I
mean
that's
really,
nice.
A
F
A
That's
what
I
was
just
gonna
say
because
he's
the
one
that
worked
on
the
the
other
one
for
the
stuff
right,
like
the
the
pr
to
set
it
there
I
mean:
do
you
get?
Do
you
want
it?
Would
you
be,
would
be
worthwhile
or
useful
for
you
if
we
could
just
set
it
directly
in
in
the
osd
itself,
with
like
a
config
option.
A
A
Okay,
it
sounds
like
we've
got
the
seal
of
approval
so
and
I
think
it's
a
good
idea.
A
A
Fantastic,
that's
great!
When
did
that
when
did
that
get
in.
A
A
Yeah
absolutely
there's
there's
another
regression
in
master
somewhere
that
I
need
to
track
down.
That
kind
of
came
with
all
this
different
right
testing
that
that
I
had
not
uncovered
yet,
but
hopefully
we'll
soon,
and
then
I'm
excited
to
see
once
we
get
that
fixed,
where
your
pr
ends
up
landing
in
terms
of
making
the
the
right
path
faster
for
small
small
rights,
very
exciting.
H
All
right
there
was
another
another
car
that
actually
was
merged
last
week
after
a
year
of
the
on-wire
compression
that
I
see
somehow
would
miss
someone
in
the
it
appears
on
the
pad
as
if
it
was
closed,
but
it's
there
is
different
number,
but
we
added
on
wire
compression
so
on
the
pad.
It
appears
online
1407
is
closed,
but
it's
a
different
number
and
the
number
of
this
is
actually
three
six
five
one,
seven,
that's
what
eventually
would
merge.
A
I
guess
I
just
missed
that
one
all
right!
Well,
yeah,
that's
great!
A
So
we
don't
have
any
other
pr's,
then
the
only
big
discussion
topic
I
had
for
today
is
that
we're
trying
to
figure
out
for
deferred.
I
o
fixes
in
blue
store
what
approach
we
want
to
take
for
master
and
what
we
approach
we
want
to
take
for
pacific.
A
So
internally,
we've
been
the
last
day
we've
been
talking
about
the
different
options
that
we
have
available
and
I
think
for
pacific
we're
we're
kind
of
leaning
towards
a
minimal
patch
set
that
doesn't
change
things
too
much
we're
throwing
around
the
idea
of
fix
5
from
my
commit
series,
which
is
just
a
one-liner
and
then
possibly
adam's
pr
that
he
made,
which
I
believe
is
42
721
or
maybe
just
a
portion
of
that
see.
That
is.
A
Is
here,
okay,
so
in
the
testing
that
we
did.
A
Multiple
of
these
kind
of
maybe
have
overlapping
effects.
They
all
make
the
the
amount
of
right,
amplification
and
rights
going
into
the
the
red
head
log
and
mem
tables
and
rocks
to
be
much
lower
than
they
are
without
them.
Arguably,
they
may
not
be
quite
as
good
as
we
could
get
it,
but
they'll
certainly
have
a
big
effect.
A
Adam
I
wanted
to
ask
you
you
mentioned
wanting
to
get
your
pr
into
pacific.
What
is
there?
Do
you
think
that
that
will?
Well,
I
guess
what
was
the
the
the
thought
that
you
had
for
wanting
to
to
to
backport
it.
D
Actually
for
pacific,
I
would
even
go
further
actually
taking
igor's
pr
that
had
multiple
modifications.
D
I
thought
that
for
master
and
for
pacific
we
we
will
take
this
and
for
further
back
if,
if
still
needed,
we
just
cut
to
your
fixed
five,
and
that
simple
comment
you
just
mentioned.
So
it
will
be
just
not
to
make
too
much
changes
in
back
buck
releases.
A
D
E
A
It
might
be
so
you
guys
both
feel
like
the
the
patch
set
from
both
of
them
together
are
reasonable
for
a
back
port.
At
this
point,.
E
E
A
Igor,
why
I
don't
know,
I
don't
remember.
I
think
it
was
part
of
the
the
work
that
you
had
done
on
the
different
path
earlier,
but
why
did
we
switch
from
a
512k
to
a
64k
max
blob
size.
A
A
A
A
Niha
and
josh,
you
guys
have
any
strong
opinions
about.
A
Back
porting
to
specific,
you
know
larger
or
smaller
fixes
for
that
versus
what
we're
going
to
do
in
master.
F
My
point
of
view
here
is
that,
given
that
we,
what
we
have
is
that
we
backboard,
if
it
does
not
address
the
the
problem
we
saw
completely,
should
we
do
a
point
release
with
that
or
not
or
should
we
hold
off
on
backboarding
it
just
merge
it
to
master,
wait
for
all
the
following
fixes
that
igor
was
talking
about
and
then
backboard
the
entire
path
set,
and
the
other
question
is:
are
users
affected
by
this?
Currently,
the
answer,
probably
is
yes,
so
it
can.
F
We
do
something
in
the
interim
which
is
minimal
and
just
you
know,
works
around
the
issue
until
we
get
the
you
know,
full
fix
backward,
I'm
open
to
ideas.
These
are
kind
of
the
options
I'm
looking
at.
D
F
Yeah
yeah,
I
get
your
point
so
what
what
you
are
trying
to
say
is
that
it,
the
the
current
path
set,
improves
the
situation
already
at
any
additional
changes
will
be
a
bonus
that
come
in
later,
so
there's
no
harm
in
backwarding
the
entire
bad
side.
I
A
D
A
D
A
A
I
mean
it's
possible
even
too
right
that
it.
It
could
be
that
that
we
do
see
better
performance
overall.
A
A
E
E
E
E
E
And
well,
actually,
what
I
don't
like
about
our
current
benchmarks
is
that
we
perform
them
against
ssd.
A
A
Igor,
if
we
ignore
the
benchmark
numbers,
the
performance
and
just
focus
on
the
behavior
of
of
rights
hitting
the
mem
tables
and
right
ahead,
log,
so
and
and
then
the
behavior
of
compaction,
do
you
think
that's
reasonable.
E
Well
from
what
I
can
see
in
your
spreadsheet,
the
numbers
are
pretty
good
with
both
items
and
micro
needs,
so
we
fixed
the
this
right
amplification.
We
had
before.
A
E
42
sorry,
42
and
which
column
you
are
referencing.
A
E
Okay,
that
that
that's
expected,
since
the
smaller,
the
smaller
preferred
the
34
size,
you
have
the
less
traffic
you
get
into
the
database.
E
E
E
Well,
maybe
we
should
revise
something
else.
I
mean
make
apply
this
for
settings
faster,
but
you
see
that
even
when
you
have
100
compaction
events
for
say,
16k
prefer
defer
size.
It's
larger
than
the
value
in
column
b,
which
is
zero
for
preferred
different
size,
which
yes.
A
E
Yeah,
so
if
we
try
to
fix,
if
we
try
to
avoid.
E
E
So
so
right
now
we
try
to
to
reduce
the
traffic
itself,
but
we
we
can't
fight
it's
right.
The
headlock
spillover.
A
Yeah
it's
and
it's
it's
not
clear
that
making
the
deferred
the
referral
happen.
Faster
is
good
right
because
you
could
potentially
introduce
more
seeks
that
way.
E
A
A
Yeah
yeah
yeah,
and
so
that's
why
I
don't
want
to
focus
on
the
performance,
because
the
performance
is
kind
of
misleading.
It's
more
about
you
know,
there's
there's
some
trade-off
between
not
having
too
much
right
amplification
in
the
database
and
too
much
work
being
done
database
versus
trying
to
avoid
seeks
on
the
disk.
A
E
Yeah,
well
just
one
additional
observation.
So
if
you
before,
we
were
talking
about
compaction
events
which
is
higher
for
well
for
my
pr
and
for.
C
A
It's
not
actually
a
good
thing,
though
it
looks
good,
but
it's
it
what's
happening.
Is
that
because
we're
deferring
very
large
ios,
we
fill
up
the
the
the
buffers
for
the
mem
table.
We
felt
the
mem
table
very
quickly,
and
so
you
have
a
huge
amount
of
data
being
compacted
in
roxdb
and
because
there
are
very
few
items
being
compacted
just
a
very
big.
You
know
a
couple
of
very
big
items,
then
the
compaction
happens
quickly,
but
we
do
many
of
them.
A
If
you
look
at
the
next
line
on
44
you'll
see
that
it's
actually
an
inverse
relationship
to
the
total
compaction
time.
Oftentimes
here,
like
in
column,
f,
the
individual
compaction
time
is
very
small
because
we're
not
expecting
very
many
items
but
we're
compacting
lots
of
data,
and
so
the
total
compaction
time
ends
up
being
very
high,
because
we
do
lots
of
compactions.
E
A
Okay,
yeah,
I
agree
with
you
on
that
one.
So,
okay,
igor
and
adam,
do
you
do
you
think
we
should
back
port,
then
both
both
of
your
your
prs
you're
you're
comfortable
with
that
back,
pointing
into
specific.
D
And
actually,
if
you
backboard
igor's
pr,
then
my
work
is
already
there.
Oh
good.
A
Okay
and
then
is
there
any
point
to
doing
fix
five
from
my
patch
series.
If
we
do
that.
A
A
F
Yeah,
I
think
so,
and
I
I
get
where
adam
is
coming
from,
and
I
think
it's
fair
to
say
that
it's
better
than
what
the
existing
pacific
behavior
is.
A
Yep
agreed
agreed,
I
think
then
I
I
think
we
should
still
go
back
and
look
at
the
max
blob
size
and
try
to
to
understand
if
it
if
it
did
make
sense
to
switch
it,
okay
or
not,
but
but
it's
not
necessary,
it
will
get
most
of
it
back.
F
A
My
my
hope
is
that
maybe
we
can
get
the
the
red
hat
storage
working
group
to
if,
hopefully
we
can
convince
them
to
to
retry.
A
Tests
and
make
sure
yeah
it,
it
doesn't
hurt
them,
but
then
yeah.
F
Yeah
I've
already
started
talking
to
them
about
this,
but
I'm
just
curious.
Is
there
a
particular
workload
profile?
I
mean?
Obviously
they
only
test
with
rtw,
but
in
terms
of
object,
sizes
and
stuff,
do
you
have
any
preference
or
what
they
did
earlier
just
do
the
same
like
they
have
two
kinds
of
profiles
like
small
objects
versus
regular
size
objects.
A
A
Igor
there
were
some
rgw
tests
that
were
done
internally
by
one
of
the
the
storage
groups
within
red
hat,
and
it
was
just
kind
of
happenstance
or
luck
that
that
we
were
reviewing
their
osd
logs
and
saw.
You
know
that
there
was
a
ton
of
compaction
happening
and
that's
why,
then
I
went
back
and
you
know,
started
looking
at
it
on
one
of
our
upstream
test
clusters
to
see
if
it
was
in
pacific
or
not.
E
Yeah,
because
again
I
well
from
both
my
experiments
and
well
the
common
sense.
It
looks
like
this
additional.
The
short
rides
appeared
on
on
other
rights
on
partial
rights.
E
Okay,
but
maybe
I
missed
something
and
well
for
all
the
writing
more
specifically
on
on
a
line
to
the
right.
So
if
you,
if
you
are
aligned
with
this
max
blob
size,
then
they
shouldn't
go
discharge.
This
instant.
D
A
All
right
cool,
so
niha
did
did
the
tests
on
the
horse
pr
pass.
Are
they
done.
F
A
Okay,
that
sounds
good
and
then
you're
you're,
talking
to
the
workload
folks.
F
Yep
yep
so
yeah
once
the
pr
is
merged.
Maybe
we
can
just
create
the
pacific
backboard
as
well
and
start
getting
some
testing
on
that.
Hopefully
we
can.
I
guess
what
I
like.
What
I'm
understanding
is?
It's
there's
no
issues
to
get
it
in
the
next
specific
point:
release
as
well
right.
A
F
Yeah,
I
don't
think
there
should
be
any
conflicts
as
such,
like
too
many
conflicts,
but
eager
adam
plus
one
to
releasing
it
in
upstream.
D
E
Just
a
couple
of
related
items:
first
of
all,
we
might
want
to
create
a
ticket
to
additionally
investigate
this
performance.
Difference
format,
flop,
size,
256
and.
E
60
64k
common
top
size
just
not
to
miss
that,
and
secondly,
we
should
be
careful
with
comparing
master
in
pacific
now
compared
to
54
months
for
mastering
pacific.
Now,
since
gabby's
pr
is
merged
and
we
can
get
different
performance
for
master.
D
E
A
D
E
F
A
A
There's
one
other
thing,
igor
that
I
I
wanted
to
bring
up
that
I
did
notice.
I
have
not
investigated
it
yet,
but
with
your
pcr.
G
Mark
to
just
solid
correction:
the
default
is
that
we
do
that.
We
skip
column
family
b.
G
I
E
Igor
my
point
about
be
careful
when
comparing
comparing
the
performance
for
passive,
it
must
make
sense.
A
Yep
so
so
eager
there
was
one
thing
I
wanted
to
ask
you
about
with
your
pr
with
the
the
one
for
deferred
the
different
right
change.
A
I
saw
that
and
I
I
know
I'm
talking
about
performance
again
here
which
little
suspect,
but
on
on
line
27
of
that
spreadsheet
column
t
the
4k
right
throughput
improved
dramatically,
and
I
was
just
a
little
concerned
looking
at
that,
I
haven't
haven't
investigated,
but
I
was
a
little
concerned
whether
or
not
in
your
pr
small
rights
were
were
actually
being
deferred
or
if
it
was
if
it
was
acting
like
the
the
zero
k
case,
where
we're
not
deferring
at
all.
E
Well,
first
of
all,
I'd
like
to
say
these
are
not
small
rights
in
our
terminology.
So,
okay,
it's
still
big
everything
less
than
4k
is
goes
small
right
logic
why
it
became
better
well,
they
should
go
to
short
anyway
and.
E
So
well,
actually,
if
you
take
a
look
at
column,
c
d
or
e
performance
is
the
same
and
exactly
honestly,
I
don't
think
that
4k
right.
E
Should
have
different
logic
when
they
shouldn't
depend
on
the
max
blob
size
or
preferred
deferred
size.
So,
no
matter,
if
you
have
64k
preferred
default
size
or
4k,
then
the
common
sense
says
me
that
should
work
the
same.
I
presume
we
just
have
some
misbehavior
with
larger
max
block
sizes,
which
I
fixed.
A
E
A
E
A
D
I'm
just
wondering
if
I
should
somehow
move
priority
with.
I
have
a
plan
to
make
a
modification
for
that
will
allow
to
test
hdd
performance
on
ssd,
meaning
that
you
could
run
some
tests
for
a
long
time
on
ssd,
just
filling
the
cluster
and
simulating
long
long,
operation,
fragmentation
and
stuff
and
then
just
switch
for
hdd
to
actually
get
for
a
short
time.
Like
10
minutes,
15
minutes
the
performance
of
the
solution.
D
A
So
so
you
you
just
artificially
insert
latency
to
to
simulate,
seeks.
D
No
I'm.
The
intention
is
to
have
a
setup
where
you
have
actually
hdd
mirroring
ssd
and
you
for
a
time
when
you're
filling
a
cluster,
you
just
operate
on
ssd,
but
when
you
have
testing
mode
you
force
hdd
to
actually
do
operation
read
write,
seek
everything
as
it
should
but
drop
the
data
and
feed
the
data
from
ssd
anyways.
A
A
Okay:
okay,
the
only
problem
we
have
is
that
our
only
machines
that
we
can
test
on
with
hard
drives
that
I
know
of
are
inserta.
D
Yes,
but
is
that
a
problem
I
mean
I
have
to
upgrade
inserta
seven
to
centers
yeah
newer
because
that's
problematic
to
compile
anything
there.
A
F
If
I'm
getting
the
idea
right
adam,
you
are
kind
of
kind
of
trying
to
simulate
the
same,
that
most
of
our
users
have
write
their
their.
D
No,
not
at
all,
I'm
just
talking
about
simulating
a
spinner,
but
for
the
sake
of
compressing
a
week
or
two
weeks
of
operating
on
hdd,
because
that
takes
a
lot
of
time
in
fragmenting
data.
D
That
period
will
be
compressed
to
let's
say
one
hour
using
an
ssd
and
when
your
cluster
is
properly
fragmented
has
done
some
actual
work,
then
in
test
mode
reads
will
be
done
from
actual
hdd,
all
that
that
necessary
sectors
will
be
read
or
written
to,
but
the
data
from
that
will
be
dropped
because
there
will
be
no
data
there
and
the
data
will
be
provided
from
companion,
ssd
that
was
used
before.
In
that
way.
I
intend
to
make
long
duration
tests
in
hour
or
something
like
that.
D
F
That's
that's
something.
We
don't
get
coverage
of
right,
long,
running
test
and
already
degraded
state,
fragmented
state
of
the
cluster.
D
This
is,
it
stems
from
the
question
that
some
time
ago,
igor
asked
me
on
a
pr
if
I
did
actually
testing
on
spinner,
and
I
understood
that
testing
it
would
take
like
a
month
or
so
so.
Instead,
I
came
up
with
that
solution,
but
still
not
not
finished
the
simulator,
not.
E
So
well,
if
this
wouldn't
work,
but
maybe
worse
additional
thinking
I
mean
in
in
this
pre-fill
state,
you
might
avoid
writing
data
to
the
disk,
but
instead
feel
just
metadata.
E
E
The
only
drawback
I
can
see
for
now
is:
we
still
need
some
rights
for
service
data,
so
something
so
not
every
data
right
could
be
bypassed
in
this
special
mode.
D
A
A
Yeah,
it
might
not
be
the
same
thing
that
you're
thinking
of,
but
it
might
be
something
that
we
could
use
to
to
take
a
machine
that
only
has
ssd
drives
and
then
make
pretend
hard
drives
on
them.