►
From YouTube: Ceph Performance Meeting 2021-04-22
Description
A
All
right,
well,
I
suppose,
we'll
get
people
from
core,
hopefully
fairly
soon,
so
it's
been
a
while
since
we've
had
one
of
these
meetings.
Welcome
back
folks
start
with
pull
requests
here.
There
was
not
a
whole
lot
of
stuff
over
the
last
couple
of
weeks.
That
was
new
or
closed,
probably
due
to
all
the
work
that
went
into
the
the
release
and
then
just
people
kind
of
recovering
from
that.
To
some
extent,
I
think
things
were
fairly
busy
with
cdm
as
well.
A
I'm
sorry,
cds,
so
anyway,
the
ones
I
do
have
here.
There
are
two
new
pr's,
both
related
to
o
node,
pinning
and
trimming,
and
one
is
from
igor
who's
here
and
one
is
from
adam.
I
was
hoping
to
talk
about
this
today,
but
adam
apparently
is
out
for
the
week,
so
I
don't
know
how
much
we'll
be
able
to
actually
discuss.
A
B
I
think
that
both
approaches
are
viable.
Well,
maybe
one
that
is
mine
is
more
straightforward
and
lesser
or
prone,
but
well
definitely
my
opinion
is
bad
biased
at
this
point.
So
I'd
like
someone
else
to.
A
A
So
yeah-
maybe
maybe
you
know,
maybe
josh
you
and
me,
we
could
look
at
it
and
then
we
can
discuss
at
the
next
performance
meeting
or
something.
C
C
C
A
Publication
all
right:
well,
we
just
made
one
whole
discussion
topic
for
today,
so
we
can
move
on
to
updated
prs
here
now.
Let's
see
rgw
compression
that
got
more
reviews.
A
E
Gabby,
we
can't
hear
you,
I
don't.
I
wasn't
mute.
Sorry,
I
wasn't
mute,
I
kept
talking
and
nobody
was
reacting,
which
was
very
frustrating.
So
sorry,
I
put
you
as
a
reviewer,
but
I
mainly
need
ego
to
review
this
fix
and
adam.
I
also
put
josh
and
you
because
it's
performance,
so
you
might
want
to
take
a
look.
A
Yeah,
I
might,
if
I
have
time
I
might
try
to
do
just
a
couple
of
quick
performance
tests.
There's
a
secondary.
You
know
run
in
addition
to
the
ones
that
you've
done
just
to
verify
that
we
see
see
it
twice.
E
E
Come
again
igor,
I
can't
hear
you
have
you
run
a
store
test.
B
E
If
you
could,
please
send
me
the
name
of
the
test,
I
would
also
edit-
and
I
also
need
your
help
again
with
this
stuff.
We
talked
about
yesterday
because
I
changed
the
code
and
the
numbers
and
scenes
are
completely
bogus.
It
reports
a
number
of
objects
as
few
hundred
while
I
know
I
got
much
much
more.
B
F
B
E
E
Good
approximation-
I
don't
sorry
I
I
don't
know
what
I'm
looking
for
brett
asked
me
to
print
some
kind
of
a
progress
bar
or
some
kind
of
progress
indication.
So
I
need
to
know
how
many
objects
I
got
approximately.
E
B
E
E
E
B
Yeah,
I
I
mean
that
if
this
call
to
get
a
nod
count,
estimation
doesn't
work,
you
might
try
as
an
alternative
which
is
to
use,
estimate
size,
and
then
you
reach
every
node
and
you
get
the
amount
they
get
the
blob
size
for
for
each
node
and
hence
you
can
move
with
the
progress
using
this
already.
A
red
side
versus
the
total
size
already
right
size
versus
the
total
size.
E
I'm
not
sure
why
follow,
but
maybe
we
can
take
it
off.
Okay,.
B
Okay,
so
estimate
size
returns.
You
say
one
one
megabyte
for
all
nodes,
then
you
reach
nodes
one
by
one
on
each
lead.
You
get
say
32
bytes,
so
you
know
that
you
should
reach
one
megabyte
divided
by
oh
well.
Sorry,
so
you
just
account
fight
count
for
every
node,
so
32
bytes
for
the
first
node,
the
next
32
byte
or
whatever
number
you
get
for
for
the
second
node
and
total
totally.
This
should
go
to
one.
B
B
But
you
need
some
more
more
precise
progress
based
on
the
on
the
actual
numbers.
So
so
you
have
one
megabyte
total
and
you,
you
read
your
own
notes
and
read
well
for
the
first
or
no
32
bytes
for
the
next
or
note
33
bytes.
So
you
just
sum
the
already
processed
amount
of
data,
and
this
amount
should
finally
get
to
this
megabyte
and
hence
you
you
get.
You
can
get
this
progress.
E
D
And
as
far
as
as
far
as
the
status
object
count
comes,
it
comes
from
the
pg
map
and
you
can
also
get
it.
So.
The
pg
map
has
information
about
all
the
pgs
and
all
the
osds
and
the
osds
also
individually
have
a
count
of
their
objects,
degraded
objects,
etc,
and
that
the
particular
function
that
we
use
to
update
that
metric
is
called
update,
calc
stats
and
that
happens
on
a
periodic
basis
in
the
osd
code.
D
So
if
you
just
look
at
mon
pg
map.cc,
that
will
tell
you
the
accumulative
stats
and
if
you
want
the
osd
specific
ones,
you
can
look
at
calc
stats
in
in
the
osd
directory.
E
Can
you
please
send
me
the
link
to
to
this
function?
Yes,
okay,.
B
But
but
in
this
case
you
would
need
to
call
some
higher
levels
from
from
from
the
two
stores,
so
you
call
it's
going
to
result
in
calls
from
from
low
levels
up
to
upwards
and
well.
Currently,
we
will
have
different
directions,
so
higher
hd
levels
call
lower
levels
at
bluestock.
B
Not
sure
if
this
makes,
if
that
looks
pretty
nice.
D
Yeah,
I'm
not
sure
either,
but
maybe
you
can
start
looking
at
it
and
see
if
that
works
for
you
or
like
how
we
accumulate.
Even
for
those,
you
will
have
a
lower
level
interface
from
where
we
are
getting
those
numbers.
Maybe.
E
Yeah,
so
I'm
going
to
give
it
a
try
if
it's
working,
then
I'm
going
to
use
it.
Otherwise,
I'm
just
going
to
print
a
message
every
5-10
seconds.
Just
so
people
see
that
I
didn't
disappear
because
we
tried
to
do
the
math.
It
seems
that
on
my
box
I
got
four
terabyte
of
storage
and
I'm
using
4k
iot
with
a
lot
of
small
object.
So
I
got
3.5
million
object,
20
something
million
shards
and
500
million
extent
inside
the
object.
E
E
So
I
should
not
run
my
code
for
15
minutes
without
printing
anything.
So
every
few
seconds
I
should
just
show
that
what
I've
done
so,
even
by
just
printing
the
progress
into
the
log,
it's
give
us
90
of
what
we
need.
Having
a
very
nice
progress
bar,
that's
be
a
very
nice
thing,
but
it's
not
essential.
A
Another
thing
you
could
do
ab
gabby
is
add
a
performance
counter
for
it.
A
Oh
in
the
the
the
the
socket
the
admin
socket
interface,
you
could
set
it
up
so
that
you
can
query
for
the
current
status
by
just
having
maintaining
encounter.
B
Okay,
I'm
not
sure,
I'm
not
sure
it
would
work
since
you
probably
ost,
is
probably
inaccessible
at
this
state.
C
A
Is
it?
Is
it
definitely
the
case
that
I
was
thinking
it'd
be
really
nice
if
you
had
the
ability
in
the
gui
to
be
able
to
to
see
that
osd
was
in
a
like
state
like
that,
rather
than
just
you
know,
assuming
is
totally
offline.
E
B
How
sorry
igor
wha,
what
were
you
saying
also
waiting
for
good
gabby?
Oh
okay,.
E
B
Disconnected
just
a
question
which
so
well.
B
E
B
B
Hour
involves
all
the
video
all
the
data
eating
and
all
this
stuff
to
recover
location
map.
But
what's
about
the
timing
for
just
iterating
all
the
records
which
you
you
would
read
after.
E
B
E
But
you
know
what
I'm
going
to
I'm
going
to
measure
this
on
my
system.
B
E
Now
my
system
is
worst-case
scenario,
because
I'm
building
very
a
lot
of
object
with
very
small
allocation.
If
you
have
a
big
object
with
big
sorry
with
big
rights,
then
you
the
number,
the
iteration
going
to
be
much
faster.
B
Yeah
again,
just
just
to
understand:
how
long
does
it
take
okay,
comparing
to
the
total
recovery
process,
so.
A
A
So
next
on
the
list-
tough
volume,
I
think
that's
more-
of
the
same
kind
of
still
ongoing.
A
And
there
are
some
more
updates
to
the
d3n
cache
changes
for
rgw,
which
has
also
been
kind
of
ongoing
for
a
while
now,
okay,
that
was
it
for
updated
under
no
movement.
I
haven't
done
much
with
this
node
map
change
and
better
documentation
for
bluefish.
Bufferio
is
really
just
a
dumb
easy
pr,
but
ronan
has
expressed
interest
in
working
on
the
gh
object.
Key
hash.
A
To
better
populate
an
unordered
map
or
something
other
kind
of
hash
like
data
structure,
rather
than
switching
it
over
to
a
tree,
so
he
wanted
to
take
a
stab
at
that.
So
I'm
not
not
doing
anything
with
this
right
now,
and
maybe
most
of
this
just
goes
away
and
just
turns
into
just
the
documentation
for
bluefish
buffer
dio,
but
we'll
see
the
omap
bench
test.
The
only
thing
that
I
think
we're
trying
to
decide
on
before
we
merge.
A
This
is
whether
or
not
there
should
be
any
kind
of
mechanism
to
let
the
user
of
the
test
set
parameters,
and
it's
it's
kind
of
irritating
that
I
guess
g-test
doesn't
really
have
a
good
way
to
do.
This
adam
had
recommended,
maybe
letting
you
set
environment
variables
before
the
test
is
run.
It
seems
a
little,
I
don't
know
not
not
ideal.
A
I
I'd
much
rather
be
able
to
just
pass
command
line
parameters,
and
but
anyway,
if
anyone
has
any
opinions
on
that,
that's
kind
of
the
only
thing
that
we
have
to
decide
on.
I
think
before
we
merge
the
first
version
of
this.
A
I
think
the
the
suffix
team
is
planning
on
working
on
the
subtree
map,
pr
patrick,
actually,
I
think
he's
gonna
work
on
that.
A
A
Only
other
thing
I
see
in
here
is
that
I
think
we
can
now-
or
I
suppose
I
can
now
get
the
the
hb
spinning
implemented
since
we've
got
all
the
calm,
family,
stuff
and
other
related
things
done
so
that
pr,
the
kind
of
final
piece
of
the
age
bidding
can
go
in
and
adam
had
a
couple
of
concerns
about
the
way
that
that
we're
doing
that
and
if
it
might
be
better
to
rely
on
like
heuristics,
like
cash
hit
rates
and
missed
rates
and
that's
a
fair
argument
so
we'll
see
where
that
eventually
goes,
but
we
can
at
least
start
working
on
it
again
so
anyway,
I
think
that's,
basically
it
for
prs.
A
Okay,
we
talked
about
gabby's
allocation,
stuff
crimson.
I
don't
have
much,
but
I
did
give
a
presentation
to
some
folks
at
ibm
and
and
others
on.
Well,
it
was
a
couple
weeks
ago
now
I've
linked
the
the
slides
for
anyone
that
wants
to
look
at
those
in
the
the
chat
window.
I
think
a
number
of
people
have
already
seen
all
this,
but
for
anyone
that
hasn't,
I
made
it
public
so
related
to
that.
A
Now,
with
alien
store
after
prs
from
radic
and
others,
we
are
more
efficient
for
both
small
random
reads
and
smaller
random
rights
than
classic
osd.
That's
really
exciting!
So
yay
everybody,
that's
really
good
news.
A
We
are
still
slower,
though
we're
more
efficient,
but
we
can't
use
as
many
cores
right
now
we're
topping
out
at
about
two
to
three
cores
with
crimson
with
all
of
the
bottle
and
that
concentrated
in
the
reactor
threat,
so
we
are
still
slower,
but
when
we
can
move
to
multi
reactor
setups.
However,
we
end
up
doing
that.
A
To
kind
of
try
to
test
that,
how
we
might
do,
I
did
try,
multi-osd
configurations
on
a
single
device
and
crimson
is
actually
slower
with
four
chromes
and
osds
talking
to
the
same
device
versus
one,
even
with
like
1x
replication.
You
know
just
trying
to
to
get
as
much
performance
out
as
possible.
I
don't
know
why.
Yet
I'm
still
looking
at
it,
but
maybe
I'll
have
more
to
say
about
it
next
week
and
and
that's
basically
it
for
crimson.
A
I
guess
we
also
kind
of
already
talked
about
strategy
stuff
since
adam's,
not
here
josh,
and
I
will
look
it
over
maybe
next
week,
if
adam's
back,
we
can
we
can
discuss.
A
That's
all
I've
got.
Anyone
have
anything
they'd
like
to
bring
up
this
week.
C
B
C
Bergen
is
that
how
you
say
your
name.
Thank
thank
you
for
the
person
who
was
bringing
up
backfill
and
recovery
reservations
on
the
mailing
list.
Do
you
want
any
more
questions
around
that
area
or.
G
No,
I
just
want
to
confirm
you
can
hear
me
mike
okay
right
yeah.
Yes,.
G
This
is
the
first
time
I
used
blue
jeans.
I
just
figured.
I
should
double
check
that
yeah
it.
I
just
wanted
to
confirm
my
understanding
from
a
bunch
of
research
that
I
had
been
doing
probably
last
year
sometime,
so
I
think
you
answered
all
my
questions
adequately.
Thank
you.
Yeah.
A
A
A
All
right,
then:
well,
if
there's
nothing
else,
guys,
let's
wrap
it
up.