►
From YouTube: Ceph Performance Meeting 2021-03-11
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
let's
see
I'm
going
to
pull
up
the
ether
pad
and
we
can
get
moving.
I
figure
people
might
slowly
trickle
in,
but
it
looks
we
do
have
core
people
now.
So,
okay,
let's
see
I
have
got
two
new
pr's
this
week.
That
I
saw
one
is
from
me
that
is
looking
at
doing,
trying
to
make
a
really
quick
and
dirty
omap.
Benchmarking.
Implementation
in
our
object
store
store
test
suite.
A
A
I
imagine
that
there's
going
to
be
room
for
improvement
here
for
sure,
but
this
is
just
kind
of
a
first
pass
attempt
at
it.
I'll
talk
about
this
a
little
more
later,
but
it
it's
definitely
showcasing
some
interesting
things.
A
The
next
new
pr
is
from
gabby,
and
this
is
his
pr
to
remove
allocations
from
roxdb
and
and
josh
and
igor
both
have
reviewed
it.
It
looks
like
there's,
maybe
a
couple
little
things
that
could
maybe
be
worked
on
or
improved,
but
they've
they're,
both
on
top
of
it
and
working
to
to
help
make
it
good,
or
at
least
ready,
not
not,
that
it
wasn't
already
good
but
ready
for
for
being
merged.
A
Okay,
I
did
not
see
any
closed
peers
this
week,
but
I
just
realized.
I
don't
think
I
looked
through
the
list
of
prs
that
were
submitted
this
week
and
closed
so
it's
possible.
I
missed
something,
please
feel
free
to
speak
up
if
there's
something
that
you
know
that
clothes
that
I
missed
here
updated
this
week.
A
We
have
a
pr
from.
I
hope
I
say
it
right
shuhan
that
is
kind
of
like
two
separate
prs
in
one
scattering
aliens
store
threads
across
separate
cpu
cores,
but
then
also
has
a
a
separate
thing
kind
of
bolted
on
as
well
so
kifu,
and
I
talked
about
that
this
morning.
I
think
we
just
need
to
separate
that
into
two
separate
prs.
The
first
is
pretty
straightforward
for
actually
distributing
threads
across
course.
That
looks
great.
I
think,
that's
just
fine.
A
We
should
merge
that
the
other
piece
is
more
complicated
and,
and
maybe
isn't
quite
what
we
want.
So
more
discussion
needs
to
happen
on
that
part.
A
A
Let's
see
what
else?
Oh
actually
I
I
am
wrong.
I
had
this
in
the
wrong
spot
adam's
pr
to
distinguish
between
buffered
and
direct
io
in
blue
fs.
That
was
a
bug
fix
actually,
more
than
anything
they
weren't
reporting
the
correct
state
to
roxdb.
That's
been
fixed
and
was
merged
by
kifu.
A
And
then
there's
the
seth
volume
retrieved
device
data
concurrently,
david
galloway,
just
commented
on
that.
A
I
don't
know
what
exactly
is
the
hold
up
still
on
this
one,
but
in
any
event,
that
is
still
being
discussed
and
worked
on,
let's
see
so
no
movement
for
this
blue
fest,
buffered
io,
but
just
before
the
meeting
last
week
or
kind
of
yeah,
it
was
really
just
before
it
adam,
and
I
had
done
a
bunch
of
work
going
through
the
rocks
db
code
for
about
three
hours
to
understand
the
route
that
we
take
when
doing
iteration
all
the
way
back
to
blue
fs
and
reading
data
for
roxdb.
A
A
Otherwise
I
don't
think
there
was
a
whole
lot
of
other
things
going
on
that
I
saw
sam
has
a
whole
bunch
of
stuff
happening
with
c-store,
but
I
don't
know
if
it's
really
exactly
performance-related
yet
so
I
didn't
include
it
in
this
list,
but
definitely
there's
there's
a
ton
of
work
going
on
there
right
now.
A
For
omap
performance,
this
pr
that
I've
got
for
doing,
object,
store
or
test
on
that
bench,
I'll,
post
or
I'll
paste
it
in
the
chat
window.
A
Here
we
are
so
I
don't
really
trust
this
yet.
To
be
honest,
the
file
store
results,
look
really
really
good,
and
I
I
don't
totally
know
why
it's
possible
that
file
store
just
uses
so
much
less
memory.
That
rocksdb
can
utilize
the
page
cache,
given
a
small
container
memory
limit
effectively,
and
it's
still
just
doing
really
really
well,
but
it
even
in
other
tests
is
doing
surprisingly
well.
A
So
I
I
need
to
understand
this,
but
the
gist
of
it
is
that
in
blue
store,
we're
we're
seeing
really
really
slow,
iteration
overall
map
and
seemingly
maybe
slow,
omap
put
and
get
performance
as
well.
A
A
So
there
are
multiple
code
paths
involved
depending
if
you're
doing
director
buffered
io
there's
a
lot
of
of
different
things
that
could
impact
what's
going
on,
given
what
I'm
seeing
in
my
tests
that
I
don't
totally
trust,
I
think
we're
seeing
a
really
really
obvious
performance
hit
when
the
block
cash
and
rocks
db
is
being
heavily
contended
when
other
things
like
oh
nodes
or
maybe
allocation
data
or
other
stuff
is
cycling
through
the
block
cache
it.
A
My
guess
as
to
what's
going
on
is
that
that's
basically
forcing
the
sst
files
that
were
being
read
for
iteration
out
of
the
cache
we're
doing
going
back,
doing
prefetch,
it's
all
very
slow
reading
from
disk
and
the
iteration
step
is,
we
might
even
be
iterating
over
like
a
long
time
period
where,
in
between
other
stuff,
is
forcing
those
ssd
files
out
of
the
cache
and
and
we're
actually
reloading
them.
I
don't
know
yet,
but
that
it
sure
seems
like
that's
what's
going
on,
is
you're
really
nasty?
A
If
that's
the
case
in
that
pr,
though,
if
you
look
at
the
last
side
test
from
like
an
hour
ago,
you
can
see
there's
this
huge
disparity
in
blue
store
direct.
I
o
iteration
performance
when
we
have
a
smaller
osd
memory
target,
so
a
smaller
rock
city
block
cache
a
bigger
memory
target
and
thus
a
bigger
block
cache.
A
It's
like
a
10x
difference,
so
that
is
kind
of
what
I'm
going
to
try
to
focus
on
is
understanding
exactly
what
it
is.
Doesn't
look
to
me
like
right.
Look
to
me
right
now
like
this
is
a
bug
exactly
in
rocks
tv,
it's
not
like
when
we're
doing,
pre-fetching
or
or
reads
for
iteration
that
we're
not
putting
stuff
into
the
blog
cache.
I've
seen
some
evidence
with
wall
cloud
tracing
that
we
are,
in
fact
loading
data
into
the
block
cache.
A
So
I
don't
think
that's
what's
going
on
it's
that
we're
just
thrashing
it
really
badly.
This
may
just
be
an
indication
that
blue
store
is
trying
to
use
rocks
to
be
for
too
much.
We
already
kind
of
suspected
that
anyway,
so
that's
kind
of
what
we've
been
working
on.
If
people
are
interested,
we
can
go
through
like
a
a
walkthrough
of
the
rock
cd
code,
but
it's
you
know
pretty
esoteric.
I
don't
know
if
folks
want
that
or
not
that's.
That's
kind
of
high
level
questions,
thoughts.
B
A
B
A
A
Well,
that's
I
I
I
keep
going
back
and
forth
on
this
because
that's
kind
of
what
I
originally
thought-
and
I
saw
some
very
ad
hoc
knowledge
of
you-
know
various
forms
claiming
it
did
and
I
am
seeing
with
blue
star
a
change
in
the
performance
of
iteration
with
buffered.
I
o
when
the
c
group
limit
is
increased,
even
if
the
osd
memory
target
is
left
the
same.
A
A
So
yeah
anyone
that's
interested
in
this
stuff.
Please
please
look
at
the
benchmark
code.
I
don't
really
trust
it.
It's
very
or
it's
basically
just
taking
the
existing
test
and
throwing
counters
around
it.
A
A
Sam,
I
I
don't
mean
to
pick
on
you,
but
do
you
remember
with
file
store
since
you
worked
on
it
so
much
for
so
long?
Would
you
expect
really
fast,
oh
map
performance
with
it?
Does
it
make
sense
to
you,
given
the.
C
A
A
So
yeah
that's
that's
kind
of
where
we're
at
with
this.
I
think
there
I
include
a
couple
of
wall
clock
profiles
of
an
early
version
of
the
benchmark
in
the
pr
there's.
Definitely
some
crazy
stuff
going
on
in
here.
Oh,
I
forgot.
I
still
need
to
run
these
tests
with
telling
tc
malek
that
it
can
have
more
hard
cash
because
we
don't.
A
This
is
not
kind
of
implicit
when
running
tough
test
object,
store
individually.
A
B
Thought
it
was
interesting
that
the
hashing
was
showing
up
behind
your
profiles
for
booster
just
to
be
calculating
that
the
hash
for
which
cash
chart
to
access
it
looks
like.
A
B
A
I
do
kind
of
remember
seeing
that
too,
let's
see
yeah
there,
it
is.
B
A
A
And
their
version
of
the
their
block
cache
doesn't
use.
This
doesn't
use
our
jenkins.
We,
we
tooled
that
in
to
use
when
we
made
our
own
version
of
it.
We
replaced
their
hash
with
the
ceph1.
A
I
don't
actually
know
is
our
jenkins
any
slower
than
any
of
the
other
implementations
like
like.
What's
it
the
the
twister,
I'm
not
sure,
okay
shoot.
Sorry,
I'm
I'm
yeah
anyway.
Yeah
go
ahead.
B
I
was
just
surprised
to
see
that
it
showed
up
so
high
in
the
profile
like
that.
C
A
A
A
So
also
in
here,
there's
the
red
black
tree,
the
standard
map,
time
being
taken.
A
B
With
the
tc
malik,
increased
central
heap
size
too
be
curious
to
see
what
the
profile
looks
like
after
that.
A
Also,
the
current
version-
the
benchmark
works
really
differently.
Right,
like
this,
those
traces
were
from
doing
like
a
single
object
with
like
a
million
keys
on
it.
The
new
version
is
letting
you
specify,
like
a
number
of
objects
to
create
with
you
know,
a
smaller
number
of
keys
per
object,
so
the
trace
might
look
a
lot.
B
A
But
the
good
news
is
that
we
still
can
do
that
right,
like
we
could
go
back
with
the
new
version
of
the
benchmark.
You
just
specify
one
object
that
you
want
in
the
collection
and
then
you
know
a
ton
of
keys
on
it,
so
we
can
still
do
those
kinds
of
tests.
A
B
I
don't
think
it
makes
much
of
a
difference
in
multiple
collections
versus
in
the
collection.
Okay,
also
something
I'm
missing
about
serialization
there,
but
you're
going
directly
through
the
object
store
interface.
A
A
B
What
kind
of
like
db
size
are?
You
are
you
testing
with,
like
so
far.
A
I
have
no
idea
literally,
I
just
started
like
throwing.
You
know,
objects
and
and
keys
into
things.
So
I
mean,
like
you
know
I,
I
know
what
the
test
does
right
like
this
is
like
a
hundred
thousand
objects
with
100
keys
per
object
and
the
key
length
is
like
64
bytes
and
the
value
length
is
256
bytes,
so
I
mean
you
can
roughly
kind
of
figure
out
what
box
db
would
see
prior
to
basic
amplification.
A
A
Yeah,
it
was
enough
that
I
actually
started
starting
to
see
cache
effects
right.
That
was
what
I
was
really
going
for
was
I
wanted
that
increasing
the
size
of
the
blue
store
or
the
osd
target?
You
know
memory
target
and
increasing
the
block
cash
actually
made
the
difference
for
for
blue
store,
so
that
was
really
all.
I
was
focusing.
B
B
Yeah,
I
guess,
for
the
file
store
case.
You
could
even
like
run
this
the
system
in
a
little
memory
mode,
where
you
don't
let
it
use
like
with
the
kernel
with
a
low
memory
mode.
So
you
can't
even
use
more
than
the
data
set
size
to
force
it
to
be
able
to
go
to
disk.
Sometimes,
if
you
wanted
to
avoid
that,
like
the
page,
cache
effect
entirely.
B
Because
the
kernel
parameter
you
can
set
when
you
boot,
to
determine
your
respective
memory
to
a
smaller
amount
than
what
you
actually
have
physically
available.
A
B
Then
you
don't
have
to
worry
about,
like
the
c
group,
how
that's
playing
with
a
page
cache
exactly.
A
A
It
one
one
other
thing
I
wanted
to
ask
here,
igor
you,
you
looked
at
like
collection,
prefetch
and
collection
like
caching,
I
think
a
while
back,
and
I
think
the
thought
was
that
prefetch
was
going
to
be
good
enough.
Do
you
do
you
remember?
A
A
A
A
A
A
Well,
so
this
is
not
really
being
tested
in
the
benchmark
right
now,
right
like
doing
a
collection
list
itself,
but
given
the
other
things
I'm
seeing
in
this
benchmark,
I
wonder
if
it's
going
to
be
very
slow
in
blue
store.
A
C
D
I
I
I
was
pretty
lucky
with
repetitive
pg
listing
power,
technical,
additional
caching.
Once
I
I
started
to
use
this
position,
I'm
I
I
can't
see.
D
D
And
also
all
on,
on
the
same,
this
was
compared
on
the
same
note
on
the
same
data
set.
D
Well,
it
was
something
like
that
about
perfect
io
or
blueprint
buffet,
I
use
set
to
false
it's
getting
around
eight
seconds
and
I
set
it
to
true
dropped
to
one
or
two
seconds
and
then
upgrading
to
octopus
15.2.9,
which
has
this
new
remotely
dropped
to.
Let
it
be
second.
A
D
D
And
that
well
definitely
that
that
was
the
case
when,
when
bulk
removal
happened
before
so
for
regular
usage,
the
difference
is
not
that
large,
but
that
that
was.
That
was
the
case
when
it
was
degraded
after
bulk
removal.
A
Okay-
and
that
was
not
caused
by
the
rocks
to
be
tombstone
issue
that
we
saw
also
right.
D
D
But
again
it's
not
that
efficient,
since
it
depends
on
your
access
pattern.
If
you
use
iterators
and
save
the
previous
position,
you
as
it
works
pretty
good
anyway,
without
compaction,
incredible
things
like
that,
it's
it's
a
first
search
which
might
be
inefficient.
D
So
well
what's
happening
when
you
are
removing
pg
on
each
collection
listing
to
retrieve
text
items
to
to
remove
it
has
to
to
run
through
a
longer
list
of
removed
entries.
I
mean
the
original
behavior.
It
always
started
from
from
the
beginning,
so
on
the
first
iteration
it
retrieves.
The
first
entries
then
remove
them
on
the
next
one.
A
We've
got
a
couple
of
people,
or
at
least
one
person
right
now,
working
on
trying
to
take
all
of
the
different
customer
cases
that
we
think
may
be
related
to
flow
map,
performance
and
solo,
iteration
and
and
kind
of
get
them
together,
so
that
we
can.
We
have
that
it
seems
like
it
does.
Whatever
this
this
is.
It's
maybe
affecting
many
different
use.
Cases,
though.
D
Yeah,
just
just
one
one
thing
to
mention
that
actually
this
patch
for
removal
doesn't
drop
roxdp
in
in
in
any
case,
so
one
might
have
some
different
scenarios
when,
when
there
are
a
bulk
of
bunch
of
multiple
removals
in
in
roxbb,
and
currently
we
are
still-
we
might
still
suffer
from
from
from
this
yeah.
Well,
one
of
the
keys
is
snap
mapping
records.
C
D
Agree
currently,
we
have
just
this
one
one
scenario:
six,
my
second,
my
my
second
pr,
which,
unfortunately,
don't
have
enough
time
to
proceed,
work
on.
D
Some
stuff,
but
well
it
allows
to
run,
allows
to
to
fix
bulk
or
map
removals
by
using
by
using
range,
deletes
subsequent
compaction.
Yeah
again,
that's
probably
not
a
100
percent
pollution.
A
D
Yeah
and
actually
compaction
is
good
in
in
any
case,
if
you
applied
it
or
not,
this
workaround
to
apply
compaction
with
totally
degraded
trunks
db.
It's
a
little
was
a
workaround
for
for
a
while.
D
D
Still
still,
an
open
question
is
how
to
to
handle
bulk
removals
on
unsorted
records.
A
So,
in
the
the
test
case,
I've
got
here
in
this
this
pr,
I
linked
there's
no
removals
happening
until
the
very
end,
so
we're
not
even
touching
any
of
that
that's
going
to
just
be
even
worse.
I
think
this
is
just
touching
the
cache
and
how
effectively
we
can.
We
can
iterate
your
memory
constraint
scenarios.
A
Damn
not
to
bother
you
again,
but
if
I
remember
right
file
store,
does
something
kind
of
strange
with
omap
when
it's
is
submitting
to
roxdb
or
previously
leveldb,
it's
not
as
straightforward
as
this
bluestore.
But
do
you
do
you
remember
how
that
works
or
is
there?
Is
there
something?
Is
it
easy
to
sorry?
What
was
that.
C
C
A
Yeah,
it's
probably
not
using
a
significant
amount
of
block
cache
because
I
don't
think
file
store
changes
the
default,
so
I
think
it's
like
whatever
we
set
the
default
block
cache
size
to
be.
I
don't
know
it's
like
500
megabytes
or
something,
but
it
doesn't
really
matter
because
it's
all
buffered
I
o
going
to
whatever
file
system
it's
running
on.
A
So
whatever
page
cache
is
left,
if
assuming
the
c
group
is
actually
limiting
it,
then
if
file
store's
not
using
a
whole
lot
of
memory,
you
might
have
a
couple
of
gigabytes
available
for
rocks
to
be
paid
to
use
in
the
page
cache
for
for
this
and
that's
good
enough,
and
it
just
works
all
right.
Well,
this
is
all
I've
got
guys
any
any
other
thoughts
or
questions
or
comments
all
right.
Well,
oh
sorry,
were
you
saying
something.
A
Okay,
I'll
keep
working
on
this,
and
let
you
guys
know
how
it
goes.
Anyone
have
anything
that
they
want
to
bring
up
before
we
wrap
the
meeting
up.
All
right
well,
then,
have
a
great
week,
everyone
and
see
you
again
next
week,
thanks
guys
thanks.