►
From YouTube: Ceph Performance Meeting 2020-09-10
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hey
sorry
about
that,
I
had
firefox
issues
for
some
reason.
It's
not
happy
at
the
moment
so
anyway,.
A
Josh,
you
were
looks
like
you
were
starting.
What's
what's
up
what.
A
Oh
okay,
I'll
I'll
be
frank.
I
don't
have
a
whole
lot
here.
I
I
only
have
about
halfway
through
reviewing
pull
requests,
so
not
a
whole
lot.
Josh.
It
looks
like
you
closed
you,
you
merged
your
off,
monitor
pr
cool,
that's
good!
There
are
a
couple
of
other
updated
ones
leaving.
It
sounds
like
maybe
this
osd
async
recovery
mid
cost.
Now
the
thought
is
that
we
might
leave
that
default.
D
Last
week
we
found
out
that
some
of
the
tests
that
were
being
done
didn't
have
the
right
crash
rules
because
of
which
the
results
may
not
be
reliable.
So
we
we
need
to
wait
on
that.
One
until
further
results
are
ready.
D
A
Cool,
let's
see,
and
then
I
think
kifu
is
still
looking
at
the
pr
for
optimizing.
The
blue
store.
A
Okay
but
kifu
reopened
it
okay,
so
yeah
it
looks
like
there
hasn't
been
any
real
movement
on
this
since
last
week.
It
was
just
closed
and
reopened,
so
all
right,
yeah.
Otherwise,
I'm
not
seeing
a
whole
lot.
Were
there
any
pr's
this
week
that
anyone
had
that
should
be
on
this
list.
A
All
right:
well
then,
let's
move
on
josh
to
hear
you
were
asking
e
word:
did
he
have
a
status
update
for
us.
E
Yeah
sounds
good
well,
first
of
all,
the
introduction
we've
got
multiple
complaints
about
pretty
slow,
full
removal
and
surf,
so
it
might
take
several
hours
to
reclaim
the
space
after
pull
removal,
or
sometimes
I've
heard
about
days.
E
F
E
E
So
this
involves
collection
listing
which
is
retrieving
up
to
30
entries
for
placement
group,
30,
30
or
notes
well
their
names
after
that
it
attempts
to
remove
this
entry
from
snap
mapper,
which
I'd
prefer
to
leave
aside
for
now,
not
a
trivial
operation
as
well.
But
for
my
experiments
I
didn't
have
any
snapshots,
so
it
was
trivial
one
then,
on
this
third
step,
sd
invokes
or
node
removal
from
bluestora,
which
in
fact
is
a
bunch
of.
E
Other
sub-operations,
involving
removing
all
the
node
on
map,
then
reclaiming
all
the
space
at
main
device
occupied
by
this
node,
which
is
updating
blue
storage
locator
and
then
removing
all
the
metadata
records
for
this
specific
node
once
completed
for
once
completed
this
iteration
for
30
entries.
The
task
sleeps
for
some
time,
determined
by
osg,
parameter
osd
delete
slip
which
is
different
for
depending
on
the
setup.
E
E
E
Okay,
so
well,
actually
there
are
a
bunch
of
issues
with
this
removal
procedure.
E
E
Well
then,
the
third
issue.
E
E
E
E
But
in
fact,
since
we
are
talking
about
database
primary
about
database
operations,
these
two
setups
should
be
already
the
same.
In
my
opinion,
so
hybrid
setup
from
database
point
of
view
is
no
different
from
all
flash
one
and
potentially
we
might
have
some
hidden
use,
a
full
flash
setup
where
pull
removal
might
cause
significant
performance
drop,
since
the
default
parameter
is
set
to
zero
seconds.
E
And
the
fifth
issue
is
about
using
more
advanced,
more
advanced
techniques,
to
remove
multiple
records
from
roxdb.
Actually,
it
has
a
delete
range
function
which.
E
Might
well,
which
is
not
ideal.
To
be
honest,
one
side,
it
allows
to
to
remove
multiple
sorted
records
in
a
single
shot
from
another.
E
But
maybe
we
should
reconsider
this
and
try
to
enable
it
for
all
this
pull
removal
stuff
I'll,
show
how
we
can
update
this
removal
procedure
if
fit
from
from
deleted
edge
function
from
one
side
and
from
another
side
we
wouldn't
produce
that
much
tom
stones
which
create
with
functions
in
database
since
actually
we'll
need
just
three
drumstone
per
each
placement
group
to
remove.
E
And
the
last
thing
about
this
delete
range
operation,
which
I
discovered
not
mentioned
here-
is
that
at
some
point
we
introduced.
E
Before
using
this
function,
we
introduced
record
counting
procedure
to
estimate
if
we
want
to
apply
this
function
or
not,
which
actually
iterates
of
the
bunch
of
records
until
the
threshold
is
reached,
and
if
we
have
enough
records,
then
we
proceed
to
delete
range
or
otherwise
we
delete
using
regular,
regular
function,
and
it
looks
like
this
procedure
is,
might
introduce
additional
overhead
as
well,
which
is
not
that
small
as
I've
seen
and
again,
we
might
get
rid
of
this
stuff.
E
I
mean
record
counting
for
for
new
pull
removal
stuff
and
maybe
after
that,
delete
range.
Wouldn't
be
this.
E
Well,
in
the
next
slide,
I'm
going
to
present,
the
idea
is
how
we
can
redesign
how
we
can
design
this
removal
procedure.
So
any
questions
so
far
before
proceeding
to
the
next
slide.
E
So
here
is
the
new
design
which
I'm
proposing
I
have
working
pc
for
that
and
well
I'll
show
some
numbers
if
we'll
have
enough
time
for
that
a
bit
later,
but
well
again,
let
me
present
this
this
procedure
for
now,
leaving
aside
snap
matter
stuff,
then
what
what's
the
main
issue
with
slow
pull
removal
for
user
perspective?
E
E
And
once
we
reclaim
all
the
space,
we
can
proceed
with
removal.
All
the
records
from
database
here
are
two
options.
E
Actually,
the
first
one
is
to
leave
things,
as
is
that
which
is
iterate
over
every
node
again
and
remove
them,
as
as
we
currently
do,
by
removing
their
maps
again
listing
over
them
and
then
removing
all
the
metadata
node
metadata,
but
another
option
might
be
to
benefit
from
from
deleted
range
function.
E
And
in
fact,
can
issue
just
delete
range
operations
on
the
placement
group,
the
first
one
to
delete
all
the
regular
nodes
for
each
pg,
which
are
sorted
and
hence
apply
this
deleted
range.
The
second
one
is
to
delete
temporary
or
nodes
which
have
a
bit
different
naming
and
hence
they
form
a
different
range.
E
The
third
one
is
to
remove
all
the
maps
belonging
to
objects
in
this
placement
group.
Potentially
this
is
possible,
but
our
current
naming
or
map
naming
format
doesn't
provide
the
this
ability
we
need.
Actually,
we
need
to
change
the
naming
scheme
to
include
placement
group
or
id
in
into
the
naming,
and
once
we
did
once
we
do
that
we
might.
E
E
B
About
reorganizing
the
maps,
all
right,
all
right
isn't
did
we
change
them
to
be
organized
on
appropriate
basis.
E
Maybe
but
it
it
requires
some
additional
notifications
in
the
code
since
currently
it
operates
on
placement
groups.
That's
the
first
issue
and
the
second
issue.
Potentially,
we
might
benefit
from
new
removal
for
from
this
new
removal
procedure
when
we
need
to
remove
just
a
single
pg
when
it
rebalanced
to
two
different
hosts
or
something,
and
if
this,
if
we
use
pool
ideas,
this
wouldn't
benefit
from
from
this.
F
E
Okay,
so
well,
the
next
stuff
is
the
beach
raw.
So
far,
I
didn't
have
enough
time
to
to
arrange
it
properly.
Sorry
about
that,
but
what
I
have
so
far
so
here
are
some
numbers
for,
for
benchmarking,
the
original
delete
versus
some
tuning
on
this
delete
sleep.
E
And
here
we
can
well,
and
originally
initial
performance
was.
E
Bandwidth
was
around
7.5
megabytes
per
second
and
on
the
second
run,
with
pull
removal
running
in
parallel.
We
are
getting
around
6.8.
E
E
E
And
removal
time
is
even
shorter,
but
in
fact
one
one
hidden
issue
with
these
numbers
is
that
they
show
average
bandwidth.
But
in
fact.
E
I'll
try
to
show
a
bit
later.
Actual
actual
bandwidth
is
variant
depending
on.
If
removable
removal
is
completed
or
not
so
in
average
we
would.
We
are
getting
pretty
good
numbers
for
delete
slip
0,
but
in
fact
we
have
pretty
low
bandwidth
while
removal
is
running
which
makes
it
a
bit
questionable
if
we
want
to
use
it.
E
An
incomplete,
a
partial
fix
which
introduces
space
reclaiming
before
the
removal,
but
the
removal
itself
is
he
is
using
still
the
same
delete
procedure.
E
Into
the
seek
then
delete-
and
here
you
can
say
you
can
see
that
actually
in
this
reclined
reclaim
time
row-
we
can
see
that
we
are
able
to
complete
a
space
reclamation
much
faster,
comparing
to
to
the
removal
of
the
the
original
pull
removal,
which
is
good
for
users,
since
they
get
their
space
much
faster.
E
E
Then
the
next
column,
which
is
not
very
interesting,
it's
an
updated,
a
fully
updated
removal
procedure
which
performs
space
reclamation
and
then
proceeds
with
final
delete
ranges
on
on
this
placement
group.
But
since
it
it
has
two
seconds
huge,
it's
still
slow
because
reclamation
still
is
still
operating.
This
30
30
entries
portions
and
then
sleeps
for
two
seconds.
So
it's
just
just
for
us
to
hear
it
just
for
reference.
E
More
interesting
is
the
next
column
which
is
h.
E
You
can
see
it's
some
degree
faster
than
our
original
removal
for
for
the
same
sleep
period,
we
can
see
also
that
reclaims
space
reclamation
takes
166
seconds
and
the
total
pull
removal
takes
169
seconds.
So
we
just
we
need
just
three
seconds
to
to
to
perform
all
the
completion
using
this
deleted
range.
E
And
the
next
column
the
same
using
sleep
period
pad
to
zero.
E
So
well,
it
looks
like
from
reading
point
perspective.
We
are
not
affected
much
due
by
this
deleted
range.
One
questionable
thing
is
the
row
8,
which
is
fs
check
running
after
the
pull
removal,
and
here
I
can
see
that
it
takes
longer
with
all
these
changes.
E
Well,
what
else
I
have
is
some
diagrams
or
our
this?
Second
writing
runs
over
time
in
parallel
with
removal,
and
here
the
blue
line
refers
to
the
original
delete
with
two
second
two
second
period,
you
can
see
it
it's
pretty
stable
from
one
side,
but
it's
a
bit
lower.
E
All
the
time
and
again
we
don't
have
pace
totally
freed
of
this
period,
so
actually
the
completion
of
pull
removal
is
somewhere
out
of
this
diagram.
E
E
And
the
green
line
here
is
again
a
regular
delete,
coupled
with
the
pure,
with
the
pure
space
exclamation.
E
E
And
here
are
a
novice
another
set
of
diagrams,
which
again
include
original,
delete
blue
lines
and
then
to
two
new
procedures
to
two
new
runs
with
all
these
new
pull
removal
stuff,
where
red
line
is
sleep
period
for
reclaim
set
to
zero.
E
B
Thank
you,
yeah.
That's
really
interesting.
I
think
a
lot
of
the
ideas
you
talked
about
which,
in
the
slides
made
sense
in
terms
of
the
way
to
restructure
things.
B
B
E
Yeah
yeah
yeah.
Indeed,
I
observed
some.
E
E
E
Actually,
I
haven't
managed
to
to
remove
it
completely,
so
I
I
managed
to
reduce
this
drop,
but
it's
it's
still
visible.
B
Yeah
I
mean
the
restructuring
of
the
deletes,
makes
sense
in
general.
To
me,
I
think
the
part
that
may
be
more
questionable
is
the
remove
range
and
the
impact
is
having.
E
We
know
all
these
bulk
removals,
but
well
potentially
it
looks
much.
E
Much
better
so
just
run
one
operation
and
complete,
but
unfortunately
roxdb
doesn't
perform
it.
Ideally,
some.
E
F
E
E
F
E
E
B
Yeah,
I
agree
that
those
changes
definitely
make
it
more
likely
to
to
work,
but
I
guess
to
do
more
testing
and
see.
A
The
the
big
thing
that
it
was
the
number
of.
A
Of
brain
is
blinking,
basically
deleted
keys,
two
cells.
There
we
go
number
of
tombstones
that
that
really
as
they
accumulated
caused
excessive
overhead.
A
D
So
mark
remind
me,
the
part
about
record
counting
only
got
added
later
in
the
rm
range
picture.
Right
in
the
initial
implementation
did
not
have
any
record
counting.
D
Yeah,
so
we
just
added
it
because
we
knew
that
we
had
some
performance
results
that
showed
that
without
record,
counting
rm
range
was
not
doing
so
well.
So
we
wanted
to
introduce
record
counting
to
introduce
a
threshold
right.
A
Yeah,
so
if
you're
deleting
a
huge
number
of
keys,
the
thought
was
that
at
that
point,
maybe
it's
better,
but
I
don't
know
that
we
really
looked
at
what
that
value
really.
Is
we
just
said
it
arbitrarily
high
and
then
said
well
for
now:
let's,
let's
avoid
the
the
the
really
bad,
behavior
and
and
then
kind
of
punt
on
figuring
out
what
it
should
really
be
set
to
so
like
right
now,
I
think
effectively.
We
just
don't
use
it
because
the
value's
so
high.
D
Yeah,
I
remember
the
part
where
we
we
just
increased
the
threshold
to
a
very
high
number
so
that
it's
not
enabled
by
default,
but
I'm
just
trying
to
think
that
what
igor
is
proposing
is
that
if
we
remove
the
record
counting
all
together,
just
for
the
pool
deletion
case,
would
we
see
the
same
performance
impact
that
you
had
seen
when
you
had
run
the
tests
on
just
the
base
implementation
of
rm
range
without
the
record
counting.
A
A
But
I
think
that
if
we
just
leave
it
to
its
own
devices,
hoping
that
we
don't
end
up
with
a
lot
of
tombstones
left
laying
around,
we
may
end
up
in
the
situation
we
were
in
before
possibly.
A
D
Yeah
but
it
does
sound
like
even
if
you
leave
the
arm
range
part
out
of
the
other
proposals
that
you
got
made
in
one
of
the
slides.
Those
can
incrementally
be
made
right,
like
the
delete
sleep
thing
that
he
proposed
also
about
the
collection
listing.
So
I
mean
I,
I
guess,
there's
a
lot
of
things
that
we
can
start
working
on
immediately.
A
Definitely
yeah
yeah.
The
arm
range
thing
is
just
you
know,
one
one
small
piece
and
I
there
are
almost
certainly
ways
that
we
can
work
around
it.
I
mean
it
might
even
be
good
enough
just
to
trigger
compaction
as
soon
as
it's
done.
E
E
Yeah
makes
sense,
but
honestly,
I
have
never
tried
this
manual
compaction
from
from
before.
Well,
I
I
I
don't
understand
how
it
behaves
from
performance
point
of
view
how
how
it's
applicable
in
parallel
with
the
regular
on
this
store
but
yeah.
I
will
try.
A
Yeah,
I
don't.
I
don't
know
that
any
of
us
do
this.
Is
this
problem
isn't
just
one
that
we
face
if
you
go
out
and
look
at
the
rocks
tv
archives
for
like
the
mailing
list
or
whatever
there's
well,
you
can't
really
look
at
the
facebook
one,
but
the
there's,
like
a
google
group
thing
that
you
can
look
at
there's,
there's
other
people
that
have
hit
issues
like
this
before.
It's
not
super.
E
F
F
G
I
don't
have
a
really
a
question
whether
a
request
igor,
is
it
possible
to
share
the
the
slides
and
the
results.
G
A
Thanks
so
we
pretty
much
use
up
the
whole
time
guys,
which
is
good
as
always
good,
to
have
things
to
discuss
like
this.
Should
we
schedule
next
week
to
continue
the
oh,
no
discussion
and
move
the
paper
off
another
week
or
do
people
want
to
talk
about
the
paper.
G
I
I
read
the
paper
and
personally
I
don't.
I
don't
feel
like
it's
that
urgent
I
mean
we
can,
I
think,
continue
the
discussion
of
oh
no
next
week
and
postponed
the
discussion
of
the
paper,
but.
A
A
I'm
I'm
really
curious
to
see
what
you
uncover,
as
you
continue
to
work
on
this
and
all
right.
Well,
then,
see
you
guys.
Next
week
have
a
great
week.
Everyone.