►
From YouTube: Ceph Performance Meeting 2022-03-17
Description
Join us weekly for the Ceph Performance meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contribute/
What is Ceph: https://ceph.io/en/discover/
A
Good
morning,
folks,
starting
a
little
late,
the
corps
meeting
is
just
wrapping
up
now,
so,
unfortunately,
we're
not
gonna
have
adam
or
gabby
today,
they're
both
out
on
pto.
So
we
won't
end
up
talking
about
their
proposal
today,
but
I
do
have
some
stuff.
I
want
to
talk
about
with
science
store
before
that,
though,
let's
see
the
core
team
is
probably
gonna
arrive
here
in
moments,
I'm
guessing,
so
I
think
I
will
just
get
this
going
right
now
so
prs.
A
I
haven't
done
this
in
a
while.
I've
been
on
pto,
so
haven't
really
looked
at
it
closely,
but
with
the
all
the
work
that's
going
into
quincy,
there
hasn't
been
a
ton
of
new
stuff.
In
fact,
I
didn't
really
see
anything
that
was
really
new
over
the
last
couple
of
weeks
coming
in
feel
free
to
add
it
if
there's
something
that
I
didn't
see,
but
there
were
a
couple
of
different
closed
prs.
A
So
I'll
start
on
those,
there
was
a
pr
from
matt
benjamin
on
the
rgw
side
that
improved
performance.
When
you
had
different
logging
that
you
were
looking
at,
that
pr
got
closed
without
being
merged,
I'm
not
totally
sure
why,
if
casey
are
you
here?
No,
it
doesn't
look
weird.
I'm
here.
B
A
B
A
Well,
then,
moving
on
josh
solomon's
pr
for
improving
performance
in
some
rare
cases
in
the
balancer
that
finally
merged
we
do
not
have
josh.
Today
I
don't
see
laura,
so
I
think
she
did
quite
a
bit
of
the
review
on
that
pr.
I
don't
remember
exactly
which
cases
this
actually
helped
it,
but
but
they
both
have
been
doing
a
lot
of
work
on
the
balance
or
trying
to
make
it
better
and
improve
it.
A
So
my
guess
is
that
that's
not
the
only
pr
we'll
eventually
see
in
this,
but
for
now
general
improvement.
Look
at
it
if
you're
interested
there's
a
pr
for
using
the
thread,
local
pointer
to
save
the
shard
almond.
I
don't
remember
much
about
this.
A
Laura
looked
at
it
and
kefu
actually
looked
at
this
looked
at
this
pr
is
ronin
here
ronin.
Do
you
remember
much
about
this?
This
is
which.
A
C
C
One
second,
I
didn't
think
it
was
anything
drastic
or
important,
but.
A
Yeah
that
was
those
hand
my
impression
too,
and
no
performance
testing
results
or
anything,
but
it
theoretically
is
better.
So,
okay,
moving
on
then
testing
test
testing
classic
proof
test
with
the
performance
tag
from
chris.
Let's
see
I
have
to
get
my
window
back
open.
Do
we
have
chris
today?
A
I
don't
think
so
so
that
got
closed,
but
I
do
believe
he
is
still
working
on
trying
to
take
some
of
the
work
that
kefu
and
radek
and
others
did
with
running
jenkins
performance
tests
for
crimson
and
having
those
actually
run
tests
against
our
our
classic
code
as
well.
So
that
particular
pr
maybe
was
closed,
but
I
think
he
is
still
working
on
the
general
idea.
A
So
not
sure
what
happened
there?
Why?
Why
we
closed
that
one,
but
any
event?
I
think
that
work
is
remains
ongoing
and
the
last
one
I
had
for
closed
prs
the
igor's
work
on
speeding
up,
pool
removal
by
introducing
collection
list
prefetch
that
has
been
around
for
a
very
very
long
time
and
that
was
closed
by
laura.
D
E
E
E
This
new
pr
is
pg
removed.
A
All
right
so
updated
prs.
I
only
had
three
in
my
list.
One
is
this
one
for
setting
tracing
compiled
in
by
default?
A
This
is
on
the
rgw's
our
side,
the
the
rbd
side-
and
I
think,
casey's
also
looking
at
this,
but
the
thought
originally
was
that
this
had
fairly
low
overhead.
I
think
there
were
plans
to
merge
this.
It
went
through
some
testing
and
then,
for
whatever
reason
I
I
don't
know
if
it
didn't
pass
that,
but
there's
been
more
updates
and
more
discussion
on
it,
so
that
has
not
merged
yet
and,
and
still
is,
I
think,
actively
being
looked
at.
A
Do
we
have
anyone
from
that
group
that
was
looking
at
it
here,
casey
you're
here?
Have
you
have
you
looked
at
that
recently?
This
is
the
trace
point
stuff
right.
F
Yeah,
I
I
haven't
heard
any
recent
results:
performance
wise.
You
know
that
they
switched
to
some
tracer
sender
that
patches
things
instead
of
sending
everything
synchronously
and
that
that
helped
quite
a
bit.
But
I
don't
know
what
the
the
latest
numbers
are.
B
A
All
right
cool
well
be
exciting
to
see
what
they
what
updated
numbers
they
have
after
some
of
that
work,
okay.
Next
pr,
this
is
just
a
dock
pr
for
rewriting
the
hardware
docs.
I
was,
I
think,
tangentially
involved
in
some
of
this
sort
of,
but
I've
not
been
super
good
about
reviewing
this,
but
dan
from
cern
has
been
kind
of
reviewing
it.
I
think
so,
hopefully,
hopefully,
that
that's
to
people's
satisfaction,
primarily
the
cpu
section.
You
know
what
our
have
hardware
recommendations
upstream
are
right.
A
Now,
a
lot
of
that
that
documentation
was
quite
out
of
date,
okay
and
the
last
one
mds
pr
for
skipping
inode
with
cap
iteration
for
empty
directories.
This
is
from
patrick.
It
looks
like
there
were
some
bug
fixes
and
more
discussion
going
on
there.
A
A
Sure,
okay!
Well,
I
think
his
active
discussion
is
still
ongoing,
so
that's
it
for
that
lots
of
stuff
in
the
no
movement
category,
but
I
think
the
only
one
here
that
maybe
I
would
want
to
follow
up
on
immediately
is
we
do
want
to
get
the
tc
mount
threat.
Cache
size
moved
into
a
stuff
configuration
option,
so
I'll
try
to
follow
up
with
adam
on
that
one
for
containers.
A
This
makes
it
a
lot
nicer
to
be
able
to
change
that
setting
on
the
fly
rather
than
needing
to
change
the
basically
self
deployment
in
the
container
itself.
So
yeah,
that's
that's,
probably
something
that
we
should
try
to
get
in
sooner
rather
than
later.
I
think
all
right,
so
there
were
some
comments.
Oh
before
I
move
on
anything
I
missed
from
anyone,
but
they'd
like
to
talk
about
any
prs.
A
Okay,
so
kenneth
from
soft
iron
had
posted
in
the
chat
window
that
there
has
been
some
ongoing
work.
Looking
at
nautilus,
specific
performance
regressions,
I've
done,
I
did
quite
a
bit
of
work
on
that
before
I
went
on
pto
and
and
they've
been
doing
quite
a
bit
of
work
on
looking
at
that
as
well.
So
kenneth
had
mentioned
here
that
they've
been
doing
a
lot
of
performance
testing
on
their
side
and
noticed
a
few
things.
G
A
little
noisy
where
I'm
at
but
I'll
unmute,
good
good
game
yeah
go
ahead.
Most
of
our
you
know,
most
of
our
hardware
is,
is
arm
we're
actually
using
the
amd
opteron
a1100.
You
know
seattle
chipset,
on
a
lot
of
our
storage
notes,
especially
the
density
boxes.
So
that's
that's
a
little
unique
versus
what
you
know.
A
lot
of
folks
are
doing
our
higher
performance
boxes.
The
nvme
nodes
specifically,
are
you
know,
amd
epic,
a
little
more.
You
know
normal.
I
guess
for
what
most
other
folks
do
we
merge?
G
I
don't
so
a
couple.
A
couple
of
hooks
on
our
team
have
more
info
than
me.
I
wish
they
would
have
joined
but
they're
busy,
I'm
kind
of
you're
getting
secondhand
or
maybe
third
hand
information
here,
but
they
showed
me
a
graph
this
morning
merging
in
some
there
was
apparently
some
assembly
changes
on
arm
for
what
isa
they
import
back,
ported,
that
to
pacific
recompiled
and
noticed
a
pretty
substantial
performance.
G
You
know
increase
and
also
apparently
our
gitlab
ci,
where
we're
building
we're
doing
our
surf,
builds,
wasn't
setting
in
a
release
with
debug
info
the
cmake
build
type
that
made
a
pretty
dramatic
sense.
G
Especially
at
the
larger
block
sizes
that
made
a
huge
difference,
we're
pretty
close,
there's
still
a
gap
like
pacific
is
still
you
know,
demonstrably
slower,
especially
on
arm
than
nautilus.
We
haven't
quite
figured
it
out
yet,
but
you
know
we're
much
closer
now,
at
least.
A
A
So
then
we
ultimately
end
up
seeing
a
situation
where
also
now,
like
you
know,
pacific
looks
worse
than
natalie's
when
we
initially
tested
it.
It
looked
better
that
that
seems
to
be
an
unfortunate
trend,
which
I
guess
means
that
we
probably
need
to
be
doing
a
lot
more
performance
testing
on
back
ports
which
isn't
going
to
be
fun,
but
maybe
is
a
requirement
going
forward.
G
Yeah
we're
thinking
about
testing
one
since
we
have
such
tiny
caches
on
our
arm.
Chip
sets
we're
thinking
about
building
with
what
there's
a
cmake
option
like
min
size
or
something
like
that.
What
is
it
you're
gonna
think
if
that
made
a
difference
hold
on.
G
Post
the
mailing
list,
yeah
I'll,
I
was
gonna-
give
you
an
update
when
I
had
something
more
useful.
I
suppose,
rather
than
anecdote,
because
anecdote
isn't
very
helpful
but
we're
going
to
try
changing
some
of
some
of
the
compile
time
options
on
on
arm
since
the
the
cache
sizes
are
so
tiny
on
those
on
those
on
those
chipsets
and
seeing
if
that
made
a
difference.
G
Sure
also
what's
interesting
is
at
least
again
on
arm.
I
don't
know
why
this
would
be
different,
but
we
have
about
a
10
performance
drop
running
running
the
demons
in
containers
running
them
in
docker
versus,
not
which
I
wasn't
expecting
to
see
that
either
and
it's
pretty
consistent
across
the
same
compile
time
options,
the
same
linker
options
and
the
same
stuff
configurations
just
moving
them
into
containers
on
arm.
For
us,
we
see
about
a
10
performance
drop.
G
Right
there
is,
I
was
wondering
that
I
haven't
looked
deeply
into
it
and
I'll
admit
my
understanding
is
probably
elementary
school
level.
I
was
wondering
if,
if
there
was
some
sort
of
like
socket
or
networking
communication,
that
was
the
bottleneck.
A
G
G
So
it
was
interesting
to
see
and
that
that
definitely
drives
my
decision
on
whether
we're
going
docker
or
not
or
podman.
Or
what
have
you
right.
A
Yeah
yeah-
and
I
think
that's
that's
the
struggle.
I
go
with
too
because,
like
a
lot
of
the
stuff
that
I
run,
the
performance
tests
are
are
on
bare
metal
specifically
because
I
really
don't
want
other
random
bottlenecks,
including
you
know
the
the
info
we
get
out
of
this,
but
then
it
means
that
we're
not
necessarily
running
in
the
same
way
that
a
number
of
users
are
so
yeah.
I
I
understand
your
reasoning
for
especially
if
you're,
making
an
appliance
just
running
bare
metal,
it's
easier.
Yes,
yes,
yeah.
G
That's
all
I
have
today.
I
just
wanted
to
join
and
say
hi
give
you
an
update.
I
joined
last
week,
but
you
were
on
pto
and.
G
A
Yeah,
well,
I
I
owe
everybody
more
work
on
that
too,
because
I
I
I
ended
up
having
a
ton
of
pto
I
had
to
use,
so
I
I
I
kind
of
you
know,
stopped
working
on
it,
but
now
it's
time
to
go
back
and
try
to
wrap
up
some
of
that
also
for
quincy.
We
we're
gonna.
A
Try
to
you
know
at
least
be
able
to
showcase
a
lot
of
quincy's
gonna
look
better
than
pacific,
especially
on
the
right
path
due
to
the
work
that
gabby
did,
but
it's
also
going
to
hide
some
of
our
sins
from
back
parts.
I
think
that
we
we
saw
so
there's
probably
more
work
to
do
there,
even
if
the
numbers
look
better
got
it
got
it.
G
Well,
I
have
to
run
in
nine
minutes
for
another
call
but
like,
and
I'm
glad
you
gotta
get
around
to
you
know
my
comments
on
the
chat
again,
I
appreciate
everything
you're
doing
it.
It's.
A
Long
term,
if,
if
there's
time
to
do
it,
and
it's
always
down
at
the
bottom
of
the
the
pile
but
it'd
be
really
nice
to
make
an
automated
system
for
doing
performance,
bisects
just
have
it.
You
know,
walk
through
doing
benchmarks
and
like
downloading
new
versions
of
stuff
and
and
like
just
you
know,
go
through
the
whole
process
in
an
automated
way.
Uggie.
G
We
have
a
benchmarking
tool
that
we
use
that
I'm
trying
to
integrate
as
part
of
our
ci
cd
pipeline,
for
when
we
we
do
new
stuff
builds
to
have
some
sort
of
baseline
that
we
track.
Okay,
I'm
hoping
to
release
some
of
that
to
the
community
this
year
for
what
it's
worth
that
would
be
that'd,
be
great
yeah,
it's
it's!
It
uses.
It
speaks
native
rados,
so,
rather
than
some
sort
of
other
abstractions,
I
think
that'll
be
probably
valuable,
we'll
see
if
the
community
likes
it.
A
One
I've
got
tests
that
I
wrote
that
live
in
the
set
test.
Object
store
world
that
are
really
useful
for
for
looking
at,
like
omap
and
and
and
kind
of
the
behavior
that
we
see
there.
But
we
probably
want
that
to
exist
as
something
that
you
can
test
against
an
existing
cluster.
Not
just
as
you
know,
like
the
standalone
test.
A
G
H
A
Cool
all
right,
all
right:
okay,
oh
hey,
gabby
you're!
Here.
A
Gabby,
I
I
see
that
you're
you're
out
in
the
real
world
somewhere,
you
know
probably
enjoying
fresh
air.
A
Yeah
no
worries
I,
since
since
you
were
on
pto
and
adam,
was
on
pto,
I
figured
that
would
maybe
wait
to
discuss
your
proposal.
If
you
want
to
it's
fine,
whoever
whatever
you
prefer
to
do.
A
Maybe
in
the
meantime,
while
gabby's
figuring
out
his
leg
issues,
I'll
give
just
a
brief
update
on
crimson
and
and
science
store,
so
josh-
and
I
talked
earlier
this
week,
jeff
jorgen
and
I
talked
earlier
this
week
about
trying
to
showcase
kind
of
the
upper
half
of
the
crimson
stack
a
little
bit
better
than
what
we've
been
able
to
do
in
the
past.
So
right
now,
cyan
store,
doesn't
really
kind
of
showcase.
It
super.
Well,
it's
not
bad,
but
it's
not
really.
A
You
know
as
fast
as
maybe
it
could
be
so
earlier
this
week
I
went
back
and
re-ran
some
tests,
specifically
on
science
store
and
crimson,
just
to
see
kind
of
where
we're
at
right
now
and
the
results
were
kind
of
interesting.
Let
me
quick
get
some
of
the
numbers
that
I
gathered
I'll,
throw
these
in
the
just
in
the
chat
window
here.
A
Didn't
copy
properly
there
we
go,
so
this
is
just
64k
random,
reads
and
4k
random
reads
and
64k
random
rights
and
4k
random
rights
and
the
best
results
out
of
this
were
definitely
the
4k
random.
That's
what
we've
seen
in
the
past
we're
getting
about
66
000
iaps
out
of
that's
higher
than
we've
done,
I
think
on
classic
osds
ever
and
you
know,
of
course
this
isn't
memory
right.
So
it's
not
not
completely
reasonable,
but
it's
it's
not
bad.
A
What
was
interesting
is
that
in
both
the
64k
randomly
case
and
in
the
4k
random
read
case,
we
saw
a
send
message
as
the
primary
consumer
of
well
clock
time.
In
the
64k
random
read
case,
it
was
like
99,
where
we're
stuck
in
send
message.
Don't
know
why
those
numbers
will
probably
increase
dramatically
when
we
figure
out
whatever
whatever
it.
Is
that
we're
doing
this
making
send
message
just
consume
huge
amounts
of
time
in
the
4k
random
read
case,
it's
like
25,
so
there's
still
a
really
big
advantage.
A
If
we
can
figure
out
why
that
part
of
the
the
stack
is
taking
so
much
time,
we're
probably
gonna
see
some
pretty
big
advantages
there
on
the
right
path
side.
What
I'm
seeing
is
that
we
are
spending
a
significant
amount
of
time
in
a
buffer
list.
Substring
of
that
code
is
basically
splitting
up
bufferless
there's
a
couple
of
while
loops
in
there,
where
we're
just
kind
of
iterating
and
creating
new
buffer
pointers.
If
I
remember
right
so
we're
here
I'll
link
the
line
of
code
in
science
store.
A
This
is
basically
where
we're
doing
that,
I'm
not
totally
sure
yet
on
what
the
right
approach
is
to
improve
this.
It
was
about
14
of
the
overall
wall.
Clock
time
was
being
spent
in
that
portion
of
the
code,
otherwise
we're
kind
of
spread
over
a
bunch
of
different
stuff,
peering
code,
just
memory
allocation
code
and
freeing
x,
adders
and
omap.
Interestingly
enough,
like
this
being
rbd
age,
object,
t
comparisons
just
a
whole
slew
of
random
different
things.
So
the
big
one
was
the
substring
in
buffer
list
figuring
out.
Why?
A
That's
that's
so
there's
so
much
overhead
there,
but
the
good
news
is
that
it
looks
like
there's
some
interesting
things
to
explore
both
in
the
read
path
and
on
the
right
path.
I
think
we
can
maybe
do
better,
I'm
not
sure
how
much
better,
but
I
think
we
can
make
it
more
efficient.
A
So
I
was
I've
been
talking
to
adam
a
little
bit
about
this
problem
erratic
as
well,
and
then
also
about
whether
or
not
we
can
figure
out
how
to
shorten
the
path
in
the
upper
half
of
the
stack
being
from
the
network
buffer
down
into
the
object
store,
that's
kind
of
the
direction
I
want
to
go
as
I
do.
This
investigation
is
to
see
if
there
are
ways
that
we
can
improve
and
shorten
the
path
there.
A
I
don't
know
if
it
will
work
or
not,
but
that's
kind
of
the
the
the
approach
I'm
taking
right
now
anyway,
that
that's
about
it.
For
that
any
any
questions
on
that
or
thoughts
or
comments.
A
All
right:
well
then,
if
not
gabby,
I
guess,
is
not
gonna,
have
luck
with
his
mic
right
now,
so
maybe
we'll
wait
till
next
week
to
to
do
to
talk
about
his
proposal.
That's
all
I
have.
Would
anyone
else
like
to
have
any
topics
or
have
anything
that
they
would
like
to
talk
about
with
the
group.
I
I
have
a
topic
related
to
an
issue
that
we
saw
on
a
pacific
cluster
in
production
that
seems
to
be
related
to
roxdb
performance,
oh
and
bucket
listings
for
rgw,
and
it
sounded
like
this
was
the
right
platform
for
talking
about
that.
I
Go
ahead
yeah,
so
let
me
just
give
you
a
little
context
about
the
scenario
that
we
encountered.
We
were
on
a
specific
1627
cluster.
In
production
we
had
a
customer
that
was
using
a
veeam
client
and
was
doing
backups
riding
about
50
megabytes
per
second
constantly
for
a
week
or
two,
and
at
the
time
we
saw
issues
they
had
about
20
million
objects
in
the
bucket
and
then
their
bucket
listing
started
timing
out
after
we
fixed
an
issue
with
bi
list,
it
was
a.
I
It
was
a
back
port
that
we
patched
in
that
was
kind
of
a
known
issue
so
that
the
veeam
client
was
doing
like
one
bucket
listing
per
second
on
this
bucket
and
when
we
looked
into
it.
I
So
on
the
relevant
notices
we
saw,
they
were
using
one
core,
100
cpu.
This
bucket
index
pool
was
deployed
in
nvme
and
those
weren't
like
touched
at
all.
They
were
barely
doing
anything
and
we
were
seeing
a
lot
of
extra
space
consumption
on
the
bucket
index
pool
as
well
when
we
were
doing
a
df.
I
A
I
A
What
we've
seen
in
the
past
is
that
this
is
almost
always
tombstones
that
there
are
all
these
tombstone
entries
for
deletes
that
you
end
up
iterating
over
and
it
makes
the
iteration
extremely
slow
and
then,
when
you
compact,
it
gets
rid
of
all
that
junk.
That's
in
there
it
actually,
you
know,
reduces
the
working
set
that
you're
iterating
over
and
then
also
things
are
fast
again.
A
It
was
a
really
really
big
problem
when
we
were
trying
to
do
like
the
the
the
shoot,
like
bulk
deletes
delete
range,
we
were
trying
to
implement
that
we.
This
came
up
like
as
a
huge,
huge
issue.
A
I
think
we've
talked
about
in
this
meeting
before
and
igor
had
mentioned,
the
possibility
of
of
like
constant
background
compaction
that
we
right
now
is
triggered
when
you
end
up
with
like
a
ton
of
rights
coming
in
right,
you
have
the
the
the
mem
tables
grow,
big
enough,
that
triggers
compaction,
flushing
and
compaction,
but
what
we
really
need
is
we.
We
need
the
ability
to
when
there's
like
a
delete
workload
coming
in
and
you
have
all
these
tombstones.
A
I
Yeah,
well,
it
is
interesting.
We
see
in
the
logs
that
compactions
are
happening,
but
apparently
they're
not
doing
the
compaction,
to
the
extent
that
it
does
when
we
shut
down
the
osd
and
do
it
manually-
and
I
don't
know
why
that
is.
But
we
did
see
like
in
rockdb
documentation
that
having
iterators
open
causes
compaction
to
not
be
able
to
delete
a
lot
of
the
files
because
the
iterator
is
holding
on
to
stuff.
Essentially,
so
the
idea
was
maybe
like.
We
need
mutual
exclusion
when
the
compaction
happens
to
make
sure
at
least
at
some
cadence.
B
B
A
I
think
it'll
be
really
important
to
make
sure
that
the
compaction
that
you're
seeing
is
actually
on
the
same
range
as
that
you're
reading
right
because
you
might
be
seeing
compactions
but
is
it
in?
Is
it
actually
compacting
the
things
that
you're
thinking
you're
compacting?
Are
they
compacting
like
a
different
portion
of
the
database.
I
Yeah,
I
don't
know
for
sure
the
answer
to
that.
So
we'll
have
to
look
in
more
detail
and
verify
that
that's
a
good
point.
A
B
A
So
yeah,
I
guess
my
take
on
this-
is
first
we
should
just
make
sure
that
it's
actually
compacting
the
ranges
that
you're
actually
holding
iterators
open
for
that'd,
be
the
first
thing
to
figure
out,
but
the
next
thing
to
maybe
figure
out
would
be
sorry
igor.
Do
you
remember
we
were
talking
about
this
before?
I
think
we
were
talking
about
trying
to
introduce
some
kind
of
background
compaction
after,
like
a
certain
amount,
go
ahead.
A
E
So,
as
far
as
I
understand,
offline
compaction
helped
for
a
while,
and
it
looks
like
some
background.
Compaction
is
happening
online,
but
it
doesn't
help
and
well.
You
might
want
to
try
another
one
background
compaction
to
to
see
if
it
helps
again
and
now
the
question.
What
prevents
online
compaction
from.
E
And
I
I
didn't
perform
route
investigation,
but
it
looked
like
effective
refrigerators.
Might
impact
that
badly.
A
A
Some
of
our
worst
case
scenarios
are
where
we
like
have
an
iterator
open
for
a
while,
and
then
we
end
up
like
going
back
and
reiterating
over
the
same
range
and
then
just
like
going
a
little
farther
than
we
did
previously.
We
do
this
over
and
over
and
over
again,
but
you
can
imagine
that
in
between
that,
especially
if
we're
doing
work
in
between
like
we're
deleting
at
the
end,
and
then
we
reiterate
you,
it
seems
like
there
should
be
the
ability
in
between
those
to
be
able
to
do
a
compaction.
E
But
the
question
who
is
responsible
for
triggering
compaction
properly
at
exactly
the
moment
when
you
are
between
the
iterators
yeah?
E
E
If
the
issue
is
in
open
iterators,
then
it
might
be
inefficient
as
well.
E
A
A
I
So
that's
like
30
seconds
where
you
have
iterators
open
and
then
the
client
is
making
calls
once
every
second
in
our
case,
so
we're
never
getting
ahead
because
they're
all
taking
forever
now
you're,
almost
certain
to
always
have
iterators,
always
open.
So
if
you're
doing
compactions
in
the
background
randomly
there's
like
a
really
really
good
chance
that
when
you
try
to
do
it,
there's
going
to
be
an
iterator
open
and
the
problem
keeps
getting
worse
and
worse
because
the
queue
keeps
getting
more
full
and
the
in
the
iterations
keep
taking
longer.
E
D
A
Igor,
I
think
cory
is
right
right,
like
if
you
aren't
acting
after
a
lot
of
deletes.
You
start
out
by
doing
all
this
iteration
and
deletes
reiterate
and
then
delete
and
navigate
and
delete,
but
then
you're
never
compacting
after
the
deletes,
because
there's
no
right
workload,
then
it's
making
the
iteration
take
longer
and
like
it
exactly
would
be
a
feedback
loop
right.
E
And
well
maybe
one
more
question
about
data
removal.
So
what
is
causing
removal
in
your
cluster?
Are
you
moving
pg,
maybe
removing
some
pools
or
that's
a
regular
use
case
from
gw.
E
So
previously,
the
major
issue
with
bulk
removals
was
due
to
pull
removal
of
pg
eg
moving,
but
honestly,
I've
never
seen
a
user
remove
user
data
removal
to
cause
something
like
that.
I'm
curious!
What
what
kind
of
removals
do
you
have.
I
Yeah-
and
I
don't
know
the
specific
answer
of
what
at
the
lowest
level
what
they
look
like,
but
it
is
it's
a
veeam
client
doing
backups
and,
like
I
said,
they're
doing
like
50
megabytes
per
second,
so
they're
doing
a
lot
of
writes
constantly,
and
this
is
a
versioned
bucket.
So
I
don't
know
if
it's
part
of
the
versioning
stuff
that's
kept
in
omap.
That
is
constantly.
B
B
I
Supposedly,
yes,
I
mean
supposedly
our
recent
architect,
sales
team
have
talked
to
them
about
that
and
they
think
they
need
that.
So.
B
Well,
it's
really
very
expensive
and
and
yeah
has
that,
doesn't
what
doesn't
doesn't
allow
our
implementation
to
directory
starting
to
scale
it
it's
effectively
as
we
would
hope
if
they
invented
it,
although
if
they
generate
an
update
in
there
at
least
at
least,
this
could
eventually
spread
out,
but
but
the
versions
of
the
same
object
end
up
on
the
same
bucket
index
shard,
that's
probably
not
changing
anytime
soon.
That's
that's
a
that's
a
that's
a
concern.
B
But
it
then,
if
they
don't
and
if
they
don't
prove
prune
old
versions,
then,
which
was
that
which
was
then
becomes
their
response,
becomes
someone's
responsibility.
If,
if
this,
but
this
can
be
managed
by
a
policy
and
a
vm-
doesn't
already
do
it,
it
can
be
folded
in
somehow
if
it
could
tolerate
deletion
of
older
of
old
versions
using
the
lifecycle
method
do
so
on
some
more
appropriate
schedule.
That
would
that
could
that
they
could
burn
down
a
lot
of
that.
A
Matt,
can
you
can
you
think
of
a
scenario
using
virgin
buckets
here
where
we
would
see
a
lot
of
deletes
coming
in
at
the
same
time,
but
not
a
lot
of
rights.
B
B
B
A
Yeah
yeah,
but
right
now
I
think
I
think
the
the
reality
of
all
this
is
that
we're
so
reliant
on
rights
to
trigger
compaction,
and
we
don't
do
it
on
or
I'm
sorry
well
on
beliefs,
rather
that
that
when
we
see
tons
of
deletes
coming
in
at
once,
it
can
just
basically
completely
destroy
iteration
performance.
B
I
B
I
Well,
we
actually
have
some
meetings
with
the
veeam
guys
tomorrow
our
company
is
a
partner
with
veeam
and
stuff
and
we
have
some
close
relationships,
so
we're
actually
going
to
talk
about
that
with
them
tomorrow,
try
to
understand
their
client
behavior
better
and
if
it
can
be
improved
from
their
side,
because
we're
also
unsure
why
they're
doing
so
many
bucket
listings.
B
B
The
first
kind
of
implementation
of
of
I
forgot
what
it
was
actually
called
server
inventory,
which
which,
which
does
which
which
which
materializes
listings
their
life
cycle
into
into
objects
that
are
then
that
are
in
csv
or
park
a
foreign
for
us
or
it
could
be
or
see
you
up
in
aws
yeah.
We,
we
also
have
certification
by
veeam,
but
I
don't.
B
I've
never
talked
to
veeam
developers
that
I
knew
of
it
would
if
they,
if
they
would
be,
if
they're
prepared
to
use
server
inventory
already,
then
I
probably
have
the
solution
to
this,
because
one
of
the
solutions
is
offline.
A
So
I
I
just
took
a
look
and
there's
in
rocks
db,
there's
what's
called
ttl
compaction,
which
is
basically
just
you
know,
after
a
certain
amount
of
time
and
there's
not
been
a
compaction,
do
it
and
then
it
looks
like
in
the
last
two
years,
there's
maybe
a
way
in
rocks
db
to
compact
on
deletion.
A
E
A
Yeah,
I
was
just
trying
to
see
if
I
could
find
the
the
way
to
set
that
I
don't
remember.
A
A
Oh
cory,
I
see
that
you
pasted
another
yeah
performance
degradation
with
deletes.
That's
exactly
what
we're
talking
about.
I
A
Yeah,
it's
it's
the
exact
same
thing,
I
think
so.
Yeah
the
term
ttl
compaction
and
periodic
compaction
are
probably
worth
looking
for
oh
yep,
and
they
reference
right
at
the
end
of
that
in
february,
20
february,
20th
2021.
So
it's
like
a
month
old.
I
didn't
find
many
references
surrounding
new
compact
on
deletion,
collector
factory,
that's
the
the
the
new
thing
that
they're
talking
about
for
deleting
or
compacting
on
deletion.
A
A
So
cory,
I
guess
maybe
the
next
step
would
be:
let's
see
if
we
can
figure
out,
if
there's
some
way,
to
tell
roxy
b
to
do
some
periodic
productions
or
or
use
this.
This
fancy
undelete
compaction
thing.
That
might
be
maybe
the
next
steps
to
just
see.
If
that
can
improve
the
situation
for
you,
and
if
we
can't
make
that
work,
then
then
we
might
need
to
adapt
and
get
something
into
the
code
that
that
doesn't
visit
ourselves.
I
I
A
Frightening
upgrading
the
rocks
tv
version
in
the
past
granted
we
this
was
an
intermediate
version
between
the
releases
we
had
upgraded
to
and
it
actually
introduced
a
regression
that
caused
data
corruption,
and
we
ended
up
having
to
put
our
own
patch
that
that,
like
reverted,
the
the
database
version,
basically
because
it
wasn't
backwards
compatible.
That
doesn't
typically
happen
with
the
released
versions
that
was
kind
of
a
no-no
on
our
part
that
we
we
upgraded
to
an
intermediate
release.
A
So
so
you
know
lesson
learned
there,
but
we've
always
just
since
then
been
a
little
bit
gun
shy
about.
You
know
pulling
in
a
new
rock
cb
version
into
like
a
back
port.
I
I
think
we've
done
it
before,
but
we
really
like
testing
master
before
we.
We
do
that,
just
to
make
sure
there's
nothing
crazy
coming
in.
So
if
we
did
that
we'd,
probably
first
backport
to
like
you
know,
whatever
version
is
being
used
in
quincy
and
backward
that
the
pacific
rather
than
the
newest.
A
I
H
A
A
You
know
it's
especially
if
you're
doing
a
lot
of
iteration,
for
whatever
reason
it
tends
to
crop
up
and
it
it
gets
mitigated
if
you're
doing
rights
at
the
same
time,
because
then
you're
triggering
compaction
regularly.
So
it's
really
the
use
case
where,
if
you've
got
lots
of
deletes
and
lots
of
iteration
and
no
rights,
that's
that's
when
we
tend
to
see
this
kind
of
thing.
A
Cool
yeah,
hopefully
we
can,
we
can
get
it
worked
out
for
you
quickly
and
you
know
just
get
this
issue
generally
taken
care
of
it's
big
camp
authority
on
our
side.
So
all
right
anything
else
from
anybody
before
we
wrap
up.
A
All
right
well,
then,
thank
you,
everybody
for
coming
and
have
a
great
week
and
next
week
we'll
we'll
talk
about
gabby
and
adam's
ideas
for
pg.
That's
it
have
a
great
week.
Everyone
bye
thanks
mike.