►
From YouTube: 2017-JUN-08 :: Ceph Performance Weekly
Description
Weekly collaboration call of all community members working on Ceph performance.
http://ceph.com/performance
For full notes and video recording archive visit:
http://pad.ceph.com/p/performance_weekly
C
E
All
right
we
have
CDM
yesterday
talked
about
a
bunch
of
utility,
ops,
stuff
and
already
cash
and
came
up
so
start
packing,
like
there's
a
pull
request,
not
much
new.
Let's
see,
there's
a
something
about
the
scrub
job.
Priority
I've
looked
that
one
yet
looks
like
Greg
read
it
and
Mark
squeaking
I'm
sumus,
yet
freaking
the
throttle
cost
for
hard
disks
yeah.
It's.
E
Be
for
the
other
ones,
let's
see
like
entry,
oh
yeah,
those
yeah,
the
one
that
I
think
is
proposed.
Dick
svet
innocently
care.
If
that
merged
the
thing
that
dad
did
that
he
was
going
to
clean
up
subside.
So
there's
that
there's
a
there's
a
new
command
for
the
monitor
that
will
just
show
you
the
most
recent
blog
entries
in
the
cluster
log
and
there's
a
related
map
of
recent
entries.
That's
used
to
prevent,
duplicates
when
demons
resubmit
their
stuff.
So
let's
find
a
sea
of
voiding
overlapping
extension
Rashard
and
that
merged
reviewed.
E
You
read
that
dynamic,
restarting
merged
yeah,
that's
like
the
one
of
the
biggest
most
important
new
things
to
come
to
are
saving
a
long
time.
Basically,
you
have
to
worry
about
buckets.
Isis
emerged
of
you
anymore.
It's
just
going
to
make
it
big,
it's
going
to
split
them
and
if
they
get
small
again,
it's
going
to
merge
them.
So
one
less
thing
for
users,
not
race
to
think
about.
E
Let's
see
I
think
this
amid
transactions
mission.
Frankly
batches
but
I
can
actually
close
this
and
he
had
a
different
approach.
Azusa,
yes,
close
yeah
I
have
different
parties.
Looking
at
the
status,
fest
thing
emerged
that
this
morning,
I
just
forgot
to
merge
it
after
the
rebase
happened
in
my
memory
that
was
merged
yep
and
the
unshare
blobs
was
merged,
yet
there's
a
slowly
buggin
net.
So
that's
that's
a
pretty
good
I,
don't
think,
there's
anything
major
that
that's
standing
on
the
performance
front.
E
E
E
Okay,
all
right
sounds
good
any
other
one.
Is
this
up
commit
app
applied
thing
I,
think
word,
yeah
I
think
we
just
need
to
figure
out
what
path
to
take
there.
I
don't
think
this
is
super
critical.
It's
not
going
to
go
into
a
luminous
I,
don't
think
so,
but
yeah
Josh,
I
think
I
just
asked
for
your
opinion
on
direction.
There
and
I
can't
remember
what
it
was
and
it
was
a
video
yeah
I
think
we
can
wait
on
that.
One
I'm!
E
Maybe
we
should
get
marks
on
and
B
dams,
so
you
can
play
with
it
sure:
okay,
no
complaints
there
yeah
I,
think
all
of
other
stuff
is
so
not
not
super
important
mark
mark
sometime
Dana,
my
guess
but
yep.
That's
it
from
the
care
front.
D
D
D
So,
if
we
don't
have
cache
misses
for
the
Oh
notes,
we
can
do
pretty
good
I'm
being
me
and
those
he
can
do
about
38,000
right,
I,
apps,
maybe
even
better,
depending
on
festers
cookies,
are
then
how
that's
convenient,
but
once
we
miss
hit,
cache
misses
for
the
Oh
notes
and
extents
and
everything
else
that
we've
got.
It
slows
down
pretty
dramatically,
and
this
is
lung,
the
order
of
like
maybe
10,000
right,
apps
into
38,000.
D
D
E
Good
stuff
right
there,
the
default
cache
size
that
we
set
for
buspar,
is
one
gigabyte
and
I
sort
of
just
sort
of
picked
that,
because
that's
sort
of
similar
to
what
a
lot
of
people
that
feels
like
that
should
be
within
the
bound
of
what
people
have
deployed
on
hardware,
but
I
think
it
technically
are
recommendations
which
should
be
taken
with
a
grain
of
salt
were
one
gigabyte
per
terabyte
of
disks,
which
means
that
somebody
has
like
a
six
terabyte
disk
or
four
terabyte
disk,
and
it
should
be
or
gigabytes
per
OS
T
over
him
anyway.
G
E
D
E
G
E
It
just
seems
like
there's
two
ways
to
approach
it:
one
is
to
figure
out
what
what
sort
of
a
reasonable
amount
of
RAM
is
that
we
can
expect
users
with
current
and
future
hardware
to
do
those
two
O's
DS
we
base
it
on
that
and
the
other
is
to
try
to
figure
out.
The
sort
of
the
curve
is
and
figure
out
where
quite
close
to
the
knee
is
on
the
curve
and
then.
G
D
Yeah
we
can
I
mean
if
you,
if
you
worthwhile,
looking
at
how
much
the
the
encoded
owner
sizes
on
average
our
verses,
we,
the
unencoded,
then
also
be
compressed
encoded
in
laxity
of.
If
those
are
talking
the
compressed
cash,
whether
or
not
we
gain
anything
over
just
the
internet.
You
know
everything
is
going
to
be
a
trade
off,
but
maybe
maybe
one
or
both
of
those
are
worth
it
to
save.
On
space
I.
E
D
It
might
be
that
that's
better
than
the
oh,
no
cash
shot
to
do
a
store
side.
They
you
know,
maybe
maybe
keeping
most
of
this
encoded
is
better
or
maybe
even
keeping
the
most
of
encoded
and
compressed
is
better,
which
is
the
Yurok
speaks.
You
get
the
two
caches
who's
going
to
be
onion
uncompressed
in
the
compressed
cache
you.
E
Know
what
maybe,
what
we
should
do
actually
also
is
right.
Now
you
have
like
recast
settings,
multiple
casting
so
set
well,
one
of
them
is
the
blue
star
cast
size,
which
is
the
memory
that
blue
stars
managing
one
of
them
is
the
rock
City
cast
size,
which
is
a
separate
option
and
then
there's
another
one
that
is
I,
think
embedded
in
the
rocks
to
the
option
string,
which
is
super
awkward.
So
maybe
what
we
actually
want
is
to
make
blue
store
control.
D
We
do
we
have
any
kind
of
we
keep
track
of
how
big
and
then
coded
instead
of
metadata
is
relative
to
be
in
memory
size
of
it.
Please
do
track
that
anymore.
Okay,.
D
E
So
do
do
some
runs
where
we
fix
the
blue
star,
cast
size
and
then
bury
the
rest
of
the
cast
size
and
figure
out
or
even
better
and
then
hopefully
like
what
we
really
want
is
maybe
I
can
I
can
do
this
change
first,
where
we
configure
in
terms
of
ratio
both
like
with
the
specific
memory
budget.
Is
that
memory
better
spent
in
our
XP
cash
or
and
just
do
some
a/b
tests?
Let
me
happen
to
see
which
is
which
does
better
yeah.
We
can
it.
E
E
E
D
E
D
B
D
H
D
F
F
A
These
gems
are
for
us
are
we.
We
would
like
to
use
the
more
memory
to
get
a
more
performance
so
far
on
our
server.
We
pretty
much
have
119
a
100
to
80
gig
of
memory
for
each
it
server,
so
we
can
consume
as
up
to
10
gigabyte
each
OS,
because
I
don't
have
a
12
or
D
for
each
server
and
there
we
can
consume
up
more
than
cause
unintended,
app
or
OST,
but
I'd.
Rather,
we
could
at
good
America's
good
performance,
yep.
E
E
Only
other
knob,
well
the
only
other
main
thing
that
the
OCC-
oh
they're,
okay,
they're,
only
they're
only
handful
about
their
sort
of
pools
of
memory
that
the
OSD
uses
that
are
tracked.
One
is
the
PG
log,
that's
the
bulk
of
it,
and
we
do
them
in
pool
that.
So
we
could
base
that
on
memory
usage
and
the
other
one
is
the
throttle
theme
of
the
amount
of
memory
we
allow
to
be
read
off
the
wire
before
we
process
it
yeah,
but
that's
kind
of
a
sixth
thing
and
it
just
it's
not.
E
E
E
It
I
think
I'm
decrease
it
to
reduce
memory,
but
never
increase
it,
so
it
just
means
that
you'll
do.
If
you
have
OC,
is
down
the
amount
of
time
it
can
be
down
is
shorter
before
it
falls
in
the
back,
though,
but
like
it's
a
rare
event
and
they're
like
it's
difficult
to,
like
balance
the
cost
of
like
memory
overhead
versus
time
down
before
you
have
to
do
a
more
expensive
recovery
operation,
it's
like
super
difficult
to
make
a
judgement
anyway.
So
they're
not
too
concerned
about
it.
You.
D
Know
it
feels
to
me
almost
like
that's
one
of
those
things
where
yeah.
If
you
have
little
memory,
you
want
to
be
down,
but
you
have
lots
of
number.
Maybe
you
very
slowly
cooked
up,
but
it's
not
even
a
decision
the
user
needs
to
make,
though
right
I
mean
if
they
have
lots
member
you
can
do
that.
Imax
is
what
we
did
have
them,
set
it
to
something
yeah.
E
E
Are
you
dad,
let's
see,
is
Josh
sorry
on
the
call
skip
out
on
that
replacing
the
blue
star
cache
size
with
like
an
OST
memory
or
basing
on
an
OST
memory.
H
Could
be
Roble
I
think
it's
a
long
term.
It
would
be
great
to
have
a
single
suitable
for
Oh,
Caesar,
Marine
general,
but
in
quickly
said,
there's
much
things
to
doesn't
the
place.
That's
work
before
that
works
well,
yeah,
okay
and
the
FBI
legate.
H
E
F
E
Is
a
good
question,
I
think
it's
technically
possible,
although
it's
a
little
bit
awkward
because
there's
all
kinds
of
possibilities
like
feedback,
loops
and
noise
and
whatever
it's
not
a
very
precise
thing,
but
you
could
sort
of
try
but
I'm
worried
about
is
whether
that's
a
good
idea.
The
Bennett
people
come
any
input
or
opinions
from
anything
on
that
the
kernel
can
sort
of
do
it
because
it
owns
the
memory
system
and
it
can
reclaim
any
memory
it
needs
to
if
it
needs
to
for
any
other
purpose.
E
So
it
just
uses
all
memory
for
cash
by
default,
but
for
users
to
pace
processes.
Do
that
that's
dangerous,
because
the
kernel,
if
they'll,
react
quickly
enough
and
they're
gonna
all
start
to
swap
them
out
instead
of
just
doing
a
reclaim,
just
sort
of
a
problem
reminds
me,
I
seem
to
remember
I,
don't
know:
did
anybody
have
any
thoughts
on
that.
E
E
E
It
would
be
nice
to
just
have
them
magically
use
as
much
memory
as
they
need
and
trim
it
as
necessary,
but
there
are
a
lot
of
things
that
can
go
wrong
like
even
if
givenness
they
realize
there's
memory
pressure
and
they
try
to
free
bunch
of
memory.
If
their
heap
is
fragmented,
then
they
won't
actually
be
able
to
release
that
memory
back
to
the
kernel.
So.
E
H
D
Know
maybe
that
I
don't
actually
don't
agree
with
that.
It
seems
like
a
lot
of
times
in
hyper-converged
scenarios,
people
have
no
idea
how
much
memory
applications
are
going
to
use
like
at
peak
right,
and
so
this
is
kind
of
a
problem
for
us
now,
especially
with
having
static
memory
settings
rather
than
just
using
page
cache.
E
But
wouldn't
we
put
this
idea
out
there
that
stuff
posties
use
more
memory
when
you're
doing
recovery?
It's
not
because
it's
hard
it's
because
we
deliberately
make
the
PG
log
longer
so
that
we
can
tolerate
a
longer
post
ease
being
down
for
longer.
So
it
was
written
that
way
as
a
feature,
but
it's
sort
of
mostly
I
think
people
interpret
it
as
a
bug
and
something
that
makes
it
difficult
to
do.
Capacity
planning
I,
wonder
if
we
should
just
stop
doing
that.
H
F
E
E
D
E
D
E
D
D
F
F
I
I'll
just
say
it
seems
like
really
just
interested
in
sort
of
the
CI
op
and
VI
ops
with
the
node
performs.
And
how
long
do
you
want?
How
many
seconds
you
wanted
to
last
so
I
mean
it's
the
default.
One
is
what
is
it
a
thousand
for
PG
and
they
assert
you
could
use
it
10
times.
30
does
not,
which
for
a
hard
drive,
is
what
five
minutes
so
yeah.
E
D
E
I
I
I
I
E
E
F
E
Well,
okay,
so
here's
a
here's,
a
question:
we
could.
We
could
switch
over
hip
there
bunch
of
questions.
One
is
we
could
change
the
PT
log
trimming
so
that,
instead
of
having
a
fixed
length
of
log
/
PG,
we
have
a
minimum
that
we
keep
just
because
for
like
do
pop
detection,
but
then
we
instead
set
a.
We
did.
The
trimming
based
on
whatever
the
the
oldest
is
across
all
PG
s,
so
yeah,
mostly
idle
pull
in
a
very
active
pool.
E
H
D
E
E
E
From
the
PG
log
entries,
in
which
case
we
could
make
the
PG
log
almost
empty
in
the
case,
where
we're
not
degraded,
yeah,
yeah
and
reclaim
a
bunch
of
memory
that
way,
but
it
does
mean
that
there
is
a
the
memory
usage
will
balloon
when
you
have
degraded
PGs,
so
I
think
as
long
as
it.
That
makes
me
think
that
we
should
configure
the
amount
of
memory
that
we
devote
to
that
in
terms
of
megabytes,
so
that
there's
a
setting
that
says
that
the
recovery
overhead
for
logs
is
400
megabytes
or
whatever,
whatever
it
is.
H
E
D
D
E
D
G
D
E
I
I
H
E
E
D
E
E
You
know
stay
at
a
fixed
size,
but
there's
a
there's,
a
Delta
between
the
actual
RSS
and
what
the
blue
star
cast
eyes
was
I'm,
not
sure
that
everything
else
in
between
there
sort
of
I
can
it
for
well.
Okay,
well,
anyway,
that
gives
me
a
couple
things.
So
one
is
the
blue
star
box,
we
cash
ratio,
one
is
the
borrow
memory
from
PG
log
and
configure
like
in
terms
of
ramp.
Does
that
seem
like
a
reasonable
approach
to
you
guys,
question
Greg.
H
E
D
In
sage
I
can
I
came
at
the
same
time.
Try
to
see
if
I
can
look
at
how
rocks
TVs,
both
compressed
cache
and
uncompressed
cache
and
the
owner
of
cache
all
kinds
of
battle,
or
she
shut
her
or
not.
Some
work
with
each
other
will
say.
Okay,
we
do
this
alright,
but
buried
in
all.
This
I've
got
this
little
elm
deeper
proposal
right
discussed
at
CBM
yesterday,
but
they'll
have
to
go
over
it.
Just
this.
D
Anyone
is
interested
in
trying
to
resurrection,
shins
or
PR
for
MDB
and
see
if
I
can
make
it
make
it
work
and
maybe
give
it
another
try.
I,
never
I,
never
really
got
the
impression
that
we
did
a
real
evaluation
of
why
it
was
slow
for
rice
and
kind
of
meat,
see
how
how
bad
it
really
is.
It
may
be
this
totally
worthless,
but
we'll
see
the
only
other
thing
I'd
get
on
here
was
I
had
the
white
right
back
rattle
settings,
but
this
has
come
up
again:
I
figured.
D
We
should
probably
ash
it
a
little
but
I
hate
this
discussion
because
I
feel
like
there's.
It's
a
just
a
giant
spectrum
of
it
where,
yes,
we
can,
we
can
make
it
we
can
make
it
have
more
things
that
are
shorter.
Oh,
we
can
fewer
sins
that
are
longer
no
advantages
and
disadvantages
to
both
and
we
go
back
and
forth
on
the
spectrum
mode.
Well,
maybe,
if
I
do
this,
or
maybe
starter,
did
that
and
it
always
just
kind
of
sucks-
that's
kind
of
the
conclusion
that
comes
in
Jim
yeah.
E
I
think
the
only
I
figure
it
kind
of
doesn't
matter
so
much
where
we
end
up
here,
because
you're
sort
of
on
the
spectrum
between
batch
throughput
versus,
like
you
know,
tail
Layton
sees
and
burstiness
whatever
having
more
consistent
performance.
So
it's
not
like
there's
a
wrong
answer
yeah,
but
these
numbers
just
feel
huge
for
hard
disk
like
the
hard
limit
of
5,000
names
you
journaling
and
things
like
an
order
of
magnitude.
You
know
I
prefer
a
disc
with
this.
Every.
D
Time
that
we
were
visited
making
them
smaller,
it
seems
like
what
happens.
Is
we
test
it
out
and
it's
for
work?
And
then
we
read
back
off
because
we're
like
okay,
you
know
it
slows
us
down,
maybe,
and
maybe
maybe
there's
something
unreasonable
that
we're
doing
that's
the
reason
why
it
slows
us
down,
but
we
are
like
okay,
we
can't
take
the
10%
performance
hit
or
whatever
it's
not.
D
E
D
D
I
D
I
D
Nothing
yes,
both
and
no
several
times
over
the
last
like
six
years,
I
mean
it
comes
up,
maybe
every
like
year
or
two,
but
let's
look
at
our
back
paddle
settings
again
and
it's
for
whatever
reason
the
setting
then
really
really
high
seems
to
be
the
thing
that
that
the
helps
is
not
just
kind
of
high
but
but
kind
of
having
no
oh
yeah.
I
know
higher
was
not
even
effective
right.
Yes,.
I
A
I
D
I
E
Yeah
I
think
we
can
do
that
I'm,
so
well,
I'd,
say
ping,
be
cat
and,
what's
just
charity
says,
and
so
we
get
some
sense
of
the
pain
but
I
think
even
if
we
don't
make
a
dramatic
change
there,
we
might
as
well
just
do
the
work
and
split
them
out
and
we
can
adjust
that.
Let
us
usually
adjust
up
and
down
going
forward.
I.
D
E
D
I
mean
we
could
we
could
make
this,
we
could
make
file
store
more
conservative.
Now,
since
we
have,
you
know
there
are
some
tens
that
we
could
just
we
could
just
make
it
slower
and
be
like
yeah
well
as
to
playing
it
slow.
So
don't
look
behind
the
curtain
is
slow
and
it's
ain't.
So
true,
that's
true.
C
E
E
D
Adding
on
to
that,
from
what
I've
seen
recently
when
notes
are
in
cash,
blue
store
is
pretty
minute.
It
can
be
fostered
now
for
branded
nights,
because
we
did
the
KB
synthetic
splitting
that's
what
I
was
a
big
benefit,
but
when
you're
out
of
cash,
but
if
you're
significantly
out
of
cash,
then
then
we're
probably
slower.
D
E
C
I
got
it
thing.
Did
you
know
the
question
that
I
have
is
that
I
benchmarked
before
this
a
very
same
grade
level?
I?
Have
some
interesting
results,
I'm
going
to
send
them
out
now,
just
because
I
have
it,
we
are
completed
the
benchmarks,
one
of
the.
What
is
missing
is
that
I
want
to
change
the
SDN
list
or
an
ICD
vector.
Yes,
just
see
what
so,
it's
not
that
straightforward,
like
it's,
not
just
changing
the
declaration
of
the
code,
so
I
was
wondering
someone.
E
E
I
think
that
was
the
most.
That
was
the
proposal
we
talked
about
before,
but
it
might
just
be
that
with
a
straight
up
vector,
it's
simple
and
easy:
that's
what
also
worked
too,
but
it
would
be
nice
too.
The
nice
thing
about
the
the
container
list
type
was
that
it
was
his
container
types
was
that
they
were
drop-in
replacements
for
the
assailant,
so
they
would
still
support
things
like
swap,
swap
and
splice
and
everything
they
would
fall
back
to
order
and
if
they
were
in
line,
but
in
was
always
like
three,
so
it
matter.
E
Okay,
there's
still
a
full
request:
that's
dnm
from
with
his
original
prototype
for
those
types
and
it
meets,
needs
love
before
it
can
sort
of
be
resurrected,
but
I'm
still
hopeful.
I
still
think
that
that's
probably
a
good
path
forward,
because
once
we
have
those
types
in
place,
we
could
sprinkle
them
all
over
the
good
base
in
places
where
we
expect
the
containers
to
be
small
and
avoid
any
allocation,
because
everywhere
we're
doing
like
sets
of
those
T's.
E
E
That's
your
anybody
else.