►
From YouTube: 2017-FEB-15 :: Ceph Performance Weekly
Description
Weekly collaboration call of all community members working on Ceph performance.
http://ceph.com/performance
For full notes and video recording archive visit:
http://pad.ceph.com/p/performance_weekly
A
A
A
So,
what's
going
on
this
week,
sage
who
decided
to
dig
in
and
rip
part
of
the
fast
dispatch
code
apart
and
that
I
don't
from
talking
with
him
I,
don't
think
we
were
able
to
rip
out
quite
as
much
as
we
want
to
in
all
cases,
but
he
was
able
to
get
a
lot
of
code
out
of
the
kind
of
normal
path
when
you
are
on
when
you're
not
on
an
existing
you're,
not
dealing
with
an
interesting
cluster.
A
Hopefully
we
will
see
he
gains
there
previously
and
Men
bright
had
attempted
to
kind
of
remove
a
lot
of
walking
in
there
and
and
unfortunately
wasn't
actually
in
indy
games
we're
seeing
actually,
in
particular,
the
regression.
So
hopefully,
figures
attempt
to
be
better,
but
we'll
see
also
those
are
really
exciting
of
here
here,
actually
Oh
a
capacious
keyboards
from
igor
and
boost,
or
he
is
working
on,
reducing
the
mirror
usage
for
blogs
in
in
groups
or
when
doing
attendant
over
rights,
and
I
confess
I
asked
if
any,
owing
to
the
Kurds
I'd
look
at.
A
A
A
A
If
we
always
talk
a
little
bit
about
eager
is
patch
and
osage's
looking
into
mixes
things
like
performance
issues,
they
confound
I,
think
you
will
blast.
I
talked
to
him.
You
were
dealing
with
random
hardware.
Issues
on
this
press
set
up.
I.
Think
you
get
those
results,
though
so
wait
if
even
have
handled
his
earlier
this
week,
he'll
still
look
at
it
video
this
week
and
then
I've
been
actually
doing
quite
a
bit
of
rgw
testing
and
running
into
some
interesting
kinds
of
things.
Going
on.
A
I
do
have
some
results,
I
think
maybe
I'll
always
goes
for
next
weeks
and
the
right
now
and
I'm
running
into
issues
where
it's
different
clients,
when
there's
really
high
concurrently
against
one
rgw
instance
and
I
issues
where
certain
client
processes
are
sewing
or
for
starting
your
race
and
there's
other
ones.
Even
even
look
really
high
argue
w
thread
count
and
also
with
higher
throttle,
looks
for
the
object
or
employed
option
in
close
up
height
limits.
A
A
Things
were
pretty
doable
supplications,
but
with
their
racial
coding
for
the
data
pool
and
seeing
kind
of
scaling
limitations
or
not
feeling
that
item
we
can
be
more
bound
by
latency
than
we
would
be
doing
them,
and
I
think
I
would
expect,
and
then
we
are
in
the
replication
case
and
then
once
we
try
to
increase
the
intensity,
there's
always
other
problem
showing
up
so
anyway.
That's
why
big
eaters?
A
That's
all!
I
have
worked
as
a
couple.
People
have
added
things
in
here
for
this
week
that
they
want
to
talk
about
so
maybe
start
with
jeanne
here
right
performance
with
Krakens
on
our
disguise.
C
So
if
I
just
showed
screen.
A
C
Right
so
just
in
case
of
watching
to
wondering
what
this
is
about
just
going
to
cover
of
a
very
brief
history,
and
so
I
was
thin
and
during
the
day
very,
very
high
spikes
and
latency,
and
it
was
a
bit
confusing
to
of
what
was
going
on
so
I
looked
into
it
and
it
appeared.
It
was
the
snapshots
when
they're
being
trimmed
after
they
removed
you
very
big
latency,
literally
all
y
0
and
the
classic
just
just
drops
to
zero.
In
some
cases,
I've
had
OS
these
PI
now
and
and
I'll
get
corrupted.
C
So
the
this
is
sore
causing
quite
a
bit
problem,
so
we
wanted
to
look
a
bit
server
into
it.
Now
the
previous
versions
you've
been
up
to
use
the
sleep
after
this
nap
training,
but
since
jul
that's
moved
into
the
main
I
Fred,
so
actually
that
just
blocks
io
as
well.
So
you
can
see
like
the
green
line.
C
There
was
for
my
anode
rod
to
set
that
and
sleep
on
this
map
trimming
and
although
it
it
stops
tough
time
now
and
it
actually
sort
of
makes
the
IO
problems
like
you
worse
so
to
test
our
sams
patch
is
created.
I
have
managed
to
recreate
at
essence
knowing
my
test
box
with
47
k,
disks
I'm
made
a
snap
and
generated
about
50k
of
dirty
object
and
then,
during
the
removal
I
run
a
4k
random
lead
to
death,
one
retest.
C
C
So
Sam
made
this
branch,
and
certainly
it
gives
you
a
calculator
cycles,
and
so
he
made
this
branch
and
it
happy
limit
it
to
two
pgs
per
0
SD
and
also
you
get
a
status
now
in
this
it'll
be
a
test
phases
display,
which
is
really
handy,
so
you
can
actually
see
sort
of
how
far
along
and
fact
that
process
you've
got
so
it's
better,
but
very,
very
spiky.
So
you
know
look
just
look
at
the
fact
that
there's
this
you
know
several
seconds
still
where
I
0
is,
is
pretty
much.
C
C
So
then,
I
try
to
plan
with
the
right
back
frontal
because
it
looked
like
looking
eyes
that
when
these
terms
of
zero,
I/o
periods
are
happening,
that
there
was
just
sort
of
like
50
hundred
Judith
and
all
the
discs,
because
it's
sort
of
flushing
out
the
the
buffers
and
in
the
colonel
sorry
I'm
guessing
what's
happening.
Is
these
operations,
even
though
I've
been
throttled
to
one
or
two
a
time?
Is
that
they're
they're
getting
immediately
act
by
the
journal
and
then
this
of
building
art
and
then
when
they
get
flushed
down
the
disk?
C
It's
set
right
in
the
disk
and
they
can't
perform
any
reads,
and
so
I
tried
dropping
the
right
back
fossil
from
505,000,
250
and
500
and
thats,
or
had
quite
a
dramatic
set
of
removal
of
the
sub
spikes
down
to
zero
that
this
dillish
spikes
bit
a
little
bit
smoother
I.
Then
I
plan
with
the
snap
gym
parking
costs,
but
I
think
at
this
this
problem
it's
happening
below
the
queuing
and
set.
So
it's
all
sort
of
like
that
level
again
with
the
what
right
back,
bottle,
yeah
and
then
so
really
I'm
out
of
ideas.
C
D
Supposition
is
that
well
we're
only
allowing
to
use
to
trim
its
hitting
the
journal
and
then
the
page
cache
and
then
being
considered
complete
plus
the
work
has
been
done
yet
right,
though
it's
building
up
all
right
and
that's
why
the
sleep
parameter
work
in
the
past,
because
this
artificially
rate
limited
the
number
of
requests
to
something
that
the
disk
could
actually
do.
Okay,.
D
Excessive
at
the
a
few
level
just
introduced,
something
that
will
do
essentially
the
same
thing
asleep
parameter
did,
but
without
blocking
the
threat
of
jessica's
out
of
world.
That's
pretty
unsatisfying,
though,
because
you'd
be
able
to
produce
the
same
effect
by
sending
the
same
queue
depth
of
Rights,
the
cluster
normally
yeah.
Yes,.
D
Be
right,
back
model
was
meant
to
was
meant
to
fix
because
it
did
yeah
I.
Think.
B
D
C
C
So
yeah
I
mean
I
have
been
doing
some
other
pretty
testing
with
you
know
doing
this
like
with
back
filling
in
another
stuff
and
I,
am
forgetting
that
the
feeling
now
that
the
white,
the
right
back,
fought,
always
like
that
you
say,
has
been
tuned
for
the
getting
their
maximum
performance.
But
actually
you
know
in
most
cases
that
sort
of
read
latency
is
probably
more
critical.
D
Oh,
it
is
pretty
far
ahead.
That
part
was
a
deliberate
trade-off
and
that's
why
there's
a
novice.
So
if
you
wanted
to
play
with
that
and
sort
of
report
back
with
performance
meeting
which
you
cannot,
but
that
would
be
good
well.
B
B
B
F
D
That's
not
the
problem.
The
problem
is
that
by
the
time
that
the
file
storage
considers
the
I/o
done
almost
no
work
at
sapidus.
That's
that's
fundamentally
problem,
oh,
and
so
we
can't
even
block
it
across
the
sinks.
Okay,
so
we
have
to
make
some
random
guess
as
to
how
long
it
takes
I'll,
sir,
to
actually
get
around
to
doing
the
work.
That's
better
than
good
artists,
fish,
okay,.
B
D
C
D
D
C
D
Then
there
are
the
you're
working
on
this
is
correct
master,
so
the
back
off
rattle
allows
you
to
set
a
curve.
If
you
look
at
the
documentation,
oh.
D
Cool
thanks
for
looking
at
this
by
the
way
I
know
there
are
other
people
would
say
problem.
The
original
complete
solution
was
like
really
unsatisfying,
because
while
the
original
thing
was
going
to
get
kind
of
work,
there
was
other
stuff
in
that
thread,
tool
that
did
get
watched
when
those
lips
were
were
happening.
So
it
was
never
it's
a
solution.
D
A
All
right,
it
looks
like
nokia
has
some
Easter
results.
They
was
like
to
share.
Are
you
guys
here?
Yes,.
E
E
Well
then,
this
deal
sorry
I
couldn't
return
the
last
last
week,
but
you
have
to
ask
us
to
write
certain
informative
of
how
we
gather
at
the
test
result.
I
mean
we
are
using
basically
I,
always
stuck
okay.
This
presentation,
deeply
official
name
on
the
link
on
the
past,
so
I,
oh,
except,
is
used
for
gathering
all
the
ayahs
twist
and
then
all
the
pasta
machine
is
down
with
excellent
well
some
time
ago,
I
have
those
doubts
about
the
letters
performance
impact
of
you
selected
protection
code.
E
They
should
call
for
blood
one
original,
3,
plus
2
30
C
Guatemala's
performance
is
more
or
less
similar
and
in
fact,
I'll
only
write.
This
is
fruit
plating
over
by
Frank
of
Taxation.
We
also
make
the
same
death
on
SSD
disk
and
performance
work
parents.
So
what
we
see
from
our
reality
of
villages,
let's
look
to
layton
to
play
team
anguish
or
right,
but
it's
also
the
same
effect
on
their
range.
So
in
this
is
ma
gram
I
made
some
numbers.
Is
it
just
for
comparing?
What
is
the
other
language
that
we
get
home?
E
Writing
and
this
Sunday
Maxine
warranty
is
bandwidth.
Comparing
a
difficult
for
plus
1
engine
called
three
plus
one
black
to
service,
an
initial
call
using
ssv
disk.
Ok,
we're!
Clearly
we
get
more
less
double
by
with
using
a
single
SSD
disk
for
all
them
from
the
roughly
data
rates,
but
well
after
several
consideration
that
we.
E
Finally,
make
music
with
the
fifth
circle
11th
to
the
celebration,
and
we
have
a
be
stopping
brands
here
with
the
memory
we're
running
out
of
memory.
Nowadays
we
passed
the
test
previously,
but
with
this
new
version
we
run
it
out
of
memory.
We
have
five
notes
with
69.
Has
this
rag,
but
no,
then
we
start
looking
into
the
change
between
the
previous
version
and
DS
version,
and
we
found
finally
found
a
computer
in
change.
Drastically
change
configuration
on
the
cash
in
the
previous
version.
The
cash
was
around
500
megabytes
and
now
it's
five
gigabytes.
E
E
Memory
like
that
by
high
you
get
5
gigabytes.
This
is
my
guess:
ok
I,
don't
know
why.
But
it's
a
fact.
Yes,
look
at
the
diagram
results
that
we
are
getting
here
after
running
14
hours
of
tech.
You
can
see
we
create
to
move
with
5,000
disc
and
we
were
using
six
gigabytes,
six.
Seven
gigabytes
of
ram
air
disk.
A
E
E
Procedure
and
killed
a
prophecy,
ok,
so
the
complete
clatter
take
our
test
stopped
because
decided
to
kill
one
sec
trouble.
Okay,
talk
here
in
this
presentation
we
saying
the
problem,
so
probably
the
cash
is
computed
in
a
different
way
and
at
least
for
us,
by
its
own,
well
using
much
more
ok.
B
So
I
just
check
the
code
and
it
is
dividing
that
number
by
the
shard.
So
it's
supposed
to
be
the
total,
not
the
purse,
art
value,
and
so,
if
you're,
using
that
much
that,
my
guess
is
that
it's
a
coincidence
and
there's
some
other
reason
why
it's
using
more
memory
than
its
supposed
to
there's
a
command
call
them
an
advocate
command:
Youssef
demon,
OC
dot,
whatever
dump
underscore
men
pools,
and
that
will
tell
you
how
much.
C
E
B
B
You
have
to
set
it
in
the
cops
before
go,
see
starts,
but
if
you
set
that
to
true
and
then
run
it
into
the
state
and
then
do
doubles,
will
get
detailed
breakdown
of
where
all
that
memory
is
being
used,
at
least
hopefully
by
some
of
the
system,
and
that
that
will
hopefully
help
us
figure
out.
What's
going
on.
My
guess
is
that
there's
actually
a
subtle
bug
in
the
there's
a
buffer,
probably
that
we're
not
rebuilding,
and
so
it's
effectively
leaking
a
bunch
of
buffer
memory.
I
guess
except
that's
what
clan,
but
we
should.
E
In
any
way,
dispense
run
in
14
hours,
this
only
things
from
the
cash
we
accelerated
the
run
use
of
the
test.
Okay,
with
the
only
change.
The
only
thing
for
this
is
luther
cache
size
to
using
that
preventing
like
100
okay.
What
more
does
that
we
have
words
with
the
current
patrollers
that
we
were
able
to
get
now
that
all
the
robbery
and
data
is
going
to
the
same
partition
on
the
DS?
Okay
and
the
result
is
better
than
we've
done
in
the
previous
version.
E
My
only
concern
now
is
that
by
conversion
at
leasing,
the
previous
version
off
of
except
we
are
serving
lots
of
free
on
the
data
partition
on
Brewster
and
when,
when
I
mean
log
of
Crete,
is
2000
grit
on
the
list.
Opposition
and
I
don't
find
any
reason
for
the
and
see
about
political
is
where
the
lower
right
performance
is
activity
on
the
date.
E
B
Yeah
I
guess
that's
my
question
because,
usually
in
the
past
that
I
think
in
general,
the
reeds
there
should
be
a
bunch
of
reeds
coming
from
the
rocks
to
be
database
partition.
Basically
because
after
after
rocks
TV
does
a
compaction,
it
basically
invalidates
all
it's
cached
data
and
then
it
has
to
like
salt
it
all
in
again.
So
every
time
it
can
packs
an
SST.
It
then
has
to
like
read
it
again,
just
really
irritating,
but
dollars
read
should
be
coming
from
wherever
rocks
give
you
storing
its
data.
So
it's
usually
the
DB
partition.
B
B
B
A
B
A
B
It's
everything,
I
guess
on
the
list.
I
can
give
a
quick
update
on
the
fastest
touch
stuff
extras
out
all
its
report
that
the
pull
request
that
does
all
the
preliminary
work
is
basically
ready
to
merge.
The
one
that
actually
changes.
The
bestest
patch
code
is
the
last
piece
so
I
need
to
once
the
other
stuff
is
ml.
Do
testing
on
that,
but
I
think
it's
pretty
close
I
did
go
back
and
look
at
be
sort
of
the
last
bit
that
I
I
was
worried
about.
B
After
all,
the
reviewing
changes
was
the
get
reserved
maps
and
release
map
calls
because
those
used
to
be
called
they
usually
called
heavily
during
festive
stash.
Now,
they're
only
used
in
the
legacy
client
handling
I'm
not
so
concerned
about
that,
but
I
it
turns
out
that
there
are
also
used
every
time
you
send
a
message
in
the
OSD.
It
has
to
get
a
reference
to
a
map
and
then
release
it
so,
regardless
the
fastest
that
just
a
lot
path,
so
employee.
B
Yeah,
which,
because
we
need
to
you,
have
all
these
threads
that
are
running
that
are
sending
maps
so
I
sending
messages
other
Oh
Steve
and
we
need
to
make
sure
they
don't
send
messages.
Oh
see
that
we
just
marked
down,
and
so
we
have
to
flush
all
those
out
a
way
tree
search,
maybe
river
before
we
start
barking
shut
down.
B
So
I
think
it's
still
worth
optimizing
that
basically
so
imma
back
and
looked
at
how
mice
or
request
I
think
it
makes
sense.
It
basically
amounts
of
doing
increment,
incrementing
and
decrementing
anatomic
in
the
Fast
Pass,
which
is
pretty
good.
We
can
find
you
a
little
bit
better
with
our
see
you,
but
I
would
rather
take
that
step
first
coast.
Our
few
steps
up
so
weird
thread,
threading
requirements
that
you
register
all
threads
and
so
on.
I
think!
B
B
B
B
D
B
They
could,
but
it
it
doesn't
matter
that
much
that
one
of
the
things
that
the
fest
attached
does
is
it
changes
the
waiting
for
map
stuff
so
that
it
blocks
only
oh
did
that
before
actually
nevermind
basically
only
slows
down
that
one
client,
if
it
sends
something
in
the
OC,
doesn't
have
it
only
that
clients
requesting
gets
blocked.
Basically,
everybody
else
can
still
continue.
So
you.
B
D
That
not
trimming
part
of
the
quite
a
size
in
this
problem
which
is
different
anything
we
do
the
causes
of
map
change,
but
isn't
important
causes
an
unnecessarily
large,
hiccup
and
client.
I
0
because
clients
could
get
the
map
before
thee.
Oh
sees
do
and.
B
Now
would
be
the
time
to
do
it.
It
did
actually
wouldn't
be
that
hard
to
add
a
little
bit
of
state
in
the
object
ER
to
track
the
last
primary
sense
or
less
whatever.
In
fact,
I
had
a
past
that
did
that
for
the
back
off
stuff
and
then
dropped
it,
because
I
need
it
so
now
would
be
a
time
detect
at
it.
B
D
C
D
D
B
That's
it,
you
know
what
the
client
could
do
is
it
could?
It
could
just
have
a
have
a
time-bound
like
five
seconds
and
it
would
just
have
it
would
keep
in
memory,
hold
those
fee
naps
up
to
five
seconds
old
and
if
it's
sending
any
requests-
and
it
has
older
maps,
it
could
just
repeat
the
calculation
on
those
two,
so
it
can
calculate
a
lower
bound
in
just
the
cases
where
it
is
likely
to
matter.
B
Say
it's
500
like
this
Oh
node
and
slack
better
max
of
other,
so
the
blue
star
cash
is
right
at
one
gig,
so
that's
actually
as
it
should
be.
Yeah
I've
metadata
at
least,
but
it's
got
three
gigs
of
data
side
bet
that
the
I
bet
that
there's
a
buffer,
referencing
espera
fixing
one
of
these.
This
is
me.
B
So
that,
there's
after
the
whole
class
of
bugs
wear
em,
there's
a
bunch
of
code
for
reference,
counting
buffers
in
sess
item
can
sometimes
be
too
clever
for
its
own
good.
So,
what's
probably
happening
is
that
some
object
in
the
blue
store
metadata
is
referencing
like
a
few
bites
out
of
the
larger
memory
allocation
and
that
larger
memory
allocation
can't
be
free
because
we're
just
using
a
small
piece
of
it,
and
so
it's
still
effectively
bounded
by
the
size
of
the
blue
sore
cash.
B
But
it's
like
you
know,
10x
or
something
which
is
why
you
see
that's
showing
the
reducing
the
blue
store,
cache
size
breeze
it.
So
we
can
have
to
figure
out
which
buffer
that
is,
let's
go,
go
fix.
It
I'll
see
if
I
can
reproduce
up
to
this
on
my
list,
I'll
see
if
I
can
reproduce
this
and
it
was
going
on
blue
story.
See
memory
utilization.
I
guess
is
that
a
few
we're
using
replication?
It
wouldn't
happen
because
slightly
from.
D
E
G
B
To
process
and
cashing
the
bar
BTW,
I'm
not
sure
what
size.
So
that
is
the
question
for
fridge
jason.
I
think
there
are
other
things
that
are
being
prioritized
for
luminous
with
multiple
mobile
site
mirroring
agent,
making
that
robust
but
I
don't
yet
I,
don't
know
the
timeline
us
to
that
pressure.
Okay,
yeah!
It
was
originally
something
that
the
Intel
engineers
we're
going
to
work
on
and
they
aren't
and
and
Jason
was
going
to
do
some
of
it,
but
he
is
other
stuff.
So
my
guess
is
that
there
just
isn't
a
person.