►
From YouTube: 2016-SEP-21 :: Ceph Performance Weekly
Description
Weekly collaboration call of all community members working on Ceph performance.
For full notes and video recording archive visit:
http://pad.ceph.com/p/performance_weekly
A
Okay,
sages
basement
is
flooding
so
he's
trying
to
take
care
of
that.
So
he's
not
gonna
make
her
today
and
I
sympathize
living
in
the
midwest.
It's
it's
not
fun
all
right!
Let
me
find
my
my
window
here
where
I've
got
the
etherpad
and
we
can
go
on
all
right,
so
world
of
pull
requests
this
week,
we've
got
a
couple
of
new
things
that
look
interesting.
A
Paige
has
a
another
pull
request
here
for
reducing
the
amount
of
data
we
write
when
we
dirty
metadata.
I
haven't
looked
at
that
real
closely
yet,
but
hopefully
that
will
that
will
help
us
in
our
quest
for
writing.
Less
data
to
rocks
to
be
I
have
not
followed
David's
patch
much
here
for
reducing
deep
scrubbed
impact
on
normal
operations,
but
I.
Imagine
that
there
will
be
a
lot
of
folks
that
will
be
very
interested
in.
That,
though,
might
be
worth
ones
time
to
check
out.
A
One
thing
I
will
mention
is
that
the
FIO
engine
for
object
store
got
merged.
So
that's
that's
great.
A
lot
of
folks
are
starting
to
use
that
for
doing
blue
store
testing
and
other
object
start
testing.
So
that's
exciting.
I
have
two
rather
embarrassingly
admit
that
I
have
not
looked
at
that
yet,
and
I
probably
should
so
that's
as
good,
though
that
that
has
been
merged.
A
And
beyond
that
I
guess
we
just
got
a
whole
bunch
of
stuff
that
doesn't
have
much
movement,
yet
I
think
sages,
hoping
to
soon
get
the
fast
Dean
code.
We
encode
her
encode
decode
stuff
merged
that
that
seem
like
it's
passing
at
least
our
our
kind
of
initial
functional
tests
and
performance
tests
here.
So
it
needs
to
go
through
the
more
extensive
s
in
technology,
but
I
think
after
that
I'll
probably
get
merged.
A
So
that's
that's
very
good,
yeah,
alright,
so
the
other
stuff
that
I've
got
for
this
week
is
we
were
seeing
about
a
fifty
percent
regression
in
random.
Read
performance
at
first
I
thought
it
was
something
that
we
had
done
with
loose,
or
maybe
the
arting
or
something.
But
after
going
through
and
bisecting,
it
turns
out.
It
was
when
we
merged
making
the
ASIC
messenger
default,
and
so
I've
been
running
through
some
tests.
A
With
that
increasing
thread
count
in
a
sink
messenger
seems
to
help,
but
it's
not
enough
to
kind
of
get
us
back
up
to
simple
messenger
performance
and
also,
if
you
increase
the
thread
counts
too
high.
It
actually
makes
the
OS
DS
and
the
SEF
commands
segfault
a
GDB.
A
back
trace
is
a
little
bit
strange,
I'm,
trying
to
remember
exactly
where
it
was,
but
it
wasn't
something
that
was
immediately
obviously
wrong.
A
So
I'll
probably
need
to
just
sit
down
upon
line
haven't
o'clock,
but
yeah
chem
in
general
and
I'm
trying
to
see
if
there's
anything
else
that
can
be
done
here
easy
easily
to
to
kind
of
get
around
this.
Otherwise
it'll
pause
take
a
lot
more
deeper
investigation,
I
guess
then,
from
those
the
one
perspective
at
least
is
not
least
or
so.
That's
good,
I
guess.
A
Another
news,
the
wit
blue
store
that
we
are
seeing
a
lot
better
performance
than
we
had
and
it
back
in
the
the
jewel
timeframe.
Love
the
the
work.
That's
been
done
on
encode
decode
and
trying
to
reduce
the
amount
of
data.
This
game
shoved
into
Oxley,
be
as
paid
off
for
dal
random
writes.
So
that's
good
news.
A
Really.
The
the
big
remaining
thing
beyond
just
the
async
messenger
stuff
is
sequential,
read
and
we've
cam
known
for
a
while.
That's
been
problematic
and
I.
Think
that's
something
that
we're
going
to
need
to
focus
on
again.
There
are
other
things
here
that
we
can
probably
improve
on
this
camp
general
performance
tweaks
in
various
places,
I'm
still
kind
of
interested
in
how
much
we're
seeing
blocking
slowdowns
in
the
bitmap
alligator.
Since
that's
at
least
Mike
traces
has
been
showing
up
more
lately.
A
Now
that
we've
kind
of
resolved
other
things,
but
overall
I
think
on
the
right
side.
We're
actually
doing
really
well
now
we're
typically
faster
than
file
store
or
as
fast
at
least
so
I
think
probably
sequential
read
is
probably
the
next
big
thing
that
we
need
to
figure
out
what
we're
what
we're
going
to
do
about.
So
that's
kind
of
that
yeah.
So
I
guess
the
scan
vol
I've
got
this
week.
A
C
A
D
A
B
A
question
about
a
sink
messenger.
If
you
don't
mind,
so
you
said
that
it's
still
on
a
slower
than
simple
messenger,
but
doesn't
it
use
less
resources
than
simple
messenger
and
are
there
situations
where
that
would
become
a
significant
factor?
The.
B
While
you're
doing
that
I
just
the
reason
I'm
asking
is
because
my
group
did
some
testing
of
like
hyper
converged,
you
know
SEF
OpenStack
storage
and
on
the
Sefo
s,
DS
use
up
a
ton
of
memory.
You
know
and
I
was
wondering
if
you
know
a
you
know,
and
they
open
a
lot
of
sockets
and
so
forth
and
I
was
wondering
if
simple
messenger
might
lower
the
overhead.
A
Yeah,
it's
a
good
question
that
so
one
of
the
things
that
I
did
notice
when
I
was
doing
this
is
that
adding
more
threads
to
async
messenger
seemed
to
help.
So
you
know
taking
that
what
you
will
it
might
just
be
that
were
or
you
know,
not
not
keeping
them
or
that
were
or
we're
waiting
on
them.
I'm,
not
totally
sure
it
doesn't
help
enough
to
guess
up
to
the
performance
of
simple
messenger,
but
that
is
11
effect.
I
didn't
notice,
I've
only
kind
of
started.
A
A
So
right
now,
do
you
see
this
throughput
megabytes
per
second
graph
with
the
store
versus
file
store,
yep
an
Indian?
So
if
you
look
at
the
yellow
and
the
green
line,
it's
from
master
on
in
late
July
versus
the
this
whip
bank
branch
plus
some
other
patches
I,
had
added
that
yellow
versus
the
green
line.
There,
really
what
that's
showing
you
is
actually
a
sink
messenger
versus
not
a
sink
messenger.
A
Even
though
there's
all
this
other
stuff
also
going
on
really
that's
the
performance
difference
there
and
you
can
kind
of
see
that
at
those
middle
io
sizes
from
like
64
or
128,
megabytes
r
kb
up
to
well
really
up
to
kind
of
like
large
io
sizes,
the
Green
Line
is
higher.
That's
a
sink
messenger.
Last
week.
E
A
E
A
A
F
F
If
you
fewer
resources
and
two
or
threads
in
a
large
system,
because
simple
messenger
creates
a
bunch
for
every
connection,
but
what
simple
muster
does
get
to
do
is
it
has
one
thread
that
just
sits
there
and
raises
docket
and
shoves
data
and
VOSD.
We
as
the
async
messenger,
has
to
like
switch
through
and
do
equal
and
stuff
to
try
and
watch
a
whole
bunch
of
threads
for
a
whole
bunch
of
socks
and
then
pass
them
off
to
a
worker
thread.
F
G
A
G
A
Is
have
a
scenario
where
we've
got
like
pretty
fast
Oh
s
DS
and
not
that
many
of
them
alright.
So
in
this
case
this
is
kind
of
testing
how
how
quickly
a
small
number
of
OS
DS
can
can
take
in
a
fair
amount.
I
mean
not
not
a
ton,
a
ton,
but
you
know
it's.
It's
like
ODP
and
basically
64
incoming
ayos
random,
read
iOS
at
a
time
or
in
flight
at
a
time,
I
guess
per
OST
roughly,
but
is
that
right?
A
A
A
H
H
A
A
I'll
I'll
see,
if
I
can
do
that
here.
A
But
so
so
anyway,
a
bend
to
answer.
Can
your
question
about
performance?
So
it's
definitely
not
always
slower.
You
know,
at
least
in
this
particular
case,
but
larger
io
sizes.
It's
actually
looks
like
it's
probably
faster
kind
of
for
these
middle
ones
and
then,
for
whatever
reason
below
at
64k
and
below
that's
where
it
seems
to
it
seems
to
do
worse,
but
it's
it's
not
necessarily
because
a
I
guess
the
reason
why
it
seems
to
do
worse
is
because,
for
whatever
reason,
simple
messenger
seems
to
spike
back
up
in
performance.
A
A
It's
interesting
that
we
see
this
kind
of
difference
in
a
pattern
here
and
it
did
was.
There
was
some
kind
of
like
differences
here
when
we
change
TC
Malik
settings
to
like.
If
I
remember
right,
when
we
went
up
to
120,
it
makes
of
thread
cash
with
TC
milk,
we
actually
saw
the
degradation
in
these
middle
io
sizes.
A
No,
if
that
tells
us
that
simple
messenger
let
the
alligator
is
just
having
a
lot
of
trouble
with
simple
alligator
and
that's
why
we
saw
that
and
we're
not
seeing
that
with
a
sink
messenger,
but
its
overall
just
kind
of
a
little
slower,
but
it
is
kind
of
interesting
behavior.
So
there's
there's
probably
more
of
a
story
going
on
here
than
we
know
right
now,.
B
A
B
B
A
G
G
G
B
B
D
Yeah,
just
briefly
mention
that
I've
been
recommend
running
cbt
via
technology
and
they're,
going
to
start
adding
some
more
data
points
for
the
CBT
to
collect,
regarding,
like
in
different
intervals,
for
recovery,
for,
like
example,
have
one
different
phases:
recovery
get
our.
How
long
two
faces
appearing
are
that
sort
of
thing?
So
the
kind
of
an
interesting
hearing
folks
give
us
lots
about.
Other
things
will
be
interested
interesting
to
add
in
that
sort
of
framework
cool.
D
A
D
A
Cool
yeah,
one
of
the
things
that
but
cbt
doesn't
do
as
well,
as
is
like
the
thrashing
test.
Is
it
it
kind
of?
Has
this
canned.
A
D
Yeah
I
think
they're,
probably
some
things
we
could
add
to
it.
The
apologies
I
had
to
make
things
like
not
beyond
just
simple
like
taking
down
10
SD
things
like
how
various
kinds
of
other
background
operations
like
scrubbing
or
increasing,
PG
couch
or
increase
your
application
count.
That
kind
of
thing
yeah.
A
D
A
B
Ben-Ari
I
I,
pretty
close
to
having
that
done.
I
think
I've
been
finishing
that
up,
I'm
hoping
to
get
something
for
you
to
look
at
today
and
see
what
you
think
I,
basically
for
people
that
aren't
aware
of
what
was
going
on
there
I
was
doing
some
work
to
try
to
make
the
sort
of
the
monitoring
a
more
configurable,
so
you
could
pick
and
choose
which
monitoring
tools
you
wanted
to
run.
B
A
So
so
what
I
the
the
reason
I
brought
it
up
was
because
I
was
thinking
that,
with
Josh's
stuff
that
he's
working
on
here,
it
might
be
useful
than
he
could
tie
into
your
work.
Do
basically
if
we
were
going
to
start
playing
around
with
like
carrying
or
other
things
that,
then
we
could
use
that
new
framework.
You've
got
44.
B
So
I
mean
I
think
you
can
get
the
basic
idea
of
it
from
looking
at
the
the
I
think,
there's
something
in
the
cbt
pull
request
area
or
something
that
there's
a
visible
request.
B
That's
there,
it
gives
you
the
basic
flavor
of
it,
and
so,
if
you
want
to
add
a
new
monitoring
tool,
you
just
write
a
small
class
that
inherits
from
a
dbt
monitoring
base
class
and
it
it's
pretty
simple
to
arm
right
that
you
know,
and
then
you
basically
just
have
to
edit
your
yamel
to
reference
it
and
it
will
pull
it
in.
So
that
means
that
is
that
what
you
were
talking
about.
A
Yeah
yeah
I
figure
that
so
like
right
now
and
at
least
in
the
when
you
were
on
a
recovery
test
in
cbt,
there's
like
some
hard
coded
thing
that
basically
just
like
I
think
you
runs
like
stuff
health
there's
something
over
and
over
again
and
dumps
out
the
lines
into
a
flat
file
somewhere.
But
presumably
we
move
that
and
any
other
new
things
over
into
like
your
framework,
where
it's
just
a
module
that
you
load,
that
you
know,
grabs
health
information
or
whatever
your
degraded
object.
A
B
Yeah
so
yeah,
sorry,
it's
taking
me
a
little
longer
than
I
had
hoped,
because
I
had
to
juggle
a
bunch
of
things,
but
so
I
posted
the
pull
request
that
it's
the
sort
of
not
up
not
totally
up
to
date,
but
it
gives
you
the
basic
idea
of
it
in
the
chat
window
or
anybody
that's
interested
and,
and
what
I've
been
doing
since
then
is
I
pulled
some
of
the
common
code.
That's
in
all
these
different
benchmarks
into
the
arm.
B
D
A
G
A
B
A
Yeah
well
cool
yeah,
Thank
You
Ben
for
working
on
that
one.
Thank
you
for
Josh
for
the
stuff
that
you're
looking
at,
because
it's
they're
both
can
be
really
really
nice.
A
All
right
anything
else,
guys.