►
From YouTube: Ceph Performance Meeting 2018-10-18
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
B
C
Actually
it
has
to
come
it
inside
one
minimizes,
the
allocation
needed
to
steel
containers
realization,
but
by
only
by
four
four
bytes.
Second,
one
is
just
make
some
make
some
optimization
easier
for
compiler,
especially
on
a
steady
list
where
empty
can
be
very
inconvenient
part
of
the
of
the
internal
internal
data
of
a
city
list
than
the
size
method.
C
C
Well,
I
after
after
he,
after
reworking
I
can
see
there
is
that
the
branching
is
drink,
a
steel
container
civilization
is
removed
entirely,
but
I'm
not
entirely
sure
whether
if
you
want
to
make
it
introduce
such
numbers,
it's
a
huge
number
of
changes
only
because
of
one
single
branch.
Let
me
provide
link
to
the.
C
C
C
B
B
B
D
B
This
this
came
up
in
a
this
RBD
sprint
testing
that
folks
wanted
to
do
where
you
know.
One
of
the
things
that
they
had
done
to
improve
performance
on
nvme
was
just
kind
of
you
know
blanket
disable,
debugging
more
or
less,
and
there
was
interest
in
why
you
know
that,
in
addition
to
a
couple,
other
things
made
a
big
difference
in
some
of
the
tests
and
most
likely.
The
only
thing
in
that
that
might
have
really
mattered
was
probably
people
ms1.
So
it
sounds
like
everything
else
is
kind
of.
B
D
I
mean
okay,
well,
I,
don't
really
have
any
idea.
What
the
relative
trade-offs
are.
It's
just
like
a
lot
of
time
and
I.
Don't
know
what
like
how
comedies
versus
other
things,
but
when
something
goes
wrong
and
you're
like.
Why
isn't
this
thing
happening
and
then
not
being
able
to
see
the
messages
they
come
in
and
out
is
bad
because
you
know
if
we
need
to
see
what
messages
happened
now,
it
could
be
that
there's
a
more
efficient
way
to
dump
them.
D
It
could
be
that
we
can,
you
know,
do
a
partial
decode
like
we
do
with
the
I
think
the
OST
Hopkins,
a
partial
decode,
so
on,
like
Reza
jerks
teeth
and
it's
kind
of
a
truncated
one,
but
just
about
to
see
that
it
like
appeared,
and
maybe
we
can
do
more
of
that
with
the
others.
But
you
know
that's
the
like
instruction
passing
system
for
that,
so
we
can't
just
say:
no.
If
you
don't
get
to
see
what
it
what
it
does
anymore,
when
it's
demons
like
we
need
I
need.
Let's
see
that
somehow.
B
Like
you
said,
though,
maybe
quantifying
or
writing
the
relative
importance
of
all
that
info
that
it
gives
you
maybe
we
could
like,
at
least
if,
if
we
really
care
about
everything,
that's
there,
then
at
least
we
then
can
say
all
this
really
really
is
important.
You
know
make
the
argument
that
we
it's
worth
the
you
know,
performance
trade-off
on
a
really
fast
device.
D
Like
why
that
one
mattered
as
compared
to
all
of
the
other
debug
output,
that
gets
generated
a
lot
like?
Is
it
just
a
fig?
That's
the
old
like?
Basically,
that's
the
only
thing
that
does
anything,
because
we
don't
jump
a
whole
lot
in
in
default
State.
But
it's
not
the
only
thing
that
happens
or
is
it
that
for
some
reason,
the
way
it's
being
dumped
is
pretty
expensive
and
we
can
look
at
the
way
it's
being
done.
D
D
C
A
D
C
D
C
D
But
yeah
I
mean
these
are
the
way
you
do
like
you
pass
any
instruction.
So
you
can't
like
you,
have
to
know
what
the
instruction
was,
or
else
you're
just
your
traps
train
reverse-engineered
from
what
you
see
later
in
the
logs
and
if
it
crashed
on
an
instruction
before
you
see
the
instruction
then
you're,
just
like
oh
yeah,
a
message
came
in
and
something
crashed
or
maybe
you
don't
know,
a
message
came
in
and
something
crash.
C
D
D
I
mean
you:
can
you
can
talk
to
other
people
and
see
if
they
say
I'm
crazy,
haven't
see
like
I'm,
not
the
one
who
investigates
these
anymore,
but
that's
definitely
my
experience
with
with
stuff
in
and
especially
in
the
MVS,
actually,
no
I
think
about
it.
Yeah
I,
don't
I,
don't
think
we
want
to
get
rid
of
that
I.
C
See
I
just
asking
because
well
it
might
be
dad
that
well,
I
think
all
logs
is
perfectly
fine
if
you
target
well
100
cups
from
device,
but
I'm
a
bit
afraid
about
our
transition
to
very
very
fast
devices.
Where
stops
well.
Just
the
prodding
man
is
changing
that
making
that
investing
so
much
resources
in
debugging
wouldn't
be
our
way
to
go.
D
B
Can
we
are
there
any
specific,
simple
things
we
can
do
to
reduce
the
amount
of
work
for
this
specific
log
message,
let's,
let's
just
bring
it
back
to
that,
like
you
know,
without
whole
new
logging
frameworks,
can
we
are
there
any
extra
information?
That's
there
that
we
don't
need
to
process,
and
are
there
really
simple,
specific
things
that
we
can
do
just.
B
So,
let's,
let's
do
that
because
it's
it's
something
that's
visible
higher
up
to
people
in
the
community
and
and
elsewhere,
so
I
think
it
would
be.
It
would
be
well-received
if,
if
we
could,
in
some
fashion
make
make
debugging
if
this
is
really
the
bigger
the
biggest
impact
is
debugging.
This
one
then
I
think
it'd
be
it'd,
be
a
good
good
thing
to
do.
C
The
first
step,
but
I
guess
the
last
one,
the
very
last
one
would
be-
would
be
to
reiterate
our
approach
to
logging
in
sister
world
LT
TNG
has
very
serious
restrictions.
First
of
all,
you
can
lock.
Your
recording
macros
can
take
only
see
types
and
can
take
only
10
or
10
parameters.
So
it's
definitely
not
enough
to
binary
record
all
the
stuff
we
have,
even
when
capturing
single
gauge
object
instance.
This
means
we
need
me
to
take
serious
rework
of
the
idea
of
logging
in
in
the
new
world
in
the
in
the
sister
world,.
B
C
Well,
there
was
I
already
some
ideas
to
employ.
Our
current
den
coding
staff
generate
binary
stream
for
algae
TNG.
However,
this
would
be
this
would
impose
ins,
some
dedicated,
lock
reader.
Well,
even
more
complexity
as
such
was
was
pretty
fine,
with
just
making
much
better
selectivity
with
30
TNG.
However,
better
selectivity
ability
to
turn
each
single
lock
message
separating
from
other.
This
won't
resolve
the
issue
with
things
like
the
three.
If
I
recall
correctly
places
we
have
in
messenger.
B
B
C
Well,
we
need
to
make
decision
whether
we
want
the
very
basic,
very,
very
first
one,
whether
we
are
interested
in
recording
a
cluster
that
behaves,
we
think
behaves
correctly
well,
pre
logging.
Basically,
yes,
preload
is
quite
good
wording
here,
I
guess
but
or
we
just
want
to
run
without
any
locks
by
default
and
then
start
enabling
them.
If
we,
if
we
need
to
make
some
investigation.
A
C
B
Will
say
that
that
when
at
the
supercomputing
Institute
that
I
used
to
work
at
that
is
the
way
that
we
ran,
we
disabled
all
logging
and
debugging
and
anything
else
that
with
lustre,
this
is
kind
of
you
know
before
stuff
was
really
stable,
and
that
is
probably
not
really
representative
of
most
users,
because
we
were
just
chasing
the
kind
of
highest
performance
we
could
get.
But
you
know
there
at
least
are
a
class
of
users
out
there
that
that
want
to
work.
B
B
Don't
know
it's
fine,
it's
probably
the
right
thing
to
do
short
term,
just
to
make
sure
that
it's
bigger,
but
it
would
be
nice
if
we
could
have
everything
kind
of
worked.
The
way
that
we're
trying
to
make
the
OSD
work
now,
where
you
just
tell
it
kind
of
a
size
that
you
want
to
fit
in
potentially
because
I
guess
we're
going
right
inside
a
container
someday
and
just
have
it
kind
of
conform
to
that.
B
B
D
D
A
B
B
B
In
crypto,
so
that's
that's
good,
hopefully
we'll
hear
more
from
him
soon
and
then
this
last
one
how
to
test
Oh
map
Doug
wrote
something
but
I
think
it's
very,
very
focused
on
deletes
I
personally,
am
kind
of
thinking
about
going
in
and
trying
to
expand,
ratos
bench
to
support
mixed
workloads.
So
let
it
do
a
combination
of
writes,
reads
and
deletes
and
kind
of
just
give
it
the
the
ratios
that
you
want
to
do
I,
don't
think
it
would
be
too
terribly
difficult
to
do
that.
B
So
that's
kind
of
what
I'm
thinking,
but
let's
I'll
see
what
Doug
says
if
he
wants
to
get
this
to
this
other
tool
that
he
wrote
in
it's
like
it's
really
old,
it's
been
there
for
like
a
year,
so
I
don't
know
what
the
holdup
is.
Why
why
it
never
merged
or
if
anyone
cares,
if
it
merges
or
not,
but
anyway,
it's
that's
that
one
and
there's
a
couple
in
the
no
movement
category.
There's
a
couple
of
newer
things
here:
I
still
need
to
fix
my
cache
bidding
thing.
B
C
B
C
A
B
Yeah,
if
you
see
it
great,
that's
unfortunate.
C
C
C
A
B
Let's
write:
let's
talk
about
this
outside
of
the
meeting,
say:
hey
I'm,
not
I.
Don't
have
the
attention
span
at
the
moment
to
be
able
to
look
at
this
real
well,
but
it
would
be
really
nice
to
be
able
to
keep
using
mm
pool
infrastructure
or
the
blue
store
cash,
and
something
like
that
for
the
Roxy.
Be
cash,
so
it'd
be
worth
worth
it.
If
we
can
make
it
better,
I
guess.
B
C
B
B
Okay,
the
the
only
other
thing
I
had
is
that
I'm
looking
at
there's
a
lot
of
desire
right
now
to
understand
OMAP,
performant
and
since
I'm
already
doing
a
lot
of
stuff
with
rocks
DB
anyway,
I
thought
wow
I
can
take
a
look
at
it,
so
the
gist
of
it
is
that
we
do
a
lot
of
work
for
OMAP,
writes
on
my
dev
box.
It's
a
it's
about
the
same
numero.
If
you're
doing
like
128,
byte
or
512,
byte
or
or
even
like,
probably
1k
ohm
app
writes.
You
know
anything.
B
This
region
can
seems
to
perform
or
behave
the
same.
We
have
a
lot
of
CPU
overhead
a
lot.
It
doesn't
necessarily
look
exactly
like
it,
but
each
HTTP
OS
DTP
thread
is
spending
a
fair
amount
of
time
like
20%
of
us
active
and,
like
a
third
of
that
is
in
primary
logged.
Pg.
Just
doing
them
work
lots
of
work.
We
also
see
some
time
spent
in
in
like
just
creating
and
destroying
and
dealing
with
transactions
and
another
stuff.
I've
got
a
profile
here
that
I
can
probably
include
somehow
but
yeah.
B
The
gist
of
it
is
is
that
all
we've
got
a
lot
of
DPOs,
CTP
threads
I.
Think
sixteen
and
the
trace
I
just
grabbed
by
default
and
each
one
of
those
is
doing.
You
know
at
least
some
amount
of
work,
and
you
know
all
told
it's
a
lot.
So
file
store
is
actually
worse.
There's
some
extra
translation
stuff
in
there
that
I
don't
even
know
about
I
think
that
was
showing
up
in
the
trace
that
I
looked
at
I
took
one
for
file
store.
B
Also
the
oddly
there
was
actually
some
PG
lock
contention,
which
I
wasn't
exactly
expecting
to
see,
but
yeah
file
store
was
I.
Think
getting
like
one
point:
five
to
two
thousand
rights
per
second,
using
five
course
to
o
map
for
512-byte
OMAP
quits,
so
any
in
any
event,
it's
a
lot.
It's
working
very,
very
hard.
It's
not
entirely
just
Roxie,
be
I
mean
Roxie,
be
is
also
working,
hard,
doing,
compactions
and
and
moving
stuff
around,
but
I.
B
One
of
the
things
that
I've
kind
of
wanted
to
do
for
a
while
has
been
to
just
implement
a
really
stupid,
PG
class,
one
that
doesn't
do
recovery
at
all,
I
mean
in
the
past
I've
just
like
removed
log
operation
from
primary
log
PG
and
a
couple
of
other
random
things.
You
know
make
sure
there's
no
do
blob
detection
happening,
make
sure
that
PG
info
isn't
being
written
out.
You
know
this
kind
of
stuff
and
it
made
a
big
difference.
B
I
mean
I
was
testing
pet
store
at
the
time,
which
is
just
you
know,
a
really
fast
implementation
of
the
nem
store
basically
or
was
when
I
was
playing
with
it,
and
it
was
when
you
you
take
out
that
that
you
know
all
that
stuff
it
like
halves,
the
CPU
usage
and
I.
Imagine
for
here
it
might
be
a
bigger
effect,
I,
just
based
on
what
we're
seeing
in
this
trace.
But
I
tend
to
wonder
my
I
guess
my
my
personal
suspicion
is
that
log
based
recovery
on
really
really
fast
devices
doesn't
really
make
sense.
B
A
A
It's
not
showing
it
is
yeah,
okay,
so
it's
just
that
the
third
right,
so
I
think
I've
got
slightly
on
the
wrong
track.
Originally
this
of
those
counters
showing
different
things
for
new
rights
and
small
rights,
but
I've,
been
looking
in
the
difference
to
me,
seems
to
be
mainly
between
random
and
sequential.
So
you
can
see
that's
random
rights
too
hard
this
with
rocks
TV
on
SSD
and
that's
sequential.
So
you
can
see,
there's
quite
a
difference.
Interestingly,
the
minimum
latency
time
is
actually
quite
close,
and
this
is
just
like
a
pure
SSD.
A
You
can
see
again
that
minimum
is
very,
very
close.
An
average
a
little
bit
higher
I
suspect
it
sort
of
bouncing
around
some
sort
of
key
limit
or
something
that
during
the
third
writes,
which
is
why,
although
the
average
is
lower,
the
minimum
is
in
same
area,
so
I'm
continuing
to
look
at
that
and
I
posted
this
on
the
main.
A
This,
but
then
bottom
there's
sort
of
what
I
was
or
still
seeing
with
fast
on
a
slight
different
cluster,
but
with
hard
drive
so
that
there's
a
potential
I
guess
as
long
as
in
theory,
you
should
be
expecting
below
queued
EPS
to
be
getting
similar
latency
to
an
SSD.
Thus
I
guess
what
you
should
be
to
get,
though
I'm
going
to
keep
looking
at
that
and
see.
A
Stuff,
like
cleaning
up
snapshots
in
doing
FS
trims
with
file
store.
You
tend
to
get
this
massive
disk
utilization,
and
this
cluster
has
got
about
200,
Paris
T's,
but
even
so
doing,
their
fish
trim
would
just
sort
of
pin
the
disks
it's
sort
of
like
quite
a
high
utilization.
For
you
know,
half
an
hour
or
so
with
blue
store.
This
is
taking
on
the
same
cluster.
A
It
you
didn't
even
notice,
then
if
it
was
going
on
so
I'm,
guessing
that
some
Tim
enhancements
in
the
way
that
rocks
DB
has
been
used
to
manage
the
for
the
metadata
rather
than
level
DV.
Sitting
on
the
spinning
disc,
though
that's
been
a
massive
improvement.
I've
also
just
got
some
graphs
just
showing
the
resource
usage.
Before
and
after
going
from
files,
lots
of
blue
store
overall
dis
utilization
seems
to
be
slightly
lower.
A
A
A
A
It's
sort
of
a
better
firm,
I
sort
of
a
because
scenario
where
you
want
to
try
and
insert
of
ryos
into
the
middle
of
other
ones,
but
hey
so
I
know,
I
brought
this
up
put
about
six
months
ago,
when
I
go
here
to
the
cluster
luminous,
that
scrubbing,
who
was
having
a
quite
an
impact
on
latency
where
it
wasn't
in
enjoy,
and
that
was
cause.
Apparently
there
an
issue
had
been
found:
wear
scrubs,
weren't,
taken
PG
locks.
I.
A
A
Pg
is
waiting
on
a
scrub
to
complete,
which
could
be
honest,
winning
this
10
20,
milliseconds
or
whatever,
which,
when
you
look
at
that
deferred
right
stuff
is
it
has
an
impact
there,
cuz
I,
guess
if
I'm
looking
at
it.
If
you
were
looking
at
like
a
traditional
storage
array,
you've
got
the
the
caching
layer
right
at
the
front
of
the
stack.
A
So
it
doesn't
really
matter
what
the
latency
of
the
back
in
this
to
do
and
you've
always
hit
in
that
sort
of
very
fast
memory
when
you're
looking
at
this
and
you
starting
getting
20
milliseconds
chewed
up
for
a
scrubs
complete
and
that
does
have
an
impact
on
stuff
which
is
expecting
to
be
able
to
do
low,
latency
stuff.
A
There's
useful
did.
Did
you
ever
look
at
ec
reads,
so
they
aren't
that
I
think
that's
the
only
case.
It's
like
basically
slower
because
certainly
sorry
for
a
big
amount
of
data.
It's
faster,
but
very
nice
or
smaller,
probably
up
to
about
maybe
512
to
a
Meg,
it's
slower
because
you
end
up
having
to
wait
for
all
the
RS
DS.
B
Yeah,
the
the
little
tests
that
I
did
that
always
done
I've
done
previously
on
EC,
with
blue
store
blue
store,
actually,
where
large
iOS
was
faster
than
doing
3x
replication.
It
was
you
know
it's
more
reaching
like
2x
and
3x.
Typically
for
large
writes.
You
know
for
really
small
writes
it's.
You
know
it
gets
slower.
You
know
the
smaller
you
go,
but
for
read,
so
you
know
that
that
having
to
read
from
every
single
OSD
participating
definitely
made
it
slower
than
replication,
where
you're
just
reading
off
the
primary.
A
Yeah,
no
I
don't
mean
definitely
I
mean
I.
Think
certainly
smaller
eyes
really
really
suffer
larger
eyes.
It's
not
certainly
on
spinning.
This
isn't
too
bad
because
at
the
time
it
takes
to
maybe
read,
you
know,
1
Meg
off
for
us
these
verses
for
make
or
for
a
single
it
sort
of
makes
up
for
that,
but
certainly
the
lower
smaller
iOS
I've,
seen
in
issue
yeah
and
when
I
think
so,
I
think
I've
located
or
understood
what
needs
to
be
done
to
force
all
new
objects.
A
So
the
cache
tier,
which
is
quite
interesting
because,
coupled
with
the
lack
of
double
rights
now
of
Lewis
store,
plus
the
EC
overwrites,
it
means
you're
looking
at
about
four
or
five
times
reduction
in
the
amount
of
SSD
where
so
that
doesn't
have
quite
as
much
as
impact
as
it
would
have
done
twelve
months
ago
say
so.
I
did
try
and
compile
a
test
version
of
that,
but
I'm
having
issues
with
package
dependencies
going
all
funny
when
compiling
for
one
two
xenial,
so
I'm
still
trying
to
look
at.
A
Does
in
bursts,
which
is
of
why
I
was
trying
to
sort
of
do
this
so
there's
a
period,
maybe
sort
of
10
20
minutes
where
it
gets
a
very
long
string
of
small
iOS
and
then
once
that's
over
it
doesn't
really
matter
anymore.
So
the
idea
was
to
probably
increase
the
size
of
the
cache
tier
to
taking
this
into
account
and
just
sort
of
try
and
absorb
these.
These
bursts
and
then
flush
them
to
disks
at
a
slow
rate.
B
Yeah,
but
that
kind
of
workload
you'd
almost
want
to
anticipate
that
you've
got
these
coming
and
and
like
try
to
bring
the
cash
to
a
like,
80%
full
state,
so
that
you
can
like
or
whatever
you
need
I
guess
to
burst
them
in
and
then
slowly.
You
know
like
get
rid
of
stuff
later
yeah
yeah,
that's
interesting,
certainly
in
workloads
where
you
have
like
a
lot
of
stuff
constantly
coming
in.
A
Yeah
I
think
so
what
you
know
I
say,
because
we've
got
quite
a
lot
of
spinning
this
behind
this.
So
in
terms
of
family
for
fun,
doing
like
buffered
for
Meg,
you
know
I
can
get
performance,
but
the
actual
incoming
stream,
because
it's
all
sink
rights
and
quite
reasonably
small.
It
might
only
be
tens
of
Mexicans.
So
it's
more,
this
sort
of
acting
as
a
coalescing
buffer
type
thing,
which
is
what
I'm
trying
to
use
it
for
a
burst.