►
From YouTube: 2018-Apr-19 :: Ceph Performance Weekly
Description
Weekly collaboration call of all community members working on Ceph performance.
http://ceph.com/performance
A
All
right
so
welcome
to
the
performance
meeting
for
April
19th
right
now,
we're
basically
in
a
freeze
for
mimic,
so
there's
not
too
much
activity
on
the
ARS.
The
moment,
Mike
Market
suggested
that
I
go
and
write.
A
lot
of
us
had
some
these
discuss
with
respect
to
with
the
blue
store,
alligator
strategies.
D
A
C
C
Will
understand
in
what's
happening
with
alligator
in
production
at
the
moment,
and
probably
we
need
to
bring
some
additional
starts
collection,
performance,
counter
collection
or
things
like
that.
I
started
with
just
one
parameter,
fermentation
that
is
mirrored
to
performance
counters.
The
corresponding
TR
has
been
submitted
last
week.
C
Today,
I
admitted
inaudible
request
with
what
cater
request
and
results
sizes
histogram
that
can
probably
help
us
as
well.
Also
I
know
that
we
are
able
to
check
the
memory
usage
of
the
alligator
from
manboobs.
That
might
be
interesting
as
well
locator,
and
probably
we
should
consider
some
other
parameters
that
we
might
need.
Adam
do
have
anything
suggest.
D
Well,
for
as
much
as
you
said,
I
can
only
agree
with
my
level
of
confidence
that
what
you
think
is
correct
is
even
amplified
by
my
today
experience
because
I've
actually
fixed
one
problem
in
a
stupid
alligator,
which
was
I,
mean
I,
fix
that
now
the
stupid
alligator
was
able
to
a
call
as
neighboring
rig
regions
together,
but
in
effect,
I
cost
an
actually
more
amount
of
larger
chunks.
But
still
it
appeared
that
this
this
schema
caused
me
to
be
to
not
have
very
large
unallocated
part
which
caused
blue
FS
to
assert.
D
Some
snapshot
in
which
sizes
our
alligator
is
used
in
production,
I
mean
not
only
sizes,
just
the
histogram
but
actual
being
able
to
replay
both
allocations
and
deallocations.
So
we
can
get
a
halt
and
device
best
strategy
to
avoid
subtle
deallocations,
as
we
will
we'll
know,
on
which
level
we
have
to
enter
from
secure
mode
just
to
protect
our
last
very
large
areas
for
some
emergency,
but
now
we
just
which
we
can
just
guess
and
it
seems
as
for
me
not
always.
D
C
D
D
C
Well,
unfortunately,
only
these
recent
PRS
from
my
site
intended
for
that
so
adapt
it's
possible
to
collect
anything
but
free
least
information
from
existing
clusters
and,
moreover,
if
we
need
this
start
as
soon
as
possible,
it
probably
should
be
well
at
least
the
second
pole
request
should
be
merged
into
mimic
right
now.
It
doesn't
want
there.
A
C
C
C
Years
but
well,
first
of
all,
well
I
I
observed
the
lack
of
this
continuous
extents
in
some
artificial
scenarios
when
this
blue
FS
partition
is
isn't
that
long.
That's
the
first
command
here
and
the
second
one
is
that
currently
we
have
some
hint
stuff
in
the
in
a
locator
that
actually
ran
well.
So,
instead
of
reusing
extends
from
the
starting
of
the
volume
it
enumerates
all
the
disk
and
hence
it
brings
from
some
issues
with
well.
At
least
it
makes
the
probability
for
the
lack
of
continuous
extends
much
more
possible.
C
So
that's
actually
an
open
question.
If
it's
well,
actually
it's
possible
to
get
that
in
production
when
running
for
a
long
time.
What's
the
probability
for
that
I
don't
know,
but
in
development
setup,
I
could
reproduce
that
scenario,
but
maybe
in
real
life.
That's
not
that
important,
because
the
chunk
was
pretty
small
and
the
probability
of
the
lack
of
this
stuff
isn't
that
much
but
I
believe
we
should
try
to
avoid
many
probability
of
this
case
in
production
because
soon
or
late
it
fight
fire
somewhere
sure.
A
C
D
Well,
I
can
I
can
create
a
test
which
will
cause
this
fatal
fragmentation
for
stupid,
alligator
alligator,
with
as
little
as
1%
of
full
disk.
It
just
requires
you
to
allocate
and
de-allocate
I
mean
allocate
small
and
big
chunks,
inter,
inter
changing,
but
three
big
chunks
and
because
of
the
hint,
a
mechanism
that
Eagle
was
talking
telling
about.
We
always
try
to
allocate
with
the
net
from
the
disorder
last
offset
next,
so
we
will
just
nicely
chopped
our
discs
into
smaller
fragments
than
well.
D
D
But
we
can
be
I
agree
with
ego
that
we
can
avoid
this,
but
more
by
more
aggressively
reyes,
existing
smaller
chunks,
instead
of
following
the
hint
so
blindly
I
mean,
for
example.
Maybe
we
should
like
look
a
bit
before
the
hint
also
or
introduce
at
least
some
strategy,
that
is,
that
takes
into
account
above
distance
from
the
hint
point
and
the
fact
that
we
have
to
split
some
other
higher
order
chunk
into
smaller.
Maybe
that,
but
that's
of
course
lacking
the
actual
stats
of
use
for
a
locator
I'm.
Just
you
know
imagining
things
here
so.
A
C
Well,
actually,
as
far
as
I
remember,
stupid,
alligator
tracks
the
hint
by
itself.
Well,
there
is
an
option
to
pass
external
one
from
booster,
but
it
it's
not
used
and
if
it's
not
passed,
trance
Cupit,
alligator
tracks
its
own
hint
and
stay
sage
said
that's
important
for
rotational
drives
to
provide
they'll
continues
to
blocks
some
too
didn't.
C
D
E
D
C
A
C
But
well
just
turn
to
mention.
Actually
we
we
don't
well.
We
use
these
parts
allocation
for
for
all
data
and
that's
I,
careful
for
us
at
the
moment,
but
we
require
that
continuous
space
for
Blueface
I'm,
not
sure
that
it's
that
important
okay,
but
well
as
for
me
of
the
it's
just
a
matter
of
a
factor
in
complexity,
why
we
are
not
moving
in
that
direction,
then
performance.
So.
D
C
D
Will
be
much
I
would
be
much
happier
if
we
had
fragmentation
thought
as
a
problem
of
performance,
not
actual
availability
of
the
system.
Then
in
any
case
it
would
mostly
will
have
just
slower
down,
not
any
fatal
problems.
So
I
agree
with
figured
that
the
fix
with
not
requiring
continuous
chunks
for
you
FS.
It's
just
vital
for
us
now.
A
Yes,
Renee
seems
worth
protecting
and
seeing
how
much
of
a
difference
that
makes
and
hopefully
I
would
be
able
to
avoid
all
the
crashes
and
hopefully
not
sort
things
down
too
much,
maybe
if
something
slows
down
by
like
50
percent
or
something
that
would
be
perhaps
at
once.
This
guy
has
traffic
or
even
worse
than
crashing,
because
it
ended
up
slowing
down
the
rest
of
the
cluster
as
well.
C
A
C
A
Yeah,
that
seems
pretty
reasonable.
At
that
point,
you
can't
run
into
this
same
kind
of
I
have
experience
problem
without
being
out
of
extents.
That
was
story
using
anyway
right,
yeah.
A
C
B
There
is
at
the
moment
there
is
awaiting
pull
request
with
the
obstacle
rework
it
has
been
tested
on
in
Sirte,
also
Rivera,
fight
and
and
profile
directly.
There.
It
looks
that
we
have
pretty
significant
contention
on
the
spin
lock
in
optical.
However,
what
is
the
key
part
I
believe,
is
that
we
got
report
that
the
problem
doesn't
pop
up
on
this
start
deployed.
B
A
B
Be,
however,
I
also
dad
that
in
case
of
V
start
there
is,
the
messenger
is
a
bottleneck
which
causes
less
pressure
on
TP,
USD,
TP
and
those,
and
thus
also
lets
plus
less
pressure
on
on
the
spin
lock.
We
have
enough
chakra
that
is
accessed
that
synchronizes
by
default,
1680,
PUSD,
TP,
plus
three
messenger
threads
and
I've
got
a
question.
A
E
Perform
its
measurements
for
the
impact,
though,
to
justify
the
complexity
of
the
change
that
we
need
to
run
it
with
setbacks,
because
people
don't
basically
don't
run
without
fx4
in
real
scenarios,
and
if
we
have
something
that's
5%
faster
with
that
backed
off
and
your
%
faster
with
setbacks
on
I'm
not
sure
that
that's
gonna
be
worthwhile.
It
was
icy.
B
Dick
I
was
governing
a
bit
and
found
found
some
papers,
basically
a
reference
architecture
for
nvme
testing,
and
it
seems
to
me
the
in
the
attached
config
there
was
the
X
word
was
disabled.
Also,
we
have
CBT
disables
its
effects.
So
it's
a
matter
what
tools
should
be
used
for
performance
validation?
What
is
the
reference
point.
A
Yeah
and
so
I
think
there's
a
good
point
that
for
real-world
uses
suffix
is
most
likely
going
to
be
on.
But
in
terms
of
finding
the
like
given
button
like
within
the
OSD,
then
it's
helpful
to
test
with
it
off.
But
it
doesn't
really
tell
you
whether
the
changes
make
sense
to
an
user.
B
B
A
A
B
Messengers
working
fret
high
end
cluster
I
mean
all
full
assisity
configuration
pretty
powerful,
assess
this
at
the
moment.
I
know
I'm
absolutely
sure
that
by
default
we
have
we
have
only
three
of
them.
It
might
be
that
that
even
with
its
effects
enabled
but
having
a
lot
of
having
plenty
of
async
messenger
workers,
we
could
be
able
to
to
saturate
this
pin
locking
out
the
tracker.