►
From YouTube: 2018-Mar-15 :: Ceph Performance Weekly
Description
Weekly collaboration call of all community members working on Ceph performance.
http://ceph.com/performance
For full notes and video recording archive visit:
http://pad.ceph.com/p/performance_weekly
B
Okay,
let's
the
a
DeMatha
patch
that
changes
the
hashing
behavior
for
several
different
data
types.
This
is
great,
looks
like
we
were
only
asked
for
using
a
totally
weird
hash
there,
Robbie
Thanks,
okay
yep,
so
that's
good
managed
to
track
down.
Why
user
missing
very
high
steep
utilization
turns
out.
It
was
just
hash
collisions,
that's
great
I
mean
to
review
and
test
that
I'll
probably
get
back
for
it
once
it's
tested
in
like
there's
a
second
one,
also
more
hash
collisions
with.
B
B
A
A
B
Yeah
in
six
it
works
and
master,
and
and
this
should
fix
an
alumina
so
I'm
at
least
four
okay
see,
there's
a
improvement
on
the
OSD
for
the
batch
listing
that
merged
and
that
LMD
be
an
experiment,
close
I
doubt
about
that
old
pull
request.
So
that's
all
good.
B
B
B
C
C
C
B
B
It
needs
to
get
retests
I
think
those
are
other
things
that
already
fixed.
Let's
just
retest
it
Mike.
This
needs
QA.
So
that's
probably
fine
that
local
Reid
thing,
freeze,
beckoned,
small,
optimization,
didn't
work.
I
think
haven't
really
looked
about
yet,
but
it's
kind
of
a
low
priority
thing
so
not
too
worried
about
it.
B
Don't
know
for
all
right:
okay,
yeah,
I,
guess
that's
mostly
at
some
other
stuff
here.
All
for
this
stuff
is
pretty
old.
B
D
B
B
B
I
just
update
Adams
implementation
or
if
Jesse
wants
to
I,
don't
know
what
just
Jesse
seems
like
he
was
working
on
his
own
one,
but
I.
Don't
know
how
it's
going
to
be
any
different
and
set
up
a
stuffed
mutex
that
aliases
two
one,
either
that
or
standard
mutex
and
then
just
start
updating
all
the
uppercase
mutex
users
to
use
stuff
mutex.
Instead
they're
going
to
be
a
couple
that
are
going
to
be
annoying
because
they
use.
B
B
C
Basically
offers
say
over
soup
several
kinds
of
new
Texas:
some
of
them
are
non
plastics.
Of
course
there
are
prepared
to
mark,
but
you
can
still
have
get
access
to
it.
The
things
like
adaptive,
mutex
and
especially
interesting
one
is
a
lighting
mutex.
The
aligning
vertex
is
built
on
top
of
the
Intel's
transactional
transactional,
sedation
extension
transactional
memory.
Extension
TSX
ID
idea
is
to
vote.
If
there
is
no,
if
you
have
cut
two
frets
accessing
the
same,
the
same
critical
section
and
if
they
are,
if
memory
operations.
C
E
C
C
C
C
The
complexity
and
its
consequences,
the
tsx
extension
popped
up
in
hospital,
however
Hardware
back
was
discovered
in
tomato
microcode
update,
microcode
update
brought
another
problems,
so
a
workaround
in
gypsy
has
been
made
and
a
lot
of
distrust
are
not
enabling
the
hele
for
default
mutexes.
Still
at
least
for
my
urban
decay's.
The
support.
The
support
for
a
jelly
is
available
in
gypsy
because
it
has
two
level
controls.
One
is
something
like
half
illusion.
Second,
one
enable
illusion
by
default.
B
F
C
B
C
A
C
C
A
B
A
C
There's
a
lot
of
unnecessary
things.
First
of
all,
it's
takes.
It
usually
takes
two
cache
lines.
It
acts
it
may
it
makes
writes,
and,
moreover,
it
makes
a
lot
of
conditional
branching
that
is
tight,
very,
very
well
together,
which
means
which
could
which
could
affect
branch
prediction
I
mean
having
those
and
stuff
conditional
jumps
on
the
same
on
the
same
fetch,
the
same
memory
block
that
the
factory
is
working
on,
I
mean
it.
C
C
Way,
I
think
I
have
a
ranch.
It's
turning
out
that
all
they
come
the
the
options
we
are
passing
to
a
mutex
constructor
are
solely
compiled
time.
The
main
branch
is
not
just
mostly
on
the
HIV.
It's
it's
about
it's.
The
goal
is
to
avoid
as
much
as
much
modifications
as
possible,
but
it
may
it
means
it
translates
into
staying
with
the
Big
M
mutex.
In
most
cases,
it's
not
I'm
afraid,
yep,
okay,.
E
E
That's
taking
over
the
network
and
serving
everything
I
think
it's
less
clear,
exactly
how
we
want
to
fight
up
the
quarters
among
logical
OS,
DS
and
the
instructors
within
at
the
add
process,
whether
they
could
be
actually
most
of
all
logical,
OSD
structures
and
multiple
messengers.
Or
if
you
wanted
to
do
one
shared
spectra
for
everything.
A
B
A
B
E
So
girl
this
event
doesn't
if
I
was
going
to
stick
you
out
anyway,
that
much,
which
is
it
so
if
we
did
did
have
separate,
essentially
OSD
and
obviously
separated
into
different
cores,
are
different.
Subsets,
of
course,
such
that
we
could
preserve
like
new
middle
county
within
one
of
us
to
you
hopefully,
and
I
have
maybe
perhaps
one
core
for
OSD,
so
using
use
it
as
the
messenger
core
bad
for
that
OSD
for
some
subset.
If
there's
more
cores
right
over
your
horse
for
very
disk.
B
E
B
B
So
I
think
it's
actually
the
other
way
around
it's
where
we
want
to
so
it
seems
like
if
we
have
multiple
hardware,
so
I'm,
assuming
we're
talking
about
DP
TK
here.
If
we
have
your
separate
this
I
think
it's
what
we
call
it.
The
virtual
network
function
you
know
so
whatever
for
each
OSD,
then
you
could
have
them
on
different
cores.
E
B
A
It's
this
kind
of
question
of
like
where,
where
ultimately,
the
layer
at
which
you're
transferring
data
to
other
cores,
right
and
and
what
happens
locally
on
the
core
and
what
happens
distributed
there,
what
happens
at
the
DB
TK
level
versus
what
happens
at
like
the
messenger
level
and
I
don't
know?
Do
we
have
a
clear
understanding
yet
of
kind
of
what?
B
Going
to
throw
all
these
sync
messenger
stuff
up
in
the
air
and
the
sea,
star
refactor
of
async,
listenger
perversion,
part
of
it
or
whatever,
because
we
probably
want
both
of
those
things
right,
yeah
and
I'm
kind
of
guessing
that
the
sea
star
is
gonna,
be
a
port.
It's
gonna
be
like
copy
the
directory
and
then
like
change
it
I,
don't
know
that
it's
not
gonna
live
in
the
same
tree.
Whatever
right,
maybe
I,
don't
know
yeah!
B
D
B
E
B
And
point
is
that
we
have
a
sea
star,
messenger
that
implements
messenger
to
only
and
a
sink
messenger
does
both
messenger
1
and
messenger
and
simple
messenger.
Only
and
one's
messenger,
1
I
think
that's
probably
a
fine
end
point,
because
most
of
the
world
will
be
on
a
sinc
messenger
as
they
make
the
transition
and
then
once
they
do
make
the
transition
to
messenger
for
their
whole
cluster.
Then
they
can
start
using
the
sea
star,
1
I.
B
Guess
that
won't
actually
work,
because
we
still
need
to
support
messenger
1
for
clients
yeah,
so
I
probably
have
to
do
both
I.
Take
it
back
and
we'll
need
both
in
a
sig
mr.,
regardless
just
so,
we
can
get
all
the
new
critical
features.
D
B
It's
not
that
significant
rewrite,
but
I,
don't
know.
I!
Think
that
the
part
that's
fuzzy
is
how
the
how
we're
going
to
internally
search
to
the
code.
When
we
have
multiple,
multiple
O's,
to
sharing
the
same
messenger
or
at
the
same
court.
Are
they
going
to
have
different
messenger
implementations
that
have
some
like
back-end
thing
that
they
both
share,
or
they
were
literally
going
to
point
the
same
messenger
or
or
what
I'm
not
really
sure
exactly
how
that's
going
to
work.
A
B
B
B
E
B
B
E
B
D
E
B
Yeah
they're,
like
they're
buncha,
different
ways
we
go
to
it
could
be
that
when
you
run
suppose
to
you
like
on
the
command
line,
you
tell
it
all
the
USCS
it's
going
to
be,
and
it
just
does
it
all
or
it
could
be
that
you
have
like
a
I
noticed
you
runner
process
that
you
start
and
then
you
like
tell
it.
You
know,
start
up
instantiate,
both
d0
and
it
like
that's
it
instantiate,
so
I
see
whether
they
like
shuts
it
down.
B
So
you
have
she's
coming
and
going
within
the
same
process
and
then
set
those
do
you
might
actually
just
be
a
thing
that
communicates
to
the
background
process
to
like
instantiate,
the
others
to
you
that
you
asked
about
yeah
and
if
you
more
transparent,
oh
yeah,
it's
all,
but
then
it's
like
like,
if
you
have
us,
if
you
have
a
system
to
unit
file
for
each
OST
today,
do
they
sit
there
and
just
talk
to
the
shared
process.
I,
don't
know
no
they're
like
four
ways
you
could
do
it.
A
B
She
ate
it,
and
so
we
could
eventually
get
to
the
point
where
you
have
the
process
running
multiple
STIs
and
like
hardware,
fails
and
gets
replaced,
and
the
other
Mo's
do
state
running
yep
they're,
like
10
other
things
that
have
to
happen
to
get
at
least
10.
Other
things
have
to
happen
to
get
there.
I
guess
it's
probably
good
place
to
aim,
but
it
might
be
that
for
the
initial
thing
we
just
start
them
all
up
at
once.
E
B
But
that
yeah
I
think
the
first
part
I'm
worried
about
is
just
how
how
to
how
those
Oh
Steve's
will
have
their
messenger
facing
interfaces
constructed
when
they're,
eventually
sharing
I
think
mostly
the
only
thing
that's
sort
of
like
her
entity.
State
that's
tied
to
the
messenger
is
the
my
utter
stuff.
B
And
they're,
like
they're,
a
bunch
of
other
little
things
too,
like
having
multiple
addresses
for
the
same
endpoint,
so
you
could
have
like
an
ipv4
and
an
ipv6
address
and
you
would
just
find
it
two
ports
and
you
could
connect
either
one
of
those
that's
a
less
ambitious
piece.
That
would
still
be
useful.
That
might
give
us
part
of
the
way
there
and
there's
one
other
one.
B
A
What
what
is
like
the
Scylla
DB
kind
of
architecture
look
like
in
regards
to
DB
DK
and
all
of
this
do
they
have
anything
that
they've
worked
through,
that
didn't
work
well
or
did
work
well
or
I'm,
not
sure
exactly.
That's
a
good
question.
B
Well,
it's
stepping
back
a
minute
it
a
useful
midpoint
is
we
could
get
multiple.
We
could
use
DP
DK
to
grab
the
entire
NIC
and
have
multiple
OSC's
in
the
same
process,
but
they
would
still
be
running
independent
messengers
on
different
ports
right
still
using
messenger
everyone,
and
that
still
captures
like
most
of
our
goals.
Right
performance
really
owns
it,
just
not
the
messenger
two
ones
so
I
don't
think
we're
not
churchly
blocked
by
main
messengers.
B
E
B
E
E
F
I,
don't
think
possible
because
you
just
like
you,
get
interrupts
on
CPUs,
or
maybe
they
do
polling
but
like
so
maybe
they
can
pull
like
pre-register
or
like
DMA
in
memory
that
the
NIC
has
been
writing
to
directly
and
they
just
skip
over
stuff
that
isn't
theirs.
But
they're
still
gonna
have
to
see
it
on
some
level
at
least.
B
C
E
Yes,
let's
this
could
be
a
library
depending
on
that
which
is
a
faster
sitting.
Nick
Howard,
where.
A
We
had
some
evidence
and
concern
that
indexes
and
filters
might
be
pushed
out
of
cash,
especially
in
like
rgw
cases,
where
there's
there's
a
lot
of
key
value
pairs
in
the
in
the
database
and
a
lot
of
key
value
data
per
object.
So
what
this
does
is
that
there's
actually
a
separate
PR
for
rocks
DB
that
exposes
a
high-priority
pool
information
from
the
LRU
cache
Roxie
B
does
not
have
this
capability
for
high
priority
pools
with
the
clock
cache.
So
that's
unfortunate.
A
It
means
that
right
now
this
this
only
kind
of
works
well
with
with
the
LRU
cache
implementation,
but
maybe
we
can.
We
can
improve
that.
Having
said
that,
so
the
idea
here
is
that
we
we
prioritize
the
indexes
and
filters
as
a
high
priority
pool.
First,
we
try
to
always
give
that
memory
and
and
when
there's
there's
drastic
changes
in
the
usage,
they're
say
all
the
indexes
and
filters
get
flushed
out
of
cash.
A
Then
we
very
slowly
shrink
that
pool
so
that
if
they
come
back
in
quickly,
we
we
don't
have
to
like
reallocate
it
really
fast
right
now,
this
this
happens
every
five
seconds.
So
it's
pretty
low
overhead
there's
there's
not
not
a
whole
lot
of
change.
It
may
be
that
we
can
speed
that
up
or
that
we
want
to
slow
that
down,
but
but
kind
of
the
the
goal
here
is
low-impact.
It's
just
kind
of
you
know
very
slowly,
looking
at
kind
of
how
to
rebalance
these
caches.
A
A
It
may
actually
be
that
focusing
on
giving
the
low
priority
block
cache
more
memory
is
important
in
that
case,
but
part
of
this
ties
into
this
kind
of
weird
behavior
in
rocks
DB,
where,
if
the
during
compaction,
if
the
amount
of
data
in
the
low
priority
pool
exceeds
the
soft
cap
that
you
set,
then
all
of
the
indexes
and
filters
get
flushed
out
of
the
high
priority
pool
and
I.
Don't
know
why
that
is,
and
I
kind
of
generally
asked
about
this
on
the
rocks.
A
Tv,
Facebook,
dev
group
and
didn't
get
an
answer
back
so
I'm,
not
sure
people
even
really
realize
it's
happening,
since
no
one
had
any
ability
to
even
look
at
what
was
happening
in
the
high
priority
pool
before
so
there.
There
may
be
something
there
where
we're
trying
to
optimize
around
this
case
doesn't
make
sense,
because
it's
just
broken
behavior
but
we'll
find
out
I,
guess
the
so
so
I
guess
what
it
comes
down
to
now
is
I'm
I'm
doing
a
lot
of
testing
trying
to
look
at
okay.
A
After
ultimately,
after
the
enode
cash
and
the
the
kv
low
priority
KB
cache,
then
we
we
have
data
that
potentially
can
be
offered
as
well
for
buffer
greets
and
in
blue
store.
So
that's
kind
of
the
the
last
order
right
now
a
priority,
at
least
currently,
according
to
this
thing,
so
I
have
a
bunch
of
test
data,
but
I
haven't
really
organized
it,
yet
I'm
still
collecting
more
stuff,
hopefully
next
week,
I
should
have
a
nice
set
of
crazy
tense
graphs
that
will
show
some
of
these
behaviors.
But
that's
basically
it
that's.
E
A
I
have
not
actually
tried
doing
something
like
setting
it
really
fast
than
looking
at
profiles
right
now
at
like
five
seconds.
It
doesn't
appear
to
be
particularly
much
of
an
overhead
at
that
that
resolution
I
guess,
but
but
that
I
think
at
some
point
once
maybe
the
kind
of
overall
behavior
has
worked
out,
then
we
should
look
at
okay.
If
we,
if
we
start
doing
this,
really
often
how
how
bad
is
it
yeah,
the
truth
is
I'm
not
totally
sure.
A
E
E
A
Do
Josh,
do
you
remember,
do
people
ask
about
that
stuff?
Very
often
I
mean
are
people
tweaking
the
size
of
those
buffers
or
caches
and
the
client-side
the
client-side.
A
One
of
the
things
over
the
years
that
I've
noticed
is
that
people
have
a
habit
of
just
like
finding
some
random
tunings
that
somebody's
made,
and
then
you
know
copy
and
pasting
them
in
so
they'll
have
like
32.
You
know
OSD
threads
and
like
randomly
in
various
crazy
ways,
and
it
doesn't.
It
doesn't
really
make
any
sense,
but
they
just
like
bumped
everything
way
up
yeah,
that's
kind
of
why
I
want
a.
F
A
C
Question
right
to
our
new
Texas
Tech's
obstruction,
I
mean
I
mean
and
the
DPD
que
sorry
in
the
sister
OSD
we
are.
We
have
mutexes
men
in
places,
including
also
the
shirt,
the
shirt
common
base
and
what
we
want
to
do
with
that.
Maybe
it's
maybe
it's
a
good
time
to
try
to
resolve
this
problem
as
we
are
going
to
make
some
scenario
of
Texas.
C
E
There's
I
guess
a
couple
things
there.
One
is
that,
eventually,
when
everything
is
in
the
sea
star
framework,
we
won't
need
pre
Texas,
like
I
great,
be
like
pfft.
Texas
are
using
kind
of
sea,
star
style
mutexes,
which
are
not
doing
any
of
how
the
operations,
but
actually
just
basically
serving
as
billions
for
lock
throw
unlocks
within
one
core.