►
From YouTube: 2019-04-11 :: Ceph Performance meeting
Description
A
Well,
for
now,
let's
get
started,
though:
okay
new
pr's,
this
week
there
is
a
new
PR
from
aaron
85
that
changes
how
the
OP
Q
works,
and
that
makes
me
super
nervous.
But
but
it
sounds
like
he
thinks
that
we
can
get
better,
lock
contention
behavior
with
this
changes
and
then
also
maybe
a
better
performance,
so
I'm,
hoping
that
he
can.
He
can
give
us
a
lot
of
performance
data
on
it
before
we
emerge,
but
we'll
see
so
right
now,
I
think
there's
more
work.
That
needs
to
be
done.
A
There,
we've
got
a
LaBarbera,
d1
I.
Think
Jason
is
reviewing
that
needs
QA.
If
I
remember
right,
he
thought
it
looked.
Okay,
more
crimson
work,
so
performance
improvements
there.
Those
are
coming
in
pretty
regularly
every
week
and
then
this
one
I'm
super
excited
about
in
blue
store.
This
is
a
new
back-end
io
engine
for
the
the
Ewing
stuff
that
Ian's
Expo
merged
into
the
kernel.
A
This
is,
this
is
all
real
new,
but
basically
the
ideas
are
replacing
Li,
Bao
and
I
believe
that
it
eliminates
some
of
the
blocking
behavior
that
lebay
ou
can
can
see
in
like
I
would
submit.
It.
Just
generally
seems
like
it's
working
much
better,
so
some
of
the
performance
improvements
that
are
substantial,
assuming
that
this
this
carries
over
the
tests
were
done
on
like
a
ram
disk,
I
think
and
using
some
non-standard
settings,
like
smaller
blob
sizes
and
smaller,
a
smaller
Menelik
size.
A
But
if
that
carries
over
to
kind
of
you
know,
envy
me
that
this
could
be
pretty
substantial,
and
if
the
big
thing
is
that
the
performance
improvements
were
in
areas
where
I've
seen
kind
of
strange
dips
for
us
in
the
past,
like
128k
iOS,
for
whatever
reason
previously
have
seemed
to
to
perform
worse
in
some
cases
than
then
smaller
iOS,
and
here
he's
seeing
a
big
improvement
there.
So
this
this
could
be
really
nice
could
be
substantial,
I'm
really
excited
about
it,
though.
Let's
see
what
else
what
closed
this
week,
some
crimson
stuff
for
debugging.
A
A
A
A
A
A
No,
no
performance
improvements
really
there
right
yet,
but
that
kind
of
further
lays
the
framework
for
some
other
things
and
then
also
for
adding
the
memory
target
worked
the
other
demons.
So
that's
just
kind
of
in
the
works
framework
for
that
I
wanted
to
mention
on
the
mailing
list
that
there
was
a
user
who
was
seeing
big
heap
memory
usage
in
their
OSD.
They
set
the
memory
target
to
4
megabytes,
but
the
heap
was
actually
but
they're,
sorry,
4
gigabytes
and
the
heap
was
about
6.
So
you
know,
RSS
usage
was
high.
A
That
apparently
was
all
in
memory
that
the
that
was
unmapped,
that
the
kernel
had
yet
to
reclaim
I
had
mentioned
to
them
that
they
might
want
to
disable
transparent,
huge
pages,
but
it
sounds
like
as
soon
as
they
introduced
memory
pressure
that
kernel
did
end
up
reclaiming
it.
So
this
is
kinda
way
of
these
cases,
where
we
can't
control
the
kernel
it
it
will
or
won't
reclaim
memory
as
it
sees
fit
and
in
this
case
I
guess
it
just
didn't
do
so
until
there
was
memory
pressure
so
yeah.
A
A
But
as
soon
as
you
start
missing,
then
you'll
start
double
populating
and
then
it
gets
bad.
So
the
my
hope
is
that
we
can
disable
the
block
cache
for
the
comb,
families
that
store
blue
storoe
nodes
and
then
avoid
that
kind
of
double
cashing
entirely.
So
that's
that's.
What
I'm
working
on
right
now
I,
don't
know
if
it
will
work
or
not,
but
I'm
hopeful
I,
think
that
will
be
a
big
performance
improvement
when
we
don't
have
enough
memory
to
catch
everything.
A
So
that's
about
it!
For
me,
I
don't
see
ki,
foo
or
Redick
today,
so
maybe
we
won't
get
a
cease.
Our
update
other
than
that
it's
it's
progressing
rapidly
and
performance
already
for
reads-
is
faster
than
then
stock,
stuff
or
well
stuff
with
blue
store,
it's
not
doing
as
much
yet,
but
the
results
are
really
encouraging,
so
I
think
it's
gonna
be
good
and
then
Adam,
not
Adam
Emerson.
The
other
Adam
he's
working
on
charting
still
I'm,
not
sure.
A
B
A
A
A
A
B
C
A
A
Will
it'll
try
to
read
it
from
rocks
DB
and
that
read
will
then
potentially
try
to
read
it
from
the
block
hash
and
if
it's
not
in
the
block
hash,
it
will
actually
go
through
and
find
whatever
SST
file
it's
and
then
read
it
from
there.
And
then,
when
that
read
happens,
it
will
populate
Rox
TVs
block
cache
with
the
block
that
that
that
read
came
from
and
it
will
also
populate,
then
blue
stores,
oh
no
blue
star
will
populate
the
oh.
A
No
we've
we've
discovered
that
you
really
kind
of
want
those
reads
to
happen
from
the
blue
store.
Oh
no,
when
you're
on,
when
you're
on
really
fast
devices
like
and
beanie,
it's
substantially
faster
than
going
through
the
the
whole
decode
process
to
fetch
it
from
from
rocks.
Db
itself,
and
even
fetching
it
from
the
block,
cache
and
Roxy
B-
isn't
really
that
fast,
it's
okay,
but
it's
not
great.
A
So
the
I
think
what
we
really
want
is.
We
want
to
be
probably
fetching
OMAP
data
from
the
block
cache
like
caching,
it
there
at
least
right
now
and
then
certainly
caching
indexes
and
filters
in
rocks
dB.
If
we
have
to
do
a
read
from
disk,
we
don't
want
to
be.
You
know
scanning
at
every
level
for
where
that
key
lives
and
all
the
different
SST
files.
A
A
We
kind
of
want
that
too,
to
not
pollute
the
the
the
block
cache
instead,
just
cache
those
in
the
O
node
cache
it
can
happen
regardless
of
what
the
I/o
sizes
it's
just
you
know,
if
there's
an
O
node
mich
miss,
then
you're
going
to
end
up
with
the
O
node
read
into
both
or
sorry
the
the
O
node
red
into
the
blue
store
owner
in
the
block
for
the
O
node
exists
in
or
the
the
Roxie
B
representation
of
it.
That
block
will
get
read
into
the
block
cache.
Does
that
make
sense,
yeah.
A
But
that's
not
a
definite
answer,
but
it
kind
of
looks
that
way
and
I
think
that
both
because
of
that
and
because
we're
not
double
cashing
I,
think
we're
gonna
see
a
lot
more.
Oh
noes
be
able
to
be
cached
with
the
the
cache
balancing,
behavior
I.
Think
overall,
it's
just
gonna
I'll
behave
a
lot
better,
so
we'll
see,
but
that's
that's
kind
of
a
hope.
Right
now.
C
Okay,
that'll
be
interesting
to
give
it
an
ideas.
Some
of
the
stuff
I
do
here:
suici
I
work
on
our
alliances
as
says
guy.
The
staff
guy
and
I've
got
this
large
cluster
that
I've
got
a
bunch
of
SSDs
and
some
p48
hundreds
of
that
I'm
doing
testing
against
so
I'm.
Looking
for
looking
at
the
tunings
that
make
sense
and
then
trying
to
understand
the
odd
behaviors
I
see.
Sometimes,
though,
that's
why
this
is
important
to
me.
Yeah.
A
So
my
guess
is
that
CPU
usage,
if
you're
just
using
the
P
4800,
is
gonna
kind
of
dominate
since
they're
so
fast
and
that
that
would
be
the
bigger
limitation
you'd
run
into
if
you're,
using
them
as
like
the
DB
wall
and
have
another
device
for
the
for
the
the
block.
A
Storage,
then
kind
of
the
question
would
be
whether
or
not
you
know
eating
the
read
like
letting
the
P
4800
like
service
reads,
rather
than
then
like
what
they'll
become
a
question
of
like
you're,
you
better
optimizing
for
CPU,
lower
CPU
usage
or
better
optimizing
to
like
reduce,
reads
and
writes
on
the
P
4800
and
that's
I
have
no
idea,
hopefully
we'll
find
out
we'll
be
able
to
test
that
sooner,
because
we
don't
have
any
right
now
in-house,
but
we're
getting
some
so
that'll.
Be
a
really
interesting
question.