►
From YouTube: CDS G/H (Day 2) - RBD: Database Performance
Description
https://wiki.ceph.com/Planning/CDS/CDS_Giant_and_Hammer_(Jun_2014)
25 June 2014
Ceph Developer Summit G/H
Day 2
RBD: Database Performance
A
Looks
like
we
have
jyg
one
of
the
other
people.
Mt
wong
was
here
a
bit
ago,
but
he
dropped.
B
A
Yeah
well,
it
looks
like
it
looks
like
we're:
we're
ready
to
move
to
the
database
performance
one
so
luke
are
you
able
to
hear
us
right.
C
Okay,
so
by
the
way
we
are
waiting
for
luke
to
come,
he
just
finished
coming
from
the
another
meeting:
okay,
probably
another
one:
okay,
okay,
yes
he's
in
now,
okay,.
B
Thanks
guys
for
giving
us
a
chance
to
talk
about
our
idea,
the
blueprint
that
we
put
up
basically
is
to
kind
of
look
at.
B
Can
we
get
databases
and
what
not
to
run
on
rbd
specifically-
and
I
think
a
lot
of
us
probably
will
be
aware
that
a
lot
of
discussion
has
been
put
up
in
the
email
links
within
this
this
couple
of
days
and
weeks,
I
think
one
of
the
things
we
hope
to
achieve
out
of
this
blueprint
is
maybe
a
set
of
documents
or
best
practices
to
kind
of
work
here,
where
different
kind
of
databases
can
have
a
different
set
of
documents
on
what
and
what
not
to
do
giving
kind
of
work,
benchmark
and
so
forth.
B
So
generally,
that's
what
what
we
are
trying
to
hope
to
achieve
of
that
of
this
particular
blueprint
and
obviously
one
of
the
hidden
agenda.
The
ultimate
goal
is
to
see
how
we
can
leverage
ourselves
capability
to
run
database
on
a
really
really
wide
area
network,
rather
than
maybe
using
multi-site
to
do
a
replication.
Can
we
just
do
it
in
just
one
single
cluster
covering
the
whole
white
area
network,
for
example?
B
So
that's
that's
the
goal.
I
know
it's
quite
this,
but
that's
the
fight
of
why
we
put
up
this
particular
blueprint.
E
Cool,
so
you
talked,
you
said
that
you're
just
talking
a
little
bit
about
testing
over
white
area
networks
as
well
as
lens.
E
I
think
it
might
be.
I
could
just
get
a
kind
of
a
baseline
for
a
lot
of
different
workloads
and
to
see
where
things
are
at.
B
B
Yes,
basically,
that's
what
we
what
we
would
like
to
kind
of
look
at,
but
I
think
the
first
step
we
are
trying
to
do
right
now
is
just
to
try
to
understand
how
database
behave
behave
and
how
it
affects,
for
example,
in
a
virtual
environment
with
rbd
on
top
of
it
underneath
it.
B
So
I
think
we
recently
actually
done
some
benchmark
with
postgres
in
our
little
testbed
that
we
have
and
we
found
that
the
performance
is
not
really
what
what
we
are
expecting
and
obviously
because
we
don't
have
the
luxury
of
having
ssds
and
whatnot.
We
are
just
doing
it
on
a
sata
stripe
for
our
osd.
So
what
we
really
hope
moving
forward
is:
are
there
any
other
people
within
this
community
that
are
willing
to
maybe
share
experiences
and
best
practices
to
to
to
look
at?
B
You
know
not
just
postgrads
that
we
are
using
yeah,
but
my
sequel,
no
sequel,
star
other
works.
So
so,
obviously,
once
we
have
some
of
these
things
tied
down,
then
maybe
we
can
move
forward
towards
some
use
case
for
for
wide
area
network
as
well.
So
so
I
think
our
first
step
is
at
least
to
figure
out
what
works
and
what
doesn't
work
and
what
needs
to
be
done.
A
Yes,
yeah,
I
think
think
the
database
performance
is
going
to
be
interesting
because
my
in
my
limited
understanding
of
what
the
databases
are
usually
doing,
they
usually
have
a
journal
file
where
they
have
like
a
stream
of
small,
writes
that
they're
f-syncing
or
using
direct
right
direct
io
to
lay
down
and
then
those
for
that.
For
that,
like
you,
know,
high
up
rate
of
small
rights,
those
are
with
the
general
striping
strategy.
A
Those
are
going
to
pile
up
on
a
single
object
which
could
potentially
be
a
problem,
at
least
for
high
performance
databases.
This
is
one
of
the
reasons
why
we
added
the
fancy
striping,
so
you
can
sort
of
spread
those
out
across.
You
know
shard
that
across
a
lot
of
objects.
On
the
other
hand,
you
don't
want
to
do
that
for
the
full
image,
because
that's
that'll,
probably
it'll,
tend
to
amplify
your
rights
when
they
span
striped
boundaries.
A
So
I
know
I
know
that
or
I
assume
that,
like
you
know,
big
databases
like
oracle
and
so
forth,
will
let
you
specify
a
separate
device
for
the
database
journal
from
all
the
other
data,
at
least
that's
my
assumption.
I'm
not,
I
actually
don't
know
if
that's
true,
but
I
assume
that
they
do
that.
I
don't
know
that
postgres
in
my
mysql
do
that
at
all,
though,
since
they're
sort
of
designed
to
run
on
traditional
file
systems.
A
Right,
well,
I'm
thinking
yeah
yeah,
I
think
for
for
for
sort
of
mid
size,
small
databases,
it's
not
going
to
make
a
big
difference
in
any
case
like
having
the
journal.
You
know
pound
on
one
object
for
a
while
and
then
handle
the
next
object
for
a
while,
and
it's
not
going
to
be
too
limiting,
especially
since
it's
pure
right
so
you're
not
doing
any
reads.
A
F
B
Probably
good
so
I
don't
know
yeah
yeah,
so
so
with
that
mind
does
inkten
or
any
of
any
other
guys
out
there.
May
that
may
have
done
some
similar
work,
that
that
may
be
helpful
for
some
of
us
here
to
work
on
and
maybe
come
up
with
some
packages
or
work
packages
or
whatnot.
B
Like
some
kind
of
rubber
packages,
but
more
like
what
kind
of
things
that
we
can
work
on
items
that
we
can
work
on
in
sequence,
that
maybe
we
can
kind
of
feedback
to
the
community
and
say:
okay
hi
guys,
we
have
done
something
like
that.
I
think
this
is
a
kind
of
a
good
good
best
practices
that
you
can
start
off
with,
and
maybe
you
can
get
more
discussion
rolling
later
on.
E
Deck
is
in
the
chat
here
is
saying
that
they've
been
running
postgres
infrastructure
and
rvd
and
they're
happy
to
help
in
trying
to
improve
that,
and
also
that
you
can
postgres
actually
can
put
the
journal
on
the
separate
point.
D
We
did
some
tests
on
running
mongodb
on
rbd,
but
that
not
so
much
luck.
Performance
was
really
bad,
especially
since
mongodb
can
be
configured
to
run
multiple
charts
on
different
vms
and
the
set
replicating.
So
you
run
on
problems
like
have
multiple
vms
riding
through
the
same
pool,
so
you
get
yeah
with
replica
level
three,
for
example
the
data
written
nine
times,
or
something
like
that,
and
that's
not
really
performing
yeah.
D
So
so
you,
you
need
to
split,
split
these
over
different
pools
and
make
sure
the
pools
are
not
hitting
the
same
hardware
so
yeah
and
for
testing.
We
did
run
the
database
directly
on
real
hardware
and
trace
the
stuff
with
the
block
tools
and
replayed
it
on
the
vm
to
see
what
the
difference
really
is
between
real
hardware
and
leds.
A
I
wonder
if
the
the
way
to
sort
of
focus
in
on
what
the
what
the
specific
issues
are
and
how
to
how
to
best
address
them.
We
can
get
people
to
focus
on
like
one
particular
database
and
then
do
a
series
of
experiments
to
narrow
down
exactly
what
you
know
what
we
can
do
to
improve
it,
whether
it's
the
latency
on
the
right
ahead
log,
for
example.
A
So
I
mean
one
so
one
thought
off
the
top
of
my
head
would
be
that
if
postgrace
lets,
you
put
the
write
ahead
log
on
a
different
device,
try
putting
you
know
just
the
write
ahead
log
on
rbd
and
see
what
the
effect
is
or
put
the
database
on
rbd
but
put
the
right
head
log
on
a
local
disk
and
perf.
A
You
know
figure
out
whether
it's
the
log
that's
deliberate,
that's
latency
sensitive,
which
is
what
I'm
guessing
or
whether
it's
the
the
background
activity
on
the
on
the
rest
of
the
file
system.
That's
the
problem,
or
mostly
at
least
and
then,
and
then
we
can
figure
out
whether
you
know.
Maybe
the
solution
is
that
in
when
you
want
good
database
performance.
You
have
you
know
two
pools
of
in
radios
where
one
of
them
is
backed
by
disc.
A
F
E
A
So
there
was
a
there
was
a
this
is
sort
of
a
side
discussion,
but
what
we're
talking
about
postgres?
There
was
a
long
discussion
at
the
linux
storage
and
file
system
summit
like
three
months
ago
about
post
performance
in
general
on
just
linux,
where
they
described
in
detail
what
the
issues
were,
and
it
basically
comes
down
to
they
want.
A
If
I'm
remembering
correctly,
they
want
fast
synchronous
rights
to
the
journal,
and
then
they
have
all
their
other
table
files
that
they're
sort
of
writing
out
asynchronously
in
the
background,
but
at
some
point
they
need
them
to
be
durable
and
the
only
way
they
can
do.
A
That
is
you
via
sync,
and
they
have
this
problem
where
they
have
to
like,
tell
the
kernel
sync
and
that
shows
a
whole
bunch
of
dirty
pages
down
to
the
ios
and
then
slows
down
their
their
red
head
log,
which,
even
though
they
don't
actually
care
when
it's
terrible,
they
just
need
to
know
that
it
is
sortable
whatever.
So
there's
this
whole,
so
I
think
postgre
struggles
in
general
just
because
the
kernel
isn't
sort
of
exposing
the
right
type
of
interfaces
for
it.
A
B
B
Yeah
yeah
yeah,
our
network,
has
seemed
to
have
been
some
problem.
The
last
couple
of
days,
speaking
of
some
tools,
are
there
any
suggested
tools
that
maybe
we
can
kind
of
use
or
agree
on
when
we,
when
we
start
doing
more
tests
or
benchmarks.
B
I
mean
I
mean
when,
when
we're
running
benchmarks,
some
of
the
times,
we
really
don't
know
what
kind
of
tools
we
can
actually
further
look
at
now
to
pick
up.
You
know
we,
we
have
the
benchmark
tools
which
we
are
using
right
now,
but
we
don't
don't
really
know
what
other
tools
we
can
actually
use
for
tracing
for
example,
or
not
figure
out
what
what
is
being
done
at
what
level,
but
we
just
we're
just
doing
pure
benchmarks.
B
A
Yeah,
so
maybe
we
can
just
make
a
list
in
the
in
the
ether
pad
I
mean
the
things
to
check
are
like
is
librbd.
Caching
enabled
that's
rbd
cache
equals
true.
F
I
was
going
to
say:
are
you
looking
for
statistics
information,
for
example,
ratios
of
flights
of
rights
going
into
the
of
to
rbd
versus
rights
coming
out
of
rpd?
F
That
kind
of
thing,
so
you
see
what
the
cache
is
doing
or
are
you
asking
about
insight
into
the
set
the
rbd
configuration.
B
E
Yeah
so
inside
the
vm
they
could
use
tools
like
block
trace,
to
see
like
exactly
where
postgres
for
my
sql
or
whatever
database
is
I'm
writing
to
the
block
device.
E
That's
probably
going
to
another
level
of
the
file
system
as
well,
but
that
will
give
you
some
idea
of
and
you
can
use
as
another
tool
called
sequencer
which
you
can
use
to
visualize.
What
kind
of
patterns
are
going
at
the
disk
which
can
be
interesting
when,
if
you
see
I
see
like
the
journal
being
written
over
and
over
again-
and
there
are
random
rights
going
on
in
other
areas,.
C
A
A
A
So
I
think
it's
it's.
The
thing
I'm
most
interested
in
is
figuring
out
how
to
quantify
how
how
much
of
it
is
that
and
how
much
of
it
might
be
something
else
like.
Maybe
it's,
maybe
it's
the
that
the
barriers,
it's
periodically
doing
a
flush
and
we
have
a
whole
bunch
of
dirty
data
and
that's
the
thing
that's
slowing
it
down
like
it's
kind
of
hard
to
hard
to
say
I
wonder
actually,
sam
or
josh.
A
I
wonder
if,
if
doing
like,
a
debug
ms1
type
trace
at
rbd
might
be
good
enough
too,
just
to
get
a
sense
of
like
what
the
what
the
right
sizes
are
and
what
the
latencies
are.
E
Yeah
that'd
be
pretty
useful.
Maybe
it
needs
to
be
some
remote
post
processing
to
analyze
that,
but
that's
not
too
hard.
It
might
also
be
useful
to
get
like
debug
rbd
equals
20
to
just
see
when
flashes
happen.
How
often
we're
actually
seeing
them
if
we're
investigating
the
cache
behavior.
E
It's
basically,
it
looks
basically
log
each
rate
and
read
other
operations
when
they
begin.
Okay,.
A
C
One
more
thing:
how
about
other
parameters
instead
of
we?
We
we're
now
planning
for
things
to
experiment
with
something
like
rpd
the
the
hardware
as
well
like
in
this
case,
maybe
ssds
right.
How
about
other
parameters
like
network
parameters,
for
example,
kernel
parameters.
D
I'm
not
sure
if
that's
really
helping,
so
if
you
already
get
the
full
performance
of
the
network,
I
would
wonder
if
it
helps
to
tune
the
corner
network
parameters
there.
So.
D
D
C
B
Are
formatted
in
our
case
our
osd
are
formatted
with
sfs.
Is
there
anything
that
we
can
do
along
the
line
like
different
kind
of
a
mount
options
for
those
osds.
A
Yeah,
I'm
not
sure
that
that's
like
the
the
dominating
issue
here
yeah,
I
imagine
it's
more
in
the
sort
of
the
rbd
and
radius
behaviors
and
just
the
general
desire
to
make
ios
faster.
Obviously,.
D
B
At
this
stage
we
have
not
set
a
goal
for
the
expectation
yet,
but
what
we
do
hope
is
if
we
can
get
something
that
is
close
to,
like
you
know,
performance
like
running
a
postgresql
on
the
local
drive.
That
would
be
a
good
start
for
us,
at
least.
If
you
can
see
something
along
that
line,
then
we
can.
B
You
can
move
forward
and
see,
okay
guys
now
we
think
we
have
something
that
is
very
close
to
running
on
a
local
drive
or
at
least
a
local
sata
drive
nothing
fancy,
but
then
we
can
start
to
see
what
else
can
we
do
and
and
and
and
may
be
better
than
that?
It's
a
local
drive
things
like
that,
so
we
are
keeping
our
goal
quite
low
at
this
moment
of
time.
D
A
I
mean
in
theory,
there
are
some
things
that
we
can
do
right,
because
we
can.
We
can
reorder
rights.
A
So
in
theory,
we
can
do
okay,
like
we
can
get
someone
there
to
pay
for
it
later
on
right
or
on
reads,
but
for
rights
at
least
but
yeah
there's
definitely
a
handicap,
because
we're
replicating
and
we're
over
the
network-
and
we
have
a
couple
of
stuff
in
between.
A
A
B
We
are
just
using
setup,
drives
yeah
for
both
the
data
and
general,
so
we
we
haven't
looked
at
using
ssd
yet,
but
I
think
one
of
the
ideas
later
on
is
is
maybe
we
can
separate
the
journal
and
the
data
away
first,
and
once
we
have
a
bit
of
budget
for
the
projects
we
can
hope
to.
We
hope
to
get
some
ssd
in
as
well.
D
Yeah
we
ran
some
of
our
clusters
also
with
jbot
and
switched
over
from
jbod
to
a
raid
0,
with
right
back
caching
enabled
on
the
red
controller
and
the
red
controller
has,
I
guess
and
gig
of
ram
or
something
like
that,
and
that's
improving
the
performance
for
us,
at
least
by
a
factor
of
four
or
something.
D
Okay,
so
if
you
don't
have
ssds,
maybe
it
helps
to
use
the
enable
right
back
on
the
red
controller
cache
if
you
have
a
cache
on
the
computer.
A
So
I
mean
I
guess
I
I
think
we
have
a
in
the
aether
pad.
We
have
a
whole
list
of
things
to
sort
of
experiment
with.
A
I
think
you
should
definitely
follow
up
this
discussion
on
the
on
the
email
list
you
know
describe
describe
what
the
test
was
and
we
can
suggest
you
know
what
the
next
thing
to
try
might
be
and
seeing
what
what
effect
these
different
things
have
will
sort
of
lead
us
down
the
road
of
figuring
out.
You
know
if
it's
the
journal
or
if
it's
the
you
know
whatever
it's
a
caching
that
helps.
B
Yeah,
would
you
suggest
you
use
some
firefly
for
this,
because
currently
we
are
doing
it
on
emperor.
B
All
right
cool
looks
like
well
we'll
follow
up
with
some
of
the
suggestions
and
we
post
some
of
the
results
out
later
on
on
emailings.