►
Description
Presented by: Arthur Outhenin-Chalandre
Full schedule: https://pad.ceph.com/p/ceph-month-june-2021
A
So
what
is
rbd
safe
at
cern?
It's
four
ibd
clusters
for
openstack
cylinder
about
seven
thousand
volume,
which
represents
eight
metabyte
row
and
with
this
project
we
want
to
allow
users
to
replicate
between
different
clutches.
So
this
means
between
different
data,
centers
or
different
rooms
in
the
data
center.
A
So
the
objective
is
to
provide
a
disaster
recovery
solution
for
cinder
volumes.
We
want
to
enable
that
on
a
subset
of
our
images,
and
we
want
the
users
to
be
able
to
choose
which
which
images
they
want
to
have
this
feature
enabled.
So
this
means
a
deep
integration
with
openstack,
and
we
also
don't
want
to
that.
The
user
that
enable
this
feature
have
a
great
performance
impact
on
their
workload
and
we
also
want
to
to
be
able
to
to
replicate
a
massive
amount
of
data.
A
So
this
means
a
suitable
replication
performance
in
inseph,
so
how
everyday
replication
is
done
in
ceph,
it's
handled
by
a
mirror
called
rbd
mirror
which
which
basically
read
the
the
state
of
rbd
images
from
a
source
cluster
and
replay
those
on
a
target
cluster.
So
this
is
a
really
high
overview
that
I
will
explain
a
bit
more
later,
so
there
is
currently
two
over
two
operation
modes
supported
every
digital
and
lbd
snapshot,
and
this
talk
is
a
bit
about
comparing
those
two
and
but
first
our
test
setup.
A
So
this
is
running
safe
octobus.
We
have
six
experimental
machines
with
60
oasds
for
in
a
hybrid
scenario,
with
lgd
for
the
data
and
ssd
for
the
db.
We
also
have
18
osds
for
the
18
osd
ssds
for
the
lbd
journal,
so
these
are
dedicated
to
the
rbd
journal
and-
and
we
also
have
a
few
number
of
clients
running
fio
with
random
write
workloads,
so
first
the
rb
digital.
So
in
this
mode
the
led
client
will
write
data
to
the
lbd
image
as
usual
and
also
to
the
journal.
A
So
this
is
kind
of
a
double
write
scenario.
We
choose
this
mod
first
because
it
has
full
support
in
openstack.
You
can
basically
set
up
the
replication
and
manage
this
with
sender
and
have
in
case
of
disaster
your
business
continuity
and
failover
to
to
your
site.
If
you
have
a
certain
openstack
cluster
design
and
some
requirements
in
openstack.
For
that.
A
The
first
problem
is
about
the
mirroring
performance,
so
the
the
replays
are
quite
slow
with
the
default
settings.
We
observe
in
our
tests
about
30
megabytes
per
image,
but
this
is
per
image.
So,
with
one
epidemiology
you
can
you
can
manage
multiple
images,
so
this
can
scale
a
bit
with
the
number
of
images,
but
from
a
pair
image
point
of
view,
there
is
a
really
high
risk
that
the
replicated
image
will
lag
behind
and
that
the
journal
will
not
be
trimmed
from
the
source
cluster.
A
A
Fortunately,
there
is
some
option
for
tuning
returning
this
style
in
our
test.
We
we
managed
to
increase
the
the
replay
speed
to
about
40
or
50
megabits
per
image.
That's
the
that's
not
really
sufficient,
so
so
yeah.
This
is
a
big
problem
for
us
in
this
mode.
A
A
So
this
is
about
fifty
percent
performance
impact,
and
this
is
really
problematic
for
us,
because
we
do
not
want
the
user
that
on
high
bandwidth
and
will
have
big
block
size
to
suffer
that
much
in
term
of
performance
with
the
enable
replication.
A
Fortunately,
for
us
there
is
another
mode,
everyday
snapshot.
So
in
this
mode
you
take
a
snapshot
at
a
various
times
in
your
images.
So
this
is
not
continuous
replication
like
before.
This
is
a
quality
time
replication
so
based
on
the
snapshot
and
the
the
red
mirror.
Behavior
will
be
a
bit
like
if
you
do,
if
you
create
snapshot
and
you
do
lbddf
and
lbd
imports
from
the
rbd
cli.
A
So
that's
not
a
real
problem
and
the
replay
are
really
fast.
We
read
the
default
settings,
so
you
can
expect
at
least
in
our
test
cluster.
We
we
saw
200
megabytes
per
image,
and
this
is
really
great
because
our
cluster
is
not
able
to
do
more
than
like
400
megabytes
or
500
megabits
total.
A
A
A
A
So
now
it's
time
for
conclusion.
So
we've
seen
in
this
presentation
that
journal
journaling
has
some
performance
issues,
so
our
next
goal
is
to
to
to
go
with
snapshot
mirroring
and
we
will
continue
our
test
further
to
report
or
fix,
if
possible,
any
bugs
that
we
may
found
in
rbd
replication.
A
B
I
have
another
quick
one.
What
was
the
frequency
of
the
the
snapchats
you
were
taking
like?
What
sla
were
you
targeting.
A
So
we
in
my
initial
testing,
I
I
tested
with
the
the
minimal
ra
rate
possible,
which
is
one
minute,
but
we
will
not
consider
any
definitive
settings
for
that,
because
there
is
also
the
the
ongoing
work
on
on
fs3s
with
the
snapshots.
So
we
will
probably
have
higher
frequency
on
snapshot
than
that.
A
C
Have
you
have
you
encountered
any
issues
with
the
lack
of
proper
support
for
fs
freezing
and
you
know
possible
application
level
freezing
yet
or
is
that?
Because
in
my
experience
like
this
is
definitely
something
that
you
want,
but
many
people
don't
even
realize
that
they
don't
have
it.
So
I'm
curious.
If,
if
you
actually
run
into
any
issues
there.
A
No,
no,
I
I
we
like
we,
we
don't
have
some
beta
testing
or
things
like
that
for
users.
Yet
so
we
do
not.
We
do
not
encounter
this
sort
of
problem,
but
yeah.
D
A
B
A
No,
we
don't
had
any
issues
without
the
fs3
support
so
far,.
B
I
have,
I
guess
one
other
question
about
the
the
journal
mode.
I
mean
it's
expected
that
if
your
write
rate
is
limited
by
the
cluster,
then
because
you're
doing
this
double
right,
then
you'd
see
half
the
performance,
I'm
just
wondering
if
that
would
be
the
case
in
general,
for
example,
if
you
have
a
much
larger
rbd
cluster,
then
the
the
performance
that
a
client
sees
not
necessarily
going
to
be
limited
by
the
available
cluster
bandwidth.
A
A
I
have
some
backup
slate.
Maybe
I
can
okay
yeah,
so
okay,
this
is
our
self-cluster
performance,
and
here
we
compare
the
the
rope
pull
of
hdd
to
the
ssd
pool
and,
to
my
understanding
I
think
the
like
the
the
really
large
difference
in
4k
in
4k
like
helps
with
the
the
performance
impact
which
is
really
small
and
like
the
30
40
percent.
The
30
percent
performance
impact
with
hdd
compared
to
ssd
in
a
bigger
right
does
not
really
help
that
much
with
with
performance
in
this
mode.
D
One
parameter
that
might
affect
its
journal
set.
So
when
you
are
writing
to
image
you
writing
and.
D
They
doing
writing
to
different
objects
located
in
different
locations.
So
as
such,
I
was
talking
that
when
you
have
more
osgs
your,
I
always
spread
among
mo
osds,
but
when
you
are
writing
to
a
journal,
you
are
writing.
Just
only
in
called
journal
set
small
number
of
objects
and
then
when
they
are
filled,
they
you
are
writing
to
the
next
the
next.
D
A
Actually,
we
tried
that,
but
it
was
kind
of
similar
to
yeah.