►
Description
Videos from Ceph Developer Summit: Infernalis (Day 1)
03 March 2015
https://wiki.ceph.com/Planning/CDS/Infernalis_(Mar_2015)
A
B
So
this
is
a
continuation
of
the
one
I
didn't
hammer
so
I'll.
Just
start
with,
like
a
status
update
the
things
there
was
like
five
things
that
we
were
trying
to
complete
or
five
items
that
I
need
to
complete
to
be
able
to
support,
running
li
on
multiple
targets
and
being
able
to
access
them.
Export
rbd
devices
to
them,
and
so
some
of
the
stuff
that
I've
gone
done
or
it's
set
is
a
for
configuration
and
distributing
the
device
state.
B
A
management
gooeys
that
the
distro
have
I'm,
also
working
with
the
loop
storage
management,
people
to
modify
their
library
and
plug-in
and
their
tool
to
be
able
to
create
targets
and
basically
that
our
plug-in
would
just
call
those
pcs
commands.
So
people
don't
have
to
do
it
directly
and
that's
pretty
simple.
B
The
part
we're
working
on
or
that's
more
complicated
as
being
able
to
create
there
were
their
concept
of
pools
out
of
RPG
devices,
and
so
that's
just
taking
that's
just
more
details
that
we
have
to
hammer
out
because
I
guess
to
in
order
to
do
it
generically
across
like
a
netapp
target
and
those
other
ones.
It's
kind
of
weird
until
those
two
items
are
pretty
much
done
and
set
the
next
one
was
for
persisting
group
reservations.
B
Originally,
I
was
looking
at
using
just
dlm
and
coral
synced
to
do
a
by
locking
across
the
cluster
and
protecting
this
PG,
our
metadata,
which
basically
just
holds
like
the
initiator
information
and
I
team,
Nexus
information
and
what
type
of
reservation
it
isn't
side
like
that,
and
since
the
last
talked
I
have
implemented,
Colonel
rbd
locking.
So
it's
basically
what
we
have
in
liberate
us
and
user
space
right
now,
only
it's
based
on
the
colonel
lipsett
stuff
and
you
can
make
those
calls
from
the
car.
B
No,
no
and
the
second
part
of
that
is
being
able
to
store
the
metadata
that
I
need
in
I'm
currently
looking
into
or
I'm
currently
doing
it,
storing
it
in
the
rbd
header.
So
I
just
added
some
new
poland,
that's
class
CLS,
underscore
RB
DCC
file
and
that
one
I'm
gonna
send
a
patch
for
that
soon
for
RFC,
because
I'm
not
sure
it's
kind
of
the
new
colleague
I
called
it
like
said
scuzzy
PR
info,
because
I
wasn't
sure.
B
If
I
should
do
it
like
more
generically
or
just
say,
forget
it
and
like
I'm,
just
gonna
store
like
this
exact
data
structure.
So
I'm
not
any
comments
on
that
to
see
how
people
want
to
handle
that
and
that
one
almost
soon
and
then
the
next
item
was
compared
and
right
support
until
that's
needed
by
ESX
boards
atomic
into
atomic
test
and
set
command
which
is
needed
for
their
vaa
I
feature.
And
for
that
one.
It's
pretty
much
said
I'm.
B
Just
waiting
for
some
other
upstream
people
that
need
similar
functionality
to
lay
down
this
of
their
infrastructure,
so
I
don't
have
to
do
it.
B
Cuz
I
have
all
this
other
stuff
to
do,
but
I
I
get
done
with
my
stuff
first
to
know
late
on
that
infrastructure,
but
right
now
I'm,
just
hoping
that,
though,
needed
sooner
than
me
and
implement
it,
that's
just
like
adding
a
bunch
of
fields
and
breaking
up
these
fields
and
going
through
all
the
black
drivers
and
doing
it,
and
so
it's
like
a
upstream
chicken
I
guess
we're
playing,
and
so
the
big
one,
that's
oh,
the
big
one
that
needs
to
be
done
still
is
the
scuzzy
task
management
and
like
the
unit
attention
and
those
type
of
handling,
and
for
that
one
I've
been
I
haven't
worked
on
it
at
all.
B
I've,
mostly
there's
been
lots
of
bug
reports
and
the
boat
commands,
timing
out
and
the
air
handlers
failing
and
so
for
that
one
I've
just
been
looking
on
to
just
originally
we're
just
going
to
handle
for
what
happens
now
when
l
io
gets
an
award
from
the
initiator
and
abort
task
task
management?
Is
it
just
waits
in
hopes
that
the
underlying
device
completes
it
in
time
before,
like
the
initiator
test
management
award,
timeout
occurs,
and
for
that
one
I
want
to
do
something
a
little
bit
more
intelligent.
B
At
least
log
try
to
narrow
down
where
it's
hung
and
log
a
message,
because
it's
really
been
difficult
to
debug
that
on
the
list
and
it
seems
to
be
happening
a
lot
but
yeah
for
the
most
part,
I.
Don't
think
we
can
do
a
lot
to
really.
If
the
commands
like
on
the
device,
then
we
can't
really
enjoy
MIT,
and
so
at
least
log
a
message
if
it's
Jim,
somewhere
else
and
I
haven't
been
able
to
I,
still
need
to
look
into
more
how-to
yeah
I'm
jam
it
in
the
stuff
like
an
OST
layer.
B
So
when
the
initiator
sends
a
command,
it'll
normally
be
around
30
seconds
to
a
minute.
If
that
command
isn't
illegal
down
time,
it's
MJ
abort
task,
the
abort
asks
timeouts
is
anywhere
from
like
five
to
30
seconds
to
a
minute.
Again,
it
just
depends
on
the
operating
system
and
how
the
person
configured
it,
though
okay.
B
Yeah,
okay
and
so
for
the
big
one
for
that
is
still
handling
at
the
board
and
how
to
do
device
resets
and
for
device
reset.
You
know
you
want
to
kill
all
the
commands
on
the
bed
of
running
on
the
rbd
device
again
for
that
one
l
io
currently
it'll
just
wait
for
all
those
commands
to
complete,
and
so
there's
not
a
lot.
It
can
do
in
a
lot
of
paces
like
yeah
and.
A
B
For
see
you
at
those
feel
then
it'll
go
to
like
a
higher
level
error
recovery
for
like
ice
cause
you
to
go
to
like
it'll
log
out
the
session
and
again
for
that
to
log
back
in
and
waits
for
the
commands
to
complete.
So
some
I
need
to
look
into
some
way
to
be
able
to
actually
do
something
to
unjam
these
commands
or
something
like
that
to
make
progress
order.
Just
and
that's
what
we're
seeing
a
lot
of
so.
C
You
we
had
a
conversation
several
weeks
ago,
where
you
were.
You
were
basically
saying
that
the
OSD
failure,
detection
timeout,
is
like
20
seconds,
and
so
you
can
it's
not
that
uncommon
to
get
it
niƱo.
That
will
take
that
30
seconds
and
if
you
do,
if
you
do
trigger
that
timeout,
then
the
in
general,
you
would
normally
like
start
fencing
and
failing
doing
all
that
stuff
yeah
which,
if
overall
the
sub
cluster,
is
just
gone
away
for
a
minute,
then
is
sort
of
wasted.
C
Effort
I,
wonder
if
it
makes
sense
to
have
an
out-of-band
communication
between
the
between
the
OIO,
the
gateways
or
whatever.
Just
for
that
that
reason
right
yeah,
because
I
mean
if
both
the
gateway
is
actually
haven't,
failed
and
it's
actually
stuff
that's
going
slow
than
doing
a
failover
between
them
is
like
yeah.
B
C
C
B
B
C
Even
if
even
if
they
are
doing
that
that
failover,
then
if
the
gateways
can
recognize
that
actually
the
gateways
are
fine
and
they're
still
both
alive
and
cooperating
and
seeing
isn't
necessary,
then
they
can
do
sort
of
a
lightweight
and
off
between
them
yeah.
So
they
can
do
a
graceful,
lock,
brake
or
whatever.
This.
E
Amusing
pacemaker,
you
can
store
extra
metadata
within
pacemaker,
that's
more
than
just
it's
up
and
running,
so
you
could
potentially
have
your
your
resource
script
store
that
extra
information
act
on
it
that,
on
that
information,
as
part
of
its
fell
over
logic,
yeah.
B
I
can
do
that
so
on
each
LOL
node
I
can
detect.
If,
like
the
other
icicles
gateways,
are
you
can
detect
like
if
the
service
is
running
and
in
like
the
monitor
call-outs,
you
can
do
various
checks
to
see
it
like
the
ice,
cozy
layer
or
layers
running
in
things
like
that,
and
you
could
even
do
checks
like
a
can.
You
do.
I
saw
that
there's
like
this
code
for
the
lip
stuff
and
stuff
so
like
from
you.
The
pacemaker,
userspace
monitor
could
act.
B
C
A
C
It
okay,
so
in
that
case,
so
in
that
case,
as
long
as
the
colonel
rbd
is
actually
doing
is
going
through
that
whole
protocol
for
lock
brake
than
handling
a
reset
in
like
the
standard
way
where
we
just
break
the
lock,
take
the
lock
it'll
cooperate
and
do
it.
Okay,
so
that
actually
might
give
us
ninety
percent
of
the
way
there.
C
B
For
scuzzy
and
overing
for
what
we
support
now,
it's
it'll
be
okay.
If
it
gets
reordered
on
that
on
that
type
of
sequence,
we
rely
on
like
the
upper
layers
to
like
synchronize
it
correctly
and
so
like
there's
different
type
of
ordering
that
we
can
that
people
could
set
or
at
the
sky
sea
level,
but
no
one
does
it
that
I
know
if
I
think.
B
C
So
you
can't
think
of
one
ratos
client
it
an
asset
has
tonight
I'll,
just
like
whatever
they're
like
various.
There
are
a
lot
of
different
cases
where
we
can
conclusively
know
that
we
canceled
it
successfully
like,
for
example,
if
we
haven't
sent
it
anywhere
yet
because
the
host
is
already
down,
then
obviously
we
just
take
it
out
our
key.
We
never
send
it
or
if
we
do
have
an
outstanding
to
it,
I
know
Stephen
we
can
just
maybe
there's
a
we
can
send
it
a
cancellation
or
something
yeah.
I
don't
know
anyway.
D
C
A
C
C
C
C
Okay
sounds
great
man:
okay
sounds
good
and.
C
B
I
had
one
question
for
distributing
mime
edit,
my
PG
our
metadata,
so
I
just
did
that
I
was
adding
that
new
class
rbd
command
and
it's
really
easy
to
set
the
data,
but
I
wanted
to
be
able
to
cash
that
data
on
each
Li.
Oh,
no,
so
I
don't
have
to
go
back
and
read
it
every
time.
You
know
someone
does
a
command
and
so,
like
you
know,
people
can
do
a
watch
on
that
object
and
be
notified.
B
But
I
was
looking
for
a
way
to
like
before
I
was
using
coral
sink
and
you
could
run
like
a
command
on
all
the
nodes
and
be
notified
when
that
command
is
completed,
and
so
I
just
wanted
a
notification
that
everyone
has
updated
their
cash
so
like
unknown,
1
I'll
set
the
metadata
and
then
other
people
be
watching
and
they'll
be
notified
and
they'll
update
their
cash.
But
I
wanted
to
be
able
to
be
notified
that
they
have
to
do
their
cash.
So
I
can
tell
the
initiator
that
we're
all
set
across
the
cluster.
E
B
E
You
can
also
even
go
beyond
above
and
beyond
that
if
you
want,
if
you
had
additional
metadata
somewhere
I'll,
say
your
other
guys.
I
know
that
I'm
supposed
washing
for
you
can
watch
as
far
as
far
as
I
the
notification
response
they
can
send
whatever
they
want
back
and
say.
Well,
this
is
I'm
fine
XYZ
night,
the
truck.
B
B
C
That
Jason
is
this,
something
that
is
mrs.
sort
of
what
the
lock
leader
or
whatever
would
do
right
so
I
mean
in
the
old
pattern.
You
would
you
would
set
it
on
the
object
and
then
you
instead
of
notify
and
then
only
when
that
completes.
Would
you
know
that
everyone
has
seen
it,
but
in
the
new
world
you
actually
just
send
a
message
to
the
leader
and
you
let
them
do
all
that,
but
did
will
just
fall
in
that
category
or
I
mean.
E
If
I
delete
you
know
in
the
cases
where
you
actually
looking
for
leader,
even
if
it
does
timeout,
I
say
well,
I
got
a
response
from
leader.
That's
all
I
really
cared
about
anyway.
So
I
just
got
gotta
clear
out
like
nor
the
the
timeout
error
code
that
comes
back
from
watch
notify.
I
say
I
got
my
response.
I
looked
was
looking
for
from
the
leader,
I'm
good
to
go
with
whatever
the
leader
said,
was
the
real
status
as
part
of
his
notify
ack.
B
D
In
the
it's
a
new
like
the
uShip
streamer
that
locations
ridden-
and
it
also
has
like
different
commands
that
you
could
do
what
happened
like
hi-
see
this
because
it
could
be
like
one
of
the
commands
would
be.
You
know,
reread
their
person
reservation.
D
I
did
and
then
send
me
back
explicit,
notify
response
not
just
working
out
if
I
act
but
I
suppose
I
get
different
notification
when
you
finish
reading
that,
because,
typically
you
want
that
whatever
you
do
before
you
send
back,
notify
active
some
kind
of
quick
operations,
you
don't
potentially
timeout
like
blocking
for
some
io
somewhere.
D
So
you
just
say:
okay,
I
plated
a
thing
something
internal
state
saying
I
need
to
update
the
cache
act
that
notify
and
then
later
once
you
have
to
actually
update
the
cache,
send
a
different
application
back
to
the
original
guy
Oh.
Everyone
watching
really,
but
and
all
this
stuff
is
all
in
user
space
right
now
and
that
Colonel
doesn't
even
have
the
newer
what
freshmen
for
wash
notify
yet
so
then
take
some
effort
to
actually
get
it
all
worked
up.
The
colonel
okay.