►
Description
Presented by Brad Hubbard
Every month the Ceph Developer Community meets to discuss one aspect of Ceph code, to spread knowledge of how it works and why it works that way.
This monthly meeting will occur on the last Tuesday of every month via our BlueJeans teleconferencing system. Each month we alternate meeting times to ensure that all time zones have the opportunity to participate.
http://tracker.ceph.com/projects/ceph/wiki/Code_Walkthroughs
B
A
The
process
is
driven
primarily
by
the
primary
OSD
almost
exclusively
by
the
art
primary
OSD
and
it's
implemented
as
a
boost
date
chart
sorry
as
a
Buddhist
state
chart
state
machine.
So
it's
a
state
machine
implemented
using
the
boost
state
library
and
that's
that
state
machine
is
embedded
in
HPG.
So
for
every
PG
there
will
be
a
state
machine.
A
So,
just
before
we
go
any
further
I'll
just
mention
a
few
Doc's
that
are
worth
looking
at
the
first
one
is
the
one
that's
currently
on
the
screen
that
I
hope
you
can
all
see,
and
that's
talking
about
last
epochs
started
the
next
one.
I'd
recommend
is
reading
his
sages
thesis
and
specifically
section
6.3
where
he
covers
a
lot
of
the
concepts
involved.
A
A
The
way
that
the
state
chart
state
machines
work
is
you
have
a
series
of
transitions,
though
sorry
you
have
a
series
of
states
and
then
the
transitions
are
implemented
either
as
a
transmission
transition,
which
can
be
called
explicitly
or
by
suing
events
within
the
state
machine
which
are
interpreted
by
various
states
and
can
be
interpreted
by
multiple
states,
but
we'll
go
into
that.
More
as
we
go
through
the
code,
the
last
document
is
the
the
appearing
document.
A
I'd
recommend
looking
through
that
as
well
I'll,
try
and
include
these
links
in
the
the
description
of
the
video
somehow
or
in
a
comment
under
the
video
at
the
bottom
of
the
peering
document.
You'll
see
this
graph,
which
is
a
representation
of
the
the
state
machine
and
the
flows
within
the
state
machine.
A
A
A
B
Yeah
that
that's
better
Thanks,
okay.
A
A
So
that's
just
a
quirk
of
this
particular
implementation.
So,
but
what
we
also
see
here
is,
in
the
definition,
we're
passing
initial.
As
the
second
argument
template
argument
that
is
indicating
that
that's
the
initial
state,
this
is
in
PGH
and
the
implementation
of
peering
is
almost
exclusively
in
PGH
and
PGCC.
A
A
In
the
case
of
handle
activate
map,
we
create
an
act
map
event
and
in
the
case
of
handle
initialize,
we
create
an
initialized
event
and
we
tell
the
state
machine
to
handle
that
event.
So,
in
other
words,
we
post
that
event
to
the
state
machine
and
if
we
go
back
to
our
graph,
we
can
see
that
when
we're
in
the
initial
state,
if
we
get
an
advance
map,
an
act
map
or
an
initialize,
we
then
transition
all
the
way
over
here
to
reset,
and
we
can
see
that
in
the
code
by.
A
A
So
when
we
see
that
we
call
should
really
start
peering
on
on
ourselves
and
if
that
returns
true
within
transition
to
reset.
So
regardless
of
whether
we've
been
created,
whether
we're
being
loaded
off
the
disk
or
whether
we're
already
in
an
active
and
clean
state,
we
end
up
transitioning
to
reset
there.
A
So
that
if
we
look
at
the
graph
again
we're
now
entering
this
state
here
and
we
can
see
that
we
can
see
the
transition
we've
gone
from
initial
following
this
line
because
of
an
act
map
or
an
initialized
reset,
and
then
reset
has
a
line
here
that
when
it
receives
an
advance
map
that
line,
one
of
these
points
here
actually
goes
from
reset
to
that
started.
Box.
A
A
A
A
So
we
go
into
the
constructor
of
the
start
state,
and
then
we
make
a
decision
based
on
whether
we're
the
primary
or
not
as
to
where
we're
going
to
transition.
If
we're
primary,
we
transition
to
the
primary
state.
If
we
are
not
primary,
we
transition
to
the
stray
State
and
I
think
we're
probably
most
of
us
know
the
stray
state
is
a
state
would
go
into
where.
A
A
So,
in
this
definition,
that's
slightly
different
to
what
we've
seen
before.
We
now
have
three
arguments
template
arguments.
The
first
is
the
derived
class
itself,
as
noted
before.
The
second
is
the
context
which
is
the
parent
state,
so
we're
in
primary,
the
parent,
State
or
the
context
state
is
started
which
we've
seen
before,
but
then
we
have
a
default
transition
as
well,
so
the
default
transition
when
we
enter
PI
primary.
We
go
straight
into
peirong,
though.
A
A
A
A
A
A
A
A
A
A
A
Decide
on
an
acting
set-
and
this
is
the
guts
of
the
pairing
process
really,
if
that
fails,
we
then
need
to
decide
whether
we
generate
a
need
acting
change
then,
in
other
words,
change
our
acting
set
or
whether
we
go
to
incomplete
the
incomplete
state.
So
if
we
look
at
that
in
and
visually
we're
now
in
get
lock,
we
can
see
that
if
we
generate
is
incomplete,
we
go
to
the
incomplete
state.
A
B
A
A
A
A
A
So
we
check
last
epochs
started
here
instead
of
history,
dot
last
epoch
started
since
any
peer
with
history.
Last
epoch
started
set
to
a
particular
epoch
must
have
at
least
that
value
in
last
epoch
started
and
may
have
a
later
value
indicating
it
is
completed
activation
at
the
at
that
epoch,
but
the,
but
the
PG
as
a
whole
did
not
go
active
since
history,
dot
last
epoch
started,
shows
a
lesser
epoch,
so.
A
A
A
A
The
other
place
where
that
event
will
be
posted
is
once
we've
received
the
log
from
the
other
PG
on
another
OSD
that
we've
determined
to
be
the
authoritative
PG
will
request
a
log
from
that.
Once
we
get
it,
we
post
a
god
log
event,
then
we
react
to
that
event
down
here.
This
is
our
reaction
to
the
got
log
event
believe
get
log
and
we
transition,
or
we
call
process
master
log,
so
in
process
master
log,
we
merge
the
authoritative
log
with
our
own,
if
that's
necessary,.
A
A
A
A
A
A
A
A
A
A
C
A
A
A
A
This
is
why
you've
got
to
look
at
the
headed
header
declaration
for
a
state,
as
well
as
the
definition
in
the
source
file,
because
these
transitions
can
be
defined
defined
as
a
reaction
to
an
event
or
just
a
simple
transition
where
we
just
say:
okay,
if
we
receive
an
activator
event,
we
transition
to
active.
So
if
we
look
at
the
active
State.
A
A
A
A
If
we're
the
primary,
we
update
our
last
epoch
started.
So
at
this
point,
when
we
update
that
last
epoch
started,
we
are
acknowledging
that
we
have
an
authoritative
history
of
this
PG,
and
so,
if
we
go
through
the
pairing
process
again
at
some
point
in
the
future,
we
will
be
considered
the
best
or
the
most
authoritative.
A
A
A
A
A
Go
on
a
lot
of
housekeeping
of
weirdness,
which
is
basically
if
we
find
any
values
that
are
a
little
bit
strange.
We
lock
them
or
we
we
create
an
error,
but
it's
it's.
It's
just
an
error.
We
don't
change
anything
we're
doing.
It
doesn't
have
any
effect
on
what
we're
doing
so.
We
just
logged
the
fact
that
we've
found
some
strangeness,
though
how
often
we
see
that
when
it's
not
in
a
in
a
testing
environment,
I
suspect
we
don't
see
that
very
often
at
all.
A
A
B
A
A
A
A
A
A
A
A
Sets
its
info
got
history,
dot
last
epoch
started
to
last
epoch
started,
so
they
they
should
always
match,
but
depending
on
our
seas,
flapping
and
and
the
timing
of
things
they
may
not
match.
So
if
we
have
an
info
last
epoch
started,
that
is
greater
than
the
history
died
last
epoch
started.
We
know
that
that
PG
on
that
OSD
went
active,
but
it
was
not
active
long
enough
to
receive
an
acknowledgment
from
the
ellipse
P
G's
in
the
acting
set
that
they
also
went,
active
and
wrote
down.
The
last
epoch
started.
A
C
A
We've
transferred
transitioned
here
to
recovered
and
then
the
next
step
is
to
transition
to
clean.
So
once
we
do
that,
we're
active
and
clean,
we
appeared
we're
active
and
clean
we're
up
and
running.
Well,
we
were
up
and
running
as
soon
as
we
all
went
active,
but
that's
basically
the
end
of
what
I
was
going
to
talk
about
today.
So
we've
gone
a
little
bit
over
time,
but
not
too
far.
C
Tim
here,
I
don't
have
any
questions,
but
I
wanted
to
say
thanks
for
going
through
that
I've
I've
not
looked
at
the
guts
of
the
OSD
code
in
detail.
So
it's
in
some
respects
it's
a
little
bit
baffling
for
me,
but
at
least
I
think
now,
I've
got
a
good
idea
of
I've
got
I've
got
somewhat
of
a
feel
for
how
this
thing
hangs
together
so
overdue
after
to
go.
Looking
in
this
code,
at
least
I
think
I
know
where
to
start
now
which
I
didn't
before.
So
that's
that's
helpful.