►
From YouTube: 2018-JUN-06 :: Ceph Developer Monthly
Description
Monthly developer meeting for the coordination of Ceph project development.
http://tracker.ceph.com/projects/ceph/wiki/Planning
C
E
E
E
F
My
assumption
is
that
they
would
operate
just
absolutely
independently,
like
they
do
now.
They'd
have
their
own
mon
clients
and
everything,
except
that
they
would
be
in
the
same
process.
They'll
be
the
first
step,
but
I
haven't
actually
thought
through
the
how
the
messengers
are
gonna,
get
combined,
I,
think
that
might
be
the
that's
so
refreshing,
refreshing,
my
memory.
The
reason
why
we
need
to
do
this
is
because
they're
going
to
share
the
same
DP
DK
network
card
right.
That's
the
reason
why
I
would
want
to
put
multiple
those
T's.
E
G
I
J
F
E
F
E
K
E
F
F
E
F
I
would
I
would.
This
seems
like
this
is
an
orthogonal
concern
like
we
can
proceed
with
everything
else,
just
with
a
single
listing
the
process
and
at
some
point
we'll
do
this
work
to
have
multiple
lists,
to
use
and
figure
out
what
the
right
way
to
do
it,
but
it
up
I'm,
not
sure
that
it
they
had
there
any
dependencies
between
it
and
the
rest
of
the
sea,
star
stuff
or
the
other
way
around.
Is
that
right,
I.
E
E
E
Make
sense
for
exams
only
on
that
are
included
included
and
each
provided
copy
of
okay,
and
we
will
will
eliminate
a
group
of
contacts
that
she
mentioned
and
multicast
okay,
timber,
iris
or
condition
to
compile
the
configured.
We
can
remove
all
the
lot
in
the
cold
path
because
we
don't
want
any
marketing
that
infinity
star.
G
H
F
That
so
one
of
the
first
things
that
that
I
was
thinking
of
when
as
terms
of
sharing
stuff
across
course
was
the
OSD
cache
all
do
steam
apps
and
the
model
I
had
in
my
head
was
like
an
RC
you,
where
you
would
have
a
you,
would
have
a
hash
table
at
the
lookup
table
or
whatever
pointing
at
mapping
ethics
to
pointers
to
the
OSD
maps.
That
I
was
T.
Maps
themselves
would
be
entirely
constant
and
then,
when
you
ever
refreshed,
you
just
generate
in
your
hash
table
and
they
would
do
an
RC.
F
F
L
E
E
E
G
Yes,
this
seems
like
the
first
time
is
done
so
far
is
turning.
Everything
to
face
points
are
not
making
them
increase
points.
The
second
step
is
probably
going
through
each
and
every
to
a
lot
line
and
either
moving
the
non
constant
pieces
so
that
we
can
make
them
these
points
that
just
have
a
fixed
size
data,
the
output
of
a
constant
table
or
changing
those
long
lines
entirely.
G
C
C
It's
working
good
I,
replaced
it
in
a
blue
store,
I've,
converted,
DLT
and
retraced
ones,
and
I've
used
a
file
with
the
objects
on
back-end
to
bunch
market
I've,
seen
some
about
like
a
10%
through
boobs
enhancement
well,
compared
to
deal
with
the
lab
level
of
10
and
for
the
disk
size
are.
It
goes
from
like
a
hundred
megabytes
to
40
megabytes
for
the
round
that
I
that
I
have
with
you.
C
It's
like
a
60%
of
covert
yeah
and
I'm,
actually
using
the
approach
that
is
used
in
queue
for
deciding
which
back-end
to
use
admin
fee
because
I've
come
here
with
it.
But
basically
it's
like
a
script
and
you
tell
it
which
back-end
you
want
and
then
it
will
generate
C
code
that
implements
that
back-end
and
a
few
years
ago,
I
added
a
TT
in
G
support
for
it.
So
I,
basically
just
did
that
count
and
put
it
in
to
self
right
now
and
actually
I'll
just
open
up
ER.
C
After
this
call
and
I'll
update
the
many
miss
Cooper
with
it,
but
I
guess
I'm
curious
or
maybe
you
should
judge
this
on
7ml
but
I
don't
like.
Why
would
we
get
this
moving?
What
next
I.
F
Think
sending
it
to
set
develop
some
psych.
The
next
step.
I've
won
one
quick
question
before
we
get
totally
sidetracked
on
this.
This
is
basically
taking
the
existing
diets
and
channeling
them
through
basically
effectively
an
arbitrary
string
through
a
single,
basically,
three
l
TT
in
G,
right,
like
it's
still
just
a
unstructured
log.
F
F
Okay,
so
I
guess
what
I
was.
What
I
was
gonna
say
was.
It
seems
like
if,
for
these
trace
points
like
we
should
basically
not
use
string
stream
ever
like
we
have
to
stringify
something.
Then
then
probably
it
means
that
our
trace
point
is
like
not
a
well-structured,
trace
point
or
is
not
logging.
Something
right
thing.
C
F
Yep
and
if
there's
a
way
to
like,
have
a
trace
point,
that's
annotated
such
that
you
know
it's
like
a
verbose,
slow,
debug,
type,
trace
point.
Then
you
can
still
have
something
that
works
kinda
like
di
or
you
just
like
spam,
the
log
with
a
huge
structure
because
you're
trying
to
figure
out
what
hell
is
going
on
when
you're
developer.
As
long
as
that
compiles
away
sanely
then
we're
nuts
and
compel
away,
but
every
one
time
I
never
actually
gets
called.
C
I
E
M
A
F
E
E
E
E
Teams,
the
OT
max
to
be
used
by
pigeon
Medina,
oh
I,
already
used
in
certified
by
car
notice,
intense
and
it
will
offer
summer
a
message.
I
get
a
good
map
and
the
ED
map
and
hopefully
remove
map
they
will
be
caught.
I
see,
touchpad
or
live
instead
could
be
a
could
update
the
middle
of
the
map
also
Austin
story:
the
mini
ot
caching
police.
Also.
E
F
Coming
on
the
OSI
map
caching
service
for
a
sec
yeah,
so
it
feels
to
be
like
there.
It
we
have
to
the
question
is
how
how
far
how
far
to
go
in
one
extreme,
because
to
do
this,
they,
like
the
in
the
extreme
high
performance
case,
then
we
would
want
to
avoid
the
ping
pong
between
CPU
cores
in
order
to
look
at
those
two
maps,
and
we
would
also
want
to
avoid
OSD
map
ref
I.
Think
because
that's
manipulating
of
anatomic,
that
has
to
is
actually
synchronizing
cash,
fines
or
whatever
across
course.
F
So
one
one
way
to
do.
That
would
be
like
an
RC.
You
like
thing
where
the
yellow
steamapps
you
just
end
up
with
like
a
raw
pointer
to
a
constant
in
memory,
post,
email
and
then
there's
some
other
thing.
That
makes
sure
that
the
the
reference
has
gone
away
by
the
time
that
you
retire.
You
off
you
free
that
memory.
F
F
You
using
like
a
atomic
update
to
the
hash
lookup
table
or
something
like
that,
and
then,
when
you
want
to
trim
something
from
the
cache,
you
just
have
to
make
sure
that
there
are
no
cores
that
are
accessing
that
reference
to
those
team
up
anymore,
and
so
you
wait
some
period
where
you
make
sure
that
all
of
those
all
the
work
on
those
cores
has
completed
such
that
they
won't
be
referencing.
That
map
I
don't
know
exactly
how
that
would
work.
F
J
F
F
F
J
F
Yeah,
so
maybe
this
is
probably
warrants
its
own
thread
on
it,
set
to
Velda
to
kick
her
in
a
couple
options
here
and
see
which
one
because
you
could
notify
you
could
just
basically
pass
you
to
each
pass.
The
message
core
each
core
could
have
a
sown
local,
well
cache
hash
table.
I
do
want
to
do
it,
but
Oliver
from
Sigma
same
I,
don't
know:
yeah,
okay,
okay,.
D
A
A
F
K
F
F
F
We
would
just
have
like
all
of
that
code
sitting
in
master
in
the
OSD,
and
it
would
just
know
that
those
code
paths
would
get
taken
until
you
like
flip
them
out
or
like
make
one
change
or
something
like
that,
so
that
the
code
is
sitting
side
by
side
and
then
eventually
will
just
delete
all
CL
stuff,
but
instead
of
having
a
heavily
modified
code
in
another
branch,
I'm
just
having
a
parallel
implementation,
because
I'll
probably
like
the
renaming
functions
and
reimplemented
out
stuff
anyway.
So
so
we
can
can
start.
E
E
J
J
J
K
I
F
E
F
Yeah
I
think
the
good
news
is
that
there
aren't
that
many
changes
actually
happening
in
the
I/o
path,
and
you
know
see
these
days.
Most
of
the
changes
are
like
appearing,
which
won't
be
touched
for
the
most
part.
So
hopefully
this
won't
come
up
too
much
yeah
I.
Think
probably
we
should
just
start
with
like
what
the
actual
pull
requests
are
and
we
can
decide
if
this
can
go
into
master,
then
great.
If
it's
like
can't
or
this,
then
we
can
figure
out
what
else
to
do.
E
E
M
E
D
L
L
I've
been
doing
for
I've
been
actually
looking
at
this
for
several
years,
but
I
actually
started
picking
up
and
doing
the
actual
work
on
it
about
six
months
ago,
and
the
idea
is
to
build
NFS
Ganesha
or
to
change
NFS
Ganesha,
you
know
benefits
cognition,
but
that's
what
I'm
using
to
to
work
and
active
active
configuration
on
top
of
the
clustered
file
system
like
Stefan,
fests
and
so
stuff.
This
is
the
prototype
for
this,
though
yep.
L
So
in
any
case,
I'll
go
brief
them
talking
about
rattles.
You
guys
know
what
that
is.
This
is
this
presentation
and
with
a
you
know,
idea
of
maybe
taking
it
on
the
conference
track,
but
in
any
case
I
get
Ganesha
is
a
usual
and
NFS
server
and
that's
sort
of
the
what
the
place
I've
been
doing
it
our
place.
L
So
in
case
we've
had
suffice
or
FSL
stuff
for
a
while
now
and
we
are
actually
you
tripping
it
in
RHCs
300,
just
a
traditional
sort
of
cluster
set
up
using
pacemaker,
but
you
know
the
key
is
that
active,
active
configurations
actually
on
stuff
worked
pretty
well,
they
you
have.
You
know
the
stuff
of
s
is
pretty
what
good
at
mediating
conflicts,
and
so
we
have
stuff,
like
you
know,
opens
locks
delegations
layouts.
L
All
that
kind
of
stuff
is
pretty
well
handled
and
at
the
file
system
layer,
and
so
if
you
just
stand
up
a
bunch
of
independent
kinesha
servers
on
top
of
the
seventh
s,
it
mostly
kind
of
just
works.
The
down
the
catch
here
is
that
it
it
doesn't
work.
There
are
problems
when
you
have
to
win
us
node
crashes,
if
a
connection,
node
crashes
or
something
crashes
and
SEF,
then
we
have
to
do
some
special
handling
and
that
doesn't
exist.
Yet
so
Ganesh
already
had
some
clustering
support
too.
L
L
L
Get
together
and
coordinate
when
they
have
to
which
is
typically
only
during
the
recovery
like
when
the
node
is
coming
up
and
and
we
need
to
allow
NFS
clients
to
reclaim
so,
and
the
other
good
thing
here
is
that
Ganesh
is
pretty
amenable
to
containerization
in
this.
In
this
configuration,
there's
no
real
need
for
local
storage.
Everything
goes
into
Rattus
or
into
self
s.
L
So
we
may
need
to
do
some
some
takeover
kind
of
stuff
eventually
to
handle
migration
so
like
we
want
to
move
a
client
from
one
NFS
connection
server
to
another
in
the
cluster,
so
that
we
can,
for
instance,
decommission
that
NFS
Ganesha.
We
would
need
to
be
able
to
handle
that
and
that's
not
implemented
yet,
but
it
could
be
done
so
to
do
this,
we
have
to
kind
of
step
back
and
look
at
what
happens
in
a
single
node
configuration
right
after
a
restart.
L
You
know,
NFS
servers
are
sort
of
a
blank
slate
blank
slate.
You
know
you
they
don't
have
any
knowledge
of
what
the
clients
held
before
and
so
what
we
do
is.
Typically,
we
bring
up
an
NFS
server
in
a
particular
way
and
to
allow
and
allow
the
clients
to
reclaim
their
state.
They
say:
hey
I,
had
this
file
open
in
this
way,
I
had
this
lock
on
this
files,
etc,
etc,
and
so
during
that
period
we
call
that
we
call
that
period
the
grace
period
and
so
the
grace
period.
L
During
that
time,
clients
are
not
allowed
to
establish
a
new
state,
they
can
only
do
reclaims
and
so
to
allow
that
to
handle
those
reclaims
there's
some
certain.
There
are
certain
scenarios
where,
if
you
have
reboots
along
with
network
partition,
like
you
know,
a
client
loses
contact
with
a
NFS
server
server
reboots
a
couple
times
and
has
you
know
forgotten
all
about
that
client
that
client
could
come
back
in
on
a
subsequent
reboot
and
reclaim
state
that
it
really
shouldn't
have,
and
so
we
do
the
way
we
handle
that
is.
L
L
We
have
to
do
to
allow
clients
to
handle
that
sort
of
or
to
allow
servers
to
prevent
that
sort
of
condition,
and
so
that
the
other
catch
here
is
that
and
the
important
bit
is
that
we
have
to
atomically
replace
the
old
client
database
with
the
new
just
prior
to
ending
the
grace
period.
So
up
until
the
point
where
we
end
the
grace
period,
you
know
whatever
database
existed
prior
to
that
is
authoritative
once
we
lift
the
grace
period.
At
that
point,
the
new
databases
is
the
authoritative.
L
So
a
way
to
think
about
this
and
it's
a
sort
of
a
simplification
for
that'll
make
sense
later,
is
to
consider
each
each
reboot
a
particular
epoch.
So
when,
as
NFS
server
crashes,
we
have
a
period
where
we
have
a
grace
period
and
then
it
comes
up.
It
goes
to
the
grace
period.
We
allow
recovery
and
then
we
go
to
normal
operations,
and
then
we
have
another
grace
period
and
then
what
you
do
normal
operations
and
we
have
another
great
normal
operations.
L
Sorry,
so
now
we
have
to
think
about
what
happens
when
we
have
multiple
servers,
so
when,
in
that
case,
we
have
to
think
you
know
really
that
what
we
are,
what
the
whole
point.
The
grace
period
is
to
prevent
conflicting
state
from
being
acquired
during
reclaim.
So
when
that
so
and
we
have
to
worry
about
conflicting
state
being
handed
out
on
other
servers
as
well
in
the
clustered
setup,
so
at
that
point
we
really
have
to
consider
the
grace
period
to
be
a
cluster
water
property.
L
We
want
so
we
we
need
to
enforce
a
grace
period
until
it's
no
longer
needed
by
any
node,
and
that
means
that
when
we
think
about
it,
if
the
node
crashes,
while
the
grace
periods,
in
effect,
we
weren't
allowed
to
rejoin,
and
then
that
means
that
lifting
and
also
lifting
grace
is
really
a
two-part
process.
We
have
to
sort
of
indicate
that
this
node,
you
know
the
node,
you
know
I,
don't
really
need
a
grace
period
anymore,
and
then
we
have
to
indicate
when
we
stop
enforcing
it.
L
No
one
needs
it
any
longer,
and
so
essentially
it's
a
cluster
wide
property
and
you
guys
be
feel
free
to
stop
and
ask
questions
if
I'm
not
clear
on
a
this
and
breezing
through
the
slide.
So
any
case
we
have
one
way
to
the
simplest
way
I've
found
to
represent
all
this
is
a
little
keep
track
a
flag
per
of
sorry
like
a
flag
per
NFS
server,
but
each
each
server
in
the
cluster
gets
a
flag.
That
says
you
know
whether
it
needs
a
grace
period.
L
But
when
it
comes
up
it
crashes
comes
back
up
and
says:
hey
I
need
a
grace
period.
I'm
gonna
allow
my
clients
to
reclaim,
and
then
we
also
have
a
keep
a
track
track
with
a
flag
that
says
whether
it's
enforcing
the
grace
period.
So
you
know
when
the
client,
when
the
NFS
server
has
finished
all
of
its
reclaim,
but
another
node
and
the
cluster
is
still
allowing
reclaimed,
are
still
leads
that
still
needs
a
grace
period.
Then
we
we
set
a
flag
to
indicate
whether
that
it's
a
it's
still
enforcing.
L
So
we
have
to
separate
you
know
for
where
we
usually
in
an
NFS
in
a
single
node,
NFS
server,
a
single
now
infest
server.
We
would
conflate
these
two
ideas,
but
in
a
cluster
we
have
to
consider
them
separately,
and
so
the
idea
here
is
that
we
have.
We
really
want
very
simple
logic
that
allows
the
NFS
servers
to
make
decisions
about
grace
period
enforcement
without
based
on
the
state
of
all
the
servers
in
the
cluster,
and
we
want
that
to
be
decentralized.
We
don't
really
want
a
single
node
figuring
this
out.
L
So,
in
any
case,
the
way
we've
done
way
I've
done.
This
is
I've
created
I'm,
using
a
single
object
in
the
in
Randolph's
to
as
a
database,
and
so
that
database
and
usually
I
call
it
grace
within
that
within
that
server
or
within
the
within
rattles,
and
it
essentially
it
just
tracks
a
very
small
amount
of
data.
L
So
we
have
to
un
60
40
values
that
represent
the
epoch.
First
is
the
current
epoch
is
it,
which
is
where
new
records
should
be
stored,
and
what
the
you
know,
whatever
the
current
reboot
epoch
is,
and
then
we
have
a
an
R
value,
which
is
the
recovery
epoch
so
and
that's
during
a
grace
period
that
will
be
nonzero
and
that
will
indicate
what
epoch
we
are
allowing
recovery
from.
L
So
if
a
server
comes
in
after
it's
been
out
of
communication
for
a
while,
it
can
look
at
that
R
value
and
if
it
doesn't
have
a
database
for
the
that
R
value,
that's
represented
there,
it
won't
allow
in
recovery
and
then
also
in
that
database.
We
track
a
no
map
which
is
to
do
that.
Just
tracks
the
two
flags
and
essentially
I.
Just
have
a
bite
in
there
that
I've
and
I'm
using
two
bits
out
of
that
height
right
now,
so
we
have
space
for
other
flags
later.
L
At
the
same
time,
we
also
have
to
allow
for
parallel
server
recovery
databases,
so
so
in
a
clustered
setup,
we,
you
know
each
node.
Each
server
node
in
there
is
going
to
have
a
list
of
its
own
clients
that
it
has
and
it
needs
to
keep
track
of
that
list
separately
from
all
the
others,
and
so
what
we
do.
What
I
do
here
is,
but
the
key
is
that
we,
when
we
switch
epics,
we
need
to
have
an
atomic
switch
on
those
databases
as
well,
and
so
that's
tricky,
because
we
can't
do
Tomic.
L
L
You
know,
level,
atomicity
and
rathaus
is
just
honest
logic,
so
we
have
to
allow
so
what
I've
done
is
we
just
embed
the
current
epoch
value
in
the
in
the
name
of
the
database
in
the
in
Rattus,
though,
and
then
they
look
like
this
with
the
there's
a
wreck
and
then
there's
the
epic
string
with
an
epic
value
in
it,
and
then
I
have
the
hostname
after
that
order,
we
may
actually
change
that
to
be
something
different.
Besides
the
hostname
and
in
a
future
version,
but
for
now
let's,
but
what
it
uses.
L
So
it's
just
a
you
know
an
opaque
node
ID
effectively
and
we
use
those
traditional
connections
had
a
at
the
ability
to
store
recovery,
databases
for
Singleton
servers
for
a
long
time,
and
we
use
the
same
format
for
that
database
in
in
these.
In
these
recovery
databases
and
when
an
epic
changes,
we
can
also
go
through
and
clear
out
any
of
the
ones
that
we
know
will
never
be
used
for
for
recovery
anymore.
You
know
once
once
we're
well
beyond
the
point
where
we
would
recover
from
that
database.
We
can
just
delete
it
so.
F
L
L
F
L
L
We
don't
want
those
clients
coming
back
in
sounds
good,
all
right,
okay,
so
in
case
I
have
you
know
as
part
of
that,
so
that
you
know
we
have
a
recovery
back-end
that
implements
this
for
for
Ganesha
now
and
it
just
got
merged
this
past
week
and
then
I
was
a
command-line
tool.
It
goes
along
with.
It
allows
you
to
basically
manipulate
this
database
that
will
or
an
administrator
can
come
in,
and
do
that
and
also
allows
you
to
do
things
like
add
nodes
to
the
cluster.
L
For
instance,
if
you
want
to
grow
out
and
scale
out
a
little
bit
or
we'd
be
able
to
remove
nodes
too
right
now,
I
don't
have
a
way
to
the
migrated
clients
that
are
off
of
an
existing
node
automatically,
but
that
could
be
added,
but
it
didn't
order
to
shrink
the
cluster
we'd
have
to
take.
That's
basically
route
delete
it
so
map
key
and
of
Lee.
That's
what
that's
what
this
tool
allows
us
to
do,
so
it
allows
us
to,
you,
know,
add
remove
stuff
from
the
cluster
admin
move
notes
from
the
cluster.
L
So
any
case
you
know.
The
way
we
have
to
think
about
this
now
is
that
each
clustered
NFS
server
has
its
own
life
cycle,
and
you
know
this
is
a
way
to
think
about
the
states
as
we
walk
through
them.
So
the
first
one
is
the
startup
state.
So
first
we
start
up
from
0
for
the
server,
and
so
essentially,
this
node
will
either
request
a
new
grace
period
or
join
one.
L
That's
that's
already
and
in
if
it's,
if
there's
one
already
active,
it'll
and
effectively,
we
just
ensure
that
both
the
need
and
enforcing
flags
are
set.
And
then
we
wait
for
all
the
other
nodes
in
the
cluster
to
set
their
enforcing
flags
and
the
reason
we
do
that
is
that
we
need
to
ensure
that
we
don't
kill
off
state
held
on
the
FSF
MDS
by
previous
cluster
note
or
previous
instance
of
that
cluster
note.
L
So
we
so
if
a
node
crashes
and
comes
back
I'll
talk
about
that
here
in
a
bit,
but
when,
if
it
crashes
and
comes
back,
we
want
the
MDS
to
preserve
its
state,
at
least
for
a
little
while
until
we
are
ready
to
start
reclaim
and
so
we're
actually
adding
some
stuff
to
suffice.
Lips
FS
for
that
Zhang
Jiang
has
some
patches
for
it
and
I've
been
testing
them.
L
They
work
very
well
so
any
case
assuming
you
know
once
the
river
comes
once
all
the
other
servers
are
enforcing,
we
load
up
the
exports
table
at
that
point,
and
then
we
kill
off
in
each
state
that
was
held
by
the
by
this
particular
NFS
server
before
and
that
because
at
that
point
we
are
safe
to
do
so.
We
known
of
conflicting
state
can
be
acquired
and
and
the
the
mb/s
is
clear
to
release
that
state
at
that
point
so
that
we
can
allow
clients
to
reclaim
them.
L
So
at
that
point
we
transition
to
the
recovery
state.
So
we
start
a
grace
period,
timer
all
right,
so
we
don't
want
to
waste,
wait
forever
for
these
clients
to
reclaim.
We
just
give
them
a
certain
amount
of
time,
and
usually
it's
you
know,
half
a
minute
or
a
minute
and
a
half
up
to
like
you
know
a
couple
minutes
so
the
clients
during
that
period
or
you
know,
allowed
to
raclette,
reconnect
and
reclaim
previous
state.
L
You
know
if
they're
listed
in
that
every
DB,
just
like
with
a
normal
singleton,
NFS
server
and
that
lasts
until
the
grace
period
times
out
or
or
all
known,
clients,
etc.
Click
complete.
So
one
of
the
reasons
I'm
focusing
on
fee
for
one
for
this
is
that
career
prior
versions
of
of
NFS
did
not
allow
you
to
do
or
did
not
send
a
reclaim
complete.
L
F
E
L
You'll
get
back
NFS
for
our
grace.
Essentially
until
you
know,
and
and
and
the
other
thing
I
mean
that
we
have
to
that-
you-
you
know
we
it's
not
necessarily
the
case
that
we
are
dead
in
the
water
or
during
the
grace
period
either.
The
only
thing
that's
prevented
is
things
that
change
the
state.
You
know
stateful
information,
so
you
can
still
be
reads
and
writes
you
know.
With
existing
file
handles
you
had
or
with
existing
files
you
had
open
you,
don't
necessarily
have
to
wait
for
those
it's
just.
The
state.
L
L
L
At
that
point,
we
can
clear
out
the
recovery
database
in
memory
after
the
n
flag
is
clear
because
we
never
not
gonna,
allow
anymore
reclaim
and
then
at
that
point
we're
sort
of
in
limbo.
We
don't
we
don't
we're
not
allowing
any
reclaim
but
we're
also
not
allowing
at
a
new
state
to
be
acquired
either.
L
So
we
just
kind
of
mostly
we're
just
setting
back
errors
at
that
point
for
any
in
a
pretty
state,
morphing
operations
that
come
in
and
so
once
that
once
we
transition
to
the
to
where
the
R
equals
0,
you
know
no,
nobody
needs
the
grace
period
any
longer
and
then
we
can
transition
to
the
next
day,
which
is
normal,
and
this
is
where
the
server
will
spend
the
bulk
of
its
operations
effectively.
No
cover
e's
allowed,
so
this
R
0
new
state
can
be
acquired.
L
Now
we
can
do,
opens,
closes
or
renew
opens
and
lock
and
all
that
kind
of
good
stuff.
This
is
really
where
we
spend
almost
all
the
time
for
the
for
the
server.
You
know
it's
been
almost
all
the
bigger
and
this
either
ends
when
the
server
shuts
down
and
we
shut
down
state
or
we
go
to
or
another
node
starts
a
new
grace
period.
So
another
node
may
crash
come
back
up,
say:
hey
I
need
a
grace
period
and
at
that
point
we
go
to
the
reinforced
state.
L
So
at
this
point
another
node
and
the
another
server
and
the
cluster
has
said:
hey
I
need
a
grace
period,
but
this
particular
node
still
knows
what
its
clients
are
and
it
doesn't
need
to
do
any
reclaim.
So
what
we
do
at
that
point
is
just
enforce
the
grace
period,
but
we
don't
actually
allow
any
reclaim
indefinitely.
So
what
we
do
is
we
drain
off
any
in
progress
operations
that
would
be
disallowed
during
grace
period.
This
that's
the
idea,
and
then
we
set
our
'flag
and
start
enforcing
the
grace
period.
L
We
we
also
have
to
because
we're
changing
epochs
at
that
point.
You
know
we
see
actually
changes
what
we've
declared
a
grace
period,
so
we
have
to
create
a
new
recovery
database.
We
write
all
our
records
for
the
active
clients
into
it.
That's
okay,
it's
all
done
in
a
single
transaction.
So
and
then
we,
the
next
thing
we
do
is
and
then
at
the
end
of
that
we
go
in
back
into
our
enforcing
state.
L
And
then
you
know
we
can
cycle
back
around
and
round
around
round
of
these
in
these
states
indefinitely,
and
so
the
last
state
is
the
shutdown
state
we
can.
This
can
technically
enter
for
an
any
other
state.
I
mean
we
could
hit
the
start
state
and
still
shutdown.
At
that
point,
where
we
stop
processing
our
pcs,
we
started.
L
We
usually
will
declare
a
new
grace
period
or
join
the
existing
one
and
set
our
need
and
enforcing
flags,
because
the
presumption
is
that
we
are
going
to
be
coming
back
up,
and
then
we
complete
the
shutdown,
the
shirt
and
then
what
we
want
to
do
is
ensure
that
the
states
left
intact
and
intact
in
the
set
MDS
right
now.
So
good
NASA
doesn't
do
this
right
as
something
I
would
be
working
on
here
soon,
but
what
it
does
as
when
we
are
shutting
down
the
server.
L
We
want
to
make
sure
that
we
kind
of
want
to
leak
all
of
our
state,
it's
kind
of
nasty,
but
but
it's
what
the
only
way
I
can
see
to
do
this
properly.
So
what
we
want
to
do
is
leave
that
intact,
so
that
nobody
can
acquire
conflicting
state
air
state
that
would
conflict
with
that
state
until
we
are
back
to
starting
back
up
again
alright
that
point
now
we
have
a
demo,
let's
see
if
I
can
get
this
to
work,
I
may
have
to
change.
I
got
a
second.
L
L
Doesn't
seem
to
want
to
use
my
console
windows.
Yeah
I
may
not
be
able
to
do
a
demo
over
this
thing.
Oh
sorry,
guys
so
in
case
it
was
have
to
take
my
word
for
it.
It
actually
does
work,
so
you
can
shut
down
the
thing.
I
can
you
know,
run
IO
against
it,
shut
down
another
node
in
the
cluster
and
see
it
go
into
the
grace
period,
and
then
it
comes
back
out
comes
back
out
again,
so
yeah
there's
still
a
lot
of
work
that
needs
to
be
done.
L
H
L
Yeah,
that's
that's
the
case,
even
during
during
a
normal
grace
period.
So
even
during
the
recovery
phase,
that's
also
the
case
to
reinforce
and
and
and
a
recovery
and
forcing
all
that
stuff.
You
know
again:
The
Grates
period
really
only
affects
state
morphing
operations,
so
you
can
open
files,
you
can
or
you
can't
open
files,
but
you
can
do
I/o
to
files
that
you
already
helped
held
over
you
can't
you
can
close
files
technically
I
mean
some.
Some
servers
allow
this
some
don't,
but
technically
you
are
allowed
to
close
files.
L
F
L
Well,
you
can't
do
that
because
you
can't
open
files
mostly
to
allow
for
delegations
are
part
of
it.
Really.
What
that
is
for
is
for
NFS.
Allah
nfsv4
allows
you
to
do.
Share
level
share,
locking
share,
deny
locks,
you
can
open
a
file
and
say
I
want
to
I
want
to
set
us.
You
know
a
denial
Ock
that
and
that
will
deny
others
from
opening
the
file.
This
is
a
very
Windows
kind
of
thing
yeah,
so
we
nfsv4
whenever
they
work
the
spec.
L
They
were
really
trying
to
entice
the
Windows
folks
to
come
aboard
and
you
know
didn't
really
work,
but
but
so
that
we
have
some
semantics
in
there
that
allow
for
that
sort
of
thing
now
most
servers
will
allow
that
will
do
share,
deny
locking
you
know.
I
may
actually
make
a
proposal
in
that
one
of
the
coming
ones
that
we
allow
share,
deny
locking
could
be
optional,
because
we
can
only
select
yeah.
A
L
F
L
Doesn't
make
it
worse,
no
yeah
yeah.
We
still
have
to
do
that,
even
if
we
want
to
allow
if
we
want
to
allow
delegations
which
we
do
yeah
so
any
case.
You
know
we
still
have
a
lot
of
work
to
do
on
this
thing
right
now.
What
all
this
relies
on
the
MDS
holding
the
state
for
us,
while
you
know
for
a
for
an
NFS
server,
that's
gone
down,
so
you
know
if
Ganesh
node
crashes
and
comes
back
up
what
we
don't
want.
L
The
NBS
to
do
at
that
point
is
kick
out
all
the
state
and
start
allowing
other
stuff
to
require
it.
So
we
have
to
ensure
that
it
squats
on
top
of
that
state
for
a
while
until
we
can
come
in
and
and
give
it
an
explicit
okay,
you
can
release
this
now,
which
we
do.
After
the
everybody
is
enforcing
the
grace
period.
That's
not
properly
done
right
now,
I
need
to
add
them.
Start
I've
started
looking
at
how
we
fix
this,
but
not
trivial.
L
L
L
Eventually,
we've
made
one
redo
the
grace
database
right
now.
The
way
I
do
this
is
with
read-modify-write
operation.
Do
a
read
operation
modify
whatever
the
scent
is
and
then
try
to
do
it
right
and
if
something
changed
in
between
we
assert
on
that
and
then
turn
around
again,
and
that's
not
really
that
thickened
and
we
could
do
this
for
those
d-class
methods
in
practice.
It
probably
doesn't
matter
a
whole
lot
and
this
is
we
don't
hit
the
database
at
all.
So
you
know
it's
not
really
tear
in
a
little
bit
of
inefficiency.
L
J
Like
you,
let's
you
know,
I'm
sorry,
you
hear
me
yeah
okay,
so
it's
like.
You
really
only
need
the
enforcing
state
in
NFS
or
in
Ganesha.
If
the
envious
servers
aren't
already
doing
it
for
you,
which
I
think
they
are
unless
they've
crashed
right,
because
you
know
the
Ganesha
server,
because
the
capability
that
needs
for
a
client
to
have
the
delegation
or
open
or
whatever.
L
Right,
but
in
order
to
reacquire
the
state,
so
let's
say:
okay,
so
server
crashes,
so
we
have
a
server
in
its
story.
Sort
of
it
holds
a
bunch
of
state
clients.
It
crashes
comes
back
right
in
order
for
to
reacquire
that
state,
we
have
to
kill
off
the
state
that
was
held
by
the
by
Ganesha
before
by
the
previous
instance,
and
that's
a
window
of
time
when
another
server
could
come
in
peek
in
there
and
steal
that
state
from
us.
L
You
know
if
you
could
say
you
know
if
they
could,
instead,
instead
of
just
killing
off
the
old
session,
if
it
could
just
say
hey,
you
know
me
back
all
the
old
state
I
held
before
then
we
then
you
don't
have
to
put
anybody
else
into
the
grace
period,
because
you
know
that
state
was
still
be
held
by
that
that
particular
creature.
So
it's.
J
L
Started
so
young
did
some
before
that,
but
it's
not
trivial
I
mean
we
have
to
do
locks.
You
could
potentially
be
shoveling
a
whole
lot
of
state
over
the
wire
to
the
to
the
Kratt.
You
know
this
to
the
reincarnated
server
you
know,
and
so
there's
a
lot
of
there's
a
lot
of
work
to
be
there,
and
so
the
way
I'm
Anna
is
that
we
may
initially
allow
that
as
a
potential
optimization.
But
for
now
what
we're
going
to
do
is
do
this
one
where
we
have
these.
J
F
To
make
sure
we
can
block
side
of
our
other
options
and
that
just
to
be
clear,
so
the
thing
that
we
are
doing
with
this
is
that
the
new
Ganesha
server
instance,
which
is
sort
of
a
fresh
best
client.
There
are
new
calls,
or
it
can
say
you
can
reassert,
reclaim
stuff
that
is,
in
this
limbo
state
from
the
previously
dead
client.
F
L
Works
so
right
now
what
we're
gonna
right
now,
what
we're
gonna
do?
What
we
do
is
we
set
sort
of
an
opaque
value
on
on
a
particular
MDS
session,
and
so
there's
State
tied
to
that
session.
No
opens
locks
whatever
and
then,
when,
when
the
server
comes
back,
what
we
do
is,
we
know,
declare
we
say:
hey
we're.
We
want
to
kill
off
that
old
session,
the
week
tips
and
that
opaque
string
or
blob
or
whatever
and
say,
hey,
kill
off
any.
L
F
It
okay,
so
the
net
new
is
released.
It's
really
just
killing
off
the
old
session,
and
then
you
just
reopen
it
like.
If
the
client
asserts
that
I
had
this
file
up
and
the
new
client
just
opens
it,
and
it
will
be
allowed
to
do
that
because
we
have
no
conflicting
because
it's
a
cluster
wide
grace
periods
and
nobody
else
is
gonna
like
that,
make
sense
exactly.
F
F
So
big
picture
that
we
have
customers
and
users
who
say
we
need
a
REST
API
to
manage
the
stuff
cluster.
They
don't
want
to
use
the
CLI.
They
don't
want
to
use
the
GUI
ever
or
they
say
something
like
everything
that
we
can
do
in
the
GUI.
We
need
to
also
be
able
to
do
through
an
API,
an
API,
and
so
we
have
a
couple
different
options.
There's
the
old
thing
that
just
passes
the
CLI
commands
over
a
REST
API.
That
is
also
present
in
the
new
restful
module.
F
There's
a
new
restful
module
which
implements
like
a
small
set
of
things,
but
it's
very
minimal
and
there's
no
there's
no
documentation
or
minimal
documentation,
minimal
functionality,
it's
pretty
basic
and
then
the
last
option
is
that
the
dashboard
itself
internally
has
an
API
that
it's
using
to
talk
to
its
itself
to
the
manager.
In
order
to
do
everything
that
it
does
the
problem
with
using.
F
That
is
that
it's
it's
an
internal
API
and
it's
always
going
to
do
stuff
that
the
like
a
normal
management,
REST
API,
wouldn't
want
to
do
like
you
know,
generate
give
you
the
exactly
the
data
you
need
to
render
some
widget
or
something
like
that,
which
is
just
different,
and
we
don't
want
to
necessarily
have
at
all
the
versioning
and
documentation
hurdles
or
whatever
for
something.
That's
in
terms
of
the
dashboard.
F
So
I
guess
I
guess
what
are
the
questions?
I
think
the
question
is
one:
what's
your
sort
of
general
view
on
all
this
lens
and
what
requirements
are
you
seeing
from
your
customers
and
are
we
ready
to
sort
of
like
commit
to
having
like
a
separate,
REST
API
and,
if
so,
who's,
whose
investor
than
that
or
are
we
not
ready
for
that
or
what
so.
N
Yeah,
this
is
a
recurring
conversation.
I
hear
you
and
I
can
basically
start
by
repeating
what
I've
said
before
I
think
we're
not
completely
against
at
some
point
declaring
the
Ceph
dashboard
rest
api
as
being
an
efficient
api.
The
thing
I'm
concerned
about
at
this
point
is
that
kind
of
freezing
it
down
at
this
point
will
significantly
slow
us
down,
since
we
are
still
not
feature
completely,
are
still
adding
more
functionality
to
it.
N
Having
said
that,
we
actually
do
have
that
in
mind
when
working
on
on
the
backend,
so
I
just
merged
a
pull
request,
for
example,
that
will
give
you
an
automated
documentation
of
the
REST
API
based
on
swagger,
which
is
a
very
popular
tool
for
that.
So
that
replaces
an
initial
implementation
of
what
we
call
the
browsable
API,
where
you
basically
can
connect
your
browser
to
the
API
endpoint
and
then
have
a
web
page.
N
You
will
then
see
this
as
a
documentation
on
a
webpage,
so
the
REST
API
will
be
self
documented
in
a
way,
and
that
also
gives
us
a
way
of
a
marking,
particular
API
parts
as
internal
or
maybe
not
stable,
yet
and
in
the
other
way
around.
We
can
also
set
for
for
things
that
have
basically
settled
down
and
are
stable.
N
We
can
then
declare
this
could
be
used
by
externally
patience,
but
at
this
point
the
back
and
is
still
going
through
an
evolution,
and
we
really
see
the
the
dashboard
front
in
itself
as
the
primary
consumer.
But,
as
you
said,
there
are,
of
course,
some
queries
where
you
have
to
have
a
very
specific
jacent
being
returned
in
order
to
populate
widgets
versus
other,
more
operative
tasks
like
creating
RVD
or
creating
a
pool
or
whatever.
N
F
Swagger
sounds
awesome,
so
I
mean
there's
so
that
you're
saying
that
they
do
have
the
ability
to
mark
individual
api's
as
internal
or
dashboard
only
or
whatever,
and
so
they'd
be
obscured
or
labeled
as
such,
if
somebody's
browsing,
the
official
API
it
all.
The
like.
The
authentication
mechanism
is
like
all
general
enough
that
it
could
be
I,
don't
really
know
what
how
how
people
are.
Normally.
These
REST
API
is
in
the
first
place,
but
it
it's
it's
it's
consumable.
It's
actually
like
fits
the
set
of
requirements
that
you
would.
N
N
This
will
just
be
a
users
managed
locally
in
the
Mon
config
database,
but
also
working
progress
is
adding
and
not
other
directly,
but
using
some.
Our
open
ID
connects
OSS
all
protocols
that
are
widely
established.
That's
going
to
be
the
next
step,
going
forward,
we're
kind
of
moving
at
the
stack
here,
but
current
as
it
shipped
in
mimic.
N
N
So
the
the
people
working
on
the
dashboard
will
actually
meet
in
early
July
and
we
will
further
discuss
this
topic
as
well.
Both
downstream
representatives
from
Susan
redhead
have
a
an
interest
in
that,
and
so
this
is
definitely
one
of
the
topics
we're
going
to
discuss
further,
but
on
high-level
visits
or
intentions
and
plans.
With
that
awesome.
F
F
N
Well,
at
some
point,
of
course,
once
we
declare
this
official
and
we
may
have
to
make
incorporate
the
build
changes,
we
need
to
think
about
things
like
API,
versioning
and
all
the
other
baggage
that
comes
along
with
it,
but
for
the
time
being,
this
is
still
very
much
in
flux
and
evolving.
So
we
don't
even
bother
stabilizing
our
versioning
it
at
this
point
too,
because
right
now
we
develop
both
back
and
in
front
in
parallel.
N
F
F
F
N
Would
help
to
get
a
better
feeling
for
what
are
the
most
popular
requirements
and
customer
use
cases
a
food
loan
API
can
do
so
much,
but
maybe
is
it
about
just
creating
our
biddies
automatically.
Is
that
the
low-hanging
fruit
to
start
with
some
more
business,
intelligence
or
customer
insight
would
be
helpful?
Yeah
a
problem
may
be
I.
Think.
F
I
N
N
A
F
Yeah
all
right
thanks
next
up
is
the
smart
prediction
of
device
failures.
I
want
to
give
an
update
on
that.
This
is
a
project.
That's
gone
through
several
iterations.
Now
it
started
with
a
outreach
project
with
your
eat
who's,
on
the
call
to
do
some
built-in
capabilities
and
stuff
to
monitor
smart
statistics
and
predict
device
failures
and
respond
to
them.
F
It's
evolved
somewhat
since
then.
I
think
that
the
way
that
I'm
thinking
about
it
now
sort
of
into
grouping
into
three
sub
problems.
The
first
problem
is
the
metric
collection
and
there's
a
bunch
of
work
that
Yuri
did
initially
with
smart
Mon
tools
so
that
will
output
a
Jason
dump
that
instead
of
this
like
horrible
thing,
they
have
to
parse
just
what
smart
control
does
now
and
then
there's
also
an
OSD
command
that
you
can
do.
F
You
can
do
tell
SEF
tell
OSD
whatever
smart
and
it
will
basically
do
a
smart
control,
jason
dump
on
every
device
that
it's
using
and
it'll
send
that
back
to
you.
So
it's
sort
of
an
in
band
built-in
way
to
collect
smart
metrics
that
doesn't
require
that
you
deploy
your
own
prometheus
thing
or
other
agents
or
whatever.
F
There
is
there's
a
sort
of
a
proof-of-concept
module
in
a
manager
right
now
that
that
does
that
well
go
it'll,
scrape,
cos
DS
and
it
will
archive
them
and
rate
us.
It
doesn't
actually
do
any
prediction
yet,
but
that's
sort
of
a
let's
call
that
middle
part,
the
prediction
part
and
that's
sort
of
a
black
box
for
now
and
then
the
last
part
of
it
is
like.
F
That's
like
the
vendor
and
the
serial
number
and
other
things
that
are
actually
uniquely
identify
device
and
then
there's
a
bunch
of
code
in
the
manager
that
parses
that
and
implements
some
commands
that
light
you
listed
Isis,
so
there's
new
stuff
device
list
command.
That
was
all
the
unique
devices
in
the
system.
F
You
can
query
info
on
a
device
that
tells
you
which
demons
are
consuming
it
and
you
can
list
devices
by
host
and
by
demon
to
tell
what
the
what
the
dependencies
interactions
are
between
physical
devices
and
the
demons
that
consume
them,
and
as
part
of
that,
there's
the
ability
to
basically
tell
the
cluster
to
store
the
predicted
lifetime
of
a
device.
So
assuming
that
middle
black
box
says
that
this
disc
is
going
to
fail
in
four
days,
you
can
issue
this
stuff
command.
F
That
will
say
set
predicted
failure
of
this
device
in
four
days
from
now
and
the
cluster
will
store
it
in
a
same
way.
So
after
this
pull
request
is
and
then
the
last
peace
is
gonna,
be
as
the
next
part
that
your
eats
looking
at,
which
is
basically
the
automation
that
does
something
about
it.
So
it'll
basically
look
at
all
the
devices
and
their
when
failures
are
predicted
to
happen
and
they'll
say:
oh,
this
disc
is
supposed
to
fail
in
four
days.
I
should
do
something
about
it.
Maybe
that's
raise
a
health
warning.
F
F
So
that's
that
that's
sort
of
the
lay
of
the
land
and
the
interesting
bit
is
in
that
middle
part,
where
you
actually
have
to
predict
something
at
the
same
time
that
this
whole
start
of
building
something
into
stuff
is
going
on
and
started
getting
aggressively
contacted
by
a
group,
a
company
called
profit
store
who
has
a
product
called
disc
profit.
That's
like
an
AI,
driven,
very
smart,
very
accurate
tool
that
will
predict
when
you're
just
this
sort
of
fail.
They
have
a
product
you
deploy
on
premise,
and
then
they
have
a
SAS
service
and
they're.
F
F
So
once
all
this,
once
all
these
pieces
are
there,
that
sort
of
middle
gap
will
be
filled
by
or
could
be
filled
by
a
module
that
queries
this
disk
at
SAS
service
or,
if
you
deploy
it
on
premise
or
whatever,
which
would
be
pretty
cool
and
then
because
it's
sort
of
broken
into
pieces.
The
idea
is,
then
that
metal
piece
can
also
be
implemented
by
something
else.
So
if
somebody
has
an
open-source
sailor
prediction
model
code-
library
whatever
it
is,
then
you
could.
F
You
could
use
that
in
place
too,
and
all
the
bits
that
teach
the
cluster
how
to
respond
to
a
predicted
failure,
don't
change
and
probably
the
bits
that
are
actually
collecting
that
data
don't
change.
It
might
be
that
you
don't
want
to
do
use
the
piece
that's
collecting
through
the
OST
either.
Maybe
you
already
have
like
Prometheus
is
already
scraping
these
things
and
something
that
a
time
series
database.
That's
fine.
F
You
could
also
pull
the
data
from
there
instead,
but
that
that
last
end
of
the
pipeline,
where
you
say
I,
know
that
this
device
is
going
to
fail
them
to
two
weeks
and
the
closer
response
to
it
remains
the
same.
So
that's
that's
the
big
picture,
a
whole
bunch
of
like
missing
details
there
of
any
any
questions
or
comments
on
sort
of
what
the.
O
O
That's
why
I
make
what
white?
So
maybe
that
was
considered
and
discarded
and
I
couldn't
find
it,
but
why
so?
The
if
I
ask
an
OST
for
its
metadata.
It
will
already
show
me
its
current
metadata
right.
The
OST
metadata
time
quit
said
no
piece
of
place
to
stores
a
transient
and
the
latest
version
of
the
smart
data,
and
then
we
store
it
where.
F
There's
the
first
part
where
we
have
some
tools
that
let
you
scrape
the
data
and
that's
generic
and
you
can
use
them
or
not
use
them,
and
then
we
have
the
the
infrastructure
to
store
the
prediction
and
to
respond
to
it.
That'll
be
totally
generic
and
everything
in
between
a
sort
of
TBD,
the
in
the
disk
profit
case.
F
There
module
is
going
to
do
all
of
that,
so
it'll
it'll
directly
scrape
the
latest
metric
and
clear
their
API
and
on
those
on
this,
their
server
side,
they're
gonna
be
storing
the
history
and
they're
going
to
be
running
a
I
machine
learning.
Stuff,
all
over
it,
they're
going
to
be
doing
all
that
and
they'll
just
give
you
back
reduction,
and
so
that
doesn't
need
to
do
anything.
I
just
need
to
say
here's
a
device.
Here's
the
latest
metric
isn't
gonna
fail.
F
The
goal,
though
I
think
for
the
community
at
large,
should
be
that
there
should
also
be
a
completely
free
and
open
implementation
of
this
of
that
part
of
the
pipeline,
and
so
the
thinking
when
we
were
expecting
out
that
ratos
bit
was
that
there
should
be
some
a
module,
that's
entry.
That
is
maybe
it's
not
a
very
high
quality,
but
it's
like
a
disk
failure,
prediction
stuff
that
exists.
O
O
O
F
That's
in
the
tree.
That's
like
called
disk
profit.
That's
like
a
commercial
name.
That's
tied
directly
to
a
SAS
service,
that's
a
whatever
a
commercial
service,
I'm
gonna
making
service,
and
so
what
I
suspect
is
gonna
happen.
Is
that
that
module
is
going
to
be
like
completely
genericized
non
vendor
specific?
And
it's
and
the
api
that
it
uses
to
query
the
staff
service
is
going
to
be
specified
so
that
somebody
else
could
implement
another
model
that
just
influenced
that
API
and
on
the
back
end
it'll
use.
O
F
F
O
F
Anyway,
but
yeah
that
sounds
cool
I
mean
that
the
day
it's
literally
a
timestamp
I'm,
actually
not
I'm,
all
I'm,
storing
as
a
timestamp
and
the
time
that
recorded
the
timestamp.
So
it's
not
that
much
and
it's
not
something
I
would
ever
expected
celebrity
to
send
it.
Telemetry
should
be
sending
back
like
anonymous
aggregate
data
and
not
anything
specific,
so
I,
don't
think,
there's
overlap,
but
but
yeah
they're
the
whole
brave
new
world
of
GDP
our.
F
F
F
F
I'm
pretty
about
this
feature
in
general,
because
I
I
don't
have
hard
data
on
this,
but
the
the
risk,
your
your
data
durability
risk
is
generally
related
to
the
probability
of
having
a
second
failure
or
correlated
failures
in
your
system,
and
if
you
can
re
replicate
data
to
create
any
replica
before
the
failure
happens.
I
expect
that
we're
gonna
get
like
you
know.
Order
of
magnitude,
hopefully
increase
in
the
overall
reliability.
F
N
D
In
general,
to
contribute
to
open
data,
one
of
the
purposes
that
we
have
in
mind.
Yes,.
F
Yep
yeah
there
was
a
I,
don't
know
flea
socialized
it
on
very
publicly
but
out
of
after
the
vault
conference
last
year.
This
whole
conversation
started
about
the
idea
of
building
a
public
data
set
of
smart
failure
data
because
currently,
if
you
want
to
like
get
data
to
train
these
models,
you
have
to
go.
You
know
you
their
back
ways
as
the
data
set
or
you
have
to
like
sign
an
NDA
with
like
Google
or
something
to
get
some
big
cloud
providers
data
set.
F
F
At
let
me
find
the
pull
request.
You
can
go
look
at
that
this
process
here
it
is
here,
is
the
pull
request
that
they
publish
so
far.
The
pull
request
needs
a
bunch
of
work
because
it
predates
this
tracking
of
devices
and
storing
the
prediction
stuff.
So
it's
doing
a
bunch
of
things
that
aren't
quite
right,
but
this
is
yeah.
How
many
lines
is
it?
It's
like
it's
3,000
lines
of
code,
currently
all
right.
A
F
Fresh
reports,
okay,
so
we
talked
about
this
I
think
it
was
in
a
CDM,
but
I
couldn't
find
it
on
the
agenda
of
any
of
the
past
ones.
Maybe
we
were
just
talking
about
in
a
stand-up
here's,
the
here's,
the
pad.
The
idea
is
that
we
would
modify
and
I
just
like
made
a
very
specific
proposal
just
for
as
a
strong
and
proposal
to
still
to
discussion.
So
tell
me
if
that
any
of
this
is
stupid.
F
The
idea
would
be
to
modify
the
current
seg
fault
handler
that
dumps
recent
log
files
in
a
stack
trace
to
the
log
file
to
also
generate
a
crash
directory
report
in
varlet
somewhere.
So
I'm
thinking
bar
lips
have
crash
and
then
some
unique
identifiers
for
that
particular
crash,
and
it
would
be
broken
out
into
several
files.
So
it's
like
easily
possible
understandable.
One
would
be
that
same
log
dump
that
we
put
in
the
log,
so
the
last
like
ten
thousand
lines
of
log
or
whatever
it
is
one
would
be
the
name
of
the
demon.
F
Is
that
like
if
your
demons
crash
so
dump
something
in
the
log
file
and
then
system,
D
or
whatever
will
restart
it,
and
then
you'll
actually
notice,
like
there's
no
persistent
record
that
that
that
server
crash,
unless
you
happen
to
go,
read
the
log
file
which
nobody
ever
does
or
you
have
something
that's
monitoring
your
log
file.
So
this
record
makes
it
like
a
distinct
record
of
that
of
that
crash,
and
then
we'd
have
some
helper.
F
That
would
just
I
would
look
in
that
directory
and
look
at
all
the
we'll
call
it
the
crash
reports
and
then
it
would
basically
report
them
back
up
to
the
manager.
So
it's
a
manager
I
have
crash
report.
This
UID
have
you
heard
about
it
already.
If
not,
it
would
upload
whatever
information
it
has
about
it,
and
then
it
would
mark
it
on
the
local
node.
That's
one.
That's
already
been
reported
enough
to
repeat
that
process
every
time,
and
probably
that
helper
would
just
run
on
every
demon
when
it
starts
out.
F
F
The
manager
module
would
store
that
those
crash
reports
in
config
key,
probably
and
the
logs
are
gonna,
be
big,
for
maybe
those
have
to
go
into
Reno
school,
or
maybe
they
wouldn't
even
be
stored
by
the
manager
at
all,
not
really
sure
what
makes
sense
and
then
maybe
I
think
the
real
win
would
be
if
we
take
the
the
stacktrace
portion
of
that
and
strip
out
the
like
the
code
offsets.
So
it's
just
the
function
names
which
is
still
pretty
unique
as
pretty
sufficient
to
like
pretty
uniquely
identify
a
particular
crash.
So.
J
F
I
think
my
only
concern
is
that
it
would
be
nice
to
capture
the
same
crash
across
different
builds,
and
so,
if
you
have
like
20
different
builds
out
in
the
wild,
it
would
show
up
there's
20
different
crashes
if
they're
different
builds,
because
those
code
boxes
would
be
slightly
yeah.
That
was
a
reason
you
can
do
both.
L
A
little
leery
putting
doing
that
much
in
a
signal
handler
I,
don't
know
if
you,
if
that
was
your
intent,
but
that's
you
may
want
to
do
you
have
it.
You
know,
have
the
single
handle
to
kick
off
something
else
to
do
all
the
work
doing
stuff
like
that
inside
of
the
signal,
and
especially
in
a
sec
d,
yeah
memory,
corruption
and
stuff
so
yeah.
That
could
be
that.
L
F
F
F
O
It's
a
question
of
doing
it
in
the
signal
hand,
is
valid,
I
mean
we
could
also
look
at
analyzing,
the
chord
after
we
have
been
restarted,
but
I
think
it's
it's
probably
easy
this
way,
so
it
shouldn't
be
too
bad.
I
think.
F
D
L
L
F
Know
that
we
can
get
away
from
the
signal
handler
writing
out
this
information
without
capturing
the
core
file.
That's
the
thing
and
I,
don't
know
how
to
and
alit
I
mean.
Maybe
it
must
be
possible,
but
I,
don't
know
how
you
go
and
automate
I
guess
you
could
like
scrape
gdb
or
something
I.
Don't
know,
I,
don't
know
how
you
automate
like
analyzing
a
core
file
to
like
it.
Obviously
it
can
be
done,
but
so.
O
Does
it
we
are
actually
running
gdb,
which
is
a
major
pain,
because
that
then
complaints
a
whole
lot
missing
anyway,
the
other
thing
might
be
looking
at
what
Mozilla
does.
Maybe
that
is
interesting
because
they
get
a
crash
reports
as
well.
No,
maybe
just
for
compare
and
inspiration
if
they
have
any
good
ideas.
F
Yep
yeah
I
guess
my
my
worry
is
that
we
start
introducing
a
lot
of
dependencies
on
how
the
system
is
configured,
so
some
systems
will
have
that
a
pert
thing
installed.
That's
like
all
the
system-wide.
All
the
core
files
are
getting
pipe
to
I,
don't
know
how
you
override
that
on,
like
a
per
system
to
unit
or
if
you
can,
and
if
we
do
do
that,
and
we
pipe
it
all
to
our
special
cache
crash
collector.
Do
we
then
also
feed
it
to
the
I?
F
Don't
know
it
gets
salt,
it
gets
all
messy,
whereas
the
the
signal
handler
that
we
have
right
now
is
already
doing
almost
exactly
the
same
thing.
It's
touching
all
the
same
data
structures
in
memory
so
I,
don't
this
isn't
making
it
any
more
fragile
than
it
already
is,
and
we
haven't
had
problems
yet
and
I
think
the
worst
case
is
that
you
crash
trying
to
generate
your
crash
report,
which
means
that
you
just
don't
generate
a
crash
before
it.
So
it's
like
no,
no
loss.
L
While
you're
writing
out
files,
you
know
from
within
a
signal
handler
where
you're
dealing
with
potential
memory.
Corruption
knows
what
you're
gonna
scribble
over
on
the
file
system.
So
that
would
be
my
real
concern
there
I
think
so,
and
that
you
may
have
that
problem.
So
I,
don't
I
don't
know.
Is
that
exploitable
I
guess.
O
Yeah
I've
done
something
like
that
in
a
previous
life
with
an
H
a
stack,
it's
not
said
you,
you
change
routes,
a
signal
handler
and
stuff
beforehand.
It's
not
that
or
you
open
the
files
and
everything
ahead
of
time.
It's
not
that
bad
and
especially
if
you
know
if
it's
containerized
to
otherwise
constrained
you
just
can't
scribble
over
everything.
So
not
usually
it
hasn't
been
a
concern
for
us
in
the
past,
at
least
okay.
F
A
true
is
probably
good
widget
to
help
protect
ourselves.
I
think
that
that
the
net
new
here
is
that
what
the
current
handler
does
is.
It
writes
to
a
file
descriptor,
that's
already
open
or,
as
this
will
create
a
directory
and
write
to
several
new
files
in
the
same
directory,
but
the
actual
data.
That's
writing
is
roughly
the
same.
G
F
G
J
F
J
G
F
It
kind
of
makes
me
think
that
is
it's
not
actually
adding
any
much
value,
they
might
not
bother
doing
it.
So
we
could
leave
the
log
on
the
local
in
the
local
crash
directory
and
those
get
deleted
after
you
have
like
10
of
them,
or
they
get
more
than
10
days
older
some
retention
policy,
but
then
only
store
the
crash,
the
stack
traces
in
the
manager
and
all
my
phone
and
only
phone
home,
the
stack
traces.
J
F
Assuming
we
phone
all
that
information
home,
which
I
think
would
be
nice
that
if
it's
a
with
what
the
telemetry
module
it's
a
it's,
a
unique
ID,
that's
used
only
for
the
purposes
of
telemetry.
It's
not
actually
the
cluster
ID,
but
it
is
unique
to
the
cluster
and
that
identifies
all
the
telemetry
from
one
cluster
so
and
that
telemetry
user
can
choose
whether
or
not
they
identified
themselves
or
not.
They
can
optionally
set
like
I
own
this
cluster.
So
if
we're
gonna
go
contact
them
or
not,.
F
If
we
wanted
to
get
really
fancy,
we
could
we
could
have
a
feedback
mechanism
whereby
telemetry
could
be
operating
in
a
fully
anonymous
mode.
But
if
the
developer
sees
an
interesting
crash
and
they
want
more
information,
they
could
just
flag
that
set
a
flag
on
that
ID
that,
like
they
request
more
information
and
the
next
time,
telemetry
dials
in
and
they'll,
say
oh
it'll
pass
that
message
back
and
it
will
show
up
for
their
Eunice.
H
So
that
might
be
something
else
to
consider:
I
don't
want
to
do.
Runtime
analysis
of
of
that
and
I
haven't
I've
played
around
with
it
personally
another
option
for
generating
core
dumps.
Apparently
heat
race
recently
added
an
option
to
dump
core
on
what
you're
tracing
and
so
one
possibility
is.
We
could
have
a
wrapper
process
that
runs
Ceph
traced,
but
it
doesn't
doesn't
intercept.
H
The
system
calls
it's
just
waiting
for
an
exit
event
or,
in
particular,
a
stop
created
by
a
fatal
signal,
and
then
just
do
a
dump
core
on
that
and
then
that
that
that
P
trace
command
lets.
You
specify
a
file
you
want
the
core
time
printing
to
and
so
we're
still
getting
our
our
core
dumps
in
the
right
format,
and
there
shouldn't
be
much
or
any
overhead
because
we're
not
intercepting
any
real
events
from
the
process,
except
for
when
it
stops,
so
that
that's
another
possibility
for
doing
it,
and
it
should
be
pretty
simple.
H
Right,
yeah
I
think
you
could
even
run
GDB
with
a
certain
set
of
commands
to
generate
the
things
you
want
yeah
and
then
also
there's
that
issue
that
I
had
written
where
we
can
have
the
OSD
tell
the
tell
the
kernel
which
memory
segments
not
to
to
dump
for
security
purposes
and
also
just
to
keep
the
core
dumps
a
reasonable
size.
We
can
use
that
also
in
tandem
with
this,
to
to
limit
what
we
dump
and
make
it
safer.
F
I'm
not
sure
if
we
know
where
actual
cat
suffered
cached
user
data,
it's
gonna
end
up
in
memory
very
easily,
but
that
would
make
sense
if
we
were
gonna
capture
course.
This
actually
isn't
proposing
that
we
capture
core
files
at
all.
It's
just
capturing
the
stuck
traces.
It
would
just
tell
us
what
what
versions
of
stuff
are
crashing
and
where
and
that's
it.
F
F
F
Stefan
SGO
replication
kinda
wanted
to
so
that
I've
been
thinking
a
lot
about
this
general
topic,
because
I
think
it's
one
of
the
like
main
features
that
were
missing.
That
is
going
to
be
important
in
sort
of
the
multi
cloud
federated
whatever
world,
especially
one
that's
emerging,
was
you
know,
kubernetes
and
deploying
applications
across
multiple
clouds
and
private
clouds,
and
whatever
making
all
behave
so
I
tried
to
lay
out
what
the
what
the
motivating
use
cases
are.
What
some
other
implementations
of
this
look
like
and
figure
out
what
we
should
do,
I
should
mention.
F
F
Then
doing
sets
best
stuff
so
he's
interested
in
working
on
this,
so
I
just
I
wanted
to
lay
out
what
the
what
the
motivating
use
cases
were
and
see
what
I'm
missing
and
then
really
implementations
and
then
some
possible,
as
that
we
can
take
so
use
cases,
are
just
disaster
recovery
where
you
have
multiple
clusters.
If
one
of
them
like
blows
up
for
the
data
center
goes
away,
you
want
to
have
a
usable
backup
copy
at
another
site.
F
Part
of
that
probably
would
be
failing
back.
So
when
that
data
center
comes
the
back,
you
can
resynchronize
and
continue
without
relocating
the
whole
data
set.
So
that's
recoveries,
one
active
passive
replication
would
be
one
I'm,
not
actually
sure
how
common
this
would
be,
but
basically
you
would
have
multiple
clusters.
F
Well,
if
you
tell
people
that
you
have
a
new
distributed
file
system,
they
start
asking
about
like
whether
you
can
have
a
globally
distributed
file
system
with
replicas
and
to
repress
the
world
and
might
have
day
to
follow
the
Sun
and
always
be
cached
locally,
and
all
this
like
crazy
stuff
that
it's
basically
impossible
to
do
in
a
way
that
it's
fully
consistent
and
performant.
So
this
is
sort
of
reaching
towards
that
I
guess
and
obviously
consistency
and
conflicts
are
the
big
issue
there.
So
are
am
I
missing.
J
I
mean
so
you
talking
about
bi-directional,
active
active
at
the
engineer
like
wow.
Well,
maybe
someone
else
are.
This
is
complex
and
hard
to
explain.
Users
I
just
noted
and
like
usually
when
you
have
an
active
active.
What
you
actually
got
is
something
like
AFS
I.
Think
we're
much
like
the
different
sites
will
take
over
tree
and
will
maintain
Authority
for
that
tree
and
then,
if
someone
else
on
the
different
like
site
wants
to
access
one
of
those
files
they
go,
they
do
the
remote
access
and
they
just
pay
the
Lancie
class
on
it.
J
J
Where
you
delegate
it
down
through
the
servers
to
someone's,
desktop
and
yeah
and
I'm,
not
sure
that
that's
a
thing
we're
ever
going
to
be
good
at,
but
if
it
is
that's
like
the
design
I
mean,
maybe
we
could
be
since
we
have
voting
it
took
me
a
more
common
capability
system,
but
I
think
that
would
be
the
way
to
do
it.
If
we
wanted
to
do
it
not
try
and
like
see
some
weird
conflict
resolution
system,
yeah
yeah,
no
one's
ever
made
those
work
like
they
try
and
it's
just
as
bad.
O
So
our
primary
use
case
that
we
see
clearly
would
be
veiled,
soft
by
active,
passive
replication
based
on
snapshots.
As
long
as
the
target
is
in
a
consistent
state,
we
can
make
their
work
if
we
can
optimize
it,
so
our
saying
actually
find
stuff
fast.
You
know
that
goes
a
long
way
towards
meeting
that
goal
and
then
they
could
make
that
you
know
if
they
wanted
to
go
down
that
route.
With
some
active,
active
manual
load,
balancing
things
I
can
always
have
different
directory
trees.
O
F
Think
in
reality,
for
most
users,
like
conflicts,
don't
happen
like
it's,
not
that
you
have
two
people
editing
the
same
file
into
geographic
regions
like
maybe
they'll
have
them
like
once
in
a
blue
moon,
but
like
99.9%,
that,
like
isn't
actually
the
case,
it's
that
you
have
different
applications
operating
on
two
completely
different
parts
of
the
tree
and
so
I'm.
What
it's!
What
I'm
wondering
is
if
there
is
a
sort
of
a
more
optimistic
replication
that
has
some
basic
public
resolution
that
that
is
going
to
work.
D
K
F
F
F
K
F
F
So
what
I
wanted
to
do
is
catalog
a
couple
of
the
solutions
that
are
out
there
right
now,
just
as
a
point
of
comparison,
because
it
feels
like
we
haven't
like
thought
about
this
or
like
seriously
considered
so
many
things
because,
like
solving
the
entire
problem
perfectly
is
hard,
but
it
turns
out
that
you
can
solve
it
in
perfectly
with
easy
solutions
and
that
actually
will
capture
like
a
lot
of
use
cases
and
there's
nothing
like
for
running
us
from
doing
it.
So
so
one
one
point
comparison
is
the
cluster
fest
your
replication?
F
This
is
basically
on
the
back
and
it's
our
sync,
but
it's
driven
by
a
change
log,
but
then
that's
of
generating
so
they're
generating
a
list
of
files
that
to
examine
in
the
arcing
background
thing,
just
sits
there,
our
sinks,
those
specific
files.
So
it's
loosely
consistent.
It's
only
useful
for
dr
because
the
it's
not
the
solution
consists
on
the
other
side,
it's
one
directional,
although
I
guess
you
can
you
can
chain
them
to.
F
You
can
have
like
a
b2b
and
b2c,
but
yeah,
so
it's
relatively
simple
to
implement,
assuming
you
have
a
changelog
one,
that's
widely
deployed
commercially.
Is
the
net
app
snapmare
thing
which
takes
a
snapshot
every
few
minutes?
I
recently
you
de
mentioned
that
that
the
shortest
interval
that
you
can
configure
is
five
minutes.
That's
because
the
design
doesn't
really
scale
down
to
smaller
timeframes,
but
you
just
there
periodically
take
a
snapshot
and
then
sync
that
sense
out
the
other
side.
F
It
basically
works
as
long
as
your
workload
is
like
behaving
in
a
correct
way.
But
if
you
are
writing
the
same
files
in
multiple
sites
and
you
get
split,
brain
and
things
go
by
things,
go
bad
so
and
then
the
one
other
thing
I
wanted
to
mention
is
that
I
think
that
actually
the
biggest
example
of
distributed
right
right
replication
is
actually
Dropbox
which
isn't
really
at
the
file
system
level.
It's
at
the
file
level,
so
you
have
totally
disconnected
operation.
F
When
you
reconnect,
then
it
just
like
checks
in
a
new
version
of
the
file
and
the
server
side
just
keeps
multiple
versions
of
files
revisions
as
part
of
its
the
existing
archiving
thing,
and
so,
if
there
was
a
conflict,
you
can
just
roll
back
to
the
other
one
or
the
conflicting
one.
It's
probably
flagged
in
the
history
or
something
like
that.
F
But
that
to
me
is
sort
of
a
proof
point
that,
like
conflicts
are
rare,
although
not
that
rare
and
as
long
as
you
have
a
way
to
resolve
them
and
like
identify
them,
then
like
that's
the
user
experience
for
that
like
is,
is
okay,
it's
not
that
you
have
to
like
magically
resolve
a
conflict
between
two
programs
modifying
a
database.
It's
that
you
have
to
resolve
conflicts
between
humans,
editing
a
word
doc.
F
F
So
right
so
there's
our
sync
piece,
which
is
what
the
fellows
name
I
forget,
is
working
on
now,
I
believe,
which
basically
just
makes
our
sync
scans
stuff
s
hierarchies
more
efficiently.
I
think
there's
actually
a
key
piece
of
work
here
that
we
have
to
do
because
the
our
stats
don't
propagate,
synchronously
and
so
I
think
in
order
for
this
to
actually
be
reliable,
we
have
to
have
a
way
that,
like
forces,
a
flush
of
our
stats
up
to
a
particular
point
of
the
tree,
where
you're
starting
your
sync.
H
J
J
F
Okay,
because
I
think,
if
we
did
that
one
piece
in
the
MVS,
then
we
could
do
the
snapshot
driven
snapmirror
type
thing
where
we
just
have
a
it
could
be
a
like
a
bash
script
that
just
takes
a
snapshot
and
our
sinks
across,
and
it
takes
a
snapshot.
The
remote
site,
I
think
we
could
have
that
solution
which
is
like
equivalent
to
what
Ned
asked.
You
should
be
pretty
good
with
almost
snow
without
any
sort
of
deeper
surgery
or
plumbing.
F
F
F
A
F
So
one
idea
would
be
that
you
just
allow
writes
in
multiple
locations,
and
you
are
a
singly
replicating
those
rights
to
the
other
side.
And
then
you
would
have
some
sort
of
automated
conflict
resolution
so
something
like
last
writer
wins
for
files
and
first
rename
wins
that
the
renames
conflict,
something
like
that
this
has
like
in
order
to
actually
implement
this
we'd
have
to
have
a
change,
log
I
think
to
stream
those
updates
across.
F
J
Yeah
I,
just
I,
get
where
the
drop
box
thing
appeals
to
you,
but
I
think
you
need
to
keep
in
mind
that
they
have
a
lot
of
stuff
that
we
don't
they
they
keep
like,
because
there's
a
user
thing
rather
than
and
so
sort
of
social.
They
keep.
Basically
every
version
of
a
file
that
ever
gets
created
and
they're
in
that
independent
place
and
the
well
sort
of
transparently
resolves
conflicts
by
just
having
those
the
most
recent
one
wins.
But
the
important
part
is
not.
J
J
O
F
Yeah
that
I
think
that,
in
order
for
it
to
be
feasible,
that
we'd
have
to
have
they'd
have
to
choose
something
so
that,
if
you
didn't
touch
it,
it
wouldn't
like
be
broken
right.
It's
making
it
able
to
safely
roll
back,
so
I
think
Greg's
right
having
those
four
powers.
Visions
is
like
the
key
thing
that
would
make
it
yeah.
J
P
O
O
We
do
have
that,
you
know
if
you
detect
a
conflict,
create
a
snapshot.
F
I
think
the
the
trick
is
that
they
would
be
divergent
right,
so
you'd
have
version
a
on
both
sides
the
same
and
then
you'd
have
like
B
and
C
that
are
different.
And
then
you
try
to
sync
and
then
ideally,
you
would
have
both
versions
on
both
sides,
but
they
would
also
have
the
same
winner.
But
the
ordering
is
wrong
because
you'd
have
BC.
F
It's
weird
so,
but
if
we
did
come
up
with
the
way
to
like
and
keep
sort
of
a
third
dimension
in
this
dimension,
where
you
have
revisions
of
a
file,
I'm
sort
of
at
the
seven
first
level,
I
think
that.
So
there
are
two
reasons
why
I
like
it
one.
It
would
have
been
able
this
bi-directional
sync
with
versions,
so
that
the
user
has
an
option
to
go
revert
and
we
can
invent
some
like
dot
version
directory
to
manipulate
those
or
whatever.
J
Yeah
and
they're
already
running
next
up
right.
So
that's
that's
the
app
that
runs
and
does
the
conflict,
detection
and
resolution
for
the
user
and
what
they
put
in
us
is
like
the
resolved
ones
or
for
whatever,
and
yet
they
can
still
expose
our
set.
Like
says,
the
snapshots
to
the
users,
different
versions,
but
yeah.
F
But
it
means
that
you,
you
only
get
that
those
features
few
through
next
cloud
or
as
if
you
push
them
another
file
system,
then
you
can
also
you
can
leverage
that
same
capability
to
do
active,
active
across
geographically
distributed
sites,
which
is
cool
and
also
you
can
have
like
a
regular
POSIX
file
system
that,
like
your
your
legacy
stuff,
is
accessing
and
also
access.
Those
same
data
sets
be
a
nice
cloud.
I
have
all
the
same,
and.
J
J
I
mean
the
when
we
talked
about
adjacent
thanks
this
in
the
past
it's
been
like,
oh
like
we
could
have
it
look
like
we
could
have
a
log
structured
file
mode
or
we
could
have
snapshots,
be
different
objects
instead
of
fredo
snapshots
and
those
all
make
building
in
these
sorts
of
features
a
lot
easier
and
more
practical,
I.
Don't
but
I,
don't
really
see
it
with
this
thing
file
system
layout.
Just
because
you
know.
I
J
O
Just
one
question
regarding
the
snapshot
so,
and
maybe
just
inserting
that
I
like
having
it
and
surface
next
one
comments
that
we
hear
and
that
some
users
aren't
aware
of
because
well,
we
can
do
snapshots
with
our
BD,
with
our
GW
and
with
cephus
right
that
would
be
cool.
Is
that
they
then
end
up
running
workloads
that
accesses
them
at
different
points.
So
at
some
point
we
need
to
take
a
consistent
snapshot
of
you
know
all
the
layers,
not
just
the
manila
file
system,
but
the
manila
fossum
at
the
same
point
as
a
workload
VMs.
J
O
Wait
so
thus
we
do
the
snapshots
up
the
M
sub,
backed
by
our
BD.
Let's
say
they
have
snapshots,
they
can
be
mirrors,
they
can
be
replicated
and
then
the
files
and
that's
atomic.
Essentially
they
are
consistent,
but
then
the
files
that
they're
accessing
and
waste
comes
along
and
it's
consistent
at
a
different
point
in
time,
yeah,
something
so
I.
J
J
Need
the
VMS
that
actually
flex
like
Colette,
I/o
and
flush
everything,
and
once
you
have
the
VM
full
of
thing
I,
know
and
watching
everything
that
it
can
just
like
wait
for
little
utility.
That
goes
and
takes
a
separate
snapshot,
but
while
all
I
always
call
s
inside
of
all
those
inside
of
all
those
mechanisms,
right
take
two
snapshots
and
simple
one,
but
they
would
be
with
fullest
I.
O
F
O
F
F
J
O
J
F
F
Okay,
I
think
that
gives
a
really
high
level
message
here.
Is
that
can
and
should
we
be
changing
our
perspective
and
our
thinking
around
rgw
as
as3
compatible
gateway
for
a
radius
cluster
that
happens
to
have
multi-site
capabilities
to
instead
be
in
a
gateway
to
a
whole
topology
of
storages,
whether
they're
stuffed
clusters
or
public
cloud
storages
or
whatever,
where
the
Gateway
knows
how
to
access
your
data,
and
it
gives
you
an
S
face,
but
behind
it
it
might
maybe
it's
storing
her
objects
and
staff.
Maybe
it's
replicating
into
multiple
sites.
F
Maybe
your
objects
are
all
over
the
place
and
it
knows
how
to
find
them,
maybe
they're
being
encrypted
and
then
stored
in
the
public
cloud.
Maybe
maybe
it's
a
combination,
maybe
it's
stored
in
one
cloud
and
it's
in
the
process
of
being
migrated
together.
But
if
we
think
about
rgw
is
sort
of
a
gateway,
that's
just
mediating.
F
K
There
are
two
main
paths
that
we'll
need
to
take.
One
is
finishing.
The
whole
cloud
sync
feature
like
currently.
What
we
have
in
mimic
is
sync
to
cloud,
and
we
wanna
have
sync
from
cloud
capability
also,
so
that
will
give
us
the
ability
to
to
you
know,
look
at
a
cloud,
as
you
know,
an
entity
that
we
can
fully
sync
to
and
from,
and
you
know
and
provides
the
whole
data
mobility
functionality,
and
the
second
thing
is.
K
Which
is
not
necessarily
related
to
that,
but
the
second
thing
is
the
ability
to
have
rgw
tears
on
the
cloud,
which
means
that
objects
could
be
either
on
Rados
or
either
on.
You
know
the
cloud
provider
in
the
back
and
it's
these
are
two
separate
issues:
whether
you
have
a
whole
zone
that
is
backed
by
the
cloud
or
a
zone
that
it's
kind
of
hard
to
operate.
K
Where
data
can
stay
in
one.
You
know
one
policy
that
says
it's
gonna
be
in
Raiders
and
another
power.
Six
guys
I
mean
you
know
in
that
cloud
provider,
a
third
policy,
it's
gonna
say
something
else,
and
one
more
thing
is
the.
Currently
we
look
at
the
data
at
the
zone
level.
We
say
everything
in
this
zone
is
gonna
replicate
to
another
zone,
all
within
the
same
zone
group
very
dr.
K
K
No,
there
is
cheering
that
we'd
be
happy.
Someone
took
on
them.
One
more
thing
to
note
is
when
looking
at
the
gateway,
the
Gateway
for
external
data,
there
are
multiple
options
on
how
to
do
it.
One
is
it,
you
know
the
Gateway
can
be
a
proxy
for
the
external
data
and
another
option
is
to
Gateway
just
redirects.
It
says
you
know
this
object
researching
this
endpoint.
K
That's
to
URI,
maybe
it's
gonna
be
a
pre
P
pre-authorized
time,
because
the
authenticated
URL
that
that
redirect
to
the
users
will
be
able
to
you
know
you
get
it
directly
from
where
it
is
at
that
point.
So
these
are
main
two
options
and
then
there's
a
question:
do
we
do
it
only
for
read
operations,
or
we
also
do
it
for
reading
for
goods?
But
then
you
know
it
really
depends
on
where
we
are
exactly
in
the
stack.
F
Yeah,
okay,
so
there's
there's
the
as
far
as
like
next
steps
possible.
Next
steps:
there's
the
the
tearing
piece
where
an
individual
object
would
be
a
redirect.
It's
actually
stored
in
this
external
object
provider,
clods
right
or
whatever,
which
I
guess
optionally
would
have
encryption,
would
be
cool.
So
vo
pick
on
the
other
end,
then
there's
the
ozone
or
bucket
granularity,
replication.
I
guess
did
you
mention
that
that's
one
thing
sort
of
different
different
inch
in
different
direction.
K
No,
the
one
thing
about
sync
to
cloud
and
sync
from
cloud:
these
are
the
way
things
work
work
right
now,
when
you
sync,
when
its
own
sinks,
it'll,
always
lose
data.
So
if
you
wanna
have
sync
to
cloud
and
sync
from
cloud
right
now,
the
way
it
works
you
need
to
have
two
different
stupid,
separate
zones
right,
one
for
the
where
you
want
the
data
that
keep
sync
from
the
cloud
is
in
one.
K
F
It
feels
like
there's
a
set
of
I
didn't
get
a
chance
to
read
this
whole
document
here,
but
I
think
you
probably
covered
a
lot
of
it
yeah
identifying
what
this,
what
the
specific
capabilities
are
like
at
the
bucket
layer.
Mapping
that
says
this
entire
bucket
is
sort
over
here
or
at
the
per
object
basis
saying
this
object
is
stored
over
here
yeah
yeah,
and
then
we
can
talk
about
how
we
do
implicit
capabilities
like
migrating
a
bucket
from
Cloud
Data
cloud
be
in
a
seamless
way,
so
that
access
is
uninterrupted
based
on
those
pieces.
F
F
K
When,
when
a
bucket
is
on
the
cloud-
or
you
know
at
the
remote
standpoint
it
doesn't,
there
is
not
it's
not
a
one-to-one
mapping
necessarily
because
you
know
x
has
their
own
limitations,
so
we
might
have
multiple
rgw
buckets
residing
in
a
single
cloud
provider
bucket,
and
there
are
some
naming
mutation,
that's
happening
and
there's
a
cure.
Mutations
are
happening
and
that's
how
it
works
now,
with
with
a
mimic
cloud
and
crossing.
So
there
needs
to
be.
You
know
when
you
do
is
sync
from
cloud
you'll
need
to
do.
P
K
P
K
The
way
the
cloud
sync
works
now
we
have
20
configure
the
sync.
You
can
set
a
set
of
echoes
that,
like
there
are
different
profiles,
and
you
can
have
this
bucket,
you
will
use
this
profile
and
another
buckets
bucketz
prefix,
all
back.
It's
a
search
with
some
prefix
will
have
a
different
profile,
so
it's
very
pretty
flexible
and
then
there
is.
There
was
a
way
to
to
also
define
how
you
map
users
in
rgw
to
users
in
in
the
backing
cloud.
K
Also,
moreover,
you
don't
even
need
to
run
a
single
cloud
and
you
can
in
theory,
you
have
some.
Some
buckets
go
to
s3
some
buckets
to
go
to
Asia,
not
that
we
support
either
at
this
point,
but
it
or
two
maybe
two
different
every
region
at
this
point,
so
that
it's
it's
pretty
flexible.
You
can
mix
and
match.
I
K
Really
not
really
currently,
there's
no
need
for
it.
You
have
a
logical
unit
that
defines
how
you,
how
you
take
the
zone
and
pushes
it
to
the
clouds,
but
but
in
that
logical
unit
you
know
some
of
the
brackets
can
go
to
to
one
provider
and
others
can
go
to
another
provider
in
theory,
because
it's
the
alien
theory,
because
currently
only
is
what
is
free.
F
F
K
F
August
yeah,
okay,
all
right,
imagine
so
come
up,
then,
if
not
sooner
I
guess
my
my
general
thinking
is
that
the
more
we
can
sort
of
map
out
the
roadmap
for
these
different
features,
the
more
we
can
get
other
folks
interested
like
our
friends,
a
cloud
who
have
been
writing
so
much
code
recently
or
over
else
cause
it'd
be
nice
to
get
the
tearing
stuff
rolling,
yeah.