►
From YouTube: CephFS Code Walkthrough: MDS Locker, Part 1
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
I
think
we
should
get
started
okay.
Well,
thanks
all
for
joining
this
code,
walkthrough
talk,
we
are
discussing
about
our
walking
through
I'll,
be
walking
through
and
discussing
about
the
mds
locker.
A
So
the
the
way
I
want
to
do
this
is
to
break
it
up
into
different
series,
because
locker
mds
locker
is
a
pretty
vast
topic
with
a
good
amount
of
complexity,
but
it's
hard
to
kind
of
fit
in
everything
in
one
hour
code
walkthrough,
so
I'll,
probably
just
split
it
up
into
like,
like
a
series
of
three
code,
walkthrough
videos,
you
know
covering
in
incremental
fashion,
about
details
of
the
mds
locker
and
the
code
walkthrough
itself.
A
So
the
part
one
series
which
is
this
will
be
about
understanding.
You
know
why
do
we
need
an
md?
Why
and
how
we
need
why
we
need
locking
in
the
mds
what
we'll
do
is
we'll
go
through
different
data
structures
involved
and
the
way
the
mds
uses
these
structures
and
the
locking
structures
we'll
do
a
walkthrough
of
some
of
the
file
operations
and
see
how
locking
is
used
in
those
and
probably,
you
know,
discuss
and
walk
through
the
code
a
bit
about
the
locking
itself.
A
The
next
part
series
would
be
the
actual
implementation
of
the
locker,
which
is
pretty
complex
because
in
in
a
sense
the
mds
locker
is
a
distributed.
Lock.
You
can
imagine
all
kinds
of
complexities
that
can
happen
there.
So
part
2
series
will
be
the
implementation
of
the
core
locker
itself
and
part
3
will
be
the
different
types
of
lock
classes.
So
the
the
logging
in
mds
is
divided
into
different
classes.
A
A
So
so
now,
which
is
the
actual
you
know
why
do
we
need
a
locking?
Why
do
we
need
locking
in
dmds
and
what
is
a
locker
locking
in
mds?
So
we
all
know,
you
know:
why
do
we
use
locks?
We
use
to
protect
anything
we
use
to
protect
state
metadata.
We,
you
know
to
mutually
exclusive
access
to
a
part
of
storage
or
part
of
data.
So
that's
how
you
just
log
and
the
concept
is
obviously
same.
A
You
know
we
protect
state
of
metadata
in
the
mds
in
different
data
structures
such
as
inode
and
d
entries,
and
things
like
that.
So
so
why
does
the
mds
need
stock
in
the
first
place?
So
we
all
know
that
you
know.
Data
managed
by
the
mds
is
pretty
large.
The
metadata
is,
can
be
you.
A
Huge,
so
it's
impractical
to
put
everything
in
memory
in
one
mds
and
you
know
and
overload
it
and
you
know,
and
that
would
cause
all
kinds
of
scale
issues.
So
what
cffs
has
is
you
know
you
can
have
multiple
active
mdss
and
then
the
metadata
load
is
kind
of
shared
across
these
dss.
So
there's
a
concept
of
a
dynamic
sub
repartition,
where
a
directory
tree
is
divided
into
small
sub
trees.
A
This
is
done
by
recording.
You
know
heat
of
each
node
in
the
directory
tree
and
whenever
you
know
when
a
subtree
is
divided,
so
the
sub
is
divided
when
a
heat
for
a
particular
node
goes
above
a
threshold
value.
So
when
the
sub
trees,
you
it
the
so
the
sub
is
divided.
Then,
when
the
sub
tree
is
divided,
the
node
is
changed
from
a
single
day,
fragment
to
multiple
their
fragments.
A
And
you
know
each
fragment
is
responsible
for
a
part
of
the
original
directory,
but
there'll
be
only
one
authoritative
node
in
the
in
for
these
fragments.
So
that's
called
the
auth
auth
mds
or
you
know,
authoritative,
for
a
particular
directory
node.
Now
each
mds
can
bear
you
know.
I
can
have
can
bear
the
corresponding,
read
and
write
requests
for
these
nodes
after
fragmentation.
A
So
if
a
file
is
like
you
know,
highly
hot
or
excess
lag
from
by
multiple
clients
very
frequently,
the
mds
you
know
will
generate
multiple
copies
of
that
fragment
and
distributed
across
different
mdss.
A
So
in
a
sense
you
can
have
a
directory
tree
split
because
it
is,
it
has
used
number
of
directory
entries.
So
you
know
you
can
a
single
fragment
will
break
into
multiple
fragments,
which
is
called
as
a
diffract,
and
then
each
of
these
fragments
can
potentially,
depending
upon
the
intensity
of
access
and
the
heat,
the
heat
maps
they'll
be
replicated
and
other
active
mdss
will
now
a
copy
of
this
particular
directory
fragment.
So
that
can
service
three
requests.
A
Now
you
know
there
are,
since
there
are
multiple
clients
conveniently
reading
and
writing
to
files.
So
the
mds,
you
know,
defines
different
usage
rules
for
locking
for
different
fileman
data.
So
just
to
give
an
example,
you
know
the
uid
of
a
file
by
modifying
the
uid
of
a
file.
That's
rarely
modified
concretely
right,
so
you
know
kind
of
a
shared
read
and
an
exclusive
write
can
be
guaranteed
while
for
this
particular
kind
of
metadata,
you
know
things
like
stats
of
a
large
directory
may
need
to
be
updated
by
multiple
clients.
A
At
the
same
time,
and
this
because
you
know
this
particular
large
directory
is
divided
into
multiple
points,
and
you
know
different
clients
read
and
write
to
different
charts
charts
in
this
sense
different.
You
know
frags,
so
these
shards
must
first
ensure
that
you
know
they
can
share
that
read
and
they
can
share
the
read
and
also
achieve
simultaneous
right.
So
it's
possible
that
a
client
is
updating
one.
A
client
is,
you
know,
writing
to
one
chart
and
the
other
connecting
to
one.
A
So
dm
just
needs
to
ensure
that
you
know
that
the
these
shards
can
share
the
reads
and
they
can
also
achieve
simultaneous
rights
and
only
when
needed.
You
know
these
shards
data
need
to
be
aggregated
to
the
authoritative
node.
So
all
this
kind
of
managing
clients-
you
know,
requests
for
accessing
different
parts
of
med
of
metadata
requires
different
locking
strategies
and
locking
rules.
So
you
know
think
of
it.
As
you
know,
we
have
all
done.
A
You
know
some
form
of
sometimes
we
have
all
written,
somehow
some
form
of
code,
which
involves
walking
and
say
you
have
a
particular
structure,
a
data
structure
that
you
that
needs
to
be
protected
for
from
multiple
access
congruent
access.
So
we
typically
have
a
lock
right,
that's
protects,
you
know,
n
entries
say
you
have
10
fields
in
that
data
structure.
Log
products,
these
ten
fields.
Now
you
know
you
you,
you
could
probably
optimize
this
say
out
of
these
ten
fields.
A
You
know
a
bunch
of
fields,
you
know.
So
so
when
you
update
any
any
any
of
these
ten
fields,
you
need
to
grab
the
lock
update
and
then
unlock.
That's
typically
how
it's
done
so
you
so
to
optimize
this.
What
you'll
do
is
you
will
have
different
logs
covering
different
fields,
so
you
can
have
lock
one.
You
know
that
covers
the
first
two
fields
and
then
log
two
for
the
other
two
fields.
That's
because
you
know
maybe
some
of
these
fields
are
infrequently
updated.
A
You
know,
and
you
know-
and
maybe
some
of
the
fields
are
like
written.
Rarely
but
read
very
frequently,
so
you
might
have
a
read,
write
kind
of
log
for
these
fields
so
that
you
know
you
can
have
shared
read,
but
for
those
which
are
like
updated
frequently,
you
will
have
a
normal
mutex
for
them,
so
we
have
done
we've
all
done.
A
These
kind
of
you
know
optimizations
and
lock
breaking
and
things
right
so
with
mds
is
the
concept
is
more
or
less
the
same
like
for
an
inode
or
a
d
entry?
There's,
not
one
log
that
protects
everything
in
that
particular
for
that
particular
for
for
metadata
hold
by
these
structures.
A
So
what
the
mds
does
is
different
and
you
know,
have
different
kinds
of
flock
types.
That
is
one
thing
then
the
what
the
mds
does
is
have,
as
I
said,
log
classes,
so
we
have
different
kind
of
log
classes
that
you
know
have
different
rules
for
locking.
So,
for
example,
we
have
something
called
as
a
simple
lock
that
is
like
a
base
class
implementation
and
typing
to
it,
defines
you
know
the
the
the
rules
for
the
the
base
rules
for
the
distributed
locking.
A
Then
we
have
kind
of
local
lock,
so
I
can
so
you
can
understand
local
lock.
You
know
it's
kind
of
only
used
by
an
mds
because
it's
just
used
for
say
updating
a
version
called
or
things
like
that,
and
then
we
have
something
called
as
a
catalog,
which
is
the
complex
one
which
involves
you
know
where
you
can
for
things
like.
You
know
where
the
mds
can
delegate
some
authority
to
another
mds
say
a
replica
for
a
frag.
A
Okay,
so,
let's
see
you
know
what
let's
see
some
different
so
yeah,
so
we
discuss
log
types,
log
classes
and
then
we
have
log
states,
so
log
classes
and
log
states
are
kind
of
very,
very
much
related.
That's
because
the
law
classes
implement
state
machines
and
log
states
are
basically
just
different
states
in
the
state
machines.
We
have
a
lot
of
states,
I
mean
around
38
states
so
that
just
kind
of
increases
the
complexity
but
we'll
come
to
that
later.
A
So
don't
worry
about
that
now,
we'll
start
from
the
beginning,
which
is
like
you
know
the
lock
types.
So,
let's
see
go
to.
B
A
A
So
so
we
have
these
different
block
types
that
are
that
that
are
used
a
lot
different
states
in
the
metadata.
So
we
have
so.
The
mds
has
different
locks
covering
different
portions
of
the
fields
in
I
know
in
the
inos
and
the
entries.
A
So
so
we
have
these
defined
here.
What's
interesting,
is
you
know
the
the
the
the
the
way
that
you
have
to
find?
You
know
what
these
log
types
cover
which
which
which
metadata
feels
these
lock
types
covers?
It's
actually
you
have
to
go
into
the
code,
so
I
have
kind
of
done
that
myself.
A
So
it's
actually
pretty
evident.
Some
of
these
are
very
evident.
Some
of
these
are
not
like
you
know.
You
can
say
the
cephalocyte
snap
is
kind
of
protect,
so
all
of
these
fields
they
are
13
or
14.
I
think
13.
they
all
of
them
protect
the
version.
So
you
know
that
is
the
base
one.
So
let's
say
I
snap
and
that's
used
to
protect
the
snaps
and
the
c
time.
A
A
This
is
for
the
versions.
These
are
for
the
versions
and
we
will
see
you
know
these
are
like
local
locks.
Policy
is
for
c
time
layout
quota.
You
know,
export
pane
and
you
know
they
recently
added
ephemeral,
ephemeral
star,
you
know
types,
the
interesting
ones
are
the
file
and
the
nest,
and-
and
these
are
the
complex
ones,
because
these
are
the
ones
that
use
these
catalog
mechanism.
A
So
file
is
for
c
time
and
time
a
time
and
this
stat
so
directory
statistics
information
is
for
is
is
is
protected
by
I
inest
is
for
the
recursive
stats,
so
the
so
mds
has
these
because
it
does
things
where
for
a
given
for
a
particular
node,
you
can
have
so
yeah
they're
a
bunch
of
x,
artists
that
you
know
do
this
cumulative
upward
marking,
so
those
are
protected
by
ines.
A
Yeah.
Most
of
these
locks
are
like
simple
locks.
Are
the
local
locks
like
the
versions
are
like
just
you
know,
local
locks.
Bunch
of
these
are
just
simple
locks.
The
lock
class
is
what
I
mentioned,
and
the
ifilantinist
and
I
file
yeah
I
file
an
ionist
is
for
is
the
complex,
scatter
log
that
we'll
probably
discuss
later.
A
A
A
A
A
So,
we'll
just
think
these
are
like
this
is
just
a
lock
implementation,
so
the
locker
class
kind
of
takes
care
of
the
implementation
based
on
the
on
the
lock
class,
so
the
action
so
that
we
have
a
locker
class
that
locker
down
that's
the
locker.cc
source
and
that
takes
care
of
how
to
lock
based
on
what
type
of
lock
class
this
particular
lock
is
defined.
A
So
we
have
like
you
can
see
that
you
know.
As
I
discussed,
this
is
a
ci
node
structure
and
there's
no
one
log
that
you
know
locks
the
entire
metadata
or
different
from
that
protects
the
states.
So
we
have
like
authlog
that
protects
uid,
gid,
link,
clock
the
link
count
and
then
so
forth,
and
so
on
you
know,
file
lock
is
for
body
c
time
and
time
and
their
stats
policy
lock
nest.
Log,
the
complex
one
is
for
the
rc
for
the
recursive
stats,
so
these
are
the
log.
A
A
So
here
we
only
need
two
like
the
actual
lock
and
the
version
lock.
The
version
lock
is
for
incrementing
the
rc
cd
entry
versions.
The
interesting
one
is
the
ci
node
ones,
because
this
is
what
is
mostly
used.
The
the
the
the
entries
one
are
pretty
easy.
You
know
there's
no
scatter
log
or
things
like
that.
It's
just
basically,
either
we
increment,
we
need
an
exclusive
log
to
increment
the
version
or
just
you
know,
use
you
know,
unicycle
lock
to
or
for
that
particular
d
entry.
A
Okay,
so
covered
the
log
types
and
the
inode
structures
yeah
there.
There
are
other
data
structures
that
we
should
be
looking
at
before.
We
kind
of
you
know
to
understand
the
whole
flow
of
the
locks
better,
but
we'll
do
it
on
demand.
You
know
once
we
there's
there
are
things
like
a
log
of
vector.
A
You
know
that
is
a
kind
of
a
vector
of
logs
that
the
m
that
you
fill
and
ask
the
mds
to
kind
of
acquire
those
logs
in
a
certain
fashion,
we'll
we'll
do
it
on
demand
and
yeah.
That's
probably
the
only
other
thing.
Let
me
quickly
see
my
notes.
A
A
So
what
I'm
doing
with
all
this
is
you
know
since
so
this
is
really
old
code.
Some
of
these
these
this
some
of
these
sources
are
like
written
in
in
2009
and
practically
never
been
updated
or
changed.
So
this
is
really
old,
and
these
are
like
the
most
you
know
undocumented
and
complex
part
and
the
most
complex
part
of
the
mds,
so
yeah.
So
what
I'm
doing
is
you
know
just
just
just
to
quickly
cover
this?
A
B
A
Basically
here
the
inode
and
then
you
know,
go
to
the
encode
iot
and
then
see
what
kind
of
fields
these
these
this
log,
you
know
protect,
so
you
know
so
improve
some
documentation,
and
then
you
know,
and
and
and
and
the
some
of
these
codes,
especially
the
locker
class-
is
that's
probably
one
of
the
most
oldest
code
and
you
know
it
uses,
doesn't
use
the
you
know
the
newer
version,
the
newer
facilities,
what
c
plus
plus
provides,
and
so
it
uses
all
those
higher
indexing
and
things
like
that.
A
So
there's
you
there's
go
for
improvement
there
and
to
make
it
more
readable
and
the
other
thing
is
once
we
come
to
these
lock
states.
You
know
they
are
very
poorly
named.
You
know,
I
can
probably
not
show
you
now,
but
you
know
their
names
like
lock,
lock
and
lock,
sync,
which
do
not
make
much
sense.
A
A
A
Let's
cover
some
file
operations
and
see
how
locking
locker
is
actually
used,
and
these
different
lock
types
and
auth
locks
and
these
link
locks
and
policy
locks
are
used.
A
So
that
will
give
you
a
feeling
of
you
know
how
how
how
whether,
if
you,
if
you're,
if
you're,
implementing
a
a
new
file
operation,
for
whatever
reason
you
know
you
would
know,
we
would
know
that
you
know
what
type
of
logs
together
so
that
so
that
you
know
so
that
you
can
implement
that
particular
file
operation,
suppose
that
file
operation
uses
monster,
touch,
etc.
So
you
just
need
the
ix
at
a
lock
type.
A
So
we'll
see,
let's
do
some
basic,
you
know
mkdir,
probably,
and
all
this
is
tied
to
you
know
pathwa
part
reversal,
so
how
these
operations
are
implemented.
Are
you
know
so,
we'll
see
you
know
you
fill
in
a
bunch
of
locks
and
then
you
know,
ask
the
mds
or
the
locker
class
to
actually
acquire
a
lock.
So
if
the
mds,
you
know
can't
acquire
lock,
for
whatever
reason
you
put
it
in,
it
puts
it
in
a
queue,
and
you
know
later,
you
know
when
it
can
actually
lock
wakes.
A
It
up
requires
lock,
and
then
you
know,
your
request
is
granted
and
the
other
thing
about
lock
is
you
know
it's
kind
of
tied
with
the
whole
capabilities
thing.
So
what
can
happen?
Is
you
know
so.
A
So
so
for
a
lock
request
to
be
successful,
it's
also
if,
if
that
particular
lock
request
needs
a
particular
capability
again,
it's
you
know,
you're
put
on
a
you
put
on
a
weight
queue
and
then
the
capabilities
are
revoked.
Probably
one
of
the
other
clients
has
that
gap.
So
a
cap
rework
is
sent
once
the
revoke
is
done,
you're
waking
up,
then
you
try
to
you
know,
acquire
that
lock
again
and
then
the
request
is
granted.
So
the
whole
thing
is
and
we'll
see
this
in
the
state
machine.
A
The
state
machines
also
tell
you
know
what
kind
of
capabilities
need
to
be
there
for
this
lock
to
be
granted?
Okay,
so,
let's
see
so,
let's
quickly
do
handle
very
basic
handle
client
mk
there.
A
This
is
the
server
source
so-
and
this
particular
is
this
particular
function
is
invoked
when
the
client
is
trying
to
make
a
directory
create
a
directory.
So
we
have
these
helper
functions
called
rdlogpath
xlr
entry
so
quickly.
So
whenever
the
so,
let's
take
an
example
of
a
client
doing
an
mkr,
so
it
passes
on
a
file
path,
slash
a
b
c
d
and
then
say
file
zero.
A
So
what
the
mds
does
is,
you
know,
for
each
of
these
part
components,
take
a
read
lock
on
each
of
these
path
components.
So
the
the
the
the
directories
from
abc
to
d
are
taken.
A
read,
lock
is
taken
on
these
and
then
the
the
the
the
entry
to
be
created
and
there's
a
exclusive
log
taken
on
that.
A
The
reason
for
that
is,
you
know
the
read
lock
is
taken
so
that
you
know
it
allows
parallel
congruent
read
so
that
if
another
client
is,
you
know,
looking
up
this
particular
path
itself
that
can
go
go
through
because
it's
a
read
lock,
but
since
you
know
if,
if
if
somebody
is
trying
to
if
another
client
is
trying
to
modify
one
of
these
directory
plots
parts
that
won't
be
granted
because
you
know
somebody
already
has
a
read
lock
on
it,
so
we
have
this
anti-lock
path
extraordinary.
A
That
essentially
does
is.
This
is
the
one
that
actually
goes
into
the
locker
thing.
So
the
first
thing
we
do
is
to
rd
lock
path,
extract
entry,
we
what
we
do
is
yeah.
So
the
interesting
part.
These
are
all
some
checks.
The
interesting
part
comes
here.
A
A
We
want
to
rd-lock
each
of
these
path,
components
and
the
actual
de-entry
to
be
created
needs
to
be
exclusively
locked
and
one,
and
what
we
do
is
we
also
lock
the
snapshots
from
the
from
for
each
of
the
path
component
except
the
last
component.
So
we'll
see
that.
So,
if
you
see,
if,
if
we
happen
to
see
something
kind
of
a
lookup
that
doesn't
do
exclusive
locks,
because
it's
just
a
look
of
designated
requests
it
basically
it's
it's
basically
read
mostly
reading.
A
You
know
a
bunch
of
stuff,
there's
no
right
involved
or
updating
involved.
So
you
know
there's
no
need
of
a
wr
log
or
an
exclusive
log,
or
things
like
that.
A
So
and-
and
that
is
very
simplistic-
but
here
we,
you
know
kind
of
need.
These
read
locks
on
this
path,
confidence
and
then
exclusive
lock
on
the
the
entries.
Okay.
So
let's
see-
and
this
is
all
you
know-
kind
of
tied
with
the
power
travel
thing
in
the
mds,
so
part
ourselves
is
kind
of
implemented
in
the
cache
where
it
you
know,
kind
of
you
know,
does
breaks
up
into
different
path,
components
and
there's
a
resolution.
A
So,
depending
on
the
flags,
we
kind
of
you
know
assign
these
boolean
fields.
What
all
we
need
to
lock
so
with
rdlock
path,
extra
entry,
we
we
need
the
we
need,
the
auth.
You
need
to
lock
the
snap
we'd
lock
the
path
and
exclusive
lock
did
entry,
so
so
yeah.
So
we
have
so
the
the
locker
implementation
kind
of
breaks,
the
consumption
of
three
things:
rd,
lock,
wrlock
and
an
exclusive
lock,
so
rd
lock
is
shared.
A
Read
wr,
lock
is
shared
right,
so
the
wave
wrlock
is
used
is
you
know
for
for
these,
for
the
file
lock
and
nest
lock,
sort
of
your
lock
is
special
and
it's
mostly
used
for
file
lock,
and
this
lock,
so
file
lock
is
responsible
for
protecting
statistical
information
in
a
in
inot,
and
you
know,
which
is,
which
is
the
distracting
and
the
nest
lock
is
responsible
for
protecting
the
recursive
statistics.
A
Aster
thing
in
inode
t
so
all
this
required
since
our
directory
can
be
divided
into
multiple
charts,
as
we
discussed
and
even
each
chart
can
have
multiple
copies,
so
you
could
have
in
order
to
allow
the
statistical
information
on
the
starts
to
be
modified.
At
the
same
time,
w
or
log
is
kind
of
introduced.
A
There
are
scatter
locks
and
these
are
complex
ones,
not
really
discussed
right
now,
but
you
know
that's
where
you
know
a
doubly
alloy
comes
into
picture,
so
we
have
rd
log.
We
have
wlog
and
xlock
rdlock
share,
read
wlog
is
shared
right
and
exclusive.
Lock
is
you
know
the?
We
need
exclusive
access,
so
there's
no
sharing
there.
It's
like
a
it's
like
a
mutex.
A
Okay,
okay,
come
here.
The
interesting
part
starts
a
bit
below,
but
before
that,
what
it
does
is,
if
you
have
asked
to
log
the
snapshots
you
go
here,
there
are
a
bunch
of
checks.
If
the
directory
is
free-
and
you
know
if
it's
deleted
and
things
like
that,
we
do
the
thing
like
return
the
character
now.
So
if
you
want
to
lock
the
snap
we
we
call
it
okay,
I
should
probably
cover
this
a
bit
later.
A
Let's
try
our
d-log
snap-out
thing
might
confuse
the
things
supposed
to
be:
let's
do
the
interesting
part
where
we
actually
walk
the
path
component
and
try
to
take
clocks,
okay,
so
yeah.
This
is
the
loop.
That
kind
of
you
know
big
one
that
does
a
walk
on
each
of
these
components
and
then
tries
to
take
a
log
depending
on
which
component
it's
it's
in
which
path
component.
It's
currently
it's
currently
accessing.
A
So
a
bunch
of
checks.
It's
a
snap,
not
interesting,
so
we
we
get
the
current
directory.
So
for
say
we
are
doing
slash
a
b
c
and
then
file
zero.
We
are
walking
like
we
are
walking
each
other
part
component
abc.
A
So
we
come
to
a
first,
which
is
we
get
the
current
directory
c
data
of
a
we
have
some
checks.
If
we
are
an
auth
or
not,
the
locking
part
starts
here
but
yeah.
So
we
try
to
look
up
that
particular
path
component,
like
you
know,
so,
if
it's
like
slash
a,
I
take
the
so
the
like.
A
It's
the
current
area
is
for
is
the
carded
off
root,
and
then
we
try
to
look
up
a
on
that
particular
file,
name
or
a
directory
name,
and
obviously
you
need
a
snap
id
if
you
are
traversing
a
snap
id,
but
just
ignore
that
for
now,
if
we
are
able
to
look
up
which
means
the
the
the
the
entry
exist,
some
we
try
to
lock
it.
A
So
what
we
do
is
so
remember
that
rdlock
path,
export
entry,
you
know
says
you
know
I
want
to
lock
each
of
these
path
components
and
next
lock
the
directory
entry
and
the
last
directory
entry.
So
once
we
are
in
rdlog
path,
you
know
see
if
we
want
to
x
lock
the
entry,
the
entry
and,
if
it's
the
last
path
component,
since
it
is
not
the
last
part
component,
we
are
just
in
the
first
one.
A
We
go
to
this
part
and
we
do
we
add
an
rd
log
to
that
particular
lock.
So
this
is
the
directory
entries
lock,
which
we
saw
it
in
c,
then
see
the
entry
dot
c
this
one,
so
one
is
used
for
locking
for
version.
This
is
this
local
lock.
This
is
used
to
lock
everything
else
except
the
versions
and
cd
entry
is
simple
because
it
doesn't
have
a
bunch
of
locks,
a
bunch
of
logs,
protecting
different
different
fields
in
the
metadata
in
the
structure.
A
It's
just
have
lock,
so
you
need
version
lock
if
you
are
incrementing
the
version
and
for
everything
else,
you
just
use
the
other
lock.
So
so,
if
so,
if
you
see
this,
you
know
we'll
we'll
walk
through
the
entire
tree,
the
the
path
component,
and
then
you
know,
add
these
and
this
as
an
rd,
lock
so
yeah
before
that.
I
think
I
just
moved
a
bit
ahead.
We
mds
has
this
log
of
vector,
which
is
fine.
Then
I
think
locker
dot,
h
pop.
B
B
A
Yeah,
so
it's
locker
vector
is
just
a
vector
of
logs
different
types
of
locks
that
you
fill
up
and
then
ask
the
mds
to
acquire
these
locks
in
a
particular
fashion.
So
we
have
different
helpers.
So
we
have,
you
know
rd
log
if
you
want
feedlock,
x,
lock
and
then
wlog.
A
So
this
takes
care
of
you
know
when,
when
we
do
something,
like
an
add
rd
lock,
it
basically
just
puts
that
particular
lock
into
a
list
or
a
into
this
particular
vector,
but
depending
on
with
assigning
marking
it
as
an
al,
whether
it's
a
read
lock
or
a
light
right,
love
or
an
exclusive
box.
A
So
everything
that
needs
to
grab
block
defines
a
lock
off
pick.
So
you
can
say
you
can
see
here,
define
a
lock
effect
and
then
fill
it
in
so
while
traversing
the
whole
path
component.
We
fill
this
lock
up
back
with
different
types
of
locks
that
we
want.
A
So
there
will
be
quickly.
I
think.
A
Hold
on
yeah,
so
for
so
for
all
the
path
components,
we
added
an
rd
lock
for
the
the
entries
and
once
we
are
in
the
last
path
component
and
we
want
to
x
log
b.
The
entry,
which
is
the
xb
entry,
is
basically
done
for
operations
when
we
want
to
update
a
particularly
like
creating
a
file
or
creating
a
directory
or
assembling.
So
on
the
last
last
path
component,
we
add
two
logs.
We
had
a
bunch
of
logs
like
file
lock.
A
So
why
is
this
file
lock
required.
A
So
so
file
lock
and
next
lock.
We
add
as
that
as
shared
right
locks
of
the
wr
logs
since
so
these
are
the
on
the
on
the
parent
directory.
So
if
you
have
like
a
b
c
and
then
file
zero,
so
the
wr
lock,
so
the
phi
log
and
s
lock
are
for
the
directory
c.
A
Okay.
This
is
done
since
the
clear
start
and
the
r
start
of
this
particular
high
note,
which
is
the
parent
inode,
will
be
modified.
So
you
know
once
so
so
this
start
is
like
you
know.
If
you
quickly
see,
there's
that.
A
Hold
on
there
is
this.
A
Nest
in
4t,
basically,
I
think
it's
in
the
inot
structure.
I
can't
really
find
it
right
now,
but
you
know
it's
like
the
r
start
is
the
r
start
is
protected
by
the
nest
lock
and
they
start
by
the
file
lock.
So
you
need
that
because
you
know
once
you
so
they
start
is
nothing
but
number
of
files
and-
and
the
number
of
directories
in
that
particular
under
that
particular
particular
inode
and
the
nest
lock
is
like
the
recursive
one.
A
So
so
so
we
add
these
logs,
since
we
once
we
create
this
particular
directory,
we
are
changing
on
the
number
of
files
of
the
number
of
directories
and
under
that,
so
those
will
be
modified.
So
we
add
a
you
know
right
lock
on
it.
So
since
their
status
is
protected
by
file
cast
at
bonus,
lock
so
add
those,
then
we
add
a
read
lock
to
op
lock
for
the
parent,
for
this
particular
dng
for
the
the
entry
to
be
created.
A
Since
you
know,
for
while
creating
this
d
entry,
we
need
to
access
the
parents
permission,
you
know,
that's
like
the
ydj
and
different
other
things
for
for
normal.
You
know,
permission
checks,
so
we
had
a
re-talk
on
those.
Then
we
had
an
x-lock
on
that
particular
final
path
component
and
the
entry
itself.
Since
you
know
once
we,
this
is
required
since
we'll
be
kind
of
creating
a
new
d
entry
and
then
you
know
and
and
then
fill
in
the
different.
A
You
know
things
like
link
hd,
which
is
like
and
the
actual
persistent
information
for
that
particular
the
entry.
We
can
see
that
too,
so
I
can
see
link
hd
yeah.
So
since
we
need
to
fill
in
this
link
hd,
we,
you
know
kind
of
take
the
exclusive
lock
on
the
the
entry
of
the
of
the
of
the
entry
to
be
created,
which
is
kind
of
here.
A
These
are
the
durable
bits,
so
we
take
a
exclusive
log
on
that
and
then
you
know
once
we
fill
all
these
fill
on
these
locks
and
the
vector
we
call
into
these
acquire
locks,
which
is
the
thing
that
actually
goes
through
the
state
machine,
see
if
you
can
acquire
your.
You
know
your
your
your
you
can
you
can
particularly
lock
that
particular
metadata
fields
or
if
it's
already
logged
in
you,
are
put
in
a
queue
so
yeah.
A
So
what
happens
is
most
cases
you
know
you'll
find
this
lookup
to
be
successful,
except
in
the
last
path
component,
because
you're,
mostly
creating
those
in
that
case.
You
know,
you'll
jump
to
this
part
where
you
actually
create
a
null
entry
which,
like
the
lookup,
is
a
miss,
and
then
you
want
to
instantiate
a
null
entry
for
it
and
then
again
you
do
the
same
thing
which
is
like
you
know.
If
you
want
to
exclusive
lock
that
the
entry
at
the
file
lock
and
the
nest
lock
for
the
parent,
inode.
A
Then
add
the
outlook
for
the
parent
node
for
permission
checks,
the
actual
exclusive
lock
on
the
dnt
itself
yeah,
and
then
you
call
into
acquire
locks
so
yeah,
so
so
yeah.
This
is
how
so
I
can
go
into
acquire
logs
I'll
I'll
touch
it
very
briefly
in
this
talk,
but
you
know
this
is
how
a
normal
operation
works
like
so
for,
while
writing
implementing
a
particular
file
operation.
You
know
you
need
to
know
what
all
different
fields
in
the
metadata
for
a
or
an
inode
is
going
to
be
touched.
A
So
if
we
see
something
like
a
set
exciter,
you
know
that
will
just
you
know,
do
an
ix
header,
take
a
lock
on
that
ix,
header
and
and
call
acquire
locks.
Look
up,
you
know,
or
any
other
get
out
of
call
would
be
basically
just
reading
information,
so
that
probably
won't
take
any
that
doesn't
take
any
exclusive,
lock
or
write
logs,
basically
just
read
locks,
and
then
once
everything
is
read
just
right
just
do
a
lookup
on
you
know,
read
up
different
fields
and
then
return.
A
I
think
yeah,
so
we
do
an
rd,
lock
path.
This
is
one
of
the
other
helpers
similar
to
you
know
that
rdlock
path
x,
logged
entry,
xbox
the
entry,
this
just
there's
a
just
a
basic
rd
lock
on
everything,
because
we
really
don't
need
okay,
I
think
we
probably
yeah
it
just
rd
locks
it
and
then
it
does
an
x-lock
on
the
x-acto
lock
itself,
so
it
blocks
the
wall.
So
this
will
actually
just
walk
the
walk.
The
path
component.
Let's
quickly
see
this.
A
Helper,
I
think
it
just
calls
into
yeah
just
do
an
rd
lock
of
the
path,
so
there's
no
exclusive
lock
of
the
entry,
just
hardly
lock
the
path
and
then
audiology
snaps
and
then
do
a
empty
cash
back
drivers.
So,
depending
on
the
flags
you
pass,
you
know
the
md
cache
part
travels.
Modifies
different
functions
are
depending
on
the
flags.
A
You
pass
so
you're,
not
passing
any
exclusive
lock
of
the
entry,
so
it
just
does
a
simple
read
lock
on
each
of
the
path
component
and
back
to
here
yeah,
once
we
have
already
locked
everything,
we
just
add
an
x
lock
because
we
are
going
to
modify
the
excited
value.
A
So
we
do
an
add
this
yeah
there's
just
like
one
lock
here
and
an
exclusive
lock
on
the
catalog.
The
xrtel
lock
protects
the
version
c
time
and
x
address.
Then
we
call
into
acquire
locks.
A
So,
let's
see
what
we
can
do.
I
think
if
we
could
also
do
client
mk,
not
it's
yeah.
So
since
we
are
creating
a
new
directory
entry,
we
xlock
the
directory
entry
and,
let's
see
if
we
can
see
if.
A
A
Now
there
is
this
flag,
such
as
snap
and
snap
2,
since
we
are
kind
of
handling
two
different
path:
components
even
rename
as
that.
So
we
have
something
called
as
a
as
a
snap,
2
or
path
like
just
like
path
and
part
2.
We
have
snap
and
snap
2,
which
is
like
point
snap
is
like
the
snap
locked
will
be
for
one
particular
path
component.
The
two
prefix
will
be
for
the
other,
so
we
have
that
notation,
let's
see
where
we
are
doing
yeah.
A
This
is
a
bit
tricky
too,
because
now
we
want
to
lock
two
particular
parts,
because
now
we
have
two
parts
to
operate
on,
so
that
becomes
a
bit
tricky,
but
let's
see
if
we
can
spot
if,
if
you're
doing
some,
I
think
it's
somewhere
here
yep
once
you
have
taken.
So
I'm
not
going
in
into
detail
into
these
this
two-part
xlock
destination
the
entry.
It
should
be
fairly
straightforward
once
if
you
understand
the
normal
rd
load
path,
xlr
of
the
entry
thing.
So
this
it's
probably
it's
probably
straightforward
too.
A
So
when
we
are
doing
a
link,
we
are
bumping
up
the
link
count.
It's
like
the
hard
link
thing,
so
we
need
to
take
a
look
on
the
we
need
to
bump
up
the
link
on.
So
we
take
the
add
an
exclusive
lock
on
the
link
lock,
which
link
block,
protects
the
version
c
time
version.
Everything
is
you
know?
If
you
see
the
encode
functions
in
ci
node
source,
you
will
see
everything
encodes
the
inode
version,
so
that's
by
default
and
then
the
link
clock
protects
the
gods.
A
The
end
link
the
number
of
links
for
a
particular
regular
file.
So
we
add
that
as
x,
clock
and
then
call
acquire
logs.
Let's
see
if
we
can
quickly
do
get
at
a.
A
Yeah
part
travelers
with
basically
nothing
so
everything
is,
is
like
read,
read
log
yeah,
so
this
is
mostly
does
we
saw
the
attack
sort
of
thing
right.
This
mostly
does
read
locks
and
yeah
so
yeah.
A
So
this
is
probably
needs
some
change.
This
is
practically
you
know
unreadable
if
you
don't
have
all
these
gap
notions
in
your
head.
It's
basically
unreadable,
but
you
can
see
that
most
of
it
is
just
doing.
Read,
locks
rd,
lock,
handy,
lock,
udlock
and
rdlock,
because
you
really
don't
need
because
you
are
just
looking
up
or
doing
a
get
attacker.
There
is
no
right
or
updation
involved,
so
you
just
need
audi
locks,
read
locks
and
then
once
that's
done,
the
same
acquire
locks.
A
Let
me
see
if
I
can
take
you
guys
through
locker
dots
dc
without
confusing
myself
and
everyone
else,
yeah
I'll,
probably
just
defer
it
to
the
next
talk.
But
you
know
there
are
a
bunch
of
things
that
happen
here.
You
know,
like
you
know,
once
you
give
a
particular
a
vector
of
lock
locks
for
the
mds
to
grab
it
rearranges
this
it
rearranges
it
for.
First
of
all,
for
correctness
you
can
add
in
any
order
and
the
the
the
locker
api,
which
is
the
acquire
locks.
Interface
will
rearrange
it.
A
You
know
in
a
particular
order
and
optimize
it.
I
guess
so
so
it
does
some
lock
merging
and
things
like
that.
Basically,
to
ensure
that
you
know
the
locks
are
grabbed
correctly
in
correct
order
and
as
efficiently
as
possible.
A
So
yeah,
I
think,
we'll
just
cover
this
in
the
subsequent
talks,
but
you
know
just
to
not
to
frighten
anyone.
These
are
the
different,
lock
states.
So
once
we
go
into
things
like
scatter,
lock,
simple
lock
is
is,
is
is
probably
the
easiest
easiest
one
we'll
go
through
we'll
we'll
do
that
in
the
next
series.
A
You
know
the
these
catalogs
are
the
most
complex
ones.
You
know,
and
that
uses
a
bunch
of
these
states.
So
the
the
the
main
difference
between
you
know,
the
the
simple
lock
and
catalog
in
terms
of
definition
is
like
you
know,
simple,
lock,
says.
A
Simple
lock,
you
know,
says
anyone
can
actually
read
lock.
Sorry.
A
Yeah
right
so
simple
log
is
a
base
class
that
handles
you
know
all
these
distribution
and
distributed
locks
and
these
catalog
handles
locking
for
most
complex
situations.
A
So
let
me
see
if
I
can't
define
yeah.
I
think
so,
we'll
just
do
this
next
time
in
the
next
series.
I
need
to
probably
go
through
a
bunch
of
these
to
explain
you
the
actual
differences
between
the
simple
lock
and
the
catalog.
So
you
know
just
for
the
sake
of
avoiding
confusion
and
dragging
this
too
long,
we'll
just
do
it
next
time,
so
yeah
and
the
the
so
this
source
is
probably
you
know
last
test
in
somewhere
in
2009
and
never
just
touched
again.
A
So
it's
really
ancient
and
the
logs.c
source
it's
a
c
source,
not
even
a
c
plus
plus
source.
So
you
need
to
duplicate
a
bunch
of
entries
from
ffs
and
the
squares
the
header
we
just
saw.
You
know
we
will.
Probably
you
know,
make
this
much
more
cleaner,
so
that
you
know
so
that
it's
much
more
approachable
for
people
trying
to
you
know
make
sense
of
the
whole
locking
thing.
So
these
are
the
state
machines
that
use.
That
is
used
by
you
know
the
the
locker
class.
A
So
you
have,
you
have
state
machines
for
simple
lock.
Then
you
have
state
machines
for
these
catalog,
which
is
the
the
the
actual
distributor
lock
where
we
do
share,
read,
share
rights
and
then
and
then
the
state
machine
for
the
file
locks,
which
is
extremely
complex,
so
yeah,
the
local
lock
is,
you
know,
is
the
probably
the
most
simplest
one,
because
it's
it
doesn't
involve
anything.
It's
just
like
a
basic
anyone
can
lock
anyone.
Can
anyone
can
read
log,
but
you.
A
The
authentics
can
take
the
right
locks
because
the
authentics
can
and
can
only
move
the
states
around
and
increment
the
versions
for
that
particular
inodes.
So
we'll
do
this
probably
series,
so
I
hope
you
know
going
through
the
different
file
operations.
A
You
know
things
have
started
to
make
sense.
At
least
you
know
why
all
these
locks
are
required
and
while
different
types
of
lock
types
are
used,
and
if
you
know
when
one,
if
you
get
a
chance
to
implement
a
new
file
operation,
you
know
what
you
know
locks
to
to
acquire
and
if
you're
modifying
the
excitators
you
need
to
take
a
different
lock
if
you're
modifying
the
uid
or
the
gid
take
a
different
clock.
A
Yeah,
so
all
these
kind
of
so
the
simple
lock
is
the
base
class
lock
in
typing
and
implementation,
so
everything
is
kind
of
a
sim
a
distributed
lock.
So
you
know
it's
so
I'll
quickly
show
that
if
you
see
the
simple
lock
thing
it
says
you
know
anyone
can
so
anyone
can
read
lock,
which
is
like
even
the
replicas
can
read
lock.
A
But
nobody
can,
you
know,
do
a
right,
lock
or
an
exclusive
lock.
Does
that
make
sense,
so
everything
is
a
distributed
lock.
The
the
thing
is,
the
semantics
are
different,
so
simple,
lock
has
different
semantics
for
read,
write
and
exclusive,
while
catalog
will
have
a
different
semantics
for
semantics
for
read,
write
and
exclusive
and
file
lock
again
will
have
different
semantics
for
read,
write
and
exclusive.
A
A
Right
right,
that's
that's
only
be
yeah,
that's
that's
because
it's
only
used
for
bumping
up
the
ci,
node
inode
and
the
d
entry
versions,
so
that
isn't
distributed.
That's
only
locally
used
by
that
particular
mds.
A
Right
so
everything,
if
you
see
handle
client,
get
atta.
C
A
Right
right,
that's
what
we
do
so
these
locks
are
locks,
are
grabbed
or
taken
or
or
held
still
the
scope
of
the
request,
and
once
the
request
is
done,
we
you
know
drop
all
these
locks,
the
other,
the
the
the
the
a
minor
point
here
is,
you
know
the
mds
has
something
called
as
an
early
reply
where
it
can.
You
know
reply
back
before.
Even
it
starts
journaling.
A
In
that
case,
what
happens
is
I
think
the
read
locks
are
dropped.
So
after
the
early
reply,
the
read
logs
are
dropped
and
then
once
the
the
the
journal
hits
the
disk
or
the
operation
is
journal,
then
we
drop
all
the
right
and
the
exclusive
blocks.
So
that's
a
small
point
yeah,
but
the
scope
of
the
lock
is
the
scope
of
the
request
itself.
C
A
Yeah,
so
that's
because
we
need
different
yeah,
that's
because
we
have
different
fields
that
need
to
be
either
read
or
modified
for
an
operation
like
like.
We
saw
right
when
we
are
doing
creating
a
new
dng
like
a
new
directory.
We
need
to
read
the
permissions
of
the
parent
directory
for
permission
checks,
so
that
requires
a
read
lock
on
the
parent
outlook.
A
Then,
once
we
are
creating
a
directory,
you
need
to
update
the
number
of
directories
and
status.
That's
that's
information
like
the
their
stats
and
the
and
the
r
stats.
So
that
requires
a
couple
of
other
locks
to
be
wr,
locked
right
locked
because
that
needs
to
be
updated,
and
then
you
need
these
snap
locks
for
you.
You
need
to
protect
these
snaps
for
the
for.
So
we
take
these
snap
locks
in
so
we
relock
these
snap
blocks,
so
we
have
different
because
there
is
not.
There
is
no
one
log
that
guards
everything
right.
A
There
are
a
bunch
of
logs
some.
There
are
a
bunch
of
logs
that
covers
different
fields
in
the
metadata
in,
in
that
particular
data
structure,
and
some
need
to
be
read
logs.
Some
need
to
be
right,
locked
and
you
know
and
oh
and
to
make
it
more
complex.
You
know
it
could
be
that
you
know
some
of
these
frags
are
kind
of
are
handed
over
to
replica
mdss
and
they
are
generating
read
capabilities.
A
So
you
know
to
update
to
update
the
a
time
once
a
particular
read
operation
is
done
on
a
particular
flag.
All
the
it
needs
to
be
kind
of
you
know
you
know
integrated
and
then
the
auth
mds
needs
to
kind
of
update
the
a
time.
So
so
you
know
you
can
you
can
imagine
the
complexity
of
that?
That's
why
you
need
all
these
distributed
locks
and
all
these
state
machines,
so
that
you
know
the
locking
is
done
correctly
and
as
efficiently
as
possible.
B
Yeah
sort
of
the
fundamental
design
of
cfs
is
that
the
metadata
is
sharded
out
different
under
different
locks.
Right,
that's
a
you
know,
and
it's
really
quite
different
from
most
other
file
systems.
B
A
And
that,
and
that
introduces
a
lot
of
complexity,
a
whole
lot
of
thing.
You
know
all
these
locking,
and
especially
when
we
come
to
scatter
locks,
you
know
which
are
like
the
nest,
locks,
the
file
lock
and
the
nest.
B
A
And
file
locks
you
know,
and
that
becomes
it
becomes
really
complex
because
you're
just
like
starting
out
all
these
metadata
to
replica
mdss,
you
you're
not
only
sharp,
but
you
you.
You
are
actually
splitting
up
a
particular
directory
node
into
different
frags,
and
then
you
can.
You
can
replica
you.
You
can
make
different
copies
of
these
frags
and
assign
it
to
different
mds.
So
you
know
that
makes
things
interesting
and
complex.
A
All
the
questions
it
seems
so
thanks
guys
for
attending
this
talk,
so
yeah
we'll
do
a
part
two
series-
I
don't
know
when
probably
the
next
slot
is
already
taken,
and
probably
the
other
one
too
so
we'll
do
we'll
do
a
part
two
series
which
covers
the
actual
acquire
logs
implementation,
we'll
go
through
the
state
machines.
I
will
see
how
simple
lock
works,
so
yeah
yeah
see
you
in
august
guys.
Thank
you.