►
From YouTube: Ceph Code Walkthroughs: BlueStore SMR
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Okay,
so
let
me
start
with
like
based
on
how
I
added
the
code.
So
first
we
have
this
zone
devices
specialization
for
hms
smart
drives,
it's
pretty
much
a
copy
paste
of
the
kernel
device
that
you
already
have
here
and
this
one
just
adds
a
couple
of
smr
specific
things.
A
Yeah
this
function,
it
just
makes
a
couple
of
smart
specific
calls
like
report
number
of
zones
to
get
how
many
zones
are
there
on
the
device
and
the
number
of
conventional
and
just
sets
these
two
variables,
the
zone
size
and
the
conventional
region
size
which
is
so
every
smart
disk,
has
a
useless
bikini.
I
mean
logically
they're
mapped
to
the
beginning
of
the
drive,
but
I
physically
they
are
at
the
center
of
the
disk.
A
A
So
this
sets
those
parameters
yeah
the
rest
of
it
is
pretty
much
the
same.
So
another
important
thing
about
this
is
now.
We
do
not
need
this
with
zbd
library.
This
needs
to.
We
need
to
get
rid
of
this,
because
the
latest
kernels
have
the
iop
tools
that
you
can
use
to
get
this
information.
So
you
do
not
need
to
link
with
liv
cbd
library
yeah.
The
rest
of
it
is
pretty
much
the
same.
B
A
Yeah
there
is
no
rush,
probably
yeah.
I
don't
probably.
We
should
because
the
ioctals
were
added
like
very
recently,
probably
a
couple
of
months
ago,
so,
if
you're
running,
if,
if
there
are
people
who
will
be
running
it
in
older
kernels,
then
yes.
A
Yeah,
that's
about
it
for
the
hsmr
and
then
most
of
the
new
stuff
is.
I
mean
smart,
related
stuff
in
bullstore.
So
recently
I
had
to
just
gift
them
all
with
if
yeah
with
have
zbd
so
I'll,
just
go
over
all
the
parts
of
the
code
that
have
that
has
this
if
step
and
explain
them,
so
we
have
a
zoned,
allocator
and
zone
freelance
manager.
I
will
go
into
them
next.
A
We
have
pre
prefixes
for
rcd
yeah
and
I
think
I
have
explanation
of
explanations
for
this
in
the
zone:
freelance
manager.
So
we've
added
this
new
thread
for
the
cleaner.
A
And
when
we
are
opening
a
block
device
we
just
check.
If
pdf
is
smr,
then
we
set
the
free
list
type
to
zone.
So
we
have
added
a
new
free
list
type
zone.
A
And
then,
when
we
are
opening
a
free
list
manager,
we
are
doing
this
ugly
hack
here.
If
the
device
is
smr,
then
we
pick
it
back
the
device
parameters
which
is
on
top
of
the
analog
size
and
pass
it
to
fm,
create
and
then
within
fm,
create.
We
extract
that
information
that
we
piggybacked
onto
this
alongside
and
let
me
quickly
jump
into
this,
to
see
what
we're
packing
there
so
yeah.
So
for
now
to
avoid
interface
changes.
A
We
pick
it
back
zone,
size
and
megabytes
and
the
first
sequential
zone
number
on
to
mean
analog
size
and
pass
it
to
functions.
Allocator
create
and
fields
manager
create
so
yeah.
We
just
you
know,
just
shift
this
into
amin
alexis
to
the
higher
bits
which
are
unlikely
to
be
used,
and
then
we
will
extract
this
information
in
the
allocator.
A
To
actually
to
use
those
parameters
so,
okay
now
moving
on
and
then
when
we
are
creating
allocator,
we
are
first
making
sure
that
we're
running
this
config
settings.
This
is
so.
These
are
the
config
settings.
First,
we
make
sure
that
the
allocator
type
is
zoned
and
yeah,
so
we
I
have
had
some.
You
know
kind
of
arbitrary
restrictions
because
of
this,
because
we
were
first
targeting
to
get
the
like
the
simplest
common
case
working
and
that
is
the
amino
acid
should
be
at
least
64k.
A
Those
were
two
so
we're
checking
those
here
and
if,
if
those
settings
are
not
right,
then
we
just
error
out
and
then
we
also
pack
yeah.
We
call
the
same
function
to
pass
the
same
parameters
that
we
passed
to
free
list
manager
to
pass
them
to
the
allocator,
and
then
we
call
alec
and
then
we
yeah,
then
here
we
create
the
allocator
and
pass
that
information
in
the
alloc
size.
A
Okay,
so
here
we're
initializing
the
allocator
and
the
free
list
manager.
So
originally
I
had
these
I
had
like.
I
had
a
new
yeah.
I
extended
this
allocator
interface
to
add
smart
specific
calls,
but
then
we
discussed
this
with
digger
and
he
suggested
that
I
should
just
not
pollute
the
interface
and
use
this
dynamic
cast
to
get
the
type
of
the
allocator.
A
Calling
the
allocator
we're
initializing
the
allocator,
so
this
here
we're
getting
the
zone
states
from
the
database,
so
this
will
be
become
clear
once
we
start
looking
at
the
allocator
I'll
come
back
to
this,
and
then
we're
also
passing
a
block
and
a
condition
variable.
A
These
are
the
same.
These
are
also
being
used
by
the
cleaner
thread
here:
cleaner
thread,
okay,
I'll,
come
back
to
this
again,
let
me
just
say
that
these
are
this:
is
the
initialization
of
the
allocator
and
the
freeways
manager.
A
A
Yeah,
I
don't
recognize
me
adding
this,
but
I
think
this
was
modified.
D
A
B
A
A
And
now
the
cleaner
can
identify
live
objects
within
the
zone
by
enumerating
all
keys
that
start
with
the
zone.
Num
prefix.
So
I
talked
about
this
in
our
last
meeting
a
little
bit.
So
that's
the
implementation
of
that
code.
So,
yes
yeah!
So
we
maintain
this
zone,
node,
zoned,
all
node,
so
yeah
we
prefixed
everything,
that's
zone
related
with
zoned,
so
this
is
basically
oh
no
to
offset
map
and
the
places
where,
where
we
update
these,
let
me
first
quickly
jump
to
those
places.
A
Okay,
so
this
is
a
map
from
owned
the
vector
of
object
offset
for
new
objects
created
in
the
transaction.
We
append
the
new
offset
to
the
vector
for
all
written
objects.
We
append
the
negative
of
the
previous
on
disk,
offset
followed
by
the
new
offset
and
for
truncated
objects.
We
append
the
negative
of
the
previous
on
this
offset.
A
So,
let's
see
where
we
update
this.
A
Okay,
so
we
update
this
yeah.
We
have
this
short
functions
zone
note
new
object,
so
note,
updated,
object
and
zone,
note,
truncated
object
and
within
blue
store.
We
call
these
in
specific
places
and
then
I'll
I'll
jump
to
where
we
are
calling
this,
but
before,
let's
see
how
we
process
them.
So
if
we
go
over.
A
A
And
we
also
need
to
give
it
the
offset
to
so
that
it
can
get
the
actual
zone
number
and
it
gets
the
zone
number
by
just
dividing
the
offset
by
the
zone
size
and
then
it
encodes
that
and
returns
a
single
string
that
has
zone
key
plus
object
key.
So
this
acts
as
a
key.
This
function
returns
the
key,
and
then
this
is
the
offset
offset
the
buffer
list
and
then,
if
it's
negative,
if
offset
is
negative,
then
it
means
the
object
was
removed.
So
we
just
remove
the
key
again.
A
We
pass
the
negative
offset
so
that
yeah
we
make
it
positive
when
we
compute
the
offset
the
zone
number
here,
so
we
just
remove
it.
So
this
is
now:
let's
look
at
the
parts
where
we
are
actually
calling
updating
these.
A
A
So
if
device
is
smr
and
the
all
the
extents
are
empty,
it
means
it's
a
new
object
because
we're
not
overriding.
A
If
it's,
if,
if
those
all
the
extent
are
not
empty,
then
it
means
it's
we're,
updating
an
object,
so
we
call
outfit
object
and
we
get
the
old
on
disk
offset
using
this
again.
These
are
some
like
assumptions
that
are
not
true
all
the
time.
A
This
is
for
the
like
very
simple
scenario.
Where
you
have
objects,
you
have
complete
objects:
that
map
complete
to
complete
extents
on
disk,
and
so
we
call
new
here
we
call
updated
here
and
then
we
need
to
call
it
somewhere
else,
and
this
is,
I
think,
you
truncate
yeah,
so
deleting
or
truncating
the
object.
Both
code
cats
follow
through
this
function
to
truncate.
D
A
Yeah,
so
I'm
faith,
cleaning
method,
so
in
xc
finalize
kv.
We
call
so
up
to
here.
We
have
made
note
within
a
transaction.
We
have
made
note
all
the
all
the
new
objects,
all
the
deleted
objects,
all
the
updated
objects
and
all
of
that
information
is
within
the
within
this
deck.
Within
this
map,
and
this
now
we
have
made
note
of
all
the
delete,
updates
and
truncate.
Now
all
the
new
create
newly
created
objects,
deleted
objects
and
updated
objects
within
this
map
and
then.
A
Here
we
are
processing
them
and
we're
processing
them
and
then
we're
making
necessary
changes
to
this
is
this
function,
we're
making
necessary
changes,
key
value,
changes
to
the
transaction
and
yeah.
That's
why
this
happens
and
then,
after
that,
I
think
all
those
updates
to
key
value
store
that
keeps
track
of
the
metadata.
A
A
If
there
are
so.
This
is
pretty
much
follows
the
pattern
that
you
have
here
for
other
threads.
I
try
to
follow
the
same
pattern
with
a
with
a
function
that
starts
the
thread
thread.
Another
function
that
stops
so
here
we
get
the
zones
to
clean
if
it's
empty.
A
Oh,
we
have
the
protocol
here,
for
you
know
making
sure
that
if
we
are,
if
somehow
we
crash
in
the
middle
of
of
a
cleaning
process
when
we
resume
or
we
can
continue-
and
we
have
a
protocol
here-
that
makes
sure
that
it
is.
A
It
stays
consistent
after
a
crash
and
it
resumes
where
it's
stopped
and
the
protocol
I've
actually
described
it
in
the
I
think
yeah.
I
have
described
it
in
the
pr
and
it
should
also
be
here,
but
basically,
what
we're
doing
is
we're
making
note
of
when
we
start
to
clean.
D
B
D
B
Like
that's
an
atomic
operation,
it
needs
whatever
should
be
atomic
and
if
you,
if
you're
part
way
through
cleaning
a
zone
and
then
you
restart
like,
presumably
if
you
chose
that
zone
before
it'll
leave,
you'll
be
even
more
likely
to
choose
it
again
because
there's
less
stuff
in
it
now
than
there
was
before,
but
even
if
it's,
if
you
didn't
like,
maybe
just
because
you
have
a
different
policy
around
cleaning,
I'm
not
really
sure
I
I
guess
I
wasn't.
I
wasn't
sure
why
why
yeah.
A
A
I
think
probably
because
you
do
not
have
to
so
yeah
first,
let
me.
A
Let
me
find
the
commit
that
yeah,
the
pr
that
I
actually
had
it
on
lrc4
class.
C
A
So
cleaning
multiple
zones
is
not
atomic,
therefore,
to
support
resuming
cleaning
the
cleaning
trash.
The
cleaners
are
at
first
persist
a
list
of
zones
to
to
clean
as
a
value
of
the
cleaning
in
process
in
progress
zones.
I
mean,
I
think
it's
just
to
save
the
effort
of
going
through.
A
Oh,
that's
because
there
will
be
an
inconsistency
there.
If
you
do
not
so
let's
say
you
have,
you
have
chosen
some
zones
and
you
started
to
clean
them.
Let's
say
you
chose
five
zones
and
you
started
to
clean
them.
You
can.
You
persist
the
metadata
about
updated,
persistent
metadata
about
the
updated
you
update
the
metadata
after
you
have
cleaned
all
of
the
zones.
A
So
let's
say
you,
you
clean
the
first
zone
and
you
still
haven't
updated
the
metadata
for
that
zone
in
roxdb.
That
says
that
this
zone
now
has
for
each
zone
because
for
each
zone
we're
keeping
the
number
of
dead
bytes
in
the
zone
and
the
right
pointer
and
if
you
clean
and
we
updated
after,
we
have
cleaned
all
the
zones
in
a
batch.
A
So
if
we
clean
the
zone,
but
we
do
not
persist
its
metadata,
I
think
what
will
happen
is
after
we
resume
from
the
crash
we
will.
A
Since
we
haven't
updated
the
metadata
for
this
zone,
the
first
zone
that
we
have
cleaned,
we
will
still
use
old,
stale
metadata
to
decide
which
zones
to
clean
and
we
will
give
the
zone
back
again
as
a
candidate
for
cleaning.
Even
though
it
has
been
cleaned-
and
there
will
be-
I
mean
so
what
will
happen?
The
the
yeah
cleaner
will
go
through
this
zone
and
we'll
find
that
there
is
nothing
to
be
cleaned
because
it's
already
been
and
and
then
move
on
to
the
next
zone.
A
So
it's
it
will
probably
just
avoid
some
redundant
work.
I
guess,
but
whether
it
will
break
the
consistency.
I
thought
about
this
when
I
was
designing
this
protocol,
but
I
didn't
make
any
notes.
So
I'm
not
sure
I
am
not
100
sure.
B
I
guess
what
I'm
wondering
is
if,
if
we
can
make
it
so
that
I
mean,
if
you
look
at
the
state
of
the
system,
you
can
have
an
object.
That's
going
to
have
some
number
of
extents.
It's
just
an
object.
Has
one
extent
that's
in
a
zone
that
needs
to
be
cleaned,
so
there's
going
to
be
some
do
move
or
something
similar.
That's
basically
going
to
take.
B
Extent,
it's
going
to
read
it
in
and
it's
going
to
write
it
again
somewhere
else,
but
if
we
can
make
that
transaction
yeah
atomic
in
the
sense
that
it
it
updates
the
allocation
for
the
new
zone.
That
says
it
has
more
used,
bytes
right
point
or
whatever,
and
it
also
increases
the
dead
bytes
on
the
old
zone.
A
A
B
A
Yeah
that
that's
so,
if
two
move
is
atomic,
then
we
do
not
need
to
do
this
stuff
that
we're
doing
right
now,
but
the
issue
there
is
in
term
the
problem.
There
is
like
batching,
a
bunch
of
bunch
of
updates,
because
we
do
move
we'll
have
to
do
a
ton
of
small
updates
to
roxtv,
but
with
just
first
making
a
note
and
then
doing
all
the
updates
at
the
end,
we'll
we'll
need
to
we'll
have
more
core
screened
updates
to
ruxtv.
B
A
Actually
yeah,
so
it's
it's
going
to
be
like
yeah.
If
two
move
is
atomic,
it's
yeah
again,
it's
a
matter
of
whether
you
want
to
batch
updates
or
xdb,
or
you
want
to
do
it
every
time.
You
clean
an
object
and
we'll
still
have
to
make
some
of
the
to
move
a
transactional
anyway,
but
it's
like
because
it
will.
It
will
need
to
update,
object,
metadata
atomically,
but
there's
also
the
related
zone
metadata
like
because
we
also
update
once
the
object
move
is
complete.
A
We
also
increase
the
number
of
dead
bytes
in
the
in
the
zone
as
well.
Yes,
yeah
yeah.
B
That's
what
we're
saying
yeah
yep.
We
already
have
to
make
the
right
path
like
every
every
we
already
have
a
transaction
framework,
and
so
every
transaction
should
be
atomic
and
leave
everything
in
a
fully.
So
I
guess
what
we're
saying
is
that
if
we
just
make
sure
that
we
update
the
allocation
metadata
in
those
just
like
we
do
with
writes,
then.
A
Yeah
I
mean
it,
will
it
still
saves
some
rug
to
be
transactions,
but.
B
D
A
Okay
yeah,
so
it's
that
so
we
were
here.
Cleaner
start.
A
And
then
we
resume
we
call
zone
clean
zone
on
each
zone
to
clean
zones
and
then
reset
all
the
zones,
and
then
this
marks
the
zone
screen
in
the
free
list
manager,
which
is
in
wax
db.
So
this
persists
that
zones
are
clean.
A
B
Question
here
is
there
any
reason
to
clean
multiple
zones
at
once,.
B
B
If
you're
going
to
have
lots
of
extents
that
you
need
to
move
within
a
zone,
each
of
those
is
going
to
be,
or
even
chunks
of
them
are
going
to
be
transactions,
but
we're
simple,
so
just
to
say
each
one's
a
transaction
like
you're.
Not
it's,
not,
there's
not
going
to
be
value
in
like
reading
in
parallel
from
voltable
zones,
because
it's
a
hard
disk
with
c
latency,
so
you're
really
going
to
want
to
sequentially,
read
the
entire
zone
or
whatever,
whatever
the
live
bytes
are
in
the
zone.
So
we
can.
A
So
the
zone
selection
process
that
happens
in
the
free
list
manager,
freelance
manager,
gives
the
list
of
zones
to
clean.
So
are
you
asking
why
we're
cleaning
multiple
zones
at
a
time.
B
A
B
A
I
mean
it's
not
cleaning
two
zones
in
parallel
it
just
it
just
asks
them
once
and
then
then,
once
it's
and
that's
that's
tunable,
like
those
things
we
haven't
actually
like.
If
we
look
at.
A
So,
oh
this
one
just
reads
it
from
the
database,
but
this
way
we
actually
get
the.
I
think
it's
in
the
zoned
allocator
the
place.
We
actually
decide
what
zones
to
clean
that
yeah
here
so
find
zones
to
clean
yeah.
So
right
now
it's
just
set
to
one
and
though
I've
just
made
a
note
here
to
make
it
tunable.
A
So,
okay
yeah,
it's
it's
just,
it
depends
on
what
kind
of
cleaning
you
want
to
do.
So
you
may
actually,
so
you
may
want
to
do
stuff,
the
world
cleaning,
where
you
stop
everything
in
that
case
like
no,
I
o
is
happening.
You
may,
for
example,
you
know
how
we
discussed,
having
io
being
redirected
to
the
two
other
ost,
while
the
third
one
is
like
is
not
receiving
any.
I
o.
So
all
the
cleaning
is
happening
there,
so
you
may
want
to
do
that
kind
of
cleaning.
A
So
those
are
all
policy
things
that
that
are
to
be
this
yeah
to
be
figured
out,
but
yeah
sure
we
can
do
in
the
here.
We
this
can
be
just
get.
Cleaning
zones
can
always
return
just
one.
A
Again,
it's
not
reading
from
multiple
zones,
it's
just
getting
the
list
of
zones,
multiple
zones
at
the
beginning
and
then
cleaning
them
one
by
one
sequentially
like
it's
not
doing
parallel
cleaning,
yeah.
A
And
this
one
is
just
the
standard
yeah
pattern
for
stuffing
it
thread,
and
this
cleaner
thread
so
yeah.
This
is
the
this
is
where
it
the
it
happens
in
the
loop,
so
it
just
keeps
getting
zones
to
clean,
and
if
there
are
no
zones
to
clean
it
goes
to
sleep
otherwise
yeah.
It
first
makes
a
note
in
the
database.
A
This
is
the
zones
that
it's
going
to
work
on,
and
then
it
just
keeps
cleaning
zones
one
by
one
and
yeah.
This
code
is
similar
to
the
one
to
the
one
that
we
saw
here
in
the
start,
because
this
one
like
doing
cleaning
on
the
recovery,
this
one
is
doing
cleaning
on
the
normal
on
the
common
path
got
it.
Okay,.
C
A
Yeah,
so
if
so,
here's
the
part
so
that
this
one
this
is
the
zone
queen
zone,
which
is
this-
is
the
function
that
will
take
a
zone
number
and
clean
the
object
on
the
zone
by
calling
atomic
to
move
on
every
object
on
the
zone.
So
if
the
do
move
is
atomic,
then
yes,
we
can,
you
know
let
go
the
recovery.
There
will
be
some.
I
need
to
think
more
about
this,
but
I
think
yeah,
you
guys
already.
A
You
know
computed
it
in
your
heads
and
said
this
doing.
Fine-Grained
updates
to
rex
to
be
will
be,
will
be
free,
but
yeah,
I'm
not
sure,
maybe
you're
right,
we'll
figure
it
out,
but
yeah.
If
this
dual
dual
leave
object
needs
to
be
atomic
anyway.
A
A
You
know
yeah
complete
working
thing,
complete
working
system
without
any
optimizations
or
any
and
then
yeah.
So
now
the
only
two
things
that's
left
are,
I
think,
that's
all
the
code,
zbd
specific
code.
A
On
some
devices,
the
first
one
is
support.
Non-Overwrite
workload
such
as
uw
with
large
aligned
objects,
therefore,
for
user
rights
to
write
small
should
not
trigger
osds,
however,
write
and
update
a
tiny
amount
of
metadata,
such
as
osd
maps
statistics.
For
those
cases
we
temporarily
just
pad
them
to
mean
invite
them
to
a
new
place
on
every
update
yeah.
So
that's
the
that's
for
handling
the
for
workloads
that
I
was
thinking
of
do
right.
A
Small
shouldn't
have
triggered,
but
there's
still
small
metadata
updates
that
are
smaller
than
min
aloxides,
and
here
we're
just
adding
them
to
mean
alexis.
B
A
B
B
A
Everything
so
yeah
for
that.
I
think
we
may
need
to
revisit
the
metadata
that
we
keep.
A
B
Think,
well,
I
think
that
all
the
existing
data
structures
should
work.
Okay,
it's
just
a
matter
of
and
and
yeah
ecore
probably
knows
better,
but
just
making
sure
that
we
take
the
right
path
through,
do
write
and
do
write
small
to
write
big
so
that
we
always
we
always
allocate
a
new
blob.
That's
always
you
know.
The
allocator
is
always
going
to
give
us
a
new
right
in
the
right
place
and
we
don't
take
any
of
that.
The
wonky
paths.
B
Sometimes
we
overwrite
or
block
there's
just
like.
Sometimes
we
journal,
okay,
the
right
that
we
intend.
D
B
And
then
write
it
back
to
do
the
deferred
right.
We
don't
want
any
of
that
stuff.
We
don't
want
to
write
into
an
existing
blog,
the
unwritten
portions
of
an
existing
blob.
We
can
drop
that
and
so
on.
It
might
be
actually
here's
a
thought,
though
it
might
be
that
there's
all
this
egor.
If
I
remember
correctly,
there's
in
the
blob
encoding,
there
are
all
those
fields
for
like
what
part
of
the
blob
are
actually
used
allocated.
We
could
probably
drop
that
because
we're
never
gonna
do
any
of
these.
B
A
A
D
A
Okay,
so
this
is
another
thing
that
I
added
so
return
the
offset
of
an
object
onto
this.
This
is
added
to
oh
note,
so
this
function
is
intended
only
for
use
with
zone
search
devices
because
in
these
devices
the
objects
are
laid
out.
Contiguously
on
this
is
not
the
case
in
general
yeah.
So
here
this
function
is
supposed
to
get
the
offset
at
which
the
object
starts.
A
So
I
I've
already
forgotten
how
all
this
you
know
blobs
and
extent
and
so
on,
work.
I
need
to
review
this,
but
I
think
that
this
was
correct.
A
This
just
assumes
that
objects
are
laid
out
contiguously,
so
you
have
an
object
that
starts
at
offset
n
and
all
of
its
data
is
supposed
to
leave
at
the
offsets
and
plus
l
larger
than
n,
which
is,
I
guess,
usually
not
the
case,
it's
possible
that,
like
in
general
case,
you
have
parts
of
it
at
offset
n
and
then
other
parts
in
some
blobs
at
offset
and
minus
100
or
something
I'm
not
sure
anyway.
This
is
another
part
that
added.
B
Yeah,
I
think
that
will
okay,
so
so
igor.
I
guess
tell
me
if
this
makes
sense.
I
think
so.
I
think
this
this
strategy
is
a
little
bit
off,
because
we
want
to
allow
this
these
arbitrary
right
patterns-
and
we
want
to
allow
blobs
to
exist
in
multiple
zones
for
a
single
object
in
order
to
support
that,
which
means
that,
I
think,
probably
what
we
want
is
in
the
o
node.
We
need
us
as
like
a
set
of
all
of
the
for
any
given
o
node.
B
We
should
have
a
set
of
zones
that
are
touched
by
that
object
and
that
can
probably
that
can
either
be
like
this
object
exists
in
this
zone
somewhere
with
one
or
more
extents,
so
like
one
key
per
o
node
zone
pair
or
it
could
be
per
blob,
whatever
we
do
fiddle
with
that.
B
But
in
the
simplest
say
just
say
like
this,
o
node
has
some
data
in
this
zone
and
then
we'd
also
need
a
structure,
a
similar
that,
in
the
right
context,
that
says
that
we're
updating
like
a
new
one
like
this
is
this
right.
B
Transaction
has
a
new
extent
in
this
particular
in
this
particular
zone
or
we've
deallocated
the
last
extent
for
a
given
object
in
that
zone.
So
there's
like
the
delta
there
so
that
we
know
when
to
create
and
remove
the
keys,
basically
for
the
cleaner.
B
C
So
yeah,
I
think
right.
B
A
Hey
yeah,
okay,
so
these
are
the
main
changes
to
for
that
agent
booster.cc
and
then
we
have
zone
freeways
manager.
A
A
Scaffolding
and
this
just
persists:
yeah
encodes
and
decodes.
The
zone
state
and
zone
free
list
manager
has
this
function
of
functions
for
writing
those
states
into
into
the
roxy
b
and
then
loading
them
from
live
cd,
and
then
this
one
needs
some
states.
A
And
then
merge
operator
for
updating
the
updating,
the
I
think,
the
right
pointer
and
number
of
that
bytes.
D
A
Create
so
here
we
yeah
so
this
when
you
retrieve
that
piggyback
information
like
the
zone
size,
the
number
of
zone,
the
starting
zone
number,
so
the
starting
zone
number
is
the
zone
number
the
first
zone
number
that.
A
We
just
again
this
mimics
on
what
the
current
release
manager
does
by
just
writing
some.
This
information
into
forex
tv.
A
Similar
to
what
the
current.
A
The
freelance
manager
does
they
in
the
interesting
part,
are
the
release
when
we
do
release
we
increment
the
number
of
that
bytes.
A
B
Hey,
I
have
a,
I
have
a
question
about
the
these
post
managed
strides
in
general,
but.
D
B
You
issue
a
right
at
offset
x
and
then
you
crash
okay,
but
the
right
actually
does
it
that
the
right
actually
succeeds
like
right.
C
B
D
A
All
of
them
are
open
for
right
all
the
time.
A
Yeah,
so
I
don't
I
can't
just
I
I
have
another
meeting
at
12,
but
can
I
don't
know
I
don't
have
time
to
you
know
I
can't
do
as
much
as
I
can,
but
I'll
need
to
sign
up.
B
At
once,
this
is
good.
I
think
we
covered
just
about
everything.
I
guess
the
one
point
there,
though,
is
that
when
the
allocator
starts
up,
it
should
and
it
loads
all
its
right
pointers
or
from
the
freelance
manager
or
whatever,
and
we
should
also
do
one
last
check
that
looks
at.
A
A
So
yeah
this
is
the
allocator.
So
right
now,
this
just
you
know,
starts
from
the
first
zone
and
then
keeps
allocating
and
yeah.
It
just
starts
from
the
first
zone
and
keeps
allocated
iterating
over
the
zones
and
as
soon
as
it
finds
a
as
soon
as
it
finds
a
zone
that
fits
the
required
required
size,
then
it
breaks
out
and
returns
that
zone
yeah
it.
A
If
we
have
iterated
through
all
zones
and
we
haven't
found
which
space
then
it
means
we've
run
out
of
space.
A
So
once
let's
say
it's
success,
when
it
returns
here,
let's
say
it
has
successfully
found
the
zone.
Then
it
increments
the
right
pointer
of
the
given
zone
number
and
reduces
the
number
of
free
bytes,
and
it
also
if
like,
if
there's
no
free
space
in
the
current
zone,
it
increments
the
zone
number
so
that
next
time
it
starts
from
another
zone
yeah.
This
is
very
rudimentary
so
find
zones
to
clean
this
one.
A
A
If,
if
there
is
currently.
A
Yeah
so
find
zones
to
clean.
If
there
is
currently
cleaning
ongoing,
in
which
case
num
zones
to
claim
will
be
non-zero
or
we're
not
low
on
space,
then
it
just
returns.
Otherwise
it
just.
A
As
it
does
a
partial
sort
of
current
zones
by
using
their
the
number
of
dead
bytes
and
it
partially
sorts
by
the
number
of
zones
that
we
want
to
clean
num
zones
to
clean
at
once
and
and
then
it
sets
zones
to
clean
with
a
list
of
zones
that
we
want
to
clean
based
on
the
result
of
the
partial
sword.
A
I
think
yeah
it,
because
if
it
finds
zones
to
clean,
then
it
will
notify
the
condition
variable
and
the
cleaner
thread
will
start
cleaning,
oh
funny,
so
this
is
the
allocated
path.
So
basically,
we
just
have
a
very
simple
allocator
that
iterates
over
zones
and
the
third
zone
that
has
the
free
zone
free
space
returns
that
and
it
also
every
time
it
allocates.
It
also
looks
to
see
if
we
should
start
cleaning
and
it
so
it
kicks
the
cleaner
thread
so
that
it
starts
cleaning.
A
That's
the
allocator
path
and
the
release
path
is,
and
we
just
go
over
the
release
set
interval
set
and
we
just
keep
updating
the
number
of
bit
bytes
in
the
in
roxdb
for
the
zones
that
I
mean
this
is
not
in
roxtoda
yeah.
This
is
not.
This
is
just
in
memory.
A
A
Yeah,
this
just
updates
it
in
memory
and
then
freelance
manager
will
do
it
persistently.
A
Some
hack
here
as
well-
I
don't
remember
at
the
moment
but
yeah.
I
need
to
look
at
it
more
carefully
to
find
to
figure
that
out
but
looks
like
that
code
path
is.
A
Yeah
there
should
be
some
code
back
here
that
has
the
part
where
you
first
allocate
a
block
for
object
for
this
super
block.
A
A
Okay,
this
is
another
codepath
that
in
transactions
we
just.
A
So
currently
you
may
have
multiple
threads
issuing
rights.
We
have
multiple
threads
that
may
be
executing
on
these
codepaths,
so
there
may
be
allocation
and
right
that
happen
out
of
order.
A
So
here
what
we're
doing
is
if
it's
a
smr,
we
just
we
have
added
a
lock
that
allocation
and
write
to
disk.
They
happen
to
in
a
single
step.
So
we
don't
have
because
it's
possible
that
without
this,
so
we
take
the
luck
here
and
we
we
release
it
down
here
somewhere.
A
Here
here
yeah,
so
we
release
it
here.
So
the
purpose
of
this
is
to
make
allocate
and
write
steps,
a
single
step,
because
it's
possible
that
one
thread
allocates
and
then
the
allocator
gives
it
some
offset,
let's
say
zero
and
then
before
it
can
write
another
thread
allocate
and
that
one
gets
an
offset
like
10
and
before
the
first
thread
writes.
The
second
thread
starts
writing,
so
it
ends
up
writing
at
offset
10
before
the
offset
0
has
been
written.
A
So
this
whole
thing
just
makes
sure
that
allocate
and
write
is
atomic,
and
there
I've
made
a
note
here,
they're
trying
to
add
this
new
command
to
kernel
zone,
append
there
you're
not
going
to
give
the
you're
not
going
to
okay.
I
need
to
leave
right
now,
yeah,
I'm
getting
cold
yeah.
We
can
follow
up
on
this.
B
I
took
some
notes:
it'll
probably
follow
up
with
an
email.
That's
like
a
to-do
list.