►
From YouTube: Log Spacemap Update by Serapheim Dimitropoulos
Description
From the OpenZFS Developer Summit 2018
slides: https://docs.google.com/presentation/d/1qxsbZGt1jCwhz-eHmilZS0ZATxGvvk7VLYBd1JnCCtI/edit?usp=sharing
A
To
date,
you
of
the
open,
ZFS
developer
summit.
We
have
I,
think
five
shorter
talks.
Today,
it's
going
to
be
a
little
bit
more
casual,
so
they'll
be.
We
are
gonna,
I
think
that
pebble
is
not
here
at
this
book,
so
you
know
we,
the
speakers
a
little
bit
of
latitude.
If
there's
discussion
on
that,
we
want
to
have
in
terms
of
the
time
but
I
asked
the
speakers
to
project
like
ten
to
twenty
minutes.
Each
and.
A
So
welcome,
hopefully
folks
we're
using
different
streaming
technology
to
stream
it
to
the
Internet
today
on
YouTube.
So
hopefully
that's
working
as
well.
If
not,
let
us
know
in
the
YouTube
chat
there
I'll
be
monitoring
that
with
that
handed
over
to
Sarah
moment.
One
more
thing:
we
don't
have
like
a
microphone
for
the
speaker,
so
speakers
please
try
to
protect
your
voices
and
folks
in
the
audience
and
kind
of
in
this
general
area.
Please
try
to
keep
the
background
noise
down
so
that
everyone
can
hear
them.
A
B
Performance
based
on
project
the
looks
vision
project
is
something
that
I
presented
on
last
year's
summit
was
still
an
open
question
and
because
of
that,
we
didn't
have
any
concrete
performance
results.
So
that's
what
this
presentation
is
gonna
be
about
just
a
small
recap
about
what
the
project
is
about.
We
started
working
on
this
project
from
a
problem
that
we
found
we
have
del
fix.
Basically,
there
were
some
pools
that
were
experienced,
degraded
performance
and
the
reason
for
that
was
that
we
were
each
the
extreme
we're
reaching
a
north
of
iOS
for
metadata
updates.
B
Specifically,
we
were
offending
through
each
at
a
small
space
map
every
Dixie
and
the
reason
for
that
was
that
the
fragmentation
levels
were
very
high
and
the
workload
was
mostly
random
brands.
So
what
we
decided
to
do
about
it
all
basically
keep
all
the
changes
in
memory
and
don't
write
anything
to
this
right.
We
added
two
new
range
threes
per
meter,
slab,
one
for
lots
of
locations
and
one
front
of
fast
freeze
and
basically
how
things
would
go.
We
want.
B
Would
move
it
from
that
flash
freeze
to
that
class
dialogues
and,
if
you
want
afraid
you
move
it
from
them
classifications
to
them
fast,
freeze
and
now
the
question
is
okay.
Wouldn't
that
exert
a
lot
of
memory
pressure
of
the
system
and
in
reality
it
doesn't
as
much,
but
even
in
the
cases
that
we
do,
we
set
a
specific
limit
in
the
amount
of
memory
that
we
have
for
at
last
changes.
So
whenever
that
limit
is
exceeded,
we
start
flashing.
B
Some
metal,
slats
and
by
flashing
I
mean
that
this
doing
the
segment's
from
this
to
new
range
trees
that
we've
added
for
these
chains.
We
emptied
them
out
into
the
metal,
slab,
spaceman
and
now.
The
other
question
is
what,
if
we
crash,
we
have
all
these
things
in
memory
right.
What?
If
we
grass?
How
do
we
reconstruct
that
state?
B
And
the
answer
is
that
it's
it's
end
of
appending
to
each
metal
subspace
map
as
we're
doing
before
for
persistence,
we
issue
a
single
I/o
that
writes
all
the
metal
exchanges
in
a
single
pool,
wide
space
map
that
will
refer
to
the
log
space
map.
So
it's
thick
seed.
We
just
keep
all
these
changes
in
this
new
space
map
that
we
have
so
in
case
that
we
crossed
during
import
time.
B
We
just
read
all
the
space
maps
and
reconstruct
them
flat
state
and
now
the
other
question
is
won't
that
make
import
times
after
a
drastic
longer.
You
know
you
do
that
wholly
3d.
Well,
yes,
but
if
we
control
how
many
blocks
this
log
space
maps
have,
we
also
control
the
important
overhead
that
we
exert
at
the
pool.
B
That
is
that
if
we
flashing
that
order,
all
their
logs
will
start
becoming
obsolete
by
obsolete
I
mean
that
their
entries
would
have
made
it
to
the
meta
subspace
maps,
which
means
that
we
wouldn't
have
to
read
them
during
import
time
and
I'm,
giving
an
example
of
that
just
for
demonstration
purposes.
So
let's
say
we
have
a
tool
like
200
meter,
slabs
right
we
are
currently
at
the
X
T
10.
You
know
you
can
see
metal
slab.
B
One
goes
from
blocks,
let's
say
from
0
to
10
minutes
left
to
from
11
to
20
so
far
and
so
forth.
We
read
into
all
all
of
them.
This
plastics
you
because
we're
in
the
whole
state
and
now
we
are
enabling
the
law
of
space
map
feature
weeks.
For
now.
Let's
say
with
last
one
medicine
practices
for
demonstration
purposes,
so
tipsy
11
comes
by
we're
at
all
our
changes
in
the
these
cool
white
locks
face
lock.
B
We
don't
touch
any
of
the
metal
slams
so
far
right,
so
you
can
see
the
first
two
blocks
are
freeze
from
box
3
2
4
8
9.
That
would
have
gone
to
metal
slap
1.
There
is
like
11
to
12
15
to
18.
We
have
got
2
metal
slab
too
bad.
We
keep
them
all
in
the
log
space
map
and,
as
I
said,
with
lines
in
one
metal
opportunity.
So,
basically,
by
flashing,
it
means
we
are
bending
to
that
metal,
slab
space
map
and
you
can
see
in
gray
over
here.
These
are
the
solid
entries.
B
So
in
the
next
week
she
would
flash
metal
slab
and
we
would
record
all
the
changes
of
the
pool
in
a
new
log
space
map,
and
you
can
see
we
have
gray
entries
in
both
the
old
books.
Based
on
that,
you
look
space
map
because
we've
lost
all
these
things
in
medicine.
So
after
we'll
do
like
a
whole
round
through
all
the
matter,
slabs,
let's
same
200
degrees
from
now,
because
we
have
200
meters
left
the
whole
log
of
2
X
11
is
going
to
build
solid
means.
We
don't
need
it
anymore.
B
There's
one
can
destroy
it.
So
that's
the
idea
behind
like
flashing
in
order,
because
we
can
get
rid
of
old
space
map
blocks.
So
that's
what,
where
we
stopped
last
time
so
now.
The
question
is
how
many
methods
love's
should,
with
last
it's
the
extreme.
The
trade-offs
are
the
following:
if
we
floss
the
less
metella
truth
last
LSI
oath,
we
issue,
but
if
the
incoming
rate
of
log
blocks
basically
the
size
of
our
logs
incoming
most
face
maps
is
high,
a
lot
of
log
looks,
accumulate
and
import
times
take
a
hit.
B
If
we've
lost
more,
we
get
rid
of
log
space
maps
faster,
but
we
show
more
iOS
and
we
may
end
up
going
into
this
old
state
of
the
system
where
we're
appending
to
its
meta
stuff
space
map.
So,
as
you
can
see
the
work,
the
problem
is
were
codependent
and
ZFS
needs
to
adapt
to
that.
So
we
need
to
come
up
with
a
heuristic.
B
The
most
knife
here
is
weaken
the
one
that
we
that's
the
basis
for
all.
The
other
cases
that
you
came
up
with
is
the
block
limit.
Heuristic,
specifically,
we
said
like
a
limit
on
the
amount
of
lockbox
that
we
want
to
have
at
any
given
time
in
the
poll,
let's
say
like
a
thousand
lock
box.
So
as
we
have
more
incoming
blocks
and
log
space
maps
whenever
we
exceed
that
limit,
we
start
flashing
meta
slabs
until
we
get
below
that
limit.
B
Basically,
that
limit
ask
access
like
an
upper
bound
in
the
overhead
of
the
important
that
we
accept
the
problem
with
that
heuristic
by
itself
is
that
it's
susceptible
to
lot
of
performance
pathologies.
These
are
some
of
some
sample
results
from
the
same
simulator
that
I
made
where
you
can
see
like
there
were
teeth.
These
are
Whitworth
flossing
basis,
almost
all
the
meta
slabs
in
the
pool
and
the
reason
for
that
will
could
be
many
scenarios.
B
It
could
be,
like
our
incoming
rate,
completely
changed
and
we
were
very
close
to
that
to
our
log,
but
we
had
a
lot
of
incoming
logs
on
that
specificity
and
like
not
a
lot
of
many
other
cases.
Basically,
the
problem
is
that
the
behavior
is
not
consistent
and
not
predictable.
We
can't
reason
about
it
and
because
I
don't
have
time.
This
is
a
lightning
talk.
There
are
a
bunch
of
other
heuristics
I've
open
sourced
my
simulator,
and
you
can
give
them
a
try.
You
can
specify
different
parameters.
B
Try
your
own
things
like
that,
and
I
want
to
talk
about
the
idea
of
heuristic.
The
idea
here
is
to
take
into
the
duration
the
distance
between
the
amount
of
logbooks
that
we
currently
have
and
the
limit
that
we
set.
So
if
we
are
far
away
from
the
limit
you
can
say
like
okay,
we
are
still
good.
We
don't
need
to
flash
as
much
if
we're
close
to
the
limit.
Would
you
say?
Oh
the
system
is
under
pressure.
Maybe
we
want
to
flash
a
little
bit
more.
B
The
second
consideration
is
the
current
incoming
rate,
regardless
of
how
far
you
are
from
the
limit.
If
your
income
in
rate
is
high,
it
means
that
you
can
approach
it
like
very
quickly,
so
you
need
to
take
that
into
consideration.
A
third
thing
is
the
distributions
of
metal
slug
flask
over
a
lot
of
space
map.
B
History
I
will
be
flashing,
a
lot
who
would
be
flashing
little
that's
important,
because,
basically,
our
class
in
history
correlates
to
how
easily
we
can
destroy
old
log
space
maps
and,
finally,
the
distribution
of
log
books
over
our
log
space
monkey.
Sorry,
how
has
the
incoming
rate
beam
in
the
past
few
TX
disease?
B
So
with
this
in
mind,
we
came
up
with
the
running
some
heuristic.
Basically,
the
way
that
this
works
is
that
it's
the
XT
can
take
the
current
incoming
rate
and
we
project
it
in
the
future
that
we
we're
gonna.
Let
make
projections
on
what
we
are
getting
so
far,
man
we're
saying
like
okay,
even
some
take
this.
For
now,
whenever
the
limit
is
exceeded,
how
many
meta
slots
would
we
need
to
flash
to
stay
below
the
limit,
and
then
we
take
the
average
of
these
metal
slabs
over
the
text.
Is
that
we
protected?
B
So
we
can
tell
okay
if
I
start
flashing
from
now
will
I
stay
below
that
limit?
How
many
medisoft
I
need
to
flash
from
now
in
order
to
stay
below
that
limit
on
that
giggety.
So
here's
an
example
that
I
have
you
can
see
that
that
table
is
basically
I'll.
Explain
all
of
this.
These
are
the
history
of
our
log
each
row.
It's
a
log
space
map.
So
the
first
row
is
the
log
space
nugget
of
Dixie
10.
B
C
B
So
the
idea
of
this
whole
table
in
the
running
Suns
is
at
any
point
in
time
to
say
if
I
wanted
to
get
rid
of
eight
blocks.
Many
medicines
today
me
how
many
locks
do
I
need
to
destroy
and
how
many
matters
laughter
in
class
in
order
to
destroy
these
laws.
So,
for
example,
if
I
just
wanted
to
get
rid
of
the
first
log
I
would
flash
one
metal
slab
would
gets
rid
of
two
blocks:
five
last
three
metal
slabs.
B
It
would
get
rid
of
the
both
these
logs
log,
10
and
11,
and
it
would
last
eight
blocks.
You
get
the
idea
and
that's
why
we
call
them
the
running
songs.
Basically,
so
the
scenario
for
this
example
is
that
we
are
currently
at
txt
16.
You
know,
there's
no
log
over
here
for
60
16.
We
have
24
log
space
maps.
You
can
see
it
on
there
on
exam
right
here
too,
and
we
sent
a
block
limit
of
32
blocks
and
for
now
just
for
demonstration
purposes.
B
We
say
that
we
have
four
incoming
block
space
map
blocks,
so
we
start
running
our
heuristic
in
one.txt.
From
now,
the
limit
is
32
and
we
currently
have
24
blocks
in
our
pool
and
the
incoming
rate
is
4,
so
in
one
take
see
from
now,
we
would
end
up
with
4
blocks
below
the
limit.
The
limit
is
not
exceeded,
there's
nothing
to
do
in
two
ticks
trees.
From
now
who
were
left
with
4
lock
box
below
the
limit,
they,
as
I
said,
we
protect
the
incoming
rate.
B
So
we
get
exactly
to
the
limiting
to
take
this
from
now
now
in
3
take
this
from
now
we,
with
the
same
incoming
rate,
we
exceed
the
limit
right,
so
we
want
to
start
flashing.
So
in
order
to
get
below
the
limit,
we
need
to
free
at
least
4
blocks
right
I'm,
exactly
how
much
we
exceeded
so
the
first
row
in
the
table
with
a
block
running
some.
B
That's
at
least
four
blocks
is
Row
2,
that's
Dixie
11,
so
we
need
to
get
rid
of
Dixie
span
and
the
exist
11
logs,
which
requires
classing
three
meta
slams
and
that
would
release
eight
blocks.
That
would
get
us
below
the
limit.
So
we
need
to
flush
three
metal
slabs
in
three
to
exist.
In
the
future,
so
3
over
3
is
like
1
metal
slab.
Last
vertex
C-
and
you
know
we
can
go
on,
went
below
the
limit
by
4
are
the
incoming
rate.
Okay
was
four,
so
the
limit
was
exceeded.
B
Then
we
exceeded
the
limit
again
and
based
on
our
updated
table
who
say:
okay.
Now
we
need
to
flash
seven
metal
slabs
to
release
six
blocks,
because
we
accept
the
limit
by
four,
so
7
over
5
X
is
gonna,
be
we
need
to
flush
two
metal
stops
by
pixi
in
order
to
stay
below
the
length
and
we
keep
growing
until
we
go
over
our
whole
table
and
by
the
end
we
just
take
that
column.
When
we
take
the
maximum
of
the
meta.
B
Slabs
plastic
see
that
we
calculated
in
the
past,
which
can
pieces
that
we're
gonna
stay
below
the
limit
based
on
the
incoming
rate
of
box.
So
here's
some
sample
simulation
results.
This
is
from
a
hypothetical
pool
of
300
meter,
slabs
and
accepted
block
limit
to
be
300
blocks,
and
the
incoming
rate
was
like
randomly
chosen
between
like
10
and
64
block
box.
B
We
want
to
come
up
with
the
same
default,
because
this
limit
indirectly
controls
the
classroom
rate,
so
the
higher
the
limit
is
less
meta
stuff
that
we
need
to
class
the
lower
the
limit,
the
more
pressure
on
the
system.
We
need
to
flash
more
so
the
driving
factor
on
deciding
on
that
or
that
we
want
each
flask
to
count
meaning
we
want.
Every
time
that
we
sent
an
I/o
to
append
to
a
marathon,
we
wanted
block
size
to
be
utilized.
B
The
other
thing
to
take
into
consideration
is
that
meta
stop
space
map
entries
are
more
generally
one
word
while
logs
face.
Biometrics
are
always
two
word
and
the
reason
is
because
we're
talking
about
a
cool
wide
space
map,
we
need
to
log
entries
because
there's
a
field
about
the
video,
so
we
can
be
specific,
which
meta
slab
on
which
we
that
we're
talking
about
so
so
far.
B
We
would
say
that
our
factor
of
log
entries
to
some
medicine
space,
my
matrices,
are
around
four
and
if
that's
not
enough,
well,
there's
also
consolidation,
the
log
space
month,
for
example,
you
may
have
a
block
that
you
allocated
freed
and
then
allocated
again,
but
in
the
meta
subspace
map
it
will
count
as
only
one
entry
as
like
one
block
allocated.
So
the
consolidation
comes
into
play.
B
So
now
performance
results.
These
C's
the
number
of
fire
ups
over
five
days,
a
five-day
long
experiment
that
we
ran.
Basically,
what
we
did
is
we
had
two
pools
with
the
same
setup,
but
one
of
them
had
the
log
space.
My
feature
enabled
and
we
started
doing
random,
writes
to
them
until
they
reached
high
fragmentation
and
their
reach,
but
they
also
it's
steady
state
in
terms
of
number
of
ions.
Basically,
there
weren't
any
fluctuation
everything
was
ready.
B
Also,
during
the
course
of
these
experiments,
we
verified
our
assumption
about
the
amount
of
obsolete
entries,
the
log
space
map
at
any
given
time.
So,
basically,
what
you
need
to
emphasize
more
is
this
ratio
over
here,
which
is
around
like
0.5
at
any
given
time,
which
means
that
we
can
verify
our
assumption
about
half
the
entries
being
obsolete
at
any
given
time.
B
B
Yes,
we
actually
save
part
of
these.
Things
was
also
changing
how
the
number
of
Metis
labs
per
feedeth
are
assigned.
So
we
made
some
changes
for
its
meta
stuff
to
be
like
a
certain
I.
Don't
remember
if
it's
like
8
gigabyte,
something
like
that,
but
basically
yes,
with
these
change
they're
also,
both
you
have
less
Metis
labs.
A
A
I
think
in
your
previous
talk,
where
you
talk
about
the
kind
of
the
motivation
of
the
whole
project,
we
talked
about
the
fact
that
yeah,
if
you
didn't,
have
this,
yes,
lowering
the
number
of
minutes
would
kind
of
achieve
the
same
thing,
because
you're
only
appending
to
smaller
number
of
medicines
rather
than
you're,
gonna,
repeat
it
or
whatever.
But
that
has.
A
D
A
Those
times
were
bad
memory.
Usage
was
bad.
Number
of
my
house
was
bad
and,
like
yeah,
you
could
true,
you
can
trade
off
between
any
of
those,
but
you're
just
gonna
make
another
one
of
them
or
to
them
like
much
even
works
and
they're
already
bad.
So
this
lets
us
get
much
more
flexibility
and
the
system
is
much
less
sensitive
to
the
number
of
medicide,
so
I'm
totally
messed
up.
D
B
A
B
That's
exactly
yes,
so
basically
we
did.
What
we
ended
up
doing
is
that
we
so
for
pools
that
are
very
small
and
rules
that
are
very
large.
We
send
up
some
like
limits
the
number
of
metas
left,
but
in
between,
we
basically
have
like
a
fixed
medicine
size
and
we
basically
put
more
meta
slabs,
as
your
storage
capacity
grows.
B
That's
a
good
question:
I
haven't
done
like
a
lot
of
testing
in
terms
of
that,
but
so
far
in
because
we
run
with
this
feature
enabled
so
far
since
like
February,
and
we
haven't
found
any
issues
with
it.
The
thing
is
that
if
you're,
if
you
don't
have
a
presentation,
you're
still
dependent
on
taking
coming
right,
like
basically
down
the
flashing
algorithm,
doesn't
work
in
terms
of
fragmentation
right
so
that
ideally
shouldn't
matter.
B
A
C
B
B
Yes,
yes,
actually
that's
exactly
what
made
that
first
block
limit
heuristic
go
crazy
and
we
tried
we
tried
with
a
lot
of
different
like
incoming
trends.
So,
for
example,
if
you
suddenly
have
like
something
said,
and
then
you
jump
or
you
go
very
low
or
you
don't
have
something
like
that
looks
like
a
sign
that
you
kind
of
like
you
know,
have
a
lot
of
incoming
rate.
Then
it
goes
down
and
also
like
a
randomized
sign.
B
B
Yeah,
it's
very
interesting
because,
like
we
spend
a
lot
of
time
on
this
and
I
started,
reading
like
on
queueing
theory
and
I,
couldn't
find
any
model
that
has
this
weird
correlation
between,
like
these
two
different,
distinct
things
like
having
a
relation.
In
our
case,
the
Metis
laughs
last
and
the
log
blocks
that
we
can
destroy
so
yeah.
D
C
B
A
B
A
Like
before
surfing's
changes
there's
this
idea
of
safety
convergence-
we're
like
you
might
have
to
you,
might
be
sinking,
and
then
you
have
to
allocate
some
stuff
and
then
you
have
to
write
out
the
space
maps,
so
we
actually-
and
you
still
have
that
with
with
certain
changes,
but
because
now
you're
only
writing
one
space,
not
like
you
like
in
the
worst
case.
You're,
you
know,
one
more
sink.
Past
means
that
you're
appending
to
one
space,
not
sure
we
actually
saw
the
number
of
passes
required
to
sink
to
convergence.