►
From YouTube: December 2020 OpenZFS Leadership Meeting
Description
At this month's meeting we discussed: Forced export; RAIDZ expansion performance; visibility of .zfs/snapshot.
meeting notes: https://docs.google.com/document/d/1w2jv2XVYFmBVvG1EGf-9A5HBVsjAYoLIFZAnWHhV-BM/edit
A
Looks
like
we
don't
have
too
much
on
the
agenda
for
this
month's
december.
2020,
open,
cfs
leadership,
meeting
I'll
start
with
just
a
few
announcements
updates
on
projects
we've
been
discussing
so
opencvs
2.0
was
released.
Congratulations
to
everyone
who
worked
on
it
and
thanks.
A
I
haven't
seen
any
major
issues
or
or
problems
come
up
with
it.
So
that's
great,
and
then
we
also
saw
durate
integrated
to
master.
A
So
that's
great
brian,
if
you're
on
I'd,
be
curious
to
get
your
thoughts
on
the
next
release.
I
don't
see
him
all
right.
Well,
we
can
check
with
him
next
time.
I
think
we
had
discussed
trying
to
have
a
2.1
release
relatively
soon.
A
That
would
include
d-rate
should
be
kind
of
an
exception
to
our
usual
policy
of
of
not
including
major
features
in
updates
in
minor
releases,
but
I
think
he
had
some
some
motivations
for
doing
that.
I
think.
A
All
right,
I
saw
some
emails
fly
by
on
that,
but
I
haven't.
C
C
Also,
I
don't
know
in
what
group
it
was
discussed,
but
with
freebsd
shooting
branch
in
winter
and
release
in
march.
It
would
be
good
if
12.1
could
be
really
or
2012
could
be
released
somewhere
in
winter.
Then
we
could
integrate
it.
B
A
A
B
A
A
B
Yeah
so
brian
had
gone
over
and
pointed
out
a
couple
of
minor
things
and
we'll
updated
that
over
the
weekend
and
it'd
be
great.
B
And
importantly,
if
you
have
a
second
or
third
or
other
pools
running,
they
can
continue
to
move
forward.
Whereas
if
the
unlucky
pool
happens
to
suspend
while
holding
things
like
the
namespace
log,
then
you
can't
even
run
zfs
list
until
you
deal
with
the
suspended
pool
and
that
you
know
degrades
the
user
experience.
B
Yeah,
so
it's
working,
we
have
a
bunch
of
test
case,
but
if
people
have
ideas
for
other
test
cases-
or
I
could
take
a
look
at
it
or
try
it
out-
that'd
be
good
cool.
A
A
Just
a
little
bit
since
our
last
meeting,
so
it's
it's
pretty
close
to
being
fully
functional.
A
There
are
some
known
issues
with
like
if,
if
you
crash
at
the
at
the
wrong
time
or
if
devices
become
degraded
like
like
you,
lose
a
disc
while
you're
in
the
middle
of
the
expansion
that
we
still
need
to
address,
but
aside
from
those
corner
cases,
it
basically
totally
works.
It's
pretty
slow
in
some
for
some
things,
but
those
are
all
things
that
we
we
you
know.
We
know
how
to
address
and
we're
working
on
them.
A
So
if
folks
want
to
want
to
do
more
testing
on
that,
that
would
be
great.
I
was
planning
to
get
out
like
a
an
alpha
like
a
I
guess.
The
current
one
is
an
alpha
release.
I
was
planning
to
put
out
like
a
beta
by
the
end
of
the
year
that
draw
more
attention
to
the
fact
that
we
made
all
this
progress
yeah.
A
It
should
work
so
the
latest
things
that
we've
done
are
getting
it
so
that,
after
you
do
the
expansion,
the
writes,
are
happening
with
the
new
data
to
parity
ratio,
which
means
that
it
takes
up
less
space
on
disk
and
also
the
performance
is
good,
is
better.
You
get
basically
the
normal
performance
rather
than
you
know.
When
you're,
when
you're
reading
or
writing
things
that
are
like
misaligned,
then
it
has
to
generate
a
lot
more
cios
there's
actually
some
work
that
I'm.
A
What
I'm
working
on
right
now
is
splitting
out
some
of
the
performance
work
to
integrate
separately,
so
you'll
see
a
review
for
that
soon
and
what
what
I'm
doing
is
one
of
the
big
problems
after
you
do
the
expansion,
when
you're
doing
a
read
and
you're
reading
a
block
that
was
written
from
before
the
expansion,
then
it's
like
kind
of
going
diagonally
across
the
disks
because
you
have
like
it
was
allocated
originally
as
four
wide
and
now
you
have
five
disks,
then,
like
the
parity
is
every
every
sector.
A
The
parity
is
changing
to
a
different
disk
rather
than
like
the
parity
all
be
on
one
disk.
For
that,
given
block
so
the
way
we
deal
with,
that
is
by
creating
a
bunch
of
zios
like
one
for
each
sector
essentially
and
then
like
stitching
it
all
back
together,
which
works
great
and
integrates
well
with
the
existing
code.
A
But
just
man
like
allocating
and
de-allocating
all
of
those
zioties
and
even
all
of
the
abd's
associated
with
them
is
very
you
know
it's
very
slow,
like
it
uses
a
lot
of
cpu
to
do
that
because
you're
talking
about
like
for
128k
io,
let's
say
if
it's:
if
it's,
if
the
layout
matches
the
like
the
logical
and
physical
match,
then
you're
you're
only
dividing
that
128k
into
a
five
disk.
So
you
have
like
five
zios
versus
you
know.
A
The
worst
case
is
like
you
have
a
128
kio
and
you
have
a
shift
nine.
Then
you
have
like
256,
zios
and
abd's
that
you're
allocating,
which
is
it
just
takes
a
lot
more
cpu.
Oh.
C
A
Yeah,
I've
done
a
bunch
of
benchmarking
on
that
and
we
we
have
a
bunch
of
work
in
progress,
that's
like
kind
of
rough
and
ready
quality,
but
it
works
and
it
improves
the
performance
to
be
pretty
good.
I
don't
remember
the
numbers
off
the
top
of
my
head,
but
it
was
like
more
than
half
the
normal
performance,
so
you
know
it
wasn't
like
all
the
way
to
normal
to
normal
performance,
but
it
was
pretty
close,
and
this
is
like,
I
think,
a
system
with
like
eight
cpus
driving.
A
I
forget
how
many
driving
a
bunch
of
ssds,
so
you
know
the
throwing
more
cpus
at
it
would
improve
things
further.
A
So
the
work
I'm
breaking
out
right
now
is
basically
being
able
to
like
pre-allocate
the
abdt
have
that
embedded
in
another
data
structure
and
so
like
when
you're
creating
an
abd,
that's
just
like
referencing
another
one.
Then
you
can
provide
the
actual
abd
struct
that'll
initialize.
In
that
way.
That
way,
you
don't
have
to
do
the
like.
You
don't
have
to
do
the
camera
malloc
for
the
apdt
itself,.
C
A
Yeah,
that's
all
handled.
I
mean
this
type
of
abd
is
just
the
one
that
references
another
one.
So
we
aren't
like
setting
up
any
pages.
There
isn't
any
like
memory,
that's
actually
exclusive
for
that
apd,
because
what
like
to
to
improve
the
performance
that
I
mentioned.
A
Basically,
what
we
do
instead
is
we
do
the
aggregation
at
the
raid
z
level
so
that,
instead
of
having
like,
instead
of
creating
one
zio
for
each
512
bytes
for
each
a-shift,
we
create
one
zio
for
each
disk,
just
like
we
did
before
and
then,
but
we
still
want
to
use
the
existing
like
raid
z,
priority
generation
reconstruction
logic.
A
So
we
need
abd's
that
point
into
each
sector
of
that
and-
and
those
are
the
ones
that
I'm
talking
about
like
pre-allocating,
so
with
the
new
way
of
doing
it,
you,
you
only
have
like
the
number
of
disks
number
of
zios
and
you
have
a
lot
of
abd's
but
the,
but
with
this
I'll,
be
upstreaming
soon,
they're
very,
very
lightweight.
C
Okay,
yeah,
we'll
see
still
fifty
percent
sounds
scary
for
me,
but
maybe
for
somebody
who
last
cent
centered
on
performance,
it.
C
A
Useful
I'll
look
up
the
numbers
too,
and
I
can
give
you
the
real
the
real
number
speaking
of
abg
unrelated.
C
A
Yeah
yeah,
so
that
means
like
the
overhead,
is
one-eighth
of
what
of
what
I
was
measuring
there.
C
Well,
even
in
production,
we
shall
we
have
right
now,
one
the
case
we
are
investigating
and
they
have
performance
problems
and
cpu's
age
going
about
50
on
quite
decent
hardware,
and
there
we
have
something
like
a
16k
rate,
z
on
top
of
five
white
or
16k.
The
walls
on
top
of
five
white
raid
z,
so
even
chunking
at
four
or
eight
k,
is
pretty
significant.
It's
significantly
increased
number
of
ios
overhead,
so
even
that
is
not
perfect
and
great
yeah.
A
But
you
should
get,
but
with
redzy
expansion
you
should
still
get
better
performance
than
what
you're
talking
about
as
long
as
you
have
large
block
size
because
of
the
like
integrated
aggregation
in
the
at
the
raid
z
level.
So
you
won't
be
doing
zios
for
every
4k.
You'll
only
be
doing
the
zio.
You
know
for
the
whole
block.
A
Yeah,
so
that's
that's
the
big
one
because
I
mean,
like
cios,
are
way
way
higher
overhead
than
abd's.
Even
so,
without
that
change,
you
know
you're
talking
about
like
a
tenth
of
the
performance
with
with
a
shift
nine.
So
oh.
C
Yeah,
that
sounds
better
just
so
it's
been
a
while,
since
I
looked
on
expansion
code
last
time,
I
don't
remember
it
was
there.
A
Yeah,
that's
new!
It's
in
a
pr
against
the
expansion
code,
not
it's
not
in
the
main
expansion
code.
Yet
so
that's
why
you
haven't
seen
it
yeah,
if
I
recall
correctly
with
it
with
the
4k
sector
size.
The
performance
of
those
reads
was
like
very
close
to
without
the
expansion
it
was
like.
A
You
know,
you're
paying
for
the
one
xbox
data,
because
you
have
to
read
from
all
the
disks,
not
just
the
ones
that
have
the
data
right,
because
all
that
you
know,
if
you
have
like
a
five
wide
raid
z,
one
normally
a
read,
would
just
be
reading
for
four
disks,
but
with
if
post
expansion
you
have
to
read
from
all
five
of
the
disks,
because
they
all
have
data
and
and.
A
Yeah
so
there's
a
little
bit
of
overhead,
but
I
think
that
with
records
with
4ks
exercise,
it
was
pretty
minimal,
like
less
than
10.
A
Overhead
that
was
of
overall
performance
with
the
eight
cpus,
I'm
sure
that
the
cpu
usage
is
higher.
So
there's
there's
still
some
more.
You
know
stuff
to
be
measured
there.
I
think
for
your
use
case.
C
A
There's
there's
also
work
to
be
done
in
the
performance
of
the
expansion
itself,
which
is
very
slow
now
because
it's
doing
its
sector
at
a
time
like
creating
a
cio
for
every
sector
and
I'm
not
sure
if
we
will
get
to
addressing
that
before
integration.
A
But
since
that's
kind
of
a
temporary
thing,
oh,
your
extension
takes
a
long
time.
You
know
well.
A
C
C
Science,
we
don't
have
other
things.
I
have
one
quick,
freebies
equation.
Does
anybody
know
why
we
have
tunable
for
abd
block
size
because,
obviously
setting
it
less
than
page
size
make
no
any
sense
signs
it
automatically
drops
to
smaller
blocks.
You
even
needed
and
setting
is
higher,
is
just
killer
for
vm
subsystem
for
uma
caches.
B
C
C
That
I
think
would
be
interesting
to
note
with
that,
to
do
some
kind
of
abg
magic
on
geom,
with
f
with
the
geom
layer,
so
that
we
could
skip
some
copy
and
right
directly
from
abd.
B
C
B
Thanks
but
I
had
a
similar
question.
Actually
I
was
looking
at
a
kind
of
well.
I
don't
know
if
it's
actually
a
regression,
but
I
usually
reported
after
upgrading
to
open
zfs
2.0
on
freebsd
that
the
zfs
directory
doesn't
work
the
same
as
it
used
to
specifically
in
a
jail.
B
So
when
I
looked
in
the
code
in
the
you
know,
zfs
control
directory
stuff,
when
you
go
into
zfs
snapshots
on
freebsd,
there
is
a
comment
that
this
is
nasty
and
then
it
changes
the
thread
credentials
to
the
k
cred
to
the
basically
roots
credentials
and
then
does
the
mount
and
then
switches
it
back
to
the
original
credentials
so
that
a
regular
user
could
access
the
does
that
if
I
snapchat
directory,
whereas
normally
users
can't
mount
and
if
you
enable
user
mount,
there's
a
requirement
that
the
user
has
to
own
the
directory
they're
mounting
to
and
obviously
the
user
doesn't
own
dots
that
have
snapshot
slash
the
name
of
the
snapshot.
B
Some
recent
cleanup
changed
the
in
global
zone
macro
to
use
kerprock
instead
of
cur
thread,
so
that
code
would
need
to
be
updated
to
fake
the
credentials
of
the
process.
Instead
of
the
thread,
but
the
question
is:
is
it
okay
to
do
that?
It's
how
freebsd
has
always
done
it.
It
seems,
but
does
it
make
sense
to
allow
regular
users
to
be
able
to
mount
snapshots
and
especially,
do
that
from
inside
of
a.
B
D
Does
freebsd
have
the
notion
of
like
delegating
data
sets
to
a
jail.
B
It
does
and
then
they
could
mount
it
normally
there.
I
don't
actually
know
how
the
zfs
stuff
responds
when
it's
a
delegated
data
set
in
this
particular
user's
use
case.
The
data
set
is
only
is
mounted
on
the
host
and
just
happens
to
be
inside
the
sage
route
that
the
jail
is
based
on,
so
the
jail
itself
doesn't
have
visibility
on
that
data
set
like
it.
Doesn't
you
can't
zfs
list
the
data
set
that
it's
ending
up
mounting.
D
Yeah
I'd
have
to
go
back
and
look
on
a
lumos
for
what
it
does
for
zones
correct
or
actually
the
other
thing,
because
my
initial
thought
would
be
is:
if
it's
delegated
then
yeah
it
should
work.
But
if
it's
just
you
know
a
subdirectory
of
a
data
set,
you
know
it's
maybe
a
little
different,
but
I
can't
remember
now
it's
one
of
those
things
where
you
just
never
think
about
it.
How.
B
Yeah,
I
never
thought
about
how
you
know
a
user
in
a
jail
mount
being
able
to
access,
zfs
snapshot
worked
or
the
fact
that
nothing
ever
seems
to
unmount
those
they
get
mounted
with
the
ignore
flags,
so
the
regular
mount
command
doesn't
display
them.
But
if
you
do
mount
v,
you
can
see
the
list
of
them.
B
It
also
came
up
because,
while
testing
this,
I
broke
snmp
on
our
one
of
our
servers,
because
when
there
were
over
a
thousand
snapshots
mounted
the
udp
buffer
wasn't
big
enough
to
send
back
the
list
of
all
the
data
of
all
the
file
systems
on
the
machine.
D
Does
the
zfs
work,
but
if
you
just
like,
for
example,
if
you,
if
just
part
of
the
data
sets
visible
into
the
zone,
it
doesn't
seem
to
so
I
mean
just
as.
B
In
this
case,
the
whole
data
set
is
inside
the
zone
c
troop,
like
its
mount
point,
is
inside
the
the
siege
route,
but
the
data
set
is
not
delegated
to
the
jail.
B
And
so
the
zfs
is
visible,
it's
just
with,
because
the
credential
hack
isn't
working
right.
Now,
if
you
go
in
there
and
try
to
you
know
you
can
ls
and
see
the
list
of
snapshots.
But
if
you
do
ls-l,
you
can't
actually
f-stat
the
the
directory,
because
you
don't
have
permission
basically
because
it
fails
to
mount
it
when
it
tries
to
mount
it
to
to
get
the
f-stat.
B
B
C
I'm
sorry
I
just
want
to
say
that,
while
we're
not
using
jails,
no
delegation
to
jails,
we're
using
previous
versions
of
files
for
samba
right.
B
B
Think
it's
only
in
master
like
I
think
it
was
october.
Maybe.
B
Yet
but
it
was
it
was
you
guys
that
that
broke
it
by
fixing
something
else.
B
She
made
the
changes
for
the
in
global
zone
to
be
able
to
not
go
through
like
two
functions
and
three
macros
to
to
find
out
which
jail
you're
in
a
much
more
direct
route.
It
looks
much
nicer.
It's
just
the
hack,
only
fakes,
the
credentials
for
the
thread,
not
the
whole
process,
and
now
we
examine
the
credentials
of
the
process.
A
In
terms
of
like
what
how
it
should
work,
I
think
that
the
zfs
snapshots
directories
work
over
nfs.
Yes,
so
like
it
makes
sense
that
it
should
also
work
locally
without
having
any
special
permissions
by
default.
B
Yeah,
I
think
that
it
seems
to
make
sense
and
yeah,
like
you
said
nfs
and
the
the
samba
use
case
are
both
pretty
big
yeah.
A
By
the
way,
alexander,
I
looked
up
the
numbers
and
I
was
getting
85
of
the
normal
performance
with
all
the
performance
improvements
for
the
reading
post
expansion
and
that's
with
the
record
with
the
512
by
sector
size.
So
that's
probably
a
worse
case.
C
Yeah,
if
it's
even
if
it's
85
of
performance
not
of
spent
cpu,
that's
not
as
terrible,
especially
it's
for
for
the
worst
case,
who
could
tell
yeah?
I
I
same
as
I
I
always
discourage
people
from
removing
vdfs
unless
they
really
need
it.
I
guess
I'll
still
remain.
It's
got
a
shorter
size,
but
it's
good
to
have
it
even.
A
Cool
what
other
topics
do
we
have
to
discuss.