►
From YouTube: DRAID by Isaac Huang
Description
From the 2017 OpenZFS Developer Summit:
http://www.open-zfs.org/wiki/OpenZFS_Developer_Summit_2017
B
D
E
D
So
today,
today,
my
talk
is
about
dear
aide.
It's
a
new
video
driver
for
CFS.
It
stands
for
paraplegic
cancer
rate
and
it's
essentially
in
a
software
raid
implementation.
So
I
guess
the
first
question
people
have
about
DVDs.
Since
we
already
have
great
software
rate
implementation,
which
is
resi,
then
why
do
we
borrow
with
something
else?
After
all,
I
guess
like
more
in
ten
years
or
so
for
ZFS
we
haven't,
got
any
new
VF
driver,
so
Rizzi
as
as
we
all
know
it.
It
has
many
great
features.
D
It
gets
rid
of
the
right
whole
problem
without
requiring
any
expensive
special
hardware,
and
it's
also
self-healing,
and
it
also
has
this
famous
feature
to
be
able
to
receiver
in
empty
pool
in
less
than
a
second
and
and
however,
there
is
one
big
problem
with
with
rizzi
cuz.
The
resolver
time
can
be
really
slow
and
painful.
The
table
here
comes
from
at
least
the
data
gather
from
the
community
about
Rizzy,
poorly
silver
time
over
spinning
discs,
because
because
I'm
a
developer,
I
don't
have
the
player
to
play
with
production
pools.
D
So
some
of
the
data
fields
are
missing,
but
it's
most
important
data
is
speed
on
the
rightmost
column,
which
is
mayor,
buys
the
right
suit
of
this
replacement
replacement
dry,
which
I
think
is
a
good
mayor
for
the
resolver
speed
in
general,
and
the
actual
speed
can
be
affected
by
manufacturers
like
edge
of
the
pool
fragmentation
and
is
a
proxies.
But
in
general
we
all
agree
that
the
speed
can
use
a
lot
of
improvement
and
and
as
the
new
derivative
driver
is
solving
this
receiver
problem.
Yet
at
the
same
time
providing
similar
similar
features.
D
As
we
see,
for
example,
the
same
suite
parity
levels
know
right
hole
and
it
runs
on
just
commonplace
hardware
and
itself
hanging
as
well,
and
before
we
look
at
the
details
of
how
it
works,
we
need
to
take
a
closer
look
at
why
we
server
for
easy
can
be
quite
slow
and
there
there
are
actually
three
major
reasons
for
that.
The
first
one
is
the
receiver.
D
The
the
speed
is
limited
by
the
right
super
of
a
single
replacement.
Why,
no
matter
how
cute
is
I
or
you
feel
the
drive?
There
is
a
hard
limit.
Zero.
You
can't
go
faster
than
a
single
job
can
do,
and
there's
also
similar
limit
is
a
reader
sukkot
during
the
reconstruction,
because
when
our
ez
v
dev
is,
is
we
sobering?
D
It
needs
to
read
from
all
healthy
child
V
devs
and
reconsider
the
last
data,
and
apparently,
and
apparently,
if
you
have
more
child
child
drives
under
a
receiveed
after
there
is
more
aggregate,
is
really
super
available
to
the
to
reconstruction.
But
unfortunately,
a
single
receive
v
dev
does
not
scale
to
allow
the
number
of
child
drives
and,
for
example,
if
you
have
40
drives,
you
typically
don't
configure
it
as
a
single
40
dry
or
easy
to
read
EV,
because
there's
various
reasons.
D
Why
that's
not
a
good
idea
for
once,
and
it
would
speed
a
large
block
into
many
small
small
I
to
all
the
child
drives.
For
example,
a
on
a
single
40
driver,
easy
to
a
120,
K
blog
would
be
stored
as
32
focus
actors
and
32
you
tella
drives,
and
an
RT
focus
sector
is
on
the
parity
drive,
which
is
certainly
not
a
good
recipe
for
good.
D
D
So
now,
let's
take
a
look
at
so
we
have
a
clear
understanding
of
how
why
the
resist
receiver
can
be
slow.
Let's
look
at
is
a
howdy
resolves
the
problem,
the
first
problem.
We
need
to
look
at
is
block
pointer
to
each
traversal.
First,
we
need
to
understand
why
reason
need
to
scan
the
block
pointers
in
the
first
place,
so
the
graph
on
the
Left
shows
a
5-2
I
receive
one.
Each
column
is
the
drive
and
there
is
four
blocks
there.
First
block
stands
as
the
busy
ones
they
are
in
different
colors.
D
So
the
first
thing
we
notice
in
this
graph
is
there
is
a
stripe
which
is
actually
variable
so
I'm,
the
first
block,
the
yellow
block
its
it
is
consists
of
2
4
by
1,
4,
apparently
stripes,
so
the
strap
was
for
this
block
is
five,
but
the
strap
was
for.
This
block
is
four
and
two
here
see
here.
So
apparently,
the
receiver
process
needs
to
have
this
knowledge
in
order
to
reboot
lost
that
apparently
in
case
of
a
drive.
Failure.
D
Imagine
that
if
we
imagines
as
the
graph
on
the
left
is
all
of
us
of
a
same
color
and
there
is
no
way
to
figure
out,
where's
data
and
the
parity
sectors
are
and
that's
why
and
unfortunately,
such
information
can
be
only
found
in
the
block
pointers
and
that's
the
fundamental
reason
why
resolver
needs
to
scan
or
the
every
single
block
pointer
in.
Is
she
in
spool
order
to
find
out
in
order
to
do
to
be
silver
now?
D
How
does
DeRay
solve
this
problem?
So
if
we,
if
we
look
at
the
first
block,
the
block
suspends
first
two
rows.
It
actually
looks
very
much
like
traditional
rate,
where
the
parity
and
data
sectors
are
at
known
positions,
and
it
would
be
nice
if
all
blocks
are
like
this,
then
we
don't
have
this
block
pointer
scanning
problem
at
all.
So
and
if
we
look
at
last
block,
there
is
an
X
at
is
n,
which
is
a
skip
sector,
which
basically
is
the
padding's
as
if
there
is
e
adds
to
certain
blocks
for
allocation
purposes.
D
And
now,
let's
imagine
if
we
pack
every
if
we
add
a
little
bit
more
padding
than
a
single
sector,
to
make
every
block
spans
of
four
four
by
one
stripe,
stripe
width,
then
we
will
reach
as
a
graph
on
the
right
which
shows
the
same
four
blocks
on
tier
eight.
So
in
this
graph,
first
of
all,
apparently
we
use
a
lot
more
space.
We
use
for
a
25-person,
more
space
and
Rizzi
one,
so
we
will
get
to
that
point
later,
but
for
now
I
think.
D
Even
if
this
this
graph
is
not
shown
in
different
colors,
which
means,
even
if
we
don't
have
the
blog
pointers,
we
we
don't,
we
can
still
do
use
we
silver,
we
without
any
problem.
For
example,
if
it's
a
drive
be
here
has
failed,
we
can
simply
read
for
five
sectors
from
drive
a
CD
and
reconstruct
a
dot
to
a
new
drive.
This
is
essentially,
how
do
you
it
gets
rid
of
the
block
pointer
tree
scanning
problem,
because
we
don't
have
to
know
where
the
block
boundaries
are
and
also
a
nice
side
effect.
D
There
is
to
unite
side
effect
of
this.
The
first
one
is
since
we
don't
have
to,
we
don't
have
to
scans
block
pointers.
We
can
just
scans
Vida
from
the
beginning
to
the
end,
totally
totally
sequential
order,
because
we
are
not
limited
by
the
orders.
The
blocks
are
found
during
the
tree
traversal
and
as
they
are
seen
as
we
just
talked
about
in
this
case,
we
don't
have
to
read
the
two
blocks
here
to
reconstruct
two
blocks
here
and
then
read
one
week
install
one.
D
D
So
to
summarize,
we
get
rid
of
the
block,
pointed
switch
traversal
problem
by
inflating
the
allocate
size
of
the
blocks
a
little
bit,
and
it
just
gives
us
a
completely
sequential
and
front
from
this
point
out.
I'll
quote:
I'll
call
this
mechanism
rebuild
because
it's
there's
apparently
lots
of
difference
between
this
and
as
a
receiver
server.
D
So
this
gives
us
a
completely
rebuild
mechanism,
just
like
a
traditional
traditional
ray,
where
we
can
just
rebuild
from
the
beginning
to
the
end
in
complete
a
sequential
order,
but
then-
and
and
and
since
we
are
parallel-
we
are
part
of
ZFS.
We
can
apparently
do
better.
Then
just
you
know
act
like
a
traditional
rate
where
you
rebuild
friends
begin
to
the
end,
because
and
risk
and
receivers
over
can
skip
the
free
space
because
it
scans
block
pointers,
and
there
is
no
block
pointer
point
in
to
free
space.
D
It
does
not
require
any
special
hardware
and
resist
solve
this
problem
by
getting
rid
of
partial
stripe
right
by
by
implementing
variable
stripe.
Ways
which
is,
and
a
consequence
of
that
is,
that
is
a
resolver.
We
have
T
scans.
A
block
pointers
but
D
resolves
is
in
a
in
a
similar
way,
because
we
we
still
integrator
still
no
partial
stripe,
writes
stripe
right,
even
even
though
we
have
a
fixed
stripe
ways,
because
we
always
round
up
the
allocation
size
to
full
stripes.
D
So
so
now
we
solve
the
first
problem,
the
block
for
energy
scanning,
but
there
is
still
to
our
problems
with
rebuild
speed.
The
single
replacement,
dry
bottleneck
and
its
effects
that
are
easy,
vdf
does
not
scale
to
a
large
number
of
sour
drives
and
and
therefore
the
the
aggregate
redistribute
is
not
scalable
so
derails
solve
this
problem
solves
two
problems.
The
write
read
and
write
super
bottlenecks
by
using
a
mechanism
called
priority
clustering.
D
So
there
is
lots
of
complexities
in
this
in
this
mechanism,
but
today
we're
just
you
know:
students
I
will
just
illustrate
the
high
level
idea
by
using
an
example
of
11
drives.
So
if
we
have
11
drives,
we
can
in
the
receive
way
we
can
configure
them
as
2
5
I
really
want
V,
devs
plus
1
has
bare,
and
the
numbers
in
each
block
are
the
drive
number.
So
here,
every
every
single
column
is
a
physical
try,
so
they
have
the
same
number,
but
you
know
just
for
the
ease
of
comparison
eyes.
D
I
put
a
single
number
in
every
every
single
block,
so
in
this
graph
is
clear
that
if
we
have
a
drive
failure,
for
example,
if
job
0
has
failed,
we're
going
to
read
from
drive
1
to
4
to
reconstruct
the
last
data
in
priority,
and
that
is
going
to
be
written
into
the
hospital
to
have
10.
So
it's
clear
it's
clear
that
the
suit
is
limited
by
the
single
I
white
suit
and
the
fourth
wife
aggregated
we
stupid
and
also
we
have
11
drives.
D
So
after
1
a
failure
we
I
should
have
10
healthy
drives,
but
here
we're
only
utilizing
five
of
them
to
do.
The
research
I.
Also
half
of
the
healthy
hardware.
Resources
is
not
in
use
at
all.
So
now,
if
we
look
at
every
row
in
this
graph,
every
row
is
actually
a
permutation
of
love
blocks
from
11
drives.
But
since
in
this
case
every
column
is
physical
device,
so
every
row
is
actually
using
a
same
permutation
of
the
blocks.
D
So
the
de
classroom
process,
which
essentially
shuffles
permutations
in
a
certain
way,
but
at
the
same
time
keeping
the
parity
groups
and
the
spare
spare
position
at
the
fixed
locations
at
the
same
columns.
So
if
we
shuffle
shuffle
the
blocks
in
each
Drive,
we
will
end
up
with
the
graph
on
the
right,
which
is
a
tier
8
deal
with
one.
So
so
it'll
be
easier.
If
we
look
at
the
spare
at
first.
So
if
we
look
at
the
spare,
it
is
no
longer
a
single,
replace
single
physical
Drive.
D
It
consists
of
blocks
from
ours
also
driving
in
the
poor.
So
so
it's
easy
to
see
that
if
we
replace
blocks
from
a
failed
drive
and
write
them
into
this
logical
spare,
we
will
distribute
the
right
workload
on
all
the
drives
instead
of
a
single
joy.
So
now,
let's
take
take
a
closer
look
at
as
a
read.
If,
if,
if
job
zero
has
failed
again,
if
we
look
at
the
first
row,
drive
zero
is
a
spare
block,
so
there's
nothing
to
do
there.
D
So
if
you,
if
we
look
at
the
first
row,
we
don't
have
to
do
anything
because
drive
zero
is
this
is
a
spare
block
that
is
not
used,
and
if
we,
if
we
proceed
to
the
second
row,
we
can
see
that
we're
going
to
read
from
drives
three
nine,
eight,
seven
and
writing
right
and
right
as
the
last
block
here
in
two
thousand.
So
if
we
move
on
to
the
next
row,
it's
clear
that
we
are
reading
from
different
set
of
drives.
So
so,
if
we
go
Roberge
row
by
row,
go
all
the
way
down.
D
So
if
we
look
at
the
take
the
table
on
the
bottom
shows
a
rebuild
IO
distribution
for
all
the
drives.
The
FIR
row
shows
a
number
of
tiny
strawberries.
Red
from
the
second
row
shows
right.
So
clearly
we
are
distributing
rebuild
work
level
equally,
among
all
others
drives,
and
there
is
actually
two
very
important
since
that
this
table
does
not
show
the
person
is
that
this
table
comes
from
the
rebuild
our
distribution
after
drive.
D
D
Rebuild
our
distribution
is
going
to
be
the
same,
and
the
arson
is
also
we
are
using
only
eleven
drives
here.
This
mechanism
actually
scales
to
a
much
larger
number
of
drives
because,
because,
due
to
the
limited
of
space,
I
can
easily
show
an
ID
drive
example
here,
so
to
summarize,
we
solve
the
read
and
write.
D
To
put
pattern
at
problems
by
by
distributing
the
read
and
write
work
law
among
all
the
drives
instead,
instead
of
just
the
drives
that
belong
to
a
single
redundancy
group
and
if
we,
if
we
move
back
so
so
from
here,
we
can
also
see
that
each
party
group
consists
of
one
priority
plus
four
data,
but
at
the
same
time
that
the
d-wave
v
dev
has
eleven
drives.
So
we
so
we
sanity
rate
is
essentially
decouples.
The
redundancy
group
size
from
the
number
of
child
drives.
This
is
unlikely
z
for
easy.
D
If
you,
if
you
configure
a
eleventh,
drive
or
easy
one,
you
are
going
to
get
ten
you're
going
to
get
ten
tired,
Rives
and
one
party
drive.
We
run
in
secure
configuration,
but
dear
a
completely
separate,
separates
to
decouple
the
tooth.
That's
how
dear
it
can
scales
to
a
large
number
of
drives
without
you
know,
without
splitting
over
last
blog
into
too
many
drives.
D
So
so
far
everything
looks
pretty
good
on
paper,
so
I'm
going
to
do
a
live
demo
on
real
hardware
using
the
car
in
code,
so
I'm
gonna
do
demo
with
sir
43
Drive
D
red.
It's
configured
as
for
redundancy
groups,
each
group
has
eight
data
types
and
Tea
Party
drives
and
and
there's
Swedish
bit
espares
and
we
are
going
to
as
a
Poisson
very
lightly
use,
because
I
have
to
do
squeeze
this
demo
into
this
talk
and
each
Drive
is
capable
of
doing
about
150
megabyte
per
second
I/o.
So
this
is
an
important
figure.
D
D
D
So
this
is
a
zero
status,
so
we
have
a
D
rate
to
double
party,
but
unfortunately,
this
this.
Currently,
we
can't
see
the
number
of
data
drives
in
a
single
party
group,
because
the
poor
doesn't
as
a
command
doesn't
show
that
yes,
so
we
have
43
drives,
they
all
belong
to
us.
A
single
vida
and
I.
Simply
interesting
part
here
is
the
spares.
We
have
Swiss
pairs,
but
as
they
don't
point
to
any
physical
try,
because
they
are
the
consists
of
tri
blocks
that
comes
from
all
the
43
drives.
D
D
D
D
So
we
can
see
that
from
the
top
graph
that
all
the
drives
are
actually
doing
reads,
so
the
load
is
not
perfectly
eco
among
all
the
tribes.
We'll
talk
about
that
in
just
a
little
bit
and
the
bottom
graph
shows
shows
the
load
on
the
right.
So
basically,
all
those
drives
are
doing
right.
So
also
the
right
suku
looks
a
little
bit
low
is
kind
of
like
under
20
megabyte
per
second,
but
keeping
minds
as
there's
42
housewives
doing
the
right.
D
So
here
we
can
say
that
we
can
see
that
the
offline
ste
device
has
been
replaced
by
a
disability
error
and
if
we
go
down,
we
will
see
that
as
this
pair
is
now
in
use.
So
basically
it
acts
just
like
a
normal
scare,
normal
spare,
except
that
you
can
only
use
it
to
replace
a
drive
in
its
parent
dvivida
and
not
anything
else.
So
now,
let's
go
back
to
the
slide.
I'm
sorry.
D
So
we
have
shown
that
the
reboot
completed
in
series
3
seconds.
That
means
the
aggregator
soup
hood-
is
over
full
gigabyte
per
second
for
read,
so
that's
apparently
much
much
larger
than
and
in
10
drives
candy,
because
if
this
were
easy,
only
10
drives
only
actually
nine
tries
would
be
doing
the
read
instead
of
40
and
what's
more
interesting,
is
actually
the
right
suppose.
D
The
right
aggregator
I
stupid
for
the
rebuild
process
is
actually
for
for
nearly
five
new,
a
500
megabyte
per
second,
because
this
is
a
very
important
figure
because
we
said
before
that
a
single
survive
can
do
about
150
back
write
per
second
at
most.
So
so
we
have
this
for
TC
drives.
So
it
will
be
great
if
we
can
do
a
comparison
by
configuring.
The
same
45
for
this
40
drives
as
as
for
easy
to
v--
devs.
D
Each
one
has
ten
drives
and
the
sweetest
and
a
sweet
hot
spares
and
pulls
the
same
data
on
the
pool
and
compare
the
speed.
So
unfortunately,
we
don't
have
the
time
to
do
that
today,
but
fortunately
we
don't
have
to,
because
we
know
that
our
easy
silver
is
going
to
write
as
a
last
a
time
parity
into
a
single
drive,
which
is
which
can
do
at
most
one
50
megabyte
per
second.
So,
even
if
it
is
able
to
do
receiver
at
that
speed,
we
are
still
where
we
are
still
at
least
three
times
faster.
D
But
what
is
not
shown
here
is
that
this
mechanism
actually
scales
two
more
times
so
say.
If
we
have
80
drives
a
DC
drives,
if
we
can
figure
them.
Similarly,
the
read
and
write
workload
is
going
to
still
be
evenly
distributed
to
all
the
drives.
So,
but
if
you
do
the
same
thing
with
wizzy,
if
you
can
figure
it
as
810,
Drive
is
e
to
the
speed
is
still
limited
by
a
single
toy.
So
this
process
is
not
only
faster
than
receive
is
over,
but
it
also
scales
to
a
much
larger
number
of
times.
D
So
the
last
figure
which
is
of
interest
here
is
a
combined
read/write
superlove
single
eye
which
is
115
Mac,
and
this
is
lower
than
the
then.
What
is
the
tribe
is
capable
of
doing,
which
is
about
150
and,
as
the
main
reason
for
that
is,
this
Pooh
is
only
2%
for
so
due
to
the
nature
of
priority
clustering
when
you
have
so
many
drives,
but
you
have
so
little
so
little
data
is
kind
of
hard
to
distribute
load
to
other
drives
completely
equally.
So
so
that's
why
we
are
not
so
so.
D
D
So
now
so
far,
I've
been
pending
a
very
rosy
picture
of
D
rate
right,
so
it
offers
the
same
features.
The
same
parity
levels
as
resi
no
right-ho
rounds
down
special
does
not
require
special
hardware
and
at
the
same
time
it
has
a
fast
scalable
mechanism
to
reboot
and
this
free
software.
So
you
you
guys,
don't
have
to
pay
us
a
time
to
use
it,
but
there
is
some
cost
to
consider
the
first
one
we
have
seen
previously,
apparently
there's
space
inflation.
D
As
far
as
this
graph
shows
we
are
using
25%
more
space,
then
there
is
e,
and
so
when
we
compare
the
space
inflation,
the
first
thing
we
need
to
keep
in
mind
is
that
we
are
comparing
D
rate
to
rizzi
because
we
are
offering
similar
functions,
we're
not
comparing
D
rate
inflation
to
zero
inflation.
So
if
we
look
at
that
block,
the
green
pop
here
also
D
rate
as
two
sectors-
openings
here,
Rizzy
also
as
one
so
period
actually
introduced.
D
One
addition
so
from
this
graph,
it's
very
clear
that,
if
your
work
locus,
if
your
data
consists
consists
of
many
very
small
blocks,
this
is
the
cost
is
going
to
be
very,
very
high.
The
cost
of
inflation,
but
as
our
hand,
if
you
have
mostly
larger
blocks,
then
there's
a
cost,
get
amortized
a
little
bit.
So
it
will
not
look
this
terrible,
and
now
we
should
look
at
two
special
cases.
So
the
first
special
case
is
the
first
block
we
can
see
here
is
that
this
block
does
not
use
any
penny
at
all.
D
This
is
because
we
only
need
to
Pat
observe
the
physical
size
of
the
block
to
a
multiple
of
D
sectors,
where
D
is
a
number
of
data
types
in
redundancy
group.
In
this
case,
if
we
assume
for
K
sectors,
we
only
need
to
pad
up
them
pad
up
the
size
up
to
16
K.
So
if
your,
so,
if
you
do
not
use
compression
the
ZFS
logical
block,
size
is
the
same
as
physical
block
size
and
which
is
always
a
power
of
two.
D
So
if
that
is
the
case,
then
there
would
be
a
perfect
world
low
for
DeRay,
because
any
block,
16,
K
or
larger
will
require
zero
padding.
So
there
is
zero
inflation
for
such
blocks
and
then,
if
we
look
at
this,
blood
is
a
great
block.
Apparently
this
is
a
worst
case,
because
this
is
a
single
sector
block
and
we
add
a
three
sectors
of
padding's
to
it
and
also
if
we
look
at
the
whole
picture,
dear
aid
adds
six
patents,
six
sectors
of
pettings,
but
is
this
block
alone
as
sweet
sectors?
D
So
if
we
can
somehow
mitigate
is
a
special
case,
this
worst
case
here,
if
we
can
get
rid
of
patents
for
this
blocks,
we're
going
to
cut
the
inflation
by
half
as
far
as
this
graph
shows
and
that's
exactly
what
Iria
does
so
in
dear
aid
and
thanks
to
your
previous
talk,
I,
don't
have
to
explain
this
menace
lab
okay,
so
most
of
the
math
labs
in
dear
aid
are
actually
exactly
like
the
the
graph
we
show
on
the
Left,
which
is
parity
plus
data.
But
there
is
some
there's
some
Metis
labs.
D
We
configure
them
as
mirrors
so
that
for
this
worst
case
block
instead
of
saving
this
block
as
p
0
d,
0,
plus
3
sectors,
we
actually
saves
them
in
a
different
place
in
a
different
metal
slab
and
we
save
them
as
D
0
D
0
twice.
So
we
apparently
we
cut
this.
We
totally
get
rid
of
the
space
inflation
for
this
case,
and
for
this
graph
we
cut
it
the
inflation
by
half
and
also
for
right.
We
don't
have
to
compute
as
a
p
0
and
for
real.
D
We
can
read
from
either
a
copy
of
the
T
0,
so
we
get
better
I
ups
for
read
so
so
to
summarize
the
block
size,
inflation
problem,
it
doesn't
work
well
with
small
blocks,
and
but
we
have
a
mitigation
to
address
the
worst
case
scenario
by
using
me,
road-
labs,
and
there
is
also
a
recast
that
your
resin
you
to
consider
the
first
one
is
since
we
do
not
scan
the
block,
pointers
and
the
block
checksum
is
part
of
the
block
pointers.
So
there's
no
way
we
can
verify
the
block
checksum
during
the
rebuild
process.
D
So,
after
is
rebuild
users
with
IANA
T
verifies
checksums,
for
example
by
scrubbing
the
pool.
But
the
point
is
that
we
quickly
and
very
in
in
a
very
short
amount
of
time,
restore
the
redundancy.
So
if
there's
any
further
drive
failure
during
the
long
scrubbing
process
that
does
not
affect
the
pool,
we're
not
going
to
lose
any
data
and
also
the
party
group
and
spare
capacity
is
chosen
when
the
video
is
created.
So,
for
example,
is
this
one?
D
So
this
one
is
a
by
two
four
there's
four
party
groups
and
one
suppose
created.
You
cannot
change
that,
and
also
in
addition
to
that,
the
spares.
Apparently
they
are
part
of
the
derivative.
They
are
not
the
traditional
house
pair,
where
you
can
share
it
among
different
top
level.
V
devs
and
you
can
even
share
the
traditionally
spares
among
different
pools,
but
the
derails
special
pool
can
only
be
used
to
replace
any
drive
in
the
parent
dvivida.
So
those
are
other
limitations
and.
D
So
the
current
project
status,
so
the
code
is
the
project,
is
mostly
I,
think
feature
complete.
The
code
has
been
pushed
could
have
for
a
while,
so
this
code
is
actually
a
bit
outdated
because
we
are
waiting
for
the
metadata
allocation
plans
to
merge
first,
because
we
need
that
feature
to
manage
our
our
special
me
Road
metal
apps.
D
So
once
that
feature
is
merged
to
ZFS
on
Enochs
I'll
be
able
to
refresh
this
pro
request
with
ladies
CO
and
I
have
the
document
here
on
github
as
well,
and
I'd
always
try
to
keep
it
update
every
time,
I
updates
code
and
of
course
we
need
a
lot
of
help
from
the
community
to
reviews,
code,
testing,
patching
and
even
porting
it
to
other
operating
systems
and
by
the
way.
Currently
we
only
have
this
song
there's
a
Linux.
So
now,
I
can
now
take
questions.
F
D
If
it's,
if
it's
a
blog,
that
is
mirrored
instead
of
all
sorry,
the
question
was
because
rizzi
has
this
problem
of
a
random
read
performance
and
the
stearate
do
anything
to
improve
their
I.
Think
when
we
do
read
improves
that
is
we
store
smaller
blocks
on
mirrors
as
mirrors,
whereas
as
we
as
a
normal
is
e,
so
that
way
we
can
read
from
easier
copies,
so
we
get
better
Alps
from
there.
So,
basically
we
can
just
imagine
that
as
a
small
block
simply
mirrored
so
it
acts
more
for
smaller
blocks.
C
D
This
particular
one
because
yeah,
that's
not
a
general
conclusion,
so
so,
if
we
generalize
it
a
little
bit,
if
you
configure
the
number
of
data
drives
in
a
party
group
to
be
a
power
to
you,
and
if
your
sector
services
is
of
course
power
to
then
any
blog
that
is
size
of
size,
D,
multiplied
by
sectors
and
larger
without
compression
is
not
going
to
require
any
padding.
Any
inflation
at
all.
D
C
E
D
Yes,
so
basically,
but
this
is
a
question-
was
if
I
replace
all
the
drives
in
derivatives,
then
tonight
essentially
increase
the
size
of
the
degree.
Give
you
a
top
level.
We'd
have
an
answer.
Is
yes,
but
you
have
to
wait
for
all
the
drives
to
be
replaced
to
actually
seize
space
increase,
yeah
yeah,
it's
basically
just
a
video
chopper.
D
Currently
does
not,
but
yes,
there
is
a
command
to
see
the
configure.
We
save
the
configuration
file
somewhere
and
this
entry.
By
the
way,
this
interface
is
probably
going
to
change,
but
the
short
answer
is.
We
will
provide
something
to
see
that,
for
example,
in
this
case
the
pool
was
configured
was
created
from
this
configuration
file
which
contains
the
permutations
we
use.
So
if
we
look
at
this
one,
the
contents,
but
by
the
way
this
is
also
saved
on
the
dis
lab
of
zanza
all
the
tribes.
D
D
So
the
question
was:
is
there
a
sweet
spot
for
the
configuration
for
the
rebate,
performance
and
yeah?
We
haven't
done
too
much
performance
testing
yet,
but
but
in
Surrey
we
use
because
since
we
are
scanning
some
space
maps,
we
are
less
well
last
impacted
by
like
fragmentation.
Oh
you
know,
because
because
because
we
are
scanning
Alec
allocated
segments,
you
started
doing
the
blocks
so
so
so
yeah.
Of
course,
there's
gonna
be
some
imperative.
If,
if
there's
lots
of
fragmentation
in
a
space
map.
E
D
Yes,
but
the
thing
is
we
don't
overwrite
anything
that
is
already
there,
so
so
the
only
the
only
time
where
this
rebuild
process
can
generate
data
that
does
not
match
the
checksum
would
be
where
either
one
of
the
drives
silently
returned,
corrupted
data
to
us
or
is
it?
How
will
you
write
to
disk
somehow
get
corrupted?
So
that's
the
only
case,
and
and
yes
so
you
will
know
it
when
you
scrub
it
or
when
you
actually
read
as
a
block,
because
that's
where
the
block
boundaries
are
verified.
Oh.
D
D
Spare
if
we
use
this
as
an
example,
so
if
this
is
single
parity,
for
example,
we
failed
one
drive,
replace
it
with
distribute
spare
without
any
new
hardware
without
any
any
new
drive
to
it.
And
if,
during
the
subsequent
scrapping
process,
an
archer
fails.
So
at
that
point
we
have
failed
two
drives,
but
we
will
still
have
the
full
redundancy
sorry,
we
would
still
be
able
to
use
I/o
because
because
the
first
last
redundancy
was
rebuilt
to
the
block,
songs
archives
already
so.
D
Okay,
so
that
would
be
exactly
the
same
as
you.
What
do
you
do
if
you
have
to
so
so?
Basically,
what
you
need
to
do
is
to
replace
this
drive
with
the
physical
drive
using
a
zip
or
replace
command.
Oh
sorry,
right
so
so
the
question
was,
since
we
have
replaced
this
drive
with
the
distributive
spare,
and
a
consequence
of
that
is
this
distributor
spirit
is
in
use
currently.
So
so
what?
What
is
the
operation
that's
necessary
to
bring
this
poor
to
its
original
state?
D
So
at
this
point
what
you
can
do
is
to
replace
the
strive
with
a
new
drive
and
that
process
is
going
and
by
the
way
that
process
is
going
to
use
actually
silver
process.
So
that
and
after
the
reserver
is
down,
this
spare
space
is
going
to
be
available
again
and,
and
so
so,
if
you
have
replaced
reboot
a
period,
we'd
have
to
despair,
and
if
you
have
plans
to
like
emitter
to
replace
the
drive
which
you
don't
have
to
because
we
have
restores
redundancy.
D
But
if
you
do
have
plans
to
replace
the
drive
with
a
physical,
then
you
don't
have
to
scrub
the
pool
because
during
is
actually
drive
replacement
where
every,
where
you
replace
the
stripe
wizard
with
an
additional
I.
All
the
block
Punto
check
sound
is
going
to
be
verified
because
that's
that's
using
the
restore
process.
D
B
E
D
So
the
question
was
so:
would
there
be
any
point
in
this
model
to
create
an
article
derivatives
if
you
have
a
lot
of
drives
I
think
well!
Well,
if
you
grow
you,
a
number
of
drives
to
during
the
lifetime
was
pool.
For
example,
then
you,
since
you
cannot
grow
a
deer,
a
vdf,
so
you
have
to
put
additional
drives
into
a
new
derivative.
That's
one
case
where,
apparently
you
have
to
do
that.
There
are
cases
when
you
have
like,
because
this
so
fundamentally
is
so
the
most
difficult,
difficult
part
of
this
power
to
do.
D
Classroom
process
is
to
distribute
as
a
load
evenly
on.
My
oars
drives
totally
yukl
a
because
we
showed
in
I
example
where
there
is
11
drives,
and
it's
read
and
write
super
completely
equally
distributed
among
all
those
wives.
But
if
you
have
like,
if
you
have
over
a
hundred
drives,
then
you
to
become
more
and
more
difficult
to
do
that
there
will
be
some
slight
variance
in
the
load.
So
then,
in
that
case
you
may
make
some
sense
to
create
a
to
smaller
derivatives.