►
From YouTube: ZFS Native Encryption by Tom Caputi
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
Hi
everybody
I'm,
Tom,
Caputi
and
today,
I'm
gonna,
be
talking
about
something
I've
been
working
on
for
the
last
nine
months
or
so,
which
is
basically
adding
encryption
at
rest
into
ZFS.
So,
basically,
I'm
gonna
start
out
with
an
overview
of
the
implementation
of
you
know
what
you're
gonna
see
as
a
user.
B
What
you
can
kind
of
expect
from
you
know,
assisted
man's
point
of
view
and
then
I'm
gonna
kind
of
build
up
the
encryption
implementation
as
we
go
through
the
presentation,
so
you
can
kind
of
get
an
idea
of
how
it
works
at
the
block
level.
So,
first
of
all,
what
is
encryption
from
a
really
basic
40,000
foot
view?
B
Basically,
we
want
to
prevent
somebody
known
as
the
attacker.
We
want
to
prevent
them
from
accessing
data
that
belongs
to
us
data
that
we're
considering
private
permissions
are
not
good
enough
to
protect
against
this,
because,
no
matter
what
a
root
user
can
always
change
the
permissions
on
pretty
much
anybody's
stuff.
So
if
you
have
files
or
directories,
you
know,
if
you
don't
trust
the
root
user
under
system.
You
can't
really
do
anything
to
protect
your
data
from
them.
B
Kernel
bugs
have
routinely
been
exposed
in
Linux,
especially
in
the
past
couple
of
months,
where
there
are
kernel
bugs
that
have
allowed
for
privilege
exploit
of
escalation
which
allows
other
you
know,
untrusted
users
to
get
to
root
and
then
from
there
they
can
access
your
data.
So
this
really
isn't
good
enough,
and
the
other
thing
is.
Even
if
you
know
your
data
is
in
some
machine
and
you
know
that
nobody's
gonna
be
able
to
escalate,
and
maybe
you
are
root
on
on
your
machine.
B
Somebody
can
always
move
your
somebody
can
always
move
the
disks
to
a
different
machine
or
install
a
new
operating
system
and
they'll
be
able
to
read
the
disks
no
matter.
What
so
really
permissions
are
not
good
enough
to
protect
against
these
kinds
of
things.
The
solution
is
going
to
be
to
encrypt
the
data,
and
what
that
means
is
that
the
data
that's
on
disk
should
look
pseudo-random
to
everybody,
except
somebody
who
has
the
private
secret
key.
B
B
One
of
the
biggest
issues
with
this
from
a
performance
standpoint
is
that
ZFS
is
able
to
get
a
lot
of
performance
out
of
the
fact
that
it
compresses
data
before
it
goes
to
disk.
If
you
have
encryption
on
remember
your
data
is
gonna,
look
pseudo-random,
so
pseudo-random
data
there's
no
patterns
there,
there's
nothing
really
that
can
there's
no
compression
that
can
really
be
applied
to
it.
So
you
will
lose
a
good
amount
of
performance
there.
B
You
also
lose
D
dupe
capabilities,
because
if
your
data
truly
is
pseudo-random,
two
pieces
of
data
that
might
even
be
the
same
are
going
to
look
completely
different.
The
other
thing
is
that
it's
gonna
write
out
metadata,
headers
and
stuff,
because
encryption
as
I'm
going
to
show
you
in
a
little
bit
does
require
having
some
metadata
that
needs
to
come.
B
That
needs
to
move
along
with
the
data
these
in
E,
crypt
FS
and
a
few
of
these
other
solutions
are
going
to
be
written
in
the
header
in
the
file
which
can
disturb
the
which
can
basically
disturb
the
file
alignment
and
therefore
things
like
databases
might
not
perform
as
well.
Now,
that's
being
said,
it's
even
there's
even
more
restrictions.
B
If
you
do
encryption
at
the
disk
level
at
the
disk
level,
let's
say
you
have
some
kind
of
raid
system
set
up,
and
then
you
have
a
DM
crypt
for
those
of
you
who
use
Linux
set
up
on
top
of
that.
Basically,
if
you
have
multiple
copies
of
any
data,
that's
going
to
be
encrypted
multiple
times,
because
again,
each
block
is
going
to
get
encrypted
a
different
way.
Each
time.
There's
no
intelligence
there,
and
so
that's
gonna
mean
extra
CPU
overhead
for
each
encryption.
B
The
other
thing
is
that
you
can't
do
any
kind
of
ZFS
commands.
You
can't
even
recognize
that
this
is
a
ZFS
pool
until
you've
loaded
your
keys
and
you
know,
got
the
underlying
block
device
decrypted.
So
what
this
kind
of
means
from
a
user's
perspective,
is
that
you're
gonna
have
the
keys
loaded
all
the
time
I
mean
because
you
just
if
you
want
to
use
the
data
they
they
have
to
be
there.
B
This
will
also
mean
that
if
the
keys
aren't
there,
you
can't
do
basic
pool
operations
like
scrub
resilvered.
Nothing
like
that.
You
also
can't
send
data
without
having
the
keys
loaded
and
the
last
big
advantage
of
this
disadvantage.
Of
both
of
these
things
is
that
it
kind
of
adds
more
complex
management,
because,
right
now
you
can
manage
all
of
your
storage
through
the
ZFS
and
zpool
commands.
If
you,
you
know,
if
you're
using
ZFS,
but
without
with
this,
you
need
to
add
another
layer
which
will
just
handle
the
encryption
either
above
or
below.
B
So
this
is
kind
of
a
big
win.
Just
from
a
administrative
standpoint,
so
for
those
of
you
who
don't
know,
I
thought
I'd
include
a
little
slide
on
how
we're
planning
on
using
this
a
dado,
basically
dado,
to
sum
up
one
of
our
biggest
products.
In
a
nutshell,
it's
basically
we
have
a
backup
agent,
which
is
a
piece
of
software
that
lives
on
a
client's
machine.
B
The
clients
are
gonna,
backup
their
data
to
a
zpool
which
sits
on
a
separate
machine
on-site
at
their
at
their
office
or
wherever
it
may
be,
and
then
that
is
going
to
backup
to
the
cloud
using
ZFS.
Send
the
advantages
of
native
encryption
here
are
going
to
be
that
we're
going
to
be
able
to
get
a
lot
higher
performance
encryption
without
losing
compression
because
it
is
baked
into
ZFS.
B
We're
gonna
have
a
much
cleaner
implementation
than
we
currently
do,
because
what
we
currently
have
is
a
bunch
of
stacked
block
devices
that
have
that
perform
the
encryption
intermediately,
and
the
last
thing
is
that
we're
gonna
be
able
to
backup
our
customers
data
to
our
off-site
server
without
being
able
to
decrypt
it,
which
you
know
from
our
users
perspective.
They're
gonna
want
that
because
they
can't
they
don't
know
if
they
can
trust
us
necessarily
and
we'd.
Also
don't
want
the
liability
of
you
know
being
of
being
able
to
decrypt
it.
B
So
now,
let's
get
into
what
is
actually
going
to
be
encrypted
in
a
ZFS,
encrypted
volume
or
in
a
ZFS
encrypted
pool.
Basically,
all
of
the
user
data
file
data
and
metadata
will
all
be
encrypted.
That
includes
Eckles
names,
permissions
attributes
all
of
that
directory
listings.
All
of
that
XIV
all
data
fu
ID
mappings
master
keys,
which
I'll
get
into
in
a
little
bit
about
how
that
works,
but
your
master
keys
are
actually
going
to
live
on
disk,
but
they
will
be
encrypted
separately.
B
To
give
you
an
idea
of
what
this
is
going
to
look
like
from
a
sysadmin
point
of
view,
these
are
going
to
be
kind
of
the
new
commands
that
are
going
to
exist.
There's
a
modification
that's
been
made
to
ZFS,
create
that
will
allow
you
to
add
two
new
properties.
One
is
encryption,
which
is
basically
your
encryption,
algorithm
and
I'll
get
into
what
we're
gonna
support
for
those
of
you.
People
who
know
encryption
fairly
well
and
the
other
ones
a
key
source
and
key
source
is
basically
just
how
you're
going
to
provide
your
key.
B
Your
password,
your
you
know,
hex
key
your
rakhi.
However,
it
may
be
to
ZFS,
along
with
that,
we
have
a
new
command
sub-command
ZFS
key,
which
will
help
you
manage
your
keys
in
ZFS,
so
that
you
know
you
can
decrypt
and
encrypt
and
decrypt
your
data.
Basically
there's
gonna
be
ZFS
key
L,
which
will
load
your
key
into
the
system,
allowing
you
to
encrypt
and
decrypt
anything.
That's
there
unlock
you
will
unload
the
key,
basically
doing
the
opposite.
B
C
will
allow
you
to
change
your
key
so
that
you
know
you
can
read
so
that
you
can
change
your
password
change.
Your
key,
if
you
feel
like
it
might
may
have
been
compromised
or
just
want
the
security
of
you
being
able
to
rotate
it
and
that
will
not
Riaan
crypt
all
of
your
data
by
the
way,
the
other
one.
There
have
also
been
modifications
made
to
a
number
of
ZFS
NZ
pool
commands
so
that
you
can
load
the
keys
as
you're
mounting
and
unmounting
things.
B
The
key
things
to
remember
here
is
that,
as
long
as
the
key
is
loaded,
your
data
sets
will
be
mountable,
so
they'll
be
able
to
be
brought.
You
know
brought
up
and
they'll
look
just
like
normal
ZFS
file
systems
to
everything
else.
Your
child
data
sets
are
going
to
inherit
encryption
properties
and
the
key
source
by
default.
B
Encryption
equals
on.
If
you
just
take
the
default
on
it,
encryption
it
will
default
to
off.
But
if
used
to
say,
encryption
equals
on
that
will
default
to
a
es.
Ccm
256
bit
the
key
sources,
which
again
is
how
you're
going
to
give
your
key.
You
can
either
specify
that
you
want
to
do
it
from
a
prompt.
So,
basically,
when
you
save
ZFS
mount
or
ZFS
key
L
or
one
of
those
other
kinds
of
commands,
it
will
prompt
you
for
your
key
at
the
command
line,
there's
also
a
file.
B
So
if
you
want
to
have
your
key
in
a
separate
flash
drive
and
you
plug
it
into
the
flash
drive
and
then
you
know,
then
your
data
sets
are
mountable.
You
can
do
that
for
greater
automation.
As
far
as
the
formats
you
can
give
the
key
either
in
raw
bits.
You
know,
just
as
it
actually
is
in
hex
or
a
passphrase.
B
The
last
thing
is
kind
of
I
encourage
people
to
Wikipedia
it
if
they're
interested,
but
basically
there
is
something
called
a
crime
attack
that
it's
not
applicable
to
99
percent
of
applications.
It
can
be
if
you're
really
concerned
about
it.
You
can
get
around
it
with
compression
just
by
turning
compression
off
but
for
most
applications.
This
won't
really
matter
so
we're
leaving
so
we're
not
making
it
mutually
exclusive.
B
So
now,
I'm,
gonna
kind
of
show
you
and
walk
you
through
the
implementation
of
how
the
actual
data
gets
encrypted
on
disk
and
I'm
gonna
kind
of
start
from
the
ground
up.
So
what
I
mean
by
that
is
first
we're
gonna
start
start
off
by
talking
about
the
encryption
scope.
What
is
actually
going
to
be
encrypted?
B
Obviously
it's
going
to
be
the
users
data,
but
are
we
gonna
do
it
at
the
file
level
or
the
block
level
we're
going
to
be
doing
it
at
the
block
level,
and
this
is
gonna
be
for
a
number
of
reasons.
First
of
all,
we're
gonna
be
able
to
encrypt
each
block
separately,
which
means
that
if
you
have
a
very,
very
large
file
in
order
to
encrypt
or
decrypt
anything,
you
need
to
encrypt
and
decrypt
in
that
unit,
so
this
will
kind
of
limit
it
to
the
maximum
block
size.
B
So
if
your
maximum
block
size
is
128
K,
the
most
you
will
ever
have
to
encrypt
and
decrypt
it
once
is
128
K
we're
gonna
store
the
encryption
parameters
in
block
pointer
T,
and
that's
going
to
be
important
in
a
little
bit
I'm
going
to
show
you
how
that's
all
going
work
and
by
limiting
the
scope
of
the
encryption
to
a
block.
Instead
of
to
a
file,
we
can
make
sure
that
only
blocks
can
get
lost.
B
So
now
the
two
types
of
encryption
and
again
this
is
from
kind
of
a
40,000
foot
view
is
a
symmetric,
encryption
and
symmetric
encryption.
A
symmetric
encryption
just
for
completeness
is
kind
of
like
the
the
kind
of
encryption
that's
done
for
SSH
and
TLS
handshakes.
It's
meant
for
verifying
people
and
verifying
trust
between
users
and
making
sure
that
everybody
who's
talking
is
who
they
say
they
are
and
that
they
trust
them.
This
is
very
slow
and
usually
what
happens
immediately
after
this?
B
Is
they
submit
exchanges,
symmetric
encryption,
key
a
symmetric
encryption
key
uses,
a
single
key
for
both
encryption
and
decryption,
whereas
an
asymmetric
key,
a
symmetric
encryption
has
a
private
and
public
key
pair.
Now
this
is
going
to
be
way
way
faster
than
doing
asymmetric
encryption
for
everything
on
a
yet
on
an
x86
64
architecture.
B
If
you
have
the
AES
anti
instruction
set,
which,
if
you're
using
an
Intel
processor
you'd,
probably
do
it
will
actually
be
a
bet
about
a
thousand
times
faster
to
do
this
encryption
and
decryption
than
it
would
be
if
we
were
trying
to
encrypt
everything
with
RSA
or
some
similar
algorithm
like
that.
So,
let's
start
off
by
talking
about
the
basic
unit
of
encryption.
This
is
called
a
block
cipher.
A
block
cipher
essentially
takes
one
block
of
plain
data
and
your
encryption
key
and
will
turn
it
into
one
block
of
output
data.
B
Now
this
it's,
you
know
this
is
good,
and
this
is
what
we're
gonna
be
basing
everything
off
of,
but
this
has
a
number
of
really
severe
limitations,
the
biggest
of
which
is
that
it
only
works
on
a
fixed
block
size
which
is
128
bits.
Most
of
us
want
to
encrypt
more
than
16
bytes
of
data,
so
we're
going
to
need
to
modify
this
and
we're
going
to
need
to
turn
it
into
a
dream
cipher.
What
a
stream
cipher
is
basically
is
we're
going
to
add
we're
going
to
take
the
most
basic
stream.
B
Cipher
is
called
ECB
mode
or
electronic
cookbook
mode,
and
what
that's
basically
going
to
do
are
electronic
codebook.
Excuse
me
yeah,
but
anyway,
the
what's
going
to
happen
is
basically
we're
just
going
to
take
the
AES
algorithm
on
each
block
of
the
plain
data
and
kind
of
just
do
it
in
series
on
each
single
block.
So
that's
going
to
actually
encrypt
all
of
our
data,
and
you
know
we
can
now
encrypt
as
much
as
we
want.
But
this
has
a
severe
problem
too.
B
For
those
of
you,
who've
studied
cryptography
a
little
bit.
This
is
kind
of
the
picture
that
everybody's
known
knows
and
sees,
and
it
kind
of
explains
the
big
problem
with
ECB
encryption
problem
is:
is
that
you
can
still
see
patterns
in
your
data
because
everything
was
encrypted.
The
same
everything
was
kind
of
encrypted
the
same
way.
So
each
block,
you
know
still,
you
can
still
see
patterns
in
all
the
data.
What
we
want
in
really
is
this
is
pseudo-random
data.
We
don't
want
to
be
able
to
detect
any
patterns.
B
B
B
Basically,
the
big
requirements
that
you
need
to
know
about
in
order
to
get
the
rest
of
this
pre
tation
are
that
were
limited
to
up
to
a
hundred
a
hundred
and
four
bits
or
thirteen
bytes.
That's
the
maximum
size
that
we
can
allow
of
an
IV
for
the
two
modes
that
we're
going
to
be
supporting
and
96
bits
or
twelve
bytes
is
what's
recommended
by
NIST
because
of
a
number
of
performance
issues.
B
The
other
really
important
thing
is
that
reusing,
an
IV
with
the
same
key
will
result
in
a
catastrophic
failure
of
the
encryption,
which
basically
means
that
with
even
if
you
don't
have
the
key,
you
can
decrypt
both
blocks
of
the
data.
This
is
really
really
bad
and
it
means
that
we're
lying
to
the
people
who
we
told
them
that
we
encrypted
their
data.
So
obviously
we
don't
want
to
do
that.
B
So
the
big
things
to
take
away
from
here
are
the
IV
needs
to
always
you
be
unique
for
these
two
modes
and
we
get
12
12
bytes
of
it.
The
last
thing
that
we
want
to
add
to
this
kind
of
basic
block
of
encryption
is,
what's
called
an
authenticated
encryption
now.
What
this
means
basically,
is
once
we've
written
our
encrypted
data
out.
How
do
we
know
that
nobody's
changed
it?
When
we
go
to
decrypt
it?
B
We
want
to
make
sure
that
what
we
wrote
out
is
exactly
what
we,
what
what
I'm
sorry,
what
we're
reading
is
exactly
what
we
wrote
out
and
that,
because,
with
a
regular
like
sha-256,
checksum
or
any
other
kind
of
checksum,
anybody
can
produce
that
checksum
and
they,
if
you
can
change
the
ciphertext,
you
can
get
our
applications
to
decode
garbage,
and
you
know
that's
still
not
good
from
an
you
know.
We
still
want
to
be
able
to
make
sure
that
our
data
is
secure.
We
at
least
want
to
know
that
somebody
was
trying
to
you.
B
Was
trying
to
alter
our
data
so
we're
gonna
add
something
called
a
Mac.
A
Mac
is
basically
a
message,
authentication
code
and
it's
basically
a
checksum,
a
cryptographically,
secure,
checksum
like
sha-256
or
sha-512,
or
one
of
the
other
ones.
That
will
require
a
secret
key
to
produce.
So
in
this
case,
that's
also
going
to
be
the
encryption
key
we're
going
to
so.
Basically,
when
we
go
to
decrypt
this
data,
we
can
check
that
the
Mac.
We
can
check
the
Mac
against
what
we
actually
had.
B
So
what
we're
gonna
add
is
this
H
KDF
function,
an
H
KDF
function
is
really
simple.
All
it
does.
Is
it
takes
a
master
key
and
then
assault
which
we're
going
to
store
somewhere
too
and
the
salts
between
the
two
of
them.
They
will
basically
result
in
an
encryption
key,
and
this
encryption
key
will
be
what's
actually
used
to
encrypt
the
data.
We
can
generate
a
new
salt
every
once
in
a
while,
and
that
what
that
will
do.
B
This
is
relatively
quick
to
calculate,
and
it's
basically
like
I,
said
the
whole
point
of
it
is
to
prevent
the
master
key
from
kind
of
going
stale,
because
we're
running
out
of
IVs
that
we
can
use
now.
That
being
said,
this
salt
and
this
resulting
encryption
key,
are
going
to
be
usable
for
quite
a
while.
So
we
don't
need
to
change
it.
B
Every
single
op
on
every
single
operation,
so
we're
gonna
be
able
to
cache
this
for
a
while,
and
basically
we
will
regenerate
it
every
certain
number
of
transaction
groups
or
every
time
that
you
re
in
port,
the
pool
I'll
get
back
to
at
the
end.
So
now
this
is
what
our
diagram
looks
like.
If
you
notice
we
employ,
we
replace
the
regular
encryption
key
and
everything
here,
that's
highlighted
in
red
is
stuff
that
we
added
to
it.
B
Basically,
and
now
we
have
our
master
key,
which
is
going
to
be
used
with
the
salt
to
produce
the
actual
encryption
and
then
that's,
what's
gonna
be
used
to
encrypt
the
data.
So
now
we
have
these
two
values
right
here:
the
salts
and
the
IV,
and
the
question
is:
where
are
we
going
to
get
these
from?
The
answer
is
we're
going
to
randomly
generate
them
with
a
pseudo-random
number
generator
I
can
go
over
all
the
math
if
you
guys
really
want
to
know
about
it.
B
But
basically,
if
you
calculate
out
the
numbers,
what
we're
guaranteed
pretty
much
is
that
we
get
whatever.
That
number
is
41
million
years
at.
If
we,
if
we're
encrypting,
1
million
blocks
per
second,
we
won't
have,
we
will
have
a
1
in
a
1
billion
chance
of
of
reusing
the
same
IV
with
the
same
key
in
41
million
years
and
by
then
I'll
be
retired.
So
so
now
this
is
our
hole.
This
is
our
whole
diagram.
So
if
you
look
at
the
things
that
we
have
to
store,
I
highlighted
them
in
blue
they're.
B
Basically,
the
salt,
the
IV
and
the
Mac,
the
cipher
data
is
just
gonna
go
where
the
original
plain
data
was
gonna
go
anyway,
so
we
don't
really
need
to
care
about
that
too
much.
It's
just
gonna
get
applied
to
the
transforms
so
to
talk
about
where
these
things
are
gonna
go.
Basically,
we
kind
of
made
a
bit
of
a
Union
out
of
block
pointer.
It's
not
a
union
in
the
code
for
any
of
the
for
those
of
you
who
might
have
been
concerned
about
that.
B
But
basically
we
took
a
bunch
of
fields
in
the
block
pointer
and
started
using
them
where
we
could.
If
you
look
at
the
salt,
the
salt
is
used
for
the
fill
count
it's
stored
in
the
fill
count.
We
can
do
this
because
the
the
because
the
fill
count
we
only
encrypt
a
level
zero
data
and
level
zero
data
always
has
a
fill
count
of
one.
So
we
can
assume
that
this
is
kind
of.
So
we
can
assume
that
this
is.
You
know
that
this
is
one
just
because
this
block
is
encrypted.
B
So
therefore
we
can
store
the
salt
there.
We
don't
really
need
that
field.
For
these
for
the
Mac
we're
going
to
store
that
in
the
checksum,
it's
basically
going
to
be
128
bits
long,
which
is
half
the
checksum,
and
the
reason
that
we
can
do
this
is
because
the
Mac
and
the
checksum
kind
of
serve
the
same
purpose.
The
Mac
is
there
two
are
the
checksum?
Is
there
to
make
sure
that
data
is
exactly
as
we
wrote
it
in
the
it's
the
same
thing,
but
it
also
protects
against
malicious
users.
B
The
last
thing
is
the
IV
and
mat.
Errands
and
I
talked
about
this
for
probably
about
a
month
and
a
half
about
where
we
were
gonna
store.
This
we
threw
around
a
whole
bunch
of
ideas,
basically
originally
to
kind
of
sum
it
up
the
originally.
This
was
stored
in
the
padding,
which
is
right
here,
and
that
seems
like
a
great
idea
until
you
realize
that
it
would
use
three-quarters
of
the
padding,
and
that
would
mean
that
we
can't
store
any
more
un
64's
in
the
in
in
the
future.
B
So
this
kind
of
locks
out
block
pointer
pretty
much
forever.
So
we
kind
of
didn't
want
to
do
that.
We
also
toured
around
with
the
IV
with
the
idea
of
generating
the
IV
from
different
from
different
fields
in
there.
So
you
know,
for
instance,
we
could
take
the
place
on
disk,
but
things
get
rewritten
to
the
same
place
on
disk
mode.
Most
often
we
could
have
added
in
the
birth
time,
but
the
problem
with
the
birth,
with
adding
in
the
birth
transaction
group
is
that
ZFS.
B
If
you
hard
shut
down,
it
can
rewind
a
little
bit,
because
you
know
it
will
write
out
this
block,
but
that
won't
be
the
block,
that's
in
the
youever
block.
So
now
we
need
to
kind
of
have
this
count.
So
now
we
kind
of
you
know
it's
it's
not
really
secure.
We
can't
really
guarantee
that
that
is
truly,
that
that
is
truly
unique
and
that
we
never
reused
it.
B
We
thought
about
using
the
bookmark,
but
the
bookmark
gets
complicated,
because
when
you
have
snapshots
or
D
dupe
turned
on,
that
block
could
be
referenced
from
multiple
different
places.
So
none
of
these
were
really
going
to
work
out,
and
so
that's
partially,
why
it's
randomly
generated
and
partially
why
it's
stored
where
it
is-
and
this
is
for
those
of
you
wondering
from
before
why
we
can't
do
more
than
two
copies
of
encrypted
data.
B
Now,
if
you've
been
listening
to
me,
this
whole
time,
I've
been
saying:
hey,
we
really
can't
use
the
same
IV
with
the
same
key
and
that's
exactly
what
I'm
proposing
here.
But
the
difference
is:
is
that
we're
now
also
using
the
same
data
and
because
we're
using
the
same
data
here?
What
we've
really
only
done
is,
instead
of
leaking
all
of
everything,
we've
actually
just
duplicated,
exactly
what
we
had
before
and
we
can
use
that
to
decrypt
and
we
can
use
that
to
detect
deduplication
now.
B
This
is
technically
still
a
leak
of
information,
because
again
we
wanted
to
completely
pseudo
random
that
nobody
can
detect
any
patterns
in
but
D
to
the
D
dupe
tables
were
going
to
leak
this
information
anyway
and
if
T
dupe
is
important
to
you.
This
is
kind
of
something
that
you
can
be
aware
of,
and
you
know
and
understand.
B
B
So
how
are
we
going
to
generate
the
same
IV
and
salt
for
everything
for
equivalent
blocks
of
data?
The
only
real
way
to
do
that
is
to
take
basically
a
checksum
of
the
data,
and
because
this
is
encryption,
we're
not
going
to
take
just
a
regular
check.
Some
of
the
data
we
are
going
to
take
an
H
Mac,
an
H
Mac
is
basically
the
same
kind
of
thing
as
a
Mac.
It's
a
secure
checksum
that
you
need
a
secret
key
to
regenerate.
The
only
difference
is
that
we're
not.
B
We
don't
also
have
to
encrypt
the
data
in
order
to
make
this
happen.
Basically,
this
is
going
to
result
in
a
256
bit
Mac
H
Mac,
and
then
that
is
going
to
be
split
up
into
64
bits
of
the
salt
and
96
bits
of
the
IV
and
that's
what's
going
to
be
stored.
So
now
this
is
where
you
get
your
money's
worth.
This
is
the
picture
of
how
encryption
is
going
to
work
completely.
B
Basically,
if
you're
doing
I
highlighted
the
parts
that
are
gonna
be
different
in
red,
so
for
nandi,
doop
you're
just
going
to
have
the
plain
data
and
then
you're
gonna
have
the
pseudo-random
number
generator
generating
your
salt,
an
IV
for
dee
doop.
It's
just
slightly
more
complicated
we're
going
to
be
using
this
H
Mac
function
to
generate
your
salt,
an
IV,
but
that's
really
the
only
difference
other
than
that
everything's
gonna
work,
kind
of
the
same.
B
Basically,
there's
two
last
things
to
talk
about
as
far
as
diagrams
and
encryption
go
and
actually
getting
data
encrypted
to
disk.
The
first
one
is
I
said
before
that
we
don't
want
that.
We
don't
want
to
have
to
re-encrypt
all
of
the
data,
that's
on
disk
just
because
somebody
might
have
compromised
their
password
or
you
know,
just
because
the
user
wants
to
rotate
their
password.
That
would
be
really
really
inefficient.
B
So
we're
gonna
add
one
layer
of
indirection
between
the
master
key
that
I
was
talking
about
before,
which
is
right
here
and
between
the
actual
key
that
the
user
gives
you.
So
the
user
is
going
to
supply
this
wrapping
key
we're
gonna,
call
it
and
the
wrapping
key
is
going
to
be
used
to
encrypt
the
a
the
master
keys
and
the
H
Mac
keys
on
disk.
B
B
Basically,
if
this
is
going
to
be
the
case
for
when
you
have
for
when
you're
specifying
a
key
via
Raw
or
hex,
fret
or
hex,
so
this
is
what
its
gonna
look
like,
but
basically
all
we're
doing
is
we're
taking
the
master
key
which
were
randomly
generating
and
we're
going
to
encrypt
that
on
disk
in
a
separate
object,
which
is
called
a
DSL
crypto
key
when
it
comes
to
pass
phrases,
pass
phrases
present
a
slightly
different
problem.
Pass
phrases
are
variable
lengths
and
they're
reputedly,
extremely
extremely
weak.
B
B
So
what
we're
gonna
do
to
fight
that
is
basically
something
called
pbkdf2.
Pbkdf2
turns
a
passphrase
into
a
into
a
usable
key.
So
basically,
what's
gonna
happen
is
we're.
Gonna
run
this
function
and
the
function
is
designed
to
be
just
very,
very
hard
to
compute.
It's
gonna
take
a
whole
bunch
of
iterations.
Basically,
and
the
idea
is
it's
going
to
generate
out
this
key
at
the
end
of
it.
B
But
if
somebody
were
to
try
to
do
this
over
and
over
again,
it
would
be
very,
very
computationally,
expensive
and
it
would
just
take
too
long
to
actually
do
for
the
person
who's
actually
trying
to
decrypt
their
data
correctly
and
is
using
the
correct
key.
They
only
have
to
pay
this
price
once
so.
It's
not
so
bad,
but
everybody,
but
anybody
who's
trying
to
brute-force
the
password
will
have
to
pay
this
price
and
many
many
times
a
couple
of
additional
topics,
just
to
kind
of
give.
B
You
guys
an
idea
of
some
of
the
things
that
are
also
that
are
also
coming
with
this.
As
far
as
Zil
encryption
Zil
blocks
have
a
slightly.
You
know,
they
need
some
slightly
different
considerations,
because
they'll
dill
blocks
first
of
all
are
pre-allocated,
and
so
we
can't
save
encryption
parameters
into
the
block
pointer
after
we've
already
written
out
the
data
block.
So,
instead
we're
gonna
have
to
do
something.
A
little
different
first
thing.
B
We
are
going
to
the
idea
of
the
new
compress
darkus
that
we're
storing
data
we're
storing
data
exactly
as
it
exists
in
the
in
the
pool.
So,
basically,
when
you
read
data
into
the
l2,
our
key,
it
will
be
compressed,
but
not
but
decrypted,
and
the
reason
for
that
is
so
that
we
can
reuse
the
data
over
and
over
again
without
having
to
decrypt
it
all
the
time,
because
there's
no
real
benefit
to
that.
B
Once
we
have
the
data
decrypted
or
once
when
we
go
to
write
it
to
the
l2
arc,
we're
going
to
re-encrypt
it
and
yeah.
That's
basically
how
it's
gonna
work,
but
it's
basically
gonna
be
exactly
the
same
as
it
is
on
disk
and
when
it
gets
read
off
of
the
l2
arc,
it
will
use
the
block
pointer,
tease,
ivy
and
salt
to
decrypt
it.
B
The
last
thing
is
something
that
I'm
kind
of
excited
to
to
talk
about,
and
this
is
the
idea
of
a
raw
send.
So
the
idea
is
that
we're
going
to,
in
addition
to
the
compressed
sense,
which
have
just
recently
been
merged,
we're
going
to
be
able
to
take
we're
going
to
be
able
to
take
the
data
exactly
as
it
is
on
disk
and
move
it
up
to
an
off-site
storage
facility.
B
Even
if
it's
untrusted
and
it
will
be
still
encrypted-
and
basically
the
idea
here-
is
that
you
can
send
data
to
an
untrusted
server
without
risking
that
it
can
be
decrypted.
So
this
will
make
ZFS
kind
of
a
true
platform
for
end-to-end
encryption
so
that
you
know
we
could
really
be
used
for
anything
and
including
high-security
kind
of
security,
sense
of
kinds
of
applications.
B
An
admin
will
always
be
able
to
take
backups
of
their
pool
efficiently,
even
without
there.
Even
without
you
know
the
person
behind
the
data
being
able
to,
or
you
know,
having
to
trust
them.
Essentially
as
far
as
the
current
status,
this
is
completely
implemented
the
except
for
Ross
sense.
That's
gonna,
be
coming
in
a
separate
pull
request.
It's
ready
to
review
I'm
begging,
pleading
for
people
to
please
review
it
and
take
a
look
and
tell
me
what's
wrong.
B
Tell
me
you
know
anything
that
that
you
can
think
of
that
might
be
a
problem
with
it
I'm.
You
know
I've
been
kind
of
sick
and
tired
of
rebasing
it
on
top
of
the
current
master
for
the
past
few
months,
so
I'm,
you
know
so
I'm
definitely
trying
to
get
it
pushed
through.
The
primary
pull
request
is
on
Linux
because
that's
where
we
at
detto
are
based,
but
there
are
pull
requests
out
that
are
basically
tracking
the
same
changes
on
OS,
X
and
illumos.
B
I
also
wanted
to
give
a
quick
thank
thank
you
to
Jurgen
lungmen
for
helping
to
maintain
the
OS,
X
and
illumos
ports
and
by
helping
I
mean
doing
it.
Matt,
Ahrens
and
brian
behlendorf,
I'm,
gonna
guess,
but
basically
for
answering
all
of
my
questions
about
everything.
That's
happened
and
George
Wilson
and
Dan
Camel
for
helping
me
work
through
the
art
changes
that
just
got
merged
in
a
couple
weeks
ago.
B
Yes,
that
will
work
with
the
same
ZFS
send
stuff,
because
the
entire
structure
of
the
pool
will
still
oh,
the
question
was:
will
send
still
be
able
to
work
with
the
data
and
encrypted
and
still
be
able
to
do
everything
incrementally.
The
answer
is
yes,
because
the
entire
pool
structure
is
still
left
unencrypted
and
that's
kind
of
a
big
win,
because
we
can
do
that
and
we
can
do
scrubs
and
sense.
B
Embedded
block
pointers
you
can't
they.
This
is
not
like
a
zpool.
These
features
are
incompatible,
but
at
the
Xylo
layer,
if
you
have
encryption,
turned
off
embedded
gets
turned
off
effectively,
which
is
a
bit
of
a
performance
loss
for
some
applications,
but
that's
kind
of
I
have
to
store
the
encryption
parameters
somewhere.
So.
B
B
The
question
is:
will
a
mass
will
a
root
user
on
the
system
be
able
to
get
to
the
master
key
regardless
correct,
but
as
long
as
the
system
is
running,
the
answer
is
the
master
key
is
never
exposed
to
to
the
users
it's
completely
kept
within
ZFS
and
the
keys.
The
wrapping
keys
aren't
really
exposed
through
that.
B
The
only
way
that
they'd
be
able
to
get
through
that
is
through
some
kind
of
like
dev
mem
and
dev
K
mem
interface
that
still
exists
on
very,
very
old,
very,
very
unsecure
systems,
but
that
I'm,
sorry
or
yeah.
You
could
get
it
from
it
from
a
debugger,
but
most
people,
yeah
they're.
So
the
one
of
the
big
things
that
I
wanted
to
point
out
was
that
this
is
encryption
at
rest.
This
is
kind
of
like
encryption
is
broken
into
three
different
kind
of
categories
in
transit
at
rest
and
in
use.
B
I
can't
really
do
much
to
prevent
to
protect
encrypted
data
in
use
because,
for
instance,
that
will
even
live
in
the
page
cache
in
Linux
or
Alamos
or
other
things.
So
there's
not
a
lot.
I
can
do
there.
This
could
be
added,
but
this
will.
This
will
definitely
provide
a
big
layer
of
security
that
you
know
will
be
very
hard
to
overcome.
B
Yes,
but
you
have
to
have
also
given
them
your
key
at
the
same
time,
because
you
you
would
have
had
you
would
have
had
to
enter
your
passphrase
on
the
remote
server.
You
can
back
the
data
up
to
there
and
be
able
to
get
it
back
and
yeah
and
it
will
be
able
to
be
integrity
checked,
but
it
won't.
You
can't
decrypt
it
unless
you
give
them
the
passphrase
as
well.
B
B
B
B
The
question
is:
will
the
Ross
sense
and
the
encrypted
data
as
it
exists
on
disk
or
the
unencrypted
data
correct
or
both
yep
I,
think
I
have
the
N.
So
tell
me
if
this
answers
your
question
basically
I'm
gonna
be
adding
a
flag
which
will
do
a
raw
send
and
the
if
you
have
that
turned
on
it
will
be
exactly
as
it
is
on
disk.
If
you
don't
have
that
flag
turned
on,
then
then
it
will
just
do
a
regular,
send
and
you'll
have
to
have
the
keys
loaded
just
like
it
would
normal.
B
Oh
I
can't
hear
you
too
well,
but
it
sounds
like
you're
just
asking
for
performance
information.
We
did
some
testing
on
performance
and
basically,
what
we
found
is
approximately
anywhere
under
a
5%
CPU
overhead.
There
wasn't
a
big
there
wasn't
a
big
overhead
as
far
as
throughput
to
disk
is
concerned,
because
that's
kind
of
the
bottleneck,
but
as
far
as
CPU
overhead,
it
could
add
as
much
as
5%.
Usually
we
saw
it
at
about
2.
B
B
So
the
two
questions
were
this
well
I'll
answer
the
second
one,
because
I
remember
that
the
second
one
was
some
systems
like
Lux
have
the
ability
to
have
multiple
keys
or
yeah
multiple
slots
to
unlock
the
master
keys
currently
right
now,
no,
but
that
could
easily
be
added
every
the
encryption
keys
are
stored
in
a
zap,
so
it'd
be
not
too
bad
to
you
know.
Add
that
functionality
later
in
the
future,
but
right
now
it's
just
a
single
key
to
decrypt.
What
was
your
first
question?
I'm.
Sorry.
B
Have
we
had
anybody
with
the
extensive
crypto
background
check
all
of
the
protocols
and
procedures
that
we're
doing?
Basically,
the
answer
is
not
yet
but
we're
in
the
process
of
getting
that
of
getting
that
done
and
we've
had
people
who
are
there
there's
a
big
thing
with
anytime.
You
work
with
cryptography
where
any
time
you
ask
somebody
for
their
opinion,
they
say
I'm,
not
a
cryptographer,
but
and
then
they
give
you
their
opinion
and
we've
hit
a
lot
of
that.
But
we're
working
to
get
that.
B
So
for
that,
I
didn't
have
time
to
cover
this,
but
we're
actually
using
illumos,
which
we
have
ported
to
OSX
and
and
Illuma
and
Linux.
The
reason
for
that
is
because,
in
order
to
get,
we
needed
one
crypto
framer
to
kind
of
work
across
everything,
so
we
made
a
separate
module
called
the
ICP
which
will
also
support
cryptic
new
checksums
like
sha-512
and
skying,
and
a
few
other
ones.
B
B
Basically
originally,
when
I
first
started
working
on
this,
it
was
kind
of
based
on
it,
but
in
the
time
that
has
happened
since
then
we
found
a
couple
of
problems
with
it
that
you
know
I
can
get
into
offline
if
you
want,
but
basically
they
we
found
a
couple
of
minor
flaws
that
we
thought
should
be
addressed.
It's
not
compatible
pool
wise
at
all,
but
that
what
kind
of
wasn't
a
goal.
B
B
B
B
Do
we
have
support
for
any
kind
of
external
encryption
stuff
not
at
the
moment,
and
the
reason
for
this
is
basically
because
open
ZFS
needs
to
cater
to
OSX,
illumos
and
open
ZFS
at
minimum,
and
it's
very
hard
to
kind
of
get
those
things
to
work
across
all
three.
It
could
be
added
in
the
future
if
we
could,
if
we
could
figure
out
how
to
get
a
framework
to
work
with
the
hardware
encryption
devices.
B
Is
free
best
BFD
on
my
radar
I've
just
written
the
Linux
version.
I
really
want
to
thank
Jordan,
London
and
I'm,
hoping
I'm
pronouncing
that
name
right,
because
I've
never
actually
talked
to
him.
It's
all
been
through
emails,
but
he
has
been
maintaining
the
illumos
and
OSX
ports
so
that
I
don't,
but
it
should
not
be
incredibly
hard
to
port
at
all.
B
Is
there
a
way
to
protect
a
USB
key?
If
you
have
the
passphrase
on
a
USB
key?
Is
there
a
way
to
protect
the
USB
keys
stuff?
B
There
are
plenty
of
ways
to
do
that,
including
things
like
eek,
ripped,
FS
and
other
encryption
things
for
that
that's
kind
of
outside
of
ZFS.
But
the
idea
is
that
if
you
wanted
to
read
it
from
a
file,
it
was
more
gonna
be
automated
anyway,
because
if
not,
then
you
can
use
a
password.
You
know,
just
like
normal
and
I
will
be
generating
pretty
much
the
same
thing
that
the
encryption
key
on
the
USB
would
be.