►
From YouTube: Securing the Cloud with ZFS Encryption by Jason King
Description
From the 2019 OpenZFS Developer Summit
slides: https://drive.google.com/open?id=14uIZmJ48AfaQU4q69tED6MJf-RJhgwQR
B
You
know
we
use
EFS,
obviously,
and
you
wanted.
The
big
request
that
has
come
up
from
customers
is
protecting
you
protecting
data
at
rest
on
all
these
systems,
and
so
obvious
answer
is
use
EFS
encryption,
but
with
that
kind
of
you
know
like
with
many
things
dealing
with
encryption.
One
of
the
big
challenges
is
then
okay.
Well,
how
do
you
manage
the
keys
for
all
of
that?
And
so
that's
what
I
me
and
some
others
have
been
working
on
how
to
deal
with
that
within
Triton
and
smart
OS,
and
so.
B
So
there's
the
kaboom
API,
which
is
the
service
there
and
then
on
each
compute
node,
there's
a
demon
and
kind
of
the
key
which
is
kind
of
the
interesting
part.
At
least
what
people
find
useful
is
that
we're
using
a
pivot,
oaken's,
mostly
ub
keys,
but
anything
that
implements
the
pip
standard,
basically
to
protect
the
key
for
the
zpool.
One
thing
that
we've
done
is:
if
each
compute
node,
we
just
use
a
single
key
for
the
entire
zpool,
and
so
we
kept
the
entire
zpool,
not
just
because
we
use
so
much
snapshotting
and
clones.
B
All
of
our
images
for
containers
and
for
virtual
machines
are
all
basically
the
ZFS
sense
dreams
that
we
clone
and
snapshot
so
trying
to
get
any
finer
grain
just
doesn't
really
buy
you
anything
and
just
becomes
incredibly
complex
to
try
to
manage
I
just
for
those
that
mrs.
D.
If
that's
encryption,
just
because
you
need
to
add
that
single
encryption
route
for
all
your
clones
or
whatnot
and
so
pip
token
I
liken
it
to
kind
like
two-factor
authentication.
Some
crypto
purists
may
disagree
that
that's
the
best
way
conceptual
I
think
about
it.
B
But
basically,
you
use
that
to
encrypt
the
actual
symmetric
key
that
you're
using
for
the
pool.
So
that
way,
then
the
only
the
thing
with
the
private
key
can
decrypt
it,
and
so,
in
this
case,
with
the
piff
tokens
or
UV
keys,
I
can
use
them
interchangeably.
Just
because
there's
some
terminology
overload,
so
it
makes
a
little
easier
to
keep
it
straight.
Basically,
they
on
the
device
themselves,
they
have
public
and
private
key
pairs,
and
so
we
use
one
of
those
keys.
B
So
basically,
during
the
boot
process,
what
happens
is
early
on?
You
set
up
the
administration
Network
before
we
import
any
of
the
storage
and
then,
if
the
pool
is
encrypted,
it
will
contact
this
KABOOOM
api
service
to
request
the
pin
for
the
Yubikey.
That's
on
the
system,
basically
to
provide
the
pin
then
to
unlock
the
pool
and
with
that
the
request
itself
actually
is
signed
by
the
Yubikey,
so
that
has
to
be
present
to
even
get
the
pin.
B
So
someone
can't
just
say:
oh
hey,
give
me
the
pin
to
this
sort
of
that,
and
so
it
actually,
the
token
has
to
be
present.
Then
once
it
has
that,
then
it
can
decrypt
it.
You
can
load
the
key,
and
now
the
pool
is
already
important.
Then
we
can
load
the
key
and
then
mount
up
all
the
file
systems
that
are
there
and
kind
of
proceed
on
normally
and
see
here.
The
notes
that
I,
oh
okay,
so
I'm
kind
of
during
that.
B
B
This
from
so
you
have
to
worry
about
the
leaking,
and
then
we
generate
a
random
pin.
We
register
it
with
the
service.
We
generate
the
pool
key
and
create
the
e
box
to
store
it,
and
then
we
create
the
Z
pool
and
then
what
we
do
for
the
e
box
itself
is
just
kind
of
vista
serialize
structure,
and
so
we
basically
form
coded
and
store
it
as
a
user
property
for
the
route
data
set
on
the
pool.
B
So
then
we
can
just
read
that
and
since
it's
all
encrypted,
obviously
without
Yubikey,
without
the
pin-
and
you
can't
again-
you
can't
get
it.
You
can't
decrypt
the
Z
pool
key
without
that,
and
basically
it
contains
some
additional
metadata.
So
it
describes
the
gooood
of
the
token,
as
well
as
also
some
bits
for
recovery,
which
is
the
other
piece
which
has
kind
of
been
the
really
long
part
of
you're
working
on,
because
an
obvious
problem
with
that
setup,
of
course,
that
you
know
token
gets
lost
or
gets
damaged
or
just
gets
erased.
B
And
then
what
do
you
do?
And
so
one
thing
we
do
have
is
a
recovery
procedure,
we're
basically
it's
we
do
a
form
of
key
escrow,
and
so
when
we
create
that
ebox
that
I
talked
about,
we
actually
create
two
copies
of
the
key
one
protected
by
the
pivot
oak
and
then
the
other
one.
We
split
it
into
m
parts
and
the
value
of
n
is
decided
by
the
operator,
as
well
as
a
threshold
value
n,
which
it
can
be
less
than
or
equal
to
M.
B
B
The
idea
is
that
the
keys
split
up
into
m
parts,
and
you
need
at
least
ten
of
them
to
recreate
the
key,
and
so
the
ideas,
then,
is
that
employees
or
you
can
put
you-
can
have
a
save
like
a
break
glass
type
thing,
depending
on
what
the
operator
wants
to
do,
where
the
key
is
split
up
and
protected
by
pivot
opens
that
are
assigned
to
people.
So,
like
personal,
you
be
keys
and
then
what
happens
is
then
there's
a
challenge,
response
process
or
rate
up.
B
So
if
that
does
happen,
you
don't
I
need
to
recover
this
compute
node
and
then
it'll
give
you
a
number
of
challenge,
phrases
that
are
based
on
configuration.
Then
you
get
an,
however
many
people
that
you
need
based
off
the
your
can
click.
You
know
whatever
policy
that
you've
said,
and
they
do
this.
They
take
the
challenge
string.
There's
a
software
that
they
run
on
their
laptop
or
their
desktop
with
their
own
personal
Yubikey.
They
pass
it
in
provide
their
own
personal
pin
for
their
personal
Yubikey.
B
It
gives
a
response,
and
once
you
have
any
of
those,
then
it
can
extract
the
key
unlock
the
pool
and
things
can
boot
up
and
of
course,
at
that
point,
then
you
can
replace
the
the
Yubikey
with
the
new
one
or
whatever
you
need
to
do
at
that
point.
So
you
have
the
data
so
that
way,
if
the
token
that's
on
the
box
is
damaged
you
still,
your
data
is
not
gone,
although
you
should
always
still
have
you
know
a
separate
disaster
recovery
back-up
plan,
and
obviously
this
isn't
a
substitution
for
that.
B
B
You
know
the
you
know,
leave
all
the
data
inaccessible
and
the
other
bit
about
that.
It's
we
use
a
it's.
The
split
up
is
done
using
a
samir
secret
sharing
scheme.
If
you're
familiar
with
that.
The
idea,
of
course,
is
that
even
with
like,
if
say
you
have,
ten
parts
in
your
threshold
is
five
is
to
pick
some
numbers,
then
even
having
four
of
those
doesn't
give
you
any
and
Meishan
about
the
final
key.
You
have
to
have
at
least
five,
so
you
can't
get
like
part
of
it.
B
B
Behind
all
that,
but
so
so
that
also
just
try
to
protect
the
key
and
then,
of
course
with
that
you
know
people,
you
know
new
people
come
in
people,
you
leave
organizations,
people
lose
their
UB
keys,
and
so
the
other
thing
we
had
to
do
is
all
this.
All
this
plumbing,
then
also
that
you
have
this
policy
in
terms
of
you
know
how
many
parts,
your
threshold
value,
what
those
tokens
are
that
you
use,
and
obviously,
if
changes
you
know
occur
just
because
of
all
those
events.
B
So
that's
also,
if
you
saw
me
some
of
the
discussions
about
using
channel
programs,
and
so
that's
part
of
why
we're
looking
into
that
just
because,
since
we're
storing
this
as
a
property
being
able
to
essentially
atomically
update
a
property
change,
the
pool
key
do
things
like
then
make
sure
that
it
that
either
happens-
or
it
doesn't
so
you're
not
left
in
some
intermediate
state.
That
then
requires
manual
cleanup
just
because
that's
never
fun.
So
that's
the
basic,
that's
the
very
high-level
overview
there.
B
You
can
get
way
off
into
the
detail,
so
so
that's
all
the
I
just
what
a
cover
just
with
the
presentation.
The
two
links
here
are:
some
of
the
design
are
the
two
design
documents
that
go
into
much
more
detail.
The
first
one
is
a
more
theoretical
abstract
in
terms
of
the
concepts
RFD
77
173
is
more
into
the
details.
One
thing
we're
late
to
this
is
boom
bees
actually
intended
to
do
more
outside
of
just
managing
ZFS
keys.
B
But
that's
all
that
obviously
I
was
going
to
talk
about
for
this,
just
because
that's
pretty
the
most
relevant
for
other
people
here
so
but
those
are
you
know
you
can
freely
browse
those
and
they're
like
saying
that
goes
into
far
far
far
more
detail,
and
so
that's
basically
it
like
I
said
I
try
to
keep
it.
Hopefully
short.
So
if
there's
any
questions.
A
Cool
thanks
Jason.
No,
it's
really
cool
to
hear
about.
You
know.
We
I
think
we
we
design
a
lot
of
stuff,
including
the
encryption
stuff
in
ZFS,
to
be
able
to
do
anything
but
a
lot
of
times.
It
takes
a
bunch
of
work.
On
top
of
that
to
make
it,
you
know,
solve
real
problems.
So
it's
really
cool
to
hear
about
how
you
know
you've
done
that
with
the
encryption
stuff.
We
do
have
time
for
questions.
If
anyone
here
has
them
I
know,
there's
some
folks
online.
B
And
I
mean
usually
it's
kind
of
kind
of
a
two-factor,
but
using
the
UV
Keys
is
kind
of
it
seems
to
interest
most
people,
that's
kind
of
also,
maybe
the
most
I
say
different
or
maybe
a
unique
thing
about
it
and
again
kind
of
the
threat
models
here
were
things
like
you
know,
so
on
steals
the
drive,
so
in
steals
the
server
or
you
know
you
dispose
of
the
drives
and
you're
just
trying
to
protect
the
data.
Obviously,
there
are
threat
models
that
this
doesn't
cover.
B
B
It's
usually
the
bits
Andy
I've
done
demos
before,
although
they
kind
of
our
anti-climactic,
because
the
whole
point
with
all
these
integration,
everything
is
that
it
should
pretty
much
look
and
feel
just
like
the
norm.
You
know,
like
the
encryption,
wasn't
there
that
it's,
hopefully
all
in
the
background
and
things
work,
and
so
then,
when
I
do
that
it's
like
okay,
a
SEP,
the
compute
node
II,
tell
it's
encrypted
instead
of
not,
and
it
sets
everything
up
and
it's
like.
Okay,
aside
from
doing
your
like
a
ZFS
get
or
whatnot,
it
all
looks
the
same.
B
A
One
question
which
is
so
I
get
that
the
like
each
and
each
computer
has
the
UV
key
like
basically
permanently
installed
in
it,
so
we
can
like,
if
it
reboots
it's
going
to
request
the
key
request.
The
pin
from
the
pin
server
is
the
pin
server
like
how
do
you
manage
the
pin
server
like
if
that
dies
and
reboots
then
does
it
also
just
come
back
automatically
or
is
that
more
of
a
like
in
event
where
it
needs
manual
intervention?
It's.
B
Obviously,
there's
security
implications
for
that.
We
may
look
into
other
techniques
to
alternate
to
secure
that
the
head
node.
Just
because
were
the
reasons
for
choosing
UV
keys
versus,
say,
like
an
HSM.
Is
that
they're,
like
an
order
of
magnitude
cheaper,
you
know
you
be
keys,
are
40
50
bucks
versus
500
each
which,
of
course,
you
know
times
a
few
thousand
machines
or
more?
What
adds
up,
but
you
know
in
terms
of
the
headnotes
and
sets
you
know
one
or
two
machines.
B
B
Manage
or
protect
things
for
that
that
you,
maybe
you
don't
use
for
the
you,
know
the
rest
of
the
nodes,
and
there
is
some
discussion
about
that.
Actually,
in
RFQ
77
with
that
again
just
trying
to
get
rid
off
and
trying
to
you
know
optimize,
because,
obviously
you
know
people,
don't
you
don't
want,
don't
want
to
have
encryption,
be
like
oh
yeah,
but
it's
gonna
chart.
You
know,
cost
you
an
extra.
You
know
thousand
dollars
per
compute,
node
yeah.