►
From YouTube: ZFS & Containers by Michael Crogan
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hello,
so
I'm
gonna
talk
about
ZFS
containers
and
I'm,
going
to
attempt
to
condensed
exactly
20
minutes
of
content
into
20
minutes
time.
So
I'll
do
my
best,
so
I've
architected
a
project
which
can
be
built
upon
during
the
hackathon.
These
are
the
URLs
to
the
github
project
and
also
to
a
vagrant
image
which
will
let
you
use
open,
VZ
and
CFS
simultaneously.
A
You
have
divergence
of
your
data,
inefficiency
and
it's
transfer
and
in
flexibility
and
expressing
what
changes
you
want
to
propagate
I
wanted
to
advocate
ZFS
as
a
back-end
for
a
variety
of
reasons
and
reduce
the
barrier
to
entry
for
new
use
cases
and
dedupe
is
known
to
be
expensive.
So
there's
other
ideas
around
that.
The
project's
aim
is
to
make
it
transparently
easy
to
use
one
command
to
say.
I've
I've
made
a
change
to
my
data
set
and
I
want
to
snapshot
it
in
time
or
I
want
to
derive
something
from
it.
A
I'll
go
into
more
detail
on
taking
Dayna
aspect
next,
but
I
also
wanted
the
project
to
be
versatile
and
flexible
efficient
in
terms
of
incremental
transfer
and
a
notion
of
D
dupe
that
I'll
explain
later
to
enable
some
kind
of
greater
privacy
through
a
dataset
injection
that
I
described
later,
and
to
advocate
that
this
may
be
very
well
used
in
a
dev
build
production
environment
in
a
dev
environment.
You
get
perfect
fidelity
by
like
synchronizing
the
exact
snapshots
in
production.
A
You
gain
efficiency
by
having
a
common
starting
point,
and
your
incremental
release
is
a
small
Delta
from
your
starting
point,
so
the
so
I
I
leveraged
the
FS
and
openvz
can't
together
and
briefly
openvz
is
similar
to
docker.
It's
a
container
technology
into
which
you
may
put
an
OS
image
and
an
application
open
bc
adds
the
ability
to
checkpoint
your
state,
which
is
which
is
handy
for
debugging
or
for
failover
or
migration.
Things
like
that,
but
under
the
hood,
the
same
principles
can
be
applied
with
docker
and
CRI.
A
A
It
turns
out
to
checkpoint
your
entire
OS
image
in
openvz
is
very
efficient
if
you're
only
using
200
megabytes
and
your
applications
it'll
write
a
200
megabyte
image
representing
your
whole
OS
state
and,
in
contrast
like
sending
a
full
vm
like
of
your
entire
OS,
may,
take
8
gigabytes
or
16
gigabytes
or
whatever
it
is
making
it
problematic
to
do.
Direct
checkpointing
or
migration
and
the
whole
use
of
snapshot
and
clones
very
fast
and
efficient
in
CFS,
so
a
observe,
a
practice
which
is
an
example
of
a
use
case.
A
It
can
be
generalized,
but
this
is
a
example
for
this
discussion.
So
I
have
a
core
starting
point:
a
data
set
which
is
like
a
container
image.
It's
like
a
entire
OS
Debian
7
starting
point
within
open
BC.
It
appears
within
a
within
a
hierarchy
with
the
ZFS
back-end
that
core
starting
point,
1007
I
call
it
is
cloned
copy
and
write
and
becomes
the
basis
for
a
intermediate
common
packages
like
container
data
set,
meaning
like
if
most
of
my
packages
include
the
build
environment.
A
This
is
an
efficient
intermediate
point
for
encoding
and
then
from
that
I
have
specific
containers.
Maybe
it
might
be
a
web
server,
a
bill
environment
and
these
are
lightweight
to
encode
relative
to
the
2007,
and
this
is
just
what
it
looks
like
under
the
hood.
I
constructed
a
root
directory,
meaning
within
one
thousand
seven.
Here
are
your
your
s,
bin
et
Cie?
Your
dump
is
a
single
file
and
to
clarify
some
of
the
assumptions
around
this
I
assume
that
derivative
hierarchies
are
largely
additive.
A
If
we
have
a
starting
point
of
your
base
and
we
have
a
build
environment,
largely
we're
adding
content,
it's
not
true
all
the
time,
but
it
is
true
by
construction
or
in
this
particular
use
case,
and
because
there's
a
commonality,
the
way
that
Debian
or
other
software
pulls
in
dependencies
there's
identical
content.
That's
you!
That's
actually
shared
between
these
containers.
That
would
otherwise
be
isolated
and
I
choose
this
intermediate
point
deliberately
to
be
efficient,
and
last
assumption
is
that
size
of
datas
is
larger
than
the
size
of
metadata.
A
This
is
just
showing
you
know
more
concretely.
The
structure
of
the
actual
content
of
the
data
set
so
just
be
curly
like
this
is
one
example
on
up
to
the
point
of
route.
This
is
a
1
ZFS
data
set
and
then
up
to
the
dump
is
another
CFS
data
set,
and
likewise,
when
I
checkpoint
I'm
a
checkpoint
on
a
daily
basis
or
when
there's
an
update,
this
is
just
showing
there's.
It's
a
recursive
snapshot
and
then
there's
a
serial
notion
of
like
checkpointing
over
time
for
a
fixed
container.
A
This
indicates
how
I,
basically
start
with
a
basis,
derive
an
intermediate
point.
I
take
the
example
of
build
essential
which,
like
pulls
on
a
lot
of
packages,
and
then
this
from
this
2007
intermediate
point
I,
add
a
small
addition.
Engine
X
and
this
is
done
through
clones
and
snapshots.
Here's
another
example
of
Hachi
different
container
ID
2007
is
the
common
starting
point,
and
one
obvious
claim
is
the
the
footprint
the
storage
footprint
with
using
this
ZFS
ecosystem.
A
Is
that
your
size
of
storing
this
is
more
efficient,
also
more
efficient
to
ascend
and
to
know
what
actually
is
differing
in
your
intermediate
containers.
So
when
I
say
five
thousand
seven,
given
two
thousand
seven,
it's
considered
a
incremental
Delta
from
2007
as
a
starting
point.
If
we
were
to
compare
this
to
a
full
concatenation
of
each
of
these,
that
would
be
duplicating
a
lot
of
content
that
ZFS
could
encode
more
efficiently.
A
So
this
ZFS
in
containers
combining
the
two
we
can
do
lightweight
snapshot,
clone
and
rollback
and
because
we
are
leveraging
openvz,
it
can
also
include
the
state
of
your
running
system
so
just
to
roll
back.
If
we
have
a
when
we
take
a
snapshot
and
it
includes
state,
we
do
a
recursive
snapshot
of
the
one
thousand
seven
data
set
and
it
includes
a
dump
file.
That
is
the
result
of
easy
suspend.
A
So
we
capture
the
state
of
your
file
so
and
also
the
a
single
file
image
representing
what
was
running
at
that
time,
and
this
shows
what
it
would
look
like
you
know,
under
the
hood
or
like
low
level.
This
is
roughly
what
would
be
done
to
to
represent
that
and
here's
some
example
uses
of
the
of
the
of
the
tool.
You
see
it's
very
usage
driven
a
checkpoint.
Will
you
know
automatically
snapshot
and
your
state
and
you
can
choose
to
snapshot
every
container.
That's
running.
A
Every
dataset
you've
got
choose
a
particular
container
to
drive
a
a
clone
from
suppose
you
have
got
your
one
thousand
seven.
You
want
to
drive
two
thousand
seven,
your
intermediate
step.
You
know
container,
and
so
this
functionality
can
be
built
over
time,
but
these
are
some
like
example
commands
you
can
also
see
what's
differing
locally
versus
your
last
snapshot
and
suspend
and
resume.
A
This
gives
fidelity
of
the
container,
as
mentioned
with
best
practices,
efficient
storage,
integration
with
the
checkpoint
resume
of
your
OS
and
application
state
and
the
it's
possible
to
do
something
that
you
usually
don't
do,
which
is
like
have
a
environment
and
then
roll
back.
So
sometimes
I
accumulate
data.
You
know
temporary
data
and
my
file
system
grows
and
grows,
but
I
realize
it
was
just
a
temporary
experiment,
so
this
makes
it
easy
and
one
command
to
to
essentially
rollback.
A
Now
the
core
technology
of
the
tool
is,
is
synchronization
of
these
data
sets
across
hosts
and
to
do
so
manually
as
complicated,
especially
involving
three
hosts
or
coordinating
efforts.
So
if,
without
that,
if
see,
there
must
be
done
manually
or
maybe
there's
another
tool
out
there,
but
the
data
sets
become
out
of
sync
and
that's
not
what
I
want
when
I
have
multiple
hosts
and
I
want
to
essentially
propagate
or
share
a
container
across
them.
So
the
project
identifies
a
common
snapshot
starting
point.
A
A
Suppose
you
want
to
coordinate
between
an
hosts
if
you
have
peer
or
the
central
repository,
you
can
just
simply
include
in
your
practice
like
synchronizing
to
the
central
place
first,
but
any
number
of
enhancements
can
be
built
upon
this,
but
the
core
notion
is
a
push
and
pull
and,
as
an
example
suppose
I'm
working
on
a
development
environment,
I
noticed
a
bug.
I
can
quickly
send
the
state
of
my
running
system
and
everything
in
it.
A
Likewise,
with
a
production
environment,
a
centralized
repository
can
roll
out
the
incremental
builds
and
assuming
most
of
the
data
doesn't
change.
This
can
be
like
a
10x
speed.
Up
and
again,
you
can
quickly
fork
your
your
data
and
roll
back.
A
work-in-progress
of
the
script
is
so
injection
of
a
data
set
like
your
home
directory.
You
may
not
or
Denair
like.
If
you
have
a
home
directory,
it
might
be
part
of
your
build
environment.
You
don't
necessarily
want
that
evolving
and
part
of
your
shared
ecosystem.
A
So
in
the
spawning
of
the
container
you
can
inject
home
user
or
whatever
it
be
into
the
container
as
it
runs,
and
this
tool
can
treat
that
data
set
independently
and
can
choose
to
share
the
home
directory.
Maybe
if
it's,
if
it's
temporary
directory,
it
might
exclude
it
entirely
from
synchronization,
but
the
injection
lets
you
consider
holes
or
pieces
within
your
environment
that
you
want
to
consider
or
do
not
care,
so
I
think
we
have
time
so
the
this
there's
an
idea
of
a
rebase
for
this
is
the
generalized
idea
around
greater
efficiency.
A
So
suppose
we
have
a
collection
of
containers
and
they
live
in
this
ecosystem.
What
we
can
do,
if
we've
accumulated
that
over
time,
maybe
less
efficiently
is
we
can
rebase
we
can
reconstruct
the
identical
content,
but
in
such
a
way
that
it
is
efficiently
coded
encoded
by
ZFS
through
deliberate
copy-on-write
constructions.
So
a
very
simple
way
of
doing
this,
you
know
so
this
is
just
the
overview.
A
A
Now
there
will
be
overlap
or
which
won't
be
we're
done
in
it
will
be
just
simply.
You
know
encoded
once
and
your
result
and
there
will
be
non
overlap.
There
will
be
files
that
differ
and
there
will
be
conflict,
but
the
notion
is
that
the
percentage
of
data
that
may
differ,
for
example,
it's
may
be.
Your
package
database-
may
differ
for
each
individual
container
for
the
storage
footprint
that
component,
that
portion
of
the
data
is
much
smaller
than
what
is
common.
A
So
the
idea
is,
you
derive
a
consolidated
container.
You
clone
that
to
form
primed
versions
of
container
one
two,
five
and
seven,
and
then
you
do
a
perfect
fidelity
rsync
from
the
original
content
to
the
primed
version.
You
get
identical
content,
a
deletion
is
lightweight
in
terms
of
footprint
and
you're,
conserving
space
because
you're
getting
a
maximal
notion
of
Union
and,
as
I
said
before,
the
part
of
the
content
that
differed
is
exists.
It
will
be
encoded
with
high
fidelity,
but
it
is
small,
so
it
is
fishing
to
do
it.
A
A
But
if
this
were
enabled,
then
one
could
does
encrypt
with
a
like
a
workplace
key.
Those
private
proprietary
changes
to
a
public
data
set
same
with
taking
that
further
to
the
personal
component,
and
it's
not
a
perfect,
not
perfect
from
an
encryption
standpoint,
because
you
have
some
known
or
anticipated
data.
You
know
chosen
known
known
plaintext,
but
it
may
achieve
be
useful
effects
to
be
able
to
add
encryption.
A
Likewise,
if
Debian,
upstream
or
if
we
knew
exactly
which
directories
are
potentially
contained
personal,
identifying
information,
we
can
know
exactly
what
what
what
to
inject,
we
can
consider
temp
directories
or
as
orthogonal
and
likewise
and
there's
a
variety
of
advantages
or
sorry
enhancements
to
the
tool.
I'll
be
adding.
You
know
issues
you
know
in
terms
of
like
potential
enhancements
on
actual
content
over
over
today
later
today,
and
but
also
perhaps
if
this
were
taken
up
as
a
hackathon
project,
other
enhancements
can
be
can
be
taken
up.
A
Yeah,
so
the
the
question
is:
how
how
can
this
practical
use
like?
How
do
we
enable
it
or
use
it?
So,
the
the
vagrant
image
is
a
starting
point
which
has
a
openvz
ZFS
entire
Linux
distribution
as
a
starting
point
and
into
which
you
can
enclose
the
project
and
that's
so
yeah.
So
that's
the
starting
point
like
you
essentially
can
either
do
this
yourself
or
use
this
as
a
starting
point.
Open
BC
is
the
technology.
You
can
build
your
ZFS
into
that.
A
A
Yeah,
so
the
there's
two
parts
to
it,
so
the
mashing
everything
together
will
you
know
if
there's
an
identical
file
in
multiple
containers?
It'll
only
be
stored
once
after
cloning
that
the
clone
operation
takes
no
storage
footprint
that
the
there's
an
incremental
like
for
each
individual
container?
We
do
in
our
sink,
so
there's
an
our
sink
of
1,000.
Second
tainer
one
container
to
container
five
just
is
that
how
you'd
imagined
it
or
maybe
I
didn't
understand.
A
Well
so
because
the
notion
is
additive,
growth
of
either
snapshots
or
derived
containers,
it's
more
efficient
to
go
backwards
in
time.
So
if
we
were
to
go
forwards
in
time,
we'd
be
re-encoding
things
in
more
than
one
way.
So
as
an
example,
imagine
five
thousand
six
and
sorry
imagine
container
five
container
six
share
a
lot
of
content
that
is,
would
represent
the
same
file
if
we
encode
it
forward.
In
that
way,
we're
essentially
replaying
the
same
content.
Again
you
identify
what
actually
is
duplicate.