►
From YouTube: Kubernetes UG VMware 20201001
Description
October 1, 2020 meeting of the Kubernetes VMware user Group - presentation on stateful application backup and disaster recovery for Kubernetes clusters running on the vSphere platform.
A
Hi
welcome
to
the
october
one
meeting
of
the
kubernetes
vmware
user
group
today
on
the
agenda,
we're
going
to
talk
a
little
bit
about
storage
and
disaster
recovery.
My
some
of
my
usual
co-hosts
are
likely
to
be
joining
us
late
today.
Miles
gray
is
tied
up
with
some
activity
related
to
the
vmworld
conference,
that's
going
on,
but
he
thinks
he's
going
to
be
joining
us
using
his
cell
late,
but
he
may
be
in
listen
only
mode
and
bryson
shepard.
A
Just
slack
me
saying
that
he's
tied
up
in
another
meeting
for
now
but
sounds
like
he
might
be
joining
us
a
little
late
and
I'm
hoping
he
can
join
us
because
we
wanted
to
talk
a
little
bit
about
a
recent
incident.
He
had
where
he
was
trying
to
move
a
kubernetes
cluster
from
one
v
center
to
another
and
took
a
little
bit,
but
I
guess
he
figured
out
enough
to
get
that
working
and
I'm
hoping
he'll
be
able
to
share
what
it
took
to
get
that
to
work.
A
A
little
while
ago
we
had
on
the
agenda
a
topic
by
gopala
about
what's
new
in
the
vsphere
7
update,
1
release
he's
unable
to
make
it
today,
but
he'll
join
us
at
a
future
meeting,
but
we
still
do
have
dave
smith,
you
cheated.
Did
I
get
your
name
right
or
I'll?
Let
you
say
it
again.
If
I
didn't
and
he's.
A
Japanese,
okay
he's
going
to
talk
to
us
about
the
intersection
of
cloud
native
storage
and
data
protection.
So
with
that
said
I'll
I'll,
let
you
get
started
dave.
Okay,.
B
So
I
have
slides,
but
this
is
a
very
small
group,
so
feel
free
to
jump
in
and
you
know
if,
if
you
have
questions-
or
you
know,
if
you
already
know
all
this
stuff
or
whatever
I'm
happy
to
to
move
in
a
more
freeform
direction
as
well,
yeah,
here's
bryson,
so
I'm
dave
smith
vegeta
and
I'm
currently
I'm
styling
myself
as
the
valero
architect.
So
I've
moved
over
to
map
vu.
A
And
dave
I
I'll
I'll
jump
in,
but
I've
sometimes
hear
from
kind
of
newbies
to
this,
so
maybe
a
little
intro
on
what
valero
even
is
just
in
case
somebody
hasn't
heard
of
it
and
also
the
acronym
we're
horrible
at
acronyms
at
vmware
map
bu
is
the
modern
applications
business
unit.
So
this
is
the
part
of
vmware
that
engineers
kubernetes,
as
well
as
a
full,
build
run
manage
of
application
things
using
open
source
technologies
to
run
on
top
of
kubernetes.
B
Yes
and
that's
a
very
good
explanation,
I
need
to
remember
that
one
and
yeah
so
valero
is
our
open
source,
backup
product
for
kubernetes
and
I'll
go
into.
C
B
Okay,
so
there's
a
kind
of
an
open
question
or
I've
heard
varying
opinions
on
what
kind
of
data
protection
do
you
need
for
kubernetes?
Some
people
will
say
you
don't
need
it
and
other
people
will
say.
Yes,
you
do
so.
I
tried
to
break
it
down
kind
of
my
view
of
what's
going
on
and
what
kind
of
protection
you
may
need.
So
you
know
kubernetes
my
understanding
of
kubernetes.
B
Is
it
started
out
being
pretty
much
a
front
end
scaler
to
scale
up
and
scale
down
compute
and
to
handle
what
you
would
consider
to
be
a
stateless
app
and
that
may
be
either
something
that
doesn't
actually
have
any
state
or
it
may
be
a
front
end
for
another
service
that
already
controls
all
of
its
state
and
has
ways
to
keep
track
of
all
that
stuff
and
and
it's
already
protected.
So
really,
you
know
your
cooper.
B
The
thing
that's
running
in
kubernetes
is
really
just
you
know,
compute
you
know
network,
I
o,
etc,
that
scales
up
and
down
as
needed,
and
in
that
case
you
know,
if
you
do
just
keep
revision
control
on
your
yamls
you're.
Fine,
because
that's
that's
your
application
and
you
you
know.
Typically,
you
can
kill
your
application
reload
your
application
for
the
ammos
and
it
should
come
up
the
same
way
and
all
of
your
actual
state
is
stored
somewhere
else
safely.
B
Then
we
move
on
to
people
storing
things
in
a
cloud
service,
so
you
bring
yourself
up
in
like
aws,
and
you
put
things
on
s3
and
you
put
things
in
dynamodb
or
whatever
services
that
amazon.
You
know
the
cloud
provider
is
giving
to
you
again.
You
want
to
have
your
revision
control
on
the
yamls
and
you
know
you
may
or
may
not
want
to
have
backups
and
snapshots
on
the
actual
stuff
in
the
cloud
service.
B
Some
of
that
will
depend
on
things
like
scale.
You
know
as
you
as
you
move
into
extremely
large
scale.
Apps
like
take
a
like
consider
like
netflix,
the
idea
of
snapshotting
and
backing
up
all
of
netflix
gets
a
little
crazy
and
that's
not
really
going
to
work,
and
so
you
know
an
application
at
cloud.
Scale
is
usually
built
to
have
all
of
its
internal
replication,
redundancy,
etc,
so
that
it
will
not
lose
data
and
it
can
survive
all
kinds
of
bad
things.
B
B
So
that's
that's
kind
of
like
that's
kind
of
my
take
on
on
things
with
cloud
services
running
on
cloud
service,
and
now
we've
been
starting
to
push
this
idea
of
running
persistent
volumes
and
running
applications
inside
kubernetes
that
store
their
state
in
persistent
volumes,
managed
by
kubernetes
and
now
you
know
this
responsibility
of
who's
going
to
protect.
The
data
has
really
moved
back
to
the
application.
B
So
you
know
you
need
to
do
things
like
back
up
the
data
in
these
persistent
volumes.
You
need
to
back
up
your
kubernetes
resources,
there's
state
now
in
kubernetes
like
which
volume
was
attached
to
which
stateful
set.
All
these
kind
of
things
need
to
be
handled,
and
so
that's
where
you
really
start
looking
at
hey,
I
want
to
do
backup,
restore
remote
replication
of
my
application,
as
data
and
another
model
that
we're
seeing
is
people
actually
they're
building
applications
that
keep
the
application
state
in
custom
resources
in
kubernetes.
B
So
your
runtime
state,
which
may
actually
be
really
important
to
you,
is
no
longer
in
your
original
yamls
right.
Your
original
yamls
tell
you
how
to
how
to
set
it
up
and
run
the
servers
and
so
forth.
But
then
they
start.
You
know
keeping
track
of
things
in
crs
and
you
know
like
where
they
are
in
terms
of
processing,
something
that
might
be
stored
in
another
service
like
a
database
or
a
an
object
store,
and
now
that
information
is
actually
really
important.
B
A
C
A
Ask
you
a
question:
I
I'm
aware
of
some
of
these
kubernetes
plug-ins
that
maintain
state
too,
like
there's
a
few
cni's
network
plug-ins
that
maintain
state
associated
with
the
the
networking
implementation
and
have
you
ever
seen
any
big
listing.
I
don't
know
where
these
things
keep
their
state,
because
I'm
thinking,
yeah
it'd
do
the
world
they
service,
and
maybe
somebody
who
is
in
the
backup
thing
has
somehow
tried
to
consolidate
that.
A
But
some
of
those
store
there's
things
in
ncd,
in
which
case
an
ncd,
might
get
it,
although
some
of
them
stored
in
an
cd
that
potentially
is
not
the
same
one
used
by
kubernetes
and
some
of
them
use
databases-
and
I
don't
know
it
you-
it
strikes
me
that
you
might
want
a
general
purpose
tool
that
could
capture
all
of
this
stuff
yeah.
It's
really
that's.
B
That's
something
I
mean
valero
will
capture
it
all
some
of
the
tricks
are
things
like.
I
think,
the
the
really
different
one.
One
of
the
things
I
see
is
really
different
between
the
way
kubernetes
works
and
the
way
we've
traditionally
used.
Vsphere
is
typically
your
application
that
runs
in
a
set
of
vms
doesn't
actually
interact
with
the
vc
or
control
plane
very
much.
It
doesn't
care,
it
says.
C
B
Really
intertwine
the
control
plane
with
the
application,
and
this
isn't.
This
is
true
for
more
than
just
vsphere,
but
like
we
start
putting
in
things
like
volume,
ids
volume
handles
and
these
become
like
globally
unique
ids,
which
say,
for
example,
you
just
take
all
your
kubernetes
resources
and
you
replicate
them.
The
volume
handles
are
pointing
at
the
same
globally
unique
id,
and
if
you're,
not
careful,
you
will
wind
up
with
two
applications.
B
B
Well,
I
mean
it
depends
on
how
you
look
at
it,
because
you
can
look
at
it
too,
like
I
kind
of
view
kubernetes
as
being
analogous
to
say
an
operating
system
where
I
would
want
to.
You
know,
control
my
processes.
I
can
do
that
via
you
know
like.
If
I'm
in
in
linux,
I
can
control
my
processes.
I
can
list
them
out
in
kubernetes.
B
I
can
do
the
same
thing,
but
with
pods,
and
I
can
you
know,
for
example,
adjust
my
my
number
of
replicas
on
the
fly
by
just
you
know,
slamming
a
value
into
the
yaml
and
increasing
the
size
of
my
stateful
set,
and
let
kubernetes
do
the
scaling
for
me,
but.
B
B
Yeah
and
things
like
the
kubernetes
namespace
or
the
kubernetes,
namespace
and
control
plane
itself
is
not
a
self-contained
entity.
It's
pointing
out
to
external
resources,
which
is
kind
of
different
from
a
vm,
because
the
vm
is
more
like
self-contained,
so
so
from
silvalero
is
our.
Is
our
backup
application?
It's
designed
to
back
up
kubernetes
resources
to
object
storage.
We
have
a
number
of
different
object
store.
Plugins.
We
can
talk
to
s3
compatible,
including
on-prem
things
like
mineo,
aws
s3.
B
We
have
plug-ins
from
google
cloud,
azure
a
whole
bunch
of
other
things
and
it's
possible
to
write
your
own.
It's
it's
pretty
open
to
plug
those
in
and
we've
in
fact
done.
Some
of
that
dell
emc
power
protect,
has
done
integration
with
valero
and
they've
written
an
object
store
plug-in
so
that
the
valero
data
goes
directly
to
power
protect
rather
than
through
some
kind
of
an
s3
compatible
api.
B
Valero
runs
on
every
kubernetes
platform,
so
you
know,
we've
got
it
running
on
vsphere
aws.
If
you
want
to
run
it
in
mini
cube
or
on-prem
red
hat,
open
shift,
it
runs
anywhere
everywhere.
It's
not
tied
to
the
vmware
infrastructure
at
all
or
any
particular
cloud
provider.
B
It's
part
of
tanzan
and
we
have
integrations
going
on
with
tens
of
mission
control.
Where
you
can
do
backup
from
mission
control.
You
can
control
your
different
clusters.
You
can
say
hey
back
this
one
up
and
it'll
trigger
valero,
as
I
mentioned,
we've
done
integrations
with
dell
emc
power
protect.
So
this
is
actually
the
core
of
the
power
protect,
kubernetes,
backup
and
red
hat
uses
it
in
their
openshift
as
their
backup
application.
So
it's
gotten
quite
a
bit
of
traction
in
the
market.
B
A
B
Yeah
and
it's
integrated
so
it
so,
but
yeah
all
the
source
code
is
there.
You
can
look
into
just
about
everything
and
we're
very
committed
to
maintaining
that
open
source
stance
with
the
with
the
product
and
we've
even
done
that
with
like
the
the
vista
plug-in
we've
kept
as
much
open
source
there
as
we
can
and.
A
B
Yes,
and-
and
you
can
also
get
paid
support
for
it
through
through
tanza,
if
you'd
like
that,
so
basic
capabilities
for
valero,
it
can
back
up
all
your
cluster
resources.
Everything
in
the
cluster
you
can,
you
can
just
say,
backup
everything
you
can
back
up
individually,
individual
name
spaces,
you
can
select
by
label
and
the
label
could,
for
example,
span
all
of
your
namespaces
and
look
for
things
with
that
label
everywhere
in
the
cluster
it
gets
installed
inside
the
cluster.
It
is
cluster
scoped.
B
At
the
moment,
we
can
back
up
our
persistent
volume
data,
either
using
something
called
restic
or
snapshot
plug-in
and
I'll
go
into
a
little
bit
more
on
exactly
what
drastic
and
the
snapshot
plug-ins
do
in
a
minute.
We
have
things
like
execution
hooks
which
are
just
things
that
execute
and
you
can
use
them
for
quieting
an
application,
or
you
know
anything
you
like.
Really
it
has
a
backup
schedule,
it's
pretty
basic,
but
it
does.
B
You
can
control
it
from
the
command
line
or
by
writing
crds
the
command
line,
basically
just
writes
crds,
and
then
the
server
reacts
to
them
and
we
can
restore
into
new
clusters.
Our
name
spaces,
or
into
an
existing
clustering
space.
B
So
this
is
kind
of
the
overview
of
the
valero
architecture
without
any
of
the
the
fancy
stuff-
and
you
know
plugins
and
all
the
rest
here.
So
we
have
the
valero,
cli
or
cube
ctl
can
be
used
to
control
it.
These
write
custom
resources
for
bolero
things
like
backup,
and
then
the
valero
server
runs
inside
its
own
pod
and
it
reacts
to
these
custom
resources
and
it
orchestrates
things
it
talks
to
the
api
server
serializes
things
there.
B
We
also
call
plugins
for
various
things,
and
then
we
have
restic
is
a
open
source,
backup
file
system
to
s3
application.
We've
integrated
that
inside
of
valero,
so
that
we
can
back
up
systems
that
do
not
have
snapshotting
capabilities
and
that's
kind
of
a
scale
out.
It's
not
really
scale
out
it's
it's
distributed,
it's
controlled
by
these
crs,
so
yeah,
and
actually
this
is
kind
of
an
interesting
pattern
here,
because
you
know
I
was
talking
earlier
about
backing
up
the
application
state.
B
B
B
C
B
On
restore,
we
pull
that
that
tar
file
that
we
made
down
from
the
object
store,
look
in
it
and
we
start
restoring
objects
and
there's
a
defined
order
for
it.
So
we'll
do
things
like
custom
resource
definitions,
first
name,
spaces,
storage
classes.
Then
the
volumes
then
persistent
volume
systems
volumes
claims
are
built
out
of
those
and
we
get
down
towards
you
know
like
pods.
So
you
know.
The
goal
here
is
to
get
the
storage
in
place
before
the
pods
get
started.
B
So
I've
been
talking
about
rustic
versus
snapshot
plugins,
so
restick
is,
as
I
mentioned,
it's
a
file
system
to
object,
store,
backup
application
that
we
use.
It
is
storage
system
agnostic,
but
because
it's
storage
system
agnostic,
it
doesn't
do
things
like
take
snapshots
and
it's
basically
walking
across
the
file
system.
So
you
have
to
be
careful
depending
on
your
use
case,
like
one
of
our
our
sample
cases
is
a
log
file
backup
you
don't
need
to
quiesce
anything
because
you
know
pretty
much
rested,
grabs
the
file
and
the
file's
being
written
sequentially
anyway.
B
B
That's
randomly
writing
in
a
file
and
we're
walking
across
it
sequentially.
You
can
get
pretty
bad
results
without
a
border
rights
being
applied.
So
in
order
to
be
safe
with
this,
we
need
to
actually
class
the
application
or
you
can
do
something
like
execute
fs
freeze
via
an
execution
card.
A
I
ask
you
something
dave
absolutely
that
that
application
hook
do
those
also
potentially
do
things
like
log
truncations
and
things
or
are
they
just
kiosks.
B
No
they're
basically
run
this
command
in
this
pod.
Okay,
so
they're
pretty
open,
they're
they're,
pretty
open-ended.
B
C
B
So
these
are
crash
consistent
because
they're
snapshots,
but
it's
a
per
volume
thing.
So
there's
no
volume
groupings
we're
not
getting
cross
volume
right
order,
fidelity
yet
and
then
valero's
view
of
the
world
is
essentially
the
ebs
view
of
the
world
which
is
hey.
I
told
ebs,
amazon,
less
block
storage.
I
told
ebs
to
take
a
snapshot.
Ebs
says:
okay,
here's
your
cookie,
here's
your
snapshot
id
and
then
it
does
all
the
work
internally
to
make
sure
that
your
data
is
safe.
B
So
it
will
go
ahead
and
move
that
data
to
s3
where
it
gets
replicated
all
over
the
place
and
the
life
cycle
of
the
snapshot
becomes
independent
of
the
storage
object.
So
if
you
lose
the
storage
or
even
if
you
delete
the
object,
your
snapshots
are
still
good
and
you
can
clone
new
volumes
from
the
snapshot.
But
this
all
happens.
Internal
tbs.
A
So
can
I
ask
you,
just
I
think
I
know
the
answer,
but
just
to
confirm
that
the
advantage
of
this
volume
snapshot
integration
is
that
for
certain
classes
of
stateful,
like
databases,
you
typically
want
to
ks,
and
you
want
that
kiosk
to
be
as
short
as
possible.
So
you
kind
of
temporarily
stop
the
application
from
doing
its
normal
day
job,
and
if
you
have
this
snapshot,
assist
that
snapshot
freezes
it
and
the
backup
happens
from
the
snapshot.
So
you
can
resume
full
operation
pretty
quickly.
Is
that
it?
Yes.
B
And
it's
and
it's
typically,
you
know,
the
snapshot.
Operation
is
much
faster
than
walking
the
file
system
and
copying
the
data
so,
and
it
may
be
much
faster
to
restore
from
that
so
like
in
ebs.
You
know
you
can
do
an
instant
restore
essentially
because
when
you
say
create
a
volume
from
a
snapshot,
it
comes
back
pretty
quickly
with
here's
your
volume
and
then
it
fills
it
in
on
demand
in
the
background
versus.
A
B
All
this
when
we
get
to
the
the
vser
plug-in,
but
yes
that
was
a
challenge
that
was
some
of
the
fun,
because
our
plug-in
is
significantly
more
complex
than
say
the
eds
plug-in,
because
ebs
does
a
lot
of
the
work
for
valero
and
the
ebs
plug-in
is
like.
I
don't
know,
500
lines
the
visor
plug-in
it's.
I
haven't
done
a
code
count
yet,
but
it's
it's
up
there,
because
it
has
to
handle
a
bunch
of
things
that
aws
would
have
handled
for
us
and
then
typically,
the
block.
The
volume
snapshot.
B
B
We
started
referring
to
these
as
like
local
snapshots
versus
durable
snapshots.
So
when
I
talk
about
the
ebs
snapshot,
I
think
of
this
as
a
durable
snapshot,
because
I
can
take
this.
I
can
hold
on
to
this
cookie
and
I
rely
on
the
storage
system
to
do
the
right
thing
for
me
underneath-
and
I
don't
you
know
from
my
from
from
the
code's
point
of
view-
I
don't.
C
B
B
We're
not
we're
not
sure
yet,
but
right
now,
what
happens
is
the
the
plug-in
for
vsphere
handles
snapshotting
and
it
also
moves
the
data
into
s3
storage.
We've
only
got
s3
apis
at
the
moment.
It
is
doing
full
backups
only
we're
looking
at
what
the
right
thing
to
do.
There
is
there's
some
commercial
solutions
that
integrate
with
us
that
handle
incrementals
properly.
D
D
B
Multi-Volume
consistent
at
least
they're
close,
so
you
know
baby
steps
we're.
Currently
the
release
version
is
currently
one
zero.
Two
and
this
is
handling
vanilla
kubernetes.
So
this
is
the
kubernetes
installed
in
vms
running
on
vsphere
and
using
the
csi
driver
and
the
fcd
cns
stuff
underneath.
B
We're
currently
in
the
process
of
finishing
up
the
110
release,
which
will
support,
project,
pacific
and
I'll
go
into
the
stuff
we
had
to
do
in
there
to
get
that
working,
and
so
this
will
be.
This
will
support
all
flavors
of
kubernetes
on
on
vista.
D
B
So
what
we're
doing
right
now
is
the
plug-in
lives
on
top
of
the
adp,
so
cbt
is
available
and
for
us
it
came
down
to
less
a
matter
of
doing
the
change
block
tracking,
because
the
apis
and
everything
and
all
this
is-
is
traditional
vsphere.
So
we
didn't
change
anything
for
how
cbt
works
on
the
beaster
side.
B
It
came
down
to
us
like
a
matter
of
how
much
resource
are
we
putting
into
the
repository,
because
in
order
to
handle
incrementals
properly,
you
need
a
fairly
sophisticated
format
that
lets
you
like
merge
down
incrementals
into
your
folds,
handles
things
like
deleting
incrementals
in
the
middle
of
your
sequence,
we're
looking
at
what,
where
to
put
the
resource
in
terms
of
the
open
source
plug-in
versus
the
commercial
offerings,
so
power
protect,
for
example,
takes
advantage
of
cpt
with
the
plug-in
and
does
incremental
backups
into
power
protect
just
like
any
other
vmdk.
A
C
A
For
data
protection,
so
it
turns
out
that
at
the
storage
layer
of
vsphere
they
built
in
an
api
that
goes
back,
maybe
a
decade
to
support
these
operations
for
backing
up
vms,
but
they
still
are
applicable
to
even
when
you
layer,
kubernetes
on
top
of
vms.
As
part
of
your
kubernetes
deployment,
the
cbt
is
called
change.
Block
tracking.
The
idea
here
is
that
in
practice,
most
of
these
stateful
apps,
like
databases,
don't
actually
rewrite
the
whole
database
every
day.
A
A
People
tried
to
optimize
this
out
so
that
if
you
could
manage
to
make
your
backup
essentially
a
delta
of
the
previous
day's
backup,
the
backup
would
take
up
far
less
storage
and
there
are
actually
ways
to
create
what
they
call
synthetic
full
backups
out
of
this,
so
that,
even
if
it
was
just
the
incremental
changes,
you
could
still
very
quickly
re-and
reinstall
the
full
image
of
these
there's,
even
a
full
for
a
potential
optimization
on
the
restore
where,
if
you
know
that,
oh
you,
you
need
to
roll
back,
you
can
use
change
block
tracking
to
get
the
delta
between
today
and
yesterday,
and
only
overwrite
the
things
that
would
take
you
back
to
yesterday.
A
B
Yeah
so,
as
steven
was
saying,
steve
was
saying
like
things
like
full
incrementals,
you
know.
If
you
go
back
to
like
the
tape
days
right,
we
used
to
do
a
a
weekly,
full
or
monthly
full
backup
and
then
do
daily
incrementals
and
like
a
weekly,
incremental
versus
the
full
and
in
order
to
get
back
to
a
certain
state.
You'd
start
with
your
full
backup
and
then
start
applying
incrementals.
B
On
top
of
that,
and
that's
just
lots
of
fun,
the
synthetic
full
backup
is
really
what
you
want
where
the
storage
system
has
merged
it
all
down
for
you,
so
that
when
you
do
a
restore,
you
don't
have
to
do
a
full,
incremental,
incremental
incremental
incremental,
but
you
get
everything
back
in
one
state,
but
that
does
require
a
lot
more
sophistication
in
the
back
end,
because
we
kind
of
treat
like
in
the
current
implementation.
We
basically
treat
s3
like
tape.
We
write
things
into
it.
B
We
write
you
know
fairly
naively,
and
then
we
don't
have
a
sophisticated
format
in
them.
So
that's
things
to
do.
B
So
this
is
the
the
102
architecture,
which
is
something
that
is
much
it's
simpler
than
the
112
architecture,
so
this
architecture
say
we're
doing
a
backup.
So
we
we
tell
the
learner
to
do
a
backup.
It
runs
through
it's
doing
the
stuff
we're
talking
about
before
serializing
all
of
the
resources
from
the
api
server.
It
comes
along,
it
finds
a
persistent
volume
and
it
calls
us
in
the
plug-in
and
says:
hey
snapshot
this.
B
So
we
go
ahead.
We
talk
to
vsphere
vcenter
and
we
say
snapshot
this
fcd
and
that
bubbles
down
to
the
host
and
there's
you
know,
a
storage
of
a
typical
vsphere
snapshot
is
taken
and,
depending
on
what
storage,
you're
you're
running
on
you'll
get
different
results.
B
You
know
on
straight
vmfs
you'll
get
a
redo
log
on
a
vivo
you'll
get
a
vivo
snapshot,
but
from
our
point
of
view
it's
all
the
same
and
you
get
it's
compatible
with
all
the
storage
and
the
same
way
that
regular
vm
snapshots
are
and
in
fact
uses
the
same
mechanisms
for
the
mdk
snapshotting.
That
of
vm
snapshot
would
there's
no
changes
in
on
disk
format
for
fcds,
except
little
metadata.
That
we
had
so
then
we
say:
okay,
we
have
a
snapshot.
We
give
back
a
snapshot.
B
B
Now
what
we're
doing
is
in
the
background,
so
we
said
okay.
Well,
we
took
the
snapshot,
but
now
we
have
to
upload
the
data
someplace.
So
we
we
have
a
similar
architecture
to
read
the
app.
We
write
a
custom
resource
that
says:
hey
upload
this
disk
and
we
have
a
scale
out
data
manager.
So
there's
multiple
data
managers
running
in
the
system
and
they
you
know,
go
ahead
and
do
a
little
leader
election
on.
Who
should
do
this
disk
or
this
upload
request,
and
then
they
connect
to
the
snapshot.
B
Data
using
vadp
and
nbd
is
network
block
device.
It's
a
over
the
wire.
It's
a
it's
a
tcp
ip
based
protocol
for
reading
the
blocks
out
of
a
snapshot
or
a
vmdk,
so
it
kind
of
comes
in
from
the
side
rather
than
coming
in
through
the
vm
there's.
Another
technique
called
hot.
Add
that
we
aren't
using
at
the
moment
where
you
actually
attach
the
snapshot
to
a
vm,
and
you
do
read,
writes
via
the
regular
I
o
path.
B
B
We
don't
really
want
to
leave
a
lot
of
easter
snapshots
hanging
around,
and
so
we
don't
so
what
we
do
is
we
move
the
data
out?
It's
now.
You
know
in
in
my
parlance
it's
a
durable
snapshot
at
this
point,
because
the
life
cycle
is
completely
independent
of
your
storage.
That
is,
unless
you
put
your
s3
server
in
the
same
data
store
and
backed
up
into
it,
and
please
don't
do
that.
B
So
that
was
the
one
zero
architecture
and
that
handled
the
vanilla,
kubernetes
case
and
kubernetes
with
vser.
That's
the
correct
term
right
or
you
know,
I
call
it
project
pacific
and
that
has
a
bunch
of
changes
and
how
kubernetes
works,
and
we
had
to
accommodate
that.
B
So
one
of
the
things
that
changes
in
project
pacific
is
the
level
of
permissions
and
so
forth
that
are
available
in
the
supervisor
cluster.
So
project
pacific
takes
on
the
integrates
kubernetes
more
into
the
vsphere
control
plane,
and
we
have
this
concept.
B
That's
called
a
supervisor
cluster
and
the
supervisor
cluster
is
a
special
if
you
like,
kubernetes
cluster,
that
has
a
very
tight
integration
with
vsphere
and
it's
designed
to
let
you
manage
quotas
on
on
various
groups,
give
the
vsphere
admin
more
insight
into
kubernetes
and
allow
the
kubernetes
developer
or
devops
person
to
work
with
standard
kubernetes
apis,
and
so
it's
designed
to
handle
things
like
multi-tenancy
and
isolation
in
various
ways
and
one
of
the
things
that
it
does
is.
It
does
restrict
all
the
permissions
that
are
available.
B
A
I
ask
a
question:
absolutely
your
slide
says
it
installs
valero
in
the
supervisor
cluster.
Is
it
fair
to
say
that
the
typical
mission
of
it,
even
though
it's
installed
in
the
supervisor,
is
to
back
up
the
guests
and
to
go
further?
Is
there
any
reason
to
back
up
the
supervisor
cluster
or
is
that
something
that
could
be
recreated.
B
Well,
it
really
depends
on
how
you're
using
it,
so
we
do
have
the
ability
to
run
user
workloads
in
the
supervisor
cluster
we
have.
The
you
know.
Paul
was
supposed
to
talk
about
data
persistence
services
was
that
the
that
the
new
name
for
it
and
those
are
going
to
run
in
the
supervisor
cluster,
so
we're
setting
ourselves
up
here
to
to
back
up
these
things
that
are
living
in
supervisor
cluster
and
there's
some
shared
infrastructure
between
supervisor
and
guest.
B
I'll
show
you
in
the
the
diagram
there
that
services
guest
cluster
from
supervisor
cluster,
but
there's
also
the
ability
to
do
things
straight
up
in
supervisor.
Cluster.
A
Okay,
thanks
just
another
point
time
check:
if
maybe
you
can
fit
the
rest.
B
B
Can
move
along
I've
only
got
like
three
or
four
more
slides,
so
the
the
pieces,
the
new
pieces
of
the
plug-in
we
added
this
valero
app
operator.
That
is
a
was
it
so
it's
it's
an
operator,
that's
that
you
can
install
via
vsphere,
and
this
gives
certain
capabilities
that
we
needed.
B
This
is
the
same
mechanism,
we're
using
for
data
persistent
services
and
eventually
we're
planning
to
open
this
up
to
a
lot
more
applications
to
run
in
kubernetes
on
vsphere
there's.
This
backup
driver
that
got
added
in-
and
this
is
a
common
piece
of
infrastructure
between
the
guest
and
supervisor-
and
this
is
we've
moved
our
code
that
actually
handles
the
snapshotting
out
of
our
plug-in
proper
into
this
backup
driver,
and
this
also
supports
this
para
virtualized
architecture,
so
guest
to
supervisor.
B
So
the
guest
cluster
in
project
pacific
is
designed
for
really
strong
isolation,
and
one
of
the
the
key
points
was
that
things
running
in
the
guest
cluster
should
not
talk
to
the
center
period
and
what
we've
done
is
we've
mediated
this
through
the
supervisor
clusters,
kubernetes,
apis
and
there's.
This
pattern
called
a
para
virtualized
driver
where
there's
a
supervisor
cluster
resource
that
backs
a
guest
cluster
resource,
so
with
persistent
volumes,
for
example,
in
the
guest
cluster
in
vanilla
and
supervisor
cluster.
B
The
persistent
volume
has
a
handle
volume
handle
and
on
vsphere
that
maps
to
an
fcd
id
which
uniquely
identifies
a
disk
of
the
mdk,
but
that
requires
interaction
with
vcenter,
because
that's
where
the
database
of
where
the
fcds
lives
is,
and
that's
where
the
apis
for
connecting
that
fcd
to
a
vm.
So
you
can
actually
work
with
it
live
in
guest
cluster,
rather
than
doing
that,
we
built
the
spare
virtualized
model
and
the
pv
in
the
guest
cluster
has
a
volume
handle
which
is
actually
a
pvc
of
persistent
volume
claim
in
the
supervisor
cluster.
B
And
so
when
you
go
to
allocate
storage,
the
para
virtualized
csi
driver
first
creates
a
pvc
in
the
supervisor
cluster
and
then
you
know
all
the
usual
mechanisms
kick
in
there.
Things
like
quotas
and
all
that
happen
and
when
the
pvc
is
allocated
is
bound.
Then
the
guest
cluster
can
bind
that
name
into
the
pv
and
then
the
usual
kubernetes
things
happen.
But
this
way
there's
no
actual
communication
between
anything
in
the
guest
cluster
and
vcenter
at
least
no
direct
communication.
B
We
also
introduced
the
concept
of
a
backup
network
here
based
on
some
other
stuff
and
that
lets
us
isolate
the
traffic
for
the
backup
traffic
from
the
management
network
and
it
lets
us
nail
down
who
can
talk
to
vcenter
from
the
supervisor
cluster.
The
guest
cluster
does
not
have
access
to
the
backup
network,
at
least
it
shouldn't.
B
So
here's
the
big,
crazy
diagram
of
the
new
architecture.
So,
let's
just
walk
through
what
happens
in
guest,
so
in
this,
in
this
setup,
you'll
have
valero
would
be
installed
in
the
supervisor
cluster
via
the
app
operator
that
gets
all
the
pieces.
Parts
for
valero
installed.
The
plug-in
backup
driver
the
data
movers.
B
Currently,
you
have
to
install
those
as
vms
from
ovf's
yourself
and
we're
working
on
fixing
that,
but
you
have
a
scale
outside
of
data
movers
there
and
from
the
supervisor
cluster
things
work
very
similarly
to
the
way
they
worked
in
vanilla
in
the
guest
cluster
you'll.
Also
install
vanilla,
you'll
also
install
our
plug-in.
It
will
detect
that
it's
running
in
a
guest
cluster
and
it
gets
some
credentials
from
the
supervisor
cluster
that
allows
it
to
write
records
into
the
supervisor
cluster.
So
we
go
ahead.
B
C
B
Writing
a
cr
in
the
in
the
guest
cluster
valera
does
the
usual
thing:
serializing
the
resources
when
it
comes
to
a
persistent
volume.
It
calls
our
plugin
the
plugin.
Now,
rather
than
going
directly
to
an
api,
writes
a
resource
for
the
backup
driver,
saying
snapshot
this
volume
in
guest
cluster.
The
backup
driver
knows
that
it's
para
virtualized
and
it
takes
that
record
and
it
goes
ahead
and
writes
a
record
for
the
pvc
in
the
supervisor
cluster.
For
that
backup
for
and
then
the
backup
driver
supervisor
cluster
resolves
a
pvc
down
to
an
fcg
id.
B
Then
it
goes
back
to
taking.
You
know,
takes
a
vcenter
snapshot,
says
hey.
I
took
a
snapshot,
things
bubble
back
up
to
valero
and
in
the
supervisor
cluster
it
starts
doing
the
upload
in
the
background,
and
then
you
know
when
the
upload
finishes.
It
removes
the
snapshot,
so
that's
kind
of
a
lot
of
ins
and
outs
and
round
and
rounds.
Does
that
make
sense
or
any
questions
on
this
thing.
D
B
So
we
do
acetate
multi-part
uploads.
If
the
data
mover
crashes
or
you
lose
the
network
connection,
we
come
back
and
we
try
again.
So
it's
relentless
forward
progress
is
the
kubernetes
sketch
word
right.
D
B
So
this
is
a
view
of
the
backup
network
that
is
customer
defined.
So
you
know
you
can
just
hook
all
your
pods
to
the
management
network
and
everything
works.
Fine,
but
you're
not
supposed
to
do
that.
B
So
we
asked
for
that
originally
and
we've
gotten
some
pushback
on
that,
and
so
with
some
new
changes
in
vsphere,
there's
an
ability
to
add
a
nick
that
gets
a
a
vsphere
nfc
label
on
it
and
that
will
put
all
of
the
backup
traffic
and
vd
bracket
traffic
onto
that
nic,
and
you
can
attach
that
to
a
separate
network.
B
But
everybody's
network
situation
is
very
different,
so
that's
kind
of
the
the
run
through
on
like
valero
for
backup
you
can
also
use
valero
for
migration
so
with
the
object
store.
So
if
you
use
rustic
rustic
is
storage
system
agnostic,
so
you
can
go
ahead
and
back
up
to
an
object,
store
and
then
restore
that
to
pretty
much
any
kubernetes
cluster,
because
it's
not
it's
not
tied
to
any
particular
storage
technology.
B
If
you
don't
have
shared
storage
vsphere
to
vsphere,
you
can
obviously
use
the
plug-in.
Take
that
to
s3
and
then
back
down
and
then
we're
looking
at.
How
do
we
migrate
clusters
that
actually
have
shared
storage
underneath-
and
I
think
bryson
was-
was
doing
some
things
with
that
and
we
could
discuss
how
successful
that
was
and
what
craziness
you.
A
A
Okay,
so
bryson
on
slack,
we've
been
having
a
chat
about
an
incident
you
had
where
you
were
trying
to
move.
I
think
you
were
trying
to
move
a
kubernetes
cluster
as
opposed
to
move
a
workload
to
a
new
cluster.
Is
that
true
and
had
some
issues.
C
So
I
wasn't
actually
moving
the
cluster.
It
was
moving
the
vcenter
that
was
controlling
the
vmware
cluster,
but
in
doing
so
we
had
to
make
changes
to
be
able
to
point
to
the
new
vcenter
also
had
issues
when
that
happened,
not
knowing
who
was
actually
doing
the
moving,
how
they
did
it,
but
where
the
vms
were
coming
up
with
different
uuids.
A
B
It
yeah
you're
using
the
the
entry
driver
at
the
moment.
Yes,.
A
And
I
think
you
told
me
on
slack
that
it
came
down
to
needing
to
update
the
vm
uuid
in
a
couple
of
places.
Maybe
you
can
describe
how
you
did
it
in
case?
Somebody
else
runs
into
that.
C
I
mean
we're
getting
into
talking
more
about
pvs
than
we
are
talking
about
like
valero
and
stuff,
but
the
sorry
I'm
on
call-
and
I
just
got
an
alert-
the
storage
communicates.
C
C
So
specifically
is
the
cloud.
Config
is
pointed
the
right
vcenter
people
can
move
you
to
a
different
vcenter
and
not
tell
you
and
then
then,
once
you
know
that's
on
the
right
one.
Sometimes
if
they
move
you
to
a
new
vcenter,
they
may
not
have
set.
C
The
permissions
correctly,
don't
need
to
make
sure
the
permissions
are
set
up
correctly
and
then,
then,
even
with
the
permissions
set
up
correctly,
we've
we've
encountered
something
weird
lately
with
our
pvs,
where
it
seems
like
there's
still
some
connection
to
the
vcenter,
even
after
restarting
the
controller
manager
and
the
cubelet
on
all
the
nodes.
C
C
The
controller
manager
and
we
started
retrying
restarting
several
things
and
then
the
thing
that
we
found
that
just
ultimately
works
is
just
rebooting
each
of
the
control
plane
nodes
and
then
that
so
that
that's,
if,
like
there's
differences
with
the
permissions,
but
then
the
other
thing
we
ran
into
lately
is
where
their
it
can
happen
in
a
couple
different
situations,
but
essentially
the
vm
is
registered
with
a
vsphere
and
it
gives
it
a
new
uuid
and
the
deceptive
thing
there
is
when
you
look
at
your
node
from
the
kubernetes
side.
C
If,
if
you
look
at
that,
the
you
can
there's
actually
two
places
in
that
node
definition
for
the
uuid
and
in
one
of
those
places
it
automatically
gets
updated
and
the
other
place
it
doesn't.
So
then
you
end
up
with
the
controller
manager,
saying
hey.
I
can't
find
this
vm
and
hey.
I
can't
find
a
vm
with
this
uuid,
so
the
fix
there
was
to
correct
that
second
place
in
the
in
the
the
node
definition,
and
then
it
was
able
to
find
it
and
then
it
was
able
to.
A
So
where
is
this
node
definition
located.
C
Oh,
it's
just
like
a
get
nodes.
Keep
control,
get
nodes;
okay,.
C
Yeah
keep
control
edit
node
and
update
the
uuid
wait.
I
can't
remember
you
might
have
had
to
patch
it.
I
can't
remember.
B
Yeah,
it's
kind
of
an
interesting
thing
to
happen
there,
where
the
vms,
so
the
vms
running
kubernetes,
were
simply
switched
from
one
vcenter
to
another
and
otherwise
they're
pretty
much
running
the
same
thing
and
get
all
the
same
volumes
attached
and
networking
all
that,
but
yeah
definitely
there's
that
interconnection
between
the
kubernetes
control
plane
running
inside
the
dms
talking
out
to
vsphere
and
vcenter
and
yeah.
That's
definitely
an
interesting
case.
There.
A
C
Was
that
we
found
the
same
thing
if
we
updated
the
storage
class
with
the
new
topologies
and
and
the
tags
were
all
set
on
the
view
center,
it
would
report
back
that
it
couldn't
see
them
until
we
and
again
we
restarted
the
controller
manager
in
several
different
controller
plane
services
yeah.
We
found
that
we
just
had
to
reboot
the
control
plane
nodes
to
ultimately
get
like
a
connection
restarted.
C
B
They
stayed
up
the
whole
time
yep
so
that
that
explains
at
least
you
know
marginally,
why
the
the
connections
remain
to
the
original
b
center.
A
I
mean
I'm
ignorant
here,
but
can
you
fill
me
in
on
how
you'd
manage
to
do
this
to
move
a
vm
across
recenters
without
well?
It
was
still
running.
Is
that
moving
the
whole
esx
host
to
a
new
v
center,
or
is
it
some
form
of
that.
C
B
A
Cool
okay!
Well,
this!
It
strikes
me
that
I'm
conjecturing
here,
but
maybe
this
is
just
something
that
the
team
didn't
think
of
when
they
were
writing
test
cases
for
this.
But
we'll
take
this
back
and
it
it
does
make
sense.
Now
that
there
is
utility
in
being
able
to
support
this
scenario
just
to
minimize
outages
and
impacts
when
you
move
or
update
vcenters.
So.
B
Yeah,
I
think
we're
going
to
see
is
a
lot
of
you
know
when
you
come
in
from
the
kubernetes
point
of
view.
That's
not
necessarily
the
way
you
think
about
things
vms.
You
know
coming
out
of
the
cloud
world.
You
would
never
move
an
ec2
instance
from
one
aws
to
another.
You
simply
can't
move
them
across
regions
and
other
than
that.
B
A
Okay,
well
thanks
bryson,
let's
maybe
I
don't
know
if
anybody
on
this
call
is
going
to
have
that
use
case,
but
these
things
get
recorded.
So
you
never
know
we'll
try
to
maybe
backfill
this
even
into
the
documentation
to
save
the
next
person
and
for
things
we
can
do
going
forward
is
maybe
make
it
not
so
difficult
to
support
this.
This,
this
kind
of
pathway.
C
A
Okay,
well
thanks
we're
at
the
top
of
the
hour,
but
I've
got
a
couple
housekeeping
things
before
we
go
the
first
one
we
did.
We,
this
group
maintains
a
youtube
playlist
and
there
was
a
slight
delay,
but
the
cncf
had
their
recent
online
conferences.
The
cloud
native
china
and
the
kubecon
europe
and
the
video
recordings
of
those
sessions
have
been
posted.
A
The
final
thing
is
going
forward:
kubernetes
project
wide
they've
had
incidents
with
zoom,
trolls
and
they're
tightening
down
security,
so
we
have
to
have
a
passcode
on
future
meetings.
I
didn't
want
to
do
it
today,
so
the
rule
is
actually
you
either
have
a
passcode
or
a
waiting
room,
but
a
waiting
room
is
kind
of
intensive
for
people
who
join
late
because
you've
got
to
pay
attention
to
the
participant
list
and
it's
easy
to
have
somebody
floundering.
There,
when
you
didn't
notice
them
so
I'm
going
to
add
a
passcode.
A
So
unless
anybody's
got
any
final
items,
I'm
going
to
close
this
down
till
next
time
and
one
thing
that
you're
free
to
bring
up
if
anybody's
got
ideas
for
topics
that
they
want
covered
in
the
next
meeting,
which
will
be
first
week
of
november,
we've
got
one
carryover
where
gopala
is
going
to
return
and
cover
the
changes
that
have
happened
with
vsphere
7
update
1,
but
I
think
we'll
likely
have
room
for
one
or
two
other
things.
If
anybody
would
like
to
suggest
something
either
now
or
write
it
in
the
notes
document
later
also
dave.
A
If
I
can
get
a
link
to
your
slide,
deck
I'll
go
paste
it
in
the
notes
document
after
the
meeting
ends.
A
B
A
You,
okay
thanks,
so
going
going
on
last
call
for
any
final
remarks:
okay!
Well!
Thank
you!
Everybody
for
attending
and
I'll
get
the
agenda,
notes,
doc,
updated
and
the
recording
posted
within
a
few
days.