►
From YouTube: Velero Restore Hooks Deep Dive Session
Description
Purpose
Deep dive discussion on Velero restore hooks product requirements and design approach.
Documents
Product Requirements: https://github.com/vmware-tanzu/velero/pull/2679
Design: https://github.com/vmware-tanzu/velero/issues/2609
B
B
B
We
also
have
an
issue
which
is
related
to
the
product
requirements
that
are
tied
to
the
restore
hook
epoch.
All
of
the
items
are
linked.
There,
but
for
today,
we've
had
a
lot
of
feedback
on
the
design
proposal
for
restore
hooks
and
that's
what
we
want
to
spend
some
time
diving
into
today.
It's
been
discussed
a
little
bit
on
some
of
the
community
meetings
and
just
first
glance
and
some
of
the
use
cases
defined,
but
we're
going
to
be
taking
just
a
deeper
look
at
it
today,
so
Nolan
I'll.
A
Allow
them
to
get
adopted
by
whatever
controller
would
be
managing
them.
So
stateful
set
deployment,
you
know
whatever
whatever
might
be.
Managing
those
pods
might
even
be
a
custom
controller
and
then
Valero
would
run
the
restore
hooks.
In
that
context,
the
issue
rage
raised
in
the
PR
was
that
it's
possible
that
the
pod
could
start
up
do
some
work
and
then
Valerio
could
come
in
and
run
the
restore
hook
and
get
interrupted
or
interrupt
the
work
at
some
point.
A
Instead
of
going
back
and
forth
on
the
design
proposal
to
say
you
know,
what's
what's
the
best
way
to
address
this,
so
one
one
one
kind
of
other
one
other
idea
was
to
attach
innate
containers
to
restored
pods
so
basically
manipulate
the
pods
we'd,
say:
here's
an
and
here's
an
init
container
for
your
restore
hook,
details
there
are
fully
baked
yet,
but
the
innate
container
would
essentially
be
where
the
hook
runs
to
put
the
data
in
the
right
place.
So
maybe
it
would
run
for
a
database.
Maybe
it
would
run
the
whole
database.
A
A
A
That's
going
to
become
relevant
for
Valero,
and
a
few
releases
is
making
sure
that
this
is
item
potent
in
which
stores
so
that,
if
I
have
to
restore
operations
happening
on
the
same
workload,
that
Valero
doesn't
blow
away
the
data
or
have
race
conditions
in
the
restores,
because
right
now,
Valero
does
one
restore
at
a
time.
But
in
our
roadmap
we
want
to
get
Valero
opened
up
and
be
more
concurrent
and
you
know
behave
the
way.
A
People
really
expect
it
to
in
the
kubernetes
cluster
so
that
it
can
do
basically,
as
many
restores
as
the
cluster
could
handle
so
I
think
I
think
those
are
the
two
main
issues
I
see
right
now
is
the
three
making
sure
we
populate
the
database
or
application
I'm
using
database,
because
I
think
that
the
most
common
application
that
we're
experiencing
trepidation
over
making
sure
we
we
coordinate
in
case
it.
The
application
matters
that
there's
you
know,
multiple
nodes
that
it
matters.
A
C
I
would
like
to
add
or
clarify
one
aspect
which
is:
you
said
it
would
be
sufficient
to
run
this
hook
on
one
of
those
pods
if
we
are
running
multiple
replicas,
but
it
may
be
the
case
that
in
if
the
database
is
like
a
quorum
based
database,
you
will
want
to
run
it
on
multiple
or
majority
of
the
replicas.
So
that
is
another
thing
that
we
may
want
to
keep
in
mind.
A
Okay
good
point
so
yeah
given
given
those
I'm
I,
don't
know
if
folks
have
read
the
proposal
or
if
we
want
to
share
it,
I
didn't
have
time
to
put
together
graphs
or
anything,
but
I
will
also
mention
that
upstream
is
a
a
proposal
for
modifying
the
cube,
lit
or
something
called
container
notifiers
died.
We
would
love
to
use,
but
that's
still
very
much
in
the
early
kept
phase
and.
A
Frankly,
I
don't
know
if
it
will
even
get
accepted,
so
we
kind
of
want
to
put
something
in
sooner
rather
than
later,
rather
than
waiting
for
the
kubernetes
feature.
When
it
comes,
we
will
gladly
kind
of
move
there,
but
for
now
we
would
like
to
get
something
that
lets
users
use
the
data
that
they
quest
in
our
backup
hooks
and
get
it
going.
Our
our
stated
goal
on
the
on
the
on
the
roadmap
is
to
get
this
out
in
a
1.5
August
timeframe.
I,
don't
100%
know
if
this
is
feasible.
C
D
But
our
work
around
at
the
moment
is
writing
some
scripts
to
orchestrate
some
of
the
the
functions
that
would
happen
in
a
restore
hook.
So
one
of
our
use
cases
right
now
is
to
not
just
backup
a
database
set
is
on
kerbin
edit
running
on
kubernetes.
It
could
be
a
database
at
his
external
of
kubernetes,
but
because
it's
a
part
of
the
platform
that's
running.
D
We
believe
that
the
backup
restore
functionality
has
to
include
that
database,
even
if
it
is
external.
So,
for
example,
what
might
happen
is
the
valar
restore
will
have
occurred,
put
the
backup
of
the
database
on
the
persistent
volume
of
our
pod
that
is
going
to
then
restore
that
into
an
external
database.
That's
not
running
on
kubernetes!
So
that's
that's
sort
of
like
one
of
the
the
use
cases
that
we're
aiming
for
and
our
workaround
is
to
just
script
that
so
velaro
runs.
D
E
I
can
give
another
kind
of
concrete
example:
it's
not
that
different
from
what
Scot
just
described.
One
of
the
differences
are
that
that
we're
actually
running
our
database
inside
kubernetes
as
a
stateful
set,
and
it's
in
this
example
it's
postcards.
But
what
we've
done
is:
we've
attached
a
new
empty
der
volume,
PVC
to
Postgres
and
included
just
the
empty
der
in
the
Valero
backup,
not
the
actual
main
PG
data
directory,
and
we
have
a
pre
backup
hook
that
does
a
PG
dump
from
volume
a
into
volume
B.
E
So
the
PG
dump
data
is
the
only
thing
that
gets
backed
up.
We
inject
in
a
NIC
container
into
Postgres
that
has
a
custom
bash
script
in.
It
sounds
a
little
bit
like
what
you're
doing
that
looks
for
the
existence
of
data
in
this
empty
durval
um--.
That
would
signify
that
we're
in
a
restore
mode
and
it
will
do
a
PG
restore
back
to
the
main
one
Postgres
has
already
started
up.
E
We
have
the
race
condition
there
that
Nolan
spoke
about
for
a
minute,
and
then
it
deletes
the
data
so
that
it
becomes
an
idempotent
restore
it's
a
little
bit
fragile
in
the
fact
that
if
something
goes
wrong,
it's
going
to
be
stuck
in
a
continuous,
restoring
process
there
and
that
race
condition
creates
a
problem.
We've
we've
done
a
couple
of
things
to
kind
of
work
around
the
the
race
condition
experimental
that
we
haven't
moved
into
production,
yet
one
being
instead
of
doing
a
PG
restore.
A
Mark
I
have
a
question
here,
so
let's
say
you've
you've
run.
Let's
say
you've
done
a
restorative
at
once,
and
then
you
took
it
back
up
of
that
again
and
then
you
try
to
restore
that
again.
Does
that
init
container
stick
around
so
that
you
get
copies
of
that
in
a
container
on
another
restorer?
Is
it
intelligent
enough
to
detect
the
Vienna
containers
already
there.
E
Yeah,
it's
a
good
question.
We're,
like
our
application,
starts
up.
It's
like
very
tightly
integrated
into
customize,
and
so
are
the
omit
container
is
injected
in
at
the
last
minute
using
customized.
So
like
it's
there's,
always
one
but
like
yeah.
That
would
be
a
problem
if
we
were
trying
to
declaratively
describe
this
yamo
as
a
single,
manifest
I.
D
Have
another
example
that
we're
still
we're
not
that
that
far
down
yet
but
related
to
the
hook
and
the
life
cycles
of
things,
it's
maybe
not
necessarily
related
to
backing
up
and
restoring
of
actual
data,
but
it's
around
providing
the
metadata
of
the
the
contents
of
the
backups
and
the
contents
of
what
got
restored
so
currently
we're
implementing
in
our
hook,
something
that
emits
some
log
output
that
can
be
then
retrieved
later
on
and
parsed,
and
then
say:
okay.
This
is
the
contents
of
specifically
as
with
Cloud
Foundry.
D
So
we
want
to
see
the
orgs
and
spaces
within
the
cloud
foundry
that
got
backed
up
and
we
don't
have
it
in
our
roadmap.
But
a
use
case
could
be
I've
now
performed,
restore
and
I'm
going
to
invoke
a
funk,
some
code
to
obtain
the
metadata
of
the
Cloud
Foundry
that
we've
just
restored.
Does
it
match
what
got
backed
up
that
kind
of
thing?
So
it's
the
life
cycle.
Events
are
are
definitely
important
to
hook
certain
things
on
for
the
operators
of
this.
F
D
F
C
But
this
to
me
sounds
like
a
need
for
a
more
of
a
volume
restored
hook
rather
than
a
pod
restore
so
a
meaning.
We
don't
run
this
just
these
restore
actions
on
just
when
we
restored
parts
of
applications,
but
we
run
this
hook
on
restoration
of
volumes.
So
we
disturb
all
doom
populate
the
data
and
let
the
and
kind
of
tie
the
part
and
the
volume
together
when
we
restore
the
part.
E
She
said
I
think
one
of
the
the
challenges
that
that
we
run
into
when
we
kind
of
think
that
path
is
it
works
often,
but
there's
definitely
some
systems.
Databases
in
particular
that,
like
the
database
engine
needs
to
be
up
and
running,
the
pod
needs
to
be
available
to
be.
We
can't
manipulate
the
volume
without
also
having
a
database
engine
running
as
attach
to
it.
A
F
A
In
a
future
statement,
I'm
gonna
link
this
in
chat
when
you
say
like
this
is
a
volume
restore
hook
like
this
makes
me
think
of
this
tech.
This
generic
data
to
populate
errs,
which
is
again
like
these,
these
generic
data
populate
errs,
would
be
an
awesome
thing
once
they
actually
exist,
because
then
the
the
database
could
be
running
right.
You
could
have
a
Postgres
or
an
elastic
or
I.
Don't
know
what
you
can
pick
your
pick,
your
database.
A
Based
on
this
is,
this
is
gonna
take
a
long
time
to
get
going
because
it's
based
on
manipulating
the
kubernetes
api,
the
idea
would
be.
You
can
set
a
data
source
on
a
PVC
to
a
a
custom
resource
object,
and
you
can
then
have
a
controller
or
operator
that
would
fill
it
and
then
kubernetes
would
not
declare
the
pv
as
ready
until
this
operator
returned.
A
A
Recognized
it's
not
going
to
be
100%.
The
case,
like
I,
think
I
think
we
need
a
switch
there.
That
says
run
this
on
everything
we'll
run
this
on.
One
I
think
I
think
we
need
a
way
to
determine
that,
because
it's
not
we,
we
Valero
right
now
do
not
have
a
way
to
introspect
the
application.
I,
don't
know
that
we
ever
will
until
we
get
like
some
sort
of
whether
we
use
home
or
Waikiki,
or
you
know
whatever
we
don't
have
anything
that
tells
us
about
the
application
yeah.
That's.
D
D
Pod
does
happen
to
be
running
a
container
that
has
some
of
our
CLI
tools
that
are
relevant
for
our
backup
and
restore
needs.
But
I
am
curious
about
the
fact
that
we're
running
a
pod
that
is
doing
nothing
for
the
purpose
of
hook
like
hanging
some
hanging.
Well,
I
guess
hooks
are
for
hanging,
but
we
have
something
to
hang
the
hooks
on
or
it's
the
other
way
around.
Sorry
I
got
lost
in
the
English
of
it,
but
but
this
is
sort
of
what
the
when
I
heard
as
she's
talking
about
like.
D
D
E
I
had
one
kind
of
follow-up
question
to
your
statement
to,
since
we
moved
our
backup
over
to
the
API
and
it's
a
deployment
that's
running
and
that
could
in
theory,
be
multiple
replicas
does
Valero
have
I'm
I'm,
not
aware
of
any
but
like
you're.
Definitely
the
expert
here
in
Valero
does
Valero
have
the
same
concept
of
being
able
to
say
back
up
one
pod
from
this
deployment.
A
replica
set
I
don't
need
all
of
them
snapshot.
It.
A
With
an
end
that
that
annotations
method,
it
was
to
say
of
the
restore
hooks,
we
can
say
the
restore
hooks.
This
is
a
once
or
all
kind
of
mode,
so
I
run
this
restore
hook
on
everything
or
I.
Only
only
put
this
on
one
pod
and
in
that
case,
like
the
restore
hooks,
would
almost
have
to
go
on
the
container
of
the
pods
right
have
deployment
or
the
stateful
set
or
whatever
the
controller
owner
is.
The
hook
would
have
to
go
there.
Yeah.
E
A
Yeah
I
think
I
think
like
if,
if
we
implement
something
here,
that
would
be
useful
for
the
backup
hooks.
It
makes
sense
to
move
it
over
and
I
think
also
there's
specifically
from
the
concurrency
perspective.
We're
probably
gonna
have
to
revisit
backup
hooks.
As
we
start
looking
at
making
velaro
more
concurrent,
I
think
backup
hooks
are
probably
gonna
be
affected.
Nothing
will
be
that
probably
they
will
be
because,
like
when
you
just
to
sit
just
quickly
the
like,
when
you're
taking
it
back
up.
A
A
D
A
A
D
E
The
only
way
is
perd
to
do
that
would
be
to
do
something
like
what
Scots
doing
where
you're
just
throwing
a
pot
in
that's
in
a
sleep
infinity
or
something
along
those
lines,
right,
you
were
I
would
have
to
add
a
dedicated
pod
with
a
rep
or
something
with
the
replica
count
of
one.
If
I
wanted
to
use
a
higher
level
order,
the
component
than
a
pod
and
say
this,
the
thing
exists
only
to
do
restores.
E
C
I
was
going
to
say
just
to
kind
of
porch
a
path
forward.
Can
you
follow
something
like
we
do
for
rustic,
where
we
we
have
a
rustic
restore
helper,
which
is
like
a
tinted
container
which
fades
forever,
while
the
volume
that
the
pod
wants
gets
populated,
I
think
that
is
very
similar
to
what
Scott
you
said.
You
were
doing.
C
A
Right,
but
that's
not
on
a
studio
like
for
us,
like
what
I
think
what
what's
there
for
Valero
to
do
is
to
provide
the
building
blocks
such
that
it's
safe.
For
these
things
to
execute
and
not
step
on
their
toes
and
PG
restore,
isn't
the
only
way
to
get
the
data
into
the
post
press
there's
like
copying
it
into
PG
PG
data
without
it
running,
but
that's
not
to
say
posted
photographs
is
not
the
only
workload
we'll
think
about
here.
But
if
we
have
this,
like.
A
C
C
D
E
I
was
just
gonna
say
from
our
perspective,
you
know,
cots
is
a
it's
a
it's
a
tool
for
application
developer
to
distribute
their
application,
and
we
have
Valero
integrated
really
tightly
into
that,
and
so
we
say
you
know
just
hey.
Here's
the
Valero
docks
and
go
here
and
like
follow
the
process
that
you're
using
to
like
to
snapshot
the
application
and
we
get
questions
from
application
vendors
constantly.
E
You
know
when
they're
doing
that
around
like
okay,
that
works
but
like
what
do
I
do
about
this
restore
and
we
can
help
them
be,
let
and
say
you
know,
just
add
this
in
a
container
and
do
this
pot
in
but
like
I
mean
we
would
like
this
to
be
as
Valero
native
as
possible,
so
that
we
could
just
say:
look
you
use
them
in
an
annotation
to
add
a
bash
script
in
here
in
order
to
tell
us
how
to
get
the
data.
Add
an
annotation
in
here.
E
That
adds
to
like
that's
the
reverse
of
that
implementation.
Details
like
like
very
much
open
to
whatever
works
like
I'm,
we're
only
coming
with,
like
the
our
experience
so
far,
working
with
application
vendors
and
how
they're
working
around
this
right
now,
how
they're
actually
able
to
do
custom
restore
hooks.
If
you
will.
E
C
C
E
Something
else
that
you,
you
would
give
the
user
the
developer
the
ability
to
work
around
that
race
condition
by
attaching
it
other
places
and
in
to
me
a
restore
hook
that
I
can
attach
to
any
work.
That
I
want
to
feels
like
it's
a
potential
MVP
and
then
iterating
down
the
road
is
the
ability
to
say,
like
you
know,
one
or
all,
or
things
like
this,
where
it
allows
the
user
to
like
to
write
less
DMO
and
use
the
built-ins
in
bolero
and.
C
On
the
topic
of
one
or
all,
I
think
it
should
still
work
if
you
run
the
restore
hook
on
all
parts
of
a
application
right.
So
if
you
are
taking
a
backup
of
Postgres
and
you
have
three
replicas,
a
the
snapshot
of
each
of
those
replicas
will
be
at
different
points
in
Postgres,
but
they
should
all
like
once
they
come
up.
They
should
all
converge.
C
E
C
A
I
think
and
to
clarify
what
it
meant
for
like
once
or
all
that
that
was
kind
of
building
on
what
that
was
kind
of,
like
looking
for
a
compromise
between
I
think
it
was
Andrew.
That
pointed
out
some
databases.
Sometimes
you
want
to
only
want
to
run
at
once
and
something
what
Ashish
mentioned
was
you?
You
run
a
database
that
is
you're
running
in
a
set
up
under
quorum
based.
A
Don't
know
how
the
mechanics
of
that
would
work,
because
you
we
would
have
to
get
like
fiddly
with
how
like.
If
the
daemon
said.
How
would
we
exactly
control
which
pod,
because,
if
Alero,
let's
the
daemon
set,
do
the
like?
If
we
say
we
just
restore
the
daemon
set
and
let
the
pods
go,
get
real
fiddly
with
how
we
like
go
in
and
add
the
pods
or
we
would
have
to
restore
the
pods
but
the
game
instead
of
top
them.
A
And
then
we
run
into
the
problem
which
we
talked
about
on
the
issue,
which
is
the
damage
that
made
the
source
of
pods
yeah
and
start
up
its
own,
because
it's
easy.
We
edited
them
so
that
the
the
once
or
the
once
or
all
thing,
maybe
its
own
design,
discussion
and
thinking
I'm
kind
of
thinking
along
the
lines.
You
are
mark
that
the
MVP
is
like.
A
C
B
A
I
mean
in
in,
in
my
opinion,
I
would
love
to
you
like
if
it
would,
if
it's
something
we
could
solve
for
now,
to
make
concurrency
easier
to
implement.
That
would
be
awesome,
but
we
currencies
gonna
be
tricky
to
implement
now
that
were
like
putting
it
in
later.
Iii
I
think
there's
no
way
around
the
fact
that
getting
concurrency
employers
there'd
be
tricky.
So.
A
D
D
A
D
A
A
A
C
F
C
E
Think
that
makes
sense
I
mean
we
definitely
have
a
lot
of
you
know.
We
talked
it's
really
specifically
about
our
post
cards,
backup
and
restore,
but
we
have
other
workloads
that
we're
backing
up
and
restoring
that
you
know
some
don't
even
need
restore
hooks.
You
know
at
all,
and
some
of
them
can
run
totally
in
line
like
you
know,
Mineo
and
things
like
that.
So.
A
A
D
A
D
A
D
C
A
C
A
A
E
B
Thank
you
both.
We
really
appreciate
that
right.
So
for
next
steps,
you
know
we
have
a
product
requirements
document
that
outlines
some
of
the
high-level
use.
Cases
well
update
some
of
the
use
cases
I
discussed
here,
as
well
as
the
design,
doc
and
then
Andrew
and
mark.
You
can
continue
the
conversation
in
the
community
meetings
and
in
the
comments
to
get
this
moving
whenever
y'all
ready.