►
From YouTube: 2021-03-03 Geo DR technical iteration sync
Description
Discussion about parallel tracks of work that can be done on the technical side.
A
Cool
so
as
I
was
just
summarizing,
I
see
that
there
have
been
discussions
last
week
in
issues
and
also
in
the
document
about.
A
B
My
interconnection
is
bad.
Is
it
better
now.
B
It's
let's
send
it
over
to
my
phone,
maybe,
but
I
put
some
notes
into
the
working
group
doc
just
for
this
meeting,
because
I
think
I
I
thought
through
all
of
this,
and
I
think
there
are
a
few
things
which
are
making
it
fairly
easy
to
decide
right
now
I
mean
we
have
hard
facts
like
we
have
this
constraint
of,
let's
say
80k
dollars
per
month
as
a
budget,
then
one
hard
fact
is
the
cheapest
solution
to
everything
is
storage
that
we
know.
B
Another
heart
is:
if
you
use
this
snapshots
for
restoring,
then
that
would
always
be
way
below
above
one
hour
for
rto
target,
and
if
we
want
to
provide
something
better
for
our
premium
customers,
then
we
need
to
work
on
a
sync
solution
right
and
even
if
we
don't
decide
about
the
solution
right
now,
I
I
think
what
we
always
will
come
to
as
work
that
is
needed
is
that
we
need
to
build
a
scaled-down
copy
of
gprot
there's
no
way
around
it,
because
we
need
to
make
things
be
working
there
and
feeding
just
up.
B
It
meets
the
vertical
a
real
g
prod,
so
it
always
needs
to
be
something
like
that.
So
this
work
needs
to
be
done
anyway.
Then
we
need
to
build
in
for
automation
around
failing
over
failing
back
orchestrating
all
of
this
and
scaling
up
and
down.
This
always
will
be
needed
as
work,
and
we
need
to
work
that
we
support
deployments
on
the
secondary
side
without
database
migrations,
probably
but
still,
and.
B
A
B
Right,
yes,
so
that's
clear,
and
then,
if
we,
if
we
look
into
this
business
goal
of
supporting
a
below
one
hour,
rto
target
for
premium
customers,
then
we
need
to
build
something
which
is
synchronizing
italy
directly
over
to
the
secondary
site.
This
snapshots
are
not
good
enough
for
that
one,
so
we
either
need
to
use
geo
or
set
of
a
streaming
from
dedicated
gitly
charts
or
something
like
that,
and
that
always
means
we
need
support
for
selectively
syncing
or
putting
customers
on
selected
kiddley
notes
by
plan
right
this.
B
B
It
would
be
super
cheap
to
achieve
something,
at
least
very
shortly,
because
even
if
you
increase
the
frequency
of
the
snapshots-
and
we
only
pay
for
the
incremental
size
of
snapshots-
and
if
you
have
a
higher
frequency,
then
the
increments
will
get
slower
smaller.
So
we
will
stay
below
10k
additionally,
because
even
if
you
do
it
every
half
hour,
I
I
estimate
so
that
would
be
a
very
cheap
solution
of
building
something
for
all
customers.
B
But
then
we
would
have
the
problem
of
being
the
database
and
the
goodly
charts
being
out
of
sync
when
we
fail
over.
So
that's
the
big
downside
of
that
one.
Any
other
solution
for
everybody
would
be
something
with.
I
don't
know
an.
C
Whether
or
not
that's
even
a
viable
solution,
you
know
we
brought
down
a
database
system
because
we
did
a
snapshot
that
was
live.
You
know
that
was
a
mistake
on
our
part,
but
all
of
our
file
servers
are
technically
live
objects.
If
we
do
a
live
snapshot,
we
risk
doing
something
bad
to
a
specific
repository,
potentially.
B
A
Yeah
I
like
the
first
thing
I
wouldn't
focus
on
is
achieving
the
rto
and
rpo
targets
like
that
is
so
far
off
right
now
that
on
on
at
this
moment
for
the
technical
work
we
need
to
do,
I
I
don't
think
we
should
constrain
ourselves
for
the
stuff.
We
want
to
do
immediately
for
the
stuff
that
we
need
to
plan
and
get
others
to
participate
in.
A
That
needs
to
happen
in
parallel
right,
so
one
track
is
going
to
be
staying,
standing
up
the
site
standing
up
like
and
playing
what
can
be
done
with
the
things
we
have
right
now,
even
if
they're
slow,
even
if
they
are
not
optimized,
even
if
they
are
going
to
take
hours,
if
not
days,
to
complete
that
is
okay,
while
things
are
being
built
in
the
background
right.
So.
C
So
we
could
focus
our
efforts
in
that
realm
and
at
least
getting
everything
stood
up
and
ready
to
go.
At
that
point,
it's
just
a
maybe
we
could
start
testing
some
of
these
other
items,
such
as
the
snapshot
recovery
option
just
to
see
where
that
lands,
what
is
required
for
it
and
what
we
would
need
to
maybe
mount
that
to
servers
that
are
stood
up
in
that
area.
C
That
way,
it's
not
something
we're
duplicating
the
information
on
the
secondary
site,
but,
like
I
don't
know
what
that
looks
like
because
you're
going
to
have
a
file
server,
that's
not
syncing
data,
and
then
we
say:
hey
remount
it
with
this
data
and
all
of
a
sudden.
You
have
data
available
to
you
and
then
it's
a
weird
thing,
but
I
think.
A
A
B
I
would
think
that
we
that
we
just
don't
have
any
githuli
nodes.
On
the
other
side,
we
just
have
database
application
and
in
case
of
failover,
we
spin
everything
what
is
needed
up
and
mount
the
disk
that
we
create
from
snapshots.
So
there
would
be
no
working
gitlab
site
on
the
other
side
and
there's
no
geo
involved
in
this.
It's
just
database
replication
and
infrastructure,
which
can
be
scaled
up
on
demand
and
automation
to
get
the
snapshots
built
into
a
disk
and
mountain
to
the
right
servers.
B
Yeah
I
mean
it
will
take
time
to
get
all
the
servers
up
and
this
amount
of
servers
and
also
a
high
amount
of
disk
snapshots,
building
this
out
of
that
for
60
servers
that
can
take
a
long
time.
Maybe
because
I
know
for
databases
which
are
one
terabyte,
it
takes
20
to
40
minutes
to
build
a
disk
out
of
a
snapshot,
so
we
have
16
terabytes
for
git
lease,
so
it
will
be
around
an
hour.
B
I
guess,
but
depending
on
a
lot
of
things
like
from
which
region
we
get
the
snapshot
and
other
things
maybe
like.
As
how
many
increments
we
had
on
the
the
snapshot,
and
so
we
really
need
to
test
this
timing
a
little
bit,
but
for
an
rto
above
one
hour.
That's
fine,
I
think
like
if
you
say
we
want
to
be
back
within
four
hours.
I
think
that
should
clearly
work.
A
A
B
That's
why
I
think
vv
can
clearly
do
the
work.
That
anyway,
needs
to
be
done
with
just
standing
up
a
new
infrastructure
in
a
different
region
and
working
on
automation
to
scale
this
up
and
down
as
needed,
and
then
we
can
find
the
solution
to
sync
the
data,
and
I
mean
that's
that's
just
we
can
leave
this
out
for
now.
A
Let's
do
a
bit
of
playing
here.
Let's
say
I
say
we're
not
going
to
do
any
of
the
disaster
recovery
working
group
work,
but
we're
going
to
focus
on
the
kubernetes
migration
and
we're
going
to
focus
on
moving
the
api
nodes
and
the
web
nodes
and
then
have
the
kubernetes
handling
the
actual
scale
up
and
scale
down.
A
If
we
at
that
point
after
we
have
migrated
that
have
a
secondary
site
with
vms
for
data
stores
right,
but
a
single
kubernetes
cluster
for,
like
a
you,
know,
a
cold
cluster,
basically
right
like
waiting
for
something
to
happen,
would
that
mean
less
work
for
us
to
actually
stand
a
bunch
of
those
nodes
up,
or
does
it
mean
it's
pretty
much
the
same
amount
of
work?
C
B
B
A
So
and
then
another
we're
talking
here
right,
like
I'm,
not
making
we're
not
making
any
decisions
here.
So
let's
say
in
this
hypothetical
that
we
decided
to
say
that
italy
is
also
in
kubernetes
italy
itself.
Right,
the
storage
is
somewhere
else.
It
could
be,
you
know,
persistent
volumes,
it
could
be.
I
don't
know
wherever,
but
italy
has
the
option
of
you
saying:
hey:
my
storage
is
somewhere
somewhere
else
over
the
network
or
whatever
else.
A
B
Because
if
we
spin
up
just
your
terraform,
which
we
just
can
copy
over
and
do
some
adjustments
from
the
existing
terraform
or
if
we
spin
up
a
kubernetes
cluster,
it
always
means
that
we
need
to
set
something
up,
get
configurations
right
and
and
test
it.
And
so
I
think
it
will
always
be
some
amount
of
work.
And
I
think
there
will
be
not
such
a
big
difference.
A
But
there
is
a
difference:
henry
there
is
a
difference
in
how
much
work
it
is
to
actually
move
from
the
moment
where
you
actually
have
something
to
test
to
the
moment
where
you
actually
have
the
number
right
like
the
rto
and
rpo
going
down.
So
it
is
because
it's
the
same
amount
of
work
right
like
okay.
Let
me
try
to
explain
a
bit.
So
let's
go
back
to
the
original
plan.
Everything
is
vm
on
the
secondary
site
right,
so
we
are
going
to
build
one
api
node
we
are
going
to
build.
A
Whatever
else
is
necessary
there,
and
by
the
time
we
finish,
we'll
be
able
to
say
all
right.
We
have
this
24-hour
target
that
we
can
play
with
now
getting
down.
The
target
is
going
to
be
much
harder
because
by
the
time
we
fail
over
and
then
scale
up
and
restore
and
do
all
of
that
stuff.
It's
going
to
take
a
long
time
and
a
lot
of
effort
to
get
things
done,
but
if
we
play
the
hypothetical
and
say
that
a
secondary
site
is
majority
kubernetes,
so
the
scale-up
itself
is
fast
enough.
A
So
the
only
thing
we
need
to
focus
is
the
storage
itself.
Then
you
have
only
one
item
to
focus
on
and
that
item
is
figuring
out
how
to
restore
from
a
snapshot
right
or
focusing
how
to
restore
on
the
snapshot,
while
the
rest
is
basically
just
waiting
for
that.
So
you
kind
of
removed
and
moved
the
focus
only
on
one
thing,
instead
of
five
different
things
that
need
to
happen
at
the
same
time.
So
for
that
secondary
option
of
kubernetes,
the
build-out
cost
is
still
the
same.
I
agree
with
you
like.
A
You
still
need
to
connect
a
lot
of
these
things,
but
you
get
to
a
point
of
where
you
can
play
with
the
numbers
right
like
the
rto
and
rpo
target
you
get
to
that
option.
Much
quicker
from
the
moment
you
have
the
build
out
done.
B
A
A
A
A
So
if
we
focus
on
starting
that
build
out
with
whatever
we
agreed
on
the
sizing
of
dvms,
the
automation
that
needs
to
happen,
that
will
give
us
enough
work
for
probably
a
week
or
two
to
get
those
things
up
and
running
the
next
thing
that
we
can
do
then
in
parallel,
you're
doing
the
api
migration
right.
So
the
next
thing
we
can
do
is
play
a
bit
with
italy
and
figure
out
like
I'm
talking
about
like
a
test
node
somewhere,
you
have
some
data
on
a
test.
A
Node
stand
up
a
kubernetes
cluster
with
italy
inside
and
try
to
connect
it
to
whatever
storage
is
out
there
and
just
see
how
difficult
yeah
snapshots
see
how
difficult
it
is
to
actually
get
that
done.
If
it's
only
a
configuration
change
that
gives
us
step
number
three,
which
is
all
right.
We
need
to
make
a
decision
whether
we
are
going
down
the
vm
route
or
the
single
cluster
route,
and
that
allows
us
to
build
out
whatever
we
need
there.
A
All
the
configuration
we
need
there,
like
you
said,
kubernetes
or
vms,
is
going
to
be
the
same.
You
just
place
it
in
a
different
place
and
another
reason
why
I'm
suggesting
this
thing
for
consideration.
This
is
for
both
you
scarbeck
and
henry
is.
I
know
that
you're
having
problems
with
chef
and
omnibus
reconfigures
and
geo
ins
in
that
whole
thing
right,
like
one,
is
overriding
another
and
it's
really
hard
to
connect
all
of
those
things.
C
I
will
say
that
we
have
an
open
issue
inside
the
gitlab
board,
because
we
don't
have
it
documented
how
to
do
a
failover
for
geo.
If
everything
is
kubernetes
like
if
you
have
two
kubernetes
get
live
instances,
sgo,
there's
no
documentation
as
to
how
to
perform
a
failover.
I
suspect
I
know
how
to
do
it
based
on
reading
up
on
the
configurations
and
the
work
I've
been
doing
recently
and
it's.
C
Nicely
the
orchestration
is
the
important
part,
because
we
are
missing
that
inside
of
the
omnibus
multi-node
configuration
for
jl.
So
I
enjoy
us
thinking
about
the
kubernetes
route
and
a
story
about
the
storage.
Obviously,.
A
Yeah
and
that's
so,
those
are
like
a
couple
of
different
parallel
tracks
and
then
in
parallel
to
all
of
that,
we
have
the
talk
with
well
the
rest
of
gitlab,
basically,
which
is
how
are
you
going
to
sync
that
data
right
like
you
need
to
give
us?
How
are
you
going
to
sync
automatically,
only
private
sorry
only
paid
customers,
so
that
can
happen
in
parallel,
but
we
already
have
this
couple
of
tracks
that
we
need
to
do
regardless
of
the
storage
story
right
yep.
B
C
B
Cluster
and
staging
with
get
right,
but
but
this
wouldn't
help
us
at
least
for
anything,
because
we
can't
use
get
to
standard
a
secondary
site,
because
it's
not
usable
for
our
deployments
and
then
everything
else,
and
so
it
maybe
helps
the
geo
team
to
test
some
things.
But
it's
not
related
to
standing
up
a
dr
site
really.
A
It
does
help
it
does
help
with
that
last
part
of
the
story
that
I
just
mentioned,
which
is
right
like
if
this
stand
up
a
secondary
site
right
now
on
staging
that
allows
us
to
actually
talk
with
them
and
the
rest
of
the
teams,
italy
and
so
on.
How
are
you
going
to
sync
smaller
portions
of
data
and
us
to
get
the
practice
of
now
failover?
How
does
the
application
behave
now,
failover?
How
does
the
application
behave,
so
this
is
on
a
tiny
bit
of
a
different
level.
A
So
there
is
a
level
of
infrastructure
where
you
do
the
failover.
You
do
all
of
that
stuff,
so
that
practice
is
going
to
become
a
bit
useless,
but
the
level
above
of
how
does
the
application
behave?
What
kind
of
testing
apps
do
we
have?
What
kind
of
feature
gaps
do
we
have?
That
is
still
going
to
be
useful
for
that
part
of
the
work
you
know
you
know
what
I
mean
like.
B
A
No,
no,
no,
no!
No
get
it
yeah.
That's
not
going
to
happen
any
time
soon
for
us
at
least
right
now,
but
we
can
always
be
the
ones
providing
feedback
and
saying:
look
if
you
want
us
to
use
get.
We
need
this
to
stand
up
a
cluster
easily.
We
have
that
right
now
we
can
do
that
relatively
simply.
But
if
you
want
us
to
use
get
then
go
green.
C
Is
actively
working
on
trying
to
make
get
bring
up
a
helm
cluster
with
get
lab
running
in
this
part
of
it?
So.
A
Based
on
the
tracks,
I
just
explained
basically
by
the
time
we
get
to
the
point
of
us
deciding
for
the
cluster
or
not,
we
might
actually
even
have
something
to
test.
That's
true,
the
the
database
and
redis.
We
still
need
to
build
anyway.
So,
like
I
said,
database
thread,
is
we
need
to
figure
out
and
do
right
now,
regardless
of
everything?
That's
not
in
the
automation.
A
C
I
enjoyed
this
approach
not
for
this
meeting,
but
I
would
love
to
maybe
chat
more
about
redis
and
see
if
we
could
exclude
that
from
the
need
to
build
out
primarily
to
start,
but
I
do
enjoy
this
line
of
thinking.
I
think
this
would
be
a
good
way
to
start
off
the
meeting
today.
C
B
A
C
I
don't
think
we've
really
discussed
how
we
want
to
approach
this
at
all,
and
it's
been
more
about
trying
to
figure
out
our
business
requirements,
but
I
don't
think
the
business
really
understands
what
the
consequences
are.
Certain
decisions
are,
and
I
think
until
we
start
testing
and
showcase
what
our
capabilities
are.
They'll
have
a
better
understanding
as
to
what
that
looks
like
in
the
future,
and
then
we
can
figure
out
what
we
need
to
do
with
budgeting
at
that
point.
A
Yeah,
that's
why
I
also
said
in
in
in
that
point
number:
three:
we
don't
have
anything
right
now
so
talking
about
sub
one
hour,
rto
or
you
know
anything
below
honestly,
but
when
24
hours
is
nothing.
A
Because
it's
so
clear
in
my
head
these
different
types
of
work
streams,
I'm
asking
the
two
of
you
to
write
down
what
we
talked
about
about
these
separate
streams
because
you
writing
it
down
are
going
to
open
like
so
many
different
questions,
and
maybe
we
have
to
answer
them.
Maybe
we
don't,
but
if
you
are
able
to
explain
how
we
want
to
parallelize
some
of
this
work
and
who
takes
tackles
what,
when
I
say
who
I
don't
mean
henry
or
scarbec,
I
mean
infrastructure,
geo,
italy
and
so
on
product
and
so
on.
A
Then
it's
gonna
be
much
easier
for
us
to
have
a
larger
discussion,
because
if
I
keep
talking
all
the
freaking
time,
then
that's
not
gonna
go
well.
B
A
Issue
is
fine,
just
document
it
and
let's
open
it
up
with
that,
and
certainly
just
to
be
clear.
I
know
this
is
a
recording
call,
a
recorded
call.
A
Yeah
and
and
that's
why
I
also
want
us
to
focus
on
the
first
two
steps
right
now
when
it
comes
to
what
we
in
infra
needs
to
do,
because
next
week
something
might
come
down
the
line
and
say
hey.
Disaster
recovery
is
number
five
on
the
list
which
it
happens
like
it
happened
before.
So
I
wouldn't
all
right
I'll
upload.
A
This
recording
and
post
it
in
the
channel
and
yeah
go
out,
go
and
create
those
issues
and
ping
me
if
you
have
anything
unclear
about
what
we
talked
about
happy
to
help
right
need
to
drop
off
to
the
next
call.
Okay,.