►
From YouTube: Exploring DR flows
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hello
to
everyone
who
may
or
may
not
be
listening,
we
are
going
to
chat
about
some
disaster
recovery
strategies
and
user
journeys
for
and
we
can
take
we.
We
should
summarize
this
call
later
on
in
the
issue
as
well.
I've
prepared
a
mural
board,
I
don't
know
if
any
of
you
both
has
worked
with
that
before
no
yeah.
So
this
is
essentially
just
a
a
drawing
or
sort
of
collaborating
thing.
You
can
put
stickies
on
stuff,
you
can
draw
out
flows.
A
B
Okay,
okay,
can
I
can
I
share
what
I
would
like
to
talk
about
sure
I
think
maybe
first
about
the
big
picture
like
what
would
we
or
can
we
expect
from
a
failover
and
what
is
it
about
right
and
then
maybe
what
technologies
are
available
for
those
kind
of
things
and
then
maybe
looking
what
we
could
apply
of
that?
That
would
be
the
things
I
would
look
into,
because
I
think
we
need
to
start
a
little
bit
from
from
the
beginning.
For
all
of
this.
A
B
Let
me
start
by
talking
about
what
I
think
about
the
requirements
that
we
have
set
and
vivi.
I
think
when
the
discussion
started
even
before
the
working
group
was
founded,
we
set
some
rto
and
rpo
goals
and
some
considerations
about
course
right
and
implications,
and
I
think
those
considerations
made
some
sense
and
they
were
set
in
this
issue
in
the
handbook
for
disaster
recovery.
B
Yeah-
and
I
think
there
was
this
consideration
of
how
much
would
it
cost,
and
what
of
that
can
we
achieve
without
you
know
paying
too
much
and
things
like
that,
but
but
if
you
start
at
the
very
beginning
and
think
about
what
is
some
disaster
recovery,
I
think
the
disaster
cover
we
are
talking
about
here
is
about
a
region
failing
right,
like
everything
being
down
all
of
gitlab
and
what
to
do
about
that
then,
and
if
you
think
about
that
this
is
always
a
major
event.
B
A
A
B
Yeah,
that's
the
point:
how
quickly
can
we
can
we
recover,
and
when
do
we
do
the
decision
to
fail
over
right?
B
If
you
have
something
to
fail
over
and
and
the
expectation
adaptation
of
our
customers,
and
I
think
we
should
try
it
at
first
to
be
very,
very
or
to
be
not
ambitious
with
what
we
can
reach
for
our
customers,
because
I
think
most
customers,
even
paying
customers
would
understand
if
a
you
know,
geolocation
fails
for
whatever
reason
that
this
is
a
major
event
and
that
it
can
take
hours
to
reappear,
but
it
shouldn't
happen
very
often
right
so
for
rto
and
rpo
targets,
at
least
at
the
beginning.
B
You
know
rto
and
rpo
target
and
it
is
easier
and
to
achieve
this
fast
for
us
if
we
don't
set
a
too
high
target
for
that
right.
So
we
say
we
are
back
within
10
hours.
For
instance,
we
can
maybe
reach
this
solution
within
a
few
months,
so
we
would
have
something
that
we
already
can
deliver
as
as
a
promise
right
the
lower
we
go
with
these
targets,
the
harder
it
will
be
to
accomplish
this.
So
I
think.
A
No,
I
think
my
interpretation
here,
though,
is
that
these
are
sort
of
ambitious,
rto
and
rpo
targets
that
we
want
to
sell
at
some
point
right.
I
would
not
write
that
down
anywhere
before
we
are
very
certain
that
we
can
really
deliver
on
that
right
because
otherwise,
you're
also
legally
liable
and
all
sorts
of
other
things
right,
but
I
think
the
journey
to
that
point
is
likely
the
one
that
you
describe
right.
A
It's
not
going
to
be
step,
one
perfect
solution,
five
minutes
full
failover
right,
it
will
take
a
lot
longer
and
but
something
is
probably
better
than
nothing
right
and
I
think
at
the
moment
we
had
nothing
as
in
you
know
the
best
effort.
We
could
probably
pull
something
off,
but
nobody
knows
exactly
what
that
means
is
sort
of
my
understanding.
B
Yeah
and
if
you
look
at
what
can
we
achieve
within
a
short
amount
of
time,
I
think
then,
if
you
look
at
technical
solutions,
then
the
main
problem
always
is:
how
can
we
get
all
the
data
like
the
state
over
to
a
second
site
right,
and
how
can
we
spin
up
infrastructure
to
be
able
to
serve
everything
from
another
geolocation
yeah.
B
That's
the
solution
to
that
that
we
are
aiming
for
and
and
also
if
you
look
at,
that
there
are
coast
implications.
Can
we
do
this
for
everybody
or
do
we
need
to
select
somebody
for
that?
One,
and
this
again
has
implications
of
our
application,
can
support
the
selective
thing
right.
So
it's
it's
not
easy
and
hard
to
make
decisions,
and
if
you
go
with
the
decision
that
what
was
taken
for
using
premium
plus
customers,
then
we
rightfully
like
the
discussion
discussion
and
the
issue
was
come
to
the
conclusion.
C
The
consequences
at
the
beginning,
we
can't
yeah,
we
can't
I,
I
don't
think
we
can
do
that
at
all
so
yeah.
I
I
think
part
of
that
discussion
and
actually
what
I
would
like
to
explore.
C
I
I
I
sort
of
want
to
leave
like
any
sort
of
customer
banding
out
of
this
initial,
not
not
this
discussion,
but
like
our
initial
implementations,
because
it
is
additional
work
to
me
to
like
be
selective,
and
I
think,
where
we
actually
need
to
be
selective,
is
in
in
two
areas
like
the
recovery
point
like
how
close
was
the
delta
between
like
failing
and
the
stuff
that
we
have
saved,
I
think
that's
something
that
we
could.
C
We
would
want
to
effectively
scale
based
on
like
as
a
feature
of
different
customer
plans
and
then
the
second
one
being
like
who
gets
access
first,
so
the
recovery
time,
because
one
of
the
challenges
I
I
brought
up
in
the
call
this
week
was
we're
going
to
delay
large
customers
workflows,
which
usually
have
programmatic,
like
shared
ci
runners,
other
things
that
they're
doing
hitting
registry
programmatically
for
various
different
things.
C
So
we
should
also
probably
let
like
sort
of
have
a
not
a
wide
grand
opening
if
everybody
can
come
in
and
get
their
stuff
again,
but
like
a
scaled
opening
of
like
okay,
so
you're,
you
know
you're
this
type
of
customer
you
you're
going
to
get
in
slightly
faster,
so
that
you
can
get
back
to
your
your
day-to-day
because
it's
going
to
be
impactful
like
okay,
we
have
to
get
caught
up
and
then
start
scaling
in
the
other
folks,
but
I
think
I
think
it
needs
to
be
everybody.
Yeah.
C
B
A
A
B
So
I
think
the
that
we
everywhere
we
are
using
those
attached,
pd,
ssd
disk
things,
so
there's
an
option
to
have
local
ssd
is
to
get
even
higher
performance.
That
was
discussed
for
databases,
for
instance,
but
I
think
we
still
have
the
those
okay
network.
B
Special
things
that
google
is
providing
as
a
storage
layer-
and
I
think
we
do
every
24
hours
this
snapshots,
I'm
not
sure
if
they
are
multi-region,
but
by
default,
the
snapshots
and
gcp
are
mighty
region.
If
you
don't
change
that,
so
we
have
something
like
that.
We
should
of
course
check
all
of
that.
Yeah.
A
Yeah,
that's
fine,
but
you
know
that
was
my
first
fear,
but
if
I
just
like,
if
I
paraphrase
what
we
said,
because
I
think
we
may
be
closer
to
sort
of
complete
agreement
on
some
of
those
things,
at
least
here
it's
like
for
me,
you
know
not
copying.
Certain
data
from
free
users
is
not
an
option.
I
think
like
we
can't
lose
data
right.
There
is,
I
think,
that's
a
no-go
right.
A
So
if
we,
for
example,
had
a
scenario
where
you
know
you
have
you
fail
over
to
a
different
region
right,
you
open
it
up
for
premium
and
free
users
right
free
users,
don't
have
the
git
data,
but
they
have
the
database
data,
they
start
pushing
stuff
and
then,
like
a
day
later,
you
have
your
other
side
up
again
right
and
you,
like
all
of
a
sudden,
you
have
a
split
brain
situation.
You
can't
really
reconcile
any
of
that.
That's
not,
I
think,
that's
not
something
we
can
do
right.
A
A
We
can
make
it
available
at
a
later
point
in
time,
right,
ideally
hours
later
right,
not
weeks
right,
and
I
think
that
path
needs
to
be
laid
out
and
part
of
the
initial
implementation,
because
I
I
don't
think
it
is
feasible
for
us
to
not
have
that.
You
know
that
would
be
like
that's
my
my
personal
opinion
right.
It's
like
like
95
of
our
data
and
people
using
this
are
on
free.
I
understand
that
we,
you
know,
we
may
not
want
to
make
hard
guarantees
for
this
right,
but
we
can't
leave
them
behind.
B
That's
why
we
are
taking
the
snapshots
right.
I
mean
yes,
try
to
protect
data
for
our
customers,
all
of
our
customers.
We
have
that
in
place
and
if
they
are
already
mighty
region,
then
we
already
know
the
course,
and
if
you
want
to
have
a
better
rpo
than
24
hours,
then
you
need
to
calculate
how
expensive
it
would
be
to
do
that
more
often,
yeah,
okay,
but
the
basics.
Basics
is
at
gcp.
B
The
way
to
have
multi-region
disaster
recovery
is
to
use
cloud
storage.
That's
the
thing.
We
could
also
try
to
do
something
with
object,
storage
or
something
else.
But
cloud
storage
is
the
thing
at
gcp,
mainly
for
for
being
safe
and
multi-original
disasters
right
and
with
the
snapshots.
That
would
be
one
way
which
would
work
for
goodly.
I
think
it
would
work
best.
This
way.
C
Yeah-
and
I
agree
with
that-
and
I
want
to
add
one
other
thing
for
consideration,
is
you
mentioned
split
brain?
I
I
think
we
should
also
consider
not
failing
back
like
you
know.
If
we
have
a
dis,
you
know
if
we
have
something
that
moves
us
from.
You
know
us
east
one
to
you
know
maybe
the
west
coast
or
somewhere
else
in
the
in
the
united
states
for
gitlab.com.
C
We
just
stay
there
and
then
the
former
primary
site
becomes
the
secondary
site.
If
it
ever
comes
back.
A
Yeah
no,
but
I
think
that
also
means
that
the
requirement
is
that
we
are
able
to
eventually
scale
up
the
disaster
recovery
side
to
the
same
size
as
it
was
before
right
yeah,
and
I
think
this
is
this-
is
how,
like,
I
think
in
my
mind,
you
know
the
process
looks
like
this
at
this
point
with
those
requirements
right,
you
have
a
catastrophic
event.
You
know
you
need
to
determine
that
you,
the
the
best
way
forward
here,
is
to
actually
fail
over
to
your
recovery
sites.
Right,
let's
say
you
are
able
to
make
that
determination.
A
A
You
know
to
have
most
of
their
data
with
like
a
relatively
ambitious
rpo
target
right
and
we
must
be
able
to
let
them
in
then.
You
know
those
folks
need
to
be
able
to
essentially
continue
working
right,
having
lost
ideally
no
data
or
very
little
data
right.
Following
this,
we
must
have
the
ability
to
open
up
the
platform
you
know
like
gradually,
ideally
for
everybody
else,
right
that
that
is
not
in
sort
of
the
first
sign,
which
means
we
need
to
scale
up
the
infrastructure
right.
We
need
to
like
reattach
that
data.
A
B
That's
not
that's
not
that
hard
to
accomplish.
I
mean
it's.
It's
still,
some
work,
of
course,
but
scaling
up
via
terraform
isn't
hard.
C
B
It's
a
lot
of
data,
but
it's
the
range
of
one
hour,
maybe
for
such
huge
this
to
be
restored
and
attached
to
a
new
instance.
I
would
say
rough
estimate,
yeah.
A
Because
I
think
for
me
totally
feasible
yeah.
So
if,
if
we
go
down
here
right,
it's
like,
I
can
also
share
my
screen
if
you
like,
but
I
made
like
this
like
little
mini
diagram
here
yesterday
evening
with,
like
stateless
services,
yeah
exactly
this
one
here.
A
Visiting
squid
is
approaching
so
the
way
I
I
saw
this
and
you
know
this
is
all
very
like
it's
still
a
high
level
right,
but
I
think
it's
maybe
enough
to
talk
about
at
first
like
and
henry
you
can
correct
me
here.
It's
like.
We
have
sort
of
stateless
services,
api
nodes
web,
all
of
those
things
right
and
we
have
them
in
our
cost
estimate
already
right.
We
don't
need
to
replicate
anything
there
right,
it's
like
they.
A
They
have
no
state
right
and
we
can
scale
them
down
as
much
as
possible
on
a
secondary
site
right.
So
yes,
that's
fine.
If
we
are
terraform,
we're
also
able
to
scale
them
up
or
it's
in
kubernetes
right
and
it
does
it
automatically
right.
That's
that's
cool
right.
It
still
work,
but
we
don't
really
have
an
issue
here.
Then
we
have
database
stuff,
you
know
which
is
mainly
our
postgres
database
right,
so
that
does
need
to
be
replicated
either.
A
You
know
to
a
secondary
site
with
a
reduced
node
number
for
sort
of
the
first
wave
or
you
know
already.
We
just
say
this
is
fine
from
a
cost
standpoint.
We
just
mirror
this
right
and
there
is
a
technology
with
patrony,
right
and
standby
clusters
that
allows
you
to
do
that
right,
but
that
that's
the
thing
that
we
need
to
do
and
we
need
to
figure
out
right.
Then
there
is
right.
Then
there
is
object.
A
Storage,
object,
storage,
like
all
the
data
that
we
have
in
there,
but
that
is,
I
think,
also
already
in
cross
cross
region.
So
the
object
storage
itself
takes
care
of
the
replication.
We
don't
really
need
to
do
anything
there
right.
What
we
do
need
to
understand
is
what
the
rpo
targets
are
for
the
object,
storage.
I
think
it's
10
minutes
guaranteed,
as
in
you
know,
like
the
across
regions.
You
know
there
is
some
guarantee
from
gcp
that
they
will
re-replicate
to
this
other
region
within
an
amount
of
time.
A
I
don't
know
what
that
is,
but
we
can
look
that
up
right.
That's
important
to
know,
because
if,
if
that's
something
that
you
can
choose
and
pay
more
right,
then
we
need
to
consider
what
that
is
or
if
the
default
is
enough.
We're
also
okay
right.
So
that's
that's
object.
Storage.
I
think
there's
not
much!
That
needs
to
be
done
there
and
then
the
more
important
thing
is
sort
of
all
of
the
gitterly
gettingly
data
right
here
between
these
two
nodes,
because
this
is
where
most
of
the
cost
sits
at
the
moment.
A
I
think
right.
If
you
had
live
italy
nodes
on
both
ends
right,
then
the
amount
of
money
that
costs
apparently
is
troublesome,
even
though
you
know
there's
maybe
a
separate
discussion
to
be
had
it's
like.
If
we
could,
you
know,
eat
that
cost
for
like
three
months
and
actually
have
a
working
disaster,
recovery
solution
right
and
then
we
work
on
making
this
better.
A
Is
that
maybe
acceptable,
because
we
mitigate
the
risk
of
complete
failure
right
so
but
that's
another
another
discussion
now
you
know
what
I'm
saying:
it's
like
the
cost
of
not
doing
something
and
saving
the
money
every
month.
Essentially
right
still
exposes
us
to
the
risk
of
a
disaster
right,
that's
a
business
calculation.
A
In
any
case,
so
I
think
the
only
thing
we
really
need
to
figure
out
only
thing
at
the
moment
is
how
to
split
out
this
here
at
the
moment,
and
I
think
I
think
what
you
what
you
said
is
there
like.
I
can.
I
can
think
here
of
I'm
getting
up
my
stickies
again
here.
I
can
think
of
three
different
ways
of
of
handling
this
right
and
correct
me.
If
I'm
wrong,
we
just
use
everything
disk
snapshots,
right,
which
I
think
I
don't
know
what
the
the
fastest
snap
incremental
snapshot.
Time
is.
A
B
And
even
then,
I
would
need
to
see
if
we
can
do
this,
because
we
saw
when
we
do
the
snapshots
that
we
slow
down
discretes
for
a
short
moment.
We
saw
this
in
our
databases
so
and
to
see
how
the
implications
are
affirmatively.
I
mean
we
do
this
daily
now
without
issues,
but
something
like
every
hour
at
this
snapshot
and
then
see
how
how
much
that
would
cost.
Of
course,
yeah.
A
A
B
Yes,
one
complication,
of
course,
is
that
we
would
stream
through
and
replicate
the
database
and
the
database
would
be
you
know
with
within
the
minute
to
the
point
of
of
when
we
switched
over,
while
the
data
and
git
leave
would
be
back
up
to
one
hour
from
the
point
we
failed
over,
so
they
would
not
be
in
sync
anymore,
so
people
would
need
to
re-push
their
things
to
get
up
to
the
state
that
would
bear
before,
and
so
you
would
see
problems
when
the
database
is
pointing
to
mrs
which
are
not
in
the
gitly
nodes,
because
we
use
the
snapshot
and
things
like
that.
A
Okay,
but
that's
that's
one
way
of
of
doing
this
right.
It's
like
you
just
do
do
it
in
that
way.
That's
maybe
the
the
simplest
solution
and
it
doesn't
use
anything
else
right.
A
A
This
assumes
we
know
where
free
stuff
is
right,
so
we
essentially
say
we
have
specific
nodes.
You
know
that
you
know
or
disks
that
only
host
three
data
and
we
have
others
that
host
premium
plus
data.
I
don't
think
that's
true.
At
the
moment,
I
don't
yeah.
B
But
I
did
some
calculation
based
on
the
fact
that
95
percent
of
the
positive
storage
space
is
used
by
three
users,
so
five
percent
only
of
repository
sizes
used
by
premium
plus
users.
That
means
we
could
try
to
just
migrate
all
the
premium
users
to
three
dedicated
gitly
nodes
or
maybe
for
five
just
to
be
safe
because
they
are
more
causing
more
traffic.
I
think,
and
then
we
would
have
an
easy
infrastructure.
B
They
split
of
you
know,
thinking
things
for
for
premium
customers,
because
we
know
that's
the
these
four
three
notes
that
we
need
to
sync
via
geo
or
something
yup,
and
so
that
would
be
a
very
simple
and
boring
way
to
separate
those.
A
Yeah
I
I
actually
agree
with
that.
I
think
what
we
can
do,
then
is.
We
should
still
snapshot
these
things
right
as
backstab
backups
and
that's,
I
think,
that's
nice
to
do,
but
we
can
then,
for
example,
say
to
geo
these
things
here
need
to
be
replicated
as
fast
as
possible.
Right
and
geo
is
async,
so
it's
not
100
in
sync.
But
if
you
do
this,
you
can
probably
you
know
stay
within
you
know
we
would
have
to
measure
right,
but
I
would
estimate
within
a
minute.
B
Yeah-
let's,
let's
not
bring
you
into
this
right
now
because,
let's
just
say,
we
need
to
sync
those
synchronously.
If
you
can
right,
I
mean
there
are
several
technical
options
for
that
and
maybe
geos
even
the
best
one,
because
it's
already
there
and
working,
but
there
are
also
things
like
set
of
a
streaming
application
and
other
ways,
I
think,
to
sync
file
systems.
B
B
C
A
So,
but
if
we
do
this
in
in
this
way
here,
then
essentially,
we
have
two
two
ways
like
sort
of
two
streams,
and
this
is
all
like,
as
you
can
see,
it's
it's
bad
drawing
right,
but
it
helps
me
to
do
this
later
on.
It's
like.
We
essentially
have
then
two
two
streams
of
data
right
into
the
secondary
site.
A
We
have
the
the
free
stream
right,
which
is
a
little
bit
longer
right,
and
we
have
some
other
mechanism
that
separates
out
our
premium
users
right
to
be
able
to
deliver
this
at
the
different
rpo
target
right,
which
may
be
more
expensive,
because
we
can't
we
can't
use
some
of
this
technology
right
and
we
need
to
keep
the
ssds
running.
But
that's
kind
of
how
I
see
this
here.
B
A
Which,
I
think
is
the
is
the
main
sort
of
the
main
benefit
right.
It's
like
this.
Here
is
the
three
thing,
and
this
is
the
premium
thing
right,
because
we
wouldn't
have-
and
this
is
again
something
where
my
personal
knowledge
stops
a
little
bit.
We
would
have
a
live.
Italy
instance
right
that
has
these
things
available
on
the
secondary
site
right
and
it
essentially
could
use
them
immediately
as
soon
as
we
fail
over,
whereas
the
other
thing
probably
means
we
need
to
stand
up
more,
getting
the
servers
and
attach
the
disks
restore
them.
A
Is
that
correct
in
this
scenario,
because
we,
the
other
thing,
is
the
disk
snapshots
that
still,
I
think,
has
some
cost
for
restoration?
I
don't
know
what
that
is
right,
but
you
wouldn't
have
to
do
it
on
a
continuous
basis.
C
A
Region
yeah,
so
we
in
this
scenario
here
we
we
do
assume
a
certain
capacity
right,
which
is
this
only
works
if
we
are
able
to
essentially
lock
out
free
users
until
a
point
in
the
future.
A
That's
also
something
we
shouldn't
forget
right,
it's
like
if
you
would
make
like.
If
you
just
stand
up
your
secondary
site,
you
repoint
dns.
You
know
everybody
is
happy,
but
you
let
three
users
in
they
wouldn't
have
their
they
get
data.
So
there
must
be
some
kind
of
mechanism
that
allows
us
to
say
we're
operating.
A
You
know
in
a
degrade,
degraded
state
right.
Currently,
you
know
we're
still
restoring
free
users
and
those
people
can't
actually
log
in
and
when
they
or
get
any
access
to
anything
really
right,
because
they
we
need
to
still
restore
their
their
git
data.
Mainly
all
of
the
other
stuff
would
be
there.
The
database
is
there.
The.
B
A
True
yeah,
I
mean
fair
right.
We
can
talk
about
how
to
do
this.
I
mean
luckily
right.
I
think
I
think
one
like
we've
just
we
are
shipping
maintenance
mode
right,
and
this
would
be
like
an
interesting
sort
of
additional
feature
for
for
the
future
that
we
could
build
on
top
of
that
and
essentially
provide
a
mechanism
to
sort
of
regulate
who
can
log
in
or
can't.
But
I
don't
know,
I
would
have
to
talk
with
the
teams
that
handle
this
kind
of
stuff.
Okay,.
C
Yeah,
because
the
one
of
the
dangers
here
is
like
well,
I
mean
it's
dangerous
to
the
the
users
the
free
users
like
they
could
just
like
push
all
their
stuff
and
create
new
repos,
and
then
they
would
have
basically
duplicated
duplicate
english.
Today,
duplicate
repos,
going
on
so
yeah
maintenance
mode
or
some
some
mechanism
to
notify
them
that
they're
in
a
queue
of
some
sort.
Whether
or
not
we
actually
can
tell
them
where
they're
at
in
the
queue,
but
that
we're
you
know
still
waiting
on
restore
for
their
stuff.
A
A
It's
like
essentially
like
if
we
keep
the
site
at
a
really
minimal
level
right,
even
probably
seeing
more
traffic
from
premium
only,
and
I
don't
think
we
know
exactly
what
that
looks
like
at
the
moment
right,
but
you
would
have
to
like
sort
of
increase
the
size
initially
already
right
for
for
premium
users
to
get
something
out
right,
and
then
you
probably
have
to
scale
it
up
even
more
to
start
to
get
to
like
dot
com
levels.
A
C
B
A
Yes,
okay,
so,
but
this
to
me
looks
like
sort
of
a
path
forward
here
right
that
is
pretty
boring
but
possible.
A
A
You
don't
do
any
of
this
disk
snapshotting,
but
gitly
manages
all
of
this
via
object,
storage.
A
At
that
point
for
like
this
is
something
that
has
been
floating
around
in
the
like
italy
team
for
a
while
right,
where
they
they're
not
only
for
dot
com
but
also
for
customers
right,
maybe
they
they
would
not.
They
would
want
to
provide
an
option
for
folks
to
store
git
data
in
object
storage,
so
that
you
know,
like
I
think
from
a
mechanism
here
like
italy-
would
just
load
this
from
object.
Storage,
keep
it
in
memory.
A
I
don't
think
this
is
necessarily
the
solution,
but
it
may
be
something
that
could
happen
in
the
in
the
future
right,
and
that
means
we
probably
wouldn't
have
to
separate
anything-
and
you
know
anymore
because
you
know
like
object.
Storage
is
relatively
cheap,
but
I
personally
this
sounds
to
me
like
a
rather
significant
amount
of
work.
That
will
not
happen
like
in
the
next
two
months
right
so.
B
Yeah
also
without
knowing
the
details,
but
I
would
wonder
about
the
performance
implications
of
that,
because
I
mean,
if
you
run
from
ssds
and
git,
is
very
much
tailored
to
work
with.
Five
is
a
system
right
and
I
don't
know
how
you
would
translate
this
over
to
object,
storage
and
how
that
would
look
like
then.
C
Yeah,
I
mean
sorry,
I
I
could
see
a
world
where,
like
we,
we
do
use
object,
storage
for
ultimately
like
the
keeping
the
state
of
of
the
repos,
but
then
we're
doing
some
sort
of
behind
the
scenes
streaming
between
the
object,
storage
and
a
local
ssd
for
the
file
notes
themselves,
so
that
they're
highly
performant.
But
so
you
know
a
whole
other
scenario
of
like
okay.
C
We
need
to
to
reduce
like
the
the
time
between
what
what's
on
the
the
local
disc
and
that,
but
I
think
that
would
dramatically
improve
our
our
rpo
because
of
the
fact
that,
like
we
might
be
in
milliseconds
or
even
zero
for
those
types
of
things,
if
we
were
storing
an
object,
storage,
ultimately,
there's.
A
Actually
also
like
there
is
another
discussion
going
on
that.
I
I've
acquired
the
backup
and
restore
category
a
year
ago
and
have
told
people
that,
unless
we
hire
more
folks,
we
can
actually
work
on
this,
which
is
still
true.
But
there
is
some
very
interesting,
like
I
think,
opportunity,
for
example,
to
build
sort
of
an
incremental
backup
solution
for
for
git
right,
which
ties
in
with
some
of
those
things
right.
A
If
you
are
able
to,
for
example,
you
know
like
push
all
of
the
like
actual
changes
that
are
made
on
a
repository
into
object,
storage
right
and
keep
other
state
in
in
memory
or
on
ssd
right.
You
could
have
a
system
where
you
continuously
like
push
these
changes
to
to
repost
into
object,
storage
incrementally,
which
I
think
would
be
really
cool
for
lots
of
folks,
doesn't
even
need
to
be
only
object
storage,
but
it's
just
also
a
way
to
provide
easier
ways
to
backup
data
for
people
that
are
not
in
gcp
and
don't
have.
B
Okay,
now
that
we
are
at
the
possible
solution.
C
B
Talk
about
the
challenges
that
I
still
see
here,
one
thing
is,
I'm
not
very
sure
about
how
best
to
think
red
is,
but
I
think
there's
an
easy
way
to
do
that
and
if
it's
just
you
know
having.
C
A
I
have
a
note
here:
it
is
geo-specific,
but
I
just
want
to
know
if
you
would
stand
up
the
secondary
site
as
a
geo
secondary.
We
don't
actually
replicate
redis
at
all,
because
I
think
the
caches
and
everything
would
need
to
be
regenerated
right,
but
I
don't
think
there
is
a
need
to
really
think
those
things
you
know.
I
think
you
would
have
performance
issues
probably,
but
we
don't.
We
don't
do
that
right
now.
B
I
think
we
have
state
and
writers,
I
think
we
would
lose
all
sidekick
jobs
and
things
like
that.
C
C
Yeah
we
we
have
two
redis
clusters,
one
that
that
fabian
you
had
mentioned
were
you
have
a
general
cache
which
we
would
likely
just
rebuild
that
and
then
the
other
one
is
like.
We
actually
have
a
bunch
of
queues,
work,
cues,
psychic
queues,
as
henry
mentioned
in
in
another
redis
cluster,
where
if
we
were
to
lose
that
we
would
lose
all
these
running
jobs
or
queued
up
jobs.
So
I
mean
that
okay.
B
For
first
iteration
people,
that's
something
you
need
to
think
about
like,
like
you
should
notice
as
an
open
question
for
research,
then
the
other
thing
is:
how
do
we
want
to
separate
user
groups
like.
C
B
B
Yeah
like
like,
how
do
we
selectively
sync
right,
I
mean
the
two
approaches
that
we
mentioned
are
via
geo
by
enabling
it
to
sync
for
premium
customers
only
and
the
other
way
would
be
to
migrate
premium
customers
over
to
a
dedicated,
gitly
notes
and
then
just
sync
them
in
any
kind
of
means
that
we
see
fit
and
make
sense,
and
maybe
there
are
other
approaches
I
don't
know,
then
the
other
issue
that
we
really
need
to
look
in
is
the
scap
between
database
state
and
and
getting
storage
state
after
we
restore
from
snapshots
because
there
will
be
a
gap.
B
So
do
we
deal
with
that,
but
that's
there
was
something
else
in
my
mind
I
forgot
about
it.
Maybe
it
comes
back
later:
oh
yeah.
The
the
other
thing
is:
how
do
we
test
if
the
secondary
site
works,
because
with
this
approach
of
only
departure
partially
having
customers
being
synchronously
replicated
over
to
the
site,
you
can't
really
do
a
failover
test
right
and.
C
B
Testing,
if
that
works,
is
really
a
challenge
in
this
way-
and
I
I
mentioned
already-
I
would
love
to
see
something
like
shooting
part
of
the
traffic
over
to
a
dr
site
to
see
that
it's
working
all
the
time,
but
that's,
I
think,
hard,
because
we
need
to
do
certain
changes
in
our
application
to
make
this
work,
but
for
the
far
future.
I
would
like
to
see
something
like
this,
but
but
in
general
it
really
is
a
question.
How
could
we
test
the
failover
and.
A
I
have
two
ideas.
Actually
I
have
a
few
more
so
I
I
think
I
really
personally,
I
really
like
the
idea
of
rooting
traffic.
This
is
actually
something
that
geo
would
like
to
do
at
some
point,
for
other
reasons
as
well
like,
for
example,
we
have
this
like
read-only
secondary
web
interface.
That
is
pretty
not
user-friendly
right
and
so
for
things
like
this.
Actually,
a
mechanism
where
you
go
to,
like
you,
have
a
geo
aware,
load,
balancer
right,
you
go
to
the
site
that
is
closest
to
you,
you're
being
presented
with
the
web
page.
A
You
make
a
change
right,
it
routes
back
to
the
primary
web
like
database
and
then
comes
back
to
your
site.
Doing
that
you
know
loop
is
something
I
would
really
like
to
do.
We
don't
know
how
you
know.
That's
there's,
there's
issues
with
this
right,
latency
and
race
conditions
and
all
that
kind
of
stuff.
A
But
that's
that's
definitely
something
interesting.
I
think
when
we
say
we
need
to
test
this,
I
think
we
also
distinguish
what
we
are
actually
trying
to
test
right.
It's
like
are
we
trying
to
establish
that
users
would
be
able
to
use
the
secondary
site,
as
is
right?
That
is,
I
think,
what
we
accomplished
by
rooting
traffic
right.
A
We
also
need
to
be
able
to
test
all
of
these
other
associated
procedures
right,
like
restoring
snapshots
from
disk
right
and
being
able
to
like
scale
up
the
infrastructure-
and
you
know
like
turning
this
into
a
rewritable
instance
and
repointing
dns,
all
of
this
other
like
stuff
that
we
need
to
do
on
top
of
this
actually
like
working
right,
and
I
think
that
would
require
a
probably
some
sort
of
bubble
test
right
where
you,
you
essentially
like
isolate
your
disaster
recovery
site.
A
So
for
the
dr
side,
it
looks
as
if
your
primary
site
is
catastrophically
unavailable
right.
You,
you
do
all
of
the
things
that
you
would
do
well
in
case
of
an
actual
disaster,
but
you
don't
tell
users
that
this
is
happening
and
everybody
is
still
working
on
the
primary
side
right.
So
there
is
no
user
impact,
but
you
can
run
through
the
entire
process
and
see
you
know
if
it
actually
does
what
you
think
it
should
right.
A
Yeah,
that's
exactly
what
what
this
is
and
what
that
tests
is,
that
your
promotion's
low
and
your
entire
disaster
recovery
flow
is
actually
functioning
right,
which
I
think
is
is
also
really
important.
A
And
I
think
there
are
maybe
I
think
there
are
a
couple
of
other
like
more
nitty
gritty
issues
here
right
the
whole
like
petroni
standby
thing
right,
it's
like
petroni,
runs
on
a
non-omnibus
version
on
production
right.
So,
as
far
as
I
know,.
B
A
Know
that
it's
fairly
well
known
how
it
works,
but
I
think
there's
still
intricacies
to
this
right.
It
does
work,
though,
like
we
have
it
running
in
ngo
now
in
elsa
or
beta,
but
I
think
it's
still
like
work
to
set
all
of
this
up.
B
Yeah-
that's
that's
true,
but
but
we
know
that
this
works.
I
mean
this
is
an
often
used
scenario,
and
you
know
that
this
kind
of
technology
works
and.
A
Assumes
well,
I
think,
then,
the
the
other
thing
like
at
least
work
wise
that
I
can
come
up
with,
is
just
the
automation
right.
It's
like
associated
with
you
know
like
all
of
the
like
processes
here
where
it's
like,
because
I
think
this
is
this
is
still
you
know.
Even
if
we
scale
down
everything
right,
it's
still
going
to
be
a
bunch
of
servers.
You
don't
want
to
do
a
lot
of
manual
things
right,
so
there's
at
least
work
in
setting
all
of
this.
A
B
Will
have
to
automate
something
I
think
it's
not
as
bad
as
you
think,
because
we
will
essentially
need
to
mirror
our
production
site
to
another
region,
and
that
means
we
just
copy
our
terraform
more
or
less
over
to
that
side
using
different
credentials,
maybe
and
things
like
that,
but
and
then
just
scale
down
the
node
numbers,
which
is
just
changing
a
few
numbers
and
one
five.
If
we
do
it
manually,
I
mean,
of
course
there
are
some
more
details
to
that.
B
Maybe,
but
but
the
main
thing
really
isn't
that
hard
to
accomplish
for
scaling
the
the
status
infrastructure
up
and
down.
That's
really
not
that
hard
to
accomplish.
In
my
opinion,
the
most
work
would
go
into
setting
this
up
once
and
then
maybe
writing
automation
of
making
this
scaling
up
and
down
a
little
bit
easier.
C
B
C
Where
I'm,
where
I,
I
might
have
a
bit
more
concern
for
for
work
there
so
yeah,
I
think,
there's
a
big
spike
of
work
in
terms
of
like
having
the
this
similar
terraform
and
what
have
you
in
in
maintaining
the
dr
site
and
and
adjusting
the
size
of
it.
But
I
would
love
to
also
have
automation
around
like
okay.
We
need
to
fail
over
so
all
the
tasks
that
one
would
we
would
be
normally
manually
doing
like
oh
dns.
C
We
need
to
point
it
over
here
and
we
need
to
figure
out
which
ips
and
so
there's
a
lot
of
safety.
Mechanisms
where,
like
removing
a
human
from
the
equation
in
a
stressful
situation,
would
be
helpful
to
have
automated.
So
the
dns
is
the
only
thing
I
can
think
of,
but
I
know
there's
a
handful
of
other
things
where
we
want
to
programmatically
kick
off
things
in
a
sequence.
So
I
think
there
is
a
slew
of
automation
here.
C
C
We
would
also
need
to
automate,
like
our
our
updating,
our
our
list
of
ips,
that
our
traffic
will
come
from
for
the
different
customers
that
need
to
change
that
within
their
own
infrastructure.
So
there's
a
whole
bunch
of
fun
things
I
mean
honestly,
we
should
probably
do
a
test
once
we're
feeling
ready
with
this
with
a
couple
of
marky
customers
that
are
willing
to
help
us
like
at
least
tabletop
exercise
like
hey.
What
would
you
need
to
do?
If
we
did
this?
Oh,
we
would
need
to
work
with
our
it
departments
upgrade.
C
You
know,
update
this.
This
allowed
list
of
ips
for
your
new
new
range,
et
cetera,
so
yeah
a
lot
of
a
lot
of
devil
in
the
details,
type
stuff,
henry
you're
right.
A
Okay,
but
I
think
I
think
if
I,
if
I
look
at
this,
the
main
thing
here
that,
like
I
think
there
are
at
least
two
things
that
are
like
things,
that
we
just
can't
do
at
the
moment
right,
which
is
selecting
the
premium
customers,
the
customer
data
for
gitobi.
I
think
that
capability
does
not
exist
as
far
as
I'm
aware.
A
B
Yeah,
what
would
need
to
be
done
is
to
enable
the
application
to
route
premium
customers
to
certain
gitly
nodes
that
we
specify
and
the
migration
of
already
existing
customers
could
be
done
by
infrastructure.
We
have
faced
to
do
that,
but
the
application
would
need
to
root
on
your
repositories
of
premium
customers
to
dedicated
gathering
nodes
that
needs
to
be
built
into
the
product.
Somehow.
A
Okay,
I
don't
know
how
that
works.
I
like,
I
just
don't,
but
I
think
that's
something
to
be
that
can
be
figured
out
at
what
level
that
makes
the
most
the
most
sense.
Okay,
so
that
was
kind
of
exactly
what
I
was
hoping
for
to
get
a
little
bit
more
into
the
details
here.
C
Sorry,
I
I
think
one
more
problem
that-
and
maybe
I
glanced
over
you
mentioning
this
but
like
I,
I
do
see
an
issue
with
like
the
database
being
out
of
sync
with
potentially
with
some
of
the
snapshots,
because
that
that's
going
to
be.
You
know,
I
think
at
worst
case.
That's
a
horrible
user
experience,
but
it
might
be
really
confusing
and-
and
one
of
the
concerns
I
think
we
should
have
here-
is
customer
support
like
because
we
can't
scale
people
and
so
with
a
lot
of
additional
like.
C
Oh,
I
like
it,
says
the
mrs
are
here
but
they're
not
like.
I
know
we
can
have
templates
and
stuff
for
that,
but
I
think
we
should
figure
out
like
a
way
to
I
don't
know
be
able
to
for,
like
our
data
store,
to
have
knowledge
of
like
what's
actually
there
and
and
some
system
that
can
rectify
and
reconcile.
I
guess
the
the
delta.
A
Yeah
and
like
I
think
this
is
probably
the
most
difficult
thing
like
I
don't
know
much,
but
I
know
that
the
like
the
overall,
like
compartmentalization
of
our
data
in
the
database,
is
not
very
clear-cut.
I
think
so.
B
Yeah,
but
I
think
it's
gonna
be
very
different
for
each
of
the
kidly
notes,
because
they
might
have
snapshots
which
are
older
or
newer,
and
so
I
think
that's
super
hard
to
accomplish
to
get
this
right
and
then
I
think
it
will
never
will
work
perfectly
even
using
a
geo
for
thinking,
because
that's
not
fully
synchronous
right.
A
So,
like
you
will
have
some
instances,
if
you
essentially
as
long
as
soon
as
you
lose
any
kind
of
data
right
in
one
source,
but
not
the
other
right,
you
will
have
some
brokenness
right.
I
think
that's.
B
I
think
we
need
to
to
count
with
the
fact
that
we
need
to
advise
customers
and
this
these
cases
to
re-push
their
local
positron
data.
If
they
see
issues
because
the
good
thing
is
normally
all
of
the
customers
still
have
local
copies,
because
it's.
A
A
Well,
I
think
the
only
way
to
to
really
do
this
is
if
you
had
a
gitterly
cluster
setup
that
allows
you
to
have
consistency
across
regions
right,
but
I
think
that
comes
with.
B
Traffic
traffic
isn't
supporting
this,
and
the
gitly
team
was
saying
that
traffic
isn't
built
for
that,
and
the
latencies
would
be
too
high
because
they
try
to
be
sunk.
But
that
means
that
latency
would
be.
A
B
A
But
you
know
like,
I
think
there
are
many
other
things
that
we
could.
You
know
put
like
future
things
I
think
at
the
moment
like
I
like,
I,
like
your
approach,
and
we
are
saying
you
know
something
is
better
than
nothing
right.
It's
like
if
we
like,
if
you
know
a
hurricane,
destroys
the
data
center
for
gitlab.com,
and
otherwise
we
would
have
like
three
weeks
of
downtime
and
then
we
actually
are
up
and
running
in.
A
You
know
like
a
couple
of
hours,
let's
say
for
everyone,
and
people
have
to
re-push
their
repositories
because
they
lost
one.
Mr,
I
think
that's
that's
better,
right
and
and
there's
obviously
a
lot
of
future
improvement.
C
Yeah,
I
don't,
I
don't
think
we'll
solve
for
the
issue
that
I
I
called
out.
I
think
we
just
need
to
be
prepared
with,
like
a
maintenance
mode
or
whatever,
to
explain
this
to
literally
everybody
that
hey
you're
going
to
see
some
discrepancy
here
and
here's
how
you
can
solve
for
it.
So,
yes,
absolutely.
I.
B
Completely
agree,
one
last
point
I
I
still
also
want
to
mention
is
for
infrastructure.
Is
that
if
gitlab.com
fades
in
a
region,
then
we
need
to
make
sure
that
infrastructure
like
chef,
the
ops
instance
and
all
the
things
that
we
need
to
manage
gitlab.com
and
the
the
starstore
recovery
site
is
also
still
working.
I
mean
we
need
to
really
make
sure
that
we
don't
have
a
single
point
of
failure
in
this
infrastructure.
A
Because
I
think
there
was
a
discussion
like
many
months
ago
about,
for
example,
having
a
secondary
site
for
for
ops.
I
don't
think
that's
a
thing
at
the
moment,
because
people
realize
that
if
that
instance
goes
down
you're
unable
to
actually
do
what
you
would
need
to
do
elsewhere,
I
don't
know
how
that
is.
Centered.
B
You
have
ops
in
a
different
zone
or
region,
and
maybe
I
think
that
was
the
thing
that
we
have
set
up,
but
I
need
to
look
it
up.
Yeah.
B
It
into
a
different,
a
set
for
sure
or
maybe
in
a
different
region,
and
look
into
that
yeah.
A
But
okay,
so
I
think
because
we
we
are
almost
out
of
time
for
this.
What
I
can
do,
maybe
even
today,
is
I'll.
Take
all
of
these
notes
and
create
an
issue
and
say
these
are
some
of
the
the
scenarios
and
how
we,
how
we
see
this,
and
these
are
some
of
the
potential
solutions
for
for
this
right.
A
These
are
some
of
the
likely
pain
points
and
what
we
need
to
do
and
then
I
think
that's
already
a
more
concrete
picture
compared
to
before
right,
and
I
think
then
we
need,
I
think
once
like.
If
we
actually
like
have
some
agreement
on
on
this,
then
we
can.
We
can
look
how
how
to
do
that
right
and
I
yeah,
I
don't
think
we
can
get
around
using
some
cloud
service
provider.
A
Specific
implementations
like
the
disk
snapshots
right
that
that's
with
the
cost
constraints,
and
I
don't
think
I
don't
think
we
will
get
around
that
which
is
fine
right
and
I
think
that's
also,
but
it
means
there
will
be
some
some
specificity
for
how
gitlab.com
solves
some
of
these
problems
versus
maybe
some
of
our
self-hosted
customers
right.
But
that's,
okay,.
B
At
a
certain
size,
every
customer
will
come
to
the
same
conclusion.
I
think
because,
of
course
yeah
I
think
a
good
thing
would
be
to
do
the
calculation
for
the
snapshots
every
hour
or
maybe
something
like
that
for
okay.
C
A
A
A
A
Like
I
don't
know
like,
I,
it's
a
little
bit
silly
on
some
level
because
it
mimics
what
is
the
word
skeuomorphism
right?
It
looks
as
if
it
was
a
physical
thing,
but
I
like
in
my
previous
jobs,
you
know
which
was
not
always
working
from
home.
I
personally
really
enjoyed
the
sort
of
engineering
sessions
where
you
just
have
white
boards
and
stickies,
and
you
start
drawing
these
things
out
and
when
you
capture
that
and
then
you
write
it
up,
I
personally
find
this
quite
fun
right
and
it.
A
It
feels
like
a
good
thing
to
do
when
you
have.
You
know
something
amorphous
that
you
need
to
get
into,
because
otherwise
you
spend
ten
ten
weeks
in
an
issue
and
yeah
okay,
yeah,
okay,
I
think
what
what
obviously
interests
me
is.
I
think
it
is
probably
a
fair
statement
to
say
that
most
of
these
things
can
be
accomplished
without
using
geo
at
all.
A
Some
of
these
things
may
work
better
with
what
we
have
already,
and
I
would
I
would
personally
want
to
consider
where
we
can
come
in,
because
we
do
have
this.
This
experience,
like
you,
know,
like
the
promotion
sequences
for
a
secondary
site,
putting
it
into
a
read-only
mode.
These
things,
which
is,
I
think,
is
a
separate
bit
from
all
of
the
replication
logic
right
and
then
I
think
that
would
be
quite.
A
I
would
really
like
that
to
be
part
of
this,
because
I
think
that
is
then
there
would
be
so
much
good
feedback
for
how
to
do
this
at
scale
that
would
feed
back
into
the
product.
Even
for
folks
who
run
the
50k
reference
architecture,
because
many
of
these
problems
are
very
similar
right
and
I
think
that
that
would
be
quite
exciting.
C
B
A
Yeah,
I
have
actually
a
like
an
epic
open
for
the
team
and
I
meet
with
them
next
week
about
sort
of
an
nvc
to
only
sync
data
for
specific
customers.
A
Customer
types
right
so
and
the
initial
thoughts
from
mike
and
douglas
were
that
we
would
need
to
manage
a
specific
sort
of
relationship
in
the
in
the
in
the
code,
but
that
that
is
actually
possible
and
they
were
concerned
about
the
complexity
with,
for
example,
all
of
these
other
data
types
like
job,
artifacts
and
whatnot,
but
that's
even
not
required,
because
that's
all
in
object
storage.
So
we
really
only
talk
about
git
data,
which
is
projects,
design,
repositories
snippets
like
a
bunch
of
different
types
of
stuff
that
you
need
to
concern
yourself
with,
but.
B
C
B
B
Mean
we
have
something
in
our
database,
which
is
saying
for
each
customer
on
which
sharded
the
repository
is
located
right.
So
yes,
because.
A
Already
so
maybe
like,
I
never
really
understood
how
story
charts
work
in
italy
to
be
perfectly
honest,.
A
Yeah,
but
in
any
case,
I
think
I
think
what
we
want
to
do
from
a
geo
team
is,
if
we,
if
we
can,
you
know
if
we
can
come
to
a
conclusion
on
like
the
path
that
we
want
to
take
you
know,
then
we
can
start
thinking
about
what
we
what
we
can
do
in
product
to
support
you,
and
I
think
we
would
also
be
excited
in
helping
you
with
setting
some
of
these
things
up.
A
You
know
like
looking
at
the
automation
we
have
like
quite
a
few
plans,
like
my
biggest
plan
for
geo
overall,
like
is
to
make
the
promotion
sequence
more
automatic
and
easier
right,
and
I
think
there
are
some
interesting
ways
of
potentially
doing
this.
That
are,
let's
say
more
generic
right.
It's
like
you
could
handle
that
with
chef
or
with
console
or
whatever,
but
the
the
difficulty
there
is
how
to
like
manage
the
state
of
all
of
the
individual
club,
rb
files,
so
cool.