►
From YouTube: 2021-01-25 Delivery team rollbacks discussion
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Cool
okay,
so
we've
got
the
discussion,
notes
doc
up
already,
so
I
think
the
kind
of
rough
scope
for
today,
as
the
sort
of
first
discussion
we've
had
with
rollbacks,
is
to
try
and
work
out
like
where
we
want
to
go.
Roll
backs
like
what
might
be
a
suitable
scope
and
how
do
we
kind
of
progress
things
so
feel
free
to
lead
this
direction
wherever
we
feel
like
it's
going
to
be
useful,
so
my
initial
question
really
was:
do
we
want
to
focus
this
entirely
on
auto,
deploys
rolling
back,
auto.
B
Deploys
well,
we
don't
really
roll
back
anything
else.
I
mean
we
never
book
anything.
So
that's!
This
is
the
first
statement,
but
auto
deploys
is
easier
and
we
are
in
that
situation
where
it
may
actually
make
sense
to
to
do
it,
because
we
we
deploy
multiple
times
a
day.
So
it
changes
us
really
small,
and
I
mean
I
see
no
value
in
I
mean
there's
value,
but
I
mean
just
the
trade-off
between
the
investment
time
investment
and
what
we
get
out.
B
B
B
The
first
thing
is
always:
can
we
roll
back
and
the
answer
usually
is
well,
we
have
no
idea,
so
let's
go
and
try
to
roll
forward
and
understanding.
Where
is
the
merger
request
that
broke
it?
While
we
may
aim
to
have
a
situation
where
we
say
yes,
we
can,
because
we
know
or-
and
then
we
roll
back
in
the
meantime,
someone
fans
finds
out.
Where
was
the
broken,
merge
request
roll
it
back,
and
then
we
roll
forward.
C
C
And
they're,
you
know
managed
by
chef
runs
which
we
don't
have
a
pipeline
for
anyway
for
case
configuration
changes.
I
think
it'd
be
good
to
have
a
rollback
option.
This
would
be
anything
that
you
know
any
kind
of
like
configuration
change
that
we
make
outside
of
auto
deploy,
including
like
adding
environment
variables,
changing
configuration,
changes
to
github.rb
et
cetera,
and
then
we
have
the
satellite
projects
registry,
shell
et
cetera.
For
those,
I
think
it
would
be
good
to
have
a
rollback
option.
The
same
way
we
have
for
auto
deploy.
D
I'm
tying
into
jar's
statement
about
the
need
for
rollback
for
configuration
changes
right
now.
We
do
suck
in
some
chef
changes
into
kubernetes
to
deploy
them
and
similar.
If
we
have
we,
if
we
do
a
roll
back
in
kubernetes
related
to
a
configuration
change
that
was
spawned
from
a
change
that's
stored
inside
of
our
chef
repository
or
our
chef
secrets,
we
don't
currently
have
a
way
to
connect
those
two
together,
that's
usually
via
someone's
memory
banks
hosting
that
data
inside
of
their
brain
internally.
A
Cool
okay!
Now
that
I
think
that's
a
good
point,
are
we
talking
with
the
like?
What's
the
kind
of
I
guess
like
scope
of
rollback,
people
are
thinking
about
so
like
are
we
thinking
that
we
either
might
want,
or
it
might
be
a
good
idea
to
have
parts
of
auto
deploy
roll
back
separately
from
like
the
whole
auto
deploy
like?
Would
we
would
we
want
to
roll
back
like
just
the
kate's
deploy,
for
example,.
C
The
scope,
I'm
thinking
of
is
incident
mitigation
and
when
I
think
of
incidents
that
we've
had
in
the
past,
there
has
been
a
desire
to
isolate
rollback,
to
either
fleets
or
k8s
clusters,
even
individual
zones
so
being
able
to
roll
back
one
cluster
instead
of
all
three
clusters.
Just
so
that
we
can
help
isolate.
B
A
Yeah
that
we
can
do
that,
like
is
the
so
in
terms
of
what
we
are
trying
to
achieve
with
rollbacks
from
like
a
I
guess
from
maybe
from
like
the
okr
perspective,
in
which
I
mean
like
the
what's
our
kind
of
direction
we
want
to
go
in
as
the
team
like.
B
C
Yeah,
that's
that's
true,
so
I
guess
if
we
have
a
rollback
for
configuration
changes
we
get
that
for
free.
I
would
argue,
though,
that
we
also
get
all
to
deploy
for
free
in
case
workloads,
because
they
all
go
through
the
same
pipeline.
B
C
Why
why
is
that
a
problem?
Just
I
mean,
like
you,
think
that
it'll
be
hard
to
reason
about
whether
the
rollback.
B
B
Yeah,
that's
the
point
right
because
I'm
I'm
also
scared
of
the
of
the
development
process
itself
right.
So
we
took
two
years
for
us
to
have
a
framework
for
developing
migrations.
It
was
actually
much
here
to
do
multiple
environments.
At
the
same
time
and
yeah
I
mean
when
we
when
I,
when
I
pointed
out
this
things
in
the
I
don't
remember,
the
the
team
was
just
asking:
how
can
we
deploy?
We
pointed
out
things
like:
how
do
you
plan
to
do
roadblocks?
B
D
B
C
A
Right
like
how
do
we
know
that
something
hasn't
moved
forward,
so
I
suppose
that
that's
the
probably
a
question
for
me
on
rollbacks
is
that
like,
presumably,
where
there's
like
a
window
which
we
can
roll
back
and
it
has
an
impact
for
another
period
of
time
like
it
feels
like
registry
in
particular,
I
would
rather
avoid,
because
I
think
it's
it's
a
slightly
different
project.
We
need
to
work
with
packaged
to
understand
the
impact.
E
I
think
a
difference
there
too
sorry
to
take
into
account
is
that
currently
we
sort
of
support
multiple
versions
as
in
all
anew,
but
it's
always
moving
forward
and
it's
only
two
versions.
If
we
start
supporting
rollbacks
for
say
individual
clusters,
you
can
end
up
in
these
cases.
Where
say
you
know
fontes
and
git
from
version
b,
but
now
suddenly
say
sidekick
is
roll
back
to
version
a
in
theory.
It's
kind
of
the
same
concept.
E
You
know
the
direction
is
different,
but
I
think-
or
at
least
I
would
suspect
our
code
doesn't
handle
going
back
as
well
as
it
handles
going
forward.
So
as
an
example,
we
have
to
split
between
regular
and
post
deployment
migrations.
E
E
But
then,
if
you
roll
back
suddenly,
your
code
is
now
only
capable
of
understanding
the
old
format
where
some
comments
might
be
in
the
new
format
and
so
to
handle
that
you
basically
have
to
say,
okay.
Well,
you
know
we
support
the
old
format
until
version
x,
but
it
doesn't
really
solve
the
problem,
because
if
you
then
reach
version
x
and
have
to
revert,
you
have
the
same
problem.
E
So
you
basically
delay
it,
and
so
you
end
up
with
these
cases
where
you
have
to
code
very
defensively,
or
you
actually
have
to
do
the
sort
of
three-step
process
where
you
add
a
change
that
one
you
know,
if
there's
a
new
format,
you
can't
roll
back.
You
then
add
another
change.
It
sort
of
you
know
stops
using
the
old
approaches
that
you
might
be
able
to
roll
back.
E
B
B
C
We
need
to
support
this.
The
the
application
needs
to
do
this
full
stop
right,
I
mean
we
upgrade
sidekick
in
parallel,
unless,
if
we
want
to
start
upgrading
sidekick
before
we
upgrade
the
rest
to
the
front
end,
you
know
this
happens
now,
so
the
application
needs
to
support
it.
One
recent
example
of
the
registry
was
we
deployed
to
registry
and
there
was
a
metric
change
and
then
our
app
dex
dropped,
and
in
that
case
I
wouldn't
want
to
have
to
roll
back
the
entire
stack
right.
I
just
want
to
roll
back.
C
I
have
the
registry
developer
on
the
incident.
Call
he
just
wants
to
roll
back
to
the
previous
version.
I
want
to
click
a
ci
job.
That
does
that
I
don't
want
to
have
to
roll
back
everything.
This
is
why
I
think
that
targeted
rollbacks
for
incident
mitigation
are
essential,
like
I
think
it
just
comes
up
over
and
over.
In
my
experience
that
you
want
these
targeted
migrations,
you
don't
want
to
have
to
go
through
an
entire
pipeline.
A
C
B
Right
and
I
wrote
to
skip
gideon
brought
back
so
it's
later
in
my
point,
I
said
that
we
should
not
touch
italy
by
default
and
it
should
be
an
option
to
eventually
roll
it
back,
because
we
have
the
same
problem
here
that
gizly
is
usually
very
backward
compatible,
but
yeah
forward
compatibility
is
really
it's
really
hard
to
get
it
right
and
we
are
not
yeah.
We
are
not
really
stressing
enough
for
this
in
development,
so
this
is
not.
This
will
not
happen
very
soon,
but
then
think
about
this.
B
We're
talking
about
changes
that
have
something
like
three
four
eight
hour
of
code
changes
right,
because
we
do
also
deploy
four
four
times
a
day
three
times
a
days
and
we
tend
to
deploy
every
version.
So
we
have
really.
We
need
to
focus
on
this,
because
this
is
the
key
aspect.
It
keeps
the
change
really
small
and
what
I'm
thinking
here
is
still.
B
If
something
like
this
happened,
we
are
an
incident.
We
should
not
just
keep
roll
keep
deploying.
We
have
to
figure
out
what
is
broken
facing
it
and
move
forward,
because
then
we,
otherwise
we
have
this
nightmare
scenario.
When
you
have
one
part
of
the
fleet
running
version
n
minus
two,
then
you
have
another
power
running
version
and
minus
one.
Then
sidekick
is
version
n,
plus
three,
and
then
you
just
you,
can
handle
this.
C
C
I'm
just
thinking
that
outside
of
a
post-deployment
patch
rolling
back
is
the
only
way
we
can.
You
know
fix
something
quickly.
I
agree
so
so
it's
it's.
It's
really
the
best
option
we
have
now.
I
I
don't
think
going
forward
outside
of
a
post-deployment
patch
and
maybe
that's
what
we
do,
but
outside
of
that,
I
think
rolling
back
would
be
the
right
thing
to
do.
A
Pause,
the
pipelines-
and
I
think
that
has
to
be
true
for
whatever
we're
rolling
back,
because
knowing
we've
rolled
back
is
the
challenge
right
and
investigating.
Why
and
things
like
that,
so
that
makes
sense.
So,
if
we're
saying
an
end
goal,
is
everything
can
be
rolled
back,
which
I
think
is
what
we're
saying
what
makes
sense
to
focus
on
initially
like?
Is
it
auto,
deploys
as
a
whole,
or
is
it
something
else.
C
A
C
C
C
Yeah,
it
would
be
terrible
right
and
we
tried
to
like
create
rollback
pipelines
that
reverse
jobs
or
reverse
order,
and-
and
I
guess
my
question
is-
is
like:
do
we
really
want
to
do
it?
Like?
Is
this?
C
B
This
is
not
true
jarv,
I
mean
because
you're
thinking
about
a
human,
detecting
the
problem
at
the
end
of
a
complete
deployment
and
then
going
back.
But
he,
if
you
have
this
thing's
automated,
you
can
stop
earlier.
So
if
you
are
10
in
the
fleet
and
your
updex
drops,
you
can
stop
and
roll
back
and
it's
gonna
be
faster.
C
C
Yeah,
I
think
I
I
think
that's
that's
a
good
part
of
this,
but
I
also
think
that
it's
very
it's
very
rare
that
our
metrics
would
detect
yeah.
B
D
And
a
lot
of
people
want
to
spend
some
time
troubleshooting
and
looking
at
errors
before
we
decide
whether
or
not
we
want
to
roll
back
so,
and
that
also
slows
us
down
to
kick
a
kick
off
a
roll
back
quickly.
B
Because
you
have
no
other
opportunities,
so
what
what
do
you
have
now?
Nothing
I
mean
you
can
just
try
to
figure
out
what's
happening
and
try
to
detect
what
is
broken
apart,
because
there's
no
rollback
option
today,
no
release
manager
will
ever
say,
let's
roll
back,
because
you
have
to
manually
check
and
sell
and
make
sure
that
there's
no
post
deployment
migration,
which
is
not
easy
right,
because
it
really
depends
on
what
you
deployed
you.
It's
not
just
the
content
of
the
last
package.
B
F
C
B
What
I
was
thinking-
which
I
I
mean
so
when
you
think
about
selectively
rolling
back
you're
thinking
about
if
I'm
correct,
either
part
of
so
you're
you're
kind
of
the
splitting
the
fleet
by
cluster
names
and
things
like
that
and
services
right.
So
you
want
to
roll
back
only
registry
or
you
want
to
roll
back
git
fleet
or
git
fleet
in
zonal
cluster
x,
you're.
Thinking
about
this
right,
jarv.
C
Yeah,
basically,
this
epic
373,
which
has
like
a
mock-up
of
it,
would
just
be
a
manual
job
next
to
each
fleet
or
each
cluster.
That
would
do
a
helm,
roll
back
or
a
ansible
roll
back.
B
Yeah,
what
I'm?
What
I'm
thinking
here
is
that
it's
just
some
so
when
we
have
deployment
and
rollback
pipeline
there
are
really
straightforward
so
that
every
single
job
handles
something-
and
you
don't
have.
This
kind
of
catch-all
do
kubernetes
deployment,
but
you
may
have
triggered
this
kubernetes
deployment
for
this
cluster
or
whatever.
Then
you
can
selectively
run
only
those
jobs
that
you
want
so
that
you
have
the
ability
to
roll
back
everything
and
by
everything.
String
is
attached
because
I
will
not
re
roll
back
digitally.
B
In
any
case,
I
would
just
for
me
easily
is
a
manual
step
that
you
can
do
after
if
you
want,
but
that's
just
I
can.
I
can
explain
a
bit
better
about
this,
but
it's
but
the
point
is
that
if
you
can
roll
back
and
everything,
then
you
can
also
roll
back
selectively.
C
I
see
so
like
with
the
refactored
pipeline,
there'll
be
like
one
trigger
and
then
we
would
just
have
a
rollback
trigger,
but
we're
still,
I
think,
isn't
part
of
this
kind
of
deciding
whether
we're
going
to
do
something
sooner
than
the
refactor
or
not.
I'm
not
sure.
B
B
So
the
idea
is
that
some
of
the
most
important
part
of
the
pipeline
refactoring
should
kind
of
go
from
the
current
okr
to
the
rollback
okr,
so
that
we
have
what
we
need
in
place.
A
E
B
D
Precisely
so,
what
if
we
create
that
method
in
some
way
shape
or
form
it
may
be
when
it
gets
around
to
time
to
building
a
rollback
pipeline?
Perhaps
when
a
deployment
pipeline
gets
created,
it's
smart
enough
to
know
that
post-deployment
migrations
are
contained
or
are
not
contained
inside
of
there.
If
they
are,
don't
create
the
robot
capability
in
that
pipeline.
B
C
B
Because
I
think
you
would
find
it
either
way
yeah,
but
the
thing
is
that,
if
you
run
just
before
the
next
deployment,
it
means
that
you
don't
know
so
the
reason
about
it.
B
Deployment
and
there's
also
another
aspect
of
it
that
you
need
to
run
migration.
The
regular
one
before
the
cannery
deployment.
C
And
but
but
while
you're
yeah
I
mean
I
don't
I,
I
guess
I
I
see
your
point,
but
I
think
it
still
kind
of
puts
us
in
a
better
place
than
we
are
now,
because
how
often
are
problems
triggered
from
deployed
migrations
doesn't
happen
very
often
and
to
reason
about
that
we
could
always
just
roll
back
the
application
and
then,
if
the
problem
is
still
there,
then
we
would
say:
aha,
it's
supposed
to
play
migrations
like
we
would
just
keep
on
rolling
back
up
until
the
last
post-deploy
migration
right
and
then
you
would.
C
C
B
C
B
C
B
So
what
happened
last
time
is
that
so
batching
plus
deployment
migration
is
kind
of
outside
of
our
control
right.
So
it
requires
a
an
effort
in
socializing
the
idea
and
development
and
making
sure
that
we
can
actually
do
this.
So
if
we
can
do
our
part
of
the
homework
so
that
we
actually
detect
them
know
when
there
are,
then
it's
just
a
matter
of
showing
yeah.
B
We
were
able
to
roll
back
in
this
case,
and
then
you
can
do.
These
are
all
the
incident
that
we
were
not
able
to
roll
back
because
of
this,
and
because
this
tied
also
to
the
discussion
about
automating
completing
automating
deployment,
because
right
now
we
still
have
the
baking
time
and
we
click
and
we
click
the
play
button.
But
if
we
have
more
control
over
this
and
knowing
that
it
is
a
rollbackable
migration,
sorry
a
rebeccable
deployment,
then
you
can
just
be
more
easily
rolling
forward
than
because
you
can
roll
back.
A
A
F
Should
we
also
identify
the
type
of
post-deployment
migrations
we
have,
because
currently
you
can
have
two
types,
a
post-migration
that
triggers
a
background
migration,
and
I
think
in
that
case
it
is
not
possible
to
roll
back.
But
there
are
another
types
of
post-deployment
migrations,
just
like
the
ones
that
add
indexes
or
remove
indexes
and
those
are
declared
inside
the
post
migrations
because
they
are
adding
intersect
indexes
to
large
tables
such
as
projects
or
namespaces
or
ci
bills.
A
F
But
honestly,
the
indexes
that
are
added
impose
migrations
are
normally
used
for
the
tooling
that
is
used
usage
research,
tooling,
the
one
that
is
only
executed
every
13
days,
so
those
are
indexes
that
are
not
truly
needed
at
the
moment
of
the
application.
F
A
Cool
so
next
comment:
we've
got
here:
alessio
we
need
a
robot
pipeline.
So
how
do
we
get
to
that
yeah.
B
We
will
not
run
them,
but
it
wasn't.
What
I
was
thinking
is
that
if
we
run
the
deployment
of
a
previous
package,
then
the
migration
contained
in
that
package
should
already
be
in
the
in
the
database,
so
just
running
the
deployment
job.
We
just
do
nothing,
I'm
talking
about
regular
migration,
so
the
one
that
we
do
up
front
so
would
be
kind
of
a
no
and
it
should
be
safe,
italy
easily.
In
my
opinion,
we
should
avoid
rolling
back
easily.
B
I
was
also
working
on
the
pipeline
refactoring
before
the
school
and
in
the
original
attempt
to
run
to
make
the
rollback
pipeline.
The
rollback
deployment
is
already
there.
So
there's
there's
a
variable
that
say
if
we
are
rolling
back,
don't
do
easily
deployment
and
yeah.
This
is
what
I
was
thinking.
A
B
This
is
a
very
good
question,
so
my
point:
what
I'm
thinking
here
is
that
it's
probably
not
worth
touching
the
current
gitlab
cia
yaml,
because
it's
kind
of
huge,
so
I
would
rather
because
we
still
have
to
fit
to
think
about.
How
can
we
I
mean
we
know
how
to
find
positive
migration.
We
need
to
actually
code
the
the.
I
think
we
run
enhancement
the
ansible
script,
to
detect
this.
B
On
the
other
hand,
I'm
thinking
has,
as
I
speak,
so
what
what
will
happen
with
the
pipeline?
Refactoring
is
just
that
we
are
going
to
collapse,
jobs
together,
because
right
now
we
have
something
like
this,
so
we
have
think
about
the
web
fleet.
Just
as
an
example
right,
then
we
have
some
kind
of
skeleton
of
what
does
it
mean
to
deploy
the
web
fleet,
and
then
we
have
many
jobs
for
every
environment,
so
we
have
kind
of
gstg
web
fleets.
B
Then
we
have
gs,
gpr
dc
and
I
web
fleet,
and
then
we
have
gprd
so
all
the
environments,
but
they
are
the
same
job
except
from
some
variables
that
are
yeah
that
detects
the
stage
in
the
environment.
Basically,
so
in
the
current
situation,
this
information
are
not
detected
but
are
kind
of
supplied
by
this
gitlab
ci
yaml
file.
So
let
me
let
me
try.
B
What
I'm
working
right
now
is
doing
the
opposite,
so
you
can
have,
because
you
can
have
only
one
it
either
have
the
the
environment
and
eventually
the
canary
stage
or
not
so
the
same
job
would
just
detect
the
content
of
it
and
generate
those
information.
So
in
theory
we
should
be
able
to
run
the
same
job
and
we
and
ansible
will
do
the
right
thing,
because
the
variables
provided
are
the
right
one.
B
B
So
if,
if
we
start
adding
the
logic
for
rolling
back,
it
should
just
work,
because
it's
still
a
matter
of
running
it
basically
is
running
a
deployment
with
the
previous
enviro.
With
the
previous
version
and
cleverly
skipping,
some
jobs
like
do
not
run
possibly,
migration
do
not
run
the
digitally
digitally
deployment,
so
maybe
we
can.
C
So,
in
other
words,
the
way
you're
seeing
it
is
that
we
would
have
the
release
tools
pipeline
and
there
would
be
a
manual
trigger
job
that
would
be
like
a
rollback
and
that
would
trigger
the
ansible
pipeline
with
rollback
equals
true,
which
would
then
run
the
appropriate
jobs
at
the
previous
version
and
it
kind
of
stays
the
same.
I
guess
the
deployer
pipeline
there's
no
special
logic
there,
because
you're
just
passing
in
the
previous
sha
and
rollback.
B
It
and
it
will
also
allow
us
to
start
thinking
about
dog
fooding,
the
the
rollback
feature
in
gitlab
itself,
because
I'm
quite
sure
it
works
more
or
less
the
same
way.
It
just
runs
the
same
pipeline
with
with
the
version
from
the
previous
deployment
and
something
like
rollback
equal
through.
So
I
would
just
say:
let
me,
let's
make
sure
that
the
variable
is
the
right.
It's
the
right
one,
so
that
we
are
kind
of
gitlab
wrote
back
compatible
in
a
certain
way,
but
yeah
the.
F
C
B
C
So
then
it
would
be
the
case,
then,
that
api
web
sidekick
and
well
api
web
sidekick
pages
would
all
be
treated
as
a
as
a
unit.
Everything
that's
upgraded
in
parallel
would
be
rolled
back
in
parallel.
Okay,.
C
C
B
But
we
have
paying
customers
that
wants
this
no
downtime
deployment,
so
it's
kind
of
we
should
be
doing
right.
A
Would
we
on
that
proposal
alessio?
Would
you
would
we?
B
Okay,
so
I
I
think
was
in
another
discussion,
the
one
about
the
deploying
from
master,
but
I'm
not
really
sure
so.
There
was
this
concept
of
testing,
always
testing
rollback
on
staging,
regardless
of
incident.
B
So
if
we
have
this
pipeline-
but
we
let's
say
run
it
once
now,
because
we
are
coding
it,
then
we
never
run
it
for
six
months.
Then
we
just
cross
our
fingers
and
hope
that
it
still
works
when
we
actually
yeah
need
it
in
six
months,
three
months
whatever
so
one
idea
was
that
in
parallel
to
canon,
redeployment
or
something
like
that,
we
could
think
about
rolling
back
staging
and
then
rolling
forward
again.
B
So
that,
with
this
kind
of
approach
either
we
we
know
if
the
rolling,
if
rolling
back,
is
an
option
because
it
actually
works.
And
then
when
we
have
an
incident
it
will
make
absolutely
no
sense
to
start
from
from
staging,
because
you
have
the
problem
now.
So
you
want
to
mitigate
it
as
soon
as
possible,
and
then
you
can
just
drain
cannery
and
roll
back
production.
B
B
At
the
same
time
where
we
were
hoping
to
have
some
possibility
for
develop
for
our
developers
to
actually
run
this,
I
ran
some
tests
against
cannery,
but
with
cannery
drained
we
had
because
they
were
not
able
to
reproduce
it
locally,
but
I
mean
this
is
very
in
the
future,
but
so
I
was
thinking
right.
We
have
an
incident.
F
So
I
like
the
idea
about
rolling
back
to
staging
and
then
rolling
forward,
but
I
do
wonder
if
saying
that
everything
was
okay.
Rolling
back
in
the
staging
is
going
to
give
us
a
reliable
measure,
because
staging
is
quite
different
from
production
in
the
traffic
and
in
the
database
in
the
future
flags
that
are
enabled
there.
F
So
I'm
not
sure
about
that.
One.
B
B
Yes,
you're
right,
I
don't
know
in
in
the
afternoon
myra,
but
at
least
in
my
morning
we
I
I
try
to
keep
an
eye
on
this.
So
what?
What
really
happens
is
that,
because
of
a
bug
in
our
release
to
logic,
we
tend
to
talk
two
version:
every
auto
deploy
branch.
So
usually
we
have
one
merge
request
of
the
difference
between
what
is
actually
in
in
in
staging
and
what
is
in
production,
because,
for
instance,
if
I
have
just
some
minor
changes,
I
don't
I
don't
go
for
a
second
production
rollout.
B
A
A
Are
we
confident
enough
with
it
to
be
able
to
test
out
rollbacks,
or
do
we
need
to
like
actually
do
more
of
the
ideas
we
proposed
in
the
deploy
from
master,
and
actually
you
know,
keep
it
a
little
bit
closer
to
the
version.
We're
actually
deploying.
D
D
B
B
Well,
it's
not
exactly.
No,
it's
still
not
the
same,
because
that's
the
problem
right.
We
we
keep
thinking
about
packages,
but
we
never
think
about
delta.
So
what
we
which
was,
can
yeah
so
cannery
and
staging
will
get
every
single
package,
but
production
usually
skips
one
or
two.
So
that's
yeah
this
yeah.
We
will
not
solve
this
one,
but
I
mean
we
are
moving
from
not
not
having
this
in
as
as
an
option
to
actually
having
rolled
back.
So
I
would
I
don't
know
I
mean
it's.
B
Some
sometimes
seems
like
we
are
trying
to
really
solve
problems
that
are
really
far
far
away
and
we
still
know
nothing
about
the
the
process,
and
so
I
think
that
we
are.
There
are
several
steps.
We
could
do
and
start
iterating
on
it,
so
that
then
we
can.
Then
we
would
know
better
and
figure
out
if
we
actually
have
to
change
something
else.
D
I
think
alessio
raises
an
interesting
point
because
we
skip
revisions
in
production.
We
really
need
to
figure
out
if
a
post-deploy
migration
is
going
to
halt
us,
and
I
think
we
need
to
leverage
what
we
use
for
our
release
tracking
feature
to
use
that
as
a
gauge,
because
we
could
create
like
a
chat,
ups
command.
D
That
says,
can
I
roll
back-
and
it
just
knows,
what's
currently
on
production,
maybe
what's
being
deployed
to
production
versus
what
was
previously
on
production,
because
we
can't
just
compare
the
last
shot
that
was
on
canary,
because
that
might
be
a
revision
that
never
made
it
to
production.
For
example.
D
A
D
D
A
D
B
I
was
looking
into
this
in
issue.
1.061,
there
is
a
script.
Actually
it's
a
one-line
plus
graph.
That
tells
you,
if
you
can
or
cannot
what
I
think,
but
I'm
not
sure,
because
I'm
not
entirely
confident
in
the
in
this
part
of
the
deployment.
But
what
I'm
thinking
here
is
that
if
you
can
have
a
pipeline
that
connects
to
the
deployment
box
that
runs
migration
running
this
basically
git
lab
rails
db,
migrate
status,
egrep
something
down
yeah.
A
E
Are
there
any
files,
I
guess
added
or
changed
in
db
post
migrate,
and
then
we
just
list
those
files
in
the
json
that
we
store
in
this
metadata
repository
and
that
way
later
with
tooling,
we
could
just
you
know,
go
to
that
repository
fetch.js
and
get
the
list
of
migrations,
and
that
way
you
wouldn't
just
know.
Oh
there
are
migrations.
You'll
also
immediately
see
what
the
the
files
are.
E
As
I
say
that
now
the
alternative
is
when
we
record
our
deployments,
we
know
the
current
shots
that
we
deploy.
We
know
what
the
previous
shy
is,
because
there
we
do
store
it
in
a
more
easily
accessible
manner,
and
so
we
could
record
it.
Then
the
problem
is
that
we
then
record
the
data
after
the
deploy
not
before
it,
whether
that's.
D
E
Right
but
so
there,
the
difficulty
is,
if
we,
but
put
us
right
in
the
release
tool
side
of
things,
there's
really
two
points
where
we
can
hook
it.
This
is
when
we
tack
these
auto
deploy
tags,
which
is
essentially
at
the
start
of
the
deploy,
but
at
that
point
figuring
out
what
the
previous
deploy
is
is
more
difficult.
F
B
E
And
so
long
term,
of
course
we
can
change
and
I
think
long
term,
this
release
tools
code,
will
change
quite
dramatically.
You
know,
as
we
move
to
kubernetes,
for
example,
we
hopefully
at
some
point,
stop
deploying
this
big
package
and
then
this
entire
tooling
needs
to
change
anyway,
because
right
now
it
assumes
that
we
deploy
this
big
fat
package
and
we
sort
of
yank
the
shot
out
of
this.
You
know
big
version,
identifier.
E
So
at
that
point,
when
we
do
that,
then
yes,
you
generate
this
diff
before
a
deploy,
store
it
and
then
you
can
do
things
like.
Oh,
we
changed
this
file.
Are
you
sure
you
want
to
deploy
automatically
things
like
that,
but
I
think
for
now
probably
it's
easiest
to
hook
into
the
deploy
tracking
that
we
already
have
and
then
just
store
a
list
of
migration
files
for
later
review
and
yeah.
Then
we
need
a
separate
tool
that
somewhat
takes
that
data
and
presents
percentage
shut
up
or
whatever
it
is.
A
E
That
as
and
like
write
down
a
procedure
for
this,
because
even
if
you
fully
automate
this,
it
might
break,
might
not
work.
So
it's
nice
knowing
what
to
do
manually,
yeah
and
then
the
next
step
is
record,
those
migrations
automatically
and
then
the
third
step
will
be
some
tool
that
presents
the
data.
B
In,
in
any
case,
I
would
like
to
so,
I
don't
really
believe
that
the
looking
at
git
diff
is
the
right
option
is
just
the
only
one
that
we
have
now
and
is
the
worst
one,
because
often
times
especially
on
staging,
we
may
have
things
that
got
reverted
or
things
like
that.
So
the
content,
that's
why
I
was
stressing
out.
B
So
the
idea
was
this
one
that
when
you
promote
a
production
build
so
when
you
promote
production,
build
your,
you
already
had
the
regular
migration
because
we
do
them
on
canary.
B
E
Production,
I
think,
in
that
case
we
can
do
because
we
track
the
finished
migrations
in
the
database.
Like
a
simple
table
that
you
can
query,
what
you
would
essentially
do
is
fetch
all
those
versions
and
fetch
all
the
files
on
disk
in
the
post
migrate
and
basically
get
the
diff
of
that.
E
E
The
challenge
now
becomes
that
we
need
some
sort
of
program
running
on
these
hosts.
You
know
without,
for
example,
that's
doable
yeah.
Essentially,
you
know
whether
it's
grab
how
you
do
it
doesn't
matter,
but
you
need
to
somehow
run
it
on
the
host
with
the
ansible
approach.
I
think
that
works,
because
we
already
essentially
ssh
in
and
run
the
commands,
I'm
not
sure
how
well
this
would
work
if
we
do
kubernetes,
because
the
deployment
approach
is
very
different
there
and
what
you
could
do.
E
The
problem
you
get,
then,
is:
if
you
do
that
for
20
hosts,
you
get
the
data
20
times
because
they
don't
know
which
ones
is
sort
of
the
the
leader
in
that
sense,
so
I'm
not
really
sure
how
we
would
implement
this
appropriately,
given
the
state
that
we
are
in,
but
also
the
state
that
we
want
to
move
towards
when
it
comes
to
deployments.
B
So
that's
exactly
that
type
of
things
right,
because
you
want
to
be
able
to
run
a
pod
with
certain
script
or
whatever,
after
or
right
or
right
before,
right
after
migrations,
and
things
like
that,
so
I
mean
it's
as
soon
as
we
know
how
to
do
it.
It's
just
a
matter
of
making
sure
that
we
can
port
this
over
to
to
kubernetes,
and
even
if
we
don't,
we
can
still
run
migrations
outside
of
kubernetes,
because
we
will
ship
packages
for
to
our
customers.
B
C
I
think
I
think
we're
gonna
have
migrations
done
outside
of
the
kubernetes
cluster.
I'm
not
sure
how
that's
gonna
work,
and
I
I
see
us
having
you
know
the
four
clusters
and
us
being
able
to
at
least
for
the
front
end
we'll
just
do
all
or
nothing
rollbacks
for
each
cluster,
because
with
auto
scaling,
I
mean
we
do
this
now,
where
we
completely
drain
a
cluster
and
then
the
other
clusters
are
able
to
take
on
the
extra
load.
C
So
this
doesn't
work
with
side
currently
because
it
runs
in
the
regional
and
I
don't
think
we're
going
to
be
splitting
side
kick,
but
at
least
for
the
front
end
web
api
and
get
I
see
that
just
like
we'll,
we'll
probably
just
do
full
cluster
rollbacks
and
we'll
drain
clusters.
Even
if
we
need
to.
F
I'm
sorry
yeah
sure,
so
there
is
a
blueprint
that
is
being
developed
by
database
team,
about
testing
database
changes
in
a
production-like
environment
and
the
idea
is
to
test
the
migration,
whether
it's
a
regular
db
or
background
migration.
A
Yeah
great
great
thanks
for
sharing
that-
and
this
is
the
kind
of
the
other
thing
actually
is
interesting
about
rollbacks
is
that
I
mean.
Hopefully
these
will
help
us
sort
of
go
back
and
work
with
development
on
how
to
make
code
changes
and
migrations
safer
so
that
we
don't
need
to
be
rolling
back
stuff.
So
yeah,
that's
a
good
one.
I
think
for
everyone
to
know
about,
so
we
can
help
people
use
that
as
well.
A
So
we've
got
about
20
minutes
left.
How
do
we
want
to
take
this
forwards?
So
I
think
to
be
kind
of
clear
like
this.
This
okr
will
pick
up
on
where
we're
up
to
with
coordinate
deployments.
So
it's
not
a
case
that
we
need
to
shut
all
that
work
down
and
immediately
begin
rolling
back
on
monday.
They
should
one
should
feed
into
the
other,
but
other
things
that
we
either
want
to
prioritize
or
test
out
or
like
investigate
further
to
kind
of
help
move
this.
D
D
A
A
Cool
okay,
yeah
that
makes
sense,
and
then
also
alongside
that
or
as
part
of
that,
let's
also
figure
out
the
pieces
of
the
coordinate
deployments,
work
that
will
help
either
the
sort
of
early
stage
of
rollback
or
or
the
next
stuff,
like
I'm
kind
of
thinking,
if
there's
anything
around,
if
there's
any
tech
debt
that
would
do
like
make
sense
to
have
paid
down
or
if
there's
anything
around
like
notifications,
I
suppose
that's
the
other
interesting
thing
about
rollbacks
is
keeping
track
of
what
has
deployed
or
not
deployed
or
partially
deployed,
and
then
we
can
sort
of
load
that
stuff
on
the
early
part
of
the
quarter.
B
D
B
Are
those
idea,
mine
and
job
really
different?
I
mean
it's
kind
of
doing
as
subset
or
doing
as
a
whole.
As
long
as
you
can
still
do,
the
both
thing
is
still
the
same
say
the
same
process.
The
only
thing
that
we
kind
of
excluded
is
the
is
everything
that
is
not
so.
No,
let
me
rephrase
so
there
could
be
configuration
changes
and
rolling
out
registry,
for
instance,
or
git.
B
Shell
is
configuration
changes
right
now,
because
it
is
as
a
configuration
in
kate's
workload,
and
this
I
think
we
we
say
that
they
are
part
of
what
we
want
to
do,
but
are
kind
of
outside
of
the
scope
of
the
early
part,
because
we
already
have
this
right.
We
just
you,
you
can
roll
reverse
the
change
and
roll
and
deploy
again.
Unless
I
misunderstood.
C
My
my
my
preference
here
would
be
to
focus
on
alessio's
proposal
for
the
deployer
pipeline
and
then,
in
parallel,
add
rollback
manual
jobs
to
the
gates
workloads
pipeline,
which
will
allow
us
to
revert,
which
is
basically
just
it's
just
a
manual
job
that
sits
next
to
the
upgrade
job
that
will
do
a
helm
rollback,
and
this,
I
think,
is
important
for
the
configuration
changes
and
also
we
could
use
it
for
application
changes
that
are
treated
like
configuration
changes
like
registry.
C
I
mean,
I
think,
it's
a
pretty
simple
thing
to
do.
Anyway.
It's
not
going
to
take
a
long
time
to
make
that
pipeline
change.
What
do
you
guys
think
of
that.
D
B
B
C
Yeah,
it
would,
and
what
would
happen
is
if
we
auto
deployed
after
that,
it
would
fail,
because
we
do
this
diff
check
to
see
if
there
are
any
unexpected
changes
and
it
fails
the
dry
run.
And
so
we
would
have
to
wait
until
someone
made
a
new.
Mr
with
you
know,
either
either
we
would
do
a
proper
revert
and
revert
the
change
so
that
it's
clean
or
we
would
move
forward,
but
it
would
at
least
allow
us
to
roll
back
quickly
a
cluster
if
we
needed
to.
C
Not
the
trigger
well,
the
trigger
will
fail,
but
what
actually
fails
is
the
downstream
job
that
does
the
diff,
because
when
we
do,
the
diff
we
also
take
a
look
to
see
are:
is
the
set
of
changes
consistent
with
an
image
only
update,
and
in
this
case
we
would
see
other
things
that
are
pending
and
it
would
just
fail.
C
Yeah
but
I
think
yeah
it
wouldn't
prevent
the
rest
of
the
fleet.
It
would
just
prevent
the
kubernetes,
but
this
is
good
because
otherwise
we
would
revert
the
change
and
then
the
next
auto
deploy
would
just
like
apply
it
back,
because
right,
because
what
like
you
said,
what's
on
the
git
repository,
isn't
consistent
with
the
cluster.
A
Oh
okay,
that
sounds
good
and
then
one
other
piece
which
I'm
sort
of
thinking
about
which
I'll
put
on
the
okr
issue
most
likely,
is
how
we
link
this
to
something
measurable,
so
marian
suggested
mean
time
to
resolution,
which
makes
sense
from
this
being
an
incident
related
thing.
That's
not
a
number,
that's
being
tracked
at
the
moment,
and
obviously
it's
also
a
number
that
when
we
start
tracking,
it
will
be
hugely
impacted
by
lots
and
lots
of
other
things.
A
A
So,
let's
do
that
and
have
a
think
about
whether
there's
kind
of
follow-up
conversations
that
we
need
to
have
or
like
whether
if
we
have
some
issues,
we
could
review,
for
example,
but
it
would
be.
I
think,
it'd
be
a
good
idea
to
keep
these
two
aspects
of
rollbacks
in
a
kind
of
one
conversation
so
that
we
actually
are.
A
I
want
to
make
sure
that,
when
anyone's
a
release,
manager
you're
all
comfortable
with
here
are
rollback
options
and
it's
not
like
kubernetes
rollbacks
being
totally
separate
from
code
rollbacks.
So,
let's
even
if
they're
two
different
solutions,
let's
also
discuss
both
of
them.
A
A
Nope,
fantastic,
okay,
so
I'll
set
up
something
for
next
week.
I'll
probably
just
put
an
hour
so
hopefully
we'll
be
starting
to
wade
through
some
of
the
uncertainty
as
we
go
and
we
can
check
in
on
where
we're
up
to
and
work
out.
Some
next
steps
cool
all
right,
speak
to
yourself.
Take
care
bye.