►
From YouTube: Auto Rollback in GitLab CI/CD
Description
This video explains the new feature in GitLab 13.7 - Auto Rollback.
If you have questions, feedback or suggestions on this feature, please leave a comment in the issue https://gitlab.com/gitlab-org/gitlab/-/issues/35404 or create a new issue https://gitlab.com/gitlab-org/gitlab/-/issues/new.
A
Hi
everyone-
I
am
shinya
in
release
group
today.
I
want
to
talk
about
this
cool
feature.
It's
called
automatic
rollback.
This
feature
will
be
shipped
in
30.7.
So
the
point
of
this
feature
is
that
automating,
some
of
the
cd
workflow,
for
example,.
A
In
theory,
all
of
the
comments
are
safe
because
there
they
were
verified
in
module
requests
by
ci
pipelines,
like
probably
a
bunch
of
testing
jobs,
for
example
our
spec
jobs,
and
then
that
these
jobs
make
sure
that
the
code
which
will
be
will
landing
on
the
product
environment
is
safe,
but
sometimes
problematic.
Comet
could
slip
into
production
environment
for,
for
example,
the
future
logically
corrects
logically
works.
A
A
So
this
feature
is
about
rolling
back
to
at
previous,
stable
environment
environment.
If
the
recent
deployment
had
something
trouble,
if
there's
a
problem,
our
new
rrt
is
raised
and
then
by
receiving
that
alerts,
gilaf
automatically
creates
a
new
deployment
that
tagging
a
previous
stable
comment
and
then
automatically
mitigates
the
production
issue.
A
So
there's
a
point
of
this
feature
and
then
let's
dive
into
the
demo,
so
okay
here
we
are
seeing
this
demo
project
this
demo
project.
Let
me
briefly
explain
this:
this
is
a
ruby
on
rails
application
already
configured
auto
devops.
A
You
can
read
more
about
learn
more
about
audio
debugs
in
offshore
documentation
page,
but
it's
basically
just
you
don't
need
to
do
anything
to
set
up
pipelines
or
city
jobs.
Everything
is
automatically
automatically
configured
and
then
the
code
will
be
the
application.
Application
is
deployed
to
kubernetes
cluster.
A
So
I
already
configured
this
and
let's
check
out
the
environment
page
here.
The
production
environment
is
already
created
and
it's
let's
check
out
the
webpage
here,
since
this
is
very
basic
application.
It
just
shows
the
simple
page,
but
that's
enough
for
demonstration
demonstration
and
we
are
seeing
that
kubernetes
cluster,
the
status
that
here
are
two
parts
on
the
production
environment
at
the
next.
Let's
take
a
look
at
the
monitoring
dashboard,
here's
a
couple
of
metrics
on
this
environment
how
this
performs?
A
So
if
something
went
wrong
on
our
application
codes,
500
error,
500
lr
means
internal
server
error,
so
something
went
wrong.
Dualing
processing,
user
requests,
so,
ideally
the
rate
of
500
should
be
zero
percent
or
nearly
zero
percent.
But
sometimes
you
might
see
a
spike
on
this
in.
A
If
a
bad
comment,
the
broken
comet
is
deployed
to
a
production
environment.
So
let's
try
to
make
a
bad
comment
here,
intentionally.
A
A
This
should
be
caught
at
the
testing
phase,
but
here
we
are
in
the
situation
that
if
the
code,
bad
comet
it
slipped
in
the
testing
phase
and
then
landed
on
the
production
and
let's
wait
a
bit
until
this
gets
on
the
production
it
doesn't
take
a
while,
but
let's
resume
from
let's
resume
after
this
pipeline
finished
okay,
so
the
deployment
pipeline
has
just
finished
and
the
this
problematic
comet
landed
on
the
production
environment
and
let's
take
a
look
at
the
alerts.
A
A
A
A
We
are
seeing
comet
a1
fc
020.
This
was
a
comet.
We
made
a
bad
comment
and
then
right
after
the
deployment
does
there's
a
huge
spike
here.
The
error
rate
increased
to
100
percent
and
in
a
typical
situation,
as
always
start
investigating
on
what
went
wrong.
A
What
caused
this
spike
here,
if
they
figure
out
that
this
deployment
is
related
to
this
incident,
maybe
they
perform
rollback,
but
what's
interesting
here
is
that
we
see
another
deployment
here.
This
deployment
is
created
by
ordered
rollback.
The
neural
introduced
feature
in
30.7,
so
this
deployment
is
triggered
by
this
alert
that
the
critical
art
rate
is
raised
and
then
gillab
automatically
trying
to
mitigate
the
problem
by
redeploying
the
previous
stable
deployment.
A
So,
interestingly,
we
think
that
the
spike
is
mitigated
from
100
percent
to
zello,
right
after
this
old
rollback
or
old
rollback.
A
So
this
feature
frees
operators
from
the
duty
duty
to
keep
looking
at
the
metrics
keep
looking
at
the
alerts
by
just
mitigating
the
problem
automatically,
and
let's
take
a
look
at
the
alert
page
again
here,
a
lot
is
gone.
This
is
because
the
problem
was
resolved
by
the
old
rollback.
A
Let's
take
a
look
at
the
environment
page
at
last:
here:
here's
a
deployment
index
page.
We
are
seeing
the
history
of
deployments,
and
here
the
latest
one
number
20
is
the
deployment
created
by
old
rollback
so
yeah.
This
is
the
safe
one
and
the
previous
one.
This
a1
fc
was
a
deployment.
We
made
that
the
the
problem,
the
code
problem,
problematic
comment
that
triggered
high
spike.
A
A
Please
make
sure
that
if
this
meets
your
criteria
before
you
actually
enables
the
feature
to
enable
this
feature
you
need
to
visit
the
project.
Configuration
page
here
is
a
steps
to
enable
the
feature.
A
This
feature
is
disabled
by
default,
but
it's
worth
considering
if
you
have
any
questions
or
feedback
or
suggestions
to
improve
this
feature.
Please
leave
a
comment
in
this
issue
or
please
create
a
new
issue
here
in
this
gitlab
project.
Your
variable
feedback
is
always
welcome.