►
From YouTube: 2022-07-20 GitLab.com k8s migration (EMEA/AMER)
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
A
A
Welcome
to
july
goodness
july,
20th,
okay,
so
the
year
is
going
by
very
quickly.
I've
only
got
one
little
item
to
discuss
or
demo
actually
on
our
agenda.
Today,
I've
been
working
on
trying
to
figure
out
ways
to
help
prevent
auto
employees
from
being
blocked.
In
the
case
that
a
cluster
is
down
for
maintenance,
whether
that
be
planned
or
unplanned.
A
If
it
is
a
plane
maintenance,
we
could
certainly
perform
what
I'm
about
to
demo
ahead
of
time.
If
it's
unplanned,
obviously
we're
probably
suffering
some
major
incident,
and
you
know
we
could
perform
this
same
style
of
remediation
at
that
moment
in
time,
but
I
would
imagine
the
other
boys
would
probably
be
blocked
regardless
because
of
other
situations,
but
in
the
case
of
potentially
trying
to
avoid
all
the
deploys
being
blocked.
I
would
like
to
demo
or
showcase
how
we
could
potentially
avoid
such
catastrophic
events.
A
Normally
this
all
happens
via
ci
I'm
going
to
replicate
effectively
what
ci
does
locally.
So,
if
I
go
into
our
favorite
repo
okay,
I
don't
care.
So
this
is
the
repository
that
we
use
to
perform
auto
deploys
when
a
ci
job
wants
to
perform
a
deploy.
It
simply
runs
k,
control
upgrade
so
our
dip
jobs,
for
example.
This
is
going
to
be
targeting
our
pre-cluster.
A
Our
deployments
contain
two
stages
where
the
first
one
is
a
dry
run,
but
they
all
do
the
upgrade
command.
So
if
we
do
a
dry
run,
real
quick,
we'll
see,
hopefully
that
there
are
no
changes
in
pre-broad,
there's
nothing
exciting
in
the
dip
that
I
have
for
this
cluster.
So
we
shouldn't
see
any
changes.
A
A
There
are
more
things
that
happen
inside
of
ci,
such
as
being
able
to
pull
the
key,
that's
used
and
then
authenticate,
but
all
that
happens
is
a
step
prior
to
us
actually
running
k
control
inside
of
ci.
We
have
some
hidden
text
in
my
screen
because
of
my
color
choices
in
my
shell.
So
now,
assuming
this
is
an
auto
deploy.
You
know
the
next
step
will
be
to
run
this
precise
command,
which
will
actually
perform
the
upgrade
being
that
we
don't
have
any
changes
here.
A
I'm
not
gonna
run
this
command
because
it's
quite
you
quite
useless
now,
if
we
are
undergoing
maintenance,
I
plan
on
creating
a
runbook
that
does
this.
I'm
that's
a
work
in
progress
at
the
moment,
but
effectively.
What
we'll
do
is
just
do.
Cluster
whoops,
buster
skip
equals
and
the
name
of
the
cluster
in
this
case
pre,
is
the
name
of
the
environment.
A
A
But
this
does
the
exact
same
thing,
except
it
adds
a
little
notification
saying:
hey
we're
skipping
this
cluster
because
you
told
me
to
and
we're
going
to
exit
cleanly.
So
if
I
do
echo
that
we'll
see
that
the
exit
code
is
zero,
the
reason
we
are
exiting
cleanly
is
because
we
are
targeting
specific
clusters
for
upgrades
during
auto,
deploys
we
deploy
to
our
regional
cluster
and
cluster
b
at
the
same
time,
and
then
we
deploy
the
cluster
c
and
cluster
d.
A
A
So
I
guess,
there's
two
follow-up
items
that
I'm
currently
working
on.
One
is
the
fact
that
this
exits
cleanly,
so
you
know
there's
the
chance
that
if
someone
leaves
this
variable
hanging
around
and
leave
a
cluster
inside
of
that
variable
configuration,
we
might
accidentally
skip
deploying
to
a
cluster
without
actually
realizing
it.
So
to
solve
that
problem,
I'm
going
to
try
to
figure
out
if
there's
a
way
that
we
could
parse
the
version
of
what
gitlab
version
is
running
via
metrics
and
create
an
alert
for
it.
A
I
know
we
have
this
information
on
our
omnibus
installation,
so,
like
all
of
our
gatedly
nodes,
are
exposing
which
version
of
gitlab
is
installed,
but
I
don't
think
our
clusters
are
at
least
I
can't
quickly
find
it
so
I'm
going
to
try
to
figure
out
where
that
could
come
from
and
see
what
we
could
do
to
do
to
add
an
alert
and
then
maybe
because
our
cia
jobs,
I
think
the
timeout
is
60
minutes
or
might
be
two
hours.
A
I
can't
remember
all
the
time
ahead,
I'll
create
an
alert
that
takes
that
into
account
such
that
if
we
do
have
a
legitimate
fire
of
some
kind
that
we're
not
unnecessarily
alerting
us,
but
rather
it'll
be
a
very
fine-tuned
alert
instead
and,
lastly,
would
be
improvements
to
our
run
books
such
that
we
have
the
actual
procedure
that
I
propose
that's
necessary.
A
I'll
just
share
my
screen
really
quickly,
so
I'm
effectively
going
to
be
documenting
this,
but
in
a
nicer
format,
inside
of
a
runbook
where
the
first
step
is
to
identify
a
cluster-
and
you
know
on
my
shell
here-
I
locally
tested
preprod,
but
we
would
input
the
cr
variable.
I
did
on
the
command
line,
but
obviously
we
would
use
our
ci
variables
for
the
op
project
and
then
we
would
have
the
necessary
need
to
notify
the
release.
A
It
should
be.
You
know
less
than
two
million
requests
per
second,
for
example,
you
know,
I
think
that'd
be
a
good
way
to
check
for
that.
You
know
we
could
probably
script
that
out.
Then
we
perform
whatever
maintenance.
We
want
to
choose,
so
you
know
coming
soon.
I
guess
I'll
be
taking
one
of
the
staging
clusters
down
just
to
you
know
test
that
all
of
this
is
working
properly,
that
we
have
a
solid
runbook
and
that
we've
got
the
necessary.
How
do
you?
A
What
is
the
the
alerts
making
sure
we
silence
the
appropriate,
alerting
that
way
when
we're
performing
normal
maintenance,
we're
not
unnecessarily
paging
the
eocs
and
then
afterwards,
if
necessary,
bring
the
cluster
to
parity.
So
if,
by
chance
we
had
a
deployment
occur,
while
a
cluster
was
down,
we
want
to
make
sure
that
cluster
is
running
the
correct
version
of
anything
prior
to
it,
taking
traffic
again
that
way,
we're
not
accidentally
downgrading
our
services
and
inducing
an
outage
because
of
that
type
of
situation.
A
Need
to
figure
out
how
to
do
that.
I've
got
ideas.
Currently,
my
thought
here
is
that
we
would
simply
replay
a
deploy
job
that
was
previously
skipped.
You
know
we
would
remove
that
variable
replay
the
job
it
brings
cluster
back
into
parity
after
confirmation
that
all
is
good.
We
could
then
run
the
set
service
data
and
bring
traffic
back
online
and
then
we're
effectively
complete.
A
B
I
was
thinking
we
could
have
chat
about
the
okrs,
but
I
think
it's
we
still
miss
aqua
and
so
on.
I
think
it's
something
that
we
should
keep
doing
in
the
issue
and
then
most
question
is:
we
can
pull
it
up
in
a
different
session.
A
How
about
this?
If
we
have
more
discussion
items
on
the
issue,
maybe
we
can
bring
something
to
next
week's
agenda
if
you'd
like.