►
From YouTube: 2022-11-09 AMA about GitLab releases
Description
Delivery Group's monthly AMA about GitLab deployments and releases
A
Okay,
I'm
gonna
go
ahead
and
get
started
so
welcome.
This
is
November
the
9th
2022,
and
this
is
the
delivery
groups.
Monthly
AMA.
So
thank
you
all
for
joining.
We
have
a
question
in
the
agenda,
but
I
don't
think
cash
is
here
so
I'll
verbalize.
A
What
are
the
main
challenges
you've
faced
daily
so
who
would
like
to
go
ahead
and
share?
I'm,
not
sure
Harper
has
enough
time,
but
keep
it
give
us
a
summary
who
would
like
to
share
some
of
their
the
daily
challenges
of
kind
of
I
guess
like
working
in
delivery
group
or
managing
deployments
and
releases.
A
B
So
I'll
say
that
there
is
a
main
challenge
that
is
a
recurring
one,
which
is
that
as
part
of
the
delivery
group
through,
you
are
doing
some
shift
as
release
manager.
So
basically
you
can
think
of
what
you
want
to
achieve
during
a
year,
and
you
have
your
quarters,
your
okrs
and
everything,
but
then
it's
a
full
month
where
you
are
doing
release
management
where
basically
everything
else
stop.
B
So
this
this
is
really
hard
on
personal
level,
as
well
as
on
a
team
level.
Then,
if
we
go
on
what
we
do
as
delivery
outside
of
being
released
manager,
one
of
the
main
challenges
I
think
in
our
case
is
communication,
because
we
are
kind
of
at
the
end
of
the
process.
But
when
we
want
to
change
something
when
we
want
to
Implement
something
new,
we
have
to
either
implement
it,
but
also
convince
everyone
starting
from
product
development,
whatever.
Why
we're
doing
something?
B
What
we're
doing,
and
so
this
is
another
big
challenge
and
then
finally,
challenges
release
manager.
B
So
the
when
you
are
doing
release
management,
I
would
say
the
big
challenges
are
incidents
because
you,
these
are
the
things
there
are.
These
are
the
things
that
are
blocking
your
day-to-day
activities,
because
we
see
a
deployment
to
gitlab
takes
hours
and
basically
we
try
to
fit
as
much
as
we
can
into
the
24
hour,
but
then
getting
outside
of
an
incident
if
it
involves
a
package
that
it
has
to
be
rebuilt
or
fixed,
or
something
like
this
can
easily
take
eight
plus
hours
so
yeah.
A
Nice
thanks
for
sharing
that
last
year,
yeah
I
I
definitely
definitely
would
resonate
with
a
lot
of
those
I.
Think
it's
very
interesting
as
a
kind
of
release
manager
you
have
sort
of
some
days
are
incredibly
quiet
and
it's
just
promoting
to
getlab.com.
We
just
have
one
manual
step
that
we
have
on
our
process,
so
we
just
hit
the
button
and
hopefully
it
all
just
moves
through
slowly.
A
You
know
and
we're
all
we're
all
good
there,
and
then
there
are
other
days
like
particularly
around
kind
of
security
releases
and
the
monthly
release
where
I
actually
trying
to
coordinate
all
of
the
different
tasks
you're
trying
to
accomplish,
and
if
you
have
a
failing
pipeline
like
it
it's
incredibly
time
consuming,
and
certainly
that's
the
probably
the
bit
that
I
find
most
challenging
you're
kind
of
in
this
reactive
mode
and
trying
to
coordinate
all
the
pieces.
A
C
Right
I
was
also
told
as
a
follow-up
question,
or
maybe
not
quite
actually.
I
was
also
told
that
when
someone
merges
an
Mr
Mr
with
a
feature
flag-
and
they
did
not
enable
it
in
staging
that
piece
of
work
is
not
going
to
like,
they
won't
be
able
to
enable
it
in
production.
A
C
I
have
another
question:
if
so,
at
my
previous
company,
we
obviously
had
sandboxes
and
pre-production
environment,
and
so
all
these
different
environments,
where
you
could
actually
test
out
your
code
without
actually
deploying
it,
it
would
actually
emerging
in
into
a
master
and
when
I
joined
gitlab.
C
Recently
I
realized
that
we
primarily
do
testing
locally
and
then
we
merge
it
into
master
and
that
basically
goes
to
different
stages
where
you
QA
in
staging
and
then
in
Canary,
and
then
in
you
know,
and
I
was
just
wondering
like
the
decision
that
let's
not
have
any
like
testing
QA
QA
environments,
rather
pushes
all
towards
like
after
merging
into
a
master
I
wonder
what
like.
How
did
that
decision
come
to
be,
and
are
there
any
challenges
around
that?
Because
I
guess
people
that
are
new
join?
C
They
might
merge
something
that
wasn't
quite
ready.
Maybe
I
don't
know,
but
they
didn't
really
have
a
Sandbox
to
test
it
in.
If
you
know
what
I
mean,
if
that
makes
sense,
I
hope
that
makes
sense.
D
E
So
one
angle
is
that
git
love
receives
a
lot
of
commits
daily
and
even
hourly,
so
we
need
to
be
fast
enough
to
try
to
keep
the
pace
and
actually
a
funny
thing
is
that
in
the
beginning,
three
or
four
years
ago
we
only
performed
two
or
three
deployments
to
production,
because
we
actually
were
testing
like
commits
in
a
in
a
different
environment,
but
that
was
so
slow
and
by
the
time
we
promote
to
production
when
everything
was
broken,
because
we
were
not
able
to
to
test
it
like
in
a
real
environment.
E
So
because
of
it,
we
decided
to
be
more
agile
and
and
try
to
have
this
Auto
deployed
pipeline
that
actually
promote
everything
in
a
very
fast
paced,
and
we
do
have
some
q,
a
environments
that
is
like
staging
Canary.
We
have
is
also
staging
ref
and
each
on
the
well.
E
We
also
have
production
Canary
and
in
each
of
these
environments,
we
have
that
respective
q
a
that
try
to
test
well
a
smoke,
reliable
tests
and
well
also
in
the
terms
of
when
something
is
not
ready,
and
you
want
to
ship
it
just
to
continue
like
to
your
development.
Well,
we
have
another
tools
like
feature
flux
in
which
you
can
merge
something
into
master
and
it
shouldn't
impact,
because
the
changes
are
that
like
under
a
feature
flag.
D
And
even
with
merges
into
the
primary
gitlab
project
or
other
projects
that
make
up
this
whole
Suite,
there
are
actually
pipelines
and
abilities
to
run
all
of
these
and
build
the
commit
deploy.
It
run.
A
full
end-to-end
test
even
run
some
surrounding
full
application
Suite
tests.
In
some
scenarios
they
aren't
always
used
to
be
fair,
and
but
they
are
there,
they're,
just
not
always
enforced.
F
D
A
suite
of
tests
that
have
the
ability
to
do
a
full,
buildup
and
deployment
of
your
code
and
all
the
related
code
related
to
whatever
your
changes
are
so
if,
for
example,
there's
an
API
change
in
Italy
and
we
need
to
make
that
same,
API
change
in
the
rails
makes
use
of
it.
You
can
actually
deploy
all
of
those
components
as
a
whole
Suite
that
will
build
everything
and
deploy
it
and
actually
perform
tests
against
it.
E
G
Yeah
but
I
think
that
what
you're
getting
at
is
that,
ideally,
we
would
have
a
pre-production
environment,
that's
very
similar
to
what
we
have
production,
usually
click
a
button
deploy
your
code
and
just
do
your
manual
testing
that
way
right.
We
don't
have
that
right
now,
because
it's
a
it's
difficult
to
set
up
a
data
set
that
size
and
you
know
and
enable
like
a
thousand
people
to
be
able
to
work
on
that.
So
we
have
these
proxy
things
where
you
know
we
have
these
staging
Canary
deployment.
G
We
have
these
cute,
automated
QA,
it's
not
completely.
The
same
I
can
see
there
being
a
you
know:
there'd
be
a
case
for
having
a
much
nicer
environment.
That's
actually
mimicking
production
as
closely
as
possible,
but
I
don't
think
we're
there
yet
I,
don't
think
it's
a
constitution.
Is
that
there's
a
lot
of
little
things
we
can
do
before?
We
need
to
do
that,
but
it's
something
we
should
probably
consider.
B
We
are
actually
working
on
something
that
is
say
partially
related
to
this,
because
we
are
extending
the
standard,
backboard
policies
from
only
the
current
Milestone
to
three
version
back,
and
so
we
actually
have
the
problem
of
not
having
a
long-running
environment
for
the
all
the
three
stable
versions,
and
so
we
are
working
on
a
way
to
have
those
environments
set
up
and
running
in
a
kind
of
say
now
we
have
continuous
delivery,
disabled
branches
instead
of
just
installing
the
packages
and
as
we
are
going
through
this,
we
are
actually
thinking
about.
B
B
There
is
a
problem
with
the
number
of
merge
requests,
because
if
you
take
a
look
at
the
I
mean
the
number
of
merge
requests
that
we
have
in
GitHub
the
project,
if
everyone
is
going
to
spin
up
a
kubernetes
cluster
or
just
a
namespace
inside
a
cluster
and
run
pause,
this
is
going
to
be
also
a
huge
cost
issues.
So
there
are
many
things
connected
to
this.
A
I
would
just
mention,
as
a
sort
of
a
final
note,
then
so
near
up
the
top
just
above
the
agenda,
I've
added
a
couple:
Three
Links
to
the
kind
of
three
big
things
that
delivery
group
is
working
on
this
this
quarter,
so
the
first
one
maintenance
policy
extension
is
what
Alessio
just
mentioned
so
hopefully
we'll
be
able
to
open
up
the
maintenance
policy
a
little
bit
within
the
next
quarter
or
two
and
be
sort
of
regularly
accepting
back
bug
fixes
to
go
back
a
couple,
more
versions,
we're
also
we've
just
started
working
on
our
deployment
pipeline
observability
work.
A
So
in
the
moment
we
actually
don't
have
brilliant
sort
of
data
or
trending
around
our
deployment
pipelines.
So
we
can't
really
easily
see
if
we
have
a
long-running
pipeline
like
whether
we
commonly
have
long-running
pipelines
or
whether
we
you
know
a
certain
job
has
trended
up
or
does
it
fail
in
sort
of
certain
patterns?
So
this
is
going
to
be
the
sort
of
first
piece
if
it's
actually
starting
to
build
that
out.
A
So
we
can
make
better
decisions
about
how
to
improve
the
Pipelines
and
then
our
final
piece
is
we're
working
to
make
the
kubernetes
Clusters
easy
to
rebuild.
So
this
is
sort
of
a
good
bit
of
Maintenance
that
we've
sort
of
fits
on
top
of
our
kubernetes
migration
work
that
we've
been
doing
to
actually
allow
us
to,
hopefully
make
the
classes
a
bit
more
flexible
and
hopefully
that
unlocks
some
sort
of
other
deployment
approaches
in
the
future.
For
us
as
well.
C
A
Yeah,
this
is
a
great
question,
so
what
would
happen
in
this
case
is,
it
would
most
likely
hopefully
be
caught
on
our
staging
Canary,
so
the
first
environment,
where
we
run
QA
tests
on
the
package,
then
we
would
end
up
with
failing
most
likely.
We
are
now
with
failing
tests
and
I
should
say.
Actually.
The
very
very
first
step
is
with
a
bit
of
luck.
It
would
have
failed
to
merge
right.
A
Hopefully
it
would
have
failed
tests
on
the
merge,
Pipeline
and
wouldn't
have
actually
mentioned,
but
in
the
event
it
did,
it
would
most
likely
have
been
caught
on
our
stage
in
Canary
environment.
The
tests
would
have
failed,
so
the
deployment
would
have
rolled
out
the
package
tests
would
fail.
At
that
point,
we
would
ask
the
quality
on-call
engineer
who's
one
of
the
software
engineers
in
test.
A
They
they
have
a
rotation,
they
would
have
done
an
investigation
to
sort
of
figure
out
which
test
is
failing
and
what's
the
cause
of
that
most
often
they
would
identify
the
Mr
quite
quickly,
and
we,
if,
if
you
weren't
online,
we
would
use
the
development
the
dev
escalation
process.
The
development
dri
would
revert
the
change
to
to
unblock
the
pipeline,
so
that's
the
sort
of
the
most
usual
one.
A
There
are
times
where
the
failure
can
be
a
bit
obscure,
and
maybe
we
can't
necessarily
identify
the
the
exact
Mr
causing
it,
in
which
case
a
bit
more
investigation
again.
Dev
escalation
is
our
process
that
we
use
to
engage
people
from
within
development
if
we
don't
know
who
specifically
or
which
specific
Stage
Group
to
go
to
until
we
investigate
and
find
the
actual
cause,
and
then
we
revert
that
out.
So
we
have
processes
to
catch
us.
They
are
fairly
involved.
A
They
do
generally
require
sort
of
like
three
four,
maybe
more
people
to
be
involved
in
this,
and
they
also
take
a
while
one
of
the
things
which
I
think
Alessio
mentioned
earlier
was
one
of
their
actual
pain
points
we
have
on
our
process
is
reverting
it's
a
really
slow
process,
because
a
revert
Mr
is
basically
the
same
as
an
MR,
so
an
MR
has
to
be
created,
it
has
to
get
merged
and
we
have
to
build
a
new
package
and
we
have
to
deploy
that.
A
So
actually,
the
turnaround
time
of
a
broken
stage
in
Canary
and
a
recovered
stage
of
canary
is
is
quite
a
lot
of
hours,
which
is
why
it's
certainly
in
delivery.
A
We
are
huge
fans
of
feature
Flags,
so
if
you're
feeling
in
something
that
might
be
risky
a
feature
flag
is
a
really
really
great
way,
because
what
that
does
is
it
puts
the
control
back
in
your
timeline
because
we
can
just
get
the
code
changed
to
be
deployed
to
the
environment,
but
it's
completely
up
to
you
when
you
turn
that
on
so
you
can
do
that
in
your
day,
you
can
see
the
tests
running.
If
there
are
any
problems,
you
can
turn
that
off.
So
it's
certainly
a
much
shorter
recovery
sort
of
loop
for
us.
B
There's
also
something
to
be
said
here
that
is,
we
have
rolled
back
as
an
option
which
is
lowering
the
time
it
takes
to
recover
the
environment,
but
is
not
lowering
the
time
it
takes
to
get
out
of
the
Sadie
incident,
because,
in
order
for
us
to
restart
the
deployment
process,
the
regular
one
we
still
need
to
have
that
thing
revert
or
fixed,
usually
revert
and
then
merged
package
and
everything.
So
the
detailed
amount
of
hours
is
still.
B
A
Yeah-
it's
probably
a
it's
probably
maybe
once
a
week,
I
would
guess
like
on
the
whole
I
think
how
much
I
sort
of
approvals
on
maintenance
for
reviewers.
You
know
our
merch
pipeline
test
as
you
do.
A
great
job
and
a
lot
of
feature
flags
are
in
use
as
well,
so
I
think
an
awful
lot
of
of
these
sorts
of
things
were
avoided,
but
we
probably
I
would
say
on
average
around
one
a
week.
A
The
real
the
real
pain
point
for
these
things
is
the
impact
it
has
on
other
teams.
The
most
significant
time
is
in
the
sort
of
several
says
three
four
between
two
and
four
days
before
the
monthly
release,
because
it
what
it
can
end
up
with
is,
if
we
don't
do
a
deployment
for
seven
or
eight
hours.
Actually,
quite
a
lot
of
changes
don't
make
the
monthly
release
because
of
a
hard
deadline.
So
the
impact
can
be
quite
wide,
but
we
do
only
see
a
reasonably
small
number.
A
But
I
do
think,
like
with
our
with
our
process
our
kind
of
reviews
and
all
the
tests
that
go
on
in
the
merge
pipelines.
Like
you
know,
I
I
think
an
awful
lot
of
stuff
is
caught
very
early
on.
So
it's
not
a.
We
shouldn't
be
in
fear
of
of
this
happening,
but
a
feature
flag
is
a
great
way
if
we
have
got
something.
That's
slightly
risky.
B
There's
also
the
impact
of
the
things
that
is
broken
that
has
to
be
considered
right,
because,
if
you're
talking
about
major
features
broken
with
no
workaround,
this
is
exactly
what
we
described
right.
So
we're
gonna,
stop
everything
and
find
work
to
revert
and
revert,
and
nothing
will
continue.
But
if
we're
talking
about
Minor
feature
that
is
not
behaving
correctly,
but
there
are
workarounds
or
is
not
really
use
that
often
because
maybe
there's
something
that
is
not
even
under
a
QA
reliable
test.
B
So
we
can't
notice
this
until
it's
in
production,
and
at
that
point
you
say:
okay,
what's
the
impact
of
this,
and
we
can
say
we
can
keep
it
buggy
and
just
fix
it
and
in
the
next
6
to
12
hours.
This
will
be
rolled
through
all
the
environments,
and
so
we
get
fixed
so
not
always
rolling
back
is
an
option.
We're
talking
about
priority
one
and
two
issues
mostly.
C
And
when
something
goes
into,
the
product
to
production,
I
I
might
have
read
or
heard
somewhere
that
people
actually
like
manually,
enable
the
deployment
to
next
stage
is
that
is
that
what
happens?
Correct.
D
B
Right
there
is
a
baking
time
of
one
hour
in
before
that.
So
when
we
have
the
package
running
on
the
canary
stages
for
one
hour,
release
managers
receive
a
pings.
They
have
a
pink
and
there's.
There
are
information
about
the
healthy
of
the
system
that
get
taken
at
that
point
in
time,
and
so
it's
just
you
click
the
button.
B
There
are
no,
there
are
release.
Manager
are
not
wearing
a
page,
a
pager,
so
there
is
no
expectation
for
us
to
be
actually
online
when
this
happened,
and
so
this
is
one
of
the
main
reason
we
have
been
working
on
automating
this
couple
of
quarters
ago,
but
we
were
starting
from
far
far
far
away
from
that
point
that
we
actually
implemented
rollbacks
and
all
this
all
the
tests.
But
we
never
went
through
the
phase
where
we're
just
going
automating
automating
the
rollout,
because
we
have
no
availability
list
of
release
managers
and
knowing.
C
A
C
More
mean
sorry.
A
I
was
going
to
just
add
to
that
so
like
there,
when,
when
a
deployment
is
rolling
out
to
the
production
environment,
if
if
there
are
any
problems-
or
you
know
like
questions
or
things,
is
the
release
managers
who
would
join
the
incident
to
help
the
the
engineer
on
call,
so
we
have
a
kind
of
responsibility
for
when
changes
are
rolling
to
production.
We
guarantee
that
there's
a
release
manager
available
at
that
time,
so
just
to
keep
those
two
things
in
sync,
we
have
still
the
manual
promotion.
B
C
Right
and
we
have
release
manage
assuming
like
in
every
time
zone,
so
they're
like
24
hours
a
day,
that's
right,
yep,.
A
A
F
So
this
most
recent
line
of
questioning
reminds
me
of
the
fact
that
we
still
have
to
be
available,
but
the
a
delivery
team
member
must
be
available
in
another
deployed
package
is
going
out
so
I
guess
the
challenge
here
is:
what
could
we
do
to
remove
ourselves
from
having
to
be
online
and
make
it
such
that
it's
fully
automated
without
us
needing
to
be
around
to
kind
of
babysit?
The
situation
like?
What
can
we
do
to
enable
I,
don't
know
infrastructure
or
maybe
some
other
team
to
just
be
like?
Oh
well,
here's
a
problem.
A
Like
actually,
rather
than
sort
of
passing
the
responsibility
to
another
team,
I
think
having
the
automated
rollback
is
actually
the
is
the
solution
right
so
increase
health
checks
on
our
environments
and
roll
back.
If
you
know
if
certain
flags
are
hit,
and
then
that
way,
you
know,
even
if
it
paused
at
the
next
stage,
then
we're
not
sort
of
relying
on
any
human
to
have
to
be
the
one
to
watch
the
risk.
A
Of
course
of
that,
is
you
really
have
to
get
those
flags
to
be
accurate,
because
otherwise
you
have
a
lot
of
rollbacks,
which
also
take
time
when
you
didn't
need
to
I.
Think
that's
probably
the
biggest
challenge
to
actually
enabling
something
like
that.
E
I
think
another
challenge
should
be
our
ability
to
Halt
our
deployment
when
it
is
going
like
when
a
certain
metric
has
been
reached
right
now.
There
is
a
point
in
timing
which
we
cannot
simply
just
cancel
a
deployment
in
the
middle
of
it
and
for
us
to
be
removed
from
being
like
release
manager
Watchers.
We
will
need
something
like
that
to
Halt
a
deployment
and
then
to
start
automatically
a
rollback.
A
C
Sure
so
so
would
you
say
we
talked
about
a
lot
about
like
feature
flags
and
and
that
being
an
option
for
everybody.
Would
you
say
that
by
default,
people
should
just
throw
feature
flags
at
a
majority
of
their
work,
just
to
make
sure
that
it's
easy
to
just
flip
like
disable
it
when
it
needs
to
be
it's
easy
to
kind
of
test
it
and
make
sure
that
everything
works
out.
E
I
will
say
that
for
sensitive
changes.
Yes,
when
you
are
involving
projects,
CI
builds
or
something
that
might
affect
a
lot
of
users.
It
is
easier
and
safer
to
roll
it
out
under
a
feature
flag
and
then
just
ship
it
gradually
and
actually
feature
flags
are
very
cheap.
Like
developing
wise,
you
just
need
to
add
a
couple
of
lines,
and
then
they
discard
that
and
then
to
remove
it.
It
is
also
easy
for
trivial
changes.
I
will
say
that
is
not
necessary.
A
E
I
believe
we
do,
it
is
documented
somewhere.
Yes,
okay,.
A
We'll
see
if
we
can
find
out
for
the
ketchup,
because
I
think
it's
a
it's
a
little
bit
nuanced,
but
I
think
generally
they're
cheap,
but
we
do
need
to
remove
them.
So
there's
a
little
bit
more
overhead.
There.
C
Yeah
I
did
read
something
in
our
in
our
docs
about
that
it
is
quite
high
level.
I
guess
it
just
says.
If
you
have
a
sensitive
change,
then
put
a
feature
flag
on
it.
That
kind
of
thing.
So
it's
not
exactly
saying
like
what
area
or
like
what
level
of
like
how
many
changes
or
anything
like
that,
it's
more
like.
Basically,
the
the
the
the
engineering
needs
to
decide
what
they
think
and
then
just
go
by
that.
A
Yeah
and
I'd
say
that's
very
much
going
to
be
down
to
impact,
as
as
as
Alessia
was
mentioning
earlier,
we
wouldn't
always
roll
back
a
change.
So
if
it's
a
low
impact
problem,
you
know
it
it
it's
something
we
would
fix
forward.
So
in
that
case
it
probably
wouldn't
be
expected.
A
future
flag
would
exist.
A
Fantastic,
okay:
we
are
at
time,
so
thank
you
so
much.
Thank
you
Kesha
for
having
so
many
questions
and
everyone
else
who
joined
in
the
discussion
like
really
great
to
chat
to
you
all
today
and
enjoy
the
rest
of
your
day.
We'll,
hopefully
see
you
next
month
thanks
a
lot
team
thanks
a.