►
From YouTube: 2022-05-04 GitLab.com k8s migration EMEA/AMER
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Good
morning,
everyone
and
welcome
to
may
the
fourth
be
with
you,
igor,
has
not
yet
hopped
in
so
why
don't?
We
skip
igor
and
henry
you've
got
the
first
item
on
today's
agenda.
B
Yeah,
you
have
to
admit,
I'm
still
catching
up
a
little
bit
from
the
last
three
weeks,
and
actually
vlad
did
some
work
on
this
and
also
paired
with
ahmad.
C
B
So
please
chime
in
once.
You
have
more
details
than
I
can
add
here,
but
from
what
has
happened
on
cameraproxy.
B
Basically,
vlad
did
a
lot
of
work
on
creating
a
hand
chart
but
and
working
in
bit,
lapel
files
to
create
a
release
still
in
a
branch
which
was
experimentally
also
deployed
to
pre
already,
so
we
have
cable
proxy
ports
running
there
and
also
pre
was
configured
to
use
chem
proxy
communities,
and
I
think
we
tested
this
even
and
could
see
some
images.
But
I
just
looked
at
this
and
right
now
it
doesn't
seem
to
work,
so
I'm
not
sure
of
the
current
state.
Actually,
maybe
we're
still
experimenting
on
this
one.
B
The
other
things
that
we
started
to
look
into
also
yesterday
is
enabling
logging,
so
I
think
vlad
incorporated
yesterday
some
changes
to
enable
pubs,
upbeat
and
things
missing
on
pre
to
get
log
files
through,
but
I
guess
this
still
needs
some
work
on
on
fluentd
still
to
be
configured.
You
need
to
check
on
this
and
what
I
think
the
next
steps
here
are
to.
B
I
guess
we
should
really
try
to
merge
this,
mr
that
we
have
here
for
development
just
to
get
a
state
and
pre
out,
because
it
shouldn't
break
anything
in
pre,
make
sure
that
it
works
in
pre
as
expected,
look
further
into
logging
and
monitoring
and
then
figure
out
how
we
can
work
with
load
balancing,
because
the
next
challenge
will
be
if
you
want
to
migrate
in
g-staging
in
g-prot,
where
we
already
have
an
existing
cable
proxy
setup
with
a
load
balancer
which
is
being
used
via
dns
to
proxy
traffic
to,
and
we
need
to
find
a
way
to
connect
from
kubernetes
to
this
lb,
maybe
and
then
transition
over-
and
this
is
the
next
thing
to
figure
out
here.
B
Right
now
we
have
a
google
load
balancer
for
that
for
the
vms
for
communities.
I
I'm
still
not
sure
what
would
be
the
best
way
to
go
here
if
we
should
introduce
something
in
kunitas,
but
then
how
can
we
transition
over
or
should
we
just
use
the
existing
load
balancer
from
kubernetes
and
make
it
connect
into
kubernetes?
I
don't
know,
there's
something
to
research
still
did
I
forget
something.
D
No,
I
think
you
covered
it.
One
thing
I
think
right
now,
vlad
is
experimenting
with
the
external
load
balancer,
so
using
the
creating
like
load
browser
jke,
and
this
goes
into
grenades
as
well.
If
I
remember
correctly
so,
I
think
he's
also
like
experimenting
with
this.
B
A
Excellent,
I
don't
have
any
further
questions.
Anyone
else
have
any
further
questions
on
that.
A
Cool
henry:
let's
talk
about
your
proposal
here.
B
Oh
yeah,
the
order
changed,
I
see
yeah
just
I
I
just
saw
that
we
have
an
issue,
I
think,
from
green
about
issues
with
our
produce
option
budget
right
now.
B
This
is
the
issue
31
I
linked
here
and
looking
into
that,
I
figured
that
we
have
a
very
old
issue
for
increasing
stabilization
in
the
windows
seconds,
which
I
also
linked
here
in
the
description
in
the
document.
B
The
thing
is,
what
we
have
right
now
is
that
our
deployments
are
all
scaling
up
and
down
a
lot
right,
so
our
hpa
is
looking
at
our
load.
B
In
all
cases
we
use
cpu
average
load,
I
think,
to
decide
if
you
need
to
scale
up
and
down
and
our
traffic
is,
you
know
very
bursty,
sometimes
and
changing
a
lot
over
time.
So
we
constantly
are
scaling
up
and
down
pots.
Very,
very
often
in
most
of
our
deployments
and
the
one
problem
here
is
with
pot
disruption
budget
and
why
we
are
scaling.
It
seems
that
we
don't
have
further
destruction.
Budget
to
you
know,
shut
down
more
parts.
If
we
want
to.
E
B
With
our
web
service,
ports,
especially,
is
that
they
take
very
long
to
you
know,
be
created,
I
think,
a
minute
or
something
like
that,
and
it's
very
resource-hungry
process
to
just
spin
them
up
and
if
they
are
just
taken
down
again
after
five
minutes
or
something
because
we
did
something,
then
we
spent
a
minute
just
scaling
something
up
and
after
five
minutes
we
scaled
down.
So
it's
a
lot
of
resource
waste
and
I.
D
B
When
this
issue
about
stabilization
window
was
created,
there
was
only
beta
support
and
kubernetes
for
that
and
we
couldn't
easily
deploy
it.
But
now,
with
kubernetes
version
121,
which
we
are
running,
I
think
it
should
be
straightforward
to
set
the
setting
and
by
setting
it
to
a
value
which
is
longer
than
the
default
five
minutes.
I
think
we
can
prevent
to
scale
down
too
early
and
those
we
should
stay
more
stable
over.
B
You
know,
traffic
spikes
and
avoid
a
lot
of
these
issues
that
we
see
with
pod
scanning
and
that
maybe
also
could
help
with
the
pot
scaling
port
destruction
issue
that
dream
was
looking
into.
B
So
I
guess
it
would
make
sense
to
set
it
to
some
value
for
our
web
server
spots,
which
is
much
longer
than
five
minutes
and
then
see
how
we
go
there,
and
the
thing
that
we
need
to
figure
out
is,
of
course,
coasts
because
of
these
slowed
scale
down
more
slowly,
then
maybe
we
faced
a
little
bit
more
sources
and
coasts,
but
I
think
by
not
as
often
scaling
up
and
down,
we
even
save
resources.
Maybe
so
I
think
it's
worth
to
looking
into
that
again.
A
A
A
B
E
A
E
F
I
have
a
question
as
I
I'm
reading
the
jar
of
issue,
which
is
one
year,
so
I
want
to
check
something.
So
this
is
stating
that
we
see
a
lot
of
500
errors
during
scaling
events.
Does
it
mean
that
we
start
routing
requests
before
the
box
is
ready,
or
this
is
happening
when
we
fear
down
that?
Basically,
we
are
closing
connection
that
are
serving
traffic
or
both
both
things.
A
F
F
B
I
also
think
we
did
a
lot
of
improvements
over
the
last
months
and
years
to
fix,
scaling
up
and
down
and
then
have
long
enough
windows
and
and
readiness
checks.
A
All
right
so
with
that
amy's
gonna
spin
up
a
potentially
new
issue
to
address
henry's
idea
here,
but
that's
not
going
to
stop
what
graham
is
currently
working
on,
though.
E
No,
I
was
going
to
actually
suggest
that
maybe
henry
you
want
to
take
over
this
issue
from
graham
and
figure
out
what
might
be
the
best
next
steps.
B
Okay,
I
will
look
into
the
issue
from
dream.
B
B
B
C
So
the
mental
model
that
I
have
for
that
is
we're
spending
a
lot
of
time,
booting
up
parts
and
that's
the
main
thing
that
we
would
save
on
so
by
having
less
flappiness,
we
kind
of
amortize
that
cost
and
therefore
potentially
lower
the
overall
resource
utilization.
Does
that
match
what
other
people
think.
B
C
I
have
the
next
one
so
we're
finally
getting
to
make
some
more
progress
on
the
the
rollout
of
host
names.
We
we
were
waiting
on
an
omnibus
change
that
has
now
landed
and
we're
looking
to
get
well,
don't
have
anything
to
demo.
Yet
I
don't
think
but
looking
to
get
that
on
pre
by
the
end
of
the
week
and
then
once
it's
working
on
pre.
We
should
also
have
the
procedures
in
place
to
to
do
it
on
staging
so
there's
some
interesting
stuff
that
we're
still
discovering,
in
particular
around
how.
C
The
the
chef,
client
and
reconfigure
interact
like
which,
which
of
those
does
what
scarbec
made
a
really
interesting
discovery
yesterday,
so
we
don't
run
well
for
most
of
our
chef
lee.
We
run
reconfigure
on
every
chef
client
run
on
redis.
We
don't
and
we
don't
in
order
to
protect
us
from
surprise,
reddish
restarts.
C
That
means,
however,
that
we're
pinning
very
old
version
of
the
package.
It
also
means
that
stuff
that
usually
gets
done
regularly
by
reconfigures
is
now
very
stale
on
those
boxes,
and
the
issue
we
ran
into
in
this
case
was
actually
deprecations,
because
those
only
get
processed
like
the
the
file
that
the
package
installer
looks
at
only
gets
written
by
reconfigure,
and
so
that
means,
if
you
try
to
upgrade
from
an
old
package
to
a
new
package,
it'll
use
the
old
settings.
C
Unless
you
run
a
reconfigure
before
trying
to
install
the
new
package.
So
there's
some
weird
dependency
ordering
stuff
there
huge
kudos
to
scarbeck
for
figuring
that
out,
hopefully
that's
the
the
biggest
hurdle
on
on
this
particular
rollout,
we'll
see
we'll
see
how
the
rest
goes.
Probably
some
more
dragons
to
be
discovered.
C
So
that's
the
host
names
side
of
things
slow
and
steady
and
the
other
one
is
process
exporter.
So
this
is
on
the
observability
side
of
things
for
edis.
C
The
helm
chart
does
ship
with
a
redis
exporter
and
that
does
all
of
the
polling
on
the
redis
instance
itself.
However,
we
also
want
to
have
per
thread
cpu
statistics,
because
we
want
to
differentiate
between
the
main
thread
and
the
I
o
threads
and
the
background
thread.
Our
saturation
metric
is
on
the
main
thread
and.
C
The
red
six
water
doesn't
give
us
that
information,
so
we
need
to
add
the
process
exporter
and
luckily
the
chart
does
support
side
cars
so
hopefully
we'll
have
the
that
exporter
in
place
soon.
C
A
All
right
so
get
lab
shd,
I'm
struggling
to
try
to
figure
out
where
certain
issues
lie,
so
I've
pinged
sean
again
as
a
quick
reminder,
we
rolled
back
again
because
we
were
having
issues
in
canary.
So
you
know,
gitlab
sshd
is
not
taking
any
traffic
in
canadian
production.
A
It
was
identified
that
we
had
a
lot
of
errors
coming
out
of
canary
a
very
generic
context,
cancelled
error
message,
don't
know
where
that
is
spawning
from
at
the
moment.
So
I'm
trying
to
figure
out
if
we
have
an
issue
to
address
that-
and
I
don't
see
one
so
I'm
asking
sean
for
that.
The
other
item
was
related
to
metrics,
where
a
metric
item
was
simply
renamed
and
that
was
not
reflecting
our
dashboard.
A
I
thought
I
saw
an
issue
for
this
in
the
past,
but
I
struggled
to
find
it
so
again,
I'm
still
trying
to
figure
out
what
that
is.
The
third
one
was
just
generic
load
performance
testing
of
some
kind.
You
know
we've
rolled
this
back
multiple
times
now
I
feel
like
we
should
have
been
able
to
detect
some
of
these
issues
in
staging
prior
to
writing
them
into
production.
A
So
I'm
trying
to
get
us
into
a
state
where
we're
testing
this
a
little
bit
more
thoroughly
in
some
way
shape
or
form
igor,
has
pulled
that
work,
so
I'm
eager
to
see
what
those
results
are,
but
I'm
kind
of
going
to
enforce
that
we
block
migrating
gitlab
sshd
into
production
tool.
All
at
least
those
three
issues
have
been
addressed
to
some
extent.
A
So
at
the
moment
it's
a
waiting
game
for
us,
I'm
eager
to
get
this
into
production
because
it's
kind
of
an
exciting
project
and
it
would
benefit
both
us
as
well
as
self-managed
users,
so
I'm
kind
of
I'm
not
trying
to
rush
it,
I'm
just
trying
to
exclaim
that
I
am
eager
for
this
to
get
rolled
in.
So
I
don't
have
any
questions
like
it's
kind
of
a
waiting
game,
I'm
not
driving
these
improvements.
A
These
are
kind
of
on
the
gitlab
shell
team
and
they've
been
preoccupied
with
another
issue
between
the
last
our
last
attempt
and
this
week.
So
there
hasn't
really
been
much
movement
in
the
first
place.
So.
C
During
the
last
rollout
attempt
the
the
communication
with
that
team
wasn't
so
proactive
like
once
it
failed
and
we
started
talking
to
them.
That
was
fine
and
they
were
very
responsive,
but
I
think
involving
them
earlier
on
and
actually
having
them
join
a
call
and
us
rolling
that
out
together
is
maybe
well
it's
something
I'd
like
to
see.
A
A
Okay,
like
mine,
I
know
you've
got
a
couple
of
cr.
I
think
you've
got
two
crs
one
for
canary
one
for
production.
I
think
for
I'll
go
through
the
canary
one
and
update
that
such
that
we
get
that
notified
and
then
also
comment
on
that
issue.
That
way,
they're
aware
that
we're
going
to
be
doing
that,
I
think
that's
an
excellent
idea
and
I
fully
support
that.
E
A
Okay,
well,
in
that
case,
everyone
enjoyed
the
rest
of
your
day.
I
look
forward
to
seeing
you
all
next
week.