►
From YouTube: 2021-06-02 GitLab.com k8s migration EMEA
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
Okay,
I'll
pink's
go
back,
but
he
might
be.
A
C
C
So
we
have
this
fine
tuning.
C
My
cat
is
annoying.
We
have
this
fine
tuning
of
api
and
gke
issue
and
what
we
did
so
far
is
trying
to
decrease
the
amount
of
main
replicas,
because
we
already
hit
the
floor
most
of
the
time
and
also
increasing
the
hva
target
right.
C
So
in
the
first
try
we
raced
it
a
little
bit
too
high,
which
caused
slight
updex
drop,
because
we
I
used
too
much
cpu,
but
now
we
tried
it
with
a
less
ambitious
number,
which
is
28,
800
requests
target,
speed
value
and
in
general
this
looks
fine.
C
So
it
should
be
clear,
then,
to
do
witch
and
veteran
in
the
next
mars
and
how
much
detail
should
I
go,
because
we
are
talking
now
about
the
things.
The
things
we
already
discussed
before
in
the
same
group.
A
Yeah,
we
can
just
link
that
in
I'm,
graham
we'll
have
to
watch
two
videos.
I
think,
but
that's
totally
fine.
Are
there
any
like
live
dashboards
or
any
interesting
things?
Yes,.
C
So
the
interesting
thing
is,
first
of
all,
looking
at
the
updatex,
of
course,
since
we
did
the
change
you
see
this
spike
here.
This
is
when
the
we
deployed
the
change
to
canary
and
each
time
we
deployed
to
canada.
We
have
an
uptext
drop
spike
and
you
see
the
updates
didn't
change
after
that.
So
it's
nearly
the
same
with
the
higher
value.
We
saw
a
decrease
in
updates,
so
this
time
we
seem
to
be
good
the
same.
C
If
you
look
at
the
ruby
thread
contention,
we
start
the
change
around
here
and
you
see
that
thread
contribution.
The
red
line
got
a
little
bit
higher,
but
not
as
much
as
in
our
first
try.
So
you
also
are
fine.
Also,
I
checked
back
the
last
30
days
and
when
we
were
running
on
vms,
our
ruby
threat
contention
was
even
higher
than
we
had
it
most
of
the
time
in
kubernetes.
C
So
we
have
some
headroom
there
still,
but
I
think
we
are
okay
here.
So
I
think
we
are
really
fine
to
try
this
on
one
production
cluster.
Now
we
can
also
see
how
the
node
count
got
a
little
bit
down
and
canary.
This
is
the
red
line.
C
We
started
the
change
here
and
but
we
didn't
get
much
lower
than
than
one
day
before,
which
is
around
here
so
because
to
see
if
this
has
a
big
effect
on
canary,
but
especially
because
we
just
have
a
few
nodes
and
we
are
not
at
low
traffic
times
right
now.
I
expect
this
to
go
down
this
night
and
the
same
for
pod
count,
there's
not
much
of
a
difference
to
see,
but
a
little
bit
at
least
then
we
let's
zoom
in,
did
the
change.
C
C
Lowering
requests
doesn't
mean
that
we
use
more
cpu
pair
put
right.
It's
just
that
we
can't
scale
higher
if
we
if
we
spike
right-
and
so
it
depends
very
much
on
on
the
patterns
of
the
spikes
that
we
see
on
our
single
containers
right,
I
think
that's
the
thing
which
is
hard
to
predict.
Looking
at
the
cpu,
I
don't
see
it
spiking
too
much
like.
C
B
Maybe
we
should,
if
ruby
contention,
is
I'm
just
thinking
long
term
because
we
had
mentioned
this
in
our
other
meeting.
But
I
wonder
if
ruby
thread
contention
is
the
driver
for
what
or
maybe
the
first
driver
where
we
start
to
see
issues.
Maybe
we
could
leverage
that
as
a
method
of
scaling
as
our
custom
metric
yeah
yeah,
we
could
make
sure
that
we
don't
ever
exceed
80
percent
ruby
thread
contention.
C
B
C
C
This
is
doubling
the
amount
of
nodes
nearly
and
I
think
this
is
causing
a
lot
of
cpu
and-
and
I
don't
know,
maybe
warm-up
time
so
this
really
seems
to
be
an
issue.
B
Do
you
think
we
may
be
prematurely
making
our
nodes
ready
or
pods
ready
rather.
C
C
Talking
about
workers,
there
was
an
interesting
remark
from
matthias
in
the
issue
where
he
mentioned.
Instead
of
increasing
the
amount
of
workers,
we
could
increase
the
amount
of
puma
threats,
because
that
would
get
around
the
lock
contention
and
still
give
us
a
memory.
Reusage.
B
B
C
I'm
not
the
expert
here.
I
am
also
not
sure.
I
just
know
that
we
have
these
minimum
one
and
maximum
four
threats
for
puma
right
now
in
gke,
but
I
don't
know
how
ruby
or
race
is
deciding
that
we
now
use.
I
don't
know
two
or
three
threads
and
and
two
or
five
or
four
workers
how
this
is
mixed
right.
So
I
wanted
to
just
look
into
that,
but
I
don't
know
it
yet.
What.
A
Are
the
what
are
the
risk
areas
like
if
that's
a
bad
idea?
What
would
be
the
impact.
C
I
mean
it's
just
how
you,
how
you
distribute
load
on
different
cpus
right
and
that's
what
you're
juggling
with
when
you
change
things
there.
So
it's
kind
of
a
tuning
thing.
I
don't
see.
C
I
guess
we
could
just
leverage
mathias
kepler
chiming
into
this
issue,
because
I
think
he's
very
much
into
the
details
of
this
anyway
and
he's
very
interested
in
that
anyway,
because
he
was
asking
questions
about
what
we
are
doing
with
tuning
right
now
in
gke,
because
he
wanted
to
understand
better,
and
maybe
he
can
even
just
give
us
advice
on
how
this
is
internally
happening,
juggling
with
threats
and
and
workers
there.
So
we
can
just
make
use
of
interest.
Maybe
we,
I
think,
yeah
yeah.
I
will
ask
him
in
the
issue.
A
Great
okay
so
looks
like
we're:
making
good
progress
today.
Is
there
any
additional
stuff
we
want
to
go
through
on
tuning.
A
The
issuers
we've
got
it
written
at
the
moment,
has
quite
a
length
of
run
books
in
it.
Is
that,
like
achievable,
is
that
how
we
want
to
tackle
this
stuff.
B
The
work
that
I'm
doing
is
mostly
just
updating
existing
documentation
because
a
lot
of
it's
out
of
date,
I
am
going
to
ask
for
feedback
when
I
have
a
merge
request
ready,
but
I'm
still
touching
various
bits
and
pieces
of
documentation,
so
I'll
ask
for
feedback
when
we
get
to
that
point.
B
I
do
want
to
see
if
I
can
get
some
feedback
as
to
what
chromebooks
we
should
add
that
might
be
specific
to
the
api
I'm
trying
to
shy
away
from
creating
hey.
This
is
kubernetes,
and
this
is
api,
because
there's
gonna
be
a
lot
of
troubleshooting
steps
that
you
could
take
with
kubernetes
in
general,
not
specific
to
api.
So
I
I'm
trying
to
avoid
that
part.
But
if
we
are
missing
something
that's
specific
to
kubernetes
that
we
don't
have
anywhere
that
we
definitely
need
to
address
in
some
way
shape
or
form.
A
Okay
sounds
good.
Please
keep
the
description
updated
with
whatever
you
decide.
That
issue
should
focus
on.
C
A
Awesome
and
are
there
other
pieces
to
the
api
service?
We've
got
run
books.
We've
got
the
tuning
to
wrap
up.
We've
got
the
retro
issue,
but
otherwise
are
we?
Are
we
feeling
that
api
is
complete.
C
I
have
a
question
about
production
changes
because
you
know
when
we,
when
we
do
production
changes
to
be
a
chef
now
we
often
need
to
create
change,
request
issues
depending
on
the
impact,
the
possible
impact
that
it
could
have
right,
and
I'm
asking
myself
if
we
do
this
via
kubernetes
now
how
we
value
the
possible
impact
of
changing
configurations,
because
in
most
cases
we
do
more
advanced
than
we
do
often
with
chef
where
we
just
you
know,
stop
chef
clyde,
maybe
somewhere,
try
it
just
on
one
note
and
things
like
that,
and
I'm
not
sure
about
how
we
should
deal
with.
C
Information
changes
much
more
often,
so
it's
more
about
thinking
about
a
policy
on
how
to
work.
With
that.
C
It's
about
change,
request
issues,
because
when
we
do
changes
in
production
with
chef,
we
often
if
we
think,
there's
high
impact
for
this.
B
C
C
D
I
think,
like
it's
really
the
judgment
of
the
I
mean
we
need
to
trust
the
judgment
of
the
sre.
We
typically
use
change
requests
for
changes
that
involve
a
lot
of
distinct
steps
and
manual
actions
and
where
there's
like
continual
monitoring
required
and
not
possibly
like
a
handover
necessary.
So
for
these
types
of
changes
I
would
say
yeah.
Maybe
it
would
be
better
we're
kind
of
using
this
issue
as
a
change
request,
because
it
has
like
links
to
monitoring,
but
perhaps
it
would
be
better
if
we
did
a
change
issue.
D
I
dislike
the
bureaucracy
as
well.
The
good
news
is
is
that
as
soon
as
igor
finishes
the
review,
I
have
an
update
to
woodhouse
that
allows
you
to
fill
out
a
change
issue
similar
to
how
we
fill
out
an
incident,
and
it
allows
you
to
fill
in
all
the
fields
and
everything
in
an
interactive
form.
So
it
does
make
it
a
little
bit
easier,
not
that
much
easier,
but
at
least
like
at
least
you
don't
have
to
manually
edit
the
description
and
all
that
jazz,
which
you
have
to
do
now.
D
B
Very
small,
concise
changes,
we're
testing
them
on
individual
clusters.
We
don't
know
what's
going
to
happen,
but
we
are
focusing
our
efforts
on
specific
things,
we're
focusing
these
on
specific
clusters
and
we
have
all
thanks
to
andrew's,
wonderful
work.
We
have
amazing
metrics
to
peel
through
and
determine
whether
or
not
we're
doing
bad
things
relatively
quickly.
D
Well,
I
would
say
the
the
this
fine
tune.
Issue
is
sort
of
becoming
a
change
issue
right.
It
has
links
to
monitoring,
it
has
a
timeline.
Why
don't
we
just
promote
it
to
a
change
issue,
and
just
so
so
we
so
we
can
at
least
reference
like
we
could.
We
could
do
that.
I
I
don't
feel
super
strongly
like
if
you
guys
think
that
it's
better
just
to
do
what
we've
been
doing.
That's
that's
fine
with
me.
I
think
I'm.
D
Yeah
we
could
just
have
one
change
issue
for
the
changes
that
we're
making
and
but
that's
basically
the
same
issue
that
we
have
now
so
the
main
thing,
the
one
place
where
I
think
we
could
have
done
better,
was
on
the
change
to
increase
the
target
average
utilization,
which
caused
the
net
decks
drop.
I
don't
think
there
was
a
very
good
handover,
like
the
on
call.
Probably
did
wasn't
aware
of
that
change
because
I
don't
know
it
may
be.
A
change
issue
would
have
helped.
D
Maybe
not
you
know
it's
hard
to
say,
but
yeah
there's.
E
A
E
C
What's
the
case
on
our
vm
fleet
that
we
also
had
a
very
high
threat
contention
all
the
time
when
I
look
30.
C
D
I
would
argue:
api
vms
was
saturated,
like
we
were
way
under
provisioned
and
we
realized
after
we
did
the
kubernetes
migration.
How
bad
things
were
right,
like
I
think,
maybe
maybe
threat
contention,
is
a
really
good
signal
and
maybe
we
should
have
been
paying
more
attention
to
it,
because
aptx
is
really
low.
For
you
know,
our
thresholds.
E
Yeah
and
and
like
I
said
to
you,
I
think,
on
monday,
job
like
there's
another
one
of
those
that's
firing
constantly
for
sidekick,
urgent
other
fleet
and
I'm
kind
of
ignoring
it
at
the
moment.
But
maybe
and
there's
an
issue.
We
know
what
the
problem
is,
but
it's
scheduled
for
like
two
releases
into
the
future
and
maybe
that's
something
we
should
go
and
take
a
look
at
and
it
might
be
something
we
need
to
kind
of
action
sooner
than
that.
D
Yeah,
I
think
it
wouldn't
hurt
to
at
the
sre
on
call
for
these
changes.
I
think
just
so
that
they're
aware,
especially
if,
like
especially
if
we
do
another
change
on
the
end
of
the
week,
because
I
think
it
was
kind
of
bad
last
weekend
beyond
call
was
getting
these
canary
aptx
alerts,
and
I
I
don't
think
it
was
hard
to
narrow
it
down
to
the
change
that
was
made
on
thursday.
D
Okay,
well,
I
don't
know,
I
think,
as
long
as
we're.
Maybe
maybe
we
should
just
keep
on
doing
what
we're
doing
but
try
to
let
the
on-call
know
as
much
as
possible
like
what
changes.
D
I
think
I
think
sre
on
call
slack
alias,
is
probably
the
best
thing
we
can
do.
Okay,.
B
Cool
but
amy
going
all
the
way
back
to
your
original
question
of.
Do
you
think
we're
done?
I've
got
one
cleanup
issue
left
in
this
epic,
when
we
first
created
a
lot
of
our
infrastructure
and
configurations,
and
we
kind
of
over
optimized,
because
there
was
an
assumption
that
we
would
get
rid
of
engine
x
so
because
of
that,
we've
got
a
few
ipa
addresses
that
are
reserved
and
all
of
our
clusters.
Oh.
A
B
It's
sucking
down
a
few
dollars
per
month
and
then
our
our
kubernetes
objects
are
not
even
used
in
those
ip
addresses,
so
that
we've
got
a
little
bit
of
cleanup
that
we
can
perform
inside
of
our
configurations
to
make
sure
that
we
are
more
concise
with
what
we
have
configured
in
our
environments.
I
would
love
to
build
and
knock
that
out
before
we
close
this
happening.
B
Sense-
and
hopefully
that's
one
of
those
quick
wins-
I
would
want
to
test
this
and
pre
to
make
sure
that
the
we
don't
accidentally
lose
the
api
from
engine
x
and.
A
A
Amazing
great
work,
exciting
stuff,
awesome
andrew
was
there
anything
you
wanted
to
demo?
I
mean
no
pressure.
I
know
you're
busy
on
other
loads
of
other
stuff,
so.
E
Yeah,
I
don't
have
a
lot
that
I
can
really
add
this
week
unless
you're
interested
in
knowing
about
postgres
saturation
metrics,
which
I
can
give
you
lots
of
insight
into
now.
But
I
don't
know
if
that's
the
the
right
audience,
I'm
really
sorry
about
that.
Hopefully
I'm
no!
No,
but
I
am
anyway.
E
I
am
hoping
to
have
this
all
done
by
the
end
of
the
week
so
because
I
I
just
want
to
get
it
off
my
plate,
it
kind
of
came
on
unexpectedly
and
I'd
like
to
get
it
off
as
quickly
as
I
can,
and
so
hopefully,
next
week
you
know
I've
put
on
that
daily
stand-up
thing
that
I'll
have
it
done
by
the
end
of
the
week.
So
hopefully
I
can
clear
it
off
and
and
be
done
yeah.
Thank
you
and.
A
Okay,
cool
well
what
I
was
going
to
say
when
I
I
might
sort
of
check
in
on
your
status
before
I
went
in
so
what
I
am,
what
I've
been
trying
to
work
on
this
afternoon
and
it's
certainly
not
finished
yet
so
I'll
keep
pulling
it
together,
is
to
capture
up
the
ideas
that
everyone
had
protected
and
work
out,
how
we
can
rank
these
things
and
what's
really
interesting.
A
So
this
is
kind
of
the
combined
view,
so
the
numbers
are
going
to
wait
differently
right,
so
I've
just
added
them
all
up,
but
there
are
some
things
we
might
want
to
prioritize
the
web
migration
for
now
and
look
at
some
things
later.
But
what
I
did
want
to
highlight
is
all
the
observability
stuff
is
ranking
up
really
highly.
A
So
I
think
that's
that's
a
really
good
sign
and
I
think
we
should.
You
know,
make
sure
that
we
are
working
with
you
andrew,
so
that
we
can
actually
all
get
this
stuff
done.
It's
going
to
be
super
valuable
for
the
web
migration,
but
having.
E
A
It
might
help
us
move
through
some
of
this
stuff.
A
A
Is
it
helpful,
seeing
all
of
our
kind
of
like
pain
points
in
this
way
or,
like
other
other
things,
that
you'd
like
to
add
to
this?
Like
to
this
approach,
I
mean.
B
A
Great
great
and
I
think
from
10
to
grade
this
morning,
we're
kind
of
saying
that
it
was
interesting
how
actually
a
lot
of
this
stuff
circles
around
maybe
three
or
four
bigger
problems
which
is
good
to
know,
so
we
can
work
out
how
we
actually
fit
those
in
in
the
future
months.
A
Awesome
so
yeah
I'll,
pull
all
that
stuff
out,
and
then
we
can
actually
take
a
look
through
it
and
work
out.
We
still
don't
have
issues
for
everything
I
think
that's
totally
fine,
like
I
don't
spend
too
much
time
doing
admin
so
I'll,
pull
that
stuff
together
and
share
an
issue
out
for
for
the
next
round
of
reviews.
C
Yeah,
I
just
added
this
very
much
at
the
end,
but
I
just
did
a
very
cheap
api,
cpu
request
calculator
and
a
spreadsheet,
to
see
how
much
we
could
fit
on
a
node
and
can
just
show
this
very
thick
as
this
one
yeah
and
for
adjusted.
Looking
on
one
of
the
api
nodes,
looking
what
kind
of
containers
we
have
running
there.
C
So
maybe
I'm
off
here
because
on
other
nodes,
it's
looking
different,
but
I
think
it's
more
or
less
the
same
like
these
are
the
not
api
related
containers
running
on
each
of
the
nodes,
and
this
is
the
no
total
allocable
node
capacity
we
have,
and
if
we
then
take
that
web
service
pot
takes
4600
requests,
we
come
up
with
three
pair
per
node,
which
we
can
fit
in
there
and
it's
very
hard
to
fit
more
and
then
we
would
need
to
reduce
the
cpu
requests
on
the
web
service
port
drastically
or
find
place
somewhere
else
to
fit
one
more
node,
one
more
part
on
one
node.
C
C
Yeah
we
can
just
adjust
the
numbers
here
and
see
how
it
fits
and
look
into
what
other
nodes
look
like.
It
was
a
very
quick
look
on
onenote
only,
but
it
looks
hard
to
adjust
the
numbers.
E
Thousand
requests
worth
of
mining
henry.
E
Thousand
requests
worth
of
bitcoin
mining
yeah.
C
A
Thanks
for
sharing
that
henry
super
does
anyone
else
have
anything
they
want
to
go
through
state?
No,
okay.
Well,
thank
you
for
demos
and
discussions
and
great
work
on
the
continued
tuning.
Looking
forward
to
us
getting
api
service
over
the
line,
so
nice
work
all
right
enjoy
the
rest
of
your
wednesday
speak
soon.