►
From YouTube: 2020-10-15 GitLab.com k8s migration EMEA
Description
Discussing Git https migration
A
Cool
so
hello,
everyone
welcome
to
the
to
another
demo,
another
thursday.
Let's
kick
off
with
a
little
view
of
the
blockers.
A
A
The
build
logs
is
making
good
progress,
so
they've
got
their
running
in
production,
they're
fixing
out
the
last
issue
and
then
planning
to
increase
up
to
25
of
traffic,
so
that
one
hopefully
will
be
progressing
nicely.
A
So
this
second
one,
the
issue's
been
split
out,
so
we've
got
these.
These
two
issues,
no
great
progress
on
either
since
our
last
demo
are.
These
are
either
of
these
blocking
us
yet
or
are
we
still
we're
still
good.
B
C
B
To
mix
that
traffic
in
with
the
git
https
traffic.
B
B
To
get
an
idea
of
like
what
the
time
frame
is
for
this,
is
it
going
to
be
like
next
week
week
after.
B
A
C
B
A
Cool
and
then
prometheus
metrics.
C
We
decided
last
mean
that
we're
just
going
to
move
forward
without
this.
I
just
want
to
keep
it
here,
because
I
want
it
known
that
we
don't
have
metrics
coming
from
just
the
shell
itself,.
A
Right,
yep,
cool,
okay,
that
makes
sense
cool
and
then
the
other
one
we
had
on
there
with
the
az
cross
ac
stuff,
which
I've
removed
unblocked
on
that
which
is
awesome
and
then
the
pages
stuff
is
in
progress,
so
hopefully
we'll
be
partially
unblocked
pretty
soon
on
that,
as
well,
so
good
progress,
I
think.
A
B
Yeah
sure
I
could
give
an
overview,
there's
not
a
whole
lot
to
demo
here
we're
still
in
the
middle
of
the
change
issue,
and
we
have
we
had
a
small
blocker
this
morning.
But
let
me
just
kind
of
give
first
an
overview
of
what
we
have
so.
B
B
So
we
have
each
well
how
many
so
so
we
have
the
get
https
backend
and
in
each
back
end
is
the
zonal,
zonal
cluster.
B
So
a
single
zonal
cluster
and
this
depends
on
the
on
the
zone
of
the
load
balancer.
So
so
there's
just
one
so,
for
example,
we'll
just
we'll
just
use
usb
1d
as
an
example
so
e,
and
then
we
have
a
weight
of
10..
B
B
B
Yeah,
so
the
weight
indicates
how
much
traffic
is
going
to
each
of
these
each
each
of
these
servers
so.
B
What
you'll
see
is
a
single
server
for
the
zono
cluster,
with
the
weight
of
10,
then
you'll
have
eight
virtual
machines
with
a
weight
of
10
and
then
you'll
have
the
regional
cluster
for
canary
with
a
weight
of
five
the
weight
of
five
and
how
you
calculate
and
how
you
and
how
you
calculate
the
percentage
of
traffic
is,
basically
you
add
up
the
weights,
and
then
you
see
what
what
percentage
of
weight
goes
to
the
server
so
just
kind
of
like.
B
B
So
maybe
we'll
go
from
10
to
100
in
steps
to
kind
of
start
starting
to
give
more
weight
and
then
eventually,
once
we
have
the
zonal
cluster
taking
over
the
majority
of
traffic
and
we're
more
confident
in
it,
then
we
will
just
remove
the
vms
altogether.
B
So
that's
kind
of
an
overview
of
what
we're
doing
the
estimated
pod
count
in
this
zone.
So
for
the
in
the
readiness
review,
we
did
some
kind
of
rough
calculations
that
to
in
order
to
have
a
minimum
number
of
pods.
C
B
Have
the
min
pods
is
set
to
30,
which
is
a
change
that
we
made
this
morning
first,
I
first
thought
was
thought
we
would
do
50
to
have
like
the
full
minimum,
but
I
realize
that's
going
to
blow
out
the
number
of
notes
like
it.
It
basically
blows
out
the
number
of
notes
to
30
30
vms
across
the
three
zones,
and
since
we
aren't
yet
putting
all
the
traffic
on
the
kubernetes
cluster,
I
want
to
start
with
30
to
see
how
it
behaves
and
then
we'll
increase
it
to
50.
If
we
need
to.
A
It
does
I
wonder
if
what's
the,
if
we
don't
have
the
size
right,
if
it's
too
small,
what
will
be
the
impact.
B
B
So
the
reason
why
I'm
setting
this
floor
is
just
so
that
when
we
shift
the
traffic
initially,
we
don't
run
the
risk
of
overwhelming
it,
but
we
will
be
doing
a
gradual
transition
anyway,
so
I
would
say
once
we
do
the
full
transition.
What
we'll
probably
want
to
do
is
back
it
off
to
set
a
floor.
That's
you
know
something
that
will
allow
us
to
scale
up
and
down
with
traffic.
A
B
So
I
put
some
links
here
of
what
I'm
paying
attention
to.
We
have
logs
in
the
git
overview
dashboard.
This
is
sort
of
the
base
basic
stuff
that
we're
looking.
B
B
So
what
I'm
doing
here
is
saying:
I'm
looking
at
type
equals
git,
I'm
looking
for
only
things
that
are
coming
from
kubernetes
and
I
don't
want
to
look
at
the
canary
stage
right
now
and
I'm
just
filtering
out
readiness
checks.
So
these
are
the
logs
and
then
we
have
the
get
overview.
The
git
overview
doesn't
help
us
too
much
just
because
it's
not
possible
to
well
with
this
dashboard,
it's
not
possible
to
differentiate
kubernetes
and
non-kubernetes
traffic.
B
So
I
think
this
gives
you
like
this.
Can
this
overview
will
give
us
like?
Okay?
Are
we
meeting
our
slos
overall
as
we
shift
traffic
over,
but
I'm
looking
at
other
things
as
well.
B
So
most
of
my
focus
is
on
latency
and
looking
at
the
50th
and
95th
percentile
of
latency
for
both
workhorse
and.
B
Rails,
so
here's
what
we
have
a
direct
comparison
of
virtual
machines
versus
kubernetes
kubernetes
is
on
the
top.
Virtual
machines
are
on
the
bottom,
so
this
is
doing
a
query,
basically
for
not
kubernetes
and
kubernetes.
B
This
is
workhorse
percentiles,
50th
and
95th
percentile.
You
can
see
that
it
looks
like
vms
are
doing
much
worse
than
kubernetes.
So
far
like
on
the
95th
percentile
we're
seeing
all
these
like
little
spikes
that
go
up.
I
know
that
we
believe
that
the
git
fleet
is
under
provision
right
now,
and
that,
like
this,
could
be
related
to
that.
We
aren't
sure
it'll
be
interesting
to
see
what
happens
to
these
two
graphs.
B
As
we
start
moving
more
traffic
over
right
now,
we
only
have
10
percent
of
traffic
in
the
kubernetes
cluster,
so
I
wouldn't
put
too
much
stock
into
it.
Another
thing
that
sort
of
conflates
this
and
skyrack
I'm
curious.
What
your
thoughts
on
this
are
is
that,
right
now
on
the
git
vms,
we
mix
both
ssh
and
get
https
traffic
on
the
kubernetes
cluster.
We're
only
doing
git
https,
so
it
might
be.
It
might
be
interesting
for
us
to
do
a
more
apples.
B
Apples
comparison,
we
could
remove
some
of
the
vms
from
get
ssh
traffic
and
just
have
like
vms,
just
doing
git
https
versus
kubernetes
just
doing
good
https.
That.
B
Yeah,
so
I'm
thinking
about
that
right
now,
I
think
I
think
the
transition
to
kubernetes
is
for
ghbs
is
going
to
go
on
until
monday
or
tuesday,
because
I
just
want
to
be
super
slow
and
careful
with
it.
Here's
a
similar
graph
for
rails
and
again
it's
interesting.
We
see
this
same
spike
now
for
the
rails
rails
is
only
showing
get
https
because
git
ssh
well,
that's
not
true!
Actually,
yeah,
there's
only
authorization
requests
are
going
through
rails,
but
I
would
say
most
of
these
requests
are
coming
from
get
https.
B
B
Like
down
in
like
0.15
second
consistently
for
the
50th
percentile,
it
jumps
around
a
bit,
but
I
would
say
yeah
I
mean
I'll,
be
interested
to
see
as
we
start
to
shift
more
traffic
over
what
these
graphs
look
like.
B
I
did
just
so
so
I
mentioned.
There's
a
there
was
a
blocker.
B
The
blocker
I
found
was
that
we
don't
have
the
metrics
yet
and
thanos,
so
our
rps,
you
can
see
here
like
as
soon
as
I
started,
to
shift
traffic
into
the
grenades
cluster.
This
started
to
go
down
and
that's
just
because
we
have
prometheus
running
in
the
clusters,
but
thanos,
which
is
our
aggregator
for
metrics,
which
serve,
which
is
like
feeds
into
the
dashboards,
wasn't
configured
yet
for
the
zonal
cluster.
So
that's
fixed
now,
so,
hopefully
the
next
run.
We
will
see
the
rps
stay
constant.
B
This
was
a
bit
concerning
load,
balancer
component
error
rates.
I
need
to
dig
into
this
a
little
bit
more,
so
this
started
to
creep
up
as
I
started
to
move
traffic
over
and
oddly,
it
seemed
like
the
errors
were
coming,
not
from
kubernetes
but
from
the
virtual
machines.
So
I
don't
know
what
exactly
is
going
on
there,
I'm
going
to
do
a
smaller
jump
on
the
next,
because
I
already
like
when
I
already
reverted
after
this
I
started.
I
started
to
see
this
climb.
B
I
just
like
undid
set
the
weight
back
to
what
it
was
before,
so
we
never
got
an
alert
because
we
were
below
our
slo
for
this,
but
next
time
I'm
going
to
maybe
do
a
smaller
jump
to
see.
If
we're
going
to
see
errors
again,.
A
B
That
is
pretty
much
it
yeah
you
can
see
here
like
we
were
seeing
mostly
500
errors
and
they
were
actually
coming
from
the
virtual
machines.
When
I
dug
into
the
500
errors
it
wasn't,
it
wasn't
clear
yet
it
wasn't
clear
to
me
yet
what's
causing
them,
so
I
just
need
to
look
into
that
a
bit
more
and
yeah.
So
I'm
kind
of
waiting
waiting
for
this
query
to
reach
there
we
go
so
this
looks
like
it's
canary.
Let's
take
out
canaries.
B
Yeah,
so
still
still
no
metrics
in
thanos,
so
we
have
to
wait,
for.
I
just
did
a
chef,
mr
to
there's
this
fmr
to
add
the
zonal
cluster
endpoints
to
the
thanos
configuration.
So
once
that's
done,
then
you
should
start
seeing
metrics.
C
Okay,
with
the
addition
of
the
zonal
clusters
and
the
addition
of
the
new
thanos
related
work,
that's
been
coming
down
from
the
observability
team.
I
have
a
worry
and
concern
that
our
existing
metrics
and
our
logging
mechanisms
are
not
going
to
keep
up
with
the
changes
that
we
have
yet
to
place
under
run
books,
and
I
have
this
fear
that
eventually
we're
going
to
have
some
form
of
outage
and
we're
not
going
to
have
a
quick
and
easy
way
to
decipher
where
the
problem
might
lie.
C
Due
to
this
fear,
I
kind
of
want
to
see
us
try
to
focus
on
tech
debt
in
the
future.
Like
I
don't
know,
what's
coming
up
next
after
we
finish
what
we
came
with
sidekick
and
after
jar
finishes
up
to
get
https
stuff,
I
don't
know
what's
really
next,
but
I
would
love
to
see
some
improvements
made
to
our
dashboards
and
I
would
love
to
see
some.
We
also
have
the
helm
upgrade,
which
has
been
stuck
forever.
C
I
would
love
to
see
us
get
out
of
the
helm,
2
version,
because
it's
kind
of
crappy
I
don't
know
how
to
steer
us
in
the
direction
to
focus
on
tech
net.
I
know
we've
got
an
epic
that
captures
this
data,
but
I'm
hoping
this
is
also.
You
know
the
opposite
of
our
okr,
but
I'm
wondering
what
do
we
need
to
do
to
try
to
figure
out?
What
can
we
do
to
ensure
confidence
in
our
runbooks
and
our
monitoring
to
ensure
that
now
that
we've
got
our
multiple
clusters?
A
Fully
sold
like,
I
think
it,
it's
a
good
point
for
us
to
to
review
this
stuff,
like
with
the
having
the
multi-cluster
stuff
unblocks
the
helm,
upgrade.
A
B
Yeah,
I
think
what
would
help
for
me.
Starbuck
is,
since
you
have
a
fresher
set
of
eyes
on
this
stuff,
maybe
kind
of
gives
us
an
idea
of
what
you
want
to
see
because
yeah.
I
was
talking
to
amy
about
this
in
our
one-on-one
this
week
and
I
think,
as
soon
as
this
migration
is
complete.
We're
gonna
we're
gonna,
improve
both
the
architecture
of
review
and
the
documentation
and
the
run
books,
and
I'm
not
sure
what
so
some
some
ideas
that
I've
had
is.
B
One
is
like:
how
do
you
configure
your
workstation
to
troubleshoot
each
zonal
cluster
with
cube
ctl,
so
that
would
be
the
first
thing
right.
Next
is
the
endpoints
for
prometheus.
I
mean
those
are
kind
of
the
same.
You
just
substitute
out
the
cluster
name,
but
we
can.
We
can
write
that
down
somewhere.
C
I
think
our
next
part
would
be
dashboards
and
metrics.
We
don't
have
any
sort
of
saturation
metric
or
alerting
for
our
clusters
at
all.
If,
for
whatever
reason
one
of
our
clusters
starts
to
take
more
traffic,
you
know
something
might
be
wrong.
C
Google's
routing
traffic,
for
example,
I'm
just
thinking
off
the
top
of
my
head-
we're
going
to
have
the
cluster
in
usd
1b
spin
up
a
crap
ton
of
pods
and
also
spin
up
some
notes
to
match
it
and
we're
not
going
to
understand
why
it'd
be
good
to
know
if
we
had
some
sort
of
saturation
alerting
or
just
some
sort
of
dashboard
to
look
at
to
figure
out
how
well
distributed
our
traffic
is
occurring
that
way
in
times
of
dire
need
or
outages.
We
have
that
ability
to
say.
Oh
look.
C
Something
is
wrong
with
cluster
sitting
in
us,
1d
stuff,
like
that.
B
C
C
Similar
to
that
some
of
our
kubernetes
dashboards
don't
work,
so
I'm
not
sure
what
information
they
would
display,
but
we
also
don't
have
information
based
on
like
a
node
that
is
going
haywire
in
comparison
to
other
nodes.
C
B
Yeah,
it
was
my
understanding
we're
going
to
just
deprecate
the
mix
in
dashboards
and
fold
what
we
want
into
the
general
dashboards.
So
that's
it's
a
little
bit
more
work,
but
first
we
need
to
figure
out
what
we
need
and
what
we
want
to
see
yeah.
So
I
don't
know
if
we
can,
like
maybe
work
with
the
absorbability
team
on
this.
C
I
created
a
few
issues
and
associated
them
with
the
tech
that
epic,
that
we
have
created
in
our
backlog
that.
C
A
few
additional
things
that
I
would
like
to
see
that
one
revert
that
you
had
yesterday
late
your
time.
I
created
an
issue
about
that,
because
I
thought
it
was
kind
of
odd
that
we
had
a
spike
in
nodes
in
one
zone
and
not
the
others
and
I'd
like
to
figure
out.
If
we
could
address
that
in
some
way
shape
or
form.
B
You
mean
the
where
you
you
mean
the
50
setting
the
mid
pods
to
50
and
that
created
yeah.
No,
that
was
no.
There
was
nothing
unusual
there.
I
applied
it
in
two
zones
and
then
I
saw
that
we
spiked
up
to
ten,
so
I
didn't
apply
it
to
the
third
zone
and
that's
when
I
reported
it.
B
There-
and
I
think
I
mean
I
wanted
to
increase
the
maximum,
because
our
max
node,
auto
scale-
you
know
limit
was
10,
so
I
didn't
want
to
like
be
running
at
like
the
max.
C
C
B
So
so
what
I
have
then,
is
like
we
take.
We
add
a
region
filter
to
the
general
dashboards.
We
take
a
look
at
the
pod
info
dashboard
and
incorporate
that
into
the
general
dashboard,
and
then
we
take
a
look
at
maybe
even
maybe
whatever
information
we
need
about
nodes.
We
incorporate
that
into
the
general
dashboard.
A
D
A
B
I
mean
we
do
have,
I
think
scarbeck
created
epic.
A
long
time
ago.
That's
called
like
technical
debt.
A
Yeah
we
do
have
a
technical
debt.
One
I
mean.
A
B
B
And
maybe
I
wouldn't
even
necessarily
call
it
technical
data
just
because
it's
it's
a
lot
of
it's
going
to
be
documentation,
although
some
of
it
is
the
monitoring
stuff,
but.
A
Yeah,
I
think
that
makes
a
lot
of
sense,
so
we
could
just
pull
in
anything
that
we
think
like
review
this
dashboard
or
write
this
document.
We've
got
a
few
things
on
the
board
already,
so
I'm
kind
of
assuming
we
will
focus
and
get
like
all
the
get
stuff
get
ssh
get
https
completed.
So
if
there's
issues
outstanding
for
those
would
be
great
to
get
them
onto
the
delivery
billboard.
A
But
following
that,
we've
got
the
issue
around
ensuring
we
could
put
zonal
clusters
into
maintenance.
I
hope
that's
possible
already.
A
B
A
This
one's
a
bit
more
of
a
placeholder
but
document
how
to
debug
the
multi-cluster
setup,
which
I
think
maybe
ties
in
a
bit
more.
You
were
saying
scotland.
A
A
slightly
bigger
thing,
but
anyway
we
can
go.
We
can
start
there
job.
Do
you
think
it'd
be
worth
doing
a
retrospective
readiness
review
for
the
multi-cluster.
B
Well,
we
never.
We
never
did
a
readiness
review
specific
to
multi-cluster
but
yeah,
maybe
a
retrospective
for
just
how
it
went.
B
Yeah
then,
let's
just
jump,
let's
just
jump
to
doc
documentation,
and
I
think
that
the
challenge
there
is
just
figuring
out
where
to
put
it,
we
have
too
many
markdown
documents
as
it
is
in
the
runbook
stock
directory
under
uncategorized.
B
So
I
think
maybe
we
should
try
to
take
another
look
at
all.
Those
documents
consolidate
them
a
bit
as
part
of
this.
I
don't.
I
just
feel
like
weird.
It's
like
adding.
Another
document
is
like
it's
just
getting
to
be
extreme
at
this
point.
A
But
I
do
think
a
readiness
review
would
also
still
be
useful
because
definitely
like
gets
people
to
review
it,
and
I
say,
even
though
there's
lots
of
documentation,
I
think
if
people
expect
to
be
able
to
find
a
readiness
review
and
they
live
in
a
certain
place.
I
think
it
could
be
a
nice
entry
point
into
other
documentation.
B
But
to
me,
readiness
reviews
are
like
for
the
moment
like
they're,
not
something
that
we
maintain.
We
don't
go
back
and
we
up,
we
don't
go
back
and
update
them.
I'd
rather
I'd
rather
just
do
documentation
for
the
architecture
of
the
multi-cluster
and
troubleshooting,
and
also
keep
it
up
to
date.
Okay,.
A
I
mean
yeah.
That
also
sounds
like
a
great
thing
to
do.
Do
we
do
you
think
you'll
get
the
same
level
of
review
like
because
what
I
think
the
difference
to
me
is
that
the
documentation
is
kind
of
a
telling
everyone.
Here's
how
it
is,
whereas
a
readiness
review
is
a
little
bit
more
inviting
critique
into
here's,
how
we
have
it
set
up.
C
A
B
C
I
hate
logging
right
now,
like
jarv,
showed
his
screen
for
a
split
second,
and
we
saw
just
nothing
but
new
lines
because
of
the
way
tailing
or
logs
in
the
state
of
networks,
and
I
realize
the
distribution
team
has
this
in
their
backlog
to
work
on.
Is
there
any
way
we
could
try
to
figure
out
how
to
like?
C
B
B
A
It
really
put
me
to
work
these
last
few
weeks,
but
I
think
if
there's
a,
if
you
have
a
suggestion
for
how
like
no
problem
at
all,
whether
it's
contributing
to
stuff,
if
there's
something
in
progress
like
if
there's,
if
there
is
a
issue
that
captures
what
you
like
there
logs
to
it
like,
feel
free
to
raise
that
up
and
we'll
see
what
we
can
do
with
that.
A
Awesome-
and
I
say
we
will
also
be
able
to
do
the
helm
upgrade
as
well
as
alongside
all
this
stuff.
A
We
should
definitely
review
what's
on
the
issue
and,
let's
see
where
we're
at
do
you
think
it'll
be
painful.
A
Okay
review
that
stuff
but
cool,
shall
we
take
a
quick
look
at
the
board
and
see
if
we've
got
so.
A
Do
so
at
the
moment
we've
got.
A
So
the
ones
related
to
well
actually
so
this
enable
action,
cable
issue.
Job
like
is
that
one
that
we
actually
want
to
keep
in
progress,
or
do
you
want
to
like
separate
it
out
and
come
back
to.
A
A
Cool
okay,
so
let's
go
back.
You've
got
your
readiness
review,
stuff,
investigating
vlog,
stuff
catch-all
going
on
and
drive,
you've
got
investigating
memory
profile
for
pods,
you've
got
an
intermittent
error
and
then
zonal
deployments
and
you've
got
the
workhorse
stuff.
A
Do
you
have?
Is
there
any
other?
And
actually
that's
I
think
all
we've
got
so?
Do
we
have
any
other
issues
from
get
https
or
get
ssh
that
we
want
to
pull
into
this
board.
C
Yeah
everything
inside
of
the
epic
that
I'm
working
on
should
be
on
this
billboard.
If
it's
missing
the
label
I'll
go
back
through
and
add
it.
A
Awesome
and
if
there's
anything
like
on
there,
that
is
you're
not
yet
working
on,
feel
free
to
stick
the
ready
label
on
so
that
it's
here
you
can
just
pull
it
in
ready.
C
A
A
B
C
C
A
B
A
C
Right
I've
got
one
question
for
jarv:
that's
not
necessarily
part
of
this
meeting.
If
we
want
to
in
the
meeting
and
jarv,
if
you
could
hang
back
for
five
minutes,
I'd
appreciate
it.
B
C
C
B
B
B
B
C
C
Yeah
kubernetes
doesn't
even
show
up
in
here,
so
I
don't
even
know.
What's
inside
of
that
index.
B
I
think
what
I
would
do
is
probably
delete
the
indexes
and
we
can
see
if
they
are
recreated,
but
I
think
that
they're
just
junk,
but
I
don't
know
what
created
them.
That's
really
strange
and
if
you
and
if
you
see
vlogs,
going
into
pub
sub
shell
and
gpro
now
then
like.
Do
you
see
your
easier
kubernetes
logs.
B
Okay,
so
that's
definitely
and
that's
definitely
a
problem
I'll
need
to
take
a
look
to
see
because
we
should
at
least
have
something
right.
B
Yeah,
so
do
you
see
vlogs
for
staging.
C
C
But
I
for
pre
I
did
see
logs
like
I
expected
to
because
that's
where
I've
been
doing
most
of
my
testing
so
far.
B
B
B
B
C
Together
because
I
have
no
clue
what
to
look
at
mayor
now,
we
ended
the
meeting.
I
just
asked
jar
for
some
assistance
on
something,
so
I
don't
know
if.
D
C
B
B
Yeah,
but
does
this
like?
First
of
all,
since
you
know
where
to
look
for
the
stuff
like
it's
in
the
the
fluent
elastic
search
yeah,
you
know
killed
files,
there's
this
values
file
and
then
what
we
do
is
we.
B
We
have
these
index
names
and
then
this
this
name
corresponds
to
creates
is
a
glob.
It's
a
glob,
it's
actually
star
git
lab
underscore
git
lab.
Where
course
star
we
put
the
star
before
and
after,
and
it
looks
for
slash
var,
slash,
log,
slash
containers,
and
then
that
name
and
then
it
goes
into
this
index.
B
B
Yeah,
I
think
I
probably
merged
my-
I
probably
merged
myself.
First,
and
this
came
in
after
or
something
okay.
C
All
right
I'll
get
a
request
in
to
fix
this,
then.
So,
when
I
fix
this,
do
I
need
to
manually
delete
these
indexes?
Is
that
going
to
cause
an
issue
at
all.
B
You
could
probably
I
mean
they'll,
I
think
they'll
eventually
be
deleted
on
their
own.
You
could
check
with
observability
guys,
but
I
think
it
doesn't
hurt
to
delete
them
either.
Okay,.
C
B
Cool
marin
since
you're
joining
a
bit
late,
just
to
give
you
a
quick
update
on
good
https,
we
currently
have
around
10
of
the
traffic
going
into
the
kubernetes
cluster,
saw
that
there
was
a
problem
with
thanos
getting
the
metrics
in
for
the
zonal
clusters.
So
that's
fixed
now,
and
I'm
going
to
be
like
slowly
ramping
up
the
traffic
a
little
bit
more
today,
but
the
expectation
is
that
this
is
not
going
to
complete
tomorrow.
B
Okay,
the
way
that
and
the
way
it
works
now
is
that
we
have
both
the
cluster
and
all
of
the
virtual
machines
in
the
https
backend.
And
then
we
just
like
increasing
the
weight
of
the
gk
cluster
slowly
over
time
in
intervals,
and
that
allows
us
to
kind
of
like
slowly
move
traffic
over
what.
D
B
Like
compared
to
the
vm
fleet,
we're
seeing
on
vms
we're
seeing
like
the
95th
percentile
is
very
spiky
on
kubernetes,
it's
very
flat
and
it's
good.
So
that
sounds
good,
but
it
also
is
perplexing,
although
I
do
think
that
the
git
fleet
is
a
tad
under
provisioned.
B
Vms
are
sort
of
doing
both
did
ssh,
so
we
were
thinking
about
taking
some
get
vms
and
just
putting
did
https
on
them.
Just
to
compare
our
point
of
comparison,
this
could
be
something
we
could
do
so
how.
D
About
I
mean
I,
I
support
in
doing
this
a
bit
more
careful
because
it's
a
huge
possibility
for
for
an
outage.
What?
What
kind
of
confidence
do
you
need
to
get
in
order
to,
for
example,
jump
from
10
traffic
to
like
25
and
then
to
50
instead
of
10
15
20.
like
what?
What's
apart
from
that
spikiness
that
you're
not
sure
about
like
what
else
yeah.
B
That
I'd
like
to
and
also
just
I
mean
I
saw
that
I
think,
once
we
have
proper
metrics
going
to
thanos
properly,
then
I
was
losing
some
visibility
because
of
that
I
was
just
looking
at
logs,
so
we'll
see
I
think
my
target
is
to.
I
was
hoping
to
have
50
of
the
traffic
in
today.
It
looks
like
that
might
happen
tomorrow
and
then
we'll
we'll.
Let
that
sit
over
the
weekend
and
then
go
to
100
early
next
week.
D
I
I
have
a
suggestion.
You
don't
have
to
take
it,
so
you
don't
feel
isolated
in
this
and
both
that
means
for
both
of
you,
you
and
and
well
both
john's.
C
D
Like
chat
things
with
multiple
people
just
to
see
different
perspectives
there
and
you
also
like
roll
them
into
the
rollout,
so
maybe
someone
tells
you
like
something
that
you're
not
seeing
or
maybe
someone
gives
you
more
confidence
or
less
confidence
and
yeah.
D
I
don't
expect
you
to
roll
out
all
of
the
traffic,
that's
up
to
you,
but
at
the
same
time
it
would
be
nice
to
also
involve
more
than
just
two
of
you,
so
you
can
get
a
bit
more
questions.
B
So
fun
we're
all
deploying
workhorse
on
master
right
now
to
the
kubernetes
cluster,
which
is
different
than
what
we're
deploying
to
the
vms.
C
I
thought
we
fixed
that
with
that
chart
change
what
happened.
B
I
don't
know
what
happened.
I
don't
think
this
was
ever
working
man
like
when
we
tag
when
we
tag
cng.
It
picks
up
the
latest
version
of
workhorse,
so
we're
using
the
tagged
version
of
workhorse.
That's
fine,
but
it's
actually
we're
using
the
tagged
image
the
tag
cng
image,
but
it's
actually
the
latest
version
of
workhorse.
That
was
available
at
the
time
that
that
cng
is
tagged.
D
C
D
C
D
There
is
an
issue
they
have
created
one,
but
there
is
a
desire
to
fix
it.
It's
more
about
the
order
of
it
like
do.
We
stop
the
world
now
to
fix
this
or
take
on
some
risk,
so
we
decided
to
talk
about
what
the
risk
is
and
we're
deciding
to
take
a
bit
of
risk,
but
then
the
two
corrective
actions,
as
we
roll
things
out,
are
that
and
also
the
shutting
off
traffic
between
clusters.
B
Yeah,
I
would
say
almost
certainly
yeah
I
mean
I.
C
Guess
I
could
look
at
the
latest
auto
deploy
and
see
if
the
version
of
shell
changed.
B
Just
just
just
like
ssh
to
a
vm
and
go
to
your
container,
and
do
it
just
just
run
the
binary
with
version
flag
and
see
if
it's
the
same.
D
C
I
guess
I'll
find
that
out
and
I'll
create
a
new
issue.
D
Look
at
that
look
at
that
it
is
wow.
Okay,
we
did
rewrite
it
in
go
anyway,
like
the
the
vm
is
not
going
to
be
a
good
test.
You
need
to
test
the
image.
D
You
need
to
pull
an
image,
pull
attack
the
image
and.
C
C
D
D
D
D
Cool
all
right
thanks
jar
for
the
update,
good
progress.
I
I
do
like
it.
They
look
like
that.
We
are
being
a
bit
careful
there
as
well.
So
thanks
for
doing
that,.