►
From YouTube: 2021-07-14 GitLab.com k8s migration EMEA
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
D
B
C
Awesome
so
I'm
not
sure
jarv
may
jump
in,
but
here
we
go
awesome
so
scuba.
Do
you
want
to
kick
us
off.
A
A
Those
tabs
open
so
console
we're
missing
dashboards,
and
this
is
a
relatively
critical
service.
All
of
our
any
service
that
requires
access
to
our
database.
Our
postgres
database
relies
on
console
for
service
discovery
for
determining
who
the
master
is
and
for
determining
who
the
secondaries
for
the
database
are.
That
way,
we
request
rights
and
reads
to
the
correct
database
systems,
so,
like
console,
is
pretty
decently
important.
The
fact
that
we
don't
have
dashboards
for
it
is
kind
of
concerning,
so
I
decided
that
we
should
finally
do
something
about
it.
So
I
have
a
merge
request.
A
A
So
you
know
be
something
that
I'm
hoping
andrew
could
help
me
address
in
the
merge
request
and
there's
also
certain
data
that
just
doesn't
make
sense
to
have
I'll
get
to
that
in
a
second
but
like
we
do
have
data
coming
in
from
all
of
our
agents
that
are
running
inside
of
kubernetes.
So
you
know,
we've
got
how
much
cpu
that
we're
consuming,
which
is
kind
of
cool,
and
you
know
also
the
amount
of
memory
that
we're
consuming.
A
It
is
very
interesting
that,
for
we
appear
to
be
using
a
lot
of
cpu,
but
we
need
to
go
back
to
account
for
how
many
nodes
we're
running
inside
of
production
and
we
run
a
lot
of
nodes.
So
while
we
only
request
a
fraction
of
cpu,
it
adds
up,
depending
on
how
many
nodes
that
we're
actually
consuming
or
running
inside
of
our
clusters.
A
The
one
thing
that
I
do
have
missing
in
this
merge
request
is
our
sli
related
detail,
because
currently
I
don't
know
what
metrics
are
important
for
determining
this
information.
You
know
console
is
not
a
service
where
you
have
a
request
rate,
that's
important!
It's!
Actually,
whether
the
service
is
available,
but
at
that
point,
you're
monitoring
the
individual
service
behind
console.
So
it's
going
to
be
more
like
is
the
cluster
of
console
itself
healthy,
and
I
don't
know
which
metrics
actually
give
me
that
information.
A
A
I
tried
I
was
very
unsuccessful
and
troubleshooting
not
a
stack
trace,
but
goodness,
but
troubleshooting
a
compilation.
Failure
in
json
is
not
easy
yet
to
me
so
again,
stuff
like
rps,
which
doesn't
make
sense.
We
obviously
don't
have
data
for
saturation.
We
have
no
data
for
which
we
kind
of
should.
A
I
think
the
only
thing
that
we
see
is
this
open
file
descriptors,
which
is
specific
toward
nodes,
not
kubernetes
but
like
we
do
have
no
level
metrics,
because
we
do
have
five
servers
as
virtual
machines
and
then,
if
we
go
down
you
know
we
get
the
same
kind
of
stuff.
I
kind
of
showcased
already
where
we
get
our
kubernetes
related
information
as
well.
A
And
then
my
merge
request
also
includes
some
changes
to
the
service
definition
so
stuff,
like
the
cpu
utilization,
we'll
start
gathering
that,
after
that,
merge
request
gets
pushed
into
place.
So
we
have
dashboards
coming.
It's
a
open,
merge
request.
C
C
Should
we
be,
should
we
have
this
on
our
roadmap
to
to
be
migrating
stuff?
You
mentioned
that
we're
running
some
vms.
Is
this
something
that,
at
some
point
should
all
be
in
kubernetes.
A
In
the
future,
yes,
I
think
it's
I
think
console
is
a
decent
service
to
having
kubernetes,
just
because
of
its
clustering
capabilities,
but
I
have
not
looked
into
it
myself,
yet
just
due
to
the
fact
that
we've
got
enough
work
to
do
so,
I
think
it
should
be
on
the
rematch
for
the
future
when
we
do
it,
I
don't
know.
Okay,.
D
Starbuck
regarding
your
question
about
detecting
healthy
state
of
the
cluster
itself,
maybe
you
should
take
a
look
at
the
raft,
so
basically
a
console
underscore
raft
something
you
should
be
able
to
see
the
leader
and
the
peers
and
make
some
assumption
based
on
those
numbers,
because
you
know
the
how
we
configure
the
cluster.
So
the
majority
model,
and
things
like
that,
so
it
kind
of
gives
you
an
idea
because,
on
the
other
aspect,
an
unhealthy.
D
Peer
should
be
detected
as
a
service,
if
I
remember
correctly,
because
basically,
every
console
node
is
part
of
the
service
catalog
and
if
it's
unhealthy
by
its
own
definition,
it
just
appears
there
as
unhealthy,
but
as
a
cluster
level.
Knowing
if
the
cluster
is
working,
it
depends
on
the
presence
presence
of
the
right
number
of
monster
leaders
actually
yeah.
Just
leaders
right,
yeah,.
A
A
What
I
couldn't
figure
out
is
what
happens
if
one
of
the
nodes
dies,
and
I
couldn't
figure
that
part
out,
because
I
would
signal
hey,
that's
an
unhealthy
cluster
despite
it
working
it
just
means,
there's
a
problem
that
we
should
investigate.
Let's
alert
somebody
because
we
don't
want
counsel
to
completely
go
down
that
kind
of
thing.
A
So
that's
something
I
was
actually
confused
about
because
there
is
a
leader
metric
of
some
sort,
but
when
I
queried
the
cluster
or
when
I
cleared
the
metrics,
all
of
them
showed
up
as
a
leader
and
I'm
like
that
doesn't
make
sense
to
me.
I
thought
only
one
was
supposed
to
show
up
as
a
leader,
so
I
think
I
was
either
looking
at
something
wrong
or
I
need
to
refine
the
queries.
A
But
I'm
like
at
this
point
I'll
say
that
for
a
future
iteration,
I'm
just
going
to
concentrate
on
this
merge
request,
because
this
is
better
than
anything
else
we
have
which
is
zero,
currently
so
yeah,
so
yeah.
Okay.
So
let's
look
at
nginx
real,
quick
because
that's
the
other
one
that
is
merged-
and
I
know
those
dashboards
are
in
a
nice
fun.
A
State
capacity:
this
is
this
one
takes
forever
to
load,
but
we
have
our
nginx
controllers
and
we
got
you
know
our
container
details.
So
we
have
you
know
the
cpu
usage
of
all
the
containers
and
you
know
their
standard
deviations
and
as
well
from
memory,
and
we
also
have
the
fact
that
we've
got
the
waiting
reasons
and
termination.
So
we
can
see
that
we're
scaling
on
a
regular
basis.
A
That's
kind
of
cool
same
deal
for
the
actual
deployments,
so
we
know
how
much
cpu
memory
and
network
traffic
is
happening
through
each
one
of
our
individual
clusters.
A
We
do
have
error
ratios.
We
don't
have
sli's
attached
to
these
yet
because
our
errors
are
kind
of
high,
which
is
kind
of
surprising,
but
I
have
not
really
dug
into
this
very
much
yet
I'm
just
trying
to
get
data
out
there
available
so
that
we
could
look
into
it.
I
did
find
some
interesting
measures.
I
thought
were
important,
so
I
went
ahead
and
had
dashboards
for
those
items
and
just
total
connections
and
connections
by
state
there
was
actually
a
fun
during
one
incident.
A
A
Cool,
so
we
can
see
how
many
active
replicas
are
operating
across
our
cluster,
which
is
cool,
so
you
know
we
got
some
pretty
cool
stuff.
Oh,
the
other
thing
I
wanted
to
show
was
inside
the
original
detail.
We
got
this
nifty
link
here.
That
goes
to
stackdriver,
because
this
is
where
our
error
logs
are
being
sent
to.
We
decided
to
just
ignore
our
access
log
details,
so
you
can
see
we're
kind
of
dominated
by
the
fact
that
we're
buffering
temporary
files-
this
is
normal
behavior.
A
A
That
we
should
be
able
to
get
down
to
the
very
important
stuff.
Okay,
don't
filter
that
out.
Let's
just
get
rid
of
this.
Do
I
need,
and
no
just
not.
A
A
I
don't
quickly
see
it,
but
another
item
that
we
see
a
lot
is
the
fact
that
we
don't
have
tls
configured,
which
is
a
perfectly
fine
error
that
we
could
ignore.
But
you
know
oh
here
it
is
service.
Web
doesn't
exist,
so
you
know
we're
we're
getting
air
logs,
which
is
our
primary
goal
of
increasing
our
nginx
observability.
So
I'm
happy
with
where
that
stands
as
of
right
now,.
C
Awesome,
did
you
just
go
with
the
error
logs
in
the
end
for
engineers.
A
Yeah,
so
access
logs
there's
two
problems
that
I
see
that
are
leading
towards
us
not
wanting
to
try
to
consume
those
is
one
or
I
guess,
a
few
things,
one
they're
going
to
be
kind
of
redundant
with
our
hd
proxy
logs.
The
only
difference
that
we're
going
to
see
is
which
back-end
takes
care,
of
which
request
hd
proxy
provides
us
that
to
the
nginx
ingress,
which
is
always
going
to
be
the
same
across
all
endpoints
nginx
is
going
to
be
like.
A
Oh,
this
request
is
going
to
go
to
the
web
deployment,
or
this
request
is
going
to
go
to
the
api
deployment.
That's
about
the
only
important
information
I
see
out
of
that
out
of
that
data.
A
Another
item
that's
problematic.
Is
that
nginx
we
don't
currently
have
a
method
to
filter
out
secret
data
that
gets
logged
by
nginx
so
because
of
that,
that
kind
of
puts
us
in
a
higher
situation
of
exposing
data
to
the
wrong
engineers.
A
That's
kind
of
lower
priority
because
only
certain
people
have
access
to
the
production
data
anyways
in
our
the
gcp
console.
But
I
don't
want
to
deal
with
that
from
a
security
perspective
in
general,
so
I'm
just
like
screw
it
and
two
there's
just
going
to
be
a
large
volume
of
data.
Yeah.
B
A
A
Thing
for
nginx
so
because
of
the
other
two
items
I
don't
deem
the
cost
of
it
worthwhile
so
and
plus
we've
been
dealing
with
this
for
the
last
two
plus
years.
So
why
bother
so
I'm
going
to
leave
it
that
way
if
we
have
an
incident
where
looking
at
nginx
logs
are
more
important,
I
guess
we'll
tackle
it
then.
But
at
this
point
it's
been,
I
updated
our
run
book
to
comment
that
we're
not
doing
this
because
of
xyz.
B
C
Awesome
yep
sounds
good
nice
great
great
progress,
good
to
have
visibility
of
both
those
things
so
awesome.
A
A
C
Yes,
exactly
that's
awesome,
really
good
yeah
and
deployments
have
been
super
smooth
today.
So
that's
a
huge
milestone.
A
A
So
yeah
web
traffic
is
definitely
going
to
our
kubernetes
nodes,
which
I'm
pretty
stoked
about
so
awesome.
I
think
the
next
step
that
we
need
to
accomplish
is
there's
a
configuration
file
that
graham
did
not
yet
audit.
So
I'm
going
to
try
to
do
that
today
and
then
we
could
start
knocking
out
some
of
the
other
issues
like
the
redness
reviews
and
stuff
that
are
currently.
B
C
A
C
They're
not
gonna
say
that
this
is
not
in
terms
of
timing,
like
not
gonna
apologize,
but
I
mean
in
terms
of
like
do
you
want
to
get
like?
Are
we
gonna,
try
and
get
like
a
small
bit
of
traffic
on
canary
and
then
get
the
readiness
reviews
signed
off?
Or
do
you
want
to
get
all
those
things
wrapped
up
and
approved
before
we
put
any
traffic
on
canary.
A
A
C
We'll
need
to
do
that
as
well
as
we
go
along,
because
there
is
a
reasonable
chance
that
we'll
get
new
chef
changes
coming
in
that
won't
be
inc
like
if
anyone
changes
stuff
next
few
weeks,
it's
not
guaranteed
that
they'll
do
it
in
both
places.
C
I
know
great
just
mentioned
that
he
saw
one
that
was,
that
was
only
in
chef,
so
that
was
the
thing.
I
think
that
made
him
aware
that
we,
we
will
need
to
watch
for
this,
so
he
started
sharing
shouting,
but
we
might
want
to
just
he's
probably
fine
in
canary,
but
at
some
point
before
we
go
fully
into
production.
We
should
just
do
another
quick
order.
A
The
only
thing
that
I
know
of
that
recently
was
made
as
a
change
in
chef
that
we
cannot
take
over
new
kubernetes
is
due
to
a
chart,
configuration
it's
just
not
available
in
our
health
chart.
A
B
C
Okay,
super
okay.
Hopefully
it's
that
one
great
okay
sounds
good
awesome.
That
sounds
great
and
one
thing
I
was
like
at
some
point.
So
we
as
we've
kind
of
been
talking
about
q3
ideas
and
things
like
pages
is-
is
like
the
next
big
state.
That's
not
big,
it's
hopefully
small,
but
the
next
stateless
service.
C
I
think
we
should
just
get
started
on
that
whenever
we're
comfortable
managing
the
pieces,
so
it
doesn't
have
to
be
like
right
now,
but,
like
also,
I
don't
know
if
we
necessarily
have
to
wait
for
all
of
web
to
migrate.
If
we
feel
like
we've
got
the
pieces
all
moving
along
and
we
could
move
it
in
pre
or
staging,
then
let's
get
that
started
when
we
can.
C
Yeah
cool
sounds
good,
sounds
good,
great
great
progress,
great
progress.
A
C
Well
and
and
you
as
well
so
like
go
team
work,
so
I'm
loving
this
time
zone
handing
off
it's
working
out
really
well.
C
Do
it's
really
good
awesome,
so
I
had
a
couple
of
discussion
on
some
points.
The
main
one
is
really
about
q3
okrs.
I
was
wondering
kind
of
based
off
yesterday.
Should
we
try
and
do
something
to
reduce
the
conflicts
between
auto,
deploys
and
kate's
workload?
Changes
like
would
that
make
it
easier
to
work
on
the
clusters.
A
Yes,
I'm
more
concerned
about
auto,
deploys
being
blocked
and
graeme
has
an
idea
about
this,
and
I'm
pretty
sure
he's
got
an
issue
logged
about
this
as
well
yeah.
So
I
remember
the
proposal
was
to
try
to
make
auto,
deploy
specific
pipelines,
simply
not
check
for
configuration
changes
when
upgrading
the
cluster,
which
should
be
doable
because,
right
now
we
query
our
single
source
of
true,
which
is
currently
chef
and
our
secrets
to
make
those
necessary
changes.
But
there
should
be
a
way
we
could
say.
A
Hey
only
do
this
one
specific
change
to
our
clusters:
graeme
had
an
idea
about
this.
I
just
we
just
need
to
find
that
issue
and
maybe
prioritize
it.
I
think
it's
either
in
the
tech
debt,
epic
or
it
might.
D
D
D
A
We
do
use
resource
groups.
The
problem
that
we
run
into
is
that
the
diff
checker
might
run
while
a
configuration
change
may
have
been
pushed
into
place,
and
if
that
happens,
we
block
auto
deploys
because
auto
deploy
saw
a
change
that
may
not
have
been
rolled
out
to
that
cluster
yet
because
it
may
have
been
blocked
because
the
diff
job
was
running
from
auto
deploy,
for
example,.
B
D
A
So
yeah
we
need
to
address
that
in
some
way,
shape
or
form,
and
I
think
our
best
option
is
going
to
be
determining
how
to
figure
out
how
kate's
workloads
operates
or
deploys
things
overall,
which
I
know
graham
has
ideas
for.
C
Cool
okay
yeah.
I
think
that
makes
sense
cool
okay,
so
I'm
gonna
I'll
add
a
comment
on
the
okr
discussion
issue,
but,
like
I'm
kind
of
thinking
like
the
in
terms
of
like
scaling
up
stuff,
that
to
me
feels
like
it'd,
be
a
nice
one
to
have
removed
just
as
auto
deploys
are
moving
faster.
Now
with
bridge
jobs
like
you,
you
get
so
much
less
opportunity
to
push
these
things
out
in
between
auto
deploys
like
at
the
moment.
C
The
last
well
instance
aside,
but
over
the
last
couple
of
weeks,
with
graham
deploying
as
well,
we've
been
hitting
five
deploys
a
day,
which
is
awesome
great
for
like
getting
things
out,
but
it's
gonna
make
it
harder
for
you
to
like
dodge
these
things.
Basically,.
C
And
such
exactly
so
we'll
have
a
lot
less
visibility
of
of
changes
coming
in
so
yeah,
okay,
so
then
the
other
so
I'll
add
the
other
one
which
I'm
going
to
mention
on
the
issue
we
should
have
think
about
is
chanter
jarvan
he
was
mentioning
like.
Should
we
should
we
try
and
reduce
our
dependencies
on
omnibus,
like
specifically,
post
deployment?
Migrations
could
be
a
great
one
right
following
web
fleet
migration.
C
Maybe
that's
a
good
time
to
actually
try
and
work
out,
because
we
know
that
registry
needs
this.
They
don't
need
it
yet,
but
they
they
know
they'll
need
it
in
the
future.
We
don't
actually
have
a
solution
for
applying
post-employment
migration
safely
on
registry
and
the
reason
I
believe
I
I
can't
remember
the
full
thing,
but
I
believe
the
reason
is
the
fact
that
is
it
still
with
the
fact.
We
can't
control
the
order
it
applies
or
it's
is
that
right,
yeah.
D
Yep,
basically,
we
need
an
operator
and
no
one
is
working
on
operators.
So
the
ask
is
to
re-implement
the
deployer
in
kate's
workload,
because
basically,
this
is
the
so
the
baseline
here
is
that
the
more
we
remove
from
the
deployer
we
have.
We
we
end
up
in
a
situation
where
basically,
the
the
box
that
runs
the
migration
can
simply
be
an
image
running
as
a
job
on
the
cluster.
So
we
schedule
a
job
and
say:
could
you
please
run
regular
migration?
D
Then
we
kick
in
the
helm
upgrade
and
then,
when
it's
done,
we
run
another
job
which
is
running
positive
alignment,
migration,
which
sounds
like
re-implementing
deployer
in
kate's
workload,
and
the
reason
being
is
that
because
we
need
an
orchestrator,
which
usually
is
what
an
operator
is
is
made
for,
but
we
don't
have
one.
B
D
D
For
us,
it's
easier
to
just
do
the
work
around
instead
of
building
an
full-fledged
operator.
C
D
C
Yes,
okay!
Well,
that's
a
good
one
that
we
should
at
least
explore
like
whether
that's
an
option
I
mean
I
guess
that
could
be
a
solution
right.
Somehow
we
need
to
find
a
solution
for
post-deployment
migrations
and
perhaps
assets
as
well
and
an
operator
of
some
variety.
I
guess
is
going
to
be
needed
right.
A
D
A
C
Yeah,
so
that.
A
D
C
D
This
is
not
hard
right,
it's
just.
I
mean
you
you're,
changing
the
ansible
job
with
with
a
kubernetes
job.
You
just
schedule
the
job
to
run,
and
then
we
need
two
images.
I
think
the
images
are
already
there
because
the
helm
chart
is
using
those
images.
So
it's
just
a
matter
of
providing
the
environment
variable
to
say,
skip
positive
migration
run,
possibly
migration.
So
it's
it's
simple.
D
It's
more
that
even
when
we
talk
about
operators,
operators
is
it's
a
cluster
concept,
so
within
a
collapser
you
may
have
an
operator,
but
we
have
a
multi-cluster
deployment.
D
D
D
If
we
move
away
from
omnibus
packages,
we
will
increase
the
our
velocity
in
terms
of
how
much
will
it
takes
for
deploying
something,
because
we
will
remove
one
hour
of
building
the
images
and
things
like
that,
and
on
top
of
that
we
may
also
have
something
interesting
to
explore
here,
which
is
decoupling,
italy
and
the
rails
deployments.
D
So
there
are
tons
of
opportunities,
but
it
doesn't
really
sound
that
we
are
going
to
something
which
is
streamlining
the
process
in
any
way.
We're
just
making
custom
we're
just
removing
a
custom
tool
for
building
a
new
custom
tool
which
may
or
may
not
maybe
easier.
It
may
be
faster
to
run
but
not
necessarily
easier
to
implement
and
understand
and
work
on.
A
D
Yeah
I
mean
the
point:
is
that
if
we
know
that
there
is
no
easily
change,
let
me
rephrase
if
we
know
that
there
is
no
getaly
api
change,
which
means
that
it
can
run
with
the
old
gateway
version.
We
can
completely
decouple
things,
and
so
we
run
migrations
in
kubernetes,
kubernetes
deployment
plus
deployment
migration
in
kubernetes
as
some
as
a
as
a
deployment,
and
then
when
the
omnibus
package
is
ready,
we
roll
out
visually
independently,
but
right
now
we
can't
do
this
because
we
have
the
migrations
happening
on
vms.
D
D
A
D
Yeah,
maybe
something
we
can
do
in
the
meantime
for
experimenting
on
this
to
validate
the
approach
is
that
we
remove
post-deployment
migration
from
deployer
and
we
put
them
in
kate's
workload
so
that
we
still
have
this.
We
still
wait
for
omnibus
packages,
and
so
we
just
run
the
deployment
in
the
with
the
same
steps
that
we
are
doing
today,
but
we
validate
our
ability
to
run
post
deployment,
migration
using
kubernetes
job,
just
switching
the
job,
because
at
that
point
in
time
we
ate
we
both
we
have
both
vms
and
kubernetes
images.
D
C
Yeah,
certainly,
okay,
cool
yeah.
That
sounds
that
sounds
interesting.
They
say
we
know
at
some
point.
We're
gonna
need
to
solve
this
for
registry.
We
didn't
have
a
solution,
but
it's
not.
I
thought
that
was
a
piglet.
That's
like
right
scarves
got
a.
C
Yeah,
we
know
we
all
need
to
solve
this
for
registry.
We
don't
have
a
good
weight
right
now
and
I
think
that
the
the
concerns
raised
were
around
the
way
they'll
roll
back,
if
it
if
it
fails
or
or
the
order
of
things.
So
we
can
certainly
work
out
details,
but
that
sounds
like
it
could
be
a
good
thing
to
explore.
C
Yeah,
for
sure
I'll,
add
a
comment
onto
the
onto
the
issue,
so
we
can
discuss
that
cool
and
then
is
there
any
other
stuff?
Anyone
wants
to
talk
about
on
on
okrs.
C
Cool
all
right
and
then
just
to
find
out,
I
was
going
to
say
on
five
on.
It
was
really
interesting
graph
scale
back
on
the
gitlab
shell
memory
leak.
Should
we
ping
the
team
and
get
get
some
developers
to
take
a
look
at
that,
like
I'm,
assuming
that's
not
really
just
an
infrastructure.
A
B
A
There's
something
wrong
with
how
I'm
querying
the
data
or
something.
The
second
reason
why
I
did
not
raise
this
issue
is
because
I
know
at
some
point
in
time
I
don't
know
the
timeline
gitlab.
Shell
is
supposed
to
be
changing
out
how
that
service
runs
so,
instead
of
it
being
like
the
ssh
demon
that
calls
out
to
a
binary,
it's
supposed,
gitlab
show
itself
is
supposed
to
be
the
ssh
demon,
that's
listening
for
a
request
which
that's
an
entirely
different
way
of
running
the
gitlab
show
process
entirely.
A
C
C
I'd
suggest
once
you,
if
you
once
you
get
to
the
point
where
you're
reasonably
confident
that
data
is
right,
I
suggest
just
ping
them
on
air
and
you
can
literally
pretty
much
just
say
that
right,
like
hey,
I
know
you're
working
on
this
stuff,
just
his
visibility
of
what
how
things
look
and
then
I
think
they
can
work
out
the
timeline
and
does
it
make
sense
to
fix
or
are
they
just
migrating?
A
I'm
also
not
heavily
concerned
about
this
because
I
don't
see
any
bad
things
happening
inside
of
kubernetes
like
it's
not
killing
these
pods
and
I
have
not
seen
people
complaining
about
the
get
lab
ssh
unless
there's
another
incident,
I'm
not.
B
A
C
Yeah,
I
know
I
think
that
makes
sense
if
I
understood
from
henry,
if
I,
if
I
answer
correctly
like
it,
was
the
sidekick
tuning
that
was
causing
the
most
kind
of
pain
for
engineers
on
call
and
the
other
two
registry
and
gillab
shower
just
once,
he
saw
in
amongst
the
metrics
like
the
dashboard,
so.
A
D
I
have
a
question
starbuck
about
the
the
github
shop
memory
usage.
So
my
question
is:
how
do
we
schedule
those
spots?
So
are
they
long
long
running
and
they
just
get
routed
requests
or
they
kind
of
serve
a
number
of
requests
and
then
they
kind
of
get
killed
because
of
the
hpa
just
expanding
and
contracting
the
fleet.
A
D
A
Like
in
this
particular
case,
we're
just
using
the
sh
daemon-
and
I
forget
the
exact
configuration
but
like
it
theoretically,
each
demon
will
handle
upwards
of
200
requests
before
it
starts
denying
clients
it's
either
200
or
400
yeah.
I
don't
know
how
many
clients
they
actually
serve,
because
the
ssh
demon
has
no
metrics
in
any
way
shape
or
form.
D
So
I
was
looking
at
the
individual
graphs
that
you
linked
instead
of
just
the
because
the
trending
is
obviously
this
is
monotonically
increasing.
So
there's
a
it
sounds
like
there's
a
memory
rig,
but
on
the
other
hand,
what
I
was
thinking
is
that
you
have
a
go
demon,
because
I
know
what
the
shell
thing
was
written
and
go
right,
so
you
have
something
that
is
kind
of
streaming
data
so
from
one
direction
to
the
other
one.
So
we're
talking
about
clone
pushes
so
data
intensive
operation.
D
So
the
way
these
things
works
very
likely
have
a
a
memory
buffer
that
is
kind
of
moving
information
from
one
direction
to
the
other
one.
So
I
was
thinking
because
all
of
most
of
those
things
have
a
very
short-lived
metric.
So
if
you
just
click
on
some
of
them,
they're
kind
of
very
tiny.
So
I
was
wondering
if
we
see
it
monotonically
increasing
because
we
have
peak,
then
we
spawn
apart
it
starts
serving.
Then
the
memory
peak
memory
gets
used
and
then
basically
he's
done
is
get
killed.
A
D
A
D
A
D
A
D
I'm
talking
about
the
the
not
the
node
exporter,
the
host
exporter,
sorry,
the
one
that
you
can
install
on
your
machine
so
that
it
runs
on
linux
and
collect
a
set
of
metrics
and
I'm
sure
quite
sure
that
you
can
just
say
I
am.
I
want
detailed
information
about
process
with
the
speed
or
this
name
or
the
spot
or
whatever,
and
they
will
get
added
to
the
export
thing.
B
C
Cool
is
there
anything
else
that
we
want
to
go
through.
A
I
have
not
not
something
I
really
want
to
go
through
it's
more
of
just
a
question.
I
had
a
one-on-one
with
marin
earlier
today
and
I
was
discussing
you
know
what
kind
of
stuff
we
want
to
see
in
the
future,
and
we
see
that
gideon
is
currently
in
discussion,
but
I
already
know
that
gideon
is
not
going
to
be
ready
to
be
migrated
to
kubernetes
just
due
to
the
current
state
of
how
it
works
right
now,.
B
A
A
I
don't
know
enough
about
prefect
to
provide
an
opinion
about
this,
but
I
feel
like
if
it
is
a
stateless
service,
we
could
spin
up-
and
I
don't
know
what
the
expectation
of
prefect
deployments
are
supposed
to
be.
But
theoretically,
if
it's
stateless
like
there
should
be
no
problem
with
moving
it
over
to
kubernetes
and
then
it
is
now
the
backing
store
for
giddily,
which
sends
this
data
to
a
virtual
machine.
A
But
later
we
could
then
start
spinning
up
giddily,
pods
and
then
so
long
as
prefect
is
working
as
it's
supposed
to
be.
It
doesn't
matter
how
we've
deployed
giddily,
whether
it's
just
one
or
two
pods
that
are
sitting
behind
it.
But
as
long
as
the
replication
is
working,
we
should
be
able
to
have
a
seamless
deploy
with
that.
C
I
mean
it's
something
which
might
be
worth
us
just
trying
to
test
somewhere.
Like
you
know,
where
could
we
spin
up
perfect
in
kubernetes
and
actually
be
confident
that
we
can,
like
you
know
we
want
to
do
some
deploys?
We
want
to
see
scaling,
we
want
to
check
like
logging
and
those
sorts
of
things.
If
we
can
work
that
out,
then
I
think
it
would
be
a
great
one,
like
I'm
almost
at
the
stage
where.
Actually
I
wonder
if
we've
got
lots
of
kind
of
options,
people
are
talking
about
like?
C
Should
we
do
something
with
reddish?
Should
we
do
something
with
like
console
ho
proxy,
like
they're,
almost
all
the
pieces
which
we
should
probably
just
test
somewhere
and
see
what
works,
what
doesn't
work
and
what
would
be
involved.
D
In
the
past,
we
have
used
staging
for
this,
because
the
way
that
projects
are
routed
to
a
specific
storage
provider,
we
can
just
deploy
something
and
just
route
one
or
two
projects
that
we
are.
We
start
testing
manually
with
them
and
then
we
can
just
extend
this
to
qa
or
things
like
that.
A
C
Going
to
believe
that's
fine,
we
can
do
that
right
like
if
we.
If
we
started
this,
I
mean
it
is
a
bit
of
an
interest
run
around
deployments,
but
it's
something
which
like
if
we
handle
within
delivery,
we
could
manage
that
conflict
and
set
up
and
tear
down.
Do
you
want
to
like?
Shall
we
try
and
do
a
test
on
prefect
and
see
if
it
works.
C
I'm
not
aware
that
it
is
because
we
created
the
gitly
one
just
the
other
month
when
we
started
talking
about
italy,
okay,
so
I'm
not
aware
of
a
perfect
one.
No,
but
it
would
be
a
great
one
to
get
me
like
get
discussion,
because
I
know
jason
and
perhaps
other
people
in
distribution
have
thoughts
on
perfect,
and
I
know
andrew's
mentioned
it
before
that
it
might
be
a
great
time
to
migrate
it
before
it
actually
gets
tons
of
users.
B
A
A
To
spit
up
an
issue
yet
because
I
I
need
to
learn
more
but
yeah
what
if
we
can
make
that
decision,
I'm
happy
to
test
this
out.
C
Super
yeah.
I
love
that,
like
I
mean
we're.
Definitely
at
the
stage
where
I
think
the
there's
lots
of
pieces.
We
want
to
get
into
kubernetes,
but
none
of
none
of
them
are
going
to
be
quite
as
straightforward
and
obvious
as
like
the
web
fleet,
where
you
know
it's
sort
of
just
there
already
so
yeah.
I
think
we
probably
need
to
plan
some
some
tests
and
see
what
we
can
pick
over.
A
Sounds
like
a
natural
place
to
put
profit
because
you
need
to
run
multiple
versions
of
them,
not
versions,
but
multiple
pods
at
the
same
time,
for
redundancy
and
because
it's
stateless,
we
don't
have
to
worry
about
disk
storage,
it's
just
a
matter
of
deploying
the
actual
image
it
feels
like
a
no-brainer,
but
we
need
to
learn
more.
A
D
Learn
yeah,
it's
more
interesting
to
upgrade
procedure,
how
the
rollout
will
affect
it
and
yeah,
but
yeah
I
mean
it
would
be
more
or
less
the
same.
Things
just
happened
on
vms,
so.
C
Yeah
for
sure,
yeah
cool,
well,
yeah,
yeah.
Absolutely
please
go
ahead
and,
like
start
digging
on
that,
let's
see
if
we
can
get
that
into
something
where
we
could
test
it
and
see
what
it
looks
like
fantastic
cool
great.
Is
there
anything
else
cena
wants
to
bring
up
today.
C
D
A
C
D
D
C
A
C
B
C
Awesome
so
yeah
good
that
we've
covered.
I
often
say
to
people
that
occasionally,
if
you
watch
like
they,
these
demo
videos
all
the
way
to
the
end,
you
get
the
easter
egg.
You
don't
always.
I
feel,
like
some
people
feel
like
really
cheating.
After
an
hour
of
watching
us
talk
about
like
dashboards
and
things,
nothing
happens.
We
just
go
away,
but
occasionally
you
get
like
gold.
B
C
A
C
Anyways
all
right
well,
super
chat,
you
all
yeah,
it's
not
working
super
well,
but
it's
like
you.