►
From YouTube: SIG - Performance and scale 2021-07-01
Description
Meeting Notes: https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.8taxjc2uv4bg
A
Okay,
all
right.
Everyone
welcome
to
sixth
scale
if
you
need
yourself
as
an
attendee,
please
meeting,
unlike
the
dachshund
and
the
chat
meeting
chat.
A
Okay,
so
today
we're
I
wanted
to
talk
about
kind
of
continue
on
with
some
discussions
that
we
had
last
week
and
kind
of
try
and
see
if
we
can
take
some
of
the
things
that
we
have,
that
we've
talked
about
in
the
past
and
kind
of
talk
a
little
bit
more
about
the
implementation.
How
we
can
accomplish
some
of
these
things
and,
like
I
mentioned
last
week,
we
so
we're
going
to
move
to
weekly.
A
I
also
said
on
the
mailing
list
and
I'm
sure
you're
all
aware,
because
you're
here
we're
going
to
be
doing
we'll,
do
weekly
meetings
on
thursday,
okay,
so
the
first
item
on
the
list.
This
is
the
pr
that
marcelo
has
been
working
on
marcelo.
Do
you
want
to
talk
a
little
bit
about
you
know
any
of
the
progress
that
you've
been
able
to
make
on
this
one,
since
last
thursday,.
B
All
right,
so,
unfortunately,
I
don't
have
too
much
into
update.
I
was
busy
with
some
activities
and
I'm
updating
the
pr
today
I
just
realized.
So
actually
it
was
a
little
bit
mess
to
update
that
because
it
was
no
way-
and
I
just
realized
today-
that
changed
the
master
branch
to
maine
and
so,
and
things
was
not
like-
was
not
synchronized
well
and
anyway,
I'm
I'm
going
to
update
that.
B
So
the
idea
is
to
remove
all
the
the
parts
that
I'm
collecting
from
ito's
metrics
and
have
only
the
part
that
actually
is
creating
a
bunch
of
them
and
and
measuring
the
and
getting
the
creation-
and
you
know
running
time,
from
the
vm
and
from
the
creation
to
running
phase
and
and
report
that.
A
Okay,
okay
sounds
good,
okay,
all
right,
so
it
sounds
like
we
don't
need
to
talk
anything
more
about
this.
Then
I
think
I
think
it's
understood,
okay,
so
the
let's
go
to
the
second
bullet
point
then
this
is
this
is
the
issue
that
I
created
to
measure
the
different
performance
metrics,
so
we've
already
had
some
progress
here.
David
did
a
nice
job
on
this
and
we've
already
had.
We
already
have
that
merged.
A
I
want
to
talk
about
some
of
the
remaining
bullet
points
and
just
see
if
we
can
kind
of
talk
about
how
we
can
implement
some
of
these
or,
if
you
want
to
change
some
of
these
or
add
or
remove
any
of
these.
So
I
copied
the
bullet
points
in
here.
The
first
one,
your
work
hue
length.
I
started
looking
at
this,
I
already
started
to
write
some
code
about
about
it.
It
seems
pretty
straightforward.
A
A
So
we
have
some
sort
of
count
and
then
we
decrement
it
every
time
that
the
it
is
it
completes.
So
this
the
idea
is
that
this
will
give
us
at
any
given
time
the
number
of
or
the
the
how
long
or
where
q
is,
and
we
can.
We
can
monitor
that
in
prometheus.
A
What
do
people
think
about
that
is
that
I
don't
know.
Is
that
the
right
approach?
What
do
people
think.
C
It
sounds
good,
I
mean
okay,
they
pretty
clearly
see
how
much
is
intuited
yeah,
not
entirely
sure
yeah
I
mean
if
it's.
If
things
really
go
south
south,
you
will
see
it
and
promise
at
least.
A
Yeah,
I
think
like
so
what
we've
seen
like
from
our
measurements,
because
we
basically
just
kind
of
did
some
experiments
a
lot
ago,
where
we
kind
of
posted
this
to
standard
out
to
record,
and
we
just
kind
of
script,
the
logs
for
it
in
general.
What
I
expect
us
to
see
is
like
to
have
like
a
count.
That's
high,
it
just
kind
of
stays
high
and
it
slowly
goes
down
and
it
very
quickly
descends.
A
So
I'm
curious
to
see
how
it
shows
up
and
see
if
we
see
the
same
behavior,
because
I
guess
what
we
would
expect
is
that
we
have
like
a
parabola
where
it
just
when
it
quickly
increases,
and
then
it
quickly
decreases
or
even
just
almost
no
queue
at
all,
because
it's
just
being
the
input
is
speed,
is
equals
the
output
speed.
So
let's
we'll
see,
but
that
will
give
us
some
good
information.
D
A
A
Yeah,
so
that's
what
I
was
thinking
for
number
two
so
like
event
callbacks
in
queue
time
or
maybe
we
can
clarify
this
so
like
so
gavin
like
we'll
talk
to
you
about
what
you're
thinking.
D
I
was
thinking
service
time
of
the
queue
from
when
the
events
delivered
in
the
work
queue
to
when
we're
done.
A
D
A
D
Yeah,
just
the
queue
this
the
queue
service
function
called,
I
guess,
for
any
event,.
A
Okay,
so
we
add,
we
add
okay,
so
any
any,
so
any
any
call
whatsoever.
That
is
it
wait.
So
this
would
be
like
like
like
when
we
do
a
re-cue
kind
of
like
that,
like
when
we
did
I'm
trying
to
find
the
diagram.
A
So
if
we
did
like
when
we
do,
let's
say
like
right
here.
A
So
when,
let's
say
we
see
a
a
status
change
right
right,
we
update
the
status,
then
we
add
ourselves
back
to
the
queue
and
then
we
go
through
this
loop
again
after
we
get
picked
up.
D
A
D
I
think
at
the
entry
point
to
the
work
queue
whatever
it's
called
executable.
A
A
Okay,
let's
start
execute
and
then
when
we
we
stop
at
the
when
the
keys
pulled
off.
So
our
wedding.
A
A
D
Yes
and
then,
and
then
maintain
an
average
across
all
keys
so
time
time,
each
individual
keys,
execution,
okay
and
keep
an
average
all
you
know
outliers
or
whatever
it
is.
D
A
A
Maybe
I
should
change
the
title
here,
because
so
I
I'm
calling
this
callbacks
and
q
time
so
this
could
be
like,
for
example,
so
this
okay,
what
what
this
doesn't
capture
is
the
the
time
spent
in
the.
C
A
C
C
D
D
As
I
said,
that
would
be
the
service
time
and
then
like
you're,
referring
to
the
queuing
time
before
you
pick
to
run
is
also
interesting,
because
that
we
saw
that,
with
the
rate
limiters,
where
things
were
sitting
in
the
queue
for
a
long
time
because
of
the
rate
limiter.
So
we
would
we'd
want
to
measure
that
to
pick
it
up.
C
Not
sure
what
do
you
mean,
you
mean
the
error
rate
limiting
or
something.
D
The
kubernetes
client,
the
the
default
client
has
a
rate
limiter
and,
and
you
effectively
spend
a
lot
of
time
in
the
queue
before
you
come
out
of
it
again.
So
to
make
that
visible,
I
think
you
need
to
to
measure
the
queue
time
at
the
time
in
the
queue.
C
Yeah,
my
only
thing
is
that
it's
from
that
perspective,
not
clear
if
you
spend,
for
instance,
time
is
such
a
rate
limiting
thing
or,
if
you,
if
it's
enqueued
there
for
a
long
time,
because
your
processing
loop
takes
a
long.
E
A
Okay,
so
how?
How
should
we
kind
of
tackle
this
because,
like
I
could
see
this
kind
of
going
from
so
I
could
do
like,
for
instance,
we
could.
I
could
literally
do
the
moment
when
a
key
gets
added
to
it
to
the
queue,
so
that
would,
I
think,
be
like
like
when
we
do
when
we
do
this,
like
re-cue
call
or
enqueue
call
whatever
it
is
like.
A
Maybe
we
do
we
record
the
time
the
time
stamp.
C
C
C
Enqueue
after
thing
yeah,
if
the
object
changes
in
the
meantime,
it
would
still
be
processed.
So
it's
not
a
fixed
delay.
It's
like
at
worst
case,
look
at
it
after
this
time
again.
A
A
C
C
C
Okay,
I
didn't
look
too
much
into
the
rate
limiter.
To
be
honest,
I
can,
but
I
I
found
it
pretty
straightforward
to
to
do
the
actual
pros
to
measure
the
actual
processing
time
and
not
how
long
it
stays
in
the
queue.
A
Well,
that's
so
like
that's!
We
want
that.
That's
kind
of
what
we
want,
though,
but
like
that's
what
so
we
want
to
know
an
individual
event
like
if,
if
the
status,
if
we
just
updated
the
status
of
a
vmi
like
how
long
do
we
know
the
moment,
we
we
like
did
the
update
to
when
it
gets
processed
right.
That's.
C
Like
our
event
processing
I
had
for
me,
it
was,
I
mean
you
showed
us
some
graphs
and
so
on,
but
for
me
it
was
always
a
little
bit
unclear
how
you
were
getting
these
this
data
and
what
exactly
you
were
measuring
and
where
exactly
what
time
was
spent.
C
So
I
think,
with
the
pr
from
david,
we
have
now
a
much
better
way
of
looking
under
the
game,
startup
times
and
phase
transitions,
and
I
think
the
next
step,
the
direct
next
step.
I
mean
a
lot
of
other
here,
but
zero
on
the
list
makes
sense,
but
at
least
for
me
from
it
first
structure
gathering
and
this
I
would
really
look
into.
In
addition,
how
long
controllers
need
for
the
objects
to
process
them
in
the
controller,
loop
and
and
yeah
yeah?
C
That's
pretty
pretty
much
it
and
then
then
you
can
narrow
it
down
already
a
lot
in
the
basically
in
the
business
logic
and
lessen
the
infrastructure
where
issues
are
and
when
you
still
can't
find
anything
there.
I
guess
the
rate
limiting
and
the
q
length
may
become
more
interesting,
but
I
mean
you
saw
an
increase
in
the
q
count.
For
instance,
but
it
was
never
clear
for
me
if
we
actually
spend
time
in
the
processing
loop
or
if
they
are
really
stuck
in
the
queue
for
such
a
long
time.
For
non-obvious
reasons.
C
Yeah,
all
I
mean
is
like
let's
say
you
have
five
threads
for
your
controller
yeah.
A
C
Then-
and
you
implement
this
scotch
and
you
measure
the
time
in
the
queue
and
then
there
are
just
let's
say
I
don't
know
five
vms
which,
for
whatever
reason
are
stuck
in
the
processing
loop,
then
all
the
events
will
just
stay
there
for
ages
before
anything
moves
and
you,
you
will
not
really
see
anything
with
the
magics,
whereas
with
the
other
other
thing,
you
would
pretty
clearly
see
that
suddenly
a
few
objects
take
an
insanely
long,
insanely
long
time
until
they
finally
leave
the
processing
part
right.
C
A
Yeah,
like
like
kevin,
said,
like
there's
some
rate
limiting.
That
is
happening,
which
is
something
that,
like
that's
fine
like
we
can,
we
can
get
around
the
rate
limiter,
but
more
than
that,
like
would
like
to
understand
like
why
it's
getting
rate
limited
in
the
first
place,
trying
to
understand
why,
like
white,
has
to
do
that
so
that
that's
kind
of
that's
what
I
like.
I
want
to
capture
with
these,
because
something
is
going
on
with
the
key
that
this
that's
in
the
claim
makes
sense.
It's.
C
D
Yeah,
when
you
create
your
forget
to
call
it
create
something
from
config
when
you
create
your
kubernetes
client.
If
you
just
accept
the
default,
you
get
something
like
10,
10,
ops
per
second
with
a
burst
of
10,
or
something
like
that.
D
But
you
can
specify
a
a
higher
burst
and
a
higher
average
rate,
and
when
we
increase
that
we
got
substantially.
A
Yeah
I
mean
I,
I
think,
we'll
get
some
well
anyway.
I
think
we'll
get
some
some
things
from
this,
even
if
it
becomes
we
kind
of
prove
that
it's
that
it's
a
non-factor,
then
that's,
that's
fine
like
we
can.
We
can
get
rid
of
it.
That's
okay!
I
I
think,
like
I
think
that
they're
it
would
be
good
to
know
kind
of
just
to
get
some
insight
into
what's
going
on
in
some
of
this
in
this
area.
So,
okay.
B
I
think
the
the
word
q
has
some.
You
know
some
metrics
that
can
be
exposed
like
the
q
lamp
and
the
the
time
you
know
the
key
timing,
like
I
just
sent
like
here
some
links
the
chats
here
in
the
zoom.
B
So
I'm
just
wondering
if
we
need
to
you
know
you
know
the
depth
yeah
yeah.
This
is
this.
Is
some
metric?
The
other
link
actually
shows
it's
a
little
bit
better.
You
know
on
this
running
process,
the
other
one
yeah
yeah,
so
you
just
you
know
the
depth
key
and
the
cue
you
know,
latency,
you
know
work
duration
and
I
think
those
those
kind
of
metrics.
B
Maybe
we
don't
need
to
you,
know,
pre-implement
the
thing
again
and
just
expose
them
and
and
then
the
processing
time
of
the
the
key
itself.
As
ramon
was
mentioned,
this
one
is
actually
the
one:
that's
not
it's
messy
here,
it
doesn't
have
it
so
then
it
might
be
be
very
interesting
to
spend
more
time.
Maybe
this
one
and
expose
those
ones
here.
B
A
Yeah,
so
we
need
to
expose
so
that
what
do
we
like?
What
I
I
haven't
looked
at
this
so
like
this,
we
just
need
to.
We
just
need
to
report
them
they're
already
there
or
something
like
or
or
like.
We
just
need
to
wire
them
up
or
something
to
like,
whatever
keyword,
exports.
A
A
B
A
Okay,
that
that
looks
great,
so
then,
okay,
so
that
we
can,
we
can
look
at
wiring
up,
so
I
I
can
take
that
one,
I'm
already
sort
of
looking
at
this
one.
So
I
can
look
at
wiring
these
up.
That
looks
that's
good!
Okay,
cool
yeah,
thanks!
Okay,
all
right,
we'll
skip
past
these
work
queue
ones,
since
I
think
that's
covered.
Let's
talk
about
some
others,
latency
between
the
vert
launcher
pod
and
the
vmi
object.
Oh,
this
is
for
electropod
being
ready
in
the
vmi
object.
C
A
B
A
F
C
A
C
Just
said
so
this
this
this
monitor,
I
already
let
me
check,
but
history.
A
Okay,
okay,
that's
that's
good!
Okay,
less
work
to
do
all
right,
good!
All
right!
We
just
need
to
find
it.
Then
it's
just
information
so
just
in
a
wireless
dashboard,
okay,
great
okay,
so
the
next
one,
the
latency
between
divert
launcher
pod
being
ready
and
the
vmi
object.
So
what
about
this?
One.
A
Like
how
could
we
so
we
had,
I
think
to
do
this.
We
need
to
know
we
have
to
have
a
time
stamp
of
when
the
vert
launcher
pod
becomes
ready,
and
then
we
need
to
know
when
the
vmi
object
goes
to
running.
So
we
might
have
this.
So
we
have
the
vmi
object
running
now
with
data
changes,
so
we
just
need
a
diff
between
the
two
to
this.
I
think
we
are
doing.
We
must
have
this
like
this.
D
Is
that
not
scheduled
phase
that
launcher
pod
being
ready?
Is
that
not
when
we
update
the
vmi
to
schedule
yeah.
A
A
Yeah,
so
it's
so
we
go.
So
we
go
from
scheduling,
yeah,
okay,
so
it's
so.
It
needs
to
match
vmi
scheduling
and
convert
launcher
ready,
yeah
you're
right.
We
need
those.
So
we
need
so
we
have
this
now.
What
dave
has
changed
yeah
and
this
one
I
don't
know,
do
we.
I
don't
know
if
we
have.
This
can
does
this
report
it
on
the
object.
A
Yeah,
well,
that's
that's
what
I
want
to
measure
because
there
is
a
different,
the
well
there.
There
is
a
gap
here
there
and
it's
and
like
I
was
talking
about
earlier,
like
we
there
we
it's
it's
noticeable
like
when
that
there
is
actually
time
and
for
none
of
the
reasons
we
mentioned
before.
There's
time
between
these
two.
D
But
I
I
I'm
not
sure
if
this
is
the
one
you
mean
ryan,
but
I
I've
seen
even
today
I
saw
a
vmis
where
the
pod
was
ready
and
running
and
it
took
about
a
minute
before
the
and
the
vmware
was
in
scheduled
status
and
it
took
about
a
minute
before
the
vmi
got
updated
to
running
state
off
running
phase.
And
I
don't
know
where
that
time
went
you
know,
but
handler
does
that
update
but
and
it
waits
for
the
domain
to
get
going
and
so
on.
D
But
I
don't
know
I
can
take
a
minute
for
that.
A
C
Yeah
so
run
through
it
launches
ready.
It
goes
to
scheduled
and,
in
addition,
a
label
is
added
to
the
to
the
vmi,
which
makes
it
visible
to
their
handler.
From
that
point
onward
handler
can
see
it
in
its
own
queue
and
we'll
do
the
delivered
part
of
work
and
some
additional
hardware
setup.
A
Okay,
so
there's
our
there's
our
diff.
We
have.
We
need
this
time
stamp.
We
have
this
one
on
dave's
changes.
We
need
this
one,
but
we
could
get
this
even
if
we
don't
have
lit
already.
This
would
be
easy
to
get
because
it's
it's
seeing
it
like
when
the
moment
we
do
that
action.
When
the
controller
does
the
handoff,
we
could.
C
Yourself
that
when
you
increase
the
rate
limiting
the
all
the
issues
pretty
much
disappear,
so
I
guess
I
would
not
focus
on
too
many
small
details
here.
You
seem
to.
It
seems
to
make
sense
to
finally
have
an
ins
have
insight
into
the
queue
also
seeing
if
the
processing
rates
are
fast
and
seeing.
If
things
are
fine,
then,
and
trying
to
capture
this
threat,.
A
Yeah,
I
guess
so.
The
point
is
like
that
we
with
if,
if
qps
sort
of
is
the
solution
like
it
was
just
kind
of
curious,
because
what
how
I
don't
know
how
that's
going
to
scale
like
that's
the
that's
the
concern,
because
I'm
wondering
if
we
can
find
some
insight
into
why
it
is
that
we're
getting
rate
limited.
A
No,
no
I'm
not
going
to
find
it
with
this.
This
is
the
the
idea
is.
Is
that
if
something
so
this
the
goal
of
this
one
is
that
if
say,
for
example,
we
had
a
change
that
was
introduced
that
somehow
caused
an
issue
between
these
two
things
so
that
we
had
some
sort
of
delay.
We
could.
We
would
know
it
like.
We
know
that
this
change
actually
slowed
the
processing
of
the
the
vmi
relative
to
the
vert
launcher,
like
it's
just
another,
measurable
that
we
could,
we
can
look
at.
A
A
Yeah,
okay,
so
then
I
think
on
this
one:
we
just
need
the
vert,
the
vert
launcher
ready
time,
and
then
we
can
take
the
diff
and
then
we
have
we
take
it
from
the
scheduled
phase
and
then
we
know-
and
we
report
it:
okay,
okay,
that's!
I
think
that
that
one
makes
sense
that
one's
pretty
straightforward.
How
about
this?
One
latency
between
volume
creation
and
the
vert
launcher
pod.
A
Volume
creation
in
this
case,
what
does
it
mean
so
like
this
would
be
the
pvc
and
actually
does
this
make
sense
so
like
with
the
pvc
and
then
and
the
difference
between
the
time
it
takes
to
create
the
pvc
from
your
from
your
dynamic
parishioner
and
your
and
whenever
launcher
pod
is
actually
going,
but
it's
not
ready
to
see
the
eye
yeah
yeah,
also
external
yeah,
okay,
this.
B
A
External
yeah,
okay,
we'll
skip
this
one,
okay
device
plug-in
latency.
This
also
is
sounding
external.
Can
we
at
least
unless
there's
a
way
we
can
measure?
Is
there
a
way
we
can
measure
like
how
long
it
would
take
for
something
to
attach.
C
That
would
be
more
in
the
cube
there
or
or
it's
difficult.
I
mean
it
also.
It
depends
also
depends
on
which
device
plugins
you're
talking
about
there
in
keyboard.
We
have
multiple
different
kinds
of
device
bookings.
Let
me
put
this
way.
C
A
Okay,
all
right,
maybe
that's
something
that
could
be
implemented
on
the
external
plug-in
side,
then
for
both
these
okay
kubernetes
api
calls
latency
count
made
by
us.
B
David
there
is
a
metric
metric
at
the
real
red
exposure.
The
kubernetes
calls
in
latency.
B
C
Yeah,
that
would
definitely
work
for
the
tests.
In
addition,
david
looked
into
wrapping
the
kubernetes
client
with
an
instrumenter.
C
Okay,
so
you
can,
you
know,
there's
how
is
it
called
http?
I
forgot
the
device.
A
Yeah,
I
think
he'd
also
mentioned
was
it.
Jaeger
was
the
project
that
did
some
analysis
here.
Does
anyone
remember
I.
A
E
Jaker
itself
does
visualization,
but
they
have
their
own
protocol
or
use
open
telemetry
for
actual
tracing
inside
the
the
code,
like
you
said,
like
you
said,
spans
like
this
is
reconciliation
and
you
see
how
long
reconciliation
takes
and
they
can
do
another
span
inside.
That's
saying
this
is
building
a
template
for
whatever
and
it's
less
on
the
api
call
specifically,
but
you
instrument,
you
can
instrument
specific
pieces
of
your
code,
but
it
also
can
transit
jump
contexts
like
it
can
pass
on
through
http
to
other
clients
and
also
implement
it.
A
B
Just
some
some
mention
so
there
was
like
a
well,
I
would
say,
like
last
year,
some
big
discussion,
kubernetes
about
you
know
enabling
jager
or
not
because
yeah
you
know
kind
of
it's
eager
is
for
rpc.
It's
not
really,
for
you
know,
you
know
asynchronous
calls
and
things
can
get
nasty.
You
know
when
you
in
kubernetes,
for
example,
when
you
have
like
a.
B
I
think
that
it
doesn't
follow
the
same
path,
for
example
creating
pcs
externally,
and
then
you
don't
have
you
know
you
know
the
hood
the
root
of
the
the
spine
goings.
You
know
to
the
creation
and
I'm
just
saying
that
it's
maybe
it's
available,
but
me
it
can
also
face
some
big
challenge
with
jaeger.
E
Yeah,
the
discussion
in
kubernetes
was
so
is
a
cap
for
adding
tracing
to
the
api
server
and
the
general
commands
control
plane
and
what's
happening.
Now
I
think,
is
it's
slowly
tracing
any
synchronous
api
calls,
it
will
not
add
tracing
to
like
operator
reconciliation
loops,
but
that
is
for
the
coordinates
part,
because
the
challenge
was
they
wanted
to
trace
everything
that
happens
to
a
resource
and
to
do
that
they
would
have
to
save
the
trace
context
somehow
on
the
resource,
because
asynchronous
stuff
happens.
E
For
our
case,
though,
adding
tracing
to
our
reconciliation,
loops
and
our
api
calls
would
be
no
problem,
I
would
say,
because
we
just
output
spans
for
every
reconciliation
loop.
We
don't.
We
can't
link
it
to
api
calls,
but
we
still
can
trace
what's
happening
inside
a
an
operator
and
annotate
it
with
their
respective
information
like
meter,
data
and
stuff
just
to
see
what's
going
on
in
the
operator,
it's
not
linked
to
any
synchronous
stuff
and
that
shouldn't
be
a
problem.
A
I
wonder
if
there's
some
priority
here
because,
like
like,
like
you're,
saying
just
about
the
q
stuff,
I
wonder
if
there's
anything
about
operators
already
implemented
with
like
jaeger
or
something
that
we
can
look
at.
E
F
E
Just
added
two
lines
of
code
for
open
tracing
to
where
we
were
curious
about
more
information
and
the
exporting
was
done
by
in
our
case,
google
clouds,
but
you
can
do
that
with
jaeger
or
anything
else.
Okay
and.
E
I
can
look
at
the
tracing
part.
If
I
have
time
I
don't
know
when
that
will
be
but
yeah
I
I
I
wouldn't
mind
seeing
if
we
can
add
some
tracing
to
the
operator.
E
But
I
I
think
roman
mentioned
david's
already
looking
at
the
round
trip
before
at
least
tracking
api
calls,
so
that's
something
kind
of
different.
I
would
say.
C
C
E
C
C
Yeah,
this
is
the
one
thing,
so
that
should
be
easy
to
extend
to
also
that
you
can
also
see
okay,
we
call
we
update
the
node
a
lot
or
we
update
the
clients
a
lot.
You
basically
just
look
at
the
right.
You
just
split
the
the
url
and
yeah
you
just
take
it
out
for
it
and.
E
C
You
go
back
ryan
up
to
the
to
the
topic
from
before.
No
I
mean
men
on
the
google
doc.
C
A
A
Oh,
I
mean
I
mean
like
what
I'll
do:
yeah
they're
there.
What
I'll
do
I
want
to
put
this
into
grafana
and,
like
I'm
like
our
end,
and
I
want
to
get
I'll
come
back
with
some
bass
lines
for
this
right,
like
with
our
scale,
and
we
can
get
some
ideas
like
because,
eventually
like
with
like
a
with
all
of
these,
like
we
wanted
like,
like
you
were
asking
earlier
about
this
like
we
want
to
be
something
that
we,
we
don't
want
code
to
ever
change
this
in
the
future.
A
We
want
this
to
always
want
the
integrity
to
be
intact
and
and
so
like.
We
want
to
have
baselines
for
all
every
single
one
of
these,
so
eventually
that's
what
we'll
get
to.
B
So,
okay,
just
you
mentioned
about
the
grafana.
I
think
there
was
some
discussion
you
know
some
time
ago
to
actually
we
have
some
grafana
dress
work
and
it
would
be
nice
because
you
know
like
all
these
metrics.
We
forget
them
and
if
we
can
maintain
some
graphone
dashboard,
it
will
be
easy
just
to
check
and
see
the
metrics
there.
You
know.
A
I
thought
there
was
like
a
public,
wasn't
it
wasn't
there?
Wasn't
there
talk
about
like
creating
one
like
a
public
one?
That
would
be
measure
some,
I
don't
know
periodic
job
or
something.
C
Yes,
I
think,
that's
what
you
mean
much
really
right.
C
Okay
yeah,
since
we
talked
about
that,
I
can
give
you
an
update
from
federico
who
was
preparing
that
in
the
background
so
from
his
first
from
lymphoid
perspective,
he
has
everything
ready
for
that
or
almost
so.
If
we
would
create
now
periodic
jobs
with
what
you
enabled
in
kubernci
marcelo
that
we
can
deploy
the
prometheus
when
so
when
we
would
create
the
periodic
job
and
label
the
path.
Accordingly,
the
metrics
would
already
be
collected
and
show
up
in
the
grafana
dashboard.
C
What
we
do
not
have
yet
is
access
to
prometheus
directly
so
that
the
developers
can
play
around
with
the
melees
themselves.
There
we
have
the
bloat
balance
already,
but
the
keyboard.o
domain
is
now
owned
by
cncf
and
since
they
own
our
domain,
it's
a
little
bit
tougher
to
get
a
dns
entry
than
before.
So
we
can't
actually,
so
you
can't
actually
reach
it,
because
there's
no
dns
associated.
C
F
E
C
C
E
B
A
Okay,
cool,
okay,
so
it's
something
we
can
leverage.
Okay,
all
right!
We
have
two
more
two
more
to
go
so
vmi,
pod
metrics.
This
is
cpu.
Mem
usage
open,
go
routines
juicy
times.
A
Okay,
what's
a
good
way,
we
can
measure
this.
Can
we,
I
think,
there's
some?
What's
it
called?
I
just
lost
it.
There's
a
project
that
kubernetes
uses
to
leverage
this.
C
C
Yeah,
I
think,
when.
E
B
E
B
Many
of
the
metrics
that
see
a
divisor
exports.
It's
when
we
install
permits
operator.
We
can
see
that
the
advisor
you
know
reads
like
the
c
groups
and
directly
and
I'm
not
sure
if
it
will
give
more
things,
because,
like
the
the
thing
that
cool
it's
not
showing
is
the
the
node
metrics.
C
C
Exporter
too,
but
perhaps
you
edit
the
node
exporter,
but
I
think
what
if
we
really
want
to
get
the
go
runtime
matrix
from
word
launcher
ports,
and
this
seems
what
you
want.
Ryan
right.
A
E
C
Talk
about
collecting
with
word
handler,
the
metrics
and
exposing
them
together
with
the
others,
like
we
do
with
vm
metrics.
A
C
C
D
A
F
E
E
E
And
I
think
those
go
metrics
are
actually
more
useful
than
what
we
get
from
c
advisor,
especially
because
we
still
want
to
like
we
have
this
story
going
on,
that
we
want
to
make
sure
our
vm
overhead
is
small
and
correct,
and
I
think
those
metrics
help
us
see
more
how
we.
F
A
Yeah
definitely,
okay,
that
makes
sense
to
me:
okay,
what's
the
next
one
latency
for
virtual
machine
instances
and
virtual.
F
Machines,
so
this
is
api
calls
made
to
us.
A
So
like
we
are
doing
a
get
or
a
list
of
virtual
machine
instances.
How
long
does
that
take
so
this
is
just
an
idea
of
like.
A
A
E
C
E
C
A
Okay,
then,
okay,
if
you
have
a
link,
kevin
that'd,
be
great
yeah,
just
at
least
see
like
what's
there
or
if
there's,
if
it.
If
there
is
something
there,
we
just
it's
not
hooked
up,
we
can
hook
it
up
or
if
it
is
hooked
up.
Maybe
we
just
need
to.
A
Okay,
okay,
great
okay,
that
covers
that
covers
everything.
I
actually
wanted
to
add
one
more
thing,
so
the
I
was
actually
doing
I
haven't
used
like
I
like
it
just
so
that
everyone
is
on
the
same
page
like
when
everyone's
doing
development,
like
with
make
cluster
output,
make
cluster
sync
right
and
you're
doing
enable
prometheus
and
all
that
stuff
to
do
the
testing
to
like
withdraw
stuff
like
this.
How
how
do
people
do
like?
A
How
are
you
like
testing
your
with
the
dashboard
like
if
you're
doing
it
that
way,
you're
doing
developing
with
make
cluster
of
cluster
sync?
Do
you
have
to
like
port
forward
and
all
sorts
of
stuff
to
make
it
work?
So
you
can
watch
the
dashboard
or
watch
for
atheists.
C
So
I
guess
we'll
make
a
class
subway
cluster
marcelo
included
a
grafana
dashboard
there,
which
you
can.
E
A
Well,
let's
just
say
like
if
I
wanted
to
so
like
let's
say
I
wanted
to
say
I
was
looking
at
developing
this
right
and
I
and
then
using
make
cluster
of
cluster
sync.
I
have
enabled
the
dashboard
right.
It
sounds
like
everything's
there
like.
I
have
everything
there.
What
like
is
it
is
this.
What
people
are
doing?
A
B
C
C
Yeah
just
a
minute.
B
E
Just
for
information,
I
added
a
a
metric
that
I
just
checked.
That
is,
at
least
on
my
opinion.
Of
course,
I
I
but
I
think
it's
a
coordinated
metric,
so
it
should
work
for
everybody.
E
B
A
E
E
C
C
It
should
be
script
table,
so
leona
can
use
it
in
scripts
and
there
is
also
a
way
to
use
a
fixed
port,
but
the
disadvantage
with
the
fixed
port
is
that
you
easily
end
up
with
collisions
right
and.
E
C
A
So
here's
what
I'm
doing
probably
yes,
there
true.
So
this
is
what
so
I'm
going
to
do
this
and
then
I'll
try
that
okay,
so
that
that
way,
oh
and
then
you
close
to
something
else,
yeah.