►
From YouTube: SIG Instrumentation 20200107
Description
SIG Instrumentation January 7th 2020
A
Record
start,
the
recording
welcome
everyone
today
is
january,
7th
2021.
This
is
the
sig
instrumentation
community
meeting.
We
have
a
couple
items
on
the
agenda.
Elena,
do
you
want
to
do
you
want
to
kick
it
off.
B
Okay,
the
first
one
on
here.
Oh
great,
now
my
internet
connection's
unstable.
Hopefully
you
can
still
hear
me
great
okay,
so
for
this
one
I
just
wanted
to
mention.
This
is
not
the
first
sort
of
thing
like
this
that
I've
seen
before,
but
basically
like
a
critical
metrics
regression
happened.
Sig
node
got
tagged.
They
didn't
notice.
B
I
guess,
like
david,
very
kindly
like
tried
to
approve
the
fixes
and
whatnot
but
like
they
just
didn't
go
through
and
they
sat
for.
Like
three
months,
we've
had
two
releases
where
all
of
the
like
machine
metrics
from
c
advisor
have
been
totally
broken
and
like
no
one
tagged,
sig
instrumentation.
B
So
like
I,
and
it's
not
really
a
priority
for
node
so
anyways.
I
just
wanted
to
give
people
a
heads
up
as
soon
as
I
noticed
that
that
was
a
thing
I
reviewed
and
dealt
with
the
pr
post
haste
and
I
back
ported
it
to
both
of
the
affected
versions.
So
it
should
be
on
its
way
to
being
like,
I
think,
it's
fixed
already
it'll
get
probably
like
released
to
the
next
patch
release
for
119
120.
B
A
So
I
am
not
sure
that
that
thing
that
you
described
was
an
incorrect.
I
I
actually
that
sounds
like
that
makes
sense
right
because,
like
component
owners
own
their
own
metrics
like
we
for
us
right,
our
duty
is
like
more
about
the
shape
and
form
and
about
like
conforming
to
certain
things.
But
if
someone
like
deletes
their
http
route
to
their
metrics,
endpoint
yeah,
their
metrics
are
going
to
disappear.
B
B
But
we're
not
doing
anything
like
that
right
now.
So
I
I
don't
know
if
this
is
kind
of
a
segue
into
the.
We
really
need
to
pick
stable,
metrics
and
like
actually
do
some
sort
of
testing
on
it
to
make
sure
that
we
are
catching
those
regressions.
But
it's
kind
of
the
segway
into
the
other
topic
that
I
have
on
yeah.
C
I
think
this
makes
perfect
sense
like
it's
just
we
don't
have
any
stable
metrics,
so
yeah
yeah,
we'll.
A
Actually
have
somebody
I
wanted
to
ping
them
right
now
that
wants
to
do
it
for
api
machinery.
A
B
So
we
need
to
like
go
and
tell
sigs
like
yo.
You
need
to
pick
your
stable
metrics,
or
are
we
actually
well?
This
is
kind
of
segwaying
into
the
next
topic
so
before
we
do
that,
just
to
wrap
up
on
this
one,
it's
fixed
yay.
I
think
that
we
need
some
sort
of
strategy
to
like
because,
like
fundamentally
like
when
metrics
break
people
aren't
like.
Oh
you
know,
node
is
incompetent
they're,
like
kubernetes
metrics
suck,
so
I
think
it.
B
It's
definitely
a
it's
definitely
a
concern
of
saying
instrumentation
and
I
think
that
if
people
had
tagged
instrumentation,
it
probably
would
have
gotten
like
it
would
have
been
noticed
at
a
triage
meeting.
It
would
have
been
dealt
with
post
haste.
That
kind
of
thing.
A
There's
a
couple
nuanced
things
about
the
node
one
specifically,
which
make
it
a
little
bit
harder
to
tie
with
the
stable
things
mostly
because
they
they
use
the
the
custom
collectors
which
are
dynamic
in
nature,
which
we
explicitly
excluded
from
being
able
to
be
marked
as
stable,
because
things
like
what
probably
happened
are
more
apt
to
happen.
C
A
C
We
did
last
time
we
discussed
it,
though
we
said
that
there
are
not
so
the
metrics
that
are
there.
C
I
don't
think
we
said
we
necessarily
need
parity,
but
what
is
there
is
not
sufficient.
That's
what
we
said.
I
don't
think
we
ever
necessarily
defined
what
what
completion
would
look
like.
I
guess
that
would
be
the
best
like
action
point.
If
we
can,
if
we
can
have
parity
amazing,
I'm
not
sure
if
that's
necessarily
the
goal.
A
B
I
mean
the
good
news
is
that
I
am
since
I
switched
jobs
within
like
the
past
month
at
red
hat.
I
am
now
going
to
be
significantly
more
involved
in
cigno'd
and
can
help
with
that,
mostly
like
I
was
just
trying
to
like
suss
out
like
han.
I
I
think
I
agree
with
your
assessment.
B
This
is
one
of
those
specialized
weird
cases
where
the
metrics
are
totally
critical
and
they're
super
special
cased,
and
we
really
need
to
figure
out
a
strategy
of
like
what
to
do
with
them,
but
like
it's,
not
necessarily
a
generalizable
one
like.
I
think
I
agree
with
that,
but
I
think
nonetheless
it
will
be
good
and
valuable
to
talk
about
what
we
want
to
do
for
stable
metrics
and
I
guess
for
caps.
In
general,
I
didn't
put
a
cap
review
thing
on
this
agenda.
B
I
figured
we
went
until
next
meeting
but
yeah,
so
that's
all
I
had
for
that
one.
I
don't
know
if
anyone
has
anything
to
add.
A
C
A
This
is
the
31st,
so
we
should
probably
try
to
get
a
cap
for
this
by
the
31st.
I
almost
sounds
like
elena.
Is
volunteering
to
write
a
cap.
D
I
I
did
want
to
point
out
one
thing
just
if
people
are
curious
about
the
details
of
why
the
regression
slip
through
c
advisor
has
its
own
end
and
testing
that
it
does
before
pre-submits
that
checks,
things
like
the
prometheus
metrics
and
all
of
the
random
json
endpoints
it
has,
but
kubernetes
doesn't
have
tests
for
the
c
advisor's
specific
endpoints,
because
at
least
thus
far
it's
assumed
that
those
have
already
been
tested.
D
As
part
of
the
c
advisor
release
process,
it
does
have
tests
for
the
summary
api,
which
is
usually
enough
to
exercise
the
sort
of
is
c
advisor
working
in
kubernetes
question.
But
this
case
was
just
a
failure
to
register
a
bunch
of
metrics,
and
so
that
was
why
that
wasn't
caught
in
any
of
our
pre-submits.
A
I
mean
this
is
also
why
I
kind
of
wanted
that
thing.
Where
again,
I'm
going
back
to
the
thing
that
I
argued
with
frederick
about
with
the
metric
and
the
descriptors,
because,
basically,
if
we
could
have
like
a
thing
of
these,
are
all
of
the
the
metrics
that
were
registered,
then
it
would
be
easy
to
just
dump
this
in
something
and
to
generate
hey
look.
These
are
the
things
here's
the
metadata
I
mean
like.
You
can
expose
this,
also
through
a
debug
endpoint
or
whatever.
A
A
B
For
han,
so
you
mentioned
a
date
for
enhancements
freeze,
I
just
checked
everything
sig
release.
They
have
not
released
any
dates
yet
so.
A
Okay-
I
I
don't
know
because
I
just
I
got
pinged
by
david
eads
in
the
morning
to
look
at
something,
and
I
asked
him
when
he
needed
me
to
look
at
it
and
he
said
maybe.
A
He
said
just
look
at
it
before
cut,
freeze
and
I
said,
what's
kept
freeze
and
he
said
january
31st
or
so,
oh
he
says
he's
guessing
january,
31st
or
so
so,
maybe,
okay,
okay!
I
was
wrong.
I
just.
B
I
I
just
I
was
curious
because
I
I
was
like
looking
to
update
the
dates
on
our
dock
for
the
next
release
and
like
they
haven't
been
published
yet
otherwise
I
would
have
done
it.
C
To
to
wrap
up
this
this
item,
it
sounds
like
to
me:
we
should
finish
up
that
cap
to
get
whatever
we
call
feature
parody
in
our
in
the
couplet
metrics
endpoint
with
kublet
resource
metrics
endpoint.
Then
I
think
we
can
discuss
maybe
for
these
particular
metrics.
C
We
can
even
have
an
a
special
strategy
if
we
can
find
something
generalizable
to
mark
these
as
stable,
because
I
think
these
do
make
sense
to
have
stable,
have
a
stable
and
then
I
think
we
can
talk
about
other
stable
metrics
for
121,
potentially.
A
D
So
we
do
test
the
resource,
metrics
endpoint,
it's
just
that
resource.
The
thing
to
keep
in
mind
is
that
the
resource
metrics
is
a
very,
very,
very
small.
Subset
of
the
metric
c
advisor
provides
right,
very
basic,
cpu
and
memory
that
the
metrics
that
we're
missing
are
machine
metrics,
which
include
topology
information,
as
metrics
includes
a
really
wide
variety
of
stuff
that
probably
doesn't
belong
in
the
resource.
Metrics
endpoint.
A
B
B
That's
in
their
enhancements
repo,
that's
pre-stability
is
the
metrics
overhaul.
C
But
what
what
what
han
said,
I
think
is-
is
interesting
to
at
least
think
about.
Maybe
we
can
talk
about
that
once
we've
fleshed
out
that
cap
a
little
bit
more,
but
I
I
think
I
could
agree
to
marking
the
ones
that
we
have
today
as
stable
in
the
yeah
right
we're
going
this
way.
B
B
So,
and
I
think
let's
see,
does
the
kept
metadata
say
this
is
no,
it's
implementable
not
implemented.
Well,
this
is
on
my
list
of
node
caps
that
need
attention,
so
all
50
of
them
or
whatever.
Okay,
that
makes
sense
to
me.
I
think
that,
like
we
have
a
reasonable
path
forward
there,
so
then
my
my
next
question
is
like
what
do
we
do
about
stable
metrics
in
121
are
like
we
picking
them?
Are
we
telling
sigs
to
pick
them?
Six
have
two:
what's
the
approach
here.
A
B
Okay,
so
this
process
in
the
cap,
we
need
to
tell
them
about
it
and
we
probably
need
to
send
out
some
sort
of
communication
to
the
project.
Yes,.
C
And
I
would
expect
if
we,
if
we
haven't
already,
I
would
expect
this
to
be
a
document
in
the
community
repo
under
developer
docs.
A
A
A
They
will
require
a
second
instrumentation
thing
because
of
the
auto
generated,
stable,
metrics
thing,
and
thank
you
merrick
for
that
so
yeah,
so
it
will
yeah.
Definitely
we
will
have
to
prove.
C
On
a
similar
note,
I
think
we
started
the
conversation
last
end
of
last
year
about
a
couple
of
candidates
that
we
feel
like
should
probably
be
proposed,
or
at
least
be
reviewed,
to
be
proposed.
Eventually,
I
believe
han
wanted
to.
He
wanted
to
introduce
a
new
generic
one
that
exposes
storage
object,
count
so
that
we
don't
have
a
one
that
is
specific
to
etcd.
I
think
we
could
already
add
this.
I
mean,
I
don't
think
this
needed
needs
a
cap
or
anything
yeah.
C
I
I
because
a
lot
of
people
use
that
cd1.
I
would
add
one
in
parallel.
I
would
say
yeah
yeah.
Of
course,
no,
I'm
not
I'm.
A
I'm
not
going
to
delete
accounts
yeah,
that's
not
happening.
C
But
then
the
the
other
ones
are
obviously
apis
over
metrics
and
I
would
even
say
scheduler.
Metrics
are
probably
a
good
candidate
as
well
like
scheduling,
latency.
C
That's
fine,
as
I
said,
I
primarily
want
to
pick
some
to
review
so
that
we
can
make
the
changes
that
we
would
want
to
make
so
that
then,
in
the
following
release,
we
could
mark
them
stable.
E
I
I
just
wanted
to
mention
that
there
was
also
bounced
idea
about
the
kubernetes
reliability
working
group
that
will
work
and
introduce
more
sls,
so
basically
as
a
comparison,
like
mainly
kubernetes,
but
in
particular
scalability
team
proposed
and
proposed
and
maintained
some
official
kubernetes
sls
and
looking
at
their
definitions,
they're,
pretty
sized,
not
precise
in
definition
of
metric,
because
there's
they
don't
define
metric
and
based
on
my
like
personal
contact
with
scholarly.
I
guess
this
is
like
defined
like
they
have
some
promised
use
instance
that
they
run
and
define
them,
but
it's
not
public.
C
That
makes
sense
we
actually
have
the
kubernetes
mix
and
that
collet
that
has
a
bunch
of
common
prometheus
slos
already.
So
I
think
it
would
only
make
sense
to
have
all
of
these
definitions
in
a
central
place,
and
this
is
already
one
that
is
widely
used
in
the
community.
B
So
yeah.
E
A
B
I
have
been
trying
to
look
through
to
actually
find
the
thing
that
talks
about
how
to
make
a
metric
stable.
It's
split
into
like
three
or
four
pieces,
and
the
only
one
that
talks
about
metric
stability,
which
also
talks
about
like
static
analysis
that
one
is
marked
as
implemented.
So,
but
we
didn't
know.
A
A
C
I
would
say:
let's
introduce
process
f
after
we've
done
this
a
couple
of
times.
Yeah
seems
a
little
premature
at
this
point.
A
Let's,
okay,
let
me
just
I
can
create
the
storage
storage
object,
one
like
pretty
easily,
so
let
me
just
create
a
storage
optic
metric
and
then
I'll
submit
a
pr
and
then
in
a
separate
pr.
I
will
just
upgrade
it
to
stable.
A
Yeah
sure
sure
well
yeah,
I
will
test
the
metric,
but
it
will
I'm
literally
going
to
define
it
in
exactly
the
same
way
that
city
after
counts
is
defined,
and
it
will
then.
B
So
han,
can
I
ask
you:
is
it
okay
for
me
that
to
ask
you
with
since
you're
gonna
do
the
testing
of
this
once
you
finish,
your
testing
send
the
project
wide
communication
to
kdev,
saying
we're
doing
this.
This
release.
Please
participate!
Here's!
How.
A
About,
oh,
it's
just
oh
yeah
that
we're
setting
it!
No,
no
I'm
going
to
talk
about
with
api
machinery.
If,
if
the
leads
there
are
not
okay
with
it,
then
I.
B
A
A
A
So,
okay,
then
yeah
okay,
I
I
will
just
I
will
yeah
talk
to
some
people,
make
sure
that
they're
okay
with
it
and
then
I
will
add
the
metric.
D
All
right,
I
think,
that's
everything
that
was
on
our
agenda
for
today
happy
new
year,
everyone
and
you
get
three
minutes
back.