►
From YouTube: SIG Instrumentation 20210819
Description
SIG Instrumentation Bi-Weekly Meeting August 19th 2021
A
All
right,
it's
august
19th.
This
is
sick,
instrumentation
and,
as
han
has
just
mentioned,
we
only
have
kind
of
one
item
on
the
agenda,
which
is
mostly
a
discussion
around
the
revisited
metrics
stability
classes.
Do
you
want
to
kick
it
off.
B
Sure
so
to
update,
we
me
and
elena
went
to
sig
architecture,
and
we
we
brought
this.
We
brought
this
up.
Basically,
people
were
on
board.
I
think
there's
probably
gonna
be
a
little
bit
of
bike
shedding
around
the
actual
stability
classes
and
the
guarantees,
but
I
think
that's
to
be
expected,
so
we
should
probably
actually
get
to
what
the
stability
classes
are
are
actually
going
to
mean
like
semantically.
A
Well,
were
there
any
comments
on
the
existing
like
light
proposal
right
like
we,
we
had
previously
discussed
like
internal
slash,
debug
alpha
beta
stable.
I
think
right,
yeah.
B
So
they
liked
the
two
additional
stability
classes,
the
internals
or
development
one
and
the
beta.
I
brought
up
the
lagging
thing.
B
So
basically,
we
also
went
to
working
group,
reliability
and
davide
suggested
that
we
lag
metric
classes,
a
release
behind
feature
releases
and
the
reason
for
this
was
they
don't
actually
start
mandating
metrics
until
a
beta
release
and
even
after
a
product
goes
ga,
there's
not
enough
information
about
usage,
you
don't
get
widespread
usage
until
something
is
actually
ga,
and
so
it
doesn't
make
sense
to
ga
and
metric
without
knowing
how
how
this
stuff
is
actually
being
used.
B
A
To
what
capacity
do
we?
Actually?
I
mean
if
we
go
with
the
same
naming
as
features
then
people
are
gonna,
have
similar
expectations
with
those
names
right,
but
it
sounds
like
we're
expecting
that
there
are
situations
where
we
ga
a
feature
with
a
beta
metric
and
we're
gonna
have
to
remove
or
change
that
metric
significantly
right.
B
A
Is
a
potential
to
confuse
end
users?
They.
A
They
can
expect
the
metric
name
to
remain
the
same.
Okay,
if
that's
the,
if
that's
the
guarantee
that
we're
giving,
I
think
I
think
that's,
I
think
I
think
that
would
be
okay.
I
think
I
need
to
think
about
it
a
bit
more,
but
I
think
this
could
could
be
workable,
at
least
as
like
as
long
as
we
have
some
pretty
concrete
rules
for
this,
I
think
anything's,
really,
okay,
as
long
as
it's
the
expectation
is
not
just
yeah,
it's
it's
exactly
like
features.
B
The
well
the
metric
itself,
so
the
metric
with
that
name
will
not
disappear
for
n
releases,
even
in
beta
right
in
beta,
however,
in
beta
labels
can
be
added
or
removed
and
removed.
Okay.
C
B
A
So
I
think,
I'm
okay
with
trying
it
exactly
like
this
for
the
one,
and
only
reason
that
people
tend
to
already
pretty
heavily
use
beta
features,
and
so
I
feel
like
when
we
get
to
that
point.
Actually,
the
the
metrics
are
going
to
be
exercised
enough
that
there
aren't.
You
know
wild
leaks.
B
A
A
B
Also,
you
know
when
people
add
metrics
to
a
cap,
they
don't
have
to
be
new
metrics.
They
can
use
existing
metrics
as
a
way
to
measure
their
own
feature
right,
like
api
server,
request,
latencies
or
duration
seconds
or
whatever
is
like
a
perfectly
acceptable
metric
to
measure
your
future
right.
Like
are
you
talking
talking
about
production,
readiness
right
now
or
yeah,
production
readiness,
mandates,
metrics
for
beta
features,
and
I'm
saying
that
because
yeah
but
yeah
production
readiness
is
an
extension.
In
my
mind
of
the
cap.
B
Yeah,
I'm
saying
that,
like
we
shouldn't
necessarily
expect
the
proliferation
of
metrics,
not
necessarily,
and
so
like
there
could
be,
people
could
be
using
stable.
Metrics
people
could
be
ahead
of
the
curve
they
could
be
using
stable
metrics
for
a
beta
feature,
stable
metrics
for
an
alpha
feature,
because
they're
using
existing
metrics.
B
I
think
I
think
at
the
outset
they
should
be
internal
because
I'm
not
sure
how
people
are
going
to
use
them.
I
think,
however,
given
sort
of
the
wide
scope
and
nature
of
a
party
in
fairness,
you're
going
to
want
to
have
some
public
stable
metrics
for
gauging
how
your
requests
are
getting,
throttled
or
prioritized,
or
whatever
like
right,
like
you're
gonna
want
some
set
of
metrics,
for
that,
like
people's
clusters
are
heavily
affected
by
that
feature,.
A
A
How
this
cannot
be
why
this
wouldn't
be
able
to
be
connected
back
to
like
request
errors
or
latency.
B
It's
going
to
affect
latencies
that
will
so
definitely
like
the
duration
seconds.
Your
distributions
are
going
to
be
affected
by
it,
but
how
it's
being
affected
is
not
really
you
don't
know
how,
like
you
just
know
that
something
is
happening.
I
mean
like
how
do
you
tweak
the
settings?
You
have?
No,
no
data
really.
I
agree
and.
B
That's
not
internal,
like.
A
B
B
No!
No!
No!
No!
No,
because
the
the
difference
here
is
internal.
When
we
had
discussed
it
was
more
like
a
developer
flow
like
someone
who's
developing
kubernetes,
but
in
this
case
this
feature
affects
end
users,
cluster
admins,
who
would
want.
A
B
That's
like
at
a
high
level.
That's
what
you
want
that
number
and
you
don't
want
that
to
be
internal,
because
that's
not
an
internal
concept,
it's
literally
the
the
what
the
thing
is
doing
and
how
it's
affecting
your
cluster,
and
you
should
be
able
to
slo
off
of
this
right
because
party
and
fairness
when
it's
being
exercised
slowly
degrades
your
cluster.
A
D
Yeah,
I
I
think,
by
what
I
was
hoping
to
get
at
is
it
seems
like
we're
putting
cluster
operators
in
a
bit
of
a
weird
position.
If
we
launch
a
ga
feature
without
metrics
that
are
stable,
it
almost
feels
like
there
should
be
some
very
small
set
of
metrics
that
go
stable
at
ga
and
then
lots
of
metrics
that
come
afterwards
and
a
lot
of
them
are
debugged
and
will
never
go
stable.
B
I
mean
like,
in
this
case
you're
going
to
want
some
set
of
metrics.
I
agree
I
mean,
but
like
people
are
going
to
be
using
request
durations,
that's
a
stable
metric
to
to
know
what
the
latencies
are,
but
the
saturation
metric
is
like
kind
of
a
tricky
one
and
you're
not
going
to
know
which
dimensions
like
it
makes
sense
to
even
be
measuring
until
people
start
using
it.
It's
starting
to
be
used
now
and
we're
realizing
a
lot
of
weird
things
about
it.
So
I
think
it's
exactly
kind
of
why.
A
This
is
the
this
is
basically
where
we
are
describing
this
the
similar
situation
of
right
now
you
can
use
the
proxy
metric
of
request
duration,
but
we
need
it.
It
would
be
good
to
figure
out
something
that
is
clearer
about
the
situation
right
and
so
something.
A
A
I
mean
like
you,
can
still
go
through
metric
stability
after
a
product
has
yeah
yeah
for
sure
sure
right
so
yeah,
that's
what
I'm
saying
like
that's,
that's
more!
That's
what
you're,
what
you
just
described!
No
like
we're!
Yes,
we're!
Now
we're
now
having
more
widespread
adoption
of
this
feature
and
we're
realizing.
A
We
need
better
insight
into
it,
and
so
we
create
a
new
alpha
matrix
like
just
just
just
an
example
right
and
even
if,
if
we
were
to
say
that
the
feature
went
ga
already,
we
could
still
experiment
with
these
metrics,
and
this
is
going
to
happen
all
the
time
right
that
we're
going
to
realize,
there's
some
more
aspects
that
we
would
like
to
understand
better
and
even
if
that's
just
an
internal
metric
right,
maybe
one
day
we'll
realize
okay.
Actually,
this
is
something
that
people
want
slo
off
of
like
we
shouldn't.
B
No,
no,
we
we
have.
We
have.
We
have
static
analysis,
so
we
can
basically
enforce
beta
metrics
being
promoted
after
we
just
have
to
decide
the
number.
A
That's
like
that
that
that's
then
up
to
production
readiness
to
the
side
right
right,
yeah
yeah!
I
think
that's,
that's
reasonable.
It's
really
hard
to
un
tune
to
put
a
finger
on
at
what
point
do
people
really
use
the
metric
right?
A
I
want
to
say
in
my
experience
after
four
releases,
people
definitely
start
using
it,
but,
like
I
I
would
say
two
releases
is
probably
a
reasonable
number.
That's.
A
Three,
maybe
like
maybe
maybe
maybe
damien-
has
some
some
idea
about
this
as
well,
but
like
in
the
kubernetes
mix-in.
I
want
to
say
we
start
using
the
metrics
pretty
much
right
away
like
when
they
when
there
are
beta
features
and
then
people
you
know
slowly
start
adopting
this
so
yeah.
I
think
I
think
two
releases
is
reasonable
number
for
for
to
promote
to
stable,
because
at
this
point
like
the
kubernetes
mix
in
will
certainly
have
exercise
that.
B
We
can
provide
an
escape
patch,
we
can
see
say
after
two
releases
you
either
have
to
deprecate
your
metric
or
you
have
to
promote
it
and
deprecation.
Just
buys
you
another
cycle
right
like.
A
B
I
don't
think
so.
I
think
it
depends
on
the
future.
I
could
see
party
in
fairness,
probably
using
it,
but
I
see
a
lot
of
other
things
like
scheduler
stuff,
like
people
are
pretty
pretty
yeah.
It's
not
super
contentious
like
these
are
the
metrics
that
we
want
for
some
of
these
features,
I
think
some
of
the
auth
stuff
yeah.
It's
like
pretty
pretty
simple.
I
think,
like
you,
will
know
after
two
features
after
two
releases,
like
what
metrics
make
sense.
A
Yeah,
I
I
I
think
that
that
sounds
reasonable.
Is
there
any?
Are
there
any
other
points
that
we
had
open.
B
Yeah
one
more
thing
was
we're
going
to
be
able
to
run
static
analysis
against
everything
except
the
custom
collectors,
because
those
are
super.
Those
are
dynamic
and
there's
like
a
set
of
other
metrics
which
are
dynamic.
So
those
automatically
have
to
be
internal.
A
Runtime,
like
resource
metrics,
are
going
to
move
to
the
like
cri
stats,
endpoint
anyways,
and
then
then
we,
you
know
practically,
have
a
stable
list
there
as
well.
So
if
like
that,
those
are
the
only
ones
where
I
would
be
worried
about
the
collector
metrics,
but
it
seems
like
we
already
have
a
solution
on
the
horizon
for
resource
metrics.
So
to
me
this
sounds
fairly
reasonable.
Actually,.
B
Yeah,
it
kind
of
sucks
you
kind
of
want
the
list
I
I
want
to
generate
like
I
want
to
auto,
generate
all
the
all
of
the
metrics.
I
would
like
to
do
it
statically,
but
I
think
you
kind
of
have
to
do
it
at
runtime
because
of
the
dynamic
nature
of
some
of
the
metrics.
B
B
You
you
know
at
registration
time,
I'm
not
talking
about
scraping
the
endpoint
like
we,
we
have
a.
We
basically
have
a
wrapper
around
the
registry,
so
why
do
you
know
what
registration.
B
A
A
A
description
right
yes,
but
the
there's
a
there's,
a
collect
description
method.
I
forget,
I
think
it's
yeah
describe
or
something
and
yeah
it
will
return
descriptions,
there's
no
contract
that
it
has
to
be.
It
has
to
return
the
exhaustive
list
that
it
potentially
knows,
and
there
are
collectors
that
dynamically
return.
The
descriptions.
B
Right
custom
collect
custom
collectors
yeah
and
those
we're
not
going
to
be
able
to
do
anything
about
it,
but
for
the
set
that
is
registered,
this
is
basically
as
comprehensive
as
you
can.
You
can
get.
A
C
Okay,
again
shouldn't
the
custom
collectors
like
implement
the
interface
that
we
have
with
the
stability
like
all
the
stability,
api
and
stuff.
C
And
I
was
implementing
not
the
normal,
like
like
register
interface,
but
the
one
you
implemented
in
component
base.
So
I
think
for
this
one
you
can
actually
like
get
the
number
of
like.
A
B
A
B
There's
like
there's
like
a
loop
or
something
yeah,
it's
it's
terrible,
but
yeah
yeah,
so
those
I
mean
will
be
perma
permanent
journal.
I
mean
one
should
not
one
should
not
do
that
and
we
will
audit
them
in
the
future
and
make
sure
that
they
don't
happen
but
yeah
they
exist
and
yeah.
We
should
try
to
deprecate
them.
Probably
those
particular
metrics.
A
B
The
the
set
that
of
metrics
that
we
can't
determine
that
registration
time
and
then
we
should
try
to
deprecate
all
of
them.
Basically,.
B
A
C
B
B
Yeah
cool,
then
I
I
guess
we
have
an
agreement
on
on
most
aspects
of
this
yeah.
Well,
that
was
a
productive,
productive
session
agreed
cool.
Okay,
then
I
guess
we
we
can
work
offline
on
the
on
the
cap.