►
From YouTube: Kubernetes SIG Node 20210630
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
Hello,
everybody
and
welcome
to
today's
edition
of
the
node
ci
subgroup
and
triage
session.
It
is
wednesday
june
30th
2021
and
we
have
one
agenda
item
from
francesco
and
I
suspect,
probably
more
agenda
items
will
trickle
in
do
we
have
francesco.
B
A
Yeah
I've
been
reviewing
his
prs.
I
know
that
there's
a
bunch
of
different
problems
that
we
have
on
the
the
node
serial
test,
so
that
one
looks
like
it
got
approved
and
then
here's
another
one
helpfully
fixing
this
test,
and
I
think
I
looked
at
this.
This
one
was
funny.
I
was
like
there's
a
sleeve
in
this
random
bash
command.
What's
going
on,
there
looks
like
the
job
is
still
failing,
but
I
guess
there's
probably
more
asynchronous
work
to
be
done
here.
C
C
A
And
I
guess
we've
got
so
as
far
as
board
goes
looks
like
we
have
a
new
pr,
but
other
than
that.
What
do
we
want
to
take
a
look
at
today?
We
have
like
a
bunch
of
flakes.
I
think
that
the
let
me
take
a
look
at
kubernetes
dev.
A
So
the
release
team
has
been
sending
out
signal
reports.
I
usually
look
at
them,
so
you
can
see
it's
red
and
we
had
a
few
signaled
things
in
here.
That
are,
I
guess,
marked
as
in
progress.
Luckily,
no
new
things
so,
but
there
are
like,
I
think,
five
in
here,
and
these
should
probably
be.
A
I
don't
know
if
the
issues
currently
have
priorities
on
them,
but
if
not,
we
might
want
to
mark
these.
We've
got
five
things
and
I
think
they're
gonna
start
marking
these
as
flaky
and
excluding
them.
If
we
don't
fix
them,
like
that's
kind
of
the
that's
the
release
team's
lever-
and
so
I
know
this
is:
let's
see,
we've
got
the
startup
probe
one.
A
I
have
not
even
seen
this
one,
it
doesn't
look
like
it's
assigned
to
any
well
assigned
to
francesco.
Apparently,
so
I
guess
we've
got
this
one
which
I'm
unfamiliar
with
this
one
looks
like
matthias
is
on
this
one
is
me:
I
haven't
looked
at
this
one
and
rtm,
so
I
guess
do
you
want
to
go
through
them?
Do
you
want
to
start
with
this?
One.
C
C
Something
like
that
and
we
had
the
similar
issue
under
the
same
test,
and
the
fix
was
just
like
to
assume
that
zero
is
normal
value
for
it.
But
probably
it's
not
so.
A
C
A
I
agree
that
it's
probably
not
normal,
and
I
think
this
is
catching
an
actual
symptom,
because
I
know
we've
had
some
bugs
in
openshift
reported
along
these
lines,
where,
like
c
advisor,
just
isn't
returning
stats
sometimes,
and
I
think
that
harshall
had
a
fix
for
that.
Like
I
remember
there
being
an
open
shift
bug
do
we
we
might
because
david
porter's,
the
c
advisor
maintainer.
Should
we
cc
him
as.
D
A
This
has
been
flaking
on
master
for
a
long
time,
and
I
know
I've
seen
this
like
in
certainly
like
latest
openshift.
I
think
we've
had
complaints
from
customers
about
like
metrics
just
disappear
and
because
most
of
the
software
treats
zeros
as
actual
zeros
they're
like
there
are.
These
drops
in
my
graphs,
and
this
doesn't
make
any
sense.
So
I
think
that
this
one
probably
should
get
some
eyes.
C
A
Yeah,
let
me
just
bump
this
one
to
the
top,
so
we
to
kind
of
put
some
sort
of
priority
order
on
these
things.
I
don't
know
that
this
one
is
at
the
point
where
it's
critical
urgent,
but
I
know
that
this
is
constantly
an
annoyance.
I
see
it
flake
on
my
pr's
and
I
think
that
there's
an
actual
underlying
behavioral
bug,
so
we
may
be
able
to
link
some
of
those
if
we
go
through
a
bunch
of
like
other
bugs
but
yeah.
I
think
this
is.
A
We
just
should
just
try
to
keep
making
progress
on
this.
So
thanks
for
taking
a
look
at
it
and
then
hopefully
we
can
maybe
pull
in
like
david
porter
or
harshall
or
other
folks.
Let
me
take
a
note
for
myself
to
make
sure
that
I
I
can
maybe
pull
in
some
open
shift
bugs.
A
What
is
this
url
I'll?
Try
to
remember
to
do
that
just
so,
we
have
them
sort
of
all
in
one
place.
Okay,
so
I
think
that
one
takes
care
of
that
and
then
for
this
one,
so
this
one's
apparently
assigned
to
me,
I
haven't
had
a
chance
to
look
at
this.
A
So,
let's
see
what
oh,
that
looks,
maybe
pretty
frequent.
A
I
don't
know
how
much
of
that
came
through,
but
initially,
when
I
looked
at
this,
it
was
like
not
flaking.
Very
often,
it
appears
to
be
flaking
a
lot
more
often,
so
I
haven't
actually
like
done
a
deep
dive
into
this.
Should
I
remove
myself?
Does
anybody
want
to
take
a
deep
dive
into
this.
C
A
A
Well
so
yeah
so
mark,
I
think,
might
be
on
leave
still,
so
we
should
I'll
paste
this
in.
D
A
D
The
name
of
somebody
who
can
work
with.
A
A
A
A
Moving
it
into
done,
because
I
don't
think
there's
anything
else
for
us
to
do
if
they
want
to
send
it
back,
then
so
be
it.
Okay,
pods
should
run
through
the
life
cycle
of
pods
and
pod
status.
F
Yeah,
like
like
a
metronome
but
yeah
strange.
A
A
A
Is
this
a
test
grid
link?
No,
but
I
might
be
able
to
click
on
one
of
these
and
get
the
test
grid.
F
F
It's
yeah.
A
F
One
day-
and
you
know.
A
I
certainly
trust
you,
I'm
not
gonna,
I'm
gonna
up
the
priority,
because
this
is
a
failing
test
and
we're
getting
pings
from
sig
release.
To
look
at
this,
I'm
gonna
yeah.
C
F
A
Perfect,
okay,
I
have
the
next
steps
on
here,
so
I
think
we
should
be
good
to
go
thanks.
Okay,
let's
look
at
this.
One
pods
should
support
pod
readiness
gates.
I
don't
know
anything
about
this
feature
and
apparently
it
is
flaking
and
here's
a
test
grid
link.
Oh,
that
did
not
work.
A
A
A
This
is
super.
Super
old
looks
like
I
have
not
seen
this
failure
at
all,
so
maybe
let's
check
it
in
triage,
not
loading
close.
These.
A
A
Okay,
so
we
actually
have
and
it's
it's
failing
on
cryo
and
container
d.
So
let
me
add
this
more
recent
triage.
A
A
Okay,
hopefully,
that
will
be
enough
for
francesco
to
work
with.
Let
me
remove
derek
and
don
as.
A
A
A
Okay,
where,
where
is
the
what's
the
pr
that
we're
waiting
on?
Oh
it's,
the
clayton's
pod
life
cycle,
yeah,
okay,
yes,
that's
that's
fine!
I
think
we
can
just
keep
waiting
while
we
wait
for
that
to
land.
I
think
it's
we're
currently
waiting
on
land,
how
to
give
that
the
lgtm.
A
So,
okay,
no
update
for
that
one.
That
sounds
good
to
me.
We've
looked
at
all
of
our
tests.
We've
done
our
duty.
So
is
there
anything
else
in
the
board
we
want
to
go
through
today?
D
D
Make
that's
better.
A
Okay,
then,
probably
a
kind
failing
test
and.
A
D
Let's
merge,
commit
and
still
cool
like.
A
A
He
answers
my
slack
messages,
amazing.
So,
let's
see
end
to
end
node
fix
the
device
plug-in
test.
This
is
a
francesco
pr.
A
This
one
I
should
probably
review,
because
I've
been
looking
at
a
lot
of
yeah.
I
already
did
a
review
on
this
one.
So
let
me
just
assign
myself.
C
We
still
have
some
races,
they
pretty,
we
don't
have
a
lot
of
them,
but
we
still
have
so
just
like,
because
the
recent
introduction
of
the
new
text
context
flag
just
to
disable
automatically
start
of
the
complete.
I
just
use
it
I'll.
A
Throw
important
long
term
on
here
and
who
is
the
right
person
to
review
this.
C
A
A
D
Yeah,
I
think
one
thing
from
back
triage.
We
found
boxcraft
because
we
found
so
many
old
old
issues
and
we
need
to
have
test
coverage
which
is
great,
and
we
have
this
issues
on
the
board
now.
So
whenever
you
have
a
chance
to
take
a
look,
they
wasn't
properly
attributed.
That's
why
we
didn't
find
them
before
box
club
really
helped.
D
Editing
the
sock
test
like
sock
test
is,
should
be
quite
interesting,
like
I
think
somebody
already
commented
there
that
they
want
to
take
it,
but
they
already
knew
how
to
start
so.
I
gave
some
pointers.
A
Great,
do
we
have
anything
else
for
sort
of
ci
related
stuff,
because,
if
not,
I
would
love
to
talk
more
about
the
bug
scrub,
because
I
think
that
it
has
unlocked
and
or
created
some
work
for
sort
of
the
triage.
Second
half
of
the
meeting.
A
Okay,
hearing
nothing
more
on
ci,
so
basically,
we've
now
gone
and
we've
scrubbed
all
the
bugs,
which
is
great.
We
started
with
like
450
bugs
and
we
closed
130
of
them.
So
we
now
still
have
like
an
absolutely
outrageous
number
of
bugs
but
they're
almost
getting
into
like
manageable
numbers,
so
a
thing
that
I
think
we
should
consider
doing
and
I'm
going
to
start.
A
I
didn't
have
time
quite
yet,
but
I'd
like
to
start
with
some
proof
of
concepts
is
to
create
like
some
like
for
node
specific,
like
bug
management
boards,
so
like
one
for
features
and
everything
else,
and
one
just
for
bugs
and
ensure
that
we're
kind
of
looking
at
that
stuff
weekly
we're
not
doing
that
right
now,
obviously-
and
I'm
not
entirely
sure
yet
like
how
we
necessarily
want
to
categorize
them,
because
they
don't
necessarily
have
the
same
workflow
as
we
have
for
like
prs,
where
you
know
you
initially
send
the
pr
and
then
you
assign
a
reviewer
and
then
it
gets
reviewed,
and
then
you
have
an
approver
and
so
on
with
issues
I
think,
there's
probably
going
to
be
more
states
like
needs
to
be
triaged.
A
We're
waiting
on
information
we've
like
confirmed
the
bug,
and
it's
in
progress
that
kind
of
thing,
but
I
haven't.
I
haven't
really
done
any
proofs
of
concept
for
that.
Yet
so
I'm
gonna
try
to
make
some
boards
and
play
around
with
it,
and
I
think
probably
we
want
to
start
with
bugs
and
then
move
into
our
feature.
Backlog,
which
is
quite
big.
Is
anybody
interested
in
working
with
me
on
that?
D
I
believe
we
need
to
keep
the
number
of
box
manageable.
Last
time
I
almost
a
year
ago,
I
suggested
to
do
that
and
feedback
from
community
was
that
we
are
not
ready
to
take
new
work.
Yes,.
F
A
A
Then
things
are
all
mostly
up
to
date
in
terms
of
labeling,
and
so
we
should
be
able
to
kind
of
like
pull
them
onto
boards
and
not
have
to
look
at
every
single
thing
while
we're
doing
it-
and
I
know
unfortunately
like
we
still
don't-
have
a
lot
of
automation
for
like
auto
categorizing,
you
know
what
columns
on
what
boards
things
need
to
go
to,
but
I
think
at
least
we're
in
the
sort
of
state
where
we
can
start
like
every
week,
looking
at
incoming
stuff-
and
that
will
be,
I
think,
a
big
improvement
too.
A
So
I
think,
having
a
like
a
bug
board
plus
and
everything
else,.
A
A
But
yeah
I'm
looking
forward
to
doing
that.
Hopefully
moving
forward.
A
A
Cool
okay,
other
subjects,
anything
else
on
bug,
scrub
follow-up,
I'm
hoping
to
do
that
soon.
It's
really
painful
without
automation,
I
gotta
say
it's
like.
Yes,
I
would
love
to
click
and
drag
200
issues
onto
a
board.
That
sounds
like
a
great
time.
A
There
aren't
any
so
github
is
currently
like
announcing
a
bunch
of
stuff
happening
with
their
triage
boards
and
whatnot,
but
as
far
as
I
can
tell,
there's
a
lot
of
new
features
that
they're
adding,
but
none
of
it
solves
the
problems
that
we
need
solved.
I've
been
talking
to
contrib
x
about
it,
so
there
exists
some
bots
out
there,
so
we
might
end
up
asking
contribex
to
like
set
up
a
bot
for
us,
but
it's
going
to
be
a
lot
of
manual
configuration
and
work
and
we'd
have
to
like
keep
the
bot
running.
A
A
A
Oh,
where
are
we
at
with
the
who's
working
on
the
node
conformance
stuff?
Do
we
have
anyone
assigned
to
that.
A
Okay,
that's
that's.
A
A
And
then,
similarly
for
the
node
feature
stuff
is
that
also
in
the
same
boat.
D
A
A
B
G
Sure
well,
so
this
is
the
first
part
of
the
the
work
in
progress
to
remove
these
tags
from
the
tests.
First,
I
created
the
first
pr
which
basically
duplicates
the
tags
with
the
new
syntax
and
eventually
we'll
remove
these
tags
right
now.
This
pr
is
safe
because
it
doesn't
modify
any
behavior
and
there
are
no
tags
that
have
the
new.
A
Awesome,
so
how
does
this
work
exactly
we're
adding
node
we're,
adding
the
thing
that
just
like
marks
things
as
a
node
feature.
G
A
A
A
D
About
this,
we
needed
to
take
it
in
stages
and
yeah.
Oh.
A
D
A
D
So
if
we
like
there's
one
topic
about
stock
tests,
I
don't
know
if
you
saw
this
discussions,
there
is
a
issue
that
you
found
during
boxcraft.
Somebody
suggested
to
do
sock
tests
for
to
detect
couplet
leakages
or
breakages.
G
D
Yeah
the
test
easy,
the
only
problem
how
to
detect
the
failures
and
leakages,
so
I
I
suggest,
like
one
idea,
is
for
me
to
use
npd
to
detect
the
problems
because
npd
can
find
when
kublet
was
restarted
or
such,
and
it
has
some
medic
collection
stuff.
But
I
don't
know
how
easy
to
use
it
during
the
test.
I
wonder
if
anybody
has
any
other
ideas,
how
to
detect
crashes
and
leakages.
A
Yeah
so
for
crashes,
I'm
not
entirely
sure
for
leakage.
So
a
problem
that
we
have
right
now
is
that
the
node
tests
aren't-
or
I
should
say,
cubelet's
not
being
scalability
tested
and
is
there
anything
in
here
but
scale?
Let
me
see
if
I
can
find
this
on
the
scalability
repo.
A
A
D
Yeah
there's
a
link,
and
at
the
moment
I
checked
it.
It
was
like
15
days
ago
and
it
didn't
have
much
information.
Maybe
it
doesn't
know.
D
D
A
Look
at
that,
I
wonder
if
that
was
the
what
date
is
this
six
eighteen?
I
wonder.
E
A
Well,
apparently,
there
was
something
else
yeah,
so
it
looks
like
121.
Has
it
just
120,
I
think
120
does
not
so
should
we
ask
them
if
they
want
to
add
it
for
older
branches
or
is
just
the
assumption
at
this
point
that
it
won't
matter
because
we're
hopefully
not
adding
scale
regressions
in
back
ports,
but
who
knows.
A
A
A
Okay,
well,
I'm
so
happy
to
see
that
looks
like
we
now
have
quite
a
long
agenda
compared
to
what
we
started
with.
Do
you
have
anything
else
for
today?
I
don't
think
I
have
anything.
D
A
I
don't
know
of
anything
specifically
in
general
when
I've
seen
leaks,
debugged,
it's
all
been
manual
using
like
the
stuff
that
comes
off
the
prof
endpoint
for
like
memory
utilization.