►
From YouTube: Kubernetes SIG Node 20210610
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
A
So
we
had
something
on
the
agenda
here
for
this
item,
but
it
is
crossed
out.
Do
you
want
to
talk
about
this.
A
B
C
B
A
Yeah,
so
this
is
crossed
off,
but
nobody
has
merged
this.
I
did
talk
to
dims
and
he
was
hesitant
to
merge
this,
because
his
concern
was
that
we
might
be
losing
tests
by
basically
getting
rid
of
this.
I'm
pretty
sure
we're
not
because
I
went
and
looked
at
the
test.
Selectors
and
the
only
difference
that
I
could
see
was
that
this
one
does
not
select
for,
I
think,
serial
tests,
but
from
what
I
can
tell
it's
not
actually
running
anything
else.
It's
just
a
duplicate,
so.
B
D
A
Know
that
we
have
the
eternal
serial
job,
so
we
should
probably
talk
about
that.
I
don't
know
francesca
if
you've
seen
the
latest.
E
A
Yeah,
so
I
guess
as
soon
as
we
don't
mess
up
the
like
test
description
to
actually
like
run
it
on
the
branches
and
we
exclude
the
eviction
test
and
it
seems
to
just
be
the
eviction
tests.
Everything
runs
fine
like
within
about
an
hour
and
30
minutes,
so
I
submitted
a
pr
basically
to
split
the
two
jobs
and
then
odin
was
skeptical
that
that
would
fix
it.
A
But
I
think
that
we
should
try
to
like
divide
and
conquer
like
because
there
were
still
definitely
lots
of
failing
serial
node
jobs.
So
I
think
that
we
should
try
to
like
fix
those
separately
from
like
the
eviction
jobs
which
are
slow.
E
I
just
want
to
go
on
record
and
saying
that
in
general
I
believe
that
splitting
the
serial
the
huge
cell
line
is
something
worth
per
se.
Even
things
were
working
good.
I
think
that
one
huge
serial
job
was
too
much.
D
A
I
mean
this
looks
so
like
it's
failing
constantly
to
me,
although
we
can
see,
I
think,
from
these
from
the
prs
where
it
was
run.
We
had
some
greens.
A
A
F
A
F
A
Yeah
so
I
mean,
is
there
anything
I
I
really
want
to
unblock
the
like
periodic,
so
we
can
get
signal
on
which
tests
are
failing.
So
if
folks
on
this
call,
don't
have
any
concerns
like
dims
was
happy
to
approve
this
one,
the
way
that
it
is
not
getting
rid
of
any
tests,
we're
just
splitting
them
into
different
jobs.
If
folks
think
that
that's
okay,
then
hopefully
we
can
get
this
merged,
and
then
we
will
have
the
eviction
test.
A
I
guess
running
and
timing
out,
and
then
we
can
file
that
as
a
separate
issue,
and
someone
can
work
on
that
and
then
we
will
have
the
rest
of
the
serial
tests
which,
according
to
test
grid,
this
is
from
the
pr
job.
Look
pretty
flaky
but
like
there
are
some
of
them
that
are
passing
so
there's
a
few
that
have
been
failing
and
they
do
look
like
they're
some
of
the
disruptive
tests
so.
E
A
We
have
a
much
larger
group,
mostly.
I
think
that,
like
odin
linked
this
tab,
this
is
the
pr
tab.
This
will
not
run
continuously
unless
we're
like
going
and
doing
that,
like
pr
blah
blah
blah
on
every
single
pr.
I
don't
think
we
should
do
that.
I
think
that
we
should
just
you
know
like
this
is
based
on
the
two
commits
where
we
disabled
the
eviction
test,
so
when
I
think
that
we
should
do
like
for
any
other
pr
that
doesn't
do
that,
it's
still
gonna
keep
failing.
So
we
really
need
to
split
that
out.
A
So
mostly,
I
was
just
confused
by
what
he
was
saying
in
terms
of
like.
A
Do
we
need
to
let
this
soak
because
nothing's
merged
that
would
need
to
soak
right
now,
like
there's
nothing,
that's
merged
in
kubernetes
kubernetes
and
the
only
thing
that's
merged
in
test
infra
is
that
we
fixed
it
so
that
it's
actually
running
against
prs
and
not
just
against
master.
I
missed
that
when
I
did
the
copy
paste
on
the
first.
A
G
I
took
a
look
at
these
piazza.
Yes,
I
have
a
vague
idea.
A
So
the
the
tldr
is
that
we
have
this
node
cube
serial
job,
which
is
supposed
to
be
doing
all
of
the
serial
tests
for
the
cubelet,
and
it
has
been
failing
for
months,
but
it
has
a
bunch
of
really
important
tests.
So
we
wanted
to
get
this
job
working,
and
so
we
started
with
trying
to
tweak
the
timeouts,
because
previously
it
had
like
a
three
hour
timeout,
we
tried
increasing
it
to
seven.
A
It
still
was
failing
so
then
francesco
found
that,
like
it
was
the
tests
that
were
running
for
eviction
that
were
timing
out,
so
he
submitted
a
pr
to
like
stop
those
and
then
it's
still
timed
out.
But
then
we
realized
that
the
job
for
the
pull
request
was
not
actually
running
on
the
pull
request.
It
was
just
running
against
master,
so
we
fixed
that
and
then
we
found
that
indeed,
when
we
disable
the
eviction
test,
it
was
passing
in
a
reasonable
amount
of
time.
A
So
I
submitted
a
pr
to
basically
split
the
two,
so
we
can
work
on
on
one
hand
trying
to
get
the
eviction
test.
Stick
so
they're,
not
timing
out.
It
sounds
like
they're
timing
out
even
after
24
hours,
so
there's
something
wrong
with
those
tests
and
then
the
rest
of
the
tests
keeping
in
the
same
job
so
that
we
can
get
some
signal
in
terms
of
what's
flaking.
What
needs
to
be
fixed
and
we
can
hopefully
add
them
back
to
like
release
and
forming
or
maybe
even
release
blocking.
A
For
so
yeah,
I
guess
we'll
we'll
leave
it
to
give
odin
a
chance
to
chime
in
on
that
pr.
But
I
think
that
we
should
proceed
with
splitting,
because
if
we
don't
do
that,
we're
not
going
to
get
more
signal.
A
I
have
no
idea
so
that
would
be,
I
guess,
probably
do
we
have
an
issue
for
that,
probably
not
that's,
probably
something
we
should
file
right
now.
E
B
H
I
think
I
think
what
was
happening
was
we
actually
did
kill
the
pods,
but
there
was
a
different
asynchronous
thread
going
on
inside
kublet.
That
was
getting
the
stats
for
which
pods
are
running,
so
they
were
receiving
the
fact
that
the
you
know,
a
pod
that
has
already
been
deleted.
You
know
was
still
running
after
the
pod
was
deleted
because
of
the
the
asynchronicity
of
the
stats.
You
know
they
get
to
get
which
pods
are
running.
H
F
H
G
A
A
A
C
A
C
A
I
thought
I
opened
it.
Let's
see,
I
don't
know
if
there's
anything
other
than
that,
one
that
we're
missing,
I'm
gonna
assume
that
folks
can
assign
themselves
and
then
I'll
leave
this
in
the
in
progress
column.
Let's
see
if
there's
any
flakes,
no
yay,
okay,.
D
A
So
if
somebody
wants
to
just
go
run
in
and
triage
accept
that
one
right
now,
that
would
be
great
and
I'm
not
sure
what
the
state
of
things
are
or,
if
there's
anything
that
we
need
updates
on.
Oh
we've
got
this
cubelet
node
conformance
tests
are
broken
that
I'm
working
on.
So
I
should
maybe
slash
or
sign
this
one.
A
A
C
I
D
A
A
A
A
Yeah,
so
I
think
that
this
will
probably
be
handled
by
whoever
is
working
on
these
two
things
both
of
those
I'm
assigned
to,
and
there
is
a
pr
up
to
just
remove.
A
A
So
yeah,
it
seems
like
for
this
one
dims
had
some
concerns
that
we
might
be
losing
tests
by
removing
this
job,
but
I'm
like
pretty
sure
that
we're
not
actually
losing
anything
because
I
did
a
search
through.
I
believe
so
we
have
this
job,
which
is
running
fine.
I
believe
this
kubernetes
node
cubelet
and
then
we
have
the
kubernetes
node,
cubelet
conformance,
which
runs
the
exact
same
thing
only
it
skips
it
does
not
skip
serial
tests,
whereas
this
one
skips
serial
tests,
but
I
couldn't
find
anything
marked
node
conformance.
A
Maybe
I
can
pull
up
into
test
grid,
so
here's
the
node
cube
of
conformance
it's
just
it's
like
there's
some
sort
of
infra
issue
where
the
job
is
not
starting
and
then
what
is
the
other
job
called.
A
So-
and
I
feel
I
mean
I
don't
know-
maybe
there's
one
of
those
eviction
tests-
that's
tagged
cereal,
that's
causing
this
one
to
fail,
but
that
I
don't
know
if
that
would
be
the
thing
causing
this,
then
we
can
look
at
one
of
the
test
runs.
Maybe.
A
A
A
Is
there
anybody
that
wants
to
pick
this
up,
I
mean
I
put
a
pr
up
to
remove
the
job.
Does
anybody
want
to
confirm
that
they're
not
duplicates
and
see
what
why
this
is
a
ci
failure,
because
I
know
that
dims
indicated
some
interest
in
making
the
job
work
first
and
then
confirming
it's
a
duplicate
before
we
get
rid
of
it.
I
A
A
A
C
A
And
dd
you're
assigned
to
this
one,
is
there
anything
that's
been
happening
here.
It
looks
like
you
did
some
research
yeah.
J
So
actually
like
before
one
or
two
months
back,
I
guess
dems
have
raised
multiple
pr's
on
cleaning
up
container
tests
and
yes,
I'm
linked
with
the
issue,
and
I
have
also
raised
one
or
two
pr's.
So
it's
just
now.
If
you
look
at
the
container
board
on
telescript,
it
looks
much
cleaner.
A
A
And,
what's
the
what
exactly
is
the
follow-up
that
we're
we
need
to
do
still?
I
think
that.
J
J
A
A
A
A
A
A
A
A
A
A
Okay
looks
like
we
have
an
approach
for
that
from
yesterday,
so
that's.
A
E
A
That's
okay.
Does
anybody
feel
passionate
about
this
test
and
want
to
jump
in?
I
know
that
this
is.
This
has
been
a
flake
for
quite
some
time.
A
Yes,
well,
don't
we
all?
It
would
be
good
to
at
least
get
somebody
looking
at
this
now,
because
I
know,
like
I've,
seen
this
flake
on.
A
But
if
nobody
else
can
get
to
it,
then
certainly
you
can
look
at
it
later.
Let's
just
you
can't
do
everything.
A
So
this
failure
that
rob
guilty
reported
here,
that's
a
duplicate
of
the
other
issue.
It's
not
a
timeout.
A
So
perhaps,
given
that
that
summary
is
the
same,
I
will
just
close
that
one
there's
a
duplicate
of
this
one,
because
I
think
this
one
has
a
little
bit
more
detail.
A
And
actually
it's
hard
to
say,
but
I
know
that
the
the
ci
signal
team
is
looking
at
this
issue
and
not
this.
A
A
So
pots
should
run
through
the
life
cycle
of
pods
and
pods
status.
E
A
I
Yeah
I
was
looking
into
this
yesterday
kind
of
did
some
digging
up
so
basically
still
there's
a
test
that
is
failing
because
of
some
time
out
issue
and
need
to
get
at
the
bottom
of
this.
So
I
just
put
in
kind
of
some
of
my
findings
for
me
to
you
know.
Remember
when
I
come
back
to
it.
Yes,.
A
No,
that's
a
great
update
and
yeah.
I
know
with
this
one.
It's
just
like
the
pods
for
the
tests
initially
are
not
starting,
which
I
don't
know.
If
that's
something
that
we
can
do
anything
about.
That's
more
of
an
api
machinery
thing
so.
I
A
Then
I
will
guess
this
one
looks
like
it's
also
being
looked
at
by
this.
The
isignal
team,
so
it'd
be
good.
If
we
had
someone
actively
assigned
to
this
one
I'll
unassign,
you
francesco
just
because
you've
got
so
much
on
your.
A
And
I
know
you
also
have
the
the
issue
of
doom
with
the
flakes
on
the
probes.
A
I
should
probably
take
a
look
at
that
one
today,
flaky
test
paws
should
support
pod
readiness
gates.
A
E
A
A
C
A
Potential
race
from
one
of
the
winters
adida
you
are
assigned
to
this
one:
do
you
have
any
updates
on
this.
J
Yeah
give
me
a
second
so
on
this
one
like
if
you
go.
I
have
I
have
mentioned
in
comments
here.
If
you
go
down
next
one
yeah
from
the
discussion,
I
found
that
like
what
the
lender
is
doing,
that
that
part
we
can
remove.
It
won't
have
any.
J
I
will
create
a
pr
to
remove
it.
Then
we
can
see
more
discussions
on
it,
possibly.
A
A
This
is
more
of
a
sort
of
topic
thing
and
yeah.
I
agree
with
this.
We
should
not
have
node
features
and
also
features
it's
just
kind
of
all
over
the
place,
so
I'm
not
sure.
D
C
A
A
Okay,
last
one,
I
think
this
is
matthias's
favorite.
A
Flakes
less
yeah,
that's
good
news,
I'm
not
sure
how
to
fix
this.
On,
like
I
know
when
I
added
the
tests
for
changing
like
the
grace
period,
I
did
like
a
thing
where
I
ran
the
thing
on
my
local
machine
for
like
two
hours
and
then
I
like
looked
at
the
distribution
of
timings,
and
then
I
based
the
timeouts
on
that.
I
doubt
anybody's
done
anything
like
that
with
this.
So
I
wonder
if,
if
maybe
that's
the
sort
of
thing
that
we
just
need
to
do
on
this.
A
I
went
and
I
ran
it.
I
collected
100
data
points.
I
did
the
quantile
analysis
on
it
and
then
I
like
added
a
little
bit
of
room.
Sometimes
the
problem
with
these
things
is
that,
because
you
know
there
are
possible
races
and
whatnot
like
the
way
that
we
set
the
timeout
thresholds
is
like
we
have
to
set
them
in
such
a
way
that
we
know
for
a
fact
that
we're
getting
the
behavior
that
we
want
isolated
in
the
tests
and
we're
not
accidentally
timing
out
on
something
else.
K
A
Will
necessarily
make
a
big
difference
on
this
one,
but
I
think
possibly
like
reducing
the
jitter
to
like
ensure
that
it's
I
mean,
I
guess
it'll
just
still
be
some
multiple
of
like.
K
Yeah
so.
K
I
I
I
I
need
to
watch
again
the
the
source
code,
because
at
least
the
way
that
now
the
test
is
written.
We
we
no
longer
have
the
issue
of
the
sleeping
at
the
beginning,
the
the
jitter
that
was
like
interfering
with
the
sleeping
so
now,
but
by
the
way
it's
written
here
it
it's
it's
out
of
scope,
but
there
might
still
be
something
in
propagating
the
the
the
stages,
maybe
from
the
container
to
the
pod,
or
I
don't
know.
A
K
I
I
have
like
two
questions,
so
this
this
meeting
is
superseding
the
one
on
wednesday
or
we
have.
A
So
we'll
never
have
both
of
them.
The
plan
is
that
we
will
do
this
one
once
a
month
on
the
second
thursday
at
this
time
and
we'll
use
the
wednesday
time
the
rest
of
the
month,
because
we
weren't
sure
how
much
demand
there
was
for
this
time,
and
the
other
issue
is
that
I
am
like
quadruple
booked
in
this
slot.
So
I
can
do
this
once
a
month,
but
I
can't
do
this
every
week.
K
A
K
A
We
won't
have
a
wednesday
meeting
so
every
second
week
of
the
month,
every
second
thursday
of
the
month.
We
will
have
this
time
and
not
have
the
wednesday
meeting.
There
is
a
calendar,
invite
that
was
sent
out
to
signo
test
failures,
mailing
list.
So
if
you
are
subscribed
to
that
list,
then
you
will
get
the
calendar
invite.
So
you
will
know
when
the
meetings
are
supposed
to
be
sergey.
C
A
Week
and
we're
basically
at
time
so
we
didn't
have
a
chance
to
go
through
the
the
rest
of
the
pr's,
but
I
think
that's
okay,
then
we'll
stop
sharing
anything
else
for
our
last
two
minutes
today,.
A
Bug
smashing
so
we're
gonna
be
working
on
the
bug
scrub
that
will
be
in
two
weeks
two
weeks
today,
and
my
hope
is
that
we
can
kick
that
off,
probably
starting
at
like
1
utc
on
the
24th,
which
will
for
me
be
like,
I
think,
the
the
day
before
kind
of
thing.
So
I'm
happy
to
help
kick
that
off.
A
We
need
volunteers,
so
I
haven't
checked
the
spreadsheet,
but
I
sent
out
a
spreadsheet
to
the
mailing
list
for
folks
to
sign
up
to
volunteer,
and
I
know
that
we
have
some
folks
on
this
call
in
various
time
zones.
So
I'm
hoping
that
I
will
see
you
there
and
that
we
can
have
some
folks
volunteer,
looks
like
some
people
have
signed
up.
A
So
that's
great,
we
need.
We
need
regional
captains
for
europe
and
middle
east
and
africa
emea.
So
if
anybody
at
least
have
lots
of
reviewers
approvers
possible
mentors
signing
up,
but
we
need
people
to
help
coordinate
so
hoping
to.
I
encourage
you
to
sign
up
for
that.
A
Oh
and
let
me
show
the
link
in
the
chat
it's
also
in
slack,
and
it's
also
sent
to
the
mailing
list
and
thank
you
adidi
and
paco
for
volunteering
for
apac.
That's
awesome
coordination.
Will
there
be
a
slacker
zoom?
I
am
trying
to
figure
that
out
right
now,
so
probably
we'll
just
use
the
oh
hello
bird.
I
have
a
bird
sitting
on
my
window.
A
Hopefully
we
will
use
the
sig
node
zoom.
So
like
this
room
and
we'll
just
be
dropping
in
and
out
throughout
the
day,
we
don't
have
any
other
signal
meetings
scheduled.
So
it's
not
like.
There's
worry
about
a
conflict
and
then
for
slack,
I'm
seeing
if
we're
going
to
either
use
the
main
channel
or
if
we're
going
to
get
a
special
event
channel.
I've
been
talking
to
contribex
and
slack
admins
and
I
haven't
gotten
the
response
for
them.
A
Yet
I'm
also
trying
to
get
triage
parties
set
up
because
apparently
we
have
to
like
run
our
own
instance
of
it.
It's
not
a
hosted
thing
so
yeah,
I'm
still
working
on
the
details,
but
hopefully
at
some
point
we'll
have
that
all
figured
out
great
we're
at
time.
So
I
will
see
everybody
next
week
in
a
stig
node
meeting,
perhaps
or
on
slack
sooner
and
have
a
great
rest
of
your
day
cheers
everyone
cheers
bye.