►
From YouTube: Kubernetes SIG Node CI 20230927
Description
SIG Node CI weekly meeting. Agenda and notes: https://docs.google.com/document/d/1fb-ugvgdSVIkkuJ388_nhp2pBTy_4HEVg5848Xy7n5U/edit#heading=h.2v8vzknys4nk
GMT20230927-170826_Recording_2560x1600.mp4
A
Hello,
everyone
today
is
September
27th,
welcome
to
Sigma
CI
meeting
going
on
to
the
agenda.
A
A
And
also
the
last
the
comment
here
so
apparently,
Tims
has
already
put
up
a
summary
here
that
most
of
the
failures
are
unrelated,
but
we
still
want
to
look
at
this
great
right.
C
And
prioritize
looking
over
the
all
the
tests
in
test
grid
and
just
making
sure
that
there's
not
any
new
failures
just
because
all
the
jobs
were
converted
and
I
think
that
they're
all
converted
to
use
the
new
decorate
and
I
think
most
of
them
are
working
now,
but
I
didn't
know
if
I
would
I
was
looking
I
didn't
see
any
that
I
think
are
caused
by
that,
but
there's
a
lot
of
tests.
So
that
was
mainly
my
point
for
the
agenda.
A
Okay
sure,
so
we
can
start
with
signal
we'll.
D
Go
forward
Kevin.
Can
you
just
give
everybody
overview
of
what
we
did
with
bootstrap
file
and
what
to
look
for
just
for
recording
and
for
everybody,
awareness.
C
C
Okay,
so
basically
I
guess:
there's
been
a
long
running,
work
done
to
try
and
deprecate
this
bootstrap.pi
script
from
I
guess
from
Sig
testing
and
I,
don't
I.
It
sounds
like
a
lot
of
the
sigs
Sig
node
included,
have
not
really
converted
a
lot
of
those
over
and
so
I
think
dims
I,
don't
know
its
full
name.
He
kind
of
he
converted
all
the
I
think
jobs
over
like
last
week.
C
So
now
pretty
much
everything
should
be
using
decorate,
true
and
then
not
and
then
using
no
longer
using
the
bootstrap
script.
So
what
that
looks
like
is,
can
you
all
so?
Can
you
all
see
my
screen.
A
C
So
what
that
looks
like
now
is
that
there's
this
decorate
true
a
config
and
then
usually
there's
some
kind
of
command,
a
runner
because
they
deprecate
essentially
the
bootstrapped
up
high
I
think
was
an
entry
point,
but
it
was
relying
on
I
guess
it
was
running
an
entry
point
in
all
the
containers,
so
they
added
you
have
to
add
a
command
and
then
usually
the
args
didn't
change
which
I
didn't
know
about,
but
this
was
kind
of
the
conversion
and
we
had
one
I
think
one
issue
I
know
related
to
that
which
was,
let's
see,
I,
think
what
was
this
one
1.25..
C
C
This
thing
was
not
ready
and
then
luckily
the
issue
with
this
one
was
actually
a
typo.
If
you
look
at
the
build
logs
it
wasn't,
it
wasn't
actually
cloning
the
repo
for
to
actually
run
this
test.
So
that
was
actually
fixed
and
that's
why
that's
why
this
one
is
working
now
and
then
I
didn't
see
any
others
and
then
yeah,
so
that
was
at
least
like
some
of
the
examples.
C
C
And
yeah,
so
I
guess
I
wasn't
too
sure
which
ones
are
like
which
tests
are
actually
failing
or
not.
I've
been
just
kind
of
looking
at
at
looking
at
the
test
grid
lately
trying
to
find
some
examples,
but
I
think
a
lot
of
them
are
looking
pretty
good,
but
there
are
I
have
noticed.
There's
some
I
did
some
conversion
of
the
pre-summits
from
the
there's
a
lot
of
periodics
that
did
not
have
a
corresponding
way
to
run
them
as
off
of
a
PR.
C
C
Don't
know
if
I
think
that's
at
least
some
on
my
end,
but
if
there's
any
any
ones
that
check
like
I've
just
been
looking
at
the
tester
trying
to
find
ones
that
look
like
they
were
failing
out
of
nowhere.
I
didn't
see
any,
especially
in
the
release.
Blocking
I
didn't
I
think
there's
only
this
contain
this
one
here,
but
this
looks
to
be
running
all
the
tests.
So
sorry
in
Heinz
or
in
short,
I
think
mostly
looking
for
tests
that
are
failing
like
outright.
C
Usually
it
can
mean
something's
wrong
with
the
setup
and
that's
at
least
what
I've
been
trying
to
look
for
and
then
I'll
test
that
just
started
failing
out
of
nowhere
and
I
do
think
what
dims
pointed
out.
I
think
most
of
them
are
existing.
I
know
that
the
oh
I
know
that
I
was
looking
at
the
Fedora
swap
ones
and
I
think
we
actually
have
a
PR
to
fix
those.
But
that's
unrelated
to
this.
A
So
I
have
opened
the
test
grid.
Moving
on
to
Sig
release,
blocking
I
saw
one
failure
here,
and
this
says
I
think
the
path
is
wrong
somewhere.
Let's
wait
for
this
to
open.
A
A
This
does
look
like
this
has
been
failing
for
a
while.
So
did
we
create
a
p
an
issue
last
time,
I
think
this
has
some
connection
related
issues.
C
D
Yeah
the
intermittent
error
in
the
beginning,
so,
while
not
being
created,
it
keeps
pretty
trying
to
SSH
and
eventually
it
will
succeed.
So
I
think
it's
a
mistakenly
been
highlighted
in
a
summary.
D
Failing
test,
so
you
probably
need
to
click
on
specific
test
that
failed
and
see.
What's
going
on
and
I
think
it's,
it
was
failing
for
a
while.
Now
we've
been
debating
whether
we
need
to
increase
the
threshold,
but
we
didn't
want
to
do
it
without
investigation.
E
D
C
C
E
Purple
cell
means
that
you
you
trigger
at
this
that
runs
three
times
and
one
of
them
only
one
of
them
passed.
So
it's
it's
more
like
a
flake
some
of
the
tests.
We
need
to
run
them
multiple
times,
to
not
sure
consistency.
A
A
A
I
clicked
on
this
Cube
test
dot
node
test,
and
then
it
takes
me
to
specific
two
tests
that
have
failed.
D
Yeah
I
haven't
seen
this
failing.
I,
don't
remember
this
one
I
remember
this
greet
always
has
this
like
10
ports
creation
liking.
A
I
see,
it's
I
think
it's
all
different
tests
that
are
failing
at
different
times
like
if
I
try
to.
E
I
guess
I
can
create
an
if
you
will
all
you're
doing
this.
This
is
a
simple
serial
container.
Okay,
let
me
get
into
that.
A
F
F
A
Okay,
thanks,
so
do
we
need
more
eyes
on
this.
D
So,
what's
the
latest
on
that?
Is
it
using
multinomial
VM
now.
F
Yeah
I
think
it
is
using
multi-numa
VM
and
but
this
particular
test,
which
is
testing
for
metrics,
and
this
this
particular
failure
is
related
to
keeping
track
of
metrics
when
topology
manager
fails
a
pod
with
admission
failure,
so
we
are
trying
to
track
after
a
cubelet
restart,
and
the
expectation
is
that
our
we
would
be
able
to
keep
account
of
the
admission
failure,
but
it
seems
here
it's
complaining
that
the
admission
failure
count
should
have
been
zero,
whereas
we
legitimately
see
a
pod
failing
at
admission
time.
D
F
Could
be
I
need
to
look
at
that
I
believe
the
the
idea
was
for
topology
manager
to
actually
run
on
multinoma
environments,
and
previously,
when
we
had,
we
have
tests
where,
if
multinoma
environment
is
not
there,
then
the
testoster
skip.
But
my
understanding
is
now
we
have
multinoma
enabled
in
our
test
infrastructure,
so
those
tests
should
be
running
so
we
can.
We
can
definitely
look
into
why
those
tests
are
getting
skipped.
If
multi
Roma
is
the
issue
or
it's
something
else.
Do.
D
F
A
And
maybe
you
could
also
add
the
issue
to
notes
and
see
if
we
can
get
more
people
to
look
at
it.
So
if
you
get
stuck
sure.
C
These
are
not
new
I,
think
they've
been
going
on
for
a
while,
but
there
is
a
PR
up
to
fix
them
and
the
person
is
very
active
on
it.
I'll
link
to
that
I
think
I
think
we
are
aware
of
these
issues.
Sorry,
let
me.
C
I'm,
just
getting
the
the
pr
that
should
fix
them.
Apparently
it's
it's
I
think
related
to
some
feature:
toggles
being
disabled
or
enabled,
or
something
like
that.
A
C
So
this
is
excuse
me,
this
is
a
PR
that
is
up
to
fix
them
and
I
see
that
their
reviews
as
of
two
days
ago,
so.
C
B
C
And
then
I
think
that
I,
don't
I,
didn't
look
into
why
one
of
the
swaps
was
not
failing
and
others
was
not,
but
I
think
the
swap
fedora
cereal
in
the
Ubuntu
serial
ones
are
all
failing.
Oh
wait
now
this
is,
it
looks
that
looks
more.
Concerning.
A
C
A
A
B
Oh
yeah
I
wonder
if
some
of
those
have
are
due
to
Kevin's
sort
of
the
rework
that
he
did.
A
C
Yeah,
that's
what
I'm
wondering
about,
but
I
thought
I
reverted
all
those
so
I
I
mean
there's
I!
Guess
it's
Sarah,
I
guess
for
this
group.
It
should
be
well
aware:
we
ran
into
some
problems
with
trying
there's
a
another
push
in
Sig
testing
to
try
and
migrate
a
lot
of
our
jobs
to
I.
Guess
the
community
proud
cluster,
so
that
you
can
do
that
by
adding
the
cluster
name.
I
I
forgot
what
it
is.
C
The
name
of
it
We've
run
some
problems
with
cryo
periodics
trying
to
navigate
that
most
of
them
most
of
the
docs
say
you
should
just
be
able
to
add
the
cluster
name
and
some
resources,
and
it
should
work
for
cryo.
I
have
noticed
some
problems
with
the
periodics
and
we
had
to
revert
that
change.
C
I
will
look
into
the
swap
Fedora
ones,
because
that's
another
one
where,
if
that
is
occurring,
it's
kind
of
a
weird
one
where
the
SSH
stuff
actually
doesn't
succeed,
and
then
the
job
completely
the
job
will
time
out
essentially
and
I.
Think
that
looks
to
be
pretty
close
to
this
one.
I,
don't
remember
changing
that.
One
though
I'll
look
into
that
I'll.
A
A
A
A
This
was
this
was
also
failing
last
time,
I
believe.
E
A
A
C
Yeah
9
22
I
think
is
I
would
think
this
is
a
a
good
option.
If
I
don't
have
the
dock
up,
can
we
add
that
to
the
notes
or
I'll
get
the
dock
up
and
then
I'll
add
that
one
but
yeah
I
would
that's
another
one
I
think
we
want
to
be
on
the
lookout
for
for
the
the
migration,
because
that
seems
to
be
a
candidate
for
some
problems.
I.
A
A
A
Yeah
I'm
just
being
feeling
again
for
a
while
now
so
I'll
just
check
whether
this
is
the
one
related
to
what
Mike
mentioned.
One
pack
of
filed
a
request
for
okay
and
then
PC2
shortcut,
container
affection.
C
This
is
one
of
our
I
think
I
think
most
egregious
failures,
but
I
think
the
eviction
ones
have
been
failing
for
a
while
I'm
I
am
actually
looking
into
those
I
am
a
little
stuck
on
really
why
they're,
failing
I
think
there's
some
flakiness
with
a
lot
of
these
they're,
not
like
they're,
not
consistent
on
the
failures,
but
I
am
actually
looking
into
some
of
that.
Those
right
now.
C
There
is
a
there
is
an
issue
for
eviction.
Well,
I,
don't
know:
do
we
want
to
track
individual
tests?
There's
a
lot
of
issu
there's
a
lot
of
issues.
I
think
around
the
I
think
some
people
have
reported
like
test
ish
like
test
failures.
Generally
I,
don't
know
if
I.
B
D
For
sure
I
don't
I
can
dig
it
up.
C
B
C
There's
this
one
from
ouch
from
2021
that
I
I
assigned
myself
to
so
this
is
this
exact
one
so
I'll
add
I'll,
add
some
notes:
okay,.
A
What
about
this
one
mode
cubelet?
This
has
also
been
failing
for
a
while.
A
A
Okay,
I'll
add
it
to
the
notes
so
that
I
go
ahead
and
create
an
issue,
and
then
we
can
take
it
up
on
the
basis
of
comments
or
details.
There
I
mean
that's
all
from
the
script
perspective.
A
Do
you
want
to
do
bug
triage
now,
Sergey,
because
we
are
already
36
minutes
into
the
meeting
or
should
I
go
over
the
test?
Dashboard.
D
A
A
Waiting
for
author
okay,
so
it's
good
e
to
e
set
liveness
pro
timeout
timeout
seconds
for
conformance
test.
This
is
a
bug.
According
to
the
issue
called
healthy
to
a
pod
with
reused,
IP
needs
few
seconds
which
may
cause
a
to
a
test
to
fail
for
default
probe.
Okay
in
my
environment,
this.
D
Yeah-
let's
just
call
it,
can
you
just
stretch
close
yeah,
because
the
person
migrated
directly
like
send
a
prg
reacted
to
previous
series
Branch,
then
you
need
to
do
Master
first
and
then
animated
cherry
pick.
A
D
B
D
A
A
D
B
D
Think
it'll
be
a
protocol
to
fix
compatibility,
was
127.
I'd
suggest
to
move
to
Archive.
D
Oh,
it's.
The
person
reopened
the
same
PR
in
the
master,
I
think.
D
Yeah
we
will
need
to
have
a
reviewer
for
that.
It
seems
that
they
increase
the
timeout
because
of
some
environment.
Cni
installation
takes
longer.
A
Okay,
all
right,
then
yeah
I'm
done
with
the
texture
over
to
you
like.
B
E
Hospital,
okay
memory
manager,
unexpected
admission
error:
it
provides
us
the
setup,
we're
using
apology
manager,
policy
with
stuff
or
no
no.
E
E
So
looks
like
Francesca's
already
working
on
this.
We
can
move
on
to
the
actual
action
test.
E
E
So
it
looks
like
this
is
already
been
working.
I
guess
interesting.
Computers
in
that
is
also,
to
be
honest,.
E
Yeah,
it
looks
like
someone's
working
on
this
already.
Okay
moving
to
the
next
one
foreign.
E
E
Okay,
so
I
guess
we're
still
trying
to
reproduce
this,
no
one
to
keep
it
on
their
next
information
as
well.
D
E
E
Okay,
price
steel
shows
the
compress
size
and
it's
inconsistent
with
darker
images.
It's
probably
a
bit
of
request.
E
Think
we
just
missed
an
was
still
accepted.
D
Or
we
should
have
removed
kind
bug
right.
E
That's
right,
let
me.
D
And
you
can
reply
to
the
ml
fund
seal
that
yeah.
If
he
can
send
a
PR,
it
will
be
appreciated.
D
B
D
E
E
E
B
C
D
C
Leave
it
open
in
the
case
of
someone
searching
for
this
one,
but
I
do
think
that
the
the
other
issue
it
mentions
some
kernel,
a
bad
patch
in
a
kernel.
E
B
E
B
E
E
That
is
what
the
Simone
okay,
no
execute
20
seconds
with
extra
five
seconds
delay,
with
an
already
condition
becomes
unknown.
E
D
D
I
think
they
are
accept
and
Marcus
priority
important
soon
and
I'm,
not
sure
how
long
in
the
past
we
need
to
actually
fix
it.
D
I
mean
there's
a
couple
of
breaks
on
newer
OS,
so
the
question
is
I,
wonder
if
we
maybe
call
like
do.
You
know,
if
course
will
migrate
to
six
one
I
don't
know
at
any
moment.