►
From YouTube: Kubernetes SIG Node CI 20230628
Description
SIG Node CI weekly meeting. Agenda and notes: https://docs.google.com/document/d/1fb-ugvgdSVIkkuJ388_nhp2pBTy_4HEVg5848Xy7n5U/edit#heading=h.2v8vzknys4nk
GMT20230628-170455_Recording_1792x1020.mp4
A
Oh
hello,
it's
okay!
What
date
is
it
June,
28
2023?
It's
a
signal.
Ci
meeting,
welcome
everybody
agenda
item.
I
only
have
one
agenda
item
today.
It's
yeah
I
think
it's
not
just
this
PR.
There
are
many
PRS
going
on
right
now:
cleaning
up
our
node
end-to-end
tests.
This
is
one
of
them
where
we
don't
pull
images
from
node
end
to
end
any
longer,
I
mean
this
is
not
underscore
intense.
A
It's
not
like
node,
slash,
end-to-end
or
like
end-to-end,
slash
node,
and
this
is
an
effort
to
run
this
test
on
ec2
on
Amazon,
so
watch
for
PRS
and
if
you're
interested
you
can
follow
a
few
people
like
James,
making
a
lot
of
changes
and
I
think
somebody
else
making
changes
yeah
recently.
This
was
one
where
we
stopped
people
link
images.
A
A
If
you're
interested
in
this
specific
item,
I
I
posted
it
here
just
for
everybody
to
be
on
watch
out,
I
think
I
remember
like
very
vaguely
that
in
the
past,
lack
of
pre-pulling
was
causing
test
failures,
especially
for
tests
that
are
waiting
for
specific
images
to
run
on
specific
time.
Those
tests
may
start
flaking.
So
if
you
see
any
flakes,
please
remember
that
there
was
this
change
and
Idol
gyms
and
we're
watching
for
failures,
and
if
there
will
be
failures,
we
will
revisit
this
decision.
B
A
Questions
on
that
no
I
wanted
to
double
check.
Swati
did
you
have
like
I
know,
as
for
follow-up,
from
last
week,
arm
tests
were
fixed,
we're
now
skipping
some
of
the
tests
that
were
failing
before
we
fix
them
for
Autumn
for
real
and
I'm
wondering
if
you
had
any
time
to
look
at
topology
manager
and
the
CPU
manager,
failures
to
the
topology
manager
doesn't
run
any
tests
and
the
CPU
manager
and
the
memory
managers.
They
are
failing.
C
A
You
we
still
have
CI
test
that
was
working
last
I
checked.
So
at
least
we
have
some
coverage,
but
that
would
be
nice
to
have
it
periodics
to
run
all
the
time.
A
A
A
A
A
So
we
go
to
triage
right
now.
If
you
have
any
more
agenda
items-
and
you
remember
something
feel
free
to
interrupt
me.
Otherwise,
we'll
just
start
the
triage
and
stuff
foreign.
A
A
A
But
I
want
to
make
sure
a
slow
test
can
finish
normally
so
yeah
they've
pinned
me.
He
has
two
proposals.
One
is
this
one
bumper
timeout
another
one
is
just
to
skip
all
slow
tests
in
CI,
so
I
think
it's
keeping
slow
doesn't
CI
is
better.
I
will
comment
on
that.
A
Okay
Just
for
information,
this
PR,
let
me
just
and
just
added
a
slow
inner
skip,
so
it
would
be
I
mean
we
don't
want
to
slow
down
CI.
A
Oh,
this
is
what
was
mentioned
yesterday.
I
think
we
can
discuss
next
week.
I
think
they.
They
say
that
it'll
come
next
week.
A
Yeah
bunch
of
changes
so
they're
really
hard
to
test
I,
don't
know
how
they
okay.
So
there
are
some
metrics.
A
A
Okay,
now
take
a
look.
I
think
it
needs
to
go
to
review
required.
A
B
A
Yeah
this
is
what
I
mentioned
yesterday.
Yesterday
teams
was
making
changes
to
resolve
host
instead
of
using
the
node
name.
B
A
D
A
Direct
controller
I
think
it's
another
code
change.
A
Yeah
and
just
read
me
on
the
test
side:
oh
no,
not
just
with
me,
foreign
driver.
A
A
E
B
D
Yeah
I
do
tests
for
CRI
stats
cap.
A
D
Yeah
I
think
everyone
I
think
we're
all
pretty
much
I'll
need
to
find
an
approver
but
I'd
say:
they're,
increasing
death,
preliminary
people,
first
yeah
I.
D
D
A
A
B
B
A
B
A
The
population
of
running
boards,
Francisco.
B
A
What
you
brought
into
the
yesterday.
A
In
place
like
at
windows
for
in
place,
upgrade
so
product
feature.
B
A
The
link
is
here.
A
Yeah
I
forgot
to
put
area
tests
there.
Okay.
B
A
B
A
A
And
we
cross
link
to
each
other.
C
The
the
title
issue
title
is
different,
I,
wonder
if
it's
because,
like
a
certain,
a
set
of
tests
will
run
and
then
eventually
under
the
hood,
you
know
serial
test,
the
jobs
corresponding
to
the
serial
tasks
were
failing
and
they
were
same
for
both
the
issues.
It
could
be
that
yeah.
A
B
A
Yeah
I
will
do
this
is
that
we
ended
with
the
test
portion
of
it.
Next
week
is
short.
You
can
U.S
I
wonder
how
how
much
progress
we
will
make,
hopefully
with
some
progress,
but
then
we
will
have
two
weeks
left
all
right
after
next
week.
It
will
be
just
one
week
for
code
freeze
and
then
a
little
bit
of
time
for
test
fees.
So
if
you
want
something
to
be
done
in
this
release,
please
start
sending
PRS
and
we'll
find
reviewers.
A
I
want
to
go
to
bhaktias
right
now
or,
if
anything
related
to
test.
Please
speak
up.
A
Okay
kind
of
fail
to
get
putlock
when
Docker
used
driver
local
driver.
A
D
A
A
Yeah
I
agree
with
Paco.
It's
really
hard
to
understand.
What's
going
on,
given
this
information
password,
do
we
have
any
document
suggesting
how
to
check
if
everything
was
allocated
properly
like
it's
indeed,
one
full
CPU.
C
C
I
think
we
need
more
information
of
how
to
evaluate
how
do
you
determine
you
know:
Hardware
interrupts
and
the
percentage,
and
all
that
that's
mentioned
in
addition
to
that
I
think
we
don't
have
explicit
expectations
set
as
to
you
know
how
Hardware
interrupts
should
be
like
sorry,
we
don't
have
anything
pertaining
to
that,
maybe
the
other
line.
We
can
add
some
tests
related
to
that.
If,
if
we
agree
that,
that's
something
that
we
want
to
do.
C
And
one
more
thing
that
caught
my
attention
was
the
reporter
of
the
issue
mentioned
Calico
was:
was
it
just
a
mention
about
the
stack
that
Calico
was
used?
For
you
know
using
Calico
impacted
that
if
that
is
the
case,
then
it
it
is.
The
problem
lies
in
cni
space.
You
know,
or
this
specific
cni
plugin,
as
opposed
to
CPU
manager.
B
A
B
D
It
seems
like
this
is
actually
a
failure
of
the
Network,
plugin
and
I
I.
Don't
know
what
yeah
I
it
might
be
really
within
that
scope,
because
if
you
look
at
the
error,
it's
like
the
Network
plugin
is
failing
to
create
pods,
because
it's
not
able
to
authenticate.
So,
it
seems,
like
you
know,
I'm
not
sure
which
he
and
I
that
would
be.
C
C
D
Oh
cool
yeah,
so
then
I
think
they'd
have
to
pick
it
up
with
the
Calico
Developers.
A
A
Okay,
so
they
send
that
99
of
memories
already
used,
but
before
that
I
said
that
94
is
used.
A
A
Okay,
we
out
of
this
box,
have
15
more
minutes
to
take
a
look
at
the
recent
news.
Information
updates,
just
in
case
something
happened.
Okay,
this
is,
we
just
looked
at.
A
B
A
B
A
Yeah,
so
the
issue
is
that
when
Grace
contamination
happens,
there
is
not
enough
time
to
remove
endpoints
which
eliminates
Imports.
So
we
give
a
little
bit
of
time
for
ports
to
terminate
and
then
there
is
no
time
left
for
endpoints
to
be
cleaned
up.
A
I,
don't
know
what
the
right
fix
will
be.
Maybe
we
need
to
update
our
graceful
termination
logic
to
leave
a
little
bit
of
time
after
everything
was
already
started,
terminating
because
this
is
not
not
ideal
like
there
is
always
this
race
between
endpoints
removal
and
pause
removal.
A
B
B
A
A
A
A
Yeah
we
saw
see
sometimes
like
this
health
check
is
just
a
regular
pink
to
couplet,
and
sometimes
it's
tough
working
so
yeah
and
we
need
I
mean
they
just
shared.
That
Uber
was
restarted,
but
we
don't
know
the
reason.
B
A
Yeah
I
think
it's
related
to
this
issue
with
static
Port
being
scheduled
after
a
regular
post
was
created,
yeah.
A
Okay,
then,
let's
finish
five
minutes
early.
Thank
you.
Everybody
for
coming,
oh
by
the
way
before
I
forgot,
I,
see
two
new
people
here.
Do
you
want
to
introduce
yourself?
I
can
stop
recording
if
you
don't
feel
comfortable
in
traditional
recording.
A
E
Hi
I'm
Matt
I've
been
working
in
the
test
meetings
for
a
few
months,
just
looking
for
a
way
to
contribute
and
I
heard
about
this.
One
and
thought
I
would
join
them
the
same
so
I'm
just
trying
to
run
a
little
later.
Land.
A
Welcome
do
you
feel
it
was
usable
introduction
for
you
like?
Did
you
find
anything
that
you
can
yeah.
A
If
not
it's
fine
I
don't
want
to
force
anybody
to
speak
up.
Okay!
Thank
you.
Everybody
have
a
good
good
rest
of
your
day.
Bye.