►
From YouTube: Kubernetes SIG Node CI 20230614
Description
SIG Node CI weekly meeting. Agenda and notes: https://docs.google.com/document/d/1fb-ugvgdSVIkkuJ388_nhp2pBTy_4HEVg5848Xy7n5U/edit#heading=h.2v8vzknys4nk
GMT20230614-170440_Recording_2078x1340.mp4
A
A
First,
is
our
test
and
now
running
yay
to
everybody's
involved
and
I
know
many
people
were
working
on
that
it's
a
subset
of
tests
right
now.
It's
not
all
the
test,
but
it's
a
good
enough
starting
point.
Now
we
can
add
more
and
make
sure
that
we
validate
arm
64
Upstream,
so
every
Everybody,
using
rm64
Builds
on
like
selling
them
and
like
installing
them,
can
be
more
confident
that
it's
working
so
I
know
that
there's
some
tests
are
failing,
I
Francesca.
Do
you
think
it
may
be
a
related
device?
Plugin
I,
like
this
device
thinking.
B
A
It'll
be
nice
yeah.
A
We
have
a
bug
for
that
I
think
we
have
uberbock
tracking
arm
64
tests
in
general,
so
we
so
far
we've
taken
it
there.
But
if
it's
very
specific
to
this
test,
we
can
create
a
separate
bug.
A
B
A
A
A
A
Multinomial
some
multinoma
also
we
added
we
had
CI,
we
added
video
objects,
but
unfortunately
they
all
failing
like
this
is
not
failing.
This
is
not
running
any
tests,
so
it
seems
that
some
definition
isn't
correct,
but
then,
like
these
two
are
failing.
This
configuration
I'm,
not
sure
if
it's
white
yeah.
C
B
If
there
is
I
I
think
yeah
investigation
is
due,
but
let's
try
to
First
fix
the
the
CRI,
because
that
all
the
research
measure
will
depend
on
that
and
it's
a
common
cause
of
possible
issue.
And
there
is
a
test
infra
issue
fed
by
Parkway.
If
I'm
not
mistaken,
about
updating
the
base
image,
which
is
too
old-
and
that
should
be
relevant.
A
I
think
next
item
is
about
it
right.
Oh,
it
wasn't
Leather
by
me,
but
no,
no,
it's
different
I.
A
Okay,
Mike,
you
you've
been
looking
at
this
cost
93
right.
D
A
A
A
A
Now
this
one
was
led
by
me,
but
somebody
thank
you.
E
Yeah
I
added
it
so
like
actually,
the
dra
related
tests
are
like
failing
in
this
like
after
it
was
introduced.
It
has
like
never
passed
like
since
last
week,
so
I
think
these
are
the
same
set
of
tests
that
that
are
failing
in
the
arm
64
one
also
just
now.
I
saw
it
so
like
cure
the
Dr
replica
not
able
to
register
to
cube
plan.
Some
sort
of
error
like
that.
A
E
E
B
A
Thank
you,
okay,
thank
you.
Are
you
also
looking
at
that
at
him
or
you
just
you're
asking
for.
A
A
And
lastly,
the
spermal
failure
issue
just
wanted
to
check,
so
we
get
some
arm
fixes.
We.
A
Yes,
the
most
focusing
made
on
on
arm.
We
need
to
keep
and
this
one
Mike
will
send
today.
So
hopefully
we
can
address
like
two
of
them
and
next
to
address
other
ones.
A
A
A
I
have
quite
a
few
issues:
I
already
cleaned
up,
some
of
them.
Let's
go
through
the
rest.
A
Okay,
yeah,
we
looked
at
it
last
time.
Some
tests
are
not
running
as
fast
as
expected,
so
they
writing
like
0.70.072
seconds
slower
than
expected.
A
Still
no
takers,
I,
guess:
okay,
let's
make
it
happen
for
another
week
and
then
we'll
try
to
find
more
people.
F
Be
related
when
we
first
started
running
some
Indian
no
tests
on
eks.
F
There
were
several
issues
related
to
go:
Max
procs
the
nodes
actually
had
more
CPUs
and
then
go
match
props
to
text
number
of
CPUs
on
the
Node,
and
so
it
runs
more
go
routines
more
stuff
in
parallel.
But
then
it's
still
CPU
limited
to
like
one.
So
it
ends
up
basically
trying
to
run
a
whole
bunch
of
tests
and
parallel,
and
they
end
up
taking
longer
than
expected.
A
G
To
move
as
many
jobs
as
possible
without
dependencies
over
to
the
eks
cluster,
one
of
them
kind
of
many
from
that
reference
issue.
G
Dependency
so
I,
if
any,
like
internal
authors,
that's
required
for
those
pushes.
We
should
keep
those
out
of
this
issue.
Does
that
make
sense.
A
G
Failing
because
the
eks
cluster
requires
a
resource
quotas,.
A
A
A
I
will
move
it
into
original
motor,
but
likely
to
be
as
soon
as
a
digital
proof.
But
it's
already
approved.
C
F
A
Yeah
it
is
but
yeah
I
wonder
what
is
going
on
yeah
I,
wonder
why
James
is
running
ec2
tests.
D
F
A
Okay,
so
this
is
still
failing.
I
guess
I
mean
all
other
tests
are
not
fading,
so
I'll
beautify
is
fake.
A
Yeah
we
on
Google,
we
don't
use
E2
as
well
like
we
want
n
series
to
be
used,
so
it
probably
will
be
just
good
to
go.
I
just
want
to
understand
the
context
here,
better.
A
A
A
And
this
one
we
just
looked,
and
we
mentioned
Ed
who
added
this
test,
so
maybe
that
can
help.
A
A
A
A
Yeah,
oh
yeah,
I,
remember
now,
so
it
seems
that
during
the
bad
State
at
least,
containers
does
just
keep
accumulating
and
they
never
been
terminated.
The
suggestion
is,
can
we
at
least
have
a
timeout
and
kill
them?
A
Otherwise
we
get
into
umlock
failure.
A
Is
anybody
interested
to
take
a
look
at
how
to
add
a
timeout.
A
We're
so
I'm
going
to
double
check
that
this
to-do
is
still
there.
I
am
pretty
sure,
but
let's
be
on
the
safe
side.
E
A
And
since
it's
the
least
container
start,
so
maybe
we
need
to
associate
it
with
a
promoting
this
to
Beta
as
well.
Maybe
I'll
mark
this
judged
no.
B
B
Probably
once
we
agreed
about
to
the
best
way
to
fix,
which
is
still
being
discussed.
A
C
A
Yeah,
since
it's
issue
with
Alpha
feature,
I
will
put
important
long
term,
not
critical
and
yeah.
It
seems
to
be
known.
A
Yeah
the
half
an
hour,
so
we
have
a
little
bit
more
time.
I
suggest
we
just
go
through
a
little
bit
of
needs
information.
Maybe
we
can
yeah
at
some
point.
We
need
to
clean
it
up.
A
Okay
David:
this
was
looking
into
that.
So
David
said
that
he
cannot
update.
G
B
Just
for
context,
because
I'm
looking
into
similar
stuff
so
I,
don't
know
how
GCA
works,
but
I
know
that
when
cubelet
is
restarting
so
initializing,
if
admission
fails,
it
kills
a
container
intentionally.
This
is
the
bug
I'm,
looking
at
the
one
with
I
mentioned
previously,
so
hypothesis
I'm
not
really
brainstorming.
Now,
if
GCA
runs
an
admission
and
that
admission
fails
for
whatever
reason,
then
cumulate
will
terminate
running
pods
I'm,
not
sure
if
this
is
relevant
or
what
is
happening
here,
but
just
giving
an
idea.
B
A
Anything
special
so
you're
talking
about
this
issue
about
device
like
any
part
of
this
device.
Being
yes,.
B
Yes,
yes,
the
common
factor
could
be
but
I'm
really
just
looking
at
this
comment
now,
so
it's
really
guesswork
that
when
cubelet
restarts,
if
any
admission
fails
for
any
reason,
then
yes,
it
intentionally
kills
pod,
which
could
be
surprising.
I
was
a
bit
surprising
when
I
learned
myself,
so
it
is
the
reason
why
I'm
mentioning
here
could
be,
maybe
not,
but
it's.
E
G
A
A
That
seems
to
be
regression:
okay,.
A
A
A
A
Okay,
so
it's
in
place
scaling
related
and
we
asked
to
check
c
groups:
I,
remember,
yeah
c
groups.
A
The
issue
that
it
wasn't
applied,
like
CPU
limits,
200
meal
and
here
is
Zach
and
start.