►
From YouTube: Kubernetes SIG Node 20210428
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
Hello,
it's
april,
28
2021.,
it's
a
signal,
ci
group
and
we
are
all
here
for
like
checking
the
status,
so
we
already
went
through
this
two
three
items,
so
I
hope
this
pr
will
fix
most
of
the
serial
jobs
and
then
we
can
see
actual
failures
yeah.
Let's
wait
for
our
progress.
I
pinged
approvers
yesterday
and
some
of
pr's
already
got
approved,
but
we
still
have
a
few
okay,
so
you
are
francesca
for
the
next
topic.
B
Yeah
yeah
yeah,
so
in
in
this,
is
kinda
related
to
the
serial
phillips,
because
I
was
investigating
some
tests,
the
end-to-end
test,
cpu
manager,
so
long
story
short.
There
is
a
key
thing
which
I
would
like
to
address
and
this
when
we
run
end-to-end
node
test,
the
cubelet
is
going
to
use
the
default
system
directory
var
lid
cubelet
to
hold
the
state
of
the
managers,
for
example,
cpu
manager,
memory
manager,
and
this
means
we
will
have
heightened
state
and
what's
worse,
height
and
state
between
tests
and
which
is
something
I
will
rather
avoid.
B
But
first
I
will
first
like
to
check.
I
will
be,
I
will
be
checking
with
using
git
log
and
asking
in
the
channel,
but
first
of
all
I
will
just
start
from
here
and
asking
if,
if
we
agree,
we
should
try
to
avoid
actually
avoid
this
hidden
global
state
and
if
there
are,
if
no
one,
someone
in
this
meeting,
maybe
already
knows
why
we
do.
That
could
be
just
a
historical
artifact
and
in
general,
if
there
are
objections
on
concern
or
concerns
in
moving
away
from
this
global
heightened
state.
B
Yeah
yeah
yeah.
We
have
the,
for
example,
the
cpu
manager,
state
memory
manager,
state
and
also
the
all
the
state
files
for
the
resource
managers
which
are
being
tested
in
the
end-to-end
node,
but
in
general
the
state
of
the
cubelet
is
ending
up
here
and
cleaning
for
I
don't
want
to
take
too
much
time,
but
cleaning
up
this
state
directory
up
between
individual
end-to-end
tests.
First
of
all
defies
the
the
purpose
of
having
shared
state
if
you're
going
to
reset
any
way.
B
Why
having
a
global,
shared
state-
and
then
second,
this
is
actually
harder,
because
how
the
end-to-end
tests
are
run.
So
you
really
need
some
careful
coordination
between
the
all
the
moving
parts.
So
it's
it's
just
seems
easier
to
use
to
give
its
each
test
each
end
to
end
test
its
own
private.
Let's
say
state
directory,
so
it's
on
enter
and
each
end.
End-To-End
test
has
its
own
private
state,
so
you
have
isolation
among
tests
so
I'll
really,
unless
there
are
objectional,
just
keep
investigating
this,
among
god,
with
alongside
all
other
things,.
A
A
Yeah,
I
would
be
interested
to
learn
more
about
state,
but
I
don't
I
mean
from
a
description
of
it.
You're
right
it's.
It
seems
that
between
executions,
you
clearly
need
like.
We
definitely
need
to
clean
it
up,
or
at
least
don't
share.
B
Okay,
I
will
keep
investigating
asking
people
which
may
brought
the
end-to-end
tests
and
asking
the
channel
and
keep
torturing.
It
will
make
all
also
the
tests
more
reliable
in,
for
example,
in
other
streets
in
openshift.
I'm
pretty
sure
we
we
delete
the
state
file,
among
runs,
which
is
pretty
much
my
point
so
yeah
keeping
this
yeah
going.
C
Like
like,
it
is
for
cpu
manager,
memory
manager,
I
know
that
we
anyway
delete
the
state
file
during
the
after
each
phase,
so
I'm
pretty
sure
we
already
already
do
it.
B
A
D
D
Could
I
look
at
like
to
see
if
this
is
just
a
an
open
shift
issue,
or
if
this
is
like
upstream
as
well,
and
for
what
I
can
tell
it's
a
run
c
issue,
so
it
should
affect
both
upstream,
like
kubernetes
and
openshift,
but
it
has
been
difficult
to
track
down
in
part
because
there
really
is
not
any
resource
utilization
tests.
D
Right
now
for
the
cubelet,
I
asked
six
scalability
and
I
just
said
it
doesn't
exist
so
like
the
api
server
and
a
bunch
of
other
components
have
this,
but
not
the
cubelet.
So
I
have
filed
this
issue
asking.
Maybe
we
can
do
this?
I
mean
still
probably
for
a
regression
analysis.
D
We'd
have
to
go
and
pull
profs
and
that
kind
of
thing,
but
it
would
at
least
make
it
a
little
bit
easier
to
do
with,
like
you
know
where,
where
do
we
bisect
from
and
where
the
regression
started,
and
that
kind
of
thing,
because
in
this
case
there
were
some
run
sea
changes
that
I
still
don't
have
a
smoking
gun
trying
to
like
deal
with
a
local
up
cluster
prof,
but
certainly
we
were
seeing
an
open
shift
and
I
think
part
of
the
problem
is
it's
really
hard
to
reproduce
like
on
a
local
machine,
a
cpu
regression,
whereas,
like
that's
much
easier
on
a
large
cluster,
that's
actually
running
some
load.
D
There
was
clearly
a
cpu
regression
in
the
api
server,
as
of
like
whatever
that
date
was
for
the.
I
think
it's
the
like
highest
percentile.
D
I
mean,
I
don't
know,
I'm
not
worried
about
the
api
server.
That's
that's
not
our
problem,
but
we
don't
like.
I
was
like.
Where
can
I
find
this
for
cubelet
and
turns
out
doesn't
exist.
D
So
one
of
these
does
not
exist
for
cubelet,
but
if
you
click
on
the
dropdown,
where
it
says
cube,
api
server,
there's
a
lot
of
different
components
that
are
monitored
here,
but
not
the
cubelet.
So
I
said
we
should
also
have
the
cubelet
and
they
said
we
would
love
that.
But
we
don't
have
anybody
to
implement
that.
So
please
file
an
issue.
D
And
if
you
haven't
seen
these
dashboards
previously,
these
are
the
perf
dash
perf
dash
dash
dot,
k8
dot
io,
and
you
can
view
all
these
various
stats
that
we
collect.
There's
also
some
cluster
slos
that
get
collected
here
and
different
jobs
and
whatnot.
So
you
can
see
this
dislike.
You
know
api
server,
cpu
regression
that
I
found
was
on
like
120.,
but
there's
other
stuff.
A
Okay,
so
I
mean,
if
somebody
will
like
I
mean
first
of
all,
I'm
interested
myself
because
my
performance
that
are
really
hard,
you
need
to
reproduce
them
and
you
need
to
have
stable
environment.
It's
also
things,
but
then,
if
somebody
will
take
this
issue,
how
they
will
like
where
they
will
get
started.
D
Yeah,
so
I
think
like
luckily
in
this,
so
this
is
the
perf
tests,
repo,
which
is
separate
from
kk
there's.
This
thing
called
cluster
loader
two
and
I
I
would
assume
that
there's
probably
a
pattern
already
for,
like
all
this
other
stuff,
that's
scraped,
it's
just
not
set
up
for
cubelet.
I
think
that
the
perf
tests
are
mostly
maintained
by
googlers.
D
So
if
someone
at
google
wanted
to
pick
this
up,
you
will
probably
have
like
much
better
luck
with
trying
to
find
people
internally
to
help
you
then,
for
example,
I
would
so
because
I
think
a
lot
of
that
team
is
at
google
and
I
think
a
lot
of
them
are
also
in
europe
like
a
lot
of
them
are
in
the
warsaw
office.
I
think.
D
They
also
have
an
on
call
in
their
slack,
which
is,
I
think,
a
best
effort
on
call
rotation
and
they
will
respond
to
people
during.
I
think,
like
europe,
business
hours,
so
it
does
not
overlap
with
me
really
being
on
pacific
time,
but
there
are
folks
there.
A
Yeah
we
can
chat
with
poetic,
maybe
they
have
somebody
to
help,
but
you
said
you
already
spoke
with
them.
D
Well,
I
just
like
added
a
question
in
the
slack
channel
and
they
said
they
would
be
interested
in
somebody
implementing
this,
and
so
I
filed
this
book
based
on
what
they
said.
A
Okay,
yeah,
I'm
just
afraid
like
whenever
we
add
some
tests,
we
need
to
make
sure
that
we
start
looking
at
this
test.
Otherwise,
it's
just
work
that
will
degrade
over
time.
C
Good,
thank
you.
You
know
in
general,
you
can
just
make
it
required
during
the
lane.
I
know
that
one
of
scale
tests
is
required,
probably
for
api,
and
if
you
have
regression
and
it
failed
from
some
reason,
like
you,
you
just
don't
post
ci.
D
Yeah,
so
we
had,
I
don't
know
if
you
were
there,
that
week,
ben
elder
came
from
sig
testing
to
chat
with
us
about
pre-submit
tests,
because
we
needed
to
add
one
for
cryo,
because
there
were
a
lot
of
things
that
were
getting
broken
on
cryo
and
then,
like
you
know,
having
to
get
like
retroactively
fixed.
And
so
I
know
that.
D
Basically,
if
a
thing
there's
a
lot
of
issues
now,
especially
with
having
removed
bazel
in
terms
of
like
pre-submits
being
pretty
expensive
because
they
run
on
every
single
pr
and
every
single
one
of
those
jobs
runs
on
every
single
pr.
And
so
I
know
that
we're
basically
trying
to
move
away
from
having
more
pre-submits
to
having
less
pre-submits
and
instead
running
more
periodics,
because
the
periodics,
I
think,
are
much
cheaper
to
run.
D
And
as
long
as
you
know,
we're
responding
to
those
failures
that
we're
seeing
there
it's
somewhat
of
a
better
way
to
catch,
because
you
know
like
somebody
has,
for
example,
a
typo
and
their
thing
doesn't
build.
And
then
all
the
pre-submits
fail.
So
we
don't
get
a
lot
of
good
signal
in
terms
of
like
where
tests
are
actually
flaking
and
so
yeah
anyways.
I
have
been
told
that
we
should
try
to
be
moving
away
from
pre-submits,
more
and
moving
more
towards
sort
of
post,
submits
and
all
of
the
stuff.
A
D
D
D
Yeah,
I
don't
know
what
the
job
I
don't
know,
what
jobs
that
are
populating
it.
I
don't
know
what
infrastructure
it's
running
it
on.
I
don't
know
who
owns
this
dashboard.
I
would
strongly
prefer
to
not
have
a
separate
node
dashboard
and
that
we,
instead
hook
into
all
of
the
rest
of
the
scalability
test,
dashboards.
D
Because
yeah,
that
one
is
like
it's
unmaintained,
I
don't
know
that
it
would
be
worth
setting
it
up
as
a
separate
thing.
D
A
Okay
yeah,
so
let's
take
a
look
at
triage.
A
A
A
A
Okay,
this
is
the
cheat
not
assigned.
D
A
Great
okay,
it
doesn't
need
any
text
perfect.
I
think
you
can
cut
it
short
today
for
testing
unless
anybody
has
any
more
topics.
Any
questions.
A
D
A
Okay,
now
some
consolation
calendar
as
well
yeah.
Do
we
want
to
go
to
product
feature?
I
just
cut
it
short
today.
A
D
I'm
happy
to
either
do
a
quick
run
through
the
board
or
cut
it
short.
I've
been
a
little
bit
behind.
A
Yeah,
let's
cut
it
short
and
try
to
do
offline
as
much
as
possible.
I
did
some
yesterday,
but
not
too
many.
D
A
Okay,
thank
you.
Everybody.