►
From YouTube: Kubernetes SIG Node 20210511
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
All
right
so
welcome
everyone
to
the
may
11th
trinity,
signed
meeting.
We
have
a
few
items
on
today's
agenda
along
with
our
regular
order.
Business.
Sergey
did
you
wanna?
I
don't
know
if
you're
available
do
you
want
to
give
any
updates
on
where
we
are
with
incoming
and
upcoming
code?
I
guess,
especially
in
light
of
probably
focusing
on
caps.
B
Yeah
I
first
became
I
don't
have
much
update.
I
pasted
the
table
and
I
was
super
under
the
water
last
week,
so
if
somebody
else
can
say
something
what's
happening
in
regards
to
signals,
I
mean
I
think
this
table
is
a
great
way
for
me
to
just
review
what
happened.
So
I
will
do
that.
A
Sure
yeah,
I
would
say
at
a
macro
level,
I
know
for
myself:
I've
been
focusing
on
helping
people
shepherd
their
cups
to
the
upcoming
deadline,
for
I
guess
the
13th,
and
so
maybe
in
the
spirit
of
just
giving
an
overall
update.
For
that.
I
don't
know.
If
renault
and
alana
we
wanted
to
talk
through
the
items
that
we
were
tracking
in
the
shared
google
doc,
and
at
least
we
could.
C
Yeah,
I
mean
from
the
enhancements
team's
perspective,
they're
tracking,
in
the
cap
spreadsheet,
for
the
release,
rather
than
like
our
dock.
So
after
last
week's
meeting
everything
that
we
agreed
upon
that
we
wanted
to
do
for
this
release
I
put
into
the
spreadsheet
and
anything
that
like
didn't,
make
it
because
it
was
red
or
there
wasn't
a
pr
or
anything
like
that.
I
did
not
add
to
the
spreadsheet.
So
I
can.
C
I
don't
have
screen
sharing,
but
I
can
at
least
give
you
sort
of
the
stats.
According
to
the
release
team,
they
are
currently
tracking
right
now,
22
enhancements
for
sig,
node
and
one
has
merged
the
rest
are
at
risk.
So
the
deadline
is
this
thursday,
at
which
point
things
need
to
be
merged,
like
you
know,
meet
all
of
the
criteria
in
order
to
be
considered
for
inclusion
and
the
release
and
anything
that
misses
that
deadline,
an
exception
would
need
to
be
filed.
A
Yeah
so
at
least
for
myself,
when
going
through
this
this
morning
and
will
continue
this
afternoon,
I
think
these
set
qualified
domain
name
kept
graduating
at
the
ga.
That
was
tags,
so
I
think
across
voytech
and
myself
that
was
good.
The
setcomp
as
default
work
that
tim,
eau,
claire
and
sasha
were
pushing.
I
I
had
to
prove
that
as
well,
that
looked
pretty
good.
A
I
know
marinal
had
a
cap
open
just
before
this
call
on
priority
ordering
of
shutdown
among
priority
classes.
I
think
I
had
one
minor
knit
on
that,
but
that
also
looked
good
and
then
I
know
alana
had
a
number
of
comments
on
your
swap
one,
but
I
think
we're
getting
there.
C
A
Yeah,
you
should
be
good
to
go
now.
D
We
need,
I
didn't
miss.
The
basement.
Understanding
is,
should
be
okay
for
this
one,
that's
the
stuff,
I
think,
can
we
maybe
we'll
discuss?
Oh
by
the
way.
Another
thing
debug
container
also,
I
think,
should
be
okay,
it's
more
it's
from
like
the
cap
owner
how
much
his
time
committed
for
that
one.
But
I
think
the
car
provided
basically
is
ready.
A
All
right
cool,
so
alana
do
you
have
this.
C
Yes,
this
is
a
spreadsheet
here,
so
these
are
everything
that
the
release
team
is
considering
at
risk
right
now,
so
here's
the
one
that
they
have
said
is
tracked,
which
I
guess
has
been
merged
and
all
that
other
than
that
one
which
has
been
merged.
I
would
not
trust
what
it
says
in
this
pr
status
column
right
now,
because
I
know,
for
example,
this
one
says
prr
approved,
but
it's
not
so,
but
we're
hopefully
going
to
use
this
once.
C
This
is
all
filled
out
in
this
column
to
be
able
to
track
whether
or
not
prr
is
done
and
when
all
of
the
prs
have
merged
as
an
update
for
caps
and
they
meet
all
the
criteria,
then
you
know
this
will
go
from
at
risk
to
being
tracked.
C
These
are
the
enhancements
contacts
for
each
thing.
So
this
is
the
person
on
the
in
on
the
release
team,
who's
tracking
that
enhancement
and
is
making
sure
that
you've
met
all
of
the
criteria,
and
I
think
almost
all
of
them
are
something
graduating.
We
have
two
deprecations
and
it's
kind
of
a
mix
across
the
board
between
alpha
beta
and
I
think,
there's
only
a
few
things
going
to
stable,
so
yeah.
A
The
thing
I'm
confused
on
is
reconciling
don's
feedback
and
I
guess
seth's
concerns
and
the
author's
concern
on
container
notifier
and
now
that
that's
marked
as
tracked.
That's.
C
D
A
Yeah,
I
I
know
much
of
my
afternoon
is
going
to
be
spent
reading
the
rest
of
the
caps
here.
So
so,
hopefully
we
can
yeah.
E
D
Yes,
I
I
I'm
so
sorry
I
brought
this
back.
We
couldn't
find
the
people
in
this
we
need,
but
do
we
definitely
want
to
find
the
people
as
soon
as
possible,
but
for
this
release
there's
no.
Everyone
is
overloaded.
Okay,
okay,
yeah.
E
D
C
I'm
really
not
sure
why
the
release
team
called
it
track,
maybe
because
tim
hawkin
put
an
lgtm
and
an
approve
on
it,
but
there's
no
prr
done
here,
so
I'm
not
sure
and
and
also
tim
is
not
approver
for
sake
node,
so
I
mean
we
can
go
talk
to
the
enhancements
team
and
tell
them
that,
but
this
isn't
merged.
So
I'm
not
sure
why
that
would
be
called
tracked.
D
I
think
they
just
misunderstand
team
only
just
because
in
the
past
we
have
like
a
back
force
like
the
even
direct
high
involved
several
meeting
with
team.
It's
more
is
from
the
scope
and
the
content
notifier
scope,
because
that
could
be
huge
and
then
api
how
we
are
represent.
So
I
think
the
teams
also
have
some
compromise
this
time
and
okay
with
the
lyric
down
scope
and
also
okay,
with
the
current
api,
similar
like
the
vpa
right
so
back,
we
agree
about
the
design
scope,
but
there's
the
back
force
on
the
api.
D
So
so
that's
kind
of
he
only
approved
on
the
he's
on
the
api
level.
So
that's
different.
C
Yeah,
because
I'm
just
looking
at
like
the
this
pr
column,
I
don't
think
it's
accurate
because
I
have,
I
think
I
was
very
close
to
pr
approving
on
this
one,
but
I
did
not
yet
approve
it.
I
know
that
this
one
has
not
yet
been
approved
for
me
and
so
on
and
so
forth,
so
someone's
probably
enough
to
go
through
and
update
this,
but
at
least
you
know
this
is
this
is
the
list
of
what
the
enhancements
team
is
tracking
and
the
issue
numbers
and
whatnot?
C
A
Yeah
so
at
least
for
my
part,
I
think
I've
gotten
through
a
number
of
these
cups
this
morning,
we'll
get
through
the
latter
half
this
afternoon
and
I
hope
to
have
any
comments
needed
for
those
by
end
the
day
today
and
hopefully
that
won't
be
too
bad
and
I
think
anyone
else
in
the
community
that
wants
to
do
that.
You
know
please
let
your
comments
be
known,
thanks,
alana
for
pointer
to
this.
A
F
Not
much
the
only
item
that
I
was
concerned
about
is,
if
we
can
identify
a
code.
F
Review
in
june
july
time
frame
and
david
ashball,
even
jordan,
leggett
and
tim
to
to
a
certain
extent,
look
right
quite
closely
at
the
code
before
tim
figured,
we
probably
want
to
remove
the
resources
allocated
being
in
the
api
in
the
pod
spec
and
move
it
to
checkpointing
in
the
couplet.
So
my
plan
is
to
take
that
code,
which
has
been
pretty
well
reviewed
and
move
it
lift
and
shift
surgically
on
most
parts
of
it
to
the
latest
code
base.
F
So
the
minimal
review
would
be
required
for
the
for
the
status
updates
and
api.
The
pod
spec
update
and
the
container
the
sync
the
sync
pod
loop.
F
The
new
thing
would
be
your
checkpointing
in
the
kublet
that
will
be
new
code
and
I'm
wondering
if
there
is
a
reviewer
available
to
look
at
that
closely.
So
we
ensure
that
this
goes
in
with
high
quality.
B
F
Yeah
he
I
think
he
was
there
parts
of
it
at
least
the
cri
part.
He
looked
at
it
and
made
some.
We
made
some
changes
to
the
cap,
so
that's
great!
So
if
we
can
get
him,
I'm
going
to
plan
on
starting
I'm
going
to
start
working
on
the
code.
What
remains
is
to
look
at
the
prr
section.
I
added
that
move
things
around
a
little
bit
to
line
up
with
the
latest
templates,
so
I
believe
derek
and
you
can
look
at
it
and
elena
elena
has
already
looked
at
it.
F
I
think
the
first
pass
and
for
the
most
part
correct
me
if
I'm
wrong,
but
you
think
it's
close.
So
we
need
three
of
you
to
sign
off
and
then
the
cap
should
be
official.
C
A
All
right
awesome,
the
only
comment
is,
I
think
I
don't
think
the
checkpoint
stuff
has
evolved
deeply
since
lantau
had
looked
at
the
area,
but
I'm
happy
to
help
out
there
as
well.
So.
A
B
F
G
Hey
folks,
so
just
real,
quick
I'll,
just
paste
a
pr.
This
is
a
candidate
fix
to
address
it's
actually
documented
in
code,
so
there
are
known
races
between
setting
up
the
container
manager
and
getting
node
status.
Sometimes
the
ephemeral
storage
allocatable
data
is
not
yet
around,
and
so
I've
been
spending
a
lot
of
time
in
the
last
couple
weeks,
figuring
out
a
way
to
surgically
improve
this
to
minimize
the
race
condition
without
refactoring
everything,
because
a
lot
of
it
is
pretty
hairy
there.
G
G
That's
exactly
right,
and
it's
actually
known
it's
like
there
are
code
comments
that
describes
c
advisors,
sort
of
role
in
the
various
flows.
So
there's
some
serialization
that
happens.
The
advisor
comes
sort
of
next
after
the
initial
container
manager
instantiation,
which
means
that
we
can't
do
a
root
fs.
Quite
yet
in
in
most
cases
it
seems
the
the
first
node
status
request
comes
after
c
advisor
has
started
and
the
container
instead,
I'm
already
at
a
super
low
level.
A
Yeah,
so
just
trying
to
think
through,
like
what's
the
actual
impact
of
this
race,
because
last
release,
or
so
we
had
a
correctness
issue
in
the
cubelet,
where,
for
example,
it
might
have
been
able
to
list
watch
pods
but
had
not
yet
been
able
to
list,
watch
the
node
resource
itself
and
then,
as
a
consequence,
there
was
an
issue
where
the
cubelet
could
launch
a
pod
that
may
not
have
been
actually
feasible
to
schedule
on
that
local
node,
and
then
we
had
done
a
number
of
fixes
to
try
to
get
right
in
the
first
fix,
resulted
in
kind
of
a
it
was
more
correct,
but
then
was
slower
and
then
impacted
cube,
adm
startup
times,
and
then
we
iterated
together
to
try
to
get
a
faster
fix.
A
That
was
also
correct.
What
I'm
wondering
here
is
like
this
seems
like
it
will
more
greatly
impact
cuba
dm
startup
times,
and
I'm
wondering
if,
if
there's
a
budget
that
this
use
case
is
being
driven
from
that
says,
you
know
how
quickly
we
expect
all
these
things
to.
G
Yeah,
that's
a
that's
a
great
call.
I
think
all
that
is
actually
open.
So
the
pr
right
now
has
some
retry
functionality
under
the
with
a
timeout.
So
we
can
drive
the
timeout
super
low.
If
we're
concerned
about
that.
But
to
answer
your
first
question
really
the
repro,
the
canonical
repro
is
the
cube
adm
configuration
that
includes
local
lcd,
so
you're,
delivering
an
fcd
pod
via
etsy
kubernetes,
manifests
and
since
120,
that
includes
ephemeral,
storage
requirements
and
for
most
qmedium
joins.
D
G
G
A
If
I
can
read
on
the
issue,
I
was
just
mostly
trying
to
be
concerned
on
correctness,
fixes
that
then
introduced
latency.
D
A
Then
resulted
in
a
lot
of
work
and
on
this
particular
issue
I
don't
know
if
we
have
a
better,
faster
path
to
get
more
time
squeezed
out
so
anyway,
I
was
just
curious
on
that.
One
I'll
read
it.
G
Yeah,
no,
I
think
it's
a
great
call.
I
think
that
should
be
totally
under
consideration.
I
mean
I
plan
to
actually
go
after
this
go
to
cube
adm
and
maybe
advocate
that
we
roll
back
the
ephemeral
storage
requirement,
but
I
feel
like
it's
the
best
thing
to
do
due
diligence
here,
because
there
is
a
race,
and
so
if
we
can
identify
that
race
and
make
a
decision
on
what
we
want
to
do
about
it,
that's
really
the
foundational
thing.
G
So
I
will,
I
think
the
pr
I
have
in
place
is
totally
open
to
to
even
not
retrying.
We
can
just
do
it
just
in
time.
If
we
don't
have
allocatable
storage,
we
can
try
once
to
see
if
c
advisor
has
started.
I
mean
it's
it's
hard
to
explain
the
stuff
in
the
abstract,
get
to
kind
of
know
the
code,
but
that
would
that
would
mitigate
any
potential
like
20.
Second
30.
Second,
warm-up
delays,
waiting
for
allocatable
storage,
on,
like
the
nominal
cubitium
case,
even
when
ephemeral
storage
isn't
a
factor
yeah
that
makes.
A
H
Yeah,
I
have
some
comments,
I'm
aldo
from
six
scheduling,
so
I
was
looking
at
this
issue
because
I
saw
it
in
the
scheduler
log,
sometimes
that
we
also
get
ephemeral,
storage
zero
for
some
time.
But
then
that's
fine,
because
the
scalar
will
retry
right.
So
in
that
scenario,
wouldn't
be
an
issue,
you
wouldn't
get
parts
in
the
cubelet
that
the
that
require
ephemeral,
storage.
H
Then
at
that
point
you
have
a
pod
that
should
should
work
in
that
node
and
then,
after
the
restart
it
it
fails
and-
and
the
issue
is
that
we
have
remaining
parts
or
fail
pods
that
don't
get
don't
don't
get
a
garbage
collected.
So
that's
pretty
much
the
issue
and
that
you
just
get
those
spots
that
need
to
be
recreated
by
some
other
controller
right.
So
my
my
suggestion
or
my
question
would
be:
is
it
possible
to
leave
any
pods
or
like
pots
that
don't
require
ephemeral,
storage
unaffected
by
the
delays?
A
Yeah,
so
for
each
of
those
paths
we
have
to
work
our
way
through.
On
that
I
mean
all
things
are
good,
so
it's
probably
possible
on
the
static
pod
issue,
like
we
had
to
delay
a
start
of
a
static
pod
until
we
had
known
that,
we
had
in
the
past,
been
able
to
list
watch
the
node
from
the
api
server
so
that
there
is
some
parts
of
the
code
we've
added
gating.
A
But
I
think
if,
if
you
had
a
test
case
for
the
static
pod
scenario,
that
was
like
consuming
ephemeral
storage
and
we
showed
it
failing-
that's
a
great
way
of
us,
then
getting
it
fixed
right
so
that,
if,
if
that's
impacting
anyone
right
now
and
even
the
cubadm
community,
we
can
better
enriching
our
test.
Cases
is
a
nice
way
of
ensuring
that
we
get
it
working.
A
I
guess
if
it
did
regress,
but
the
actual
like
where,
in
that
sync
loop
or
in
the
pod,
at
mission
check
on
the
cubic
side
the
handle
they
re-cue
it
for
later.
I'd
have
to
think
through.
You
know,
with
someone
else
together
on
the
best
place
to
put
that.
G
Yeah,
the
problem
I'll
have
a
brief
comment
during
that,
but
the
challenge
there
is.
That
was
my
first
approach
as
well.
But
if
you
add
retry
tolerance,
there
you're
you're
not
really
actually
guaranteed,
depending
on
what
kind
of
race
conditions
area
you're
in
that
that
it's
ever
going
to
succeed
under
the
hood.
G
So
I
I
felt
like
the
best
most
surgical
change
would
be
to
at
the
point
where
you're
actually
gathering
data
to
detect
when
ephemeral
storage
has
not
ever
been
gathered,
which
is
the
particular
race
at
the
start
of
the
container
manager
and
wait
on
block
on
c
advisor
for
a
small
amount
of
time.
Trusting
that
in
most
cases
it's
going
to
come
up
quickly
after
and
you'll
be
able
to
get
that
data.
A
Zero,
maybe
I
don't
know
jack,
even
if
you
had
a
look
at
your
pr,
if
you
had
an
ede
in
needy
node
that
just
simulated
starting
a
static
pod
with
the
thermal
storage,
that's
probably
even
if
you
don't,
the
cube,
adm
community
doesn't
leverage
it.
It
probably
at
least
gets
us
on
a
path
to
know
that
we're
not
going
to
have
issues
where
static
pods
would
be
consuming.
It.
G
Yeah,
that's
literally
what
I've
been
running
h,
a
at
a
control,
plane,
cube,
adm,
build
cluster,
so
yeah.
G
A
Yeah
yeah
all
right
cool,
all
right,
well
I'll,
look
at
the
pr
and
go
from
there
jack
thanks
a
lot.
Thanks
tim.
I
know
we
talked
about
the
psp
release
last
week.
Were
there
new
discussion
points
we
want
to
talk
to.
A
C
I
had
a
quick
thing,
which
was:
I
have
gone
through,
and
I've
updated
the
spreadsheet
and
everything
that
I
could
find
a
pr
reviewer
I
put
into
the
spreadsheet
at
least
half
of
them.
Don't
even
have
someone
assigned.
This
is
mandatory
for
your
cap
to
merge.
So
some
of
them
I've
gone
and
commented
and
said
please
assign
one,
but
if
or
the
release
team
already
has.
Please
make
sure
that
you
do
that.
C
I
I
have
a
quick
question:
if
give
me
a
moment,
I'm
trying
to
get
a
sense
of
where
the
cisco's
set
up
what
the
current
status
of
cis
cuddles
is
and
sort
of
marking
new
cisco
levels
is
safe.
My
reading
of
the
recent
enhancement
proposal
is
that
it
was
largely
destabilizing
or
like
documenting
the
existing
allow
and
save
cisco's
mechanism.
Is
there
anything
sort
of
on
the
roadmap
for
adding
new
systems,
or
where
should
I
be
asking
about
how
to
add
new
cisco's?
That
should
be
safe.
A
A
It's
possible
that
new
things
have
become
safe,
that
we're
unsafe
at
that
moment
in
time.
So
if
you
have
a
a
more
updated
set
that
you
we
can
demonstrate
are
safe
and
we
know
the
kernel
versions
at
which
they're
safe
to
be
exercised.
A
I
A
Yeah,
so
I
think
before
you
even
jump
into
a
cap
or
code
change,
just
sharing
what
you
sounds
like
you
may
have
discovered
in
in
your.
I
I
I
I
D
You
can
also
look
at
the
kubernetes
kubernetes,
have
the
system,
validation
test
and
the
link
against
that
test,
and
then
say
what
do
we
put
there?
If,
if
I
wrote
her
a
long
time
ago,
we
basically
just
3.18
up
and
about,
but
we
should
refresh
that
one,
we
totally
okay
to
refresh
that
whoa,
because
that's
the
couple
years
ago
we
have
that
for
kubernetes
yeah.
J
A
A
Yeah
yeah,
so
there's
like
a
kernel,
validator
dot
go,
might
be
what
I
want
to
check
out.
Thank
you
all
right,
all
the
best
and
look
forward
to
hearing
on
that.
Next
next
time,
you're
convinced
jeffrey
talk
later
bye,
everyone,
my
folks.