►
From YouTube: Kubernetes SIG Node 20230411
Description
SIG Node weekly meeting. Agenda and notes: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#heading=h.adoto8roitwq
GMT20230411-170451_Recording_1522x928.mp4
A
Hi
everyone
welcome
to
signode
weekly
meeting
on
April
11
2023.
We
have
a
couple
of
topics
on
the
agenda,
so
I
see
Dawn
already
responded
to
Kevin's
request
for
approval
on
the
dra
cap,
so
we
can
quickly
move
on
to
the
second
topic,
which
is
the
n
minus
three
skew
by
Jordan
and
Derek.
B
Yeah
Jordan:
do
you
wanna
kick
off
the
discussion
on
this
yeah.
C
D
A
I
just
give
you
host
children,
yeah
great,
see,
I
think
this
works.
D
Yeah,
so
I
I
wanted
to
kick
off
a
discussion,
I'm
kind
of
making
the
rounds
of
various
cigs
that
touch
things
on
the
Node
or
care
about
skew
policies,
so
talked
with
cluster
lifecycle
last
week
and
sick
Arch
last
week,
and
then
node
and
network
this
week.
So
node
and
network
are
the
two
sigs
that
actually
own
node
components,
so
the
cubelet
and
Cube
proxy.
D
So
I
care
a
lot
about
what
folks
here
think
about
this,
but
the
the
too
long
didn't
read
version
of
this
is
it
would
be
really
great
if
the
oldest
node,
that
we
support
and
the
newest
control
plane
we
support,
works
together.
D
That's
that's
the
goal,
and
that
was
actually
the
goal
of
the
current
SKU
policy,
which
was
saying
n
minus
two
nodes
support
current
control
planes,
but
when
we
moved
to
a
yearly
support
period,
I
guess
a
year
or
two
ago,
we
realized
that
users
actually
need
a
couple
months
overlap
after
we
release
a
new
version
for
them
to
qualify
and
upgrade
to
it.
And
so,
if
we
release
three
minor
versions
in
a
year,
we
actually
support
the
oldest
minor
version.
D
For
a
couple
months
after
we
cut
a
new
minor
version,
and
so
there's
like
a
14
month,
support
window
and
that
two-month
period
is
intended
to
let
users
qualify
and
upgrade,
and
so,
if
we
only
support,
if
we
strictly
support
nodes
that
are
two
versions
older,
then,
in
order
for
users
to
stay
within
like
the
supported
SKU,
they
actually
have
to
upgrade
their
nodes
twice
if
they
were
wanting
to
leave
their
node
pools
at
the
oldest
supported
version
and
then
jump
them
to
the
newest,
supported
version.
D
So
I'm
talking
with
Derek
and
other
folks
that
it's
pretty
clear
that
node
upgrades
in
particular
are
like
way
more
disruptive
to
users,
workloads,
so
control
plan
upgrades
you
normally
have
like
one
or
three
control,
plane,
numbers
and
user
workloads.
Don't
have
to
take
dependencies
on
control
planes.
So
if
the
user
workloads
are,
you
know,
running
pods
that
don't
actually
care
about
the
keyboard.
Api
server
they're
happy
to
keep
running,
even
when
the
control,
plane's
upgrading
but
node
upgrades
minor
version.
D
Node
upgrades
require
draining
or
spinning
up
new
nodes,
and
so
every
workload
in
the
cluster
is
gonna
have
to
get
recreated
or
restarted
in
the
process
of
doing
a
minor,
node
upgrade,
and
so
making
people
do.
Two
of
those,
especially
when
you
might
have
like
thousands
of
nodes
in
a
cluster
is,
is
actually
way
more
disruptive,
so
the
goal
was
to
see
if
we
could
let
people
just
do
a
single
node
pool
upgrade
to
get
from
the
oldest
version
to
the
newest
version.
D
So
that
was
the
goal.
The
next
thing
we
looked
at
was
like
what
would
it
actually
cost
us
to
do
this,
and
so
there
were
three
types
of
changes
that
I
looked
at,
and
this
is
where
I
would
like
feedback
from
folks
in
Sig
node.
D
If
we've
missed
types
of
work
or
types
of
changes,
that
would
be
impacted
by
expanding
to
one
more
version,
but
the
three
types
of
changes
we
looked
at
were
how
fast
we
can
roll
out
new
features,
how
quickly
we
can
drop
support
for
olds,
deprecated,
no
longer
supported
features,
and
then
the
third
category
was
like
rest,
API
changes.
The
third
category
is
actually
in
really
good
shape
ever
since,
like
119
I,
think
all
the
things
that
cubelet
and
node
Cube
proxy
use
are
stable,
they're
at
V1
levels.
B
Jordan,
just
maybe
one
other
comment
that
I
realized.
We
didn't
capture
them
or
discussion,
but
the
alternative
is
that
we
we
could,
as
a
community
just
say,
do
more
to
support
In-Place
updates
of
qubits
and
I
think
it's
worthwhile.
Maybe
we
put
some
language
in
the
cap
to
explain
why
we
wouldn't
necessarily
recommend
that,
because
typically
users
would
one
that
would
impact
the
keyboard's
ability
to
change
to
operating
system
updates.
B
So,
for
example,
like
doing
a
c
group
B1,
the
V2
migration
in
place
is
not
really
a
thing
we
can
do,
and
so
maybe
just
some
language
in
here
that
calls
out
like
the
cubelet
could
have
chosen
to
do
in
place
updates.
B
But
we
as
a
community,
probably
feel
that
that
is
a
bad
idea,
because
it
would
inhibit
our
ability
to
actually
keep
up
with
the
pace
of
operating
system,
Innovation
and
so
like
right
now
it's
been
a
benefit
that
we
always
recommend
users
drain
their
notes
before
doing
that
maintenance,
but
I
could
see
some
folks
pushing
back
on
this
and
saying
well.
Why
doesn't
just
stick?
You
can
just
change
the
kiblet
binary.
B
You
don't
even
need
to
do
a
drain
and
I,
don't
I,
don't
think
it's
a
good
idea
for
the
Sig
to
take
on
that
posture,
but
I,
just
I
want
to
I
want
to
call
it
out.
Call
it
out.
D
Just
metered
gone
yep
I
think
you
were
feel
free
to
unmute
and
interrupt
me
at
any
time.
If
you
have
something
you
want
to
say
in
terms
of
like
what
work
you
would
actually
take
to
achieve.
The
goal
of
kind
of
this
upgrading,
a
node
pool
from
the
oldest
version
to
the
newest
version,
non-disruptive,
with
as
little
disruption
as
possible.
I
agree
that
supporting
In-Place
upgrades
across
minor
versions
is
probably
way
more
work
and
therefore
way
less
likely
to
actually
get
done.
D
I
I,
really
like
sort
of
incremental
improvements
that
give
us
a
lot
of
bang
for
the
buck
and
and
so
I
I
did
want
to
jump
to
some
of
the
analysis.
I
actually
looked
through
those
types
of
changes
over
the
past
couple
years
to
see
like
how
many
new
features
did
we
actually
delay
until
the
oldest
node
supported
a
feature,
and
there
were
actually
very
few
of
them.
D
Typically,
we
roll
out
new
features,
and
we
just
say
if
you
want
to
use
the
new
features
you
have
to
upgrade
your
nodes
to
a
version
that
supports
that
feature
which
I
think
is
probably
pretty
reasonable.
Like
you
want
to
use
a
new
feature,
you
have
to
upgrade
to
a
version
that
has
that
feature
like
there
may
be
user
experience
things.
D
We
could
improve
there
to
make
it
more
obvious
when
a
user
tries
to
use
a
feature
and
their
nodes
aren't
new
enough
like
make
it
fail
in
nicer
ways
or
tell
them
earlier
in
the
process.
I
can
get
our
user
experience
improvements,
but
generally
we
don't
wait
for
all
supported
skewed
nodes
to
support
a
feature
before
we
say
you
can
enable
this
and
use
this
in
newer
nodes.
D
Most
of
the
time,
the
only
times
that
we
will
delay
is
when
it's
like
a
security
issue,
so
I
think
pod
security
standards
waited
to
relax
requirements
on
Windows
nodes
or
Windows
pods,
until
we
were
sure
that
all
cubelets
would
honor
the
Pod
OS
field.
D
D
More
common
was
dropping
deprecated
functionality
from
the
control
plane
once
the
N,
minus
2
node
didn't
need
it,
and
so
there
were
a
few
instances
in
Sig
storage
around
like
dropping
entry
volume
plug-ins.
Once
the
N
minus
two
cubelet
was
guaranteed
to
be
using
CSI
migration.
D
But
again
the
cost
of
just
like
letting
old
code
hang
around
for
One
More
release
and
then
dropping
it
is
actually
pretty
low.
At
least
that's
the
feedback.
I've
gotten
from
Sig
node
I
still
have
to
talk
to
some
of
the
other
six,
but
if
we
can
make
users
lives
better
by
letting
them
upgrade
their
nodes
at
the
cost
of
like
ignoring
a
deprecated
package
for
One
More
release
before
we
delete
it,
that
doesn't
seem
like
a
terrible,
terrible
trade-off.
It's
not
impacting
velocity
of
new
features.
D
So
then
this
was
just
like
showing
homework.
These
were
I
went
back
two
years
to
122,
I,
guess
and
looked
at
enable
one
of
new
features,
removal,
deprecated,
stuff
and
then
removal
of
apis
beta
apis
and
tried
to
see
which
of
these
would
have
caused
problems
with
n
minus
three
skew,
and
so
there
was
only
one
example
of
new
feature
that
I
could
find,
and
there
were
a
couple
sorry
two
two
examples,
both
of
which
were
Sig
off,
so
maybe
I'm
shooting
my
own
Sig
in
the
foot.
D
D
What
I'm
looking
for
from
this
sig
is
sort
of
gut
reaction
to
this,
like
support,
oldest
node
against
newest
control,
plane
goal
and
then
pointing
out
anything
that
we
missed
in
this
analysis
in
terms
of
like
types
of
work
or
types
of
features,
if
there
were
features
that
we
waited
to
roll
out
until
the
oldest
node
supported
them,
that
I
didn't
have
in
this
list,
that'd
be
helpful
to
know
and
then
like
some
of
the
Alternatives
or
other
ways
of
accomplishing
this
goal,
like
what
Derek
pointed
out
so
I'll
stop
talking
there
and
let
other
people
talk.
B
Response
Jordan
is
that
there's
just
widespread
agreement
that
this
is
the
right
thing
to
do,
and
there
was
oversight
when
we
changed
the
project,
support
policy,
I
think.
B
If
do
people
find
the
language
and
they're
kept
clear
about
how
to
handle
new
features,
so
you
know
I'm
trying
to
think
about
features
that
are
in
flight
and
right
now
from
note
the
Note
6
so
in
place.
B
Pod
resource
resizing
is
the
language
in
the
cup
here
clarifying
for
how
we
would
choose
to
enable
that,
potentially
in
the
future
on
by
default,
the
API
server
side
or
is
it
unclear
or
if
there's
a
a
particular
resource,
that
the
qubit
is
not
yet
tracking,
but
we've
thought
about
tracking
I,
think
Rinaldi
you
and
I,
and
Sergey
and
Dawn
had
a
conversation
about
file
descriptors,
for
example
like
to
me.
B
The
key
thing
is
that
we
are
comfortable
with
the
language,
that's
in
the
kept
to
understand
when
and
how
we
choose
to
allow
a
feature
to
go
on
and
off
by
default
or
give
appropriate
guidance
to
those
who
come
to
the
Sig
on
like
the
time
frame.
They
might
be
looking
for
for
that
future
to
be
on
by
default,
and
if
the
language
here
is
clear,
that's
good!
If
it's
not,
then
that's
probably
the
best
thing
we
could
gather
together
as
a
community
to
make
sure
that
we
give
proper
guidance
to
those
going
forward.
D
If
there
are
specific
questions
around
like
how
does
this
behave
on
older
nodes,
that
might
be
a
good
sort
of
sample
question.
To
put
there
as
a
prompt.
I
will
note
that,
like
we
already
have
n
minus
one
and
N
minus
two
support,
and
so
hopefully
people
are
already
asking
these
questions
and
already
thinking
about
how
this
rolls
out,
and
hopefully
the
only
impact
of
this
proposal
would
be
for
people
who
are
actually
waiting
until
the
N
minus
two
nodes
have
a
feature
enabled
to
turn
something
on
in
the
control
plane.
D
So
my
my
sense
is
that
it's
not
actually
impacting
most
features.
Most
features
enable
in
a
way
that
the
feature
wouldn't
work
with
an
N
minus
one
or
n
minus
two
node,
and
they
would
just
tell
the
user
if
you
want
to
use
this
feature,
you
have
to
upgrade
to
a
newer
node.
D
I
I
also
included
a
link.
Where
was
it
here?
We
go
one
of
the
ways
that
we
enable
features
today.
Is
we
just
default
them
on
in
a
given
release
and
then,
as
that
release
sort
of
propagates
back
into
skewed
nodes
more
and
more
releases
supportive
feature?
That's
not
actually
in
really
terrific
way
to
roll
out
features.
D
It
kind
of
is
in
some
ways
it's
the
worst
of
Both
Worlds
like
it's
slow
to
make
progress,
because
it's
like
a
four
month
gap
between
each
release,
and
it
also
leaves
clusters
that
have
skewed
nodes
in
a
state
where,
like
maybe
the
features
allowed
at
the
control
plane,
but
doesn't
work
with
older
nodes,
and
so
it's
the
current
state
is
actually
something
that
could
be
improved
and
I
linked
to
a
kept.
D
That
Daniel
has
in
progress,
which
is
maybe
trying
to
talk
about
improving
the
way
we
toggle
feature
flags
on,
so
that,
instead
of
just
being
tied
to
a
release,
it
actually
like
can
be
more
cluster
aware.
So,
if
you
had
a
cluster
where
all
the
nodes
and
the
API
server
are
on
the
newest
version,
then
great
the
feature
enables.
But
if
you
still
have
nodes
that
were
on
an
older
version
that
didn't
support
a
feature,
maybe
we
wouldn't
default
the
feature
on.
Maybe
we
would
wait
until
your
nodes
supported
it.
D
So
I
I
think
there's
room
for
improvement
in
How,
We
Do
feature
rollouts
I.
Think
that's
orthogonal,
to
whether
we
have
a
minus
two
right.
Minus
three
support,
but
for
those
who
are
interested
I
would
definitely
encourage.
Reading
Daniel's
kept
and
weighing
in
with
an
eye
towards
like
skewed
node
control,
plane,
rollouts.
D
Okay,
that
that
was
all
I
had
if
there
are
no
more
questions
here,
feel
free
to
read
the
the
N
minus
three
kept
and
the
questions
there
or
ping
me.
If
there
are
things
I
ever
got
or
you
want
to
see
and
we'll
try
to
get
it
updated.
B
D
So
so,
by
default
like
if
we
don't
make
any
changes
in
the
control
plane
in
a
new
release,
then
they
have
just
as
good
support
for
n
minus
three
nodes
as
n
minus
two
nodes,
and
so
ideally,
if
it's
not
disrupting
plans
that
sigs
have
in
place,
I
would
like
to
see
the
next
version
of
the
control
plane.
D
128
support
back
three
versions
like
keep
as
good
support
for
125
nodes
as
127
had
so
I
tried
to
do
some
analysis
like
forward-looking
to
see
what
plans
sigs
had
and
Sig
storage
was
the
only
one
that
I
could
find
so
yeah
I
would
like
to
have
128
control
planes,
support
125
nodes,
just
as
well
as
127
control
plans
did
that's
that's
my
goal.
D
A
Thanks
all
right,
thanks,
Jordan
and
Derek
folks
on
the
call,
please
take
a
look
at
the
cap
and
chime
in
if
you
have
any
thoughts.
Okay,.
C
Sorry
I'm
sorry,
bro
I
just
already
comment
that
for
the
In-Place
pod
resource
update,
I
had
to
take
a
closer
look
at
the
cat,
but
from
what
I
can
tell,
maybe
it
will
need
some
additional
changes
to
ensure
that
if
you
request
a
resize
on
a
pod,
that's
running
on
a
on
a
node
that
doesn't
have
this
feature
at
all,
then
we
reject
that
request
early
in
the
API
server
Jordan.
Does
that
about
sound
right.
D
Maybe
I
again
like
we
already
have
a
SKU
the
possibility
of
skew
right.
You
could
be
running
on
a
node,
that's
one
or
two
versions
older.
So.
D
All
right
so
yeah,
let's,
let's
sync
up
and
look
at
what
the
In-Place
design
was
going
to
propose
for
skew
handling
and
see
if
this
adjusts
anything
yeah
sounds.
B
Good
was
that
it
would
delay
turning
that
feature
on
in
the
control
plan
by
default
by
One
release
yeah,
but
it
wouldn't
preclude
the
ability
of
others
to
use
that
feature
in
clusters
that
they
knew
in
their
local
deployment.
Posture
were
at
a
satisfactory
level.
So
to
me
it
just
it
changed
when
it
defaulted
on,
but
yeah.
If
you
and
vinay
can
sync
up
on
that,
that's
kind
of
how
I
read
our
our
language.
A
C
A
E
So
we're
still
making
a
good
Progressive
updates.
We've
had
some
other
work
come
out
the
last
week
where
we
wanted
to
be,
but
we're
also
waiting.
Looking
for
feedback
for
one
from
one
of
the
main
members
we
had
I
think
we
had
Don
assigned
to
this,
but
Don
hasn't
made
a
lot
of
the
calls
so
I'm
trying
I'm
looking
for
help
right.
A
B
Is
this
apologies
I'm
coming
back
from
being
out
for
a
week
this?
This
is
the
same
overall
plug-in
design
we
were
discussing.
The
past
were
like
to
my
knowledge.
The
issues
that
we
were
encountering
was
around
bootstrapping.
B
Just
be
the
only
thing
I
wasn't
aware
of
is
if
we
had
a
satisfactory
resolution
to
like
the
bootstrapping
challenges
and
or
if
there
was
a
proposal
to
to
update
them.
I'll.
E
B
Look
to
read
through
the
link
here,
but
I
think
I
would
kind
of
plus
one
renault's
feeling
of
like
maybe
have
folks
give
an
update
on
the
latest
state
of
the
discussions
for
those
who
haven't
been
able
to
attend,
because
otherwise
yeah
I
was
on
the
impression
that
we
were
still
blocked
on
the
core
bootstrapping
problem.
E
A
So
maybe
we
can
schedule
it
for
the
week
after
kubecon,
so
I
think
that
brings
us
to
the
end
of
the
agenda
and
I.
Think
next
week
is
kubecon.
So
do
we
want
to
cancel
the
call
or
keep
it.
A
Yeah
all
right,
meanwhile,
folks
reach
out
on
signode.
If
you
have
anything
that
you
need
thanks
for
joining,
see
you
all
couple
of
weeks,
bye
now.