►
From YouTube: Kubernetes SIG Node 20210928
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
B
No,
no,
that's
fine!
I
was
just
I
guess
it's
the
september
28th
signed
meeting.
We
have
a
few
items
on
the
agenda.
Some
I'm
not
sure
we'll
get
to
full
resolution
on
today,
but
at
least
we
can
get
awareness.
I
guess
sergey.
Would
you
be
kind
enough
to
want
to
run
through
today's
agenda.
C
Yeah,
absolutely
hello,
everybody.
I
think
we
can
start
as
usual,
with
some
formal
reduction
like
if
you
missed
last
two
weeks
and
don't
know
what
was
happening
in
terms
of
prs
and
work
happening.
You
can
click
all
these
links
on
the
created,
prs
and
closed
merspiers.
C
I
went
through
that
like
there
is
no
like
some.
Nothing
wrote
in
the
way
that
needs
to
be
picked
up
and
we're
doing
very
good.
Job
in
updating
prs,
like
replying
to
pr's,
like
146,
is
unusually
high
number,
so
we're
doing
great,
so
yeah,
and
I
wanted
also
to
remind
everybody
that
the
soft
code
phrase
we
announced
before
is
in
two
weeks
and
one
of
those
weeks
is
cube
convict.
C
D
For
the
for
the
soft
code
freeze,
I
think
I
sent
an
email
about
this,
so
you
should
have
it
in
your
inbox
on
the
signate
mailing
list.
I
just
wanted
to
add.
This
is
the
first
time
we've
done
this.
We
were
hoping
to
try
to
get
like
some
features
merch
earlier.
I
know
that
you
know
everybody
is
very
busy,
but
certainly
like
one
thing
we
really
want
to
avoid.
Is
everybody
not
having
their
prs
ready
until,
like
the
last
week
of
code
freeze,
given
how
long
the
development
cycle
is?
D
So
even
if
we
don't
get
everything
that
we
want
to
get
merged
merged
by
then,
like
we've
got
a
bunch
of
beta
things,
it
would
be
cool
to
graduate
by
that
point,
but
please
have
a
pr
ready
for
review
by
that
point,
so
we
don't
spend
all
of
like
the
last
week
of
code
freeze
just
doing
that.
Also.
I
suspect
that
that
148,
updated
might
have
been
somewhat
my
fault
since
I
got
through
a
very
long
github
backlog
this
week.
C
Yeah,
okay,
let's
go
into
agenda
and
I
think
first
skylar
enable
static,
pods.
F
E
On
basically
she's
implemented
the
pinned
images,
pr
where,
like
in
cri,
you
can
have
specify
an
image.
Spin
and
cubelet
does
not
try
to
gc
it.
So
she
wanted
to
quickly
check
whether
we
can
get
it
in
or
we
need
to
open
an
announcement
for
that.
B
B
A
Want
to
enable
this,
this
expanded
scope
to
the
other
static
power.
He
he
or
she.
I
don't
know
and
didn't,
make
that
clear,
but
from
the
agenda
and
it
looks
like
they
want
to
expand
it.
I
think
that
we
we
do
have
concern
on
those
things
we
talked
about
last
time
when
we
discussed
the
ping,
the
image
and
but
once
we
expanded
those
title
parts,
then
we
need
a
wire.
A
So
if
that's
the
static
part,
are
we
going
to
think
of
what
the
current
state,
what
kind
of
the
static
part
which
worship
even
at
the
same
part,
but
different
image,
different
image,
worship,
which
one
should
be
evicted,
which
one
should
be
evicted
all
those
kind
of
complexities
should
be
considered
here.
B
So,
let's
get
feedback
on.
I
guess
what
I
was
trying
to
figure
out
was
the
length
kept
off
the
pr
which
was
2694,
basically,
that
the
runtime
decides
which
images
were
pinned,
and
I
thought
the
pr
that
was
presented
just
had
implemented.
That
is
there
a
place
where
this
was
tied
back
to
static
clouds.
A
Agenda
fellow
agenda
actually
asked:
can
we
end
the
stack
apart
from
this
one,
and
we
need
to
open
our
enhancement
for
this
one?
I
do
think
about.
We
shouldn't
enhancement,
that's
more
complicated
than
the
current,
really
small,
thin
image
we
discussed
even
that
time.
Actually
we
didn't
mention
that.
B
A
I
also
goes
through
that
pi.
Think
of
here,
should
it
be
high
most
okay,
but
once
we,
but
I
do
have
concern
about
the
people,
maybe
abuse
that
one.
Then
we
end
up
having
to
buy
the
not
by
the
node.
We
don't
know
how
to
recover
right.
So
today's
we
just
force
first
delete
because
when
once
we
have
run
out
of
the
disk
so
and
what's
the
rescue,
so
if
we
have
those
kind
of
problems
so
so
this
is
why
we
need
to
discuss
anyway.
A
I
will
comment
on
this
this
request,
since
she
cannot
attend
and
ask
the
enhancement
request.
C
So
we
have
action
items
on
that
and
I'm
a
little
bit
surprised
that
it
was
added
into
existing
cap.
So
is
it
on
the
easier
like
I
do
know
if
it's
only
iv1
or
it's
blackboard
to
current
version
of
cry.
B
B
B
H
So
basically,
what
was
going
on
is
that
I
had
made
a
pull
request
that
created,
like
it
enabled
for
pods
to
be
pinned
in
order
for
them
not
to
be
removed
at
garbage
collection,
and
there
wasn't.
H
So
I
we
needed
it
to
be
like
probably
to
be
put
through
the
process
of
like
adding
a
feature
because
pinning
the
pods
is
a
feature
and
that's
the
plan.
G
B
H
D
B
H
Maybe
I
wrote
it
totally
wrong
because
I'm
I'm
going
from
advice
that
bernal
gave
me
about
bringing
this
up
at
this
meeting,
so
I
probably
totally
messed
up
how
to
write
it.
A
I
I
want
to
this
one
like
what
derek
already
mentioned,
that
we
already
agreed
discussed
that
the
ping,
the
image
in
the
past
and
at
that
time
why
this
static
part
is
being
brought
up.
So,
let's
just
forget
about
steady
power
make
this
is
the
narrow
scope?
What
we
agree
if
we
go
static
apart,
we
need
a
more
complicated
policy,
so,
let's
just
make
sure
we
are
just
loaded
up
for
the
pr.
I
did
look
at
the
pi.
I
also
didn't
see
that
expanded.
A
So,
let's
just
make
it
clear
see
if
we
want
the
stack
of
part
of
support,
come
up
the
new
policy
and
the
enhancement
cannot
piggyback
a
previous
characterizer
so,
but
the
surveillancing
kpi
actually
is
aligned
with
the
cap.
I
think,
if.
D
C
Okay,
then,
let's
go
on.
There
was
a
striking
item
about
dynamic
config.
I
replied
in
a
comment
or
mentioned.
I
think
parker
cannot
be
here.
That's
why
it's
track
and
over
so
reminder
if
you
affected
by
dynamic
google
config
and
you
have
a
strong
case
to
not
delete
it
in
124,
please
step
forward.
C
I
Yes,
I
just
wanted
to
mention
that
it's
ready
for
further
review
approval.
I
think
we
we
from
ronald-
and
I
are
at
the
point
where
we
think
it
it's
ready.
We
trimmed
it
down
to
the
just
to
the
core.
You
stay
and
try
to
remove
everything
which
is
not
directly
directly
related
to
the
to
the
cap.
A
Yes,
I
saw
that
I
also
sent
you
your
message
and
the
call-
and
I
is
going
to
review
and-
and
we
are
going
to
be
the
reviewer
and
also
approval
on
this
one.
C
C
J
So
I
sent
a
I
started
a
thread
on
the
mailing
list
this
morning.
I
also
started
included
sig
apps
and
put
this
topic
on
the
sig
apps,
the
next
meeting,
because
it
sort
of
touches
apps
and
touches
node.
J
Eviction
currently
sometimes
prevents
eviction
of
pods
that
are
not
ready
and
sometimes
does
not,
and
sometimes
preventing
a
not
ready,
pod
from
being
evicted
will
make
it
impossible
to
drain
a
node
which
there's
an
open
bug
for
so
at
least
some
people
are
negatively
impacted
by
that
and
expect
eviction
to
not
have
opinions
about
pods
that
aren't
ready,
but
other
people
are
relying
on
eviction
blocking
deletion
of
not
ready,
pods
for
for
other
reasons.
J
So
some
I
think
michael
mentioned,
that
in
the
pull
request
and
then
the
the
thread.
So
this
has
come
up
before
and
it's
kind
of
been
unclear
what
the.
J
Yet,
regardless
of
what
the
pdb
says
and
recently
recently
like
a
year
ago,
allowing
deleting
a
not
ready
if
the
pdb
status
says
there
are
enough
healthy
pods
and
so
there's
a
mismatch
between
what
the
api
server
allows
and
what
the
disruption
controller
considers
to
be
a
healthy
pod,
and
I
think
what
we
have
today
doesn't
really
make
sense,
and
I'd
like
to
try
to
figure
out
how
to
resolve
this.
So
the
questions
that
I
asked
in
the
signature
thread
there
were
three
three
questions.
J
The
second
question
was:
does
it
make
sense
for
eviction
to
block
deletion
of
pods
that
are
already
disrupted
and
then
the
third
question
was
trying
to
get
feedback
on
people
who
were
relying
on
the
current
behavior
and
how
they
were
handling
some
of
the
races
and
lack
of
guarantees
around
the
current
behavior.
B
So
joy,
one
question
I
had
was
in
some
of
the
dialogue
on
the
issue.
There
was
special
behavior
documented
around
how
you
can
prevent
any
voluntary
disruption
by
setting,
I
guess
the
pdb
to
zero.
Is
there
any
express
desire
to
change
that
behavior
or
is
that
independent
of
the
pr
that
was
linked.
J
I
don't
think,
there's
a
desire
to
change
that
behavior.
I
think
the
question
is
what
counts
as
disruption
and
so
that
that
was
my
first
question
like
if
a
pod
is
not
ready,
is
it
already
disrupted.
B
Yeah,
so
I
will
admit
that
some
of
my
confusion
this
morning,
when
trying
to
speak
to
both
david
and
michael
on
the
topic,
was
with
the
confusion
and
terminology
we
sometimes
use
so
ready
versus
available
and
the
pdb
referring
to
available
were
available
only
had
meaning
on
replica
sets,
but
not
on
pods
themselves.
B
B
What
I
was
trying
to
figure
out
was
if
there
is
an
unspoken
use
case
desired
for
don't
disrupt
the
scheduled
non-terminal
pod
and
that's
what
I
feel
like
is
coming
up
in
the
existing
dialogue
on
the
issue
and
so
to
me.
I
wondered
if
it
made
if
the
api
is
as
documented
or
if
the
api
is
as
implemented
and
do
we
want
to
only
tighten
api
behavior
when
the
as
implemented
behavior
can
still
be
re
restored,
with
maybe
an
undefined
knob.
B
A
I
I
get
partial,
what
do
you
say,
but
I
want
to
first
get
back
with
children?
Yes
yeah.
It
is
true.
We've
been
discussed
this
many
times.
I
haven't
looked
at
that
issue
and
the
pr
in
detail,
but
because
it's
not
surprised
to
me,
we
have
this
kind
of
problem
in
the
past
for
the
pdb.
For
me,
that's
the
cluster
level
of
the
wheel
for
that
workload
or
services
right
so
so.
This
is
why
I
think
about
a
lot
of
terminology
is
being
misused
like,
for
example,
eviction
when
we
first
design
eviction.
A
Actually
it's
the
local
decision,
optimization
preemption
actually
is
the
cluster
level
of
the
things
like
the
eviction.
When
I
decided
evicted,
it's
like
I'm
going
to
know
the
resource
establishment.
Oh,
and
when
I
like
to
drink
the
node.
I
want
to
evacuate
all
those
kind
of
things
and
then
to
then
claim
that
note
it
is
ready
to
be
removed,
properly
removed,
but
now
it's
kind
of
a
mix
all
over
the
place
and
but
on
that
particular
pdb
issue,
we
have
this
kind
of
problem
in
the
replica
side
in
the
initially.
A
I
also
have
the
lung
discussion
with
the
magazine:
even
don't
have
the
sig
ipf.
I
have
the
lung
discussion
with
the
api
machinery
teams.
A
So
on
this
kind
of
things
I
I
really
think
about
the
the
lord
writing
part.
It
is
disruptive
and
but
that's
not
the
decision.
Our
decision
is
pdp
as
the
controller,
their
decision
right.
It's
not
our
decision
and
on
the
node
side
we
already
market
this
part
it
is
running,
ready
or
not
ready,
and
you
will
have
the
ready,
plus
states
in
some
so
basically
pdb
based
on
that
one.
A
They
should
define
the
policy
or
the
services
availability
and
make
decision
how
they
are
going
to
delete
this
partner
right
at
the
end,
it
is
just
what's
the
policy
they
wanted
to
eat
and
kubernetes
cannot
make
that
decision,
because
we
could
see
that
this
cuban
this
part
at
this
given
time
due
to
all
those
problems
due
to
all
those
kind
of
available
resources.
It
is
in
crash
low
pro,
and
so
we
mark
that
is
not
ready,
but
it
is,
we
admit
at
kubernetes.
A
We
only
run
it
once
succeed,
run
once
and
their
running
state
based
on
the
restart
policy
will
keep
running
that
all
terminate
that
all
those
things
it
is
on
the
class
level
of
the
control
controller
decided
to
make
the
services
level
availability
or
whatever
availability.
They
decided
what's
next
step,
so
that's
kind
of
the
things
that
we
agree
upon
at
the
earlier
stage
of
the
kubernetes.
A
D
So,
jumping
in
not
on
a
terminology
thing,
but
on
a
concern
I
would
have
jordan
if
we
change
the
eviction
logic
to
be
able
to
evict
not
ready
pods.
So
one
of
the
things
that
I've
seen
there's
been
a
number
of
different
pr's.
D
I
think
there's
a
lot
of
things
that,
like
not
ready,
is
not
necessarily
a
good
enough
signal.
I
think,
to
evict
a
pod
on
its
own.
I
would
say
that
this
is.
It
has
to
be
an
app
level
consideration,
but
that's
because
there's
not
enough
context
available
in
the
cubelet
to
know.
F
I
I
completely
agree
with
that,
and
I
would
like
to
say
that
pod
disruption
budget
was
the
wrong
name
for
this.
It
should
be
called
application
disruption
budget.
So,
if
you're
thinking
about
disrupting
a
pod
in
the
context
of
an
application,
that's
different
than
thinking
about
disrupting
an
individual
pod,
and
my
premise
is
the
eviction.
Api
should
take
no
action
unless
it
can
know
for
certain
that
the
action
it
takes
will
not
result
in
an
unhealthy
application.
F
If
the
application's
already
unhealthy,
because
all
the
pods
are
in
crash
loop
or
what
have
you,
it
should
take
no
action
kind
of
like
a
circuit,
breaker,
failing
open,
like
you
requested
this,
I'm
in
an
unknown
state.
I
can't
proceed
and
that
will
cover
these
cases
where
readiness
is
flapping,
because
the
kubelet
restarted
or
the
kublet
might
be
going
unreachable
temporarily
or
any
or
if
the
pod
was
restarted.
F
For
some
reason,
readiness
can
be
quite
transient,
and
I
think
this
is
primarily
targeted
around
like
automated
behavior
so
trying
to
drain
a
node
and
then
remove
it
from
the
cluster
in
an
automated
way,
because,
obviously,
if
you're
sitting
at
the
keyboard
and
drain
fails,
it's
not
a
big
deal.
You
can
kind
of
work
through
that
quite
quickly,
but
if
you
want
a
system
that
is
constantly
maintaining
itself,
you
need
this
logic
to
account
for
lots
of
different
scenarios.
A
F
Unknown
state
because
we
don't
really
have
an
application
level
signal,
not
all
applications
are
behind
services,
so
just
because
the
pod
is
labeled
as
not
ready
doesn't
mean
it's
not
doing
useful
work,
especially
if
the
node
is
unreachable,
and
it
also
doesn't
mean
that
there
won't
be
data
loss.
If
we
were
to
remove
that
pod
and
that's
the
point
I
made
on
the
list
regarding
our
use
of
ncd
and
openshift,
we
use
these
canary
pods
because
we
run
ncd
as
a
static
pod.
F
J
There
are
a
few
things
that
I'm
trying
to
reconcile
the
the
first
is
that
we
already
allow
deletion
of
not
ready
pods
in
some
cases.
So
the
idea
that
eviction
can
guarantee.
F
F
But
they
do
absolutely
prevent
from
involuntary
prevent
voluntary
disruptions.
So
there's
always
going
to
be
a
race
between
involuntary
and
voluntary
right.
If
I
want
to
evict
a
pod
and
then
all
of.
F
In
a
different
pods
impacted,
there's
nothing.
We
can
do
to
prevent
that,
because
that
that
crash
could
happen
before
right
before
or
right
after
that's,
that's
just
what's
going
to
happen,
but
if
we
have
a
reasonable
level
of
certainty
that
the
data
we
do
have
is
accurate,
we
should
be
able
to
make
the
good
decisions
with
that
data.
J
Yeah,
I
think
the
exceptional
cases
that
would
be
like
a
node
crashes
and
then
another
node
crashes,
and
then
a
controller
goes
bananas
and
is
like
deleting
everything
like.
Those
exceptional
cases
are
exactly
the
types
of
cases
where
we
don't
actually
have
great
confidence
that
the
data
in
the
pdb
is
accurate,
like
if
the
disruption
controller,
isn't
keeping
up
updating
status
and
we're
like.
J
Like
I
had
three
healthy
things-
and
I
only
wanted
two
and
all
of
these
are
not
ready,
so
cool
delete,
delete,
delete
like
that
hap
that
happens
today.
So
I'm
trying
to
reconcile
like
the
use
of
pdbs
this
way
with
the
current
behavior
we
have,
I
think.
G
J
The
other
is
that
for
the
controller
to
use
readiness
as
a
signal
for
like
the
status
number
of
healthy
pods,
but
for
the
api
server
not
to
use
that
seems
confusing
to
me.
A
Jordan,
I
totally
agree
with
you.
There's
the
I
expected
event.
The
problem
is
like.
I
also
agree
with
michael
those
unexpected
event.
Kubernetes
at
the
know,
the
level
we
cannot
predict
right,
even
kubernetes,
don't
know
itself
will
be
that
crash
together
with
node.
So
so
all
can
do
it
is
the
data
to
generate
when
it
is
node
is
available,
generate
available
data,
and
then
there
is
a
way
to
detect
in
the
class
level,
detect
the
network
partition
detect
of
the
crash
load.
A
So
those
kind
of
information
should
be
collected,
which
is
the
kind
of
what
we
have
like
the
node
time
step
right,
send
the
status
timestamp.
A
This
is
what
we
are
trying
to
do
so
so
then
pdp
or
other
controller
based
on
that
one
to
make
a
decision,
so
they
cannot
just
based
on
the
part,
readiness
or
single
status
they
need
to
take
other
status
came
into
this
controller
and
to
see
this
kind
of
debate
is
in
the
java
controller,
and
also
the
replicas
that
have
been
I've
been
debating
many
many
times
it's
just
where,
as
the
node
level,
what
I
can
do.
It
is
just
management
of
the
node
level
of
the
device
resource
and
everything.
A
So
we
make
our
decision
to
object
animate
of
those,
but
we
give
the
data
feedback.
Loop
is
pretty
important
here.
So.
J
Like
we
tell
people
to
use
drain
as
a
tool
for
managing
nerves,
but
it's
pretty
trivial
to
deadlock.
It
in
a
way
that,
like
you,
look
at
the
node
there's,
not
the
nodes,
you
look
at
the
pods
and
they're,
not
healthy
and
they're,
not
available
and
until
they're
deleted.
The
replacements
can't
be
spawned.
J
J
F
We
have
resolved
the
deadlock
if
you
only
are
attempting
to
evict
an
unready
pod
and
you
have
enough
ready,
pods
covered
by
that
pdb
eviction,
checks
that
today,
so
if
you
lose
one
node
and
and
there's
only
one
replica
right
and
we
have
properly
tuned
highly
available
workloads
right,
that
node
can
be
successfully
drained
today,
no
problem-
and
I
think
the
concern
is
okay.
F
We
should
definitely
close
that
gap.
That
seems
like
we
could
put
just
like
something
in
a
status
field
like
like
just
something
that
needs
to
be
popped,
so
the
disruption
controller
does
the
popping
and
the
eviction
does
the
pushing.
So
the
eviction
won't
be
able
to
complete
that
request
unless
that
field
is
empty
and
that
field
is
only
emptied
by
the
destruction,
controller,
etc.
That's
one
idea,
but
definitely
there's
no
deadlock
today.
J
F
Then
there's
a
very
large.
You
know
time
period
where
pods
can
be
running,
the
api
status
can
be
incorrect
and
but
because
it's
unready
in
the
api
we're
saying
we're
going
to
delete
this
and
check
nothing
else.
I
think
that's
the
wrong
decision.
A
I
think
the
most
the
time
it
is
not
ready.
I
realize
our
intense
it
is
here.
It
is
like
the
not
ready,
but
not
right.
It
could
mean
like
the
wrist
kubernetes
restart
immediately
later
back
to
the
ready
right
could
be
like
the
crash
loop
and
with
back
off
and
the
kubernetes
will
for
a
long
time,
not
restart
that,
and
it's
totally
disruptive
to
me
on
that
state
if
we
are
about
application
level
availability.
A
So
I
think
that's
that's
the
missing
signal
for
the
for
the
pdb,
because
when
they
only
look
at
the
writing,
a
lot
right
without
more
detail
like
how
many
things
this
crashing
the
initial
crash
and
now
how
many
restart,
how
many
retry
those
restart
the
counter.
D
Yeah
there's
some
chat
happening
in
the
text
chat
and
I
just
wanted
to
address
one
of
the
things
that
was
written
in
there
because
I
mentioned
we
have
this
bug.
There
is
another
bug
that
was
filed
as
a
bug.
I
don't
actually
think
it's
a
bug,
but
there
was
a
bug
filed
against
us
talking
about.
D
Well,
you
know
if
you
restart
a
cubelet,
then
pods
show
as
not
ready,
and
so
that's
the
sort
of
thing
that
we
need
to
consider
here,
like
not
ready,
doesn't
necessarily
mean
there's
anything
wrong
with
the
pod.
I
don't
think
that
we
can
always
assume
that
that,
like
we
need
more
context
in
order
to
be
able
to
evaluate
if
something
is
actually
disrupted
or
not
because,
like
I
don't
think
it's
safe
for
a
like,
it
would
be
a
very
big
expectations
for
users
to
change
the
default
of
oh
yeah.
D
We
just
assume
the
pods
are
ready,
unless
proven
otherwise
to
me
that
doesn't
seem
like
a
safe
default.
We
should
assume
that
the
pods
are
not
ready
until
proven.
Otherwise
I
think
which
is
the
current
behavior,
so
that
was
filed
as
a
bug
against
us.
So
david
had
a
comment
like
that
seems
like
that
bug
that
we
should
fix.
I
don't
think
we
should
fix
it,
because
I
don't
actually
think
it's
a
bug.
I
think
it's
just
a
mismatch
in
expectations.
D
Similarly,
with
this
bug,
a
lot
of
the
folks
on
the
issue
are
saying
things
like
well,
but
like
a
pod,
that's
in
crash
looping,
it's
already
dead
like
it's.
You
know
I
consider
terminated
well,
a
cubelet
doesn't
consider
it
terminated
and
there
could
be
any
number
of
reasons
like
if
it's
crash
looping
sure
that's
probably
disrupted,
but
if
we're
just
looking
at
not
ready,
there
could
be
any
number
of
reasons
why
it's
not
ready.
D
It
could
just
be
that
there
was
some
issue
with,
like
you
know
the
the
readiness
probe
flaked
or
something
like
that.
So
I
don't.
I
don't
want
people
to
be
like
okay,
it's
not
ready,
definitely
disrupted.
I
don't
think
that's
true
and
I
don't
think
that's
a
bug.
A
A
They
basically
have
one
problem,
particularly
one
problem
is
just
like:
okay,
the
the
part
it
is
crash,
and
then
we
keep
dot
righty
and
then
we
have
like
the
send
or
restart
account,
but
the
replica
side
look
at
the
lottery
only
and
then
they
keep
creating
of
the
new
part
and
keep
creating
the
part,
but
they
didn't
didn't
remove
previous
ones.
A
So
the
end
up,
like
the
next
replica
equal
to
three,
they
end
up
quite
off
like
more
than
ten,
but
it
doesn't
have
the
tons
of
like
the
behind
and
it
is
in
the
not
ready
state
and
the
take
off
the
node
of
the
resource,
so
those
logical
yeah.
We
need
the
title.
I
I
I
just
read
off
the
direct
chat,
and
so
we
need
to
tighten
off
the
behavior
and
here
make
that
more
clear,
so
yeah.
I
agree.
B
B
What
I
was
trying
to
figure
out
was:
do
we
actually
feel
safe
making
this
change
in
the
community
without
allowing
the
ability
to
fall
back
to
prior
behavior?
This
seems
like
a
tightening
without
an
ability
to
to
loosen
which
feels
risky
for
a
gi
api,
which
is
why
I
was
trying
to
find
a
word
that
says
like
I
I
I
can
have
macs
scheduled,
but
not
yet
terminal
pods
be
treated
differently
than
scheduled,
but
not
yet
ready
pods.
If
that
makes
sense.
B
I
mean
the
tension
here
is
like
do
we
want
to
treat
not
ready,
pods
as
disrupted
or
not,
and
so
basically
having
the
option
that
allows
you
to
say
I
don't
care
about
the
readiness
state.
I
only
care
if
the
pod
had
ever
started
or
was
in
the
process
of
starting
that.
That
seems
like
a
use
case
that
might
have
been
missed
in
the
pdb
discussion.
B
Some
of
the
other
things
that
jordan
you
raised,
I
mean
I
I
think
I
was
the
one
who
said
pdb
should
ignore
terminal
pods.
At
least
we
all
agree
that
one
was
safe,
but
I
think
this
one
there's
reasonable
ways
to.
B
Out
on
right
now,
I
I
don't.
I
don't
think
the
open
shift
use
case.
That
michael
is
communicating,
is
a
disastrous
like
product
posture
in
any
way,
and
I
don't
actually
feel
like
the
present
behavior
or
even
the
updated
behavior
would
have
a
material
impact,
because
if
the
issue,
as
we've
talked
through
with
michael
is,
is
what
happens
if
you
had
two
nodes
lost
and
you've
already
had
a
quorum
failure
and
there's
some
issues
with
that
independent
of
of
this
capability.
B
To
me,
it's
more
like
can
reasonable
people
be
depending
on
behavior,
and
is
it
right
to
tighten
that
behavior
unexpectedly
upon
them
versus
find
an
alternate
path
to
both
support
tightening
for
correctness,
as
well
as
capture
the
that
gray
area
in
between
right
now,.
J
I
I
would
actually
consider
the
current
state
to
be
sort
of
the
worst
of
both
worlds,
because
drain
is
vulnerable
to
deadlocks
and
depending
on
it,
to
handle
like
not
ready,
pods
in
a
way
that
doesn't
deadlock
is
vulnerable
to,
like
the
controller
not
running
the
train.
B
The
staple
set
pod
won't
be
deleted
until
the
note
said.
I
have
deleted
it
and
so
like
that
is
the
pod
you're
most
likely
to
put
a
pdb
in
front
of
anyway.
F
F
J
Yeah,
I
so
like
I
I'm
not
particularly
attached
to
a
particular
outcome.
I
just
want,
if,
if
someone
is
expecting
this
to
be
safe
for
them
to
use
with
drain,
I
want
that
expectation
to
be
met
if
someone
is
expecting
pdbs
to
keep
disastrous
things
from
happening.
I
want
that
expectation
to
be
met,
and
I
don't
think
either
of
those
are
completely
true
right
now,
and
I.
F
Don't
I
don't
see
why
this
needs
to
be
part
of
eviction?
If,
if
your
organization,
you
know
whoever
not
you
specifically,
but
any
any
organization,
decides
that
I
don't
care
about
unready
pods
delete
them.
I
mean
that
is
a
that
is
a
trivial
line
to
write
right
step,
one
delete
anything,
that's
not
ready,
step
two
drain
or
reverse
them.
If
you
feel
so
inclined,
I
don't
see
why
we
would
need
to
put
this
in
eviction
specifically.
J
I
think
the
mismatch
between
the
controller
and
the
server
is
at
best
confusing
and
at
worst
opens
the
door
to
like
mismatches
and
guaranteed
behavior.
J
I
I
just
want
to
see
it
be
coherent
if
that
means
another
option
on
the
pdb
to
say
like
what
do
you
do
with
these
things,
then
the
controller
can
honor
it
and
the
server
can
honor
it,
but
the.
But
the
mismatch
of
the
two
is
pretty
confusing.
It
makes
the
admission
code
very
difficult
to
reason
about.
B
Yeah,
just
out
of
curiosity,
like
I'm,
trying
to
think
of
other
edge
cases
that
might
come
around
pdb
with
this,
and
do
you
feel
jordan?
There's
a
expectation
mismatch
on
how
a
replica
set
would
define
ready
versus
how
a
replica
set
would
define
available
and
do
we
want
the
pdb
controller
to
maybe
align
how
it
views
available?
In
the
same
way,
the
replica
set
view
is
available,
which
is
ready
for
some
min
period
of
seconds,
so
at
least
that
you
avoid
flapping.
F
J
There's
like
we're
thinking
of
these
pods,
as
only
in
two
categories
like
disrupted
or
not
safe
to
victor,
not,
and
it
might
be,
there's
like
three
categories:
there's
like
the
ones
that
the
controller
considers
to
be
able
to
count
in
current
healthy,
and
maybe
that's
actually
what
we
have
today
implicitly,
because
the
controller
considers
readiness
and
the
api
server
doesn't.
So
maybe
we
do
have
three
categories
of
pods,
but
that's
just
never
been
formalized
so
like
safe
to
evict
unconditionally
or
safe
to
delete
unconditionally.
J
Is
one
category
like
perfectly
ready
and
contributing
member
of
society.
Pod
is
another
category
and
then
there's
a
middle
area
where
it's
like.
We,
we
don't
want
to
just
delete
these
things
like
they
might
be
doing
work
but
they're,
certainly
not
healthy,
like
they're,
not
ready,
they're,
not
being
routed
to
buy
services
like
we
don't
know
what
their
status
is
and
that
just
hasn't
been
formalized,
and
so
a
lot
of
the
code
like
is
basically
these
binary,
like
boolean,
good
or
bad.
F
Functions,
there's
also
there's
a
fourth
category
of
canary
pods.
Like
specifically,
we
point
out
the
documentation
schedule.
A
pod
put
pdb
allow
disruption,
zero
and
that
is
basically
a
proxy
for
you
cannot
drain
this
node
successfully
and
and
that
could
also
represent
static,
pod
or
just
be
a
part
of
an
administrator
procedure
saying
nobody
is
allowed
to
to
get
rid
of
this
one
particular
this
class
of
notes
until
they
contact
me.
F
J
F
J
I
think
the
current
date
is
problematic
and
confusing
and
people
are
relying
on
pdbs
for
things
that
pws
can't
actually
guarantee,
and
so,
if
a
change
is
needed
in
pdbs
or
in
the
api
server
or
in
the
controller,
that's
fine.
I
don't
think
like
our
current
state
is
like
this
is
pretty
much
fine
and
if
someone
feels
like
pushing
a
big
change
here,
that
would
make
it
better,
like
there's
bugs
around
deadlocks
and
there's
bugs
latent
bugs
in
the
the
api
server
implementation
that
make
what
some
people
are
apparently
depending
on
not
safe.
J
So
I.
F
D
Discussing
this
for
a
while,
we
have
a
couple
more
items
on
the
agenda,
so
I,
in
order
to
make
sure
that
they
have
enough
time
I
want
to.
We
can
maybe
table
this
discussion
to
next
week
and
move
on
to
those.
A
C
I
wanted
to
just
say
one
thing
about
it.
I
I
can
present
next
time,
but
what
I
wanted
to
say
is:
I
think
it
may
be
good
idea
to
send
a
questionnaire
to
users
to
understand
what
preventing
them
from
other
like
from
migrating
over
to
kershim,
and
I
put
a
questionnaire
in
the
end
of
the
document.
So
if
you
have
an
opinion,
what
question
to
ask
please
comment
on
there
and
next
week
I
can
present
more.
K
Okay,
so
short
introduction,
because
I'm
new
to
this
group
here
I've
been
doing
most
of
my
work
in
six
storage.
Recently
I
started
looking
at
the
structured
logging
effort
and
I'm
helping
out
there,
I'm
also
maintaining
some
of
the
upstream
projects
together
with
tim
hawkin,
and
I
noticed
while
doing
that,
that
a
few
things
have
been
pending
in
the
sig
note
area.
It
starts
with
the
cubelet
flags
that
refer
to
config
files
or
that
have
a
corresponding
entry
in
the
config
file
for
cubelet.
K
I
think
the
deprecation
remark
is
now
several
years
old
and
I
was
just
wondering
whether
there
is
still
a
plan
in
place
to
actually
remove
the
parameters,
because
eventually,
I
might
add
one
more
to
the
list
that
will
be
basically
deprecated
from
the
beginning,
and
I
was
wondering
what
what
the
status
is
here.
Does
anyone
know.
C
So
we're
clearly
working
on
some
flags
migration,
so
there
are
a
couple
prs
currently
in
flight.
I
don't
know
what
the
status
of
a
rob
processor.
I
think
we
at
some
point.
There
was
like
a
single
issue,
taking
all
the
duplications
and
like
they
were
like
separate
issues,
and
we
created
single
issues
that
taking
all
the
migration.
C
D
Discovering
I
think,
sort
of
piecemeal
at
this
point.
There
are
some
things
that
have
flags
that
are
lacking
a
corresponding
cubelet
configuration
value.
So
I
think,
as
we've
seen,
this
come
up,
we've
been
filing
issues
to
add
them
to
the
cubelet
configuration,
because
those
probably
need
to
all
be
done
individually
as
separate
changes.
D
If
they're
already
in
the
cubelet
configuration
there's
no
api
change
and
review
required,
but
if
it
isn't
there
already,
then
we
do
need
to
do
that.
Hence
the
sort
of
like
we
have
a
mega
issue
for
one
deprecations
versus
oh
there's,
these
things
we
need
to
add
and
each
has
to
kind
of
go
through
review
separately.
So
I
know
that
that
is
ongoing,
because
I
keep
triaging
new
issues
they're
like
oh.
This
thing
is
missing:
it's
only
available
in
a
flag,
so
yeah.
K
No,
I
was
thinking
of
those
things,
but
I
already
have
a
config
entry
and
still
have
command
line
flags,
so
both.
K
D
From
an
operations
perspective,
it's
very
disruptive
to
remove
command
line
flags
without,
like
some
sort
of
deprecation
period,
so
we'd
have
to
follow
this.
The
standard
deprecation
cycle,
I'm
trying
to
find
this
in
the
minutes,
because
this
was
previously
raised
in
sig
node.
I
think
a
couple
of
months
ago
discussing
like
should
we
put
resources
into
this
deprecation
and
whatnot,
and
I
think
the
ultimate
conclusion
was
a
working
group
component
base.
D
I
mean,
I
think,
that's
now
dissolved,
because
there
was
no
active
leadership
there
so
like
that
working
group
had
lost
momentum
and
other
sigs
weren't
actually
doing
migrations.
So,
given
everything
that's
also
in
the
air
at
sync
node,
we
didn't
want
to
prioritize
doing
a
refactor
that
nobody
else
was
doing.
K
Yeah,
as
far
as
I
can
tell
coming
coming
from
that
as
a
as
an
outsider,
the
flags
were
already
marked
as
deprecated
in
the
command
line
at
least
four
years
ago.
I
don't
know
whether
that
officially
started
with
application
period.
That
was
probably
a
different
discussion,
but
anyway
it's
not
that
important.
It
was
mostly
around
what
do
I
do
about
the
new
things?
Let's,
let's
continue
with
that
part.
K
So
my
take
is
that
this
logging
part
is
still
alpha,
because
what
that's?
What
what
the
type
says
and
that's
just
the
missing
comment,
so
the
users
do
not
necessarily
see
that
they
are
using
something
that
is
actually
still
alpha,
and
I
was
wondering
about
that
is
also
the
opinion
of
everyone
else
here
in
the
group,
because
then,
if
it
is
when
I
can
create
a
pr
that
just
adds
a
comment
to
the
documentation
but
saying
that
the
logging
field
or
the
logging
part
of
the
configuration
is
alpha.
J
I
think
I'd
actually
made
a
comment
about
this
when
it
went
in
maybe
at
the
at
the
time.
It
was
part
of
an
alpha
feature.
I
in
from
my
perspective,
if
there's
a
struct,
that's
used
in
a
beta
config
api.
It
should
be
a
beta
level
stability.
So
I
I
would
encourage
like
creating
a
beta
package
under
component
config
and
like
having
the
logging
trucks
there
and
having
people
that
reference.
J
K
I
agree
that
it's
sub-optimal,
but
this
is
what
we
currently
have
and
I'm
not
sure
I
need
to
talk
with
with
with
the
folks
in
infrastructure
logging
working
group,
whether
they
are
ready
to
commit
to
a
beta
version
of
the
logging
configuration.
That
is
a
bigger
question
and
yeah.
D
Surely
you've
read
there's
a
cap
that
was
reviewed
and
approved
this
cycle
for
deprecation
of
most.
I.
K
D
Yeah,
like
I
reviewed
this
cap,
I
don't
I
mean
I
know
that
there
was
talk
of
adding
a
new
field
and
a
new
command
line
flag.
I
don't
know
like
it
hasn't
gone
through
api
review,
yet
I
don't
think
it
looked
quite
straightforward
to
me
as
a
cap
reviewer.
So
I
wasn't
like
a
stickler
well.
A
J
K
D
Yeah
and
if
I
think,
if
that
doesn't
get
graduated
by
next
release,
it
should
be
ripped
out
because
it's
alpha
and
it's
been
out
sitting
in
alpha
for
like
four
releases.
K
Okay,
so
that
that
puts
it
a
bit
into
perspective.
So
let's
talk
about
the
things
that
I'm
currently
planning
deprecating
for
k-log
flags
is
certainly
one
of
them.
I'm
not
even
sure
whether
that
affects
cubelet
or
even
needs
approval
from
sicknote,
particularly
because
it's
mostly
just
in
component
base-
and
I
think
cubelet
will
just
inherit
that
without
any
changes,
so
that
might
be
fine.
K
The
more
interesting
one
is
around
replacing
support
for
different
output
streams.
That
is
part
of
a
cap.
We
have
that
feature
currently
for
plain
text,
so
you
could
configure
k
log
to
write
to
different
files
and
then
process
those
files.
The
different
priorities
and
one
of
the
agreements
as
part
of
adopting
the
cap
for
deprecating
k-log,
was
that
a
similar
feature
should
be
possible
for
json
output
and
long-term,
also,
of
course,
for
for
traditional
plaintext.
Perhaps.
K
K
K
So
I
guess
that
will
need
an
api
review.
We
can
talk
about
it
more
when
it's,
when
it's
ready,
I'm
currently
in
the
process,
with
tim
hawkin,
to
actually
really
write
some
of
the
part
of
the
command
line
parsing
and
once
that
pr
is
in,
we
can
come
back
to
roswell,
and
I
just
mentioned
thanks.