►
From YouTube: Kubernetes SIG Node 20200811
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
B
As
a
highlight,
I
didn't
get
a
chance
to
check
it
this
morning,
at
least
for
pr's
permitting
approval.
I
think
yesterday,
when
I
was
looking
at
this,
we
were
kind
of
mid-teens
and
I
was
working
my
way
through
there.
I
think,
for
the
most
part,
everyone
that
was
pending
a
cherry
pick
has
been
pulled
back
and
many
of
the
other
ones
that
were
pending
approval,
weren't,
really
necessarily
specific
to
just
signaled
responsibilities,
so
a
number
of
them
actually
had
a
dashboard
or
others
apply
they're
approved.
B
So
I
think,
for
the
most
part,
the
needing
approval
cue
is
in
decent
shape,
for
where
we
could
do
our
work.
That
was
not
intersectional
with,
say,
sig
windows
or
api
machinery
for
the
other
cues.
I
did
not
have
time
earlier
this
week
to
pay
them
too
much
attention
beyond
what
I
looked
through
the
approval,
cue
so
either,
maybe
david
or
seth,
you
had
a
chance
to
review
one.
C
C
So
it
a
lot
of
the
prs
that
don't
currently
have
assignees
that
are
in
node
actually
have
had
a
number
of
the
people
who
are
on
this
call
already
reviewing.
So
if
you
are
reviewing
prs,
feel
free
if
you'd
like
to
follow
through
with
the
review,
to
assign
yourself
that
way.
We
know
that
we
don't
need
to
look
for
someone
else,
and
I've
been
going
through
some
of
those
pr's
and
trying
to
assign
the
people
who
have
been
working
on
them.
A
Yeah
there's
also
some
like
the
pr
goes
to
some
people,
maybe
not
evolve
with
the
signal,
so
I
did
pay
extra
attention
on
those
pr
only
assigned
to
some
like
the
old
member
and
so
the
couple
of
those
same
things
maybe
from
from
now
on.
We
have
to
pay
extra
attention
on
those
things
because
we
think
about
that
they
have
the
assignment,
but
actually
the
people
not
actively
honest.
A
A
D
D
from
last
week,
how
many
created
prs,
how
many
closed
pr's?
Is
it
useful
or
it's
not
needed.
A
I
think
it's
really
useful
and
we
can
mention
over
time.
Maybe
there
are
some
legacy
things
that
they
sting
and
but
this
one
actually
can
help
us
if
those
like
these
things
lead
the
early
start
update.
Like
the
event,
though,
I
think
we'll
reflect
here
right
so
like
there's
the
update
one.
So
then
we
can
pay
extra
attention
about
the
update
one,
so
people
still
care,
so
we
can
pick
up
from
from
here
over
time,
then
we
can.
Hopefully
we
can
cover
majority
of
those
kiosks.
B
Yeah
one
thing
that
I
did-
I
don't
know
if
anyone
from
sig
windows
is
in
today's
call,
but
there's
about
five
or
so
keyboard-oriented
changes
specific
to
windows
that
it
wasn't
clear
to
me.
Actually,
I
had
consensus
in
the
sig
and
one
of
the
ones
that
at
least
I
was
trying
to
give
some
priority
attention.
The
review
was
the
one
that
was
stripping
unnecessary
security
context
on
windows,
just
because
I
know
our
own
users
are
hitting
that
as
well.
B
So
is
anyone
from
sig
windows
today
on
the
call
that
maybe
could
like
summarize,
if
there's
consensus
on
the
approach
for
this,
because
the
pr
looked
fine,
it
just
wasn't
clear
that
it
was
an
accepted
path
forward
for
that
sig
and,
if
not,
then
maybe
given
the
cue
I
see
here
like
we,
don't
have
a
great
way
of
distinguishing
when
windows
is
happy
with
something
versus
node
right
now
on
some
of
these
items.
A
So
dark
in
the
past
I
did
a
while
back.
I
did
went
to
the
sig
windows
and
then
ask
them
to
have
someone
represent
the
sig
windows
and
join
signal,
because
it's
kind
of
a
joint
product
joining
the
project
was
weird
so
but
the
patrick.
That's
why
patrick
joined
us
and
the
patrick
left,
and
so
that's
why?
Maybe
I
have
to
go
to
the
stick
windows
to
ask
them
explicitly
to
find
someone
represent
the
sig
windows
and
attend
this
meeting.
They
could
take
a
turn.
A
They
could
do
something,
but
it
is
true
because
they
being
poke
me
and
the
province.
I
don't
have
enough
content
so
for
those
pr's
I
we
need.
There
are
certain
things
like
the
controversial
between
from
the
different
pr,
and
so
that's
why
we,
hopefully
in
a
signal
that
we
could
and
also,
I
think,
what
is
more
transparent
in
here
and
to
tell
us
the
planning
and
what's
the
what's
the
goal,
and
so
it's
hard
for
us.
So
that's
why?
B
Yeah
I
I
was
so
used
to
patrick
being
here
for
many
many
many
months
that
I
didn't
actually
realize
he
might
have
been
absent,
so
I
was
hoping
he
was
still
here
so
yeah
if
we
can
find
a
better
ambassador
from
sig
windows.
Who's,
not
better
patrick,
was
great,
but
a
present
ambassador
from
the
sig.
That
would
be
awesome
and
I'm
happy
to
follow
up
with
that
myself
don.
So
I
can.
I
can
try
to
see
what
we
can
do
on
that.
One.
A
Yeah,
so
last
time
is,
I
went
to
there
ask
them
requests
so
that
that's
why
patrick
frankie's
to
represent
us
just
from
represented
both
from
microsoft,
to
kernel
team,
to
represent
of
the
windows
container
and
also
from
the
kubernetes
sig
windows.
So
that's
kind
of
coverable.
B
Okay,
cool
yeah,
because
it's
hard
to
when
you
look
at
the
pure
data
to
have
a
clear
split
on
some
of
these
things,
and
so.
B
Anyway,
well
we'll
follow
up
on
that.
That's
probably
the
best
summary
of
what
I
see.
A
Okay,
so
let's
move
to
next
one
and
we
rock
do
you
want
to
talk
about
the
static
container
yeah.
F
Yeah,
sorry,
hello,
okay,
yeah,
yeah,
sorry
yeah.
I
wanted
to
do
a
friendly
thing
on
the
open
pr.
I
I
know
like
the
1.19
releases
coming
and
cubicle
is
coming,
so
I
I
guess
there
won't
be
much
activity
but
yeah.
I
wanted
to
know
if
it's
on
the
other
side,
maybe
before
keep
going,
there
was
a
chance
to
get
a
review
or
yeah
just
to
know
what
what
to
expect.
B
Yeah
I'll
review
this
this
afternoon,
so
thanks
for
the
call
out-
and
I
I
don't
see
just
because
119
is
held
up-
doesn't
mean
that
we
can't
iterate
on
the
updates
to
the
cap.
So
I
will
follow
up
on
that
this
afternoon.
F
Oh
okay,
perfect.
Thank
you
very
much.
A
I
think
the
sass
and
also
the
the
sergey
rd
review
right.
So
it's
just.
I
think
that
we
are.
We
did
a
sign
to
several
people
when
we
talked
about
that
earlier
this
beginning
of
this
quarter,
so
there's
the
stats
and
there's
the
turkey
and
then
there's
the
eric.
If
I
remove
correctly
assigned
to
this.
D
Yeah,
my
biggest
comment
is
about
init
containers.
Like
can
we
like
talking
to
easter
team?
It
seems
that
you
need
containers
like
running
running
after
side
containers,
initial
initialized,
it's
very
important
for
certain
scenarios,
and
my
question
is
basically
what
will
change
in
the
entire
cap?
If
you
will
just
start
site
containers
before
you
need
containers.
F
So
if
we,
if
we
create
a
new,
a
new
definition
of
sidecar
containers
that
are
before
unit
containers,
not
all
possible
sidecar
containers
now
are
might
need
another
init
container
for
size,
car
containers
or
semantics
can
get
just
get
tricky
start
to
get
tricky,
and
so
I'm
I'm
open
to
that.
Is
that
it's
better
for
the
long
run.
F
I
think
we
should
do
that,
but
I
think
that
is
quite
a
different
problem
of
the
one
that
we
currently
know
like
cycle
containers,
as
we
know
them
today,
start
after
init
containers,
and
there
are
a
bunch
of
users
doing
that
and-
and
those
are
the
problems
that
I
think
it
might
be
easier
to
to
start
solving
by
but
yeah.
I
want
more,
of
course,
more
opinions.
B
So
rodrigo
the
one
help
I
could
have
if
you
could
do
this
after
the
call
was
the
cap
right
now
is
under
sig
apps,
but
I
don't
really
see
sig
apps
as
being
like
a
long-term
stakeholder
here.
If
you
could
just
move
it
under
the
signal
directory,
that
would
probably
ensure
that
we
don't
hit
any
other
approval,
headaches
sure,
but
I
I
know
sig
apps
historically
brought
the
use
case
forward,
but
it
seems
like
from
a
implementation
ownership
perspective.
This
is
pretty
clearly
part
pardon
node.
B
F
But
I
mean
in
the
current
open
tr
that
might
be
that
might
take
long
to
merge,
or
maybe
it's
just
another
small
pr
with
that
change.
F
B
Want
to
have
a
pr
to
move
it
into
node
and
out
of
that,
and
then
do
a
separate
one.
That's
fine
or
you
can
do
on
this,
existing
one
it
just
okay,
just
it
seems
like
it's
bucking
in
the
wrong
cig.
A
Is
dark,
that's
kind
of
the,
since
we
already
have
the
process
based
on
the
label
right,
which
is
sick
and
to
that's
easy
for
us
to
book
a
tracking,
since
the
majority
worker
asked
this
team
to
do
the
review,
design
review
and
the
implementation
review
all
those
kind
of
things.
So
it's
maybe
just
try
to
reflect
the
facts
here.
Otherwise
we
may
overlook
this
one.
G
Yeah
so
hi
everyone,
I'm
swati,
I'm
here
with
alexey
and
francesco
and
just
wanted
to
provide
a
quick
starters
update
on
the
topology
where
scheduler
work
we've
been
talking
a
bit
about
it,
so
I
thought
that
would
be
good
just
to
provide
a
status
update
and
show
a
demo.
So
people
know
what's
going
on
there.
I
have
a
few
slides.
E
G
When
you
can
see
the
screen-
yes,
okay,
so
yeah,
so
basically
the
status
update
is
that
we've
been
working
on
these
two
components,
one
of
them
being
topology,
aware
scheduler
plug-in
the
other
one
is
resource:
topology
exporter.
We
have
caps
and
implementation
of
both
both
of
these
components
as
part
of
the
resource
resource
topology
exporter.
G
We
have
we
initially
prototyped
it
using
the
container
runtime
interface,
we've
enabled
it
for
container
d
and
cryo.
This
required
no
cubelet
changes,
but
when
we
kind
of
started
looking
into
port
resource
api
as
a
way
of
gathering
resource
information,
we
identified
that
there
were
some
gaps
that
needed
to
be
fixed
and
we
proposed
a
cap
and
code
corresponding
to
these
as
well.
And
I
see
if
you
want
to
talk
through
this
one,
he
basically
authored
the
gap
and
implementation
of
this.
H
Yes,
hello,
everyone.
I
have
a
couple
words
about
it.
I
think
the
changes
in
border
resources
was
arised
by.
I
G
Yeah,
so
basically
the
pod
resource
api
as
it
stands
today
it
exposes
information
about
devices
and
what
this
kept
essentially
does
is
it
in
it
enables
and
provides
you
information
about
the
cpus
and
the
topology
information
corresponding
to
the
devices.
So
that's
kind
of
the
summary
of
this
cap
and
then
I'll
be
showing
a
demo
I'll,
probably
hold
that
for
the
time
being,
but
talk
about
these
two
items
which
are
currently
in
progress,
so
derek
pointed
out
that
we
should
look
for
ways
to
merge
the
resource
topology
exporter
into
no
feature
discovery.
G
So
discussions
are
currently
in
progress
with
nft
maintainers.
It
seems
positive
at
this
point
in
time
and
we're
having
discussions
related
to
how
we
can
go
about
it,
basically,
the
design
and
basically
design
discussions.
So
we
have
an
issue
created
for
this
and
a
proposal
doc
and
then,
in
relation
to
the
pod
resource
api.
Again
we
have
an
item
in
progress
which
franchise
goes
on
the
call
again
he's
working
on.
G
We
are
proposing
that
we
expose
a
watch
endpoint
in
the
port
resource
api
if
we
have
a
kept
for
that
and
that
would
enable
us
to
to
make
resource
topology
export
a
more
event-based
as
opposed
to
the
current
design
where
it
is,
it
is
polling.
So
that's
the
idea,
so
these
so
basically
a
lot
of
items
in
progress,
but
all
of
these
kind
of
help
in
enabling
topology
aware
scheduler.
G
B
Okay,
so
if
I
recall
the
reason
I
the
earlier
iteration
of
the
cap
before
I
know
that
y'all
started
looking
at
note
feature,
discovery
was
wanting
to
basically
kind
of
consolidate
like
the
security
model.
For
all
of
these
components,
can
you
refresh
my
memory
if
it's,
okay,
or
maybe
those
who
haven't
tracked
it
like?
B
G
So,
as
far
as
I
understand
for
the
pod
resource
api
endpoint,
I
I
understand
that
it's
exposed
per
node
and
the
information
you
can
gather
it
from
a
specific
endpoint.
It's
socket
file
in
relation
to
the
security
aspects
of
it.
I
think
I
would
need
to
talk
to
someone
else
who's
of
more
expertise
in
this
area.
I'd
probably
need
to
look
into
that.
H
I
B
Aspects
yeah
I'm
happy
to
follow
up
on
that.
I
was
just
trying
to
make
sure
I
knew
which
docs
so
like
note
feature
discovery
to
my
knowledge.
Didn't
deploy
a
new
serving
daemon
per
node.
It
just
propagated
the
state
of
what
was
discovered
on
the
node
back
to
the
api
server
but
like
it
didn't,
have
a
new
serving
in
point.
So
what
I
was
trying
to
just
make
sure
with
my
accurate
understanding
that
you
all
are
proposing
a
new
per
node
serving
endpoint,
which
I'm
just
thinking
through
like
deployers
who
point
this
out.
J
B
End
point
for
the
cubelet,
as
well
as
for
this
additional
data.
G
K
Just
just
a
note
about
what
derek
just
meant
hey,
this
is
funko,
nfd
has
tails,
but
we
haven't
implemented
cert
rotation.
So
this
is
one
thing
be
what
we
need
to
implement
for
nfd,
but
there
is
tail
s
for
the
grpc
endpoint,
which
is
already
there
in
nfd.
B
No,
but
that's
funko
on
the
serving
side
for
nfd
right,
like
my
recollection
with
nfd,
was
that
you
had
a
per
node
daemon
that
fleeced
and
then
sent
that
information
back
up
to
the
api
server.
B
K
B
There
was
no
serving
port
opened
to
the
nfd
daemon
per
worker,
and,
and
so
there
wasn't
like
any
need
to
do
anything
other
than
handle
client
certs
back
to
the
api
server.
There
wasn't
necessarily
a
serving
cert
problem.
K
B
And
so
all
I
was
asking
about
was
if
the
proposal
here
is
requiring
an
additional
cert
management
for
serving
per
node
like
because,
obviously
the
cube
serves
an
endpoint,
10,
250
or
whatever,
and
managing
certs
for
that
serving
endpoint
is
often
a
a
challenge
in
adoption
for
deployment,
and
so
I
was
hoping
we
could
get
to
a
spot
that
didn't
require
any
new
serving
end
points
per
worker
because
of
the
overhead
of
of
managing
those
certs.
B
K
So
maybe
we
should
think
about
of
reusing
the
grpc
endpoint
we
already
have
in
nfd,
but
I
think
this
is
also
a
discussion
with
with
with
markers
who
is
the
lead
on
on
nfd.
But
thanks
for
pointing
this
out,
yeah
yeah.
B
I
guess
in
general,
like
whether
you're
proxying
or
doing
something
else
like
thinking
about
adoption
of
this
stuff,
it's
much
easier.
If
we
can
avoid
needing
to
serve
endpoints
on
public
ports
per
worker
node
in
a
cluster
and
maybe
don.
You
would
agree
with
that
as
a
general
principle,
but
I'm
just
kind
of
on
the
lookout
for
things
that
introduce
new
new
ports
needing
to
be
exposed
per
node
and
what
serving
certs
they
might
need
to
then
have
rotated.
And
this
is
also
just
like
a
general
challenge.
B
We
talked
about
c
advisor
and
stuff
in
the
past,
like
there's
a
lot
of
operational
benefit
with
the
fact
that
cubelet
fronted
it
for
a
long
time
that,
as
we
proliferate
demons
out,
it
gets
a
pain.
J
G
Cool
thanks
for.
I
Watching
one
question
which
I
asked
at
the
end
of
my
pr
and
one
of
which
gaps
proposal,
but
I
didn't
get
answer,
is
we
changes
to
a
word
resource
apis?
We
will
show
only
the
devices
which
is
allocated
to
report,
but
it
doesn't
give
the
information
what
resources
are
available
on
onenote.
G
So
at
this
point
in
time,
kind
of
we're
at
the
poc
stage,
we
are
enabling
srov
device
plugin,
and
in
that
case
we
pass
back
the
pci.
We
pass
a
config
file
which
gives
information
about
the
devices
that
have
been
enabled
in
the
cluster.
So
that's
that's
the
current
stage,
but
what
you're
saying
is
correct.
We
need
to
look
into
how
do
we
gather
information
about
all
the
devices
that
are
available
in
the
cluster.
I
H
Yes,
but
we
I
can
get
our
capable
resources
just
by
cubelet.
Community
exporters
are
capable
resources
to
the
cluster
onto
the
api.
H
Yes,
but
we
also
can
export
it
in
the
expression
of
german
in
the
same
way
as
cube
red
did.
But
yes,
of
course,
we
don't
know
the
configuration
of
a
device
of
all
device
or
plugins.
I
So
that
comes
to
my
question
of
the
changes
to
port
resource
apis,
so
you
are
not
really
exposing
where
actual
topology
information
about
what
devices
what
is
announced
by
device
plugins,
you
only
see
after
the
after
the
fact
allocation.
If
some
of
them,
I
said
you
already
allocated
to
something.
G
I
guess
this
information
is
available
within
kubrick,
but
the
information
about
available
devices
itself
would
have
to
be
exposed.
Somehow
we
haven't
explored
that
yet
because
we
we
decided
that
we
for
the
time
we
just
focus
on
a
single
device,
plug-in
get
an
end-to-end
working
solution
and
then
maybe
generalize
it
across
various
devices.
H
Now,
alexander
wright,
we
need
to
know
exact
resource
names
that
resource
nameless
usually
comes
from.
For
example,
srv
device,
plugin,
config
or
another
device.
Pluginconfig
depends.
G
And
that's
that's
the
current
implementation,
so
we
have.
We
have
it
based
on
a
config
and
we're
gathering
information
of
all
the
devices
that
are
exposed
in
a
cluster
and
then
based
on
that
we
evaluate
what
are
available
and
you
know
kind
of
subtract
them
from
the
already
allocated
one.
So
the
scheduler
gets
information
about
what
is
available
now
on
nominal
basis.
I
G
So
before
we
kind
of
dive
into
the
demo
video
itself,
I
just
want
to
give
an
overview
of
the
environment
that
that
this
demo
showcases,
so
we
have
two
worker
nodes
in
the
cluster.
Each
node
has
80
cpus
and
10
srov
devices
configured
and
they're
again
distributed
across
two
numera
note.
So
as
you
see
in
this,
we
have
40
cpus
on
each
number,
node
and
five
devices
on
each
lower
node.
G
G
So
so,
just
so
when
I
demonstrate
the
demo
there'd
be
a
certain
workloads
already
running
in
the
cluster.
So
these
are
the
workloads
that
are
running
so
I
have
a
pod
which
is
requesting
five
instances
of
srv
device,
five
cpus
that
is
over
here
and
then
I
have
a
workload
that
is
requesting
two
cpus
two
survey
devices
and
three
cpu
sr
devices
that
you
can
see
over
here.
G
So
from
scheduler's
point
of
view,
like
cube
scheduler's
point
of
view,
both
the
nodes
appear
to
be
exactly
the
same,
because
we
have
five
srv
devices
available
on
both
the
nodes
and
75
cpus.
But
when
we
look
at
the
newman
or
side
of
things,
the
picture
is
completely
different.
So
if
a
request
comes
for
a
pod,
which
is
something
like
this
so
we're
requesting
for
four
srov
devices,
this
information
becomes
really
valuable
in
that
case,
because
cube
scheduler
could
place
it
on
either
of
the
nodes.
G
But
in
a
in
a
cluster
where
single
moment,
policy
for
topology
manager
is
enabled.
If,
if
the
scheduler
places
it
on
this
node,
the
part
would
end
up
in
a
topology
affinity
error.
So
the
topology
aware
scary,
look
plugin,
uses
the
information
that
we
have
gathered
and
which
is
more
granular
to
place
it
on
this
node
and
then
in
turn.
It
gets
placed
on
the
first
node.
So
basically
we'd
have
something
like
this.
So
that's
that's
what
I'm
going
to
be
demonstrating
in
the
demo?
Maybe
let's
go
to
the
demo?
G
Okay,
so
I
have
I
have
here
showing
that
I
have
two
worker
nodes.
There
are
three
virtual
masters
as
well
on
this
cluster
srv
network
operator
has
been
deployed
on
this
cluster.
That
will
basically
show
information
about
the
the
statement,
visual
information
about
the
allocatable.
G
So,
as
you
see
here,
we
have
10
instances
of
srv
node
on
this.
The
first
work
node
and
the
second
worker
node
has
10
srv
instances
again
and
now
I
will
be
showing.
Let
me
just
go
back
here
a
bit
I'll,
be
showing
here
the
three
working
the
work
workloads
already
running
in
the
cluster,
which
you
can
see
on
the
left
as
well,
and
then
I'm
showcasing
here.
Sorry,
I'm
showcasing
here
nodes
that
they're
allocated
on
so
for
simplicity.
G
What
I
did
in
this
cluster
was
the
the
pci
addresses
that
are
ending
in
an
even
number
are
all
on
pneuma
node
zero.
So
here
you
can
see
that
these
these
devices
pci
addresses
all
the
devices
that
have
been
allocated
to
this
board
are
on
the
zero.
At
no
note,
as
you
can
see
over
here
and
then
for
the
sample
part
2
you'd
see.
This
is
again
on
number
note
0,
which
is
over
here
and
then
there.
This
is
the
sample
4
3,
which
is
on
my
node
1..
G
So
now
I'll
show
the
crds.
G
In
this
case
we
have
node
resourced
for
gcrd
in
this
cluster,
when
we
kind
of
show
the
instances
you
see
that
there
are
no
instances
corresponding
to
the
crd
and
then
when
we
deploy
the
resource,
topology
explorer
exporter,
those
crds
are
populated.
G
So
here
we
are
deploying
the
resource
topology
exporter,
so
we
have
because
it's
a
daemon
set.
We
have
an
instance
for
both
the
nodes
and
then
we
have
the
crd
instances
populated
corresponding
to
both
the
nodes.
Now,
so
this
command
is
going
to
show
you
the
picture
of
the
basically
the
the
information
corresponding
to
the
specific
node.
So
here
you
can
see
there
are
35
cpus
available
on
this
node
and
zero
survey
devices
which
corresponds
to
this
information
over
here
and
then
for
the
other
node.
G
G
So
now
we
deploy
the
topology
where
scheduler
we
deploy
it
as
a
separate
scheduler
itself.
We
deploy
it
in
the
cube
system,
name
space.
G
So
this
is
the
manifest
file,
and
you
see
here
that
I
have
the
schedule.
Name
name
specified
as
my
scheduler
to
indicate
that
we
need
to
use
another
scheduler
in
this
case
like
ideally
once
we
have
the
topology,
where
scheduler
plug-in
merged
into
the
mainstream
scheduler
code.
We
wouldn't
need
to
do
this,
but
this
is
how
I've
done
it
for
the
demonstration
purpose
and
now.
G
Like
yeah,
depending
on
like,
if,
if
you
have
a
single
node
policy
specified
on
the
certain
nodes,
then
that's
only
when
the
topology,
where
scary
look
plug-in,
would
kick
in
and
it
will
try
to
schedule
pods
based
on
that.
B
Okay,
so
the
expectation
would
be
that
both
the
scheduler
plug-in
and
then
the
the
resource
that's
coming
from
nfd
that
gives
the
or
potentially
coming
from
empty
that
gives
the
associated
topology
policy
on
that
node
they'd
work
in
concert
just
out
of
tree.
G
Yeah
and
we
would
like
to
make
it
more
kind
of
configurable
as
well,
so
if
someone
doesn't
want
this
plugin
to
be
enabled,
they
could
just
simply
disable
the
plugin.
So
my
understanding
is
that
the
scheduler
config
itself
allows
you
to
enable
and
disable
plugins
on
on
your
cluster.
K
So
it
would
be
really
mandatory
to
to
disable
this
scheduler
so
that
other
schedulers
can
onboard
and
use
the
same
same
crd,
crs
and
information
that
we
are
exposing
either
via
fd,
nfd
or
other
entities.
But
I
have
several
really
several
customers
who
want
to
enable
from
the
hpc
space
schedulers
that
need
those
information.
A
Because
the
different
use
cases,
so
this
one
definitely
gave
us
the
boost
of
some
performance.
This
is
sensitive,
only
be
some
device
sensitive
of
the
workload,
but
also
there's
something
like
that.
You
have
to
treat
off
for
some
nectar
if
that
work.
Node
is
not
that
performance
sensitive
make
the
the
product
batch
workload
or
even
like
the
thumb,
is
like
the
latency
sensitive,
but
actually
it's
not.
That
depends
on
the
the
memory
intensive
workload,
oh,
and
so
in
that
cases
this
actually
will
hurt
off
the
utilization.
A
So
this
is.
This
is
why
we
want
this
is
being
the
extensible
and
up
out
of
the
default
tree,
and
so
then,
basically,
it's
more
like
the
front
like
the
cluster
level
per
class
level
or
maybe
like
a
per
node
poor
level.
You
could
enable
those
kind
of
the
scheduling
behavior.
This
is.
I
think
this
is
kind
of
what
we
try
to
push
so
far.
So.
G
Okay,
so
just
to
wrap
up
on
this,
so
we
have
the
scheduler
deployed
when,
when
we
have
a
specific
pod
being
placed
by
that
scheduler,
so
we
just
create
that
part-
and
here
you
can
see
that
this
part
gets
created
on
the
zeroth
worker,
which
is
this
one
as
expected.
A
Six,
when
you
run
off
the
time
otherwise,
then
we
will
ask
that
other
people
have
more
questions
raised,
so
we
can
follow
up
of
nine
on
this
one.
This
is
george
and
thank
you
for
come
and
give
the
status
update
and
of
the
great
demo,
and
so
let's,
if
you
have
more
questions,
please
talk
to
the
sweaty,
alexey
and
genre,
and
let's
move
to
next
topic.
Is
that?
Okay,
thanks
thanks
next
one
says:
do
you
want
to
open
the
discussion
on
that
yeah?
We?
L
I'm
not
sure
if
we
talked
about
it
last
week.
I
think
it
was
a
different.
I
think
it
was
a
different
issue
last
week,
but
this
one
was
interesting
and
I
wanted
to
get
some
feedback
so
when
we,
when
we
do
enforce
node
allocatable
today,
that
that
puts
a
memory
limit
on
the
cube,
pods
c
group
and
sets
the
shares-
and
this
is
where
it
gets
tricky
like
so
shares-
is
a
is
a.
Is
a
ratio
right?
L
It's
a
it's
a
relative
number,
but
we
deal
it
with
cpus
in
terms
of
millicourse.
So
we
we
have
this.
You
know
we
keep
them
on
the
same
scale
so
that
they
continue
to
make
sense.
But
when
you
set
like
the
system
reservation
to
be
like
we,
we
have
customers
that
will
set
like
a
four
or
six
core
reservation
on
a
96
core
machine
right,
really
big.
L
And
so
what
will
happen
is
the
because
you're
not
doing
enforcement
on
the
system
reserved,
so
you
can
do
pods
system
reserved,
cube
reserved
right.
If
you
don't
do
the
system
reserved
in
the
enforce
node
allocatable,
then
the
shares
is
not
set
on
the
system.
L
C
group,
and
so
even
though
you
reserved
for
a
10
cores
under
contention,
the
system,
slice
or
I'm
speaking
in
systemd
terms
here,
but
but
the
system
c
group
does
not
effectively
have
access
to
the
reserved
cpu
count
because
its
shares
isn't
set,
and
that
throws
off
the
calculation
system
wide
where
coupon
slice
has
allocatable
number
of
cpus
and
the
system
slice
always
has
1024,
because
we
don't
set
it.
But
if
you
add
those
two
numbers
together,
it
doesn't
add
up
to
the
total
number
of
miller
cores
on
the
system.
B
So
what
do
we
see?
Is
the
recommendation
stuff
that
I
mean
today?
If
you
had
said
enforce
system
reserved
so
like
default,
we
say,
enforcement
of
account
allocated
would
be
pods,
and
so
we
write
what's
on
the
coupon
c
group,
but
then
you're
right
that
nothing
gets
reflected
on
system
slice
for
cpu.
That's
always
wrong.
B
So,
like
I
think
the
tension,
I
feel
like
vision
I
had
when
we
first
did
this
was
we
didn't
want
to
cap
memory,
because
memory
was
always
going
to
be
probably
wrong
and
we
wanted
to
let
some
burst
on
the
system,
but
reflecting
on
this
now,
like
especially
with
larger
boxes
like
where
you're
not
running
4b
cpus
but
you're
running
in
your
case,
you
said
96
the
imbalance
on
that
ratio
can
get
really.
L
L
and
it
now
you've
like
now.
Three
cpus
are
unaccounted
for
in
the
shares.
B
L
Well,
we
can't
really
do
that,
because
this
is
where
the
tricky
part
comes
in.
So
if
you
set
system
reserved
as
a
key
on
enforce
node
allocatable,
that
makes
it
to
where
the
user
has
to
provide
the
system
c
group
name
as
a
parameter
to
the
cubelet
in
see
in
systemd's
case.
It's
that
would
be
system
slice,
but
it
makes
that
required
at
that
point.
L
L
C
A
That's
the
initial
design
is
separable
yeah.
It's
a
system
reserved
initially
since
I
came
from
this
initially
is
for
the
any
demon
not
kubernetes
managed
include
after
kernel,
and
then
the
cube
was
reserved
actually
for
all
those
demon
and
an
accumulated
container
antenna
and
the
other
symmetric
cubo
proxy
back
then
that's
for
the
cube,
just
like
the
kubernetes,
provide
the
demon
or
demon
side
or
whatever,
and
that's
kind
of
the
original
source
just
share
here.
L
The
only
way
I
can
see
out
of
it
is
to
have
people
start,
setting
the
system
reserved
key
for
node
allocatable
and
then
setting
cpu
shares,
but
not
setting
memory
limit
and
bytes.
When
you
do
that,
I
think
that
that
is
the
behavior
that
most
people
are
expecting,
but
if
the
existing
behavior
is
to
set
memory
limit
in
bytes
and
for
some
reason,
people
want
their
system
daemons
killed.
If
they
go
over.
L
That
limit-
which
I
don't
think,
is
the
desirable
outcome
for
anyone,
but
it
could
possibly
be
the
current
behavior
and
what
people
expect
then,
I'm
not
really
sure
my
understanding
is
that
we
do
set
memory
limit
on
bytes
on
the
system
c
group,
if
you
pass
node
allocate
if
you.
I
B
B
Enforcing
things
that
are
I'm
having
a
word
yeah
compressible
right,
we
should
have
a
way
of
expressing
to
enforce
compressible
versus
incompressible
resources,
and
we
have
that
glob
together
into
one
concept
right
now
that
it's
probably
in
a
mistake.
L
Proposed
fix
that
you
wanted
to
draw
attention
to.
I
don't
currently
have
one,
because
I
I
don't
have
a
solution
that
is
transparent
to
the
end
user
right
we're
going
it's
going
to
change
the
behavior
of
something
it's
just.
What
behavioral
change
is
going
to
be
the
least
disruptive
and
more
aligned
with
what
users
expect.
I
guess.
E
L
For
our
burstable
quas
tier
so
like
it
shares,
allows
a
so
in
the
pod
spec
when
you
make
a
request
that
maps
to
cpu
shares
when
you
set
a
limit
on
cpu
that
maps
to
see
a
cfs
quota
and
shares
allows
a
pod
to
burst
beyond
what
it
is
requested.
As
long
as
there
is
not
contention
on
the
machine,
but
the
shares
are
there
to
enforce
fairness.
When
there
is
contention.
E
Yeah
I
get
the
burstable
part
as
they
can
do
more.
I
I
don't
know
I've
always
been
against
cpu
shares
because
of
that
like,
if
you
look
at
it
from
latency
profiles
like
depending
on,
if
you're
looking
at
a
workload
isolated
depending
on
how
much
is
run
on
a
single
machine,
you
get
really
good,
latencies
and
stuff,
even
for
reversible
tasks,
and
then
the
machine
gets
gets
been
packed
and
now,
like
I
as
a
batch
user,
my
things
are
taken
five
times
longer
and
that's
the
sole
reason
why
cpu
shares
like
it.
B
D
B
B
M
So
well
also
in
v1
there
was
a
very
bad
colonel
bug
with
cfs
quotas,
which
was
only
fixed
late
last
year.
I
can
link
to
some
issues
and
there
are
some
well
even
right
now.
A
Questions
there
is
not
fixed
some
corner
cases
still
still
have
some
problem
there.
So
so
that's
the
problem,
so
we
end
up
to
have
like
the
professional,
the
special
treatment
on
different
of
the
workload.
So
so
by
default
we
still
didn't
enable
off
the
cpu
quota.
The
limit.
B
So
yeah,
I
think,
even
on
secrets,
v2
host
right
and
renault
and
giuseppe.
I
know
you
guys
are
looking
at
this.
We
would
still
want
to
set
cpu8
on
system
slice
or
system
or
cube
slice.
Whatever
people
have
their
c
group
taxonomy
set
up.
That's
to.
C
B
Appropriately
between
end
user
pods
and
system
services,
so,
like
I,
I
guess
I
don't
have
like
a
great
answer.
Seth.
I
don't
because
I
feel
like.
If
it
were
me,
I
don't
think
anyone
in
the
world
is
setting
enforce
node
allocatable
in
a
way
that
they
understand
what
it's
doing
that
like
I
feel
like.
I
would
improve
the
world
if
I
did
set
shares
on
it
because
it
would
match
the
expectation
probably
they
have.
B
My
first
thought
because,
like
taken
to
the
extreme
like
you
might
want
to
restrict
the
amount
of
pids
or
tasks
that
you
might
proportion
out
and
like
same
with
memory
and
like
I
personally
probably
would
never
run
a
cubelet
worker
where
I
enforced
compressible
or
incompressible
resources
on
the
system
daemons,
but
maybe
others
have
a
better
understanding
on
the
workload
that
they
would,
but
that's
probably
the
best
pitch.
I
would
have
that
we
could
probably
cue
for
120
and
like
make
the
world
a
lot
better.
C
L
That's
a
fair
perspective
too.
Actually
I,
like
that
yeah,
I
I
agree
with
that
and
really
the
only
hitch
with
that
is
that
we
can't
assume
what
the
system
c
group
name
is
right.
Unless
you
pass
the
system
reserved
key
into
enforce
node
allocatable,
which
forces
the
user
to
provide
that
to
us.
So
the
only
way
I
see
forward
is
users
will
have
to
set
this.
The
system
reserved
key
for
enforce
node
allocatable
if
they
want
the
shares
set
on
their
system
slice
properly,
because
that
forces
them
to
provide
us
with
the.
B
I'm
sorry
seth
couldn't
we
do
like
just
find
all
peers
of
the
cubepod
slice
and
set
it
appropriately
with
that
like.
If
you
only
find
one
pier,
then
you
know
what
to
set
it.
If
you
find
more
than
one
that
may
be
an
issue
with,
I
don't
I
think,
if
you
find
just
one,
you
could
auto
know
the
right
balance
too
right
find
all
siblings
of
coupons
if
n
equals
one.
You
know
to
set
shares
to
the
alternate
value.
B
A
A
bit
because
we
also
have
some
other
tumor
topic
here.
Sorry-
and
we
don't
have
the
best
answer
here-
can
we
carry
that
one
through
the
because,
like
there's,
no
one
says
fit
at
all?
No,
the
allocable
config
and
the
different.
If
you
are
thinking
about
the
technical
use
cases
and
they
are
really
sensitive
for
every
single
things
and
then
you
are
running
along
those
new,
even
accumulated
learning,
so
we
are
carrying
on
that
to
the
separate
topic
later
and
can
we
move
to
the
next
one
so
star
wars?
A
N
Yeah,
okay,
so
just
to
give
a
brief
overview.
My
proposal
is
really
just
that
we
reopen
this
kubernetes
enhancement
proposal
for
node,
written
escape
or
node
readiness
gates,
which
basically
just
adds
a
declarative
api
for
defining
a
set
of
pods,
which
all
must
be
ready
on
the
node
before
the
node
is
considered
ready
and
I'm
coming
over
from
istio.
N
I
work
on
the
cni
or
istio
cni
over
there
and
our
use
case
is
kind
of
odd,
but
it
is,
I
think,
kind
of
representative
of
this
class
of
problem.
So
to
give
you
an
overview
of
what
we're
doing,
istio
cni
is
basically
a
binary
which
installs
a
which
is
installed
on
the
node
and
is
called
just
like
a
normal
cni
binary
is
it's
installed
by
a
daemon
set,
which
you
know,
runs
the
installer
and
then
adds
the
binary
and
also
updates
the
configuration.
N
However,
unlike
most
cni
plugins,
it
doesn't
cover
the
entire
kind
of
networking
stack.
The
only
thing
it
does
is
actually
exec
into
the
pod
namespace
for
a
you
know,
given
pod
the
network
namespace
and
execute
some
iptables
rules
to
setup
port
capture.
N
So
the
issue
here
is
that
since
it's
not
a
true
cni,
there's
nothing
that
actually
stops
a
pod
that
is
scheduled
before
that
cni,
insta
or
plugin
is
installed
and
chained
to
whatever
your
main
cni
plugin
is
from
being
scheduled
and
starting
up
successfully
without
those
ip
cables
rules
being
applied,
which
means
that
if
you
have
a
workload
that
depends
on
running
istio,
you
will
run
into
a
problem
where
you
have
this
pod
created,
but
it
can't
actually
execute
you
know.
N
So
it's
basically
just
sitting
there
dead
in
the
water
and
consuming
resources
on
your
on
your
computer.
So
we
call
this
the
cni
race
condition
over
an
istio,
and
the
problem
here
is
that,
because
of
the
way
that
this
is
implemented,
we
cannot
actually
repair
these
broken
pods
in
place,
because
cni
is
only
called
during
initialization
of
a
pod
network
namespace.
N
And
so,
while
we
do
have
mitigation
measures,
basically
a
ended
container,
which
detects
when
the
pod
is
broken
by
just
running
a
couple
of
loopback
commands
to
see
whether
or
not
the
iv
tables
are
in
place.
If
it's
broken,
the
only
thing
we
can
do
is
delete
the
pod
and
let
it
reschedule
itself-
and
that
is
you
know-
creates
a
lot
of
noise
on
the
cluster.
N
You
know
you
might
see
if
a
bunch
of
pods
get
scheduled
because
the
cluster
is
in
high
contention,
you
may
see
pods
get
rescheduled.
You
know
50
times
each
for
a
large
number
of
pods
on
a
cluster
that
creates
a
lot
of
noise
in
your
metrics
and
makes
it
look
like
something.
Horrible
is
happening.
When
normally
do
it's
actually
just
trying
to
repair
itself
until
the
the
cni
is
installed,
so
cni
is
a
requirement
for
a
lot
of
customers
or
istio.
N
Cni
is
a
requirement
for
a
lot
of
customers,
because
it
allows
people
to
actually
run
seo
without
having
to
you
know,
break
their
pod
security
policies
and
give
the
workload
pods
capnet
admin,
which
is
required
in
order
to
run
iv
tables
inside
of
a
pod
network
namespace
from
a
workload
pot.
N
The
problem
here
is
that
the
the
node
itself
or
cubelet
has
all
of
the
information
it
needs
to
make
the
determination
of
whether
or
not
it's
ready.
So
if
we
had
a
declarative
api
or
we
could
state
the
conditions
in
which
that's
true
and
let
cubelet
manage
it,
it
has
the
ability
to
stop
things
from
scheduling
on
it
without
any
kind
of
lag.
N
Whereas
if
you
have
a
problem-
or
you
know
the
problem
with
running
it
as
a
tank
controller,
which
is
running
outside
of
cubelet-
is
that
you
know
if
a
node
comes
up
reports
itself
as
ready,
and
there
are
a
large
number
of
pods
that
are
sitting
there
waiting
to
be
scheduled.
They
can
all
be
scheduled
onto
the
node
before
that
taint
gets
applied
and
if
the,
if
that
happens,
you'll
have
a
large
number
of
these
pods,
which
are
sitting
there
and
an
unusable
state
consuming
resources.
N
And
you
know,
if
you
don't
have
the
repair
controller
enabled
because
it
makes
too
much
noise
in
the
cluster
or
whatever
they'll
just
sit
there
and
you
know,
be
completely
useless,
which
breaks
your
monitoring,
which
breaks
you
know
auto
scalers
and
they
won't
ever
be
descheduled,
because
there's
no
mechanism
for
you
to
actually
go
clean
them
up
without
something
that
just
deletes
the
pods.
N
The
other
aspect
of
that
you
know
the
the
recommendation
was
for
this
tank
controller
to
also
be
coupled
with
a
repair
or
register
with
taints
option
in
the
cubelet
and
the
problem
with
doing
that.
Is
that
that's
not
always
available,
and
even
when
it
is
available
a
lot
of
times,
the
team
that
is
managing
istio
at
a
large
company
does
not
actually
have
control
over
the
arguments
that
are
passed
to
cubelet.
N
So
if
you
have
complete
control
over
the
system-
and
it's
one
that
allows
you,
you
know
to
actually
set
the
register
with
taints
option
yeah,
you
do
have
a
workaround
there,
but
again
it's
requires
you
coordinating
with
two
components.
You
know
one
inside
of
the
cluster
one
and
kind
of
meta
configuration
space,
and
you
know
it's
just
kind
of
not
as
clean
as
just
having
a
declarative
api
where
we
can
state.
N
You
know
this
is
the
set
of
you
know:
label
selectors
and
namespaces
and
node
selectors,
where
a
node
with
this
node
selector
is
not
valid.
Unless
it
has
a
pod.
You
know
matching
this
label
inside
of
this
namespace
running
on
it,
and
so
that's
a
basic
summary
of
kind
of
the
istio
cni
use
case,
which
I
again
I
you
know
it
when
I'm
giving
a
specific
istio
cni,
but
it
generalizes.
N
I
think
well
to
this
class
of
problem
and
it's
something
I
think
that
we
really
do
need
to
have
implemented
in
some
form.
A
I
just
want
to
summarize
general,
actually
it's
just
we've
been
talked
about
what,
after
and
what
kind
of
condition
a
giving
node.
It
is
rightly
undertaker
the
user
workload
we've
been
have
those
discussed
in
the
past
and
in
the
initial
and
kubernetes
funded.
We
we
expressly
see
like
the
node.
The
management
is
out
of
the
kubernetes
scope.
So
then
that's
what
that's.
Why?
Initially
I
put
those
things
in
the
init
script.
A
I
think
that's
addressed
a
lot
of
problem
in
the
past
because
the
kubernetes
most
built
on
top
of
the
cloud
provider
or
maybe
used
by
private
region,
a
private
on-prem
cluster
by
those
each
company
for
their
own,
and
so
they
could
control
of
the
node
initial
time.
So
then
they
will
have
the
initial
script
and
then
ensure
of
this
node
through
the
cloud
provider
and
the
and
the
node
initial
time,
and
then
they
can
say.
A
Oh
this
node
is
the
10
and
the
gate
is
several
status
is
ready
and
even
we
introduce
of
the
demon
side,
but
the
problem.
It
is
still
demon
side
now
to
those
also
have
the
risks.
So
that's
why
we
always
first
need
to
control,
so
easter's
use
cases
is
a
little
bit
different,
so
just
it
should
be
common
like
when
I
want
to
join
a
node
and
how
I'm
going
to
ensure
during
that
ensure
of
when
I
joined
the
node.
A
I
have
like
the
clear,
like
the
what
does
that
word
say
like
a
declarative
state,
how
we're
going
to
say
this
node
is
ready
to
serve
because
this
node
is
right.
It
could
be
like
this.
Node
is
boot
up
and
and
is
alive,
and
then
this
node
is
ready.
It
is
actually
it
is
okay.
Oh
my
node
have
some
certain
row
and
that's
a
certain
rule
are
required
of
the
certain
functionality
in
the
board.
A
The
functional
interval
could
be,
the
demon
set
is
running
and
the
device
plugin
is
running,
for
example,
and
then
I
can
then
I
can
claim
I'm
writing
so
there's
also
like
the
when
we
have
the
class,
seek
class
life
cycle
and
start
talk
about
the
cluster
api.
So
I
talked
to
engineer
working
on
that
one.
I
said
we
do
have
the
need
to
container,
and
then
we
do
we.
We
do
not
sorry,
we
need
a
script.
A
We
also
have
like
those
things
like
the
initialization
for
the
know,
the
startup,
how
we
are
going
to
solve
that
problem
in
cluster
api.
I
didn't
say
they
solved
that
problem
because
I
think
that's
kind
of
literally
the
node
come
up
and
they
they
mentioned
off
the
machine
life
cycle
and
when
machine
come
up
alive
and
and
then
create
of
the
node
object
and
the
register
there
should
be
have
some
state-
and
I
can
say:
oh,
I
want
to
initial
this
node
and
for
its
ready
state,
but
I
have
to
see
that
here.
A
So
I
think
this
is
kind
of
like
the
it's
kind
of
the
common
issue,
but
I
guess
everyone,
every
provider
already
have
that,
because
giving
history
likes
the
region
where
they
have
their
own
solution,
but
I
still
think
about.
Maybe
it's
a
good
time
for
us
to
revisit
things
since
we
we
change
off
the
kubernetes
scope
in
many
way,
like
at
least
we
have
the
cluster
api.
So
that's.
Why.
N
So
one
additional
thing
I
would
add
to
that
is
that
one
of
the
problems
with
relying
on
the
init
scripts
and
things
that
are
actually
inside
of
kind
of
the
meta,
config,
plane
or
meta
configuration
space
of
the
kubernetes.
Like
you
know
the
cubelet
arguments
and
stuff
like
that,
is
that
if
you
start
relying
on
that,
you're
essentially
tying
your
kind
of
that
your
deployment
of
application
level
resources
that
may
change
the
requirements.
N
For
you
know
what
is
needed
for
a
node
to
be
ready
for
a
given
known,
pool
and
you're
tying
that
to
the
configuration
of
the
entire
cluster
as
a
whole
right.
So
you
know
you're.
If
you
wanted
to
push
a
change
or
something
like
that,
you
would
potentially
have
to
roll
the
entire
node
pool
over
with
the
new
configuration
settings,
as
opposed
to
just
kind
of
configuring
that
inside
of
you
know,
krm
and
just
applying
that
krm.
N
Just
like
you
would
any
kind
of
custom
resource
and
redefining
what
it
means
to
be
ready
and
then
rolling.
Your
application
over
you
know
so
that
it
is
now
reflecting.
The
new
state
just
drastically
complicates
upgrades
and
deployments
of
that
kind
of
you
know
or
that
that
kind
of
process.
A
Yeah,
just
that's
the
legacy
vision,
because,
due
to
the
scope,
kubernetes,
that's
why
we
cannot
derek.
You
want
comment.
B
I
I
guess
I
I
don't.
I
think
it's
a
universal
problem.
I
don't
know
or
is
the
pitch
here-
that
we're
gonna
bring
back
this
cap
or
because
I
I'd
be
interested
in
sharing
like
challenges.
We
have
as
well
that
I
don't
think
honestly
met
by
this
cap
either,
but
is,
is
there
a
what's
the
what's
the
goal,
I
guess
do
we
want
to
revisit
this
problem,
or
are
we
going
to
say
that
it's
like
the
infrastructure
providers
challenge
to
work
through.
N
On
my
part,
my
recommendation
was
just
using
that
as
a
starting
point
and
if
we
need
to
remap
that
you
know
to
meet
additional
requirements
that
come
up,
we
could
probably
do
so.
You
know
yeah,
but
the
the
issue
is
that
at
the
moment,
there's
kind
of
no
proposal
out
there
to
solve
anything
in
this
class
of
problem.
N
As
far
as
I
can
tell,
this
is
the
only
thing
that
I
saw
in
motion
regarding
that
and
it
was
closed
with
the
taint
controller
being
kind
of
the
recommended
alternative
and
that
doesn't
really
meet
all
the
requirements
that
there
are,
and
so
you
know
either
reopening
this
or
opening
a
new
one
that
kind
of
uses.
This
is
kind
of
a
subset
of
it.
N
B
Yeah
so
my
my
I
have
to
re
visit
my
understanding
on
what
that
cup
had
been,
but
my
like
lived
experience
with
this
is
that,
like
when
a
node
is
considered
ready,
seems
to
vary
based
on
both
the
vendor's
choice,
this
node's
now
ready
and
then
the
customer's
desire
for
what
needs
to
be
on
that
node
before
they
also
then
run
other
workloads
that
you
end
up
with
like
a
ring
type
situation
where
like.
B
If,
if
your
provider
of
kubernetes
doesn't
provide,
you
say
log
forwarding,
but
your
organization
demands
that
log
forwarding
is
deployed
to
that
cluster
and
then
you
use
kubernetes
itself
to
deploy
fluentd
down
to
that
cluster.
Like
you
end
up
with
these,
these
security
rings
or
node
readiness
rings,
which
I
know,
is
kind
of
what
the
skate
proposal
was
talking
about,
but
it's
hard
to
get
to
like
one
true
solution,
but
I
definitely
can
empathize
that
it's
a
it's
a
it's
a
problem.
B
I'd
have
to
go
back
and
think
through
like
variations
that
people
had
done
with
tainting
the
problem
with
taints.
Is
that,
then,
you
need
to
have
those
those
system,
services
tolerate
all
taints,
which
also
then
becomes
a
problem.
So
I
can
definitely
agree
that
there
is
a
a
challenge
here.
B
It's
just
it's
hard
to
come
up
with
one
true
answer,
because
one
user
might
say
oh
cystig
or
fluentd
or
insert
random
thing
must
be
on
this
note
before
my
workloads
are
ready
to
support
it,
and
another
user
might
have
a
whole
different
list
that
I
haven't
like
seen,
something
that
solves
it
universally
there's
a
lot
of
proposals
that
just
solve
it
for
like
that,
one
user's
group-
and
maybe
my
recollection
of
this
kept-
is
inaccurate.
But
that's
my
experience.
A
I
I
think
we
all
agree
about
the
problem,
and
that
sounds
like
and
the
solution
could
be
different
dark.
I
I
think
about
the
proposal
kind.
The
proposed
the
the
course
the
proposal,
actually
still
it
is
per
cluster
base
per
provider-based
skill.
A
So
so
so
stew
is
connected
based
on
the
provider,
define
a
site
of
things
condition
and
which
is
driving
by
the
demon
side
or
whatever,
and
they
have
to
give
the
signal
or
it
is
ready
or
alive,
and
then
it
is
like
no,
the
claim
it
is
writing
so
based
on
that
one,
so
you
could
be
flexible
based
on
node
pool,
oh
yeah,
so
I
I
think
that's
that
word
that
could
be
and
it
could
be.
A
The
per
cluster
even
could
be
flexible,
like
the
per
node,
if
you
driving
that
one
on
the
machine
level.
A
So,
but
I
I'm
not
sure
the
cap
itself
will
capture
all
those
detail
yet
because
I
forgot
about
the
detail,
but
I
remember
we
should
discuss
with
me
behind
this
thing
before
proposing
to
the
signal,
but
from
my
top
of
my
mind,
if
I
remember
correctly,
didn't
really
saw
some
use
cases
because
the
there's
the
class
level
of
the
admin
iteming
and
then
there's
some
other
people
like
what
you
just
mentioned,
like
a
universal
thing.
They
have
maybe
have
like
the
sabo
group
of
the
load.
A
They
have
certain
of
the
admin
requirement,
how
they
are
going
to
share
those
duty
and
override
which
one
is
which
one
so
there's
the
sum
like
the
complexity.
But
I
think
that's
at
that
time.
I
do
think
about
is
a
little
bit
over
complicated
use
cases.
So
so
so
we
didn't
provide
the
solution.
So
that's
why
I
kinda,
like
the
end
up
like
the
things
most
of
lender
already,
will
crawl
up
the
issue
so
the
end
up.
A
We
didn't
really
pushy
that
hard
that
proposal
much
harder
because
things
solve
the
unique
stream
over
complications,
but
I
think
about
the
base.
My
understanding
today
is
tools,
use
cases.
I
think
that's
enough
to
address
that
problem,
but
we
should
look
into
more
detail,
and
so
I
just
share
my
my
experience
and
my
memory
here.
N
I
can't
link
you
to
this.
The
race
condition
that
we've
had
previously
and
the
other
mitigation
efforts
we
have
in
place.
If
that's
something
you're
interested
in
or
I
can
write
a
new
doc
if
you
would
prefer
that
just
specifically
goes
into
what
the
requirements
or
like
what
we
would
request
from.
You
know,
as
far
as
cube
literature,
node
support,
which
would
you
prefer.
B
I
guess
either
what's
convenient
to
you,
I
guess
the
part
I
question
is
if
you
can
get
down
to
a
single
signal
versus
many
signals,
and
I
was
trying
to
raise.
Was
that
like?
If
you
are
a
vendor
working
with
the
existing
might
say,
the
node
is
ready
at
a
different
point
than
the
vendor.
That's
working
vendor,
plus
and
says
user
workloads
are
only
ready
when
the
two
are
there,
and
so
all
that
stands
out
to
me
is
in
the
kept.
B
That's
linked
is
ideally
provide
a
single
signal,
and
I
think
that,
like
that's
a
premise,
I
still
question
versus
having.
N
So
I
I
haven't
added
anything
to
the
cap,
but
my
proposal
would
be
more
along
the
terms
of
this.
What
you
do
is
we
have
we
added
a
declarative
api
which
basically
states.
N
Is
a
custom
resource
which
defines
inside
of
it
a
tuple
which
is
the
the
name
of
the
resource,
the
namespace
that
a
a
damon
set
or
deployment
would
be
in
the
existence
of
a
or
a
and
a
a
node
selector,
which
you
can
use
to
define
a
node
pool
which
is
optional,
and
you
can
just
remove
it
if
you
want
it
to
be
cluster
wide
and
then
a
label
selector
for
pod
and
what
it
does
is
it
specifies
that
a
particular
pod
has
to
be
ready,
your
that
exists
inside
of
that
name
space
so
that
it
can
be.
N
You
can't
have
people
just
randomly
inject,
whatever
workload
exists
on
every
node
or
exist
on
a
given
node
before
that,
node
is
considered
ready,
and
so
you
know,
for
example,
istio
would
have
one
of
these
and
then
you'd
define
another
one
for
the
use
case
of
cystig
or
whatever,
or
you
have
another
one
for
fluentd
and
the
node
does
not
ever
flip
already
and
start
accepting
schedulable
pods
until
all
of
those
conditions
are
met,
because
you
know
you
can
still
have
pods
schedule
on
to
it
that
ignore
the
node
readiness
you
know
via
tolerations
or
whatever,
but
until
all
of
those
until
the
node
contains
one
pod
meeting
every
one
of
those
conditions
and
all
of
those
pods
are
in
the
ready
state,
it
would
just
prevent
the
node
from
being
considered
schedulable.
N
N
That
was
my
thought,
so
the
the
signal
is
basically,
does
the
pot
exist
and
is
it
ready-
and
you
know,
people
that
wanted
to
take
advantage
of
this
may
have
to
convert
whatever
their
workload
is
into
something
that
just
sits
there
and
waits
in
a
ready
state?
B
B
Yeah,
I
guess
we
don't
need
this.
I
don't
figure
out
where
that
breaks
down
and
so
either
way.
I
I
think
the
problem
makes
sense
if
it
meets
every
scenario.
I'm
not.
I
haven't
given
thought
through
enough,
but
I'll
read
the
istio
one
dawn.
If
you
could
open
up
access.
A
Yeah
I
okay,
I
just
link
to
back,
I
I
will
so
direct.
We
don't
need
to
solve
this
problem
for
the
first
time
I
think
most.
I
just
want
to
really
introduce
that
problem,
because
we
do
have
this
problem
in
the
past
and
we
walk
around
that
problem
and
also
ask
each
vendor
to
solve
that
problem,
but
more
and
more,
like
the
services
built
on
top
of
the
kubernetes.
A
So
they
may
don't
have
like
the
window
that,
like
luxury
to
force
the
vendor
or
this
kind
of
thing,
so
maybe
we
found
the
kubernetes
how
I
can
address
this
problem.
This
is
kind
of
the
maybe
it's
the
time
to
open
that
issue
and
we
can
discuss
more
and
we
it's
not
necessary,
like
the.
What
is
the
is.
The
implementation
is
that
the
solution-
but
I
just
want
to
we
understand-
also
real
common
issues
for
many
users,
like
the
private
service
on
top
of
kubernetes.
A
B
The
struggle
I
have
is
like:
should
we
just
give
up
on
the
ready
condition
like?
Is
it
a
bad
premise
to
begin
with,
and
should
we
promote
instead
other
more
specific
conditions,
and
so
it's
kind
of
like
are
we?
Are
we
truly
locked
into
what
we
originally
had,
or
is
there
just
another
thing
that
we
can
think
through
and
that
that's
what
I
haven't
come
to
grips
with
in
my
head.
A
We
can
we
can
carry
on
like
discussing,
and
I
think
that
I
can
see
that
both
have
the
pros
counts
and
the
way
it
is
maybe
like
the
surface
complexity
to
the
entire
echo
system,
another
one.
Maybe
it's
just
cut
off
that
company,
that
complexity
to
the
node,
but
how
we
are
going
to
be
able
to
do
that,
and
also
because
without
those
final
graduate
readiness
or
whatever
condition,
is-
and
maybe
it's
not
the
best
way,
it's
kind
of
the
one
that's
fit
to
all
solution
is
not
the
best
way
to
so
we
can.
A
We
we
really
run
out
of
time
and
and
the
sergey.
I
notice
that
you,
you
ask
the
one
question
and,
and
that
question
is
pretty
straightforward.
You
already
put
there
and
people
if
against
what
you
ask
and
maybe
then
we
can
talk
and
then
we
can
think
about
the
have
like
the
room
meeting.
Otherwise,
maybe
we
just
go
keep
to
their
today's
format.
A
B
Pretty
sure
we
can
overcome
this
so
I'll
follow
up
with
contribex
to
see
what
the
options
are
there,
but
I
feel
like
we
can.
B
We
can
grow
credential
rates.
A
That's
great
follow
up
the
thick
windows
make
sure
they
have
nectar
proxy
and
the
representative
in
the
signal
and
the
director
is
going
to
follow
up
the
sick
contributors
and
figure
out
a
solution
for
those
kind
of
things,
and
then
we
can
carry
on
more
discounting
on
the
typology
of
while
scheduling
topic
and
also
this
note,
the
readiness
signal
more
and
also
the
and
also
the
system
reserve
because
always
connect
open
issue.
We
need,
we
need
to
move
forward
and
figure
out
if
there's
more
generic
or
better
solution.