►
From YouTube: SIG Node Resource Management WG, 2022/03/28
Description
Meeting notes and Agenda:
https://docs.google.com/document/d/1ALxPqeHbEc0QOIzJ3rWWPpwRMRlYDzCv0mu2mR4odR8/edit#
A
Dude
right
so
welcome
to
this
series
for
for
the
plugable
resource
management.
The
container
compute
interface
driver
extensions
just
wanted
to
give
today
a
shorter
date
where
we
are
with
the
cap
definition.
A
So,
basically,
in
the
last
days,
we
were
updating
the
the
gap
definition
based
on
the
some
of
the
feedback,
but
we've
got
some
sessions
back
and
so
now
this
is
on
the
branch.
Most
probably,
we
will
push
it
to
our
Master
think
and
today
and
see
it
will
be
ready
for
another
scan
from
the
reviewers
when
you
have
time,
but
just
to
go
through
some
of
the
key
things.
What
we
we
changed
so
summary
is
more
or
less
similar
as
we.
A
What
we
had
before
motivation
also
remains
untouched
a
lot,
so
the
the
new
stuff
are
starting
or
the
changes
start
from
the
compute
specification
options.
They
are,
as
suggested
by
by
the.
A
Group
we
we
basically
decided
to
to
pick
one
option.
We
had
listed
three
options
and
one
leading
option
based
on
the
dynamic
resource
location
claims.
A
A
So
in
the
example,
we
have
first
how
you
can
use
that
kind
of
or
how
you
can
use
the
attribute
based
API
through
a
clean
mechanism,
which
is
nothing
changing
actually
to
to
the
classical
dra
specification
type.
A
A
So
basically,
we
give
some
definition
of
possible
attributes
and
we
are
thinking
basically
to
have
a
core
list
which
is
specifying
how
many
cores
you
want
to
do
to
request,
for
they
can
be
a
static
number
of
course,
or
it
can
be
arranged
most
probably
for
Alpha
will
not
support
ranges.
Ranger
are
little
bit
in
the
if
you
think
about
burstable
quality
of
service.
A
So
if
you
want
to
spawn
a
container
which
can
burst
between
one
and
four
cores,
we
can
pick
basically
a
CPU
set
with
four
cores
and
and
cap
shares
which
allow
it
the
the
container
to
burst
between
one
and
four
but
yeah.
This
is
currently
out
of
scope
for
Alpha
this
the
ranges
we
will
stick
just
to
fix
the
number
of
of
requested
course.
A
Then,
for
each
kind
of
entry
we
have
corresponding
attributes,
so
you
can
have
device
Affinity
attributes
which
are
basically
requiring
some
device
Affinity
or
trying
to
get
some
device
Affinity
or
not
required
similar
for
memory
basically
require
requiring
two
more
Infinity
bind
is
basically
similar
to
single
Numa
or
interleaved
is
similar
to
some
sort
of
Numa
spread
kind
of
semantics.
A
Later,
with
the
internet
also
attributes
for
huge
pages
and
stuff
which
which
will
be
needed
for
better
kind
of
specification
in
terms
of
memory,
then
we
have
also
some
CPU
attributes
which
control
isolation
and
and
siblings,
scheduling,
basically,
exclusive
shared
isolation
levels.
Basically,
you
can
get
exclusive
stores
similar
to
what
the
CPU
static,
CPU
manager
does,
with
guaranteed
quality
of
service
shared
is
basically
you
get
the
CPU
set
which,
which
can
be
shared
with
other
Bots
containers,
then
core
sibling
required
denied
the
this
means.
A
Basically,
if
you
want-
or
if
you
have
a,
if
you
are
requesting
for
certain
amount,
of
course,
they
will
try
to
use
the
logical
course
on
the
same
physical
core.
So
simply
required.
Is
this
option
core
sibling
denied
this?
If
you
have?
Basically,
if
you
want
to
take
one
of
the
logical
course
and
block
the
other
logical
core
from
from
being
used
from
other
thoughts,
so
this
is
possibly
denied.
A
There
are
some
applications
which
wants
to
take
the
full
physical
core
for
them
themselves,
so
that
that's
this
case
yeah,
then
then
is
there,
is
an
option
basically
preferred
which
is
trying
to
get
the
logicals
or
trying
to
get
logical
course,
which
are
on
the
same
physical
core.
But
this
is
not
a
must.
C
A
This
is
the
a
little
bit
addition
to
the
cap.
Then
further
additions
are
following
here
in
the
architecture
section
we
are
we
removed
a
little
bit
before
we
were
mentioning
that
we
were
not
sure
how
to
associate
pots
and
and
and
drivers.
A
Now
we
mentioned
that
we
will
use
the
resource
class
at
the
array,
for
that
and
CCI
resource
manager
basically
will
be
using
another
path
for
the
registration
of
drivers,
Warren
CCI,
which
is
not
to
have
conflicts
with
jury,
basically,
which
is
running
under
Warren
dra.
A
B
A
You're
right
so,
basically
the
but
after
that
you
are
sending
this
the
for
you
are
creating
the
the
entry
points
to
the
plugins
in
bar
run.
The
array
was
it
like
that.
C
B
That's
under
and
all
that's
in
there
is
the
CDI
specs
everything
with
plug
it,
plug
with
kublet
registration
and
the
sockets
between
the
the
kublet
back
and
forth.
So
the
plug-in
all
happen
in
the
standard
plug-in
directories
which
are
at
bar,
run,
cubelet
or
I
forget
what
the
exact
path
is,
but
it's
something
underneath.
B
Yeah
I
mean
there's
two
sockets
that
are
that
are
created.
One
is
for
the
connection
for
the
kublitz
connection
to
the
plug-in
and
then
the
one
from
the
kublet
from
the
plugin
to
the
kublet
and
there's
two
separate
directories.
One
Directory
is
called
plugins
where
you
create
a
dra
specific,
a
driver,
specific
directory
for
each
of
your
individual
drivers
and
then
for
the
reverse
connection.
It's
under
the
plug-in
underscore
registry
directory,
which
is
how
C
you
know,
CSI
plugins
and
all
other
plugins
to
the
kublet
yeah.
A
I
remember
in
my
prototype:
I
did
basically
unique
location
for
the
CCI
stuff
so
that
they
don't
get
mixed
with
the
dra.
Maybe
I
mix
the
bats
here.
Something
I
will
double
check
that
it
might
be
wrong,
but
yeah.
The
the
point
is
here:
I
have
a
unique
path
to
the
socket,
not
not
used
not
to
use
the
Dre
kind
of
sockets,
but
have
a
CCI
socket
so
that
for
the
registration,
so
that
those
two
cannot
are
not
mixed.
More
or
less.
A
I
will
correct
that
I
will
double
check
it.
Make
it
correct.
The
other
kind
of
addition
was
on
the
scheduler
side.
I
was
looking
a
little
bit
to
the
kubernetes
scheduler.
There
is
this
kind
of
cubelet
provides
so-called
note
listener
or
architecture.
The
Colts
here
yeah,
which
would
be
interesting
just
to
find
the
right
editor.
A
Auto
resource
server,
part
of
cubelet,
which
is
responsible
to
expose
available
CPUs
available
devices
available
memory
to
the
scheduler
scheduler,
takes
those
very
or
tasks
each
node
for
those
videos,
through
this
listener,
more
or
less
so,
I
added
some
clarification
that
if
we
want
to
to
to
provide
or
we,
we
have
to
provide
correct
information
about
what
are
the
allocatable
CPUs
and
the
locatable
memory,
at
least
for
long-term
Beta
release,
where
we
want
to
coexist
with
Statics
kind
of
CPU
management
and
stuff
like
that.
A
So
basically
those
we
have
two
ads
a
functionality
which,
where
the
CPU
provider
get
allocatable
CPUs
is,
is
calling
CCI
manager
to
to
get
the
allocated.
Those
CPUs
and
memory
before
this
was
calling
basically
CPU
manager
that
that
would
be
one
one
thing
we
we
have
to
to
enable
for
CCI
so
that
scheduler
knows
what
are
the
allocate
to
those
CPUs
allocatable
memory.
C
C
A
Yeah
so
I
I
mentioned
in
the
cap,
the
the
the
the
component
name,
Bots
resource
server,
I,
don't
know
if
this
is
sufficient.
I
can
also
mention
the
functions.
A
A
Right,
this
was
small
addition.
This
was
this
paragraph.
Basically,
then,
the
the
other
kind
of
addition
was
the
checkpointing.
Yes,
we
we
will,
as
the
CCI
manager
becomes
the
responsible
component
for
CPU
Management.
In
that
case
it
has
to
provide
checkpointing.
So
we
we
have
saved
load
functions
currently,
which
will
basically
save
the
state
the
store,
State,
you
know
yeah,
we
can
do
it.
Similarly
to
CPU
manager,
we
can
have
CCI
manager
State.
A
Basically-
and
this
is
saved
and
load
the
loaded
when,
when
we
need
it
dispose
another
small
change
so
that
we
can
get
checkpointing
in
then
we
were
discussing.
A
Yeah
Okay,
so
going
back
to
the
camp
next
kind
of
proposal
was
if
we
can
to
make
the
component
the
CCI
manager
completely
self-manageable
without
the
needs
to
having
any
dra
cubelet
drivers
or
cubelet
plugins
we
put,
or
there
was
a
suggestion
if
we
can
basically
handle
the
playing
the
logic
which
usually
the
which
was
used
to
reserve
resources
for
a
given
claim
and
basically
to
free
them
inside
the
CCI
drivers
and
I
was
thinking
as
a
first
kind
of
iteration
or
possible
interface.
A
We
could
do
that
by
taking
our
admittance,
remove
container
resource
functions
where
we
we
add
more
or
less
the
claim
parameters
to
it.
I
have
a
little
bit
the
better
view
on
that.
You
see
it
nicely
in
the
in
the
grpc
definition,
so
this
is
very
similar
to
notes
prepare
resource.
A
If
you're
looking
not
prepared
resource
of
dra,
it
defines
those
four
and
we
can
pull
them
in
in
a
in
the
admit
kind
of
request,
and
this
can
help
us
more
or
less
keep
track
of
the
incoming
resource
claims
and
dual
reservations.
We
will
use
also
the
resource
handle
as
as
good
way
to
pass.
A
The
arguments
basically
before
we
had
the
CCI
spec
just
we
now
we
can
just
pass
the
resource
handle
and
that
would
be
sufficient
and
then,
similarly,
that
we
use
remove
resource
request
basically
to
to
free
claims
and.
C
A
B
The
same
D
area
architecture
we
have
today
it's
just
that,
instead
of
what
we
have
currently
defined
as
the
kublet
plug-in
for
Dra,
which
returns
a
set
of
CDI
devices,
you
guys
can
write
your
own
variant
of
a
kublet
plugin
that
has
its
own
grpc
API,
which
is
what
you're
presenting
here
and
potentially
as
a
different
set
of
you
know,
calls
that
are
made
back
and
forth
during
different
points
in
the
container
life
cycle.
In
order
to
support
CCI
devices
instead
of
CDI
devices.
A
That
that's
the
point
we
we
call
them
a
little
bit
differently,
the
functions
we
we
and
we
have
a
little
bit
more
information
as
we
need
some.
Some
traditional
information
for
the
containers
and
Bots,
but
basically
we
have
the
admission
function
is
very
similar
to
more
hold
some
of
the
information.
What
not
prepare
resource
was
having
before
and
our
remove
container
resource.
Basically,
it's
we.
We
can
maybe
rename
them
to
admit
resource
or
something,
but.
B
A
Yep,
the
that's,
that's
the
main
kind
of
change.
What
we
did
in
the
cap
so
far,
I
think
other
than
that
we
we
try
to
address
a
lot
of
the
other
issues.
Just
at
the
end
of
the
oh
yeah,
there
there's
one
final
thing
in
the
alternative
sections
at
the
ends.
A
We
have
this
kind
of
annotation
based
approaches
just
suggested.
We
we
yeah
shortly
describe
them.
There.
A
Right
so
yeah
we
will
merge
that
to
master
to
our
kind
of
master
today,
most
probably
and
then
yeah
for
for
all
people
interested
see.
If
you
can
take
a
look,
we
will
send
a
link
and
give
us
some
feedback.
D
I
have
a
question
if,
if
possible,
sure
sorry
if
it's
already
answered
I
was
on
vacation.
So
last
time
on
the
last
meeting
that
I
I
participated,
I
asked
question
about
like
other
ports.
They
are
not
those
there
that
are
not
referencing
any
claims
and
they
they
just
request
some
kind
of
like
some
amount
of
CPUs
and
memory
right,
and
you
explained
that
those
Sports
they
will
be
also
could
be
also
scheduled
to
the
same
node.
When,
like
this
CCI
driver
runs,
is
it
correct.
A
D
Yeah
and
how
like,
in
this
situation,
how
we
can
avoid
like
double
accounting,
so
I
mean
like
from
one
hand
the
CCI
driver
would
maintain
allocations
of
CPU
and
memory
like
in
future,
and-
and
it
would
be
also
possible
like
if,
like
those
spots
that
are
not
referencing.
C
D
A
Basically,
after
making
a
call
to
the
mission,
we
get
the
results
resource
set
for
for
the
pot
which
was
handled
by
driver
and
we
put
it
in
the
store
they
screen
with,
with
some
container
ID
resource
set
and
later,
let's
say
you
have
another
bot
which
was
assigned
to
that
note
by
scheduler
to
get
it
assigned
correctly.
To
that
note,
we
we
still
have
to
consider
the
available
resources.
A
This
was
the
the
point
what
I
mentioned
at
the
beginning,
that
for
correct
functioning
of
the
scheduler
available
resources
has
to
be
propagated
down
to
the
node
listeners,
or
that
there
is
this
kind
of
class
available
and
basically
to
ensure
no
double
accounting.
The
resource
store
has
to
be
used,
so
basically
the
when
new
containers
come
in,
they
will
be
asking
the
resource
store
for
available
CPUs
and
the
resource
store
knows
at
any
time
order
the
we
both
CPUs
more
or
less.
A
If
it's
the
it
will
maintain
a
view
of
the
available
resources
which
will
be
correct
at
any
time
for
us,
but.
D
That
that
would
happen
on
Google's
side
right,
isn't
it
already
late,
so
I
I
was
kind
of
thinking
that
it
some
changes
should
be
done
like
on
scheduler
site,
to
avoid
this
kind
of
situation
that
we
actually
schedule
put
that
actually
cannot
be
solved
problem.
A
To
avoid
that,
it's
basically,
we
had
a
small
kind
of
verification.
Some
like
lost
again
yeah
here
it
is
I
will
show
the
codes
in
a
second,
but
basically
there
is
so-called
Pottery
Source
server
inside
cubelet,
which
is
this
kind
of
note
listener.
So
the
both
yeah.
D
A
This
is
called
Snippets
of
of
the
Bots
resource
server
in
cubelet,
and
it
provides
or
it
to
what
scheduler
does
calls
this
get
allocatable
resources
for
each
node
and.
C
B
C
Another
object-
this
is
at
least
the
case
for
the
number
where
to
suppose
you
were
scheduling
the
more
shadowing
solution
we
were
working
at,
and
this
was
the
recommended
solution,
but
in
any
case,
as
best
as
as
I
know
and
I'm,
not
sure
this
changed
to
the
array.
I
think
no,
it
didn't,
but
the
Shader
is
never
ever
calling
the
the
node
directly
never
communicates
to
nodes
and
I'm.
A
C
Thing
the
thing
is:
if
I
understand
that
concern
correctly
and
please
battery
skip
correct
me
if
I'm
wrong.
This
is
exactly
the
point
if
you
provide
this
data,
but
you
don't
change
the
scheduler.
The
Shader
is
not
aware
of
this
data,
so
we
are
back
to
square
one,
and
this
I'm
not
sure
this
actually
answered
the
question
from
that.
A
C
Okay,
so
the
default
scheduler
doesn't
need
this
data,
the
shutter
plugin,
which
is
enabled
then
the
positive
shadowing
actually
needs
to
consume.
This
data
have
the
intermediate
representing
actually
the
API
object
to
represent
this
data
and
then
consume
this
data,
but
all
of
those
steps
are
explicit
so
if
and
I
really
mean,
if
because
I'm
not
up
to
date
to
the
changes
to
the
cap,
so
what
I'm
saying
could
be
obsolete
already.
C
A
In
any
case
for
for
in
in
terms
of
gear
Ray
scheduling,
which
is
covered
by
the
controllers,
this
is
not
not
really
relevant.
This
is
relevant
to
this
as
soon
as
you
turn
on
static
CPU
management.
A
So
if
you
have
static
CPU
management
and
there
are
standard
Bots
coming
in
with
some
some
guaranteed
quality
of
service.
This
is
when
this
this
actually
becomes
important,
but
in
in
the
environment,
where
you
didn't
have
Bots,
with
with
static
quality
or
with
guaranteed
quality
of
service
and
stuff,
like
that,
all
the
scheduling
is
covered,
at
least
for
the
pots,
with
with
standards
with
with
claims
will
be
covered
by
by
the
controllers
in
the
array.
A
So
one
one
of
the
kind
of
requirement.
What
we
will
have
is
for
Alpha
phase,
our
kind
of
pots,
which
will
get
claims,
don't
get
or
they
will
be
using
CPU
management,
Norm
and
additionally
use.
Actually,
we
would
avoid
specifying
request
limits,
request
the
CPU
request,
limits
and
and
and
yeah,
basically,
usually
in
the
container
spec.
You
have
request
limits
and
yeah,
as
as
the
the
whole
specification
of
how
many
cores
you
want
and
stuff
like
that
is
happening
through
the
resource
claim.
A
So
we
we
would
require
that
that
basically
request
limits
are
left
left
outside
or
are
not
included
in
the
pots
package
and
in
that
case,
basically
scheduling
in
in
it's
handled
by
the
by
the
gra
controllers.
So.
A
Make
sure
it's
okay?
Okay!
This
is
maybe
a
nice
simplification
for
Alpha
version
later,
when
we
have
static
kind
of
quality,
of
static
and
guaranteed
quality
of
service.
We
have
to
come
back
to
this
point
and
take
a
look
exactly
on
the
allocatable
CPUs
allocateable
memory,
and
if
we
need
somehow
to
propagate
the
data
further
to
the
scheduler
yeah,
but
for
Alpha
I
think
we
are
fine
as
long
as
our
Bots
do
not
have
request
limits
and
yeah.
In
any
case,
we
are
not
using
static
policy
so
far,.
D
But
even
in
this
case,
some
some
ports
would
be
scheduled
to
the
same
note
and
they
will
anyway
consume
some
CPU.
A
A
Have
the
similar
to
static
CPU
manager,
they
distinguish
between
kind
of
the
exclusive
pool,
the
shares
pool
and
and
yeah.
So
basically,
those
parts
we
we
can
put
in
the
SharePoint.
D
And
this
CCI
driver
would
not
actually
work
with
those
with
Deadpool,
with
shared
pool.
A
It
can
work
also
with
the
Shaft
Tool
you
have.
If
I
switch
to
the
cap,
we
have
basically
isolation
level
called
shared.
So
if
you
want
to
put
a
certain
yeah
certain
applications
on
the
shared
pool,
you
can
do
that
with
with
that
flag.
But
in
that
case
we
don't
care
if
they
overlap,
because
they
are
shared,
we
it's
known
by
contracts
they
could
that
they
can
overlap.
D
D
A
Yeah,
this
was
an
addition.
We
we
did
to
the
Gap
a
little
bit
more
specification
of
the
attribute,
based
stuff
and
and
skull
claim
kind
of
integration
can
look
like
that.
That
was
rated.
D
D
I
would
like
to
ask
is
about
the
situation
when
we
have
CCI
driver
and
and
some
dra
drivers,
so
how
they
would
I
I
assume
that
they
would
be
just
like
filtering
claims
by
driver
name
or
something
like
that
right.
So
that
to
understand
which
actually
claims
should
be
served
by
a
certain
plugin
which.
A
Driver
yeah
by
driver
name,
and
we
were
discussing
at
the
beginning,
we
can
create
a
unique
socket
for
the
CCI
kind
of
registration.
B
D
Like
all
those
plugins,
they
will
be
filtering
all
pods
referencing
any
claim,
so
that
kind
of
concerns
me
in
terms
of
performance
so
like
if,
if.
D
B
I,
don't
follow
so
in
the
same
way
that
you
know
standard
dra
works,
you
know,
I
have
a
I.
Have
my
controller
and
my
controller
allocates
resources.
However,
it
knows
how
to
allocate
resources
based
on
being
called
out
to
by
the
scheduler
right.
None
of
that
changes
in
this
world
yeah
yeah.
B
B
C
B
Knows
how
to
request
for
the
standard
era
path
or
the
CCI
path,
based
on
the
driver
name.
D
Yeah,
but
wouldn't
it
like
erase
some
like
performance
issues
because,
like
for
Syria
for
CDI,
for
these
Dairy
devices,
they're
kind
of
it's
the
amount
of
pots
referencing,
those
claims
are
kind
of
like
not
not.
That
big,
like
in
in
this
case,
with
CPU
and
memory,
so
I
mean
and.
D
Is
that
your
concern,
yeah
and
and
like
this
this
code
pass
would
be
called
much
much
more
frequent
than
than
now
so
it's
potentially
can
create
some
performance
issues.
B
Yeah
I
mean
that
was
one
of
the
initial
Arguments
for
keeping
the
CPU
manager
in
the
kublet
to
begin
with,
but
I
think
I
don't
see
there
being
much
more
overhead
doing
this
than
if
we
were
to,
you
know,
earn
any
CQ
manager
into
a
plug-in
via
any
architecture
right.
The
minute
you
decide
to
have
this
CPU
allocation
be
done
by
a
plug-in,
and
so
that's
the
way
we're
going
to
do
it
going
forward.
B
D
Yes,
that
so,
basically
in
case
of
like
many
claims,
much
bigger
amount
than
for
for
the
devices
so
basically
like.
Let's
imagine
that,
like
every
every
port
schedule
to
the
node
would
would
be
referencing
some
claim
because
it
it
wants
to
use
some
CPUs
and
memory
and
in
this
case
amount
of
queries
to
API
server,
because
we
we
need
to
like
for
each.
D
D
Okay,
so
this
should
be
somehow
captured.
B
B
A
Can
just
mention
it
basically,
that's
that's,
basically,
that
this
processing
the
the
claim
which
will
still
require
the
the
connection
or
within
cubelet.
You
still
need
to
contact
the
the
kubernetes
API
server
for
that
until
that
is
not
optimized
away.
It's
it's
yeah,
it's
a
limitation
which
is
there
yeah.
B
But
the
slow
it
slows
down
everyone
else.
If.
A
D
B
A
B
Yeah,
we'll
see
I
mean
by
design
by
admission,
is
not
and
cannot
be
parallelized,
but
but
pod
creation
and
container
starting
a
container
and
running
it
can
be
in
theory,
although
it's
not
at
the
moment.
Yes.
A
Yeah,
we
will
mention
it
in
known
limitations,
something
like
that.