►
From YouTube: Kubernetes SIG Arch - KEP Reading Club 20220905
Description
KEPs discussed:
- Dynamic Resource Allocation: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation
A
Hi
welcome
to
this
month's
kept
reading
club
session.
As
always
this.
This
meeting
as
all
of
the
communities
meetings,
follows
the
communities
and
cncf
code
of
conduct
which
boils
down
to
be
excellent
to
each
other.
This
meeting
is
also
being
recorded
and
will
be
posted
online
for
future
reference.
So
please
act
accordingly.
A
That
being
said,
we
have
one
cap
on
the
agenda
today,
which
is
the
dynamic
resource
allocation
camp.
We
also
have
the
author
of
the
kept
with
us
on
call
patrick,
thank
you
for
joining.
We,
let's
start
with,
like
a
20
minute,
read
time,
if
possible,
it's
not
a
hard
limit,
take
as
much
time
as
you
need
after
that
as
well.
A
This
is
a
one
hour
call
and
we
have
this
one
kept
with
us
so
like
let's
try
and
hopefully
get
through
it
in
a
nice
manner
and
like
hopefully,
we
learn
a
few
new
interesting
things.
I
will
just
paste
the
link
of
the
agenda
on
the
chat
in
case.
A
Anyone
wants
to
you,
know,
follow
along
and
don't
have
the
link
with
them.
That's
the
agenda
and.
A
So
also
just
a
disclaimer
considering
this
is
a
pretty
big
one.
It's
fine!
If
we
don't
get
through
all
of
it.
This
session
we
can
probably
get
through
some
like.
However,
we
can
and
discuss
that
and
then
the
remaining
we
can
take
it
up.
Async
overslept,
that's
also
like
a
viable
and
totally
okay
option.
A
Okay,
so
I
will
start
like
a
20
minute
timer
and
then
we
can
extend
as
needed
depending
on
how
things
go.
A
Okay,
starting
the
timer
in
three
two
and
one.
A
There
are
about
four
minutes
remaining,
considering
it's
a
pretty
lengthy
cap.
Why
don't
we
can
we
maybe
stop
at
design,
details
and
then
discuss
up
to
there
and
like
answer
questions
so
that
context
is
fresh
in
our
minds
or
would
folks
prefer
to
go
through
the
entire
thing
and
then
discuss
all
at
once.
A
Okay,
that's
like
the
20
minute
mark.
I
think
we
can
probably
start
discussing
questions
now
and
then,
like
after
five
to
10
minutes
of
discussion
proceed
with
the
remaining
of
the
cat.
So
up
to
this
point,
does
anyone
have
any
questions
to
get
started
with.
B
B
B
C
Right
but
also
a
question
related
to
it
because
it's
a
whole
world
typo,
so
in
the
in
the
custom
parameters,
implementation
definition.
In
the
first
paragraph,
it
says
that
for
resource
clause
that
object
must
be
cluster
scoped,
then
for
resource
claim.
It
must
be
in
the
same
namespace
as
the
resource
claim,
and
thus
the
pod.
B
C
A
B
C
B
Yeah,
but
even
that
is
useful
feedback.
It
is
sometimes
it's
bit
for
me
at
least
being
a
german
speaker
with
an
authority
for
endless
sentences
and
long
words.
It's
particularly
hard
to
keep
the
language
simple.
It's
very
tempting
to
have
long
sentences
in
a
cap,
but
then
it
just
becomes
harder
to
read.
So
it's
very
useful
to
be
to
keep
it
simple,
avoid
fancy
words,
make
sure
that
the
language
doesn't
get
in
the
way
of
understanding
what
it's
about.
C
B
B
So
in
that
sense,
this
cap
had
to
be
that
detailed,
and
we
spend
a
lot
of
time
going
back
and
forth
over
exactly
these
implementation
details
to
make
the
description
clear,
to
discuss,
corner
cases
and,
in
the
end,
have
something
where
we
all
agree.
Yeah
this.
This
can
work
the
way
it's
specified
with
no
well
with
no
open
questions
a
bit
too
strong,
but
no
no
unknowns
left.
At
that
point.
We
knew
that
this
would
work.
We
knew
also
that
there
were
alternatives
that
we
were
still
discussing,
and
that's
a
bit
unusual
in
this
cap.
C
C
D
C
Well,
I'm
looking
at
the
index,
they
have
a
table
of
context
there
yeah
contents,
sorry
so
summary
motivation,
proposal
after.
C
C
But
if
it's
something
smaller,
not
so
complex
or
complicated,
would
there
be,
for
example,
a
motivation
and
a
proposal?
Is
that
sufficient
to
start
discussion?
It
often.
B
B
What
then
happens
usually
is
that
you
go
to
a
sick.
The
sick
looks
at
the
problem
statement.
Basically
your
motivation
and
what
you
are
trying
to
achieve
and
then
decides
that.
Yes,
this
is
something
that
physique
wants
to
address
and
then
perhaps
you
can
get
a
cap
merged
as
provisional,
with
just
the
motivation
and
proposal
sections
filled
out
or
perhaps
partially
filled
out.
B
If
you
don't
even
know
how
to
do
it,
you
might
not
be
aren't
able
to
to
answer
all
of
or
to
fill
in
all
of
these
details,
yet
like
risks
and
mitigations
that,
partly
in
my
opinion,
in
my
opinion
or
experience,
depends
on
the
actual
solution
before
you
can
answer
those
both
parts,
but
then
the
sig
might
decide
yeah.
This
is
worthwhile.
Let's
work
on
this
together
and
then
you
can
add
more
more
details
write
a
more
complete
cap
later
on.
B
I
myself
tend
to
try
to
have
more
complete
understanding
of
a
program
space
first,
but
that's
also
a
risk,
but
I'm
taking,
because
I
might.
I
did
invest
quite
a
bit
of
time
trying
to
come
up
with
a
technical
proposal
that
actually
worked.
I
think
I
even
had
a
prototype
at
that
point,
or
I
explored
at
least
some
of
the
technical
aspects
and
then
wrote
down
the
technical
side
and
then
started
to
circulate
this
cap
a
bit
more
broadly.
B
A
So
I
had
a
question
you
mentioned.
Most
of
the
contacts
needed
by
the
new
apis
are
present
in
the
object
itself,
but
any
additional
context
needed,
while
port
editing
may
be
referenced
in
a
power
scheduling
object.
So
I'm
assuming
pod
scheduling
is
a
new
object
that
is
being
introduced
in
this
cap
or.
A
Was
there
a
the
part
I
wasn't
able
to
understand
was
why
is
there
a
new
object
being
introduced?
Maybe
I
didn't
get
to
that
part
yet,
but
is
there
a
reason
for
introducing
like.
D
B
Yeah,
it
wasn't
in
the
original
design,
originally
all
of
the
fields
for
what
or
where,
where
a
specific
claim
could
be
satisfied,
we're
all
inside
the
resource
claim
status.
B
B
The
the
other
was
that
a
pod
scheduling
object
is
one
level
above
resource
claims,
so
it
can
describe
multiple
resource
claims
being
scheduled
or
allocated
together
in
a
logically
consistent
way,
which
wasn't
possible
with
the
earlier
proposal,
where
one
had
to
look
at
all
resource
claims
to
figure
out
whether
they
are
going
to
be
allocated
for
the
same
part,
and
that
opens
the
door
potentially
for
future
extensions
of
a
scheduling
mechanism
where
different
drivers,
perhaps
even
look
at
the
resource
claims
from
other
drivers
that
happen
to
be
needed
by
the
same
part,
to
make
some
kind
of
holistic
decision,
and-
and
that
is
all
easier
to
do
when
it's
one
object,
that
that
the
drivers
and
the
scheduler
look
at
to
determine
where
to
do
via
the
allocation
that
there
will
be
actually
a
cap
update
going
even
further
than
what
is
in
the
current
cap.
B
C
A
Okay,
any
other
questions,
otherwise
we
can
probably
continue.
We
have
about
15
minutes
remaining.
If
we
aren't
able
to
get
through
the
entire
cap,
we
can
always
take
a
taste
second
slap.
A
E
E
Yeah,
so
so,
if
the
resource
in
the
resource
class
right,
I
mean
the
driver,
so
basically
it's
a
way
to
specify
to
the
driver.
What
resources
we
need.
I
mean
to
take
a
very
crude
example.
I
mean
let's
say
you
have
an
accelerator,
you
want
a
specific
set
of
you
know
like
processing
units.
You
know
within
that
accelerator
I
mean
like
how
I
mean.
What's
the
interface
that
you
that
you
use
to
the
driver
right,
I
mean
I
can
see
in
type
resource
class.
E
You
have
something
called
resource
class
parameter
reference
is
that
the
one
or
I
mean
like
because
different
drivers
will
have
different
architectures,
I
mean
like.
Are
you
standardizing
it
or
is
that,
like
you
know
like
a
white
pointer
and
see
where
like
how?
How
how
does
that
interface
between
the
driver
and.
B
B
Others
have
tried
that
before
and
you
quickly
run
into
problems
of
trying
to
identify
the
common
parameters
for
a
gpu.
For
example,
that's
just,
in
my
opinion,
an
impossible
task.
Perhaps
some
standard
will
emerge
later
on,
but
right
now,
at
the
level
of
this
cap,
these
parameters
are
defined
by
the
resource
drivers
of
this
the
api,
the
entry
api
just
has
these
parameter
references
and
what
they
reference
is
validated
and
used
by
only
the
resource
driver
and
typically,
they
expect.
B
The
expectation
is
that
this
will
be
a
crd,
so
you
create
a
driver,
specific
api
through
a
crd
that
explains
which
parameters
the
driver
accepts
for
a
resource
claim
and
a
resource
class,
and
this
could
be
different.
It's
intentionally
separated
so
that
a
cluster
admin
has
a
way
to
specify
parameters
that
a
normal
user
can't
specify.
That's
the
right.
That's
the
rationale
for
having
two
parameters:
two
to
two
parameter:
references,
one
for
the
cluster
admin
and
one
for
the
user.
E
E
I
I'm
not
sure
whether
I
got
to
it,
but
is
there
any
chance?
Is
that?
Can
you
provide
an
example
of
some
sort?
I
mean
like
if
it's
already
there
I
mean
like
you,
can
ignore
it,
meaning
that
a
small
example
where
you
know
like
you,
you
actually
show
them,
and
this
is
so.
This
is
how
the
driver
would
claim
it,
and
this
is
how
you
would
use
it
at
that
time.
B
Well,
it's
under
user
stories
in
in
a
way
it
starts
with.
A
B
There's
no
detailed
description
of
what
those
types
are,
but
it's
kind
of
supposed
to
be
intuitive
or
yeah,
just
just
from
reading
with
examples
you're
supposed
to
get
a
good
feeling,
but
these
examples
is
always
a
bit
tricky.
If
you
write
an
example
and
then
it
raises
more
questions
when
it
answers,
you
end
up
explaining
the
design
and
a
lot
of
details
already
very
early
in
the
cap,
so
I
I
found
that
part
a
bit
hard
to
write.
A
So
I
want
to
make
sure,
like
I
sort
of
got
this
just
of
this
discussion.
If
you
are
a
vendor
who
wants
to
sort
of
support
this
api
right,
so
they
need
to
implement
the
implementation,
has
to
basically
try
and
satisfy
the
grpc
interface
defined.
A
That's
implementation,
specific
for
the
vendor
and
then
the
cluster
operator
or
the
cluster
administrator.
Whoever
wants
to
make
use
of
this
feature,
they
can
define
a
crd,
as
you
mentioned,
or
as
mentioned
in
the
user
stories,
that
references
this
device
or
basically
uses
the
implementation
of
the
device
that
satisfy
the
grp
system
is,
is
that
does.
A
B
B
So
the
expectation
is
that
a
resource
driver
vendor
will
provide
installed
instructions,
a
jamal
file
or
an
operator.
Perhaps
that
installs
the
resource
driver
and
then
a
cluster
that
doesn't
have
this
driver
or
this
device
support
can
install
this
driver
based
on
these
instructions
and
then,
when
it
runs,
there
will
be
another
new
type
for
parameters.
There
will
be
a
new
resource
class
and
users
can
start
making,
taking
advantage
of
of
that
new
feature
in
the
cluster.
A
Okay.
We
have
about
10
minutes.
We
can
take
five
more
minutes
to
sort
of
resume
from
wherever
we
were
and
then
five
minutes
for
any
last
questions,
and
then
we
can
call
it
thanks
for
sticking
around
by
the
way.
A
I
had
one
question
but
before
I
ask
that,
like
patrick
considering
this
is
a
pretty
big
effort
across
sex
right.
Is
there
a
tracking
issue
in
kk
for
this
yet
or
some
other
place
where
folks
can
sort
of
follow
along
or
offer
help
or
any
low
hanging
fruit
that
you
may
need
help
with
anything
like
that
is.
B
B
Yeah
I'm
aware
of
that
one
so,
but
that's
not
actually
getting
that
much
discussion
on
day-to-day
business.
We
have
core
team
of
people
who
are
actively
working
on
this
from
intel
and
nvidia.
B
We
are
kind
of
reluctant
at
this
point
to
pull
in
more
people
because
it
probably
wouldn't
help
I'm
covering
the
core
work
on
on
scheduler
ed
bartos
from
intel
is
covering
the
cupola
part,
and
but
that
pretty
much
is
sufficient
to
get
the
implementation
done.
B
B
So
I
think
the
current
work
is
pretty
well
covered
by
by
people
we
are,
or
we
started,
setting
up
a
new
channel
on
slack
dra
for
dynamic
resource
allocation,
and
then
signal
pointed
out
that
they
think
that
this
fragments
the
discussion
too
much.
They
preferred
to
have
all
of
the
discussion
around
dra
happening
on
the
sicknote
channel,
so
we
are
refocusing
and
we
are
basically
whenever
we
have
something
that
we
want
to
share.
We
are
now
using
the
signature
channel
until
people
get
bored
or
annoyed
by
us
doubling
the
volume
of
that
channel.
B
B
But
for
now
it's
getting
discussed
on
signal,
but
you
also
need
to
read
through
all
of
the
other
things
that
they
are
getting
discussed
there
yeah
that
that's
probably
place
to
stay
up
to
date
and
where
we
post
announcements
like
this
prototype,
I
have
a
prototype
pr
pending
with
this
work.
B
It's
a
pull
request
against
kubernetes,
where
I
already
raised
some
design
questions
where
I
hope
to
get
some
feedback
from
core
api
reviewers
on
the
best
way
of
doing
certain
implementation
details
around
the
api
server
that
that
pr
probably
will
see
most
of
the
discussion
for
merging
the
code
in
the
126
time
frame.
A
Got
it
yep
thanks,
patrick,
so
one
question
I
had
was
in
case
the
allocation
mode
is
immediate
and
basically
pods
aren't
getting
either
pods
aren't
getting
created.
That
request
this
resource
or,
if
calls
are
getting
created,
but
for
some
reason
are
unscheduled.
A
Of
the
of
the
resource
in
case
it's
immediate
and
based
and
no
pods
are
requesting
it
for,
like
a.
A
Period
like
is
there
a
case
where
the
resource
is
allocated
but
not
being
used?
How
is
that
handled.
B
It
will
just
stay
in
allocated
mode,
so
immediate
mode
basically
means
the
user
is
in
control
of
the
life
cycle
and.
A
B
This
reallocate
thing
that
you
mentioned
is
also
not
being
done
for
resource
claims
with
immediate
allocation,
so
the
idea
really
is
the
use
of
the
life
cycle
is
very
simple.
Resource
game
gets
created,
created
it
gets
allocated
and
remains
allocated
until
it
gets
deleted
any
additional
logic
like
okay,
this
this
resource
claim
hasn't
been
used
in
a
while
that
would
be
owned
by
by
the
user
or
perhaps
some
higher
level
controller.
B
An
operator
could
create
such
resource
claims,
for
example,
yes,
yeah
the
for
this
dealocate
that
happens
for
immediate
allocation,
and
that
is
because
the
scheduler
now
has
the
task
of
getting
a
resource
claim
allocated,
so
that
and
not
just
one
multiple
resource
claims
potentially
get
them
allocated
for
a
certain
port
and
don't
stop
until
that.
Pod
is
ready
to
run,
and
that
means,
if
it
detects
the
situation
where
post
scheduling
can't
continue
and
the
only
way
out
of
that
situation
is
to
de-allocate.
A
Got
it
yeah
that
makes
sense,
and
I
think
we
are
out
of
time,
but
thank
you
so
much
for
joining
in
and
answering
all
of
our
questions.
A
If
anyone
has
any
additional
questions,
I
will
start
a
thread
on
stick
architecture
slack
channel
on
the
kubernetes
slack,
so
please
feel
free
to
chime
in
there
and
battery
can
maybe
take
a
look
whenever
yes,
the
time
for
it
yeah
sure.
Thanks
again,
everyone
for
joining
and
have
a
nice
day
bye,
see
you
next
month,
bye.