►
From YouTube: Kubernetes SIG Node 20230926
Description
SIG Node weekly meeting. Agenda and notes: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#heading=h.adoto8roitwq
GMT20230926-170638_Recording_4096x2304.mp4
A
Hello,
hello
today
is
September
26
2023.
This
is
a
signaled
weekly
meeting.
Welcome
everybody.
We
have
very
long
agenda
today,
let's
jump
right
into
it.
First
is
Philip.
C
D
B
B
Ask
for
a
question
or
ask
for
your
opinion.
So
today
the
state
of
the
no
draining
is
that
we
have
this
like
top-down
approach,
where
you
are
usually
run
Cube
skill
drain
or
if
you
have
some
other.
D
B
Or
of
for
example,
some
note
autoscaler
or
something
you
can
have
other
options
as
well,
but
we
just
we
have
this
top-down
approach
where
we
ask
all
the
Bots
to
to
be
elected
and
like
without
any
regard
for
the
underlying
application.
B
So
we
just
hope
that
the
Bots
get
evicted
and
we
can
do
dream
successfully
and
remove
all
the
pods.
Unfortunately,
this
does
not
work
in
every
case.
We
have
also
vdbs
and
some
applications
may
wish
to
stop
the
drain
and
block
it
under
some
circumstances,
and
some
applications
live
with
with
the
disruption,
but
are
not
happy
about
it,
which
you
can
also
see
I
in
the
motivation
power
section.
B
There
are
linked
issues
where
people
have
problems
that,
for
example,
they're
single
replica
application
needs
to
compare
with
the
disruption,
and
since
they
don't
want
to
block
the
train
for
the
administrator,
then
they
are
not
using
pdbs
and
and
their
application
is
disrupted
and
they
they
do
not
wish
to
use
high
available
applications
with
multiple
replicas
and
for
applications
that
do
not
want
to
be
disrupted.
B
E
B
Then
it
has
to
dig
down
into
the
application
to
see
if
it
can
disrupt
the
application.
Even
though
the
pdb
is
specified,
which
is
a
manual
process,
and
they.
E
B
Like
to
avoid
that,
so
with
this
in
mind,
we
had
some
discussions
in
the
background
and
we
are
proposing
to
to
have
the
new
process
for
the
Deep
cattle
drain
or
any
like
type
of
drain,
and
this
we
would
like
to
be
this
more
declarative
and
give
more
control
to
the
application.
B
So
we
are
proposing
to
create
a
new
API
called
node
maintenance,
which
would
usually
a
Target
or
not
like
we
started
for
now.
We
have
just
this
simple
API,
where
you
specify
a
node,
which
he
wants
to
drain,
specify
a
reason
and
I
will
talk
about
the
stickers
later,
but
usually
you
start
drawing
as
usual.
You
might
do
not
unschedulable,
and
then
you
create
the
non-maintenance
object.
You
would
like
to
implement
this
in
keep
cattle
drain
first
and
other,
like
drainers,
can
implement
this
as
well.
B
By
creating
this
object,
this
will
get
picked
up
by
the
maintenance
controller,
which
will,
according
to
the
maintenance
object,
will
select
number
of
pods
which
are
on.
B
And
will
set
a
new
condition
calls
targeted
by
non-maintenance
on
the
on
on
each
board
in
this
in
this
node,
and
then
it
is
up
to
the
application.
We
would
like
to
first
start
with
a
deployment
controller
and
the
application
can
decide
how
to
delete
these
parts
from
the
from
the
node
or
how
to
move
them.
So
in
case
of
a
deployment
control,
we
have
a
Max
search
field.
B
This
controller
could
actually
create
a
new
pods
on
another
node
and
once
the
spot
become
available,
then
it
would
Delete
to
replica
set.
It
would
delete
the
pods
that
are
on
the
maintain
node,
and
the
advantage
is
that
we
can.
You
use
the
wish
of
the
application
which
specifies
the
max
search,
how
many
Max,
how
many
search
posts
we
want
to
create.
B
So
this
can
work
also
for
single
replica
application,
for
example,
once
that
is
done,
that
you
you
can
battle
drain,
would
work
as
normal,
but
this
time,
if
we
see
the
support
for
the
node
maintenance
objects
in
the
cluster,
we
will
try
not
to
evict
the
boss,
which
are
supporting
the
search
and
we
will
evict
or
delete
the
post.
As
usual,.
B
So
this
this
is
the
proposal,
and
also
one
of
the
advantages
is
that
we
can
add
other
features
in
the
future.
So,
for
example,
if
we
would
like
to
this,
this
node
maintenance
controller,
put,
for
example,
in
the
future,
select
also
bots
of
demon,
sets
or
According
to
some
priority
schedule
or
for
some
priority
list
and
be
good
to
do
a
like,
gradually
remove
the
posts
according
to
the
user's
wishes.
B
We
would
expect
that
the
Bots
or
the
controllers
of
the
pods
with
resolve
this
without
our
our
interaction
I
suppose
later
we
can
add
some
Flags
to
modify
this
Behavior,
but
it
would
be
up
to
the
application
first,
so
you're
in
wait
for.
F
B
B
Okay,
so
this
Loop
runs
in
parallel
with
this
Loop,
so
this
these
are
not
dependent
on
each
other
and
we
also
have
the
status,
which
should
be
resolved
by
the
Knight
by
the
maintenance
controller,
which
would
strike
the
targeted
posts
which
have
the
condition
and
the
admin
could
actually
query
this.
This
node
maintenance
objects
and
see
how
many
Bots
are
remaining
for
until
the
drain
is
complete.
B
B
We
also
thought
about
introducing
like
this
API
into
the
node
object
itself,
which
could
be
also
done,
but
since
we
are
seeing
the
possibility
for
extending
this
API
in
the
future
and
moving
skip
cattle
logic
into
the
controller
it,
it
seems
best
to
us
to
create
a
new
object
and
do
not
put
this
internet
out
object.
B
So
so
I
can
ask
for
what
are
your
opinions
on
on
this
proposal
and,
if
seek
notice?
Okay,
with
this
approach.
C
I
had
a
quick
question:
if,
if
does
each
controller
need
to
integrate
with
this,
like,
for
example,
let's
say
I'm
building
my
own
controller,
that
manages
pods,
not
a
deployment
or
an
event
set,
or
something
like
that?
How
would
I
integrate
with
this
do
I
need
to
integrate
with
it
or
what,
if
I'm,
just
creating
bear
pods
or
something
like
that.
C
B
So,
like
you
do
not
need
to
integrate
with
this,
but
you
will
not
get
the
benefits
of
that
and.
E
B
Are
also
proposing
that,
oh
sorry,
we
are
also
proposing
to
create
a
search
API
which
could
have
this
shape
or
another.
It
was
not
discussed
yet,
but
this
this
could
mean
that
other,
like
CRVs,
could
implement
this
API
and
advertise
if
they
are,
if
they
support
the
search
and
they
could
actually
migrate
or
delete
the
posts
themselves.
C
But
if
the
controller
did
not
integrate
with
that,
would
it
just
kind
of
perform
like
it
did
today,
where
it
just
uses
termination
increase
period
and
just
kills
the
pods
there.
B
Or
it
could
behave
the
same
as
today,
so
that
means
the
the
couple
drink
would
would
try
to
evict
the
boss
and
if
you
are
covered
by
a
pdb
it
would
not
evict
the
boss.
B
B
Proposing
to
make
small
changes
to
the
disruption
controller
and
to
change
the
counting
of
the
pdbs,
so
these
applications,
which
are
supporting
the
search,
would
actually
the
pdb
would
also
could
cover
the
search
Bots,
which
is
not
possible.
Today.
G
I
have
a
couple
questions:
what's
the
proposed
rbac
model
for
creating
node
maintenance
resources,
and
is
there
anything
you're
proposing
to
do
to
expand
this
API?
To
do
things
like
only
these
numbers
of
concurrent
nodes
can
be
under
upgrade
or
only
nodes
in
a
single
zone
can
be
upgraded.
No
parallel,
Zone
upgrade
no
parallel.
Zone
maintenance,
one
thing
that
just
maybe
that
the
Arbok
question
is
probably
my
first
question
like
and
then
the
second
then
afterwards
was
like.
B
Yeah,
so
the
arbuck
model
I'm,
not
sure
from
what
direction
you
are
asking,
but
this
like
the
the
are
back
for
for
the
no
maintenance
could
be
like,
should
be
accessible
also
only
for
the
administrator.
So
it's
actually
the
creation
on
the
deletion
of
the
non-maintance
object
and
since
we
are
putting
the
maintenance
condition
on
the
on
each
part,
only
the
controllers,
which
you
would
like
to
support
the
search
on
and
have
to
check
their
posts.
B
So
they
need
to
not
don't
need
to
watch
for
the
node
maintenance,
but
this
also
means
that
some
other
processes
are
controllers
that
want
to
detect
the
general
maintenance
can
get.
Air
back,
are
back
for
New
maintenance
and
can
observe
it
and
can
react
to
this
in
in,
like
in.
E
G
Yes,
today,
if
I
want
to
call
keep
control
drain,
node
I
need
privileges
to
get
that
node
resource
change,
the
schedule
attribute
of
that
node
resource
and
then
I
need
privileges
to
delete
all
pods,
probably
independent
of
a
node
like
we
have
no
way
to
my
knowledge
right
now
on
an
orback
level
to
to
restrict
my
delete,
pods
rights
to
just
pods
on
that
node
and
that
node
online
I'm
curious.
G
Then,
in
order
to
have
privileges
to
create
a
node
maintenance
record,
do
I
that's
kind
of
like
an
indirect
privilege
escalation
action
today.
So
would
we
map
that
traditionally
to
you,
if
you
can
create
a
node
maintenance
resource?
It
means
you
should
have
read
rights
on
nodes,
and
then
you
should
have
delete
rights
on
all
pods
in
a
cluster
or
is
there
anything
we
could
do
if,
if
we
were
to
pursue
this
to
make
it
that
you
could
restrict
your
rights
to
that
which
is
resonant
to
that
node
and
no
other.
B
Yeah,
that's
a
very
interesting
question
like
today,
since
all
of
the
actions
are
done
by
Cube
control
drain,
you
need
to
have
the
Privileges
to
actually
evict
or
delete
and
to
schedule.
B
So
we
would
have
to
explore
this
direction
if
we
could,
because
we
could
also
move
the
pivotal
drain
logic.
You
could.
E
B
B
G
G
I
start
to
worry
about
like
privilege,
escalation
flows
or
like
the
disruptions
one
can
cause,
and
so
that
the
thing
that
would
be
really
appealing
to
me
personally
on
this
is:
if
I
could
restrict
the
set
of
light
rights.
You
need
to
give
out
to
actors
in
order
to
do
maintenance
even
on
a
node
by
node
basis
and
absent.
E
G
I'm
just
I'm
wondering
if
we
can
explore
some
of
that
before
making
a
a
final
decision
on
this
proposal,
because
right
now,
I
could
see
maybe
some
unintended
consequences.
B
A
B
So
you
just
want
to
add
that
that
it
seems
that
nobody
had
many
big
objections
against
this
API.
So
I
would
like
to
ask
people
to
do
or
if
to
do
some
reviews
if
they
have
time-
and
we
will
also
be
bringing
this
to
other
six
sick,
apps
and
six
CLI
and
six
scheduling
to
for
to
ask
for
the
feedback.
So
this
is
work
in
progress
and
it's
still
a
way
to
go.
Thank.
G
You
my
only
time
I
I,
think
my
objection
is
a
is,
is
a
little
stronger
in
the
sense
that
maybe
a
follow-on
activity
is
to
like
see
what,
if
we
introduce
this
API,
what
is
the
least
restrictive,
R
backs,
we
can
do
and
that
be
an
argument
for
this.
Api
I
worry
about
the
presence
of
this
API,
causing
very
bad
potential
actions
and
clusters.
G
If
I
can't
restrict
the
creation
of
these
node
maintenance
resources
or
or
what
happens
if
they're
acted
on
concurrently
or
but,
and
so
maybe
one
thing
to
Think
Through
is
like,
could
you
do
some
activity
to
like
intersect?
This
API,
with
the
node
restriction,
plug-in
that
we
have
today
or
just
generally,
around
the
idea
of
like
treating
this
as
a
way
of
like
doing
scoped,
barback
interaction.
A
Another
thing
is
whether
node
itself
can
put
itself
into
maintenance,
for
instance,
for
grade
school
termination
based
on
like
it
can
be
implemented
for
supportences
and
such
so.
You
Progressive
determination
is
configured.
Do
you
want
to
follow
the
same
approach
and
not
put
itself
into
maintenance,
but
then
it
will
be
human's
carrier
and
not
itself
can
yeah.
A
Okay,
as
I
said,
let's
move
to
the
next
topic.
Philip,
you
can
step,
stop
sharing,
Adrian
s,
yeah.
H
Yeah
hi
so
since,
since
checkpoint
support,
forensic
container
checkpointing
was
introduced,
I
I
talked
to
a
couple
of
people
who
are
using
it,
and
one
of
the
main
questions
was
what
happens
if
there
are
too
many
checkpoints
and
are
filling
up
the
disk
and
So.
Based
on
this
discussion,
I
made
the
pull
request
linked
here
in
in
the
document
and
and
during.
E
H
Review
of
the
pull
request,
there
was
a
comment
from
Mike
to
to
bring
this
to
sick
node,
so
basically
I
just
want
to
let
everybody
know
that
I'm
working
on
it
and
and
and
and
Mike
suggested
some
some
different
implementations
than
than
what
I
did
here,
but
before
we
continue
discussing
it
on
the
on
on
the
pull
request.
H
H
I
guess
we
don't
need
to
discuss
it
in
details
here,
just
when
we
see
how
full
the
agenda
is,
but
if,
if
anyone
wants
to
look
at
it,
please
take
a
look
at
the
pull
request
and
maybe
give
some
feedback
if
and
how
this
should
be
implemented
or.
D
G
F
Think
maybe
a
couple
of
minutes
discussion
may
be
worth
it
so
Mike
I
think
the
main
point
of
contention
was
whether
this
should
be
done
automatically
or
it's
a
responsibility
of
an
operator
or
an
external
entity
that
is
actually
triggering
the
checkpoint
and
moving
something
away
immediately.
I
Right
right,
the
the
original
cap
was
restricted
to
the
forensics
cases,
but
obviously
the
API
can
be
used
for
additional
cases
and
clearly
they
were
doing
additional
cases
with
additional
tools
outside
of
our
purview
and
I.
Didn't
know.
If
you
wanted
to
reopen
the
cap
and
start
talking
about
ways
to
manage
damages.
I
For
example,
the
the
fixed
Adrian
is
putting
in
I
think
it's
more
isolated
to
you're,
making
a
bunch
of
backups,
and
you
only
want
it
to
be
10,
for
example,
but
right
off
the
top
you
know
you
could
you
could
see
a
policy
that
would
make
allow
for
a
backup
every
month,
a
backup
every
week
right?
What
what
kind
of
backup
system
are
we
enabling
do
we
want
to
have
policies?
I
Do
we
want
to
talk
about
that
or,
as
Moreno
said,
do
we
want
to
just
give
up
to
let
somebody
else
inject
plugins
or
some
some
other
way
to
decide
how
to
collect
these
additional
checkpoints
and
where
should?
How?
Should
we
name
them?
What
should
the
graph
look
like
for
all
the
checkpoints?
You
know
we
reuse
the
Pod
IDs.
I
Unfortunately,
so
if
you
make
a
graph
with
pod
IDs,
you
might
be
landing
people
in
the
directories
or
you
know
then
putting
listeners
in
directories
where
they
maybe
not
don't
want
to
be
it
just
it
just
seemed
like
we
were
going
into
some
of
the
use
cases
that
had
stopped
us
from
extending
the
cap
to
cover
them.
I
So
I
wanted
to
Signal
people
to
know,
and
if
they're
going
to
be
involved,
they
want
to
probably
take
a
look
at
what
this
is
because
it
you
know
it
it
it.
It
reads:
sort
of
like
a
how
we
handle
logs
today,
where
you
do,
you
have
10,
and
then
you
start
it
over
and
that's
and
that's
fine
I
think
you
know
for
certain
situations,
but
I'm
not
sure
if
that
is
the
right
case
for
checkpointing
containers
in
the
memory
that's
stored
in
those
containers
for
other
uses
outside
of
forensics.
H
I
H
I'm,
not
I'm
I'm,
not
sure
myself,
so
I
just
I
talked
to
some
people,
and
this
was
something
they
always
bring
up.
So
I
thought
I'll
I'll
try
to
bring
a
solution
to
that,
but
yeah
you're,
right,
I
and
I
I,
don't
know
if,
if
that's
the
correct
thing
or
if
we
should
do
something
else,.
I
We've
been
running
into
this
kind
of
issue,
quite
a
lot
Adrian,
which
is
you
know,
running
out
of
storage,
space
and
overuse
of
storage,
and
this
is
the
heavy
storage
use
case
right.
It
might
be
more
apt
for
us
to
be
able
to
point
to
another
Drive
where
we're
storing
these
checkpoints-
and
maybe
that's
a
secured
drive,
for
example.
Right
and
that's
the
kind
of
thing
I
would
expect
to
come
out
of
the
kept
discussion.
Yeah.
F
F
A
Yeah
in
general,
I
feel
that
we
start
liking
this
health
Checkers
on
a
node
like
we
have
like
health
check
to
that
monitors
if
Google
is
still
alive
and
restarts
it.
We
have
this
log
like
checkpoints,
problem
and
now
we're
trying
to
introduce
logs
causer
here
through
the
Kublai.
So
there
is
no
nothing
else
running
Gonna
Know
to
assume
that
can
collect
logs
even
unrelated
to
kubernetes
itself.
A
So
we
have
a
cab
that
collects
logs
through
the
couplet,
so
I
feel
that
we
put
maybe
put
too
much
on
a
cubit
shoulders,
but
at
the
same
time
we
don't
have
anything
else
running
a
node
guaranteed
so
maybe
like.
If,
if
we
can
extend
npg,
for
instance,
to
take
actions
rather
than
being
just
a
reader
functionalities-
and
we
can
maybe
put
it
there.
I
D
But
it
also
depends
because
the
checkpoints
are
created
on
a
single
node
and
it's
not
necessarily
the
case
that
all
nodes
in
the
cluster
will
be
the
same.
So,
for
example,
if
a
port
that
is
consuming
large
amount
of
memory
is
being
checkpointed
several
times,
this
may
fill
the
space
on
one
node,
but
not
on
another.
So
there
are
these
cases
as
well.
A
A
Okay,
so
I
think
action
item
here
is
to
continue
discussion
on
the
pr,
but
I
don't
think
there
isn't
any
agreement
right
now
on
what's
right
direction
to
go.
Okay,
I,
don't
see!
Okay!
Thank
you.
Thank
you.
I
don't
see!
Kevin
and
Nicole
I
think
the
request
is
just
to
approve
couple
PRS
on
enhancements
to
move
for
CGI
forward.
A
There's
also
a
link
to
some
applying
document,
I'm,
not
sure
what
this
is
about.
Oh
okay,
it's
applying
that
okay,
applying
okay
makes
sense.
So
thank
you
now,
Marcus.
J
Yes,
we
have
25
minutes,
yo,
I
I,
don't
want
to
take
too
much
of
the
Sun,
so
we
can
get
to
the
end
of
the
agenda
but
kind
of
a
quick
introduction
and
kind
of
heads
up
and
put
more
thought
and
initial
comments
about
the
cap
that
I've
been
preparing
with
zwanko
from
Nvidia.
It's
about
passing
down
resource
for
resource
information
to
CRI.
J
So
this
Gap
is
about
a
lot
of
better
visibility
of
like
all
pod
resources
and
early
on
to
the
Sierra
runtime
and
meet
us
a
lot
of
two
goals.
Separate
goals
in
mind.
The
first
one
is,
would
be
passing
down
all
resources
of
all
containers
already
at
Port,
sandbox
creation
time
on
the
on
a
primary
use.
J
This
scenario
for
for
this,
this
goal
is
it's
like
VM,
based
or
confidential
container
applications,
where
the
kind
of
VM
is
prepared
at
the
point,
sandbox
creation
and,
and
that,
on
the
reason
we
investment
I,
would
really
need
need
information
of
all
the
resources
like
devices
mounts
mounts
and
and
everything
else
so
that
it
it
can
prepare
the
VM
correctly.
J
And
then
there
is
a
another
goal
that
would
be
passing
down:
the
like
informational
as
an
informational
data
to
the
runtime,
passing
down
the
kubernetes
resource
requests
and
limits
to
the
to
the
runtime
in
in
sort
of
on
office.
Obfuscated
form.
J
I
could
quickly
show
the
okay
if
I
can,
if
you
can
make
that
give
me
the
sharing
sharing
rights
so
just
show
the
so.
The
current
content
there.
A
J
D
J
Yeah
so
yeah,
this
is
the
current
okay
in
this,
in
its
current
current
state
broken
progress,
but
basically
the
summaries
about
this
kind
of
two
two
changes
that
we
are.
We
would
like
to
do
so
pass
down
all
or
resources
of
the
plug,
sandbox
creation
and
then
second
of
the
important
resources,
resource
requests
and
limits
in
in
the
container.
Config
information
and
motivation
will
list
list
like
two
two
different
usage
scenarios.
J
First,
one
is
the
one
we
embarrassed
and
confidential
container
fun
times
and
and
then
the
other
other
kind
of
class
of
usage
scenarios
is,
is
then
like
custom,
resource
placement
or
optimization
On,
The,
Run,
runtime
side
and
the
goals.
Well,
there
are
these
two
to
the
two
things
currently
unknown
goals.
We
don't
want
to
change
any
kubernet
resource
management
or
change
existing
Behavior
or
CRI,
and
so
you,
some
user
stories,
then
about
the
Sierra
API
changes
and
quickly
show
what
we
currently
have.
J
There
will
be
in
this
proposal
now:
New
Field
called
pole,
resource
config,
and
that
would
that
would
include
resource
information
of
all
unit,
containers
and
containers
and
then
for
each
of
these
they
would
list
this.
Basically,
the
name
pauldenda
for
resource
requests
and
limits
amounts,
devices
and
CDI
CDI
devices
and,
and
then-
and
we
have
like
some
content
about
the
implementation.
Basically
cubelet
would
be
the
only
component
affected
by
this
and
some
refactoring
work.
There
would
be
needed
physically.
J
Affecting
the
some
some
aspects
of
the
resource,
all
the
resources
are
are
prepared,
some
more
stuff
before
even
before,
so
both
sandbox
creation,
but
anyway,
that's
the
that's
the
kip
in
its
current
form,
and
we
are
working
on
that
and
hopefully
being
able
to
showcase
because
it
it
is
in
a
you
know
like
into
a
demo
in
in
upcoming
weeks,
so
you
can
get
a
practical
view.
Public
would
and
could
work
and
what
are
the
benefits
in
Practical
music
scenarios.
J
F
I
think
one
thing
to
make
sure
here
is
like:
how
confident
are
we
that
we
are
covering
all
possible
information
we
are
sending
here
or
we'll
have
to
keep
updating
it,
and
maybe
we
reuse
some
of
the
existing
structures
and
one
more
is:
what
is
the
overhead
of
sending
this
all
the
time
over
CRI?
Should
it
be
made
optional
based
on
the
runtime
class,
or
it
should
be
done
for
run
series
as
well?
F
J
Yeah
yeah
good
questions,
I
have
a
proof
concept,
implementation
about
the
cubelet,
and
there
are
these
a
bit
uncertain
that
how
certain
we
are
that
we
can
pass
pass
down
everything-
and
there
are
some
kind
of
corner
cases-
amounts
that
are
being
added
like
the
plug
error
logs,
and
there
was
some
some
other
case,
so
that
needs
to
be
kind
of
sorted
out.
But
generally
it
looks
like
everything
should
be
because
the
CDI
devices
volumes
everything
is
kind
of
prepared
in
the
managers.
J
J
K
J
Go
ahead
so
so
yeah
Sergey
was
asking
the
kind
of.
Would
there
be
any
immediate
kind
of
use
cases
that
could
be
him?
Oh.
K
Yeah,
okay,
yeah
right,
so
the
immediate
use
case
is
Kata
containers
and
confidential
containers.
So
just
for
a
heads
up,
Kata
for
computational
containers,
which
is
based
on
Kata,
creates
a
VM
and
in
this
VM
the
Pod
is
run
or
the
container
is
run
so
before.
K
For
now,
what
we're
doing
is
hot
plugging
all
the
devices
that
are
coming
in,
not
at
the
sandbox
creation
time,
but
at
the
container
creation
time.
So
we
are
knowing
okay.
This
is
a
CDI
device
coming
there's
a
PC
adivas
vfio
device
coming
into
the
Container,
so
we
are
mounting
all
the
things
after
the
VM
is
created
in
the
create
container
phase,
but
for
code
Plugin
or
for
direct
attachment
of
devices
like
let's
say,
gpus,
dpus
or
other
PCI
devices.
K
We
need
to
create
a
correct,
PCI
topology
inside
of
the
VM,
to
have
all
this
information
correctly
available
at
sandbox
creation
time.
We
need
to
pass
through
all
the
devices
that
are
allocated
for
the
Pod
or
the
container
right
now.
The
cubelet
is
accumulating
memory
and
CPUs
and
creates
two
annotations
CRI
annotations
to
pass
through,
let's
say
to
containerly
or
the
CRI
interface.
K
So
the
immediate
use
case
is
correct:
sizing
of
the
VM
at
the
sandbox
creation
time
and
not
at
the
container
creation
time
so
that
when
the
container
comes
in
the
VM
is
sized
already
with
all
the
PCI
topologies
set
up,
I
mean
if
you
need
to
plug
in
a
PC
Express
device,
you
need
to
have
a
PCR
root,
port
or
PCI
switch
virtualized
inside
of
VM.
So
we
need
to
know
up
front
what
we're
dealing
with
and
in
the
confidential
containers
use
case.
K
We
don't
want
to
do
hot
plugging,
because,
because
this
creates
an
additional
attack
surface,
if
we
are
reconfiguring
DVM
during
runtime,
what
we
want
to
prevent.
That's,
why
we're
saying?
Okay,
we
want
all
things
directly
attached
at
the
sandbox
creation
time
and
that's
where
we
need
all
the
information
that's
been
allocated.
All
the
resources
that
are
in
the
yaml
in
the
pod.
K
Today,
what
we're
doing
is,
we
are
what
I'm
doing
is
I'm
creating
CDI
annotations
and
we
have
a
fork
container
d.
So
this
four
container
D
is
reading
the
CDI
annotations
and
creating
ad
sandbox
creation
time
the
right
PCI
Express
topology,
but
we
would
like
to
see
that
it's
coming
like
officially
from
the
kubernetes,
the
CRI
and
not
a
hex
containerdy.
So
that's
how
we're
doing
it
right
now,
CDI
annotations,
and
but
this
also
only
works
if
we
have
CDI
devices.
K
But
we
want
to
extend
this
interface
that
we
can
use
like
all
the
resources
that
are
located
by
the
Pod,
be
it
storage,
be
it
PCI,
Express
devices,
speeds,
vfio
devices
or
or
something
else
that
we
are
passing
through,
so
that
we
can
size
the
VM
correctly
and
one
restriction
we
are
going
maybe
to
make
in
the
cap
is
for
the
beginning
that
we
are
only.
K
K
So
I
think
to
get
to
the
question
of
munal.
If
the
interface
is
going
to
be
extended
or
not
depends
on,
if
we
let's
say
can
say
as
a
community
that
all
the
devices
they're
going
to
allocate,
it
can
be
described
as
CDI,
then
I.
Don't
think
that
we
need
to
extend
interface,
because
if
we
understand
CDI,
we
can
all
pass
through
all
the
CDI
devices.
K
But
if
CDI
is
not
the
right
tool
to
describe
all
resources
or
all
devices
that
are
going
to
be
passed
for
the
container,
then
we
might
need
to
extend
the
interface.
So
what
we're
currently
doing
to
is
looking
at
all
the
use
cases
that
we
can
find
and
people
are
commenting
on
various
PRS
in
Kata,
confidential
containers
and
on
the
cap.
L
And
I
would
like
actually
like
to
add
a
few
more
things
so
right
now,
like
most
of
the
Qatar
implementation
and
similar
above
VM
based
implementation,
So
based
on
the
idea
of
what
block
but
like
down
the
line
when
TDX
VMS
will
be
more
well
I'll
take
my
mainstream.
The
problem
is,
but
those
VMS
are
static
after
creation,
so
you
cannot
really
hot
plug
anything
afterwards.
L
M
Okay,
sorry
about
that
yeah,
so
I
just
have
two
cups
up
and
just
as
part
of
the
I
think
there
was
in
the
1.28
Retro.
We
were
talking
about
trying
to
get
reviews
earlier
for
the
kept
cycle.
So
I
have
excuse
me
the
Pod,
ready
to
start
containers
cap.
M
The
beta
promotion
is
ready
for
review
and
approval
and
then
I
also
have
the
Split
Image
disc
one
we
are
hoping
to
I
know
there
was
some
interest
around
especially
I,
think
from
Don
and
others
around
some
of
those
support
for
this
for
their
use
case
we're
hoping
to
get
hopefully
some
more
feedback
on
that
because
we
have
at
Red
Hat.
M
We
have
one
use
case
in
mind,
but
we
would
be
interested
to
hear
about
others
and
part
of
that
cup,
so
I'm
hoping
to
get
feedback
on
that
early,
but
otherwise
both
of
those
cups
are
open
and
ready
for
reviews.
M
And
I
think
that's
that's
really
it
just.
A
Thank
you,
yeah
and
also
in
a
planning
document.
We
decided
on
the
reviewers.
You
can
pinch
those
people
as
well,
where
we
decided
and
finally,
we
have
Rob
and
katarzina
do
we
have
any
over
here.
E
A
A
And
we
need
to
just
not
like
toss
a
coin
but
like
commit
on
something
altogether.
A
Thank
you
right
now,
I
think
we
reached
the
end
of
agenda
unless
there's
any
anything
anything
else.
People
want
to
talk
about.