►
From YouTube: Kubernetes Resource Management WG 20180117
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
All
right
well
welcome,
I,
run
to
the
January
17th
meeting
of
the
resource
magic
workgroup
apologize
for
not
being
able
to
hold
last
week's
meeting.
Just
as
a
reminder.
We
are
going
to
move
to
a
bi-weekly
cadence
after
this
meeting,
so
you
will
see
updates
to
the
invites
to
reflect
that
later.
Today
we
had
a
number
of
items
on
the
agenda
before
we
turn
to
the
agenda
items
I
just
want
to
call
out
if
there
are
any
particular
features
or
topics
not
captured
that
we
want
to
track
for
kube
110
that
need
to
be
discussed.
A
A
Can
hear
whoever
that
was
yet
and
if
so
feel
free
to
speak
up
about
those,
so
we
can
give
proper
priority
to
the
near
term
items,
but
with
that,
if
there
are
no
immediate
I
can
hear
you
Jeremy.
If
there
are
no
immediate
concerns,
we
can
switch
to
the
agenda.
So
first
on
the
agenda
was
questions
around
allocate
RPC
calls.
B
C
B
So
in
this
PR,
what
we
are
doing
now
is
that
we
are
making
allocate
physical
to
the
device
plug-in
at
each
container
creation
like
previously.
It
was
once
for
the
pod
for
all
the
containers,
so
so
sort
of
the
main
point
of
argument
or
discussion
here
is
that
there
is
opinion
from
some
folks.
So
once
that,
even
on
a
container
restart,
we
should
not
use
the
cachet
from
within
the
qubit,
and
we
should
make
a
look
at
coal
even
at
each
container.
B
D
Allegory
with
the
makers
opinion
that
we
want
to
to
use
cash
estate
to
handle
the
potential
the
best
practice,
video
I
think
it
has
been
a
philosophy
that
the
communities
control
a
failure
shouldn't
the
effect
in
the
normal
pattern.
You,
as
its
allocated
to
a
node
and
I,
think
you
played
the
signal.
Team
has
really
put
a
lot
of
effort
to
make
sure
he
later
restarted,
replace
and
they
should
do
the
same
for
the
best
party
and
so.
E
Stop
and
the
arguments
that
I
have
with
running
a
alcohol
at
the
beginning
container
start
so
that,
for
example,
in
the
case
of
nvidia
gpus,
we
would
be
able
to
if
the
container,
when
it
restarted,
put
the
GPU
in
a
bad
state
or
left
and,
for
example,
memory,
lakes
etc.
We
would
be
able
to
clean
it
off
and
in
terms
of
infrastructure.
I
think
it's
better
model
if
we're
able
to
scrub
the
GPU
memory
now
in
terms
of
caching
I'm,
not
completely
against
caching.
B
I'm,
correct
did
I,
do,
are
you
mental
or
the
one?
Is
they
improved
as
restart
time
and
the
second
argument,
and
that
main
primary
argument
is
that
it
would
not
that
already
running
container
will
not
depend
on
the
device
plug
in.
Let's
say
if
the
container,
which
was
already
opened
it
somehow
it
restarts,
and
if
device
burn
is,
is
not
there,
it
would
fail
if
we
make
a
look
at
each
time.
So
that
is
the
primary
reason.
So.
E
B
B
E
D
E
E
E
E
D
A
E
F
E
D
E
D
B
Is
another
conversation?
Can
I
just
come
in?
So
another
point
is
that
if
there
is
a
container
failure
and
that's
causing
problem,
so
if
that
that
food
has
become
faulty
and
that
and
when
this
port
will
be
relaunched,
so
I
think
the
devices
will
get
reset
the
poor
recreation
but
I'm
saying
that
if
a
container
is
cake,
if
it
keeps
failing
in
a
pod
and
the
user
just
deletes
that
poor
and
recreates
the
pod
right,
yeah
right.
G
G
B
E
H
E
B
So
I
didn't
know
that,
as
I
said
earlier,
the
deletion
and
poor
recreation
will
fix.
This
will
reset
the
things
I
mean
there
is
like
it's
not
like.
There
is
no
way
if
a
container
like
on
that,
you
start
off,
a
container
will
do
the
model.
Devices
are
not
getting
reset
it,
but
if
a
pole
gets
deleted
and
food
is
being
recreated,
then
very
wise.
This
will
reset
it
right,
but.
E
I,
don't
see
I
mean
that
seems
like
you
very
model,
if
your
container
we're
continuing
restarts
in
a
cpu
environment
and
you
get
weird
errors
because
Linux
didn't
do
its
job
properly,
are
you
expected
to
I
mean?
Can
you
see
the
parallel
I'm
trying
to
make?
Are
you
expected
to
have
the
user
intervene
to
remove
the
pod
and
recreate
it
because
weird
errors
happened,
I
mean
it's
the
role
of
the
buffer
or
Linux
in
this
case,
so.
H
E
H
E
If
you,
if
you're
running
GPUs
or
if
you're
running
devices,
it
makes
sense
that
your
device
plug-in
or
your
device
runtime,
because
it's
running
device
specific
operation
to
setup
your
device
will
be
as
important
as
your
I.
If
you're
not
running
devices,
then
it's
not
as
important
because
you're
not
using
the
device
but.
B
E
A
B
D
E
D
That's
an
option
I
think,
and
this
is
a
you
already
require
twins.
The
media
special
continue
runtime
but
again,
I
feel
like
her
multi-tenancy.
It's
not
a
community's
focus
from
the
very
beginning.
I
know
like
people
are
working
on
that,
but
still
I
think
we
are
quite
far
away
and
you
will
firmly
support
about
it
in
NZ.
I
think
Holliday
will
security
as
a
racer.
It's
perhaps
good
enough.
E
Why
were
where
we
don't
want
to
have
the
same
behavior
as
your
eye
and
boy
with
disability.
Insecurity
is
not
something
that's
important
to
do
to
us
or
to
the
kubernetes
community.
I
mean
the
behavior
seems
saying
it's
it's
this
exact
same
behavior
as
your
eye
and
in
general
it
seems
pretty
simple
to
say
that
the
device
plugin,
or
at
least
how
we
decided
it
last
year
and
how
we
implemented
it
into
the
device
file
system,
is
to
run
device
specific
operation
for
container
level
isolation.
E
D
E
D
E
D
E
Not
a
sanity
discussion,
it's
it's!
It's
a
GPU
discussion
where
we're
dying
here
is
suggesting
that
our
lib
is
basically
is
suggesting
a
different
model
than
what
we've
been
implementing
for
the
better
part
of
two
years
now
and
I.
Don't
understand
why
this
strategy
change
is
actually
happening
now
and
wasn't
mentioned
a
year
ago,
I.
A
D
Actually,
I
I.
Don't
really
think
this
is
attendance
easier,
but
I
want
to
better
understand
why
we
like
renounce
things
like
how
to
provide
a
continued
level
security.
Isolation
like,
for
example,
why
we
have
to
reset
device,
even
though,
like
we
just
reallocate
the
device
in
the
same
powder,
but
two
different
containers
I
think
like
if
we
can
just
reset
the
device
like
make
our
allocation
car
when
we
reallocate
a
device
to
a
different
part.
That
will
be
a
good
enough
model
for
the
continuous
case.
D
F
E
Not
just
a
security
issue,
it's
a
reproducibility
issue,
you're
at
least
have
we
understand
it
is
that
containers
are,
is
the
lowest
level.
If
your
container
should
always
run
in
the
set,
you
should
always
be
able
to
when
your
container
with
the
same
States.
That's
why
we
think
container
level
Isis
and
that's
what
we're
doing
elsewhere
in
darker
and
darker
swarm.
That's
what
we're
doing
in
business
to
and
I'm,
not
exactly
sure
why
it
should
be
par
level
in
kubernetes.
D
Restarting
a
puzzle
because
I
think
a
partner
restarted.
The
pod
and
containers
are
very
common,
you
case,
and
if
we
make
like
every
time
when
they
restarted
container,
we
need
to
make
this
allocation
car.
The
introduces
this
like
this
potential
failure,
pond
that
the
device
pragna
heights
are.
We
had
to
be
aa
bit
there
I
think
this
makes
like
the
penny
handling
scenario
very
complicated
and
also
like
it
was
playing
mostly
like
most
likely.
The
best
parking
will
be
deployed
lights
dim
inside
and
the
way
secrecy
has
benefited
like.
D
E
D
E
Mean
so
the
eye
and
just
to
answer
that
other
argument
with
restarts
I
think
that
if
you
restart
and
that
your
container
just
keeps
crashing
because
we
couldn't
keep
the
state
or
just
all
the
memory
wasn't
like
free,
then
how
does
it
have?
How
does
that?
How
does
it
help
that
your
container
just
restored
but
crashes,
but.
E
Gpu
is
at
the
same
state
as
CPU.
We
have
hardware
issues,
for
example,
if
you
have
two
processes
using
the
same
GPU
and
that
one
process
faults
the
force
on
the
GPU,
the
other
process
will
crash
to
where
we're
still
pinning
hardware,
isolation,
we're
still
building
hardware
and
kernel
clear,
but
that's
to
say,
GPUs
are
at
right
now.
D
E
E
Say
say
your
process
crash
inside
your.
You
have
a
container
that
uses
a
GPU,
your
process
claims
and
if
it's
a
tensor
flow,
for
example,
it's
going
to
claim
the
better
part
of
the
home,
or
at
least
the
whole
memory
of
the
GPUs.
If,
if
not
just
a
bit
under
it,
usually
claims
a
lot
of
the
GPU
memory
and
then
it
crashes
and
we
didn't
have
that
we
couldn't
clean
up
at
that
point.
E
If
you
continue
restarts,
then
just
tension
flow
is
going
to
try
to
reclaim
memory
that
isn't
available
and
and
and
then
it
crashes
again
and
restart
crashes,
again,
restart
crashes
again
and
I
think
it's
probably
better
to
just
have
the
same
model
s
your
eye.
That
says,
if
you're
not
able
to
have
your
infrastructure
clean
up
and
make
sure
your
GPU
is
available,
then
just
exactly
as
your
eye
and
just
return
your
neighbor
and
maybe
try
again
later
so.
D
D
E
The
next
one
should
be
pretty
straightforward,
and
the
next
item
on
the
agenda
is
I've.
Actually
it's
the
PR
bad
annotations
and
the
device
I
think
it's
racist,
straightforward,
and
the
idea
is
that
we
would
like
to
be
able
to
add
annotations,
CRI
annotations
on
a
container
and
the
product
ID,
for
that
is
that
for
cRIO
with
on
occasions,
you
can
call
a
hook
in
that
case.
That
would
allow
us
to
support
the
Nvidia.
D
Icu
comes
out
us
plug-in
API
and
to
add
continue
annotation
support,
but
I
also
feel
that
the
longer
term
it's
it's,
not
a
good
way
to
invoke
resi
hook.
So
I
talked
about
like
if
you
think
this
is
just
to
draw
your
case,
then
perhaps
we
should
the
hi
world
document
that
describes
the
use
case,
why
we
want
to
use
the
currency
pre-start
hook
and
center
circle
that
documented
with
the
signaled
and
I
people
and
hope,
like
the
rest,
who
can
be
supported
at
CRI
level.
E
It's
just
that
we've
been
talking
about
that
with
fish
since
I
think
November,
and
he
mentioned
that
CRI
was
going
to
take
some
time
and
I
think
it
might
take
some
time
so
I
think
annexations
is
a
first
step.
It
would
prove
that
stereo
works
with
the
Nvidia
and
time
we
tested
that
internally
and
it
would
basically
be
able
to
support
our
reduce
case
to
actually
standardize
rensi
pre
hooks
into
CRI.
D
E
I
I
E
Mean
so
basically
I
think
the
idea
of
bringing
that
up
in
this
meeting
was
if
there
were
anyone
that
had
any
concerns
about
adding
annotations
to
the
device
for
being
API
I
mean
that
would
be
a
place
to
talk
about
it,
if
not
we're
we're
just
looking
for
basically
people
saying
that
looks
good
to
me.
What's.
E
F
A
E
A
A
E
G
That's
one
use
case:
you
could
also
use
the
annotation
for
indicating
which
this
source
is
needed
in
a
particular
FPGA,
for
example.
So,
as
we
discussed
in
previous
meetings,
we
could
have
our
admissions
controller,
which
is
annotating
the
pod
spec,
that
is
Lanna
tations
and
that
could
be
used
with
device
plugin.
D
D
E
No,
it
doesn't
sound
like
it
sounds
like
a
sane
idea
to
sending
annotations
to
the
device
for
you,
but
I'm
still
new
to
the
idea.
I'm
still
trying
to
understand
the
use
case,
as
you
were
saying,
I
mean:
do
kick
kick.
Can
you
formalize
that
into
an
issue
or
just
a
if
you,
if
you
paragraph
yeah,
we
can
do
it.
E
A
We
would
like
to
believe
that
device
plugins
don't
need
to
do
anything
contain
a
runtime
specific,
but
that
that's
not
always
going
to
be
the
case
and
I
think
it's
I
think
the
proposal
you
have
here
is
reasonably
fair,
like
I
have
no
objections
to
it
other
than
I
think
there
was
some
commentary
I'm
like
how
do
you
graduate
these
annotations
I
guess
to
me:
that's
really
outside
the
concern
of
the
cubelet
per
se
and
more
than
concern
at
the
device
plug
in
and
how
it
chooses
to
integrate
with
the
wrong
time.
So,
yes,.
D
And
I
also
think
alike.
I
know,
tuition,
I
hope
like
I
know.
If
we
add
this
to
this
API,
we
already
use
it
for
early
experiments
and
also
even
Parkington
annotation
OCR
I.
They,
the
comment
is
play.
Stately
mansion,
like
annotation,
is
not.
Hopefully,
annotations
are
not
going
to
affect
the
container
runtime.
D
E
A
I
think
it's
outside
of
kubernetes
purview
to
tell
container
runtimes
how
they
should
use
annotations
or
not
annotations.
To
me,
it
just
seems
like
this
is
another
useful
vehicle
by
which
the
cubic
canner
doesn't
need
to
know
what's
inside
the
envelope,
but
the
envelopes
allowed
to
be
sent
right,
and
this
seems
you
know
completely
reasonable
to
me
in
that
regard.
So
I
I
can
comment
on
the
PR
with
a
+1
on
this
and
III
honestly,
don't
think
it's
like
it's
none
of
the
business
of
the
cubelet
to
know
what's
inside
that
envelope
and.
D
H
A
D
D
Think
the
Tricia
has
already
have
a
pure
and
look
at
data,
it
seems
fun
and
it's
basically
just
like
follow
the
similar
model
and
other
resources
like
few
pages,
and
also
we
don't
because
we
don't
allow
poor
Cleatus.
We
currently
only
supported
request
a
recess
name
and
code
has
cope
and
we
make
the
rangers
should
have
work.
The
same
usually
just
work.
A
A
A
E
D
See
even
the
API
is
tensions.
We
are
discussing
the
compatible
changes,
so
I
do
think
like
a
I.
Do
them
will
affect
much,
but
I
do
think
like
there
are
some
some
changes,
I'm
hoping
we
can
communicate
about
cramming
device
parameter
beta,
like
I,
think
it's
switching
to
to
a
general
complete
clocking
model
that
uses
probes
instead
of
registration.
I,
think
that
would
be
a
major
one
to
thirteen
and
also
I
also
want
to
sync
up
on
the
state
where
so
other
device
plug-in
implementations
other
than
a
GPS
tracking
I.
D
Don't
know
whether
we
have
made
any
progress
like
Solar,
Flare,
high
profile,
snake
and
also
FPGA
company
here
like
if
they
are
working
on
this
hell
guys.
Just
briefly
on
the
progress
on
those
device.
Cracking
oh.
F
A
H
D
Don't
think
this
is
a
hard
block,
but
it's
jetta
we'll
be
nice
to
happen.
This
will
generate
the
with
the.
Neither
party
was
talking
to
monitor,
cubelet
restart,
which
I
think
is
a
nice
thing
to
have,
but
I
also
don't
think
it's
really
a
hard
blocker,
but
people
people
have
other
opinions
that
you
know.
E
D
I
D
D
I
A
D
A
Do
you
folks
mind
if
we
just
find
another
minute
talking
about
that
allocate
topic
a
little
bit
sure
so
today?
My
understanding
is
that
we
call
allocate
during
cubelet
admission
of
a
pod,
correct
and
so
like
at
a
10,000
foot
view
when
we
talked
about
in
the
long
term
of
wanting
to
be
able
to
support.
You
know.
A
Locality
based
scheduling
concerns
where
I
want
to
schedule
this.
The
CPU
I
get
pinned
to
closest
to
my
GPU.
What
would
be
the
impact
of
calling
allocate
on
every
container
start
call
versus
letting
us
have
some
centralized
planning
step
during
cube
admission
its
day.
This
is
when
we
define
your
CPU.
This
is
where
we
define
your
GPU
and
then
that's
so.
E
My
general
ID
is
that
GPU
are
not
going
to
move,
or
at
least
currently
we
don't
even
support
how
high
parking,
GPUs
and
because
uncertain
mother
boys
that
you're
in
the
field
we've
seen
motherboards
that
might
have
a
few
problems
and
that
just
don't
support
hard
on
your
GPU
hardware
level.
So
the
general
idea
was
that
it
would
probably
be
a
call
during
another
call,
instead
of
a
light
that
would
just
return,
GPU
affinity
to
a
CPU
where
I'm
still
thinking
about
it,
but
in
general
it
would
or
just
like
a
matrix
that
says
this.
E
B
Yeah
and
and
and
that
ik
q,
it
may
have
information
about
the
locality
in
the
device
structures
in
its
local
Kakashi,
because
less
than
watch
we'll
keep
analyzing
device.
Logging
will
keep
advertising
video
ID
attributes
to
it,
and
when
these
let's
say
Numa
manager,
it
is
trying
to
get
get
the
topology
details
from
device
packing
manager.
It
can
respond
back
from
its
cache.
H
B
H
A
I
just
want
to
make
sure
that
I
wasn't
I'm
very
easily
confused.
So
I
wanted
to
make
sure
that
I
was
not
confused,
that
if
we
call
allocate
more
than
once
over
the
life
of
a
container
that
it
has
little
to
zero
impact
on
whether
or
not
we
can
support
locality
based
decisions
and
admission
on
which
device
and
which,
which.
D
I
see
currently
the
motive
I
think
it's
the
scheduler
and
the
curator
at
mini
admission
handler,
will
take
the
device
properties
into
account,
but
I
think
currently
the
allocation
by
the
way
I
want
to
make
a
single
or
multiple
allocation
cause
are
most
relevant
to
whether
we're
going
to
reset
the
device,
but
the
device
plug-in.
You
want
to
use
this
obstacle
to
reset
device,
because
I
think
currently
the
Keyblade
decides
what
devices
to
allocate
to
a
particular
container
and
I.
Do
you
think
that
pathway
will
change.
B
D
F
Comment
if
I
may
so
even
will
make
currently
we
are
talking
about
like
GPUs
and
tower
devices,
which
is
not
hot
pluggable.
Please
that's
keep
in
mind.
What
were
are
some
accelerators,
which
can
be
used
over
USB
and
USB
devices
can
be
plugged
and
unplugged
even
round
the
moment,
so
enumerated
a
bit
differently,
so
we
need.
We
need
to
keep
in
mind
yeah.
D
That's
the
template,
I
use
cases
we
should
support
and
about
plug-in,
does
supported
and
I'm
agree
sauce
that
cycle
and
but
like
how
to
better
support
dynamic
results.
Provisioning
like
missile
standing
I
think
is
perhaps
outside
the
scope
of
the
device
per
game,
but
maybe
faith
fighter
into
the
resource
crafts.
His
question.
A
D
A
Guess
we
got
five
minutes
left
if
there
are
other
topics,
people
went
to
most
quickly
on
now
these
rays
them.
Otherwise
we
can
adjourn
I'll
post
the
recording
and
just
remind
everyone
again
that
we're
gonna
move
to
a
biweekly,
cadence
and
I
guess
the
takeaway
here
was
that
we
know
the
first
topic.
You're
on
allocate
will
go
back
to
Sigma
and
we'll
have
a
chance
to
digest.
A
E
Just
actually
quick
question
on
that:
what's
my
approved
requirements
for
the
second
PR
I'm.
D
A
Right
I
mean
Renault,
I
will
go,
and
you
know
if,
if
what
we
discussed
here
maps
when
I
look
at
the
PR
I
will
express
my
group,
oh
and
then
give
dawn
a
moment
to
yea
or
nay,
but
then
otherwise,
I'll
just
no
just
tag
for
moving
ahead.
Okay!
Well
thanks.
Everyone
and
I
will
talk
to
you
on
slack
and
then
in
two
weeks.