►
From YouTube: Kubernetes Resource Management WG 20170711
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
C
So,
as
we
know
like
resource
efficient
is
something
like
we
have
primary
and
best
of
our
jobs,
which
which
need
to
be
co-located
on
the
same
machine
and
how
can
we
best
utilize
the
resources
by
sharing
them
across
both
the
category
of
applications
without
affecting
the
SLS
of
primary
applications?
So
we
have
tried
to
address
this
concern
by
by
introducing
some
logic
within
within
different
components
in
the
Cuban
artists
system.
So
this
is
the
overall
architecture
diagram
that
we
have
so
so
we
have.
C
We
have
the
slave
node
where
the
cubelet
runs
and
we
have
the
master
node
where
the
scheduler
API
server
runs.
So
we
introduced
two
new
components
within
the
cubelet.
Currently
they
are
a
part
of
the
cubelet
code.
They
are
essentially
go
routines
that
run
and
one
of
them
is
called
as
a
resource
estimator.
So
the
purpose
of
the
estimator
is
to
to
report
back
what
are
the
available
and
reclaim
able
resources?
C
C
What
what
this
will
achieve
is
a
scheduler
which
will
try
to
schedule
a
best-effort
part.
If
you
would
like
to
use
the
free
claimable
resources
will
be
able
to
make
decisions
better
to
schedule
this
pod,
which
can
accept
agreeable
resources
on
a
particular
node
or
not.
So
a
scheduler
will
have
this
information
about
all
the
different
type
of
places
on
node
X,
and
then
it
will
try
to
schedule
a
best-effort
pod
and
to
reclaim
able
resources
when
necessary.
So
so
best-effort
part
is
the
pod,
which
does
not
require
any
guarantee
to
run.
C
A
Again,
I
want
to
make
sure
we
have
the
same
terminology
and
we're
talking
about
best-effort.
So
traditionally,
when
I
hear
best-effort,
pod
I
say
my
pod,
that
makes
no
resource
request
or
resource
1a.
It
sounded
like
you
said
that
you
included
in
there
also
a
pod
that
has
no
limit
but
may
have
a
rigged
bus
am
I
mistaken.
No.
C
D
B
C
C
So
it
will
try
to
kill
a
best-effort
pod
or
it
will
try
to
freeze.
So
I
will
explain.
Why
do
we
need
a
freeze
action,
but
in
case
of
memory,
if
you
want
to
release
the
resources,
will
typically
kill
a
best-effort
pod
so
that
the
memory
is
given
back
to
the
regular
pods.
So
so
this
is
the
overall
interaction
diagram
that
we
have
so
I
will
I
will
go
into
the
details
of
each
of
each
of
this
component
in
the
later
slides.
So
this
is
the
scheduling
example.
B
C
1
and
part
2,
so
what
we
have
is
typically
3
sections
here.
The
be
used
resources
is
the
the
so
say
or
guarantee
pod
requested
for
some
X
amount
of
CPU,
so
it
might
not
be
using
the
whole
X
amount
of
CPUs
might
be
using
X
by
2
or
X
by
3
amount
of
CPU.
So
so,
and
we
also
have
some
Headroom,
which
we
use
for
some
buffer
buffer
right,
and
so
whatever
is
left,
is
what
we
call
a
decline
in
this.
C
So
so
this
reliable
resource
can
typically
be
shared
by
some
other
by
step
at
fault
to
run
and
similarly,
the
another
pod
pod,
which
will
again
have
a
similar
structure
like
this.
So
so,
if
you
see
this
blue
blue
sex,
which
is
highlighted
here,
this
is
the
total
reclaim
Abel
resources
that
is
available
across
board
one
and
four
tools
which
can
be
used
or
it
can
be
shared
by
some
other
vegetable
pod
to
them.
Okay,
so.
C
So
if
so,
if
a
best-effort
pod
say
pod
three
wants
to
be
scheduled
and
it
is
okay,
it
is
it
fit,
as
that
is
okay,
to
use
this
or
the
clinical
resources.
So
it
gets
scheduled
onto
this
node,
and
the
state
see
shows
that
okay,
this
particular
block,
is
now
replaced
by
a
pod
three,
which
is
having
some
Headroom
for
itself,
and
this
was
the
original
available
resources
that
we've
had
okay,
so
so
this
is
the
kind
of
resource
allocation
or
changes
in
the
resource
allocation.
C
That
will
happen,
then,
when
a
best-effort
pod
is
OK
to
accept
reclaim
abilities
or
PSA's,
so
the
next
person
I
talked
about
power
direction
so
now
I
doing
that
the
pods
of
pod,
see
tools
or
usage
increases,
I
tried
it
and
we
think
that
whatever
resources
were
taken
back
from
pod
to
have
to
be
given
back.
So
in
that
case,
this
whole
part
three
will
have
to
be
evicted
or
taken
out
so
that
these
visits
are
given
back
support.
C
C
A
And
the
key,
but
we
have
some
support
for
evicting
pods
when
memories,
which
is
when
memory
available
memory
once
care.
Some
of
those
like
is
there,
is
there
a
reason
why
we
couldn't
make
the
keyboard
itself
just
more
in
tension
more
intelligent
about
how
often
or
when
it
makes
that
a
victim
decision
versus
having
to
have
this
external
cause
controller?
Do
it
but
yeah.
C
So
our
actually
our
work
was
inspired
from
highlighters
paper
from
Christo,
canteen
and
and
and
also
a
project
called
meso
serenity,
so
in
the
source
also,
they
have
a
project
called
serenity
where
they
have
actually
developed.
This
controller
pipeline
kind
of
a
thing
which
I
will
read
later,
where
you
have
pluggable
controllers
for
different
types
of
resources
and
each
each
controller
will
will
actually
try
to
monitor
that
particular
resource,
and
then
it
will
try
to
take
some
action.
C
B
A
F
There,
okay
I,
can
also
talk
to
it,
so
we
we
actually
developed
serenity
and
got
the
api's
in
tomatoes
which
allows
you
concepts
are
drive
from,
and
controllers
are
very
similar
to
the
eviction
manager
and
dozen
Iraqis
there's
more
than
one.
So
there
are
controllers
for
each
resource
vector
that
you
wanna,
protect,
for
example,
for
cash
for
power.
For
networking.
Those
are
all
things
we
can
probably
go
into.
The
cube
would
eventually
Crichton.
B
F
You're
sure
Headroom
egress
for
hybrid
containers
of
you
know,
guarantee
your
best
effort.
The
resource
estimator
basically
was
a
soft
measure
for
how
many
separate
parts
that
could
be
allowed
into
the
machine.
Fine,
so
I
guess
that
number
could
also
be
estimated
by
the
division
manager
right.
You
can
reduce
that
to
zero
and
say
no
more
best
effort.
That's
fine
through
the
no
conditions,
yeah.
B
G
G
But
the
thing
is,
you
will
really
know
how
much
they're
going
to
use
like
this
challenges.
They
might
be
like
oh
you're
kidding,
and
so
one
of
the
things
we've
talked
in
the
past
that
we
unfortunately
haven't
implemented.
Yet
this
I
mean
this
kid
will
be
aware
of
usage
in
dual
dispersant
scheduling
for
the
Super
Bowl.
C
A
Other
common
I
would
have
here
is
also
reminder
somewhere
suspected,
either
early
or
too
early
to
stack
where
you
could
reserve
resource
across
class
peers
and
I'm
just
wondering
if
this
is
also
essentially
solving
a
similar
need,
but
yeah
I
kind
of
like
this,
where
I
feel
like
this
appears
to
be.
How
do
I
they
can
like
a
move,
a
best-effort
pod
to
a
node,
the
police
utilize.
First,
by
having
that
utilization
aware
knowledge
when
doing
the
scheduling
of
just
less
painful,
see
the
selector.
B
Just
just
a
quick
comment:
this
is
deflation
Valley,
so
you
know
this
PLC
effort
we
did
was
around
six
months
ago,
six
or
seven
months
ago,
so
I
mean
after
bad.
You
know,
there's
been
lot
of
work
done
so
one
other
thing:
that's
one
of
the
things
we
wanted
to
kind
of
find
out
as
well
kind
of
reach
out
to
you
folks
and
the
scheduling
sinkholes.
Where
is
the
overlap
and
what
exactly?
What
can
we
do?
B
I
mean
as
far
as
the
work
which
we
have
done,
how
do
we
kind
of
kind
of
work
side
by
side
with
the
the
estimator
managers?
Estimation
manage
a
and
there's
a
component
which
which
you
folks
have
develop,
so
that's
kind
of
like
that,
so
yeah,
just
to
kind
of
give
you
a
summary
as
well,
so
this
world,
this
work
was
done
about
six
to
seven
months
ago.
So
I
know
the
lot
of
work
has
gone
on.
So
that's
why
we
wanted
to
kind
of
just
going
to
adjust.
B
G
All
the
say,
in
terms
of
like
terminology,
we
use
preemptions
to
to
refer
to
actions
that
we
like
in
Victor
like
that
is
not
between
this
earth
Falls.
For
reasons
other
than
say
reason:
starvation
on
all
across
the
node,
so
the
direction
Willis
briam,
maybe
hone,
avoid
confusion.
The
directions
you
could
I
can
hear
you
use
the
preemption.
G
D
There's
a
another
note
about
the
the
original
paper
and
also
the
Serenity
work
was
that
you
know
eviction
looks
a
lot
like
the
eviction
manager,
but
really
that
was
kind
of
the
most
drastic
it
determined
that
we
used
before
was
called
corrections.
The
idea
was,
you
know,
if
you,
if
you
realize,
through
the
high
priority
APM,
so
that
the
application
that
you
care
about
is
suffering,
then
you
you
hurt
de,
and
you
know
the
worst
thing
that
you
can
do
is
can
get
off
the
box.
D
H
And
I
will
say
that
in
the
CPU
bidding
stuff
we're
looking
at
adding
we're
getting
the
same
kind
of
issue
right,
we're
taking.
Basically,
we
can
allocate
all
the
CPUs
on
the
box
to
guarantee
pods
and
then
we
actually
can't
satisfy
best
effort
pods,
and
we
need
some
way
to
signal
to
this
scheduler
to
say:
hey,
don't
schedule
best
effort
pods
to
us,
so
I
think
that
there's
kind
of
a
generic
maked
for
some
signal
that
you
can.
You
know
you
pressure
or
a
best-effort
pressure
or
something
no
condition
you
can
apply
to.
C
C
B
So
the
way
to
the
whole,
you
don't
very
much
at
the
door
it
mentioned.
You
know
we
use
the
whole
pipeline
design.
So
one
of
the
pipeline
controller
was
the
network
controller
and
the
memory
controller.
We
didn't
really
do
the
CPU
controller,
but
that's
one
other
thing:
we
did
that
as
well
yeah.
You
can
keep
going
into
it.
C
Okay,
so
so
the
API
object
changes
that
would
be
required
in
order
to
achieve
this
is
a
first
sale
say,
an
application
which
widget,
which
doesn't
want
to
share
its
resources.
The
claimable
associate
with
any
other
thought
would
would
also
have
the
capability
to
say
so
by
by
introducing
a
new,
you
know
flag,
they
say
that
offer
reclinable
resource
falls.
This
would
be
in
that
I
have
that
particular
application
would
not
be
willing
to
share
its
removal
resources,
so
the
scheduler
would
basically
not
try
to
schedule
any
parts.
C
The
separate
parts
by
taking
away
resources
from
disparate,
not
deployment
or
application,
so
so
I
think
in
we
are
presented
with
in
a
scheduling
and
one
point
that
was
brought
up
was:
we
could
simply
not
take
away
the
clinical
resources
from
a
primary
power
port
just
like
that,
so
it
should
be
based
on
some
priority
or
some
some
logic.
So
we
thought
we
could
introduce
this
additional
field.
The
user
could
provide
some
option
to
say
that
battery
is
okay
or
not.
So
this
is
the
first
API
change
yeah.
C
G
C
So
so
so
the
point
there
right
because
brought
up
was
some
applications,
have
a
very,
very
strict
latency
requirement
like
with
maybe
two
seconds
or
four
seconds,
so
it
would
be
very
difficult
to
monitor
the
resources
and
then
you
know
predict
and
then
estimate
schedules
and
all
that
stuff
within
within
that
time
frame.
So
some
applications
would
really
be
not
willing
to
share
resources
for
even
a
fraction
of
a
time
in.
In
those
scenarios,
those
applications
would
not
really
be
happy
to
share
resources,
so
that
was
a
discussion
that
went
went
along
in
that
sense.
C
C
So
the
second
API
change
would
be
if
a
best-effort
pod
would
would
want
her
to
accept
reclinable
resources,
so
he
would
set
a
flag
accordingly
in
its
pod
spec.
So,
ideally,
as
we
discussed,
he
would
not
specify
any
requests
or
limits,
but
he
would
simply
say
that
I
am
okay
to
accept
or
decline
of
the
resources.
So
the
scheduler.
B
C
So
this
was
a
resource
estimator,
so
this
is
just
a
flowchart
of
how
it
operates,
so
it
initializes
with
some
configuration
and
it
that's
the
matrix
acquisition
like
acquire
matrix
using
C
advisor,
and
it
has
some
smoothing
and
forecasting
methods
and
it
tries
to
calculate
basically
what
are
the
reclinable
resources
that
are
available
on
this
node.
So,
ideally,
it
would
be
reporting
like
for
that
particular
node.
What
are
the
amount
of
resources
like
CPU
and
memory
that
can
be
reclaimed,
so
so
this?
C
This
can
also
be
further
improved
by
introducing
some
kind
of
a
confidence
level
or
or
or
how
much
time
are.
We
really
sure
that
this
kind
of
resources
will
be
can
be
reclaimed,
but
we
did
not
include
that
in
our
first
pass
of
implementation.
We
just
offered
a
static
point
in
time,
calculation
of
the
amount
of
reclinable
resources
on
a
particular
node.
But
yet,
ideally,
this
estimation
could
be
further
improved
with
the
additional
additional
parameters.
C
C
The
total
actually
used
by
that
on
Batman,
best
efforts,
app
and
considering
some
buffer
for
the
head
room,
and
then
you
subtract
the
requested
resources
by
the
best
effort
app.
So
this
is
just
a
pictorial
representation
of
that
the
oversubscription
controller,
which
sits
from
the
master
rate
it's
job,
is
to
basically
paint
a
particular
node
based
on
the
amount
of
XA
or
de
clima
beliefs.
Also
say
if
the
amount
of
beans
is
greater
than
some
threshold
which
can
be
configured
as
a
startup
parameter
or
something,
then
we
can
enable
the
feature
on
the
node.
C
So
so
so
so,
a
scheduler
will
then
be
able
to
schedule
the
Stepford
parts
on
the
node
based
on
based
on
pay.
This
paint,
if
it
is
saying
that
it's
enabled,
and
if
it
is
less
it
will
do
the
opposite.
So
it
will
not
here.
It
will
not
allow
any
more
vegetable
parts
to
be
fitted
onto
that
node
and
enabling
disabling
the
feature
can
be
done
by
applying
paints
and
threshold
can
be
passed
as
a
parameter.
So
so
does
this
like
a
helper
clarifying?
B
F
So
that
directs
sometimes
the
utilization
that
does
not
have
a
correlation
with
how
much
a
quadrotors
is
hurting.
So
it's
it's.
It's
a
really
complicated
topic
line
and
the
computation
we
had
before
in
terms
of
the
claimable
resources
was
only
in
terms
of
CPU
time.
But
it's
been.
One
of
the
conclusions
from
the
back
case
is
really
that
it's
hard
to
gauge.
You
have
to
you,
know,
look
at
the
performance
and
there's
almost
like
you
know,
open
open
the
drawer
close
it
by
to
allow
to
come
in
oh
yeah,.
A
That's
the
thing
I'm
most
addressing
I'd
wonder
finish.
The
point
here
was
like,
like
the
realistic
difference
between
it,
a
versatile
pod,
that's
at
the
very
low
request
and
the
best
effort
pod
is
it's
like
a
mirage
like
they're,
very
similar,
and
so
like
I've,
been
just
sitting
here.
Thinking
about
like
is
trying
to
do
this
with
SF
or
power
clause.
A
Class
alone,
like
is
that
really
people's
problems
like
when
I
think
about
the
types
of
workloads
that
I
know
our
users
are
scheduling
on
to
say,
openshift
product
more
more
often
enough,
people
are
actually
setting
request
a
minute,
it's
just
the
shape
between
those
two
vectors.
It's
very
wide,
like
there's
a
very
huge
possible
constraint,
and
it
could
be
argued
that
their
request
is
just
so
realistically
low
that,
like
it's
most
by
nature,
best
efforts,
but
we
don't
fully
allow
it
to
become
best
ever
and
I.
C
When
we
did
this
work
yeah,
so
this
was
a
part
of
our
technology
project
actually,
and
we
were
trying
to
also
see
how
we
can.
You
know,
schedule
short
running
jobs,
which
requires
a
very
less
amount
of
execution
time
and
and
we
were
trying
to
experiment
and
seeing
whether
we
can
get
any
good
resource
utilization
by
sharing
of
the
resources.
So
we
primarily
targeted
from
memory,
and
we
also
did
some
work
on
Don's
network
bandwidth.
C
B
So
it
is
so
the
scope
of
this
effort
was
more
like
academic
kind
of
thing,
in
the
sense
that
you
know
we
wanted
to
kind
of
use
the
oculus
and
can
I
understand
you
know
how
can
we
implement
a
similar
thing
with
the
kinetics
like
the
mesas
folks
Nicholas,
and
these
guys
to
tap
into
energy
project?
Thank
you
so
from
that
perspective
yeah,
so
we
didn't
really.
You
can
see
that
you
know
we.
No,
we
never
even
build
the
CPU
control,
so
the
focus
was
more
on
memory
and
network
at
that
time.
B
B
So
I
think
so
the
the
what
we've
done
is
so
essentially
we
built
the
sole
design
of
the
resource
reclamation
thing
and
so
obviously
the
POC
effort.
It's
not
something
in
production
like
that
at
this
stage,
so
we
wanted
to
kind
of
reach
out
to
community
and
see
how
do
we
go
forward
and
can
we
cannot
discuss
this
Pierre
based
line
and
we
can
build
upon
it
or
is
there
some
overlap
already
there
in
the
community
and
we
can
kind
of
work
together?
B
C
Okay
yeah,
so
the
next
slide
talks
about
the
continuous
controller
pipeline.
Where
I
mentioned,
we
have
a
bunch
of
controllers
and
what
essentially
each
controller
does?
Is
it
builds
an
action
list
so
so
an
action
would
typically
say
that
I
want
to
kill
a
pod
and
I
want
to
freeze
a
pod
or
unfreeze
a
pod
and
every
every
controller
takes
like
monitors
the
usage
of
primary
pods
and
then
build
the
list
of
actions
of
secondary
or
best-effort
parts
to
be
killed
or
frozen.
So
a
networking
controller
would
put
a
typically
inspect
the
bandwidth.
C
That
is
a
nine-week
usage
of
a
primary
pod,
and
then
it
would
try
to
freeze
a
best-effort
pod
if
required,
we
did
not
do
the
shared
resource.
A
CPU,
cache
controller
and
a
Scylla
controller
was
was
again
implemented
with
a
beater
custom
probe
agent,
which
would
basically
monitor
the
latency
of
the
application
and
then
do
some
actions
based
on
that
and
the
action
executors
was
the
place
where
all
the
actions
will
be
executed,
like
actually
killing
a
pod
or
signal
and
feeding
a
pod
and
a
corrective
action
was
like.
C
If,
if
we,
if
you
later,
observe
that
the
primary
pods
are
no
longer
at
their
peak
and
we
can,
we
can
go
ahead
and
scan
whatever
a
pod,
we
had
frozen
in
case
of
network
bandwidth,
we
unfreeze
those
so
that
they
can
start
functioning
and
start
consuming
network
bandwidth
as
they
were
doing
previously
before
getting
frozen.
So
this
is
the
overall
controller
pipeline
and
comments
from
the
scheduler
sig
I
had
just
noted
down
here.
C
So
one
question
was
with
the
cubelet
should
not
contain
the
previous
controller
code
and
it
should
be
designed
to
the
demon
said
pod
so
that
we
keep
changes
to
the
cubelet
at
minimal.
This
was
one
recommendation,
VA
got
and
the
second,
when
we
already
talked
about
that,
we
cannot
simply
take
away
resources
from
primary
jobs.
The
third
one
was
why
CPU
controller
was
not
included,
because
a
CPU
is
managed
by
the
external
component
C
groups
and
by
setting
CPU
shares.
C
The
competitors
is
able
to
arrange
for
the
CPU
resources,
so
we
did
not
handle
the
CPU
controller
that
much
but
yeah,
but
for
but
we
had,
we
had
to
do
some
work
on
core
pinning,
which
was
like
not
completed
in
that
time
thing
and
we
there
are
also
asked
for
some
statistics,
pre
and
post,
applying
the
QoS
controller.
So
if
you
see,
on
the
right
hand,
side
we
have
a
graph
which
shows
that
so
with
respect
to
CPU
and
respect
to
memory,
you
have
you.
C
There
so
this
chart
compares
average
utilization
per
second
memory
between
we
call
it
horse
like
who
are
the
orchestration
scheduling,
system
and
open
source
context
when
they
run
with
full
load.
So
we
did
the
with
four
different
data
fits
combination
of
the
workloads
where
that
data,
jobs
and
enterprise
apps
and
the
observed
the
average
CPU
utilization
with
open
source
kubernetes
was
25%
and
with
our
resource
over
subscription,
enabled
it
increased
to
over
76%
and
similarly,
the
memory
validation
shot
up
from
31%
to
84%.
C
C
We
just
wanted
to
wanted
to
enable
the
CPU
controller,
also
make
the
network
bandwidth
controller
controller,
more
mature
and
improve
upon
its
functioning
and
also
input
the
prediction
algorithms,
which
are
used
for
the
source
usage
like
we
try
to
predict
the
next
ten
seconds
or
in
the
next
20
seconds.
What
will
be
the
resource
usage
for
a
binary
part
or
for
performing
an
eviction,
or
you
know,
freezing
the
second
epod
and
also
improve
upon
the
resource
estimation
techniques
that
we
saw
earlier
so
yeah?
That's
it
yeah
I
mean.
B
The
Rohit
thing
your
question,
so
what
exactly
is
the
intent?
So
intent
is
kind
of
first
of
all,
going
to
find
out
if
there
overlap.
If
there's
something
already
been
done,
that's
one
thing:
we
wanted
to
kind
of
get
your
feedback
and
then
the
second
thing
is:
does
it
really
make
sense,
like
I
think
the
David
mentioned
that
we
can
start
off
with
the
intubation
and
then
you
know,
people
start
experimenting.
The
data
no
violet
and
incubation
fails,
and
so
that's
marketed.
A
The
thing
I
wonder
when
you
talk
about
getting
an
incubation
is
like
what
incubators
come
with
prereqs,
right
and
I.
Think
you've
encountered
this
before
with
some
of
the
other
things
that
we
looked
at
incubate
and
so
like
to
actually
incubate
this.
What
what
are
the
implied
for
your
actually
unspoken
prereqs
and
for
one
it
seems
like
we
have
no
way
of
like
incubating
API
changes
right.
A
Maybe
you
could
elaborate
on
what
you
deal
with
the
correct
where,
if
you
said
yes,
let's
incubate
this
like
what
it
is
that
the
core
project
is
actually
agreeing
to
to
support
to
enable
your
incubation
and
I
worry
that
it
may
be
more
than
we
can
take
right
now,
but
then
on
this
individual
roadmap.
There
are
things
that
do
interest
me
that
probably
interest
others
as
well,
like
I,
I,
think
having
some
prediction
algorithm
for
resource
usage
or
general
resource
estimation.
Techniques
would
be
helpful
on
making
some
existing
code.
B
I
think
no,
no,
no,
we
not
I
mean
that
we
we
can
that's
what
then
this
discussion
dawn
so
I
don't
know
so
is
that
is
that
an
issue,
though,
let's
say
we
just
want
to
pass
the
whole
thing:
the
whole
code
base
and
intubation.
So
what
that's
kind
of
that's
my
because
that
that's
what
I
thought
would
be
very
simplistic
thing.
So,
let's
park
it
and
then
we
can
keep
slicing
and
dicing
and
see
what
we
can.
You
know
that
yeah.
G
Fusion
process
works
today.
Incubation
is
more
about
like
saying
there
is
this
long
term
project
which
is
very
valid.
This
is
the
right
direction,
and
that
is
like
gentle
buy-in
from
the
community
on
going
in
that
direction,
and
at
that
point
we
started
an
incubation
process,
make
sure
the
achieve
the
feature
of
the
project
of
all
the
writer
and
then
move
it
into
the
core
ecosystem.
G
So
here
the
speeds
like
that
this
is
basically
touching
the
whole
system
and-
and
there
are
many
different
in
there
may
be
many
different
intersections
with
existing
components
of
the
system,
and
we
need
to
talk
all
of
that
in
detail
and
and
I
feel
like,
instead
of
approaching
it
from
like
hey
here's,
the
code
base
that
this
is
awesome,
let's
all
make
some
use
get
let's.
First
like
talk
about
the
goals
and
let's
try
to
prioritize
them
and
then
like.
B
Exactly
Stephanie
I
think
that's
that's
exactly.
We
should
do
and
that's
the
reason
we
send
out
their
details
kind
of
word
document
as
well,
so
maybe
that
we
can
use
that
as
a
as
a
at
the
document,
and
then
we
have
discussed
all
those
kind
of
things
in
document
itself
that
or
what
do
you?
What
do
you
propose?
Then
it
should
be,
have
a
follow-on
meeting
and
can
I
go
over
or
what
exactly
should
we
do
that
so.
G
I'm
actually
going
to
come,
I'm
going
to
actually
say
what
what
I
feel
that
really
is
working
towards
know
in
in
your
future,
and
please
correct
me:
everybody
else
in
the
call,
if
I'm
misinterpreting
our
goals,
I
thought,
like
we've,
prioritized
resource
management
and
disorder.
The
first
one
is
like
having
having
linked
very
simple
scheduling,
basic,
like
priority
mechanisms
and
simple
or
subscription
the
way
we
have
like
possible
and
best
effort
and
have
like
really
working
Kota
model.
So
we
can
have
a
deterministic
system.
G
Then
the
next
level
will
be
like
providing
performance
guarantees
or
essa
loose
for
that
the
domestic
system
under
the
third
level
would
be,
or
the
third
priority
would
be,
the
improving
your
flight
mission.
So
that
would
be
like
more
smarter
resource
estimation
techniques
or
for
basically,
including
at
all,
commit
right
like
instead
of
just
doing
over
covered
by
request
of
material
but
opportunistically
take
some
more
and
never
along
the
way.
Have
a
prioritized
performance
performance
guarantees
for
both
best
effort
points
like
we
don't
even
have
a
viable
use
case
for
best
effort.
G
Pontius
may
be
batch
low
close,
my
might
start
embracing
festival,
but
even
that
is
not
here,
and
on
top
of
that
we
are
missing.
Some
critical
features
like
are
doing
best,
effort
scheduling
in
the
scheduler
it
just
just
like
strength,
best
foot
pods
right
now
that
I
happen,
and
we
are
also
missing
vertical
power
scaling.
So
is
vertical
border
at
scaling
once
that's
available.
G
B
G
Only
killed
it,
but
it's
angry
dot.
We
all
like
a
community.
We
talked
about
this
quite
a
bit
and
it
came
to
the
discussion
that
cubed
would
be
the
one
that
is
managing
CPU
and
memory,
including
like
or
committing
it,
and
even
if
there
is
a
resource
estimation
aspects
of
it
that
could
be
dealt
with
by
the
cube
in
itself,
because
it's
necessary
for
vertical
form
of
skinning
anyways.
G
B
So
for
my
sister's,
the
scope
for
this
way
is
the
resource
reclamation
thing.
So
I
understand
that
so
usually
saying
so.
The
you
trying
to
kind
of
mitigate
that
you
know
by
doing
vertical
scaling
vertical
park
scaling
and
all
that,
so
is
that
something
I've
done
put
that
because
this
become
intubation
for
that
incubation
place.
For
that
you
know
the
overarching
kind
of
resource
utilization
gains.
You
know
using
resource
reclamation,
because
this
is
exactly
what
we
have
done
is
active.
We
showed
the
numbers
as
well.
A
I
guess
my
struggle.
A
little
bit
is
it's
very
focused
on
just
running
one
class
of
jobs,
best
effort,
jobs
and
I
wish.
I
can
speak,
for
which
my
user
and
customary
I
am
inside
into
where
today
best
effort,
jobs
are
not
really
the
common
place
and
I
feel
like
there'd,
be
a
lot
of
other
prerequisites
want
employees
before
it
would
become
too
common.
We've
done
some
stuff
in
the
last
six
months,
around
enforcement
of
known
allocatable
for
CPU
and
memory
that
can
make
maybe
the
appeal
of
running
by
secure
jobs
more
practical.
A
But
then
there
are
still
other
things
that
best-effort
jobs
can
can
cause
havoc
on
that,
like
the
benefits
of
getting
CPU
or
memory,
reclamation
are
outweighed
by
the
other
things
they
can
do
to
destroy
your
nodes.
So
for
myself
at
least
I
I,
the
challenge
of
making
more
cpu
available
for
scavengers
to
complete
test
is
not
like
a
high
priority.
For
me
at
least
I,
don't
know
for
others,
but
that's
that's
where
I
am
at
this
point
right
now,
I
guess.
E
G
G
But
when
it
comes
to
the
thing
that,
but
that
doesn't
solve
is
trying
to
provide
performance
guarantees
for
best
efforts,
because
during
scheduled
time
you
don't
know
how
much
precipice
both
Americans
here
so
you're,
just
like
try
to
boost
investor
for
scheduling
and
then,
if
you
want
to
guarantee
performance
for
them.
That's
when,
like
the
reason
estimation
would
matter,
but
so
what
director
Rick
is
saying,
my
understanding
is
like
there
is
really
no
clear
use
case
for
best
of
a
pod
year.
Well,
this
is
beyond
Hennepin
Stein
recently
in
the.
A
Is
it
a
very
particular
workload
that
was
running
as
a
best-effort
job
like
what
did
your
best
job
do
to
compute
crime
numbers
or
to
do
something
more?
You
know
it.
It
did
a
consume
disk
like
what
what
particular
did
it
do?
What
what
classes
best
effort
job
was
safe
to
use.
Let's
this
benefit
of
summer
yeah.
G
A
Yeah,
I'm,
sorry
and
I
should
have
backed
up
it,
but
III
understand
the
the
what
was
being
presented
far
better
than
what
was
being
discussed
and
I
kind
of
echo
efficient.
That
I
don't
know
if
just
saying
yes
will
create
an
incubator.
It's
the
right
answer,
because
the
scope
is
much
larger.
I
am
interested
in
seeing
if
we
can
tease
out
parts
of
the
solution
and
maybe
grow
those
without
having
to
take
the
entire
thing,
and
so
we
should
have
to
fall
on
that.
A
G
B
Yeah
I
think
it
would
be
great
if
you
get
if
you
folks
I
think
if
these
are
very
good
comments
and
feedback
achieves
the
great
you
know.
If
you
can
provide
that
feedback
in
the
document
itself,
then
we
can
not,
but
we
have
everything
in
one
place
and
then
we
can
definitely
get
back
to
you
folks,
and
if
you
want
to
do
a
follow-on
meeting,
we
can
do
that
as
well.
To
come
on.
I
know
talk
more
in
detail
about
certain
things,
I,
so.
G
B
B
G
B
A
Think
fish,
if
your
feedback
was
very
broad
I,
think
we
can
get
some
more
specific
feedback
like
we
discussed
here,
which
is
like
what
is
the
workload
that
you're
running
that
actually
benefited
from
the
reclaim
and
kind
of
tease
that
out,
because
right
now
it
wasn't
clear,
but
that
said
I
do
think
we
should
time
boxes
and
I'm.
Sorry,
Nicholas
Akana
that
you
had
another
agenda
item
that
were
20
minutes
over
on.
Is
it
worth
discussing
a
preview
of
your
topic
in
the
next
10
minutes?.
D
Yeah
sure
we
could
we
could
do
a
just
a
quick
taster
and
then
maybe
bring
it
up
at
the
next
week's.
That
sounds
good,
so
yeah
just
I
guess
the
goal
was
to
try
to
get
some
consensus
on
whether
people
think
that
it's
a
okay,
so
I'm
just
going
back
up
sorry
so
right
now,
we've
got
a
few
components
that
have
been
implemented
and
some
others
that
are
planned,
that
all
make
policy
decisions
that
relate
to
nuuma.
D
So,
specifically,
we're
looking
at
the
container
network
interface
plugins,
the
CPU
manager,
which
is
making
decisions
about
Corpening
the
device
manager
I'm,
not
sure,
if
that's
exactly
what
it's
going
to
be
called
in
the
POC,
but
essentially
the
B
component
inside
the
cubelet.
That
makes
the
concrete
device
bindings
and
also
the
huge
made
you
see,
your
controller
settings
and
so
I
guess
the
major
the
TLDR
for
describing
the
problem
is
that
if
they're
all
making
independent
policy
decisions
about
these
specific
bindings
and
there's
no
centralized
way
to
unify
that
affinity,
then
we
could
end
up.
D
You
know
straddling
sockets
and
a
bunch
of
really
bad
ways.
That
kind
of
you
know
limits
the
usefulness
of
of
actually
all
those
components
that
are
that
are
trying
to
increase
performance
by
by
assuring
some
sort
of
Numa
affinity.
If
that
makes
sense,
and
so
specifically,
you
know
you
could
be
pinned
to
a
core
and
then
assign
huge
pages
on
another
socket
or
a
nick
on
another
socket
or
you
know,
if
you're
connecting
to
your
you
know
accelerator
Hardware
over
PCIe,
you
want
to
be
in
the
same
socket
that
that
PCI
switch
is
attached.
D
G
That's
what
are
you
describing
is
probably
a
future
bug
I'm
going
to
bring
this
up
like
one
of
the
reasons
we're
trying
to
unify
all
of
this
logic,
underneath
underneath
the
container
manager
is
to
avoid
the
situation.
So
we
should
like
come
up
with
a
path
with
the
practical
modular
design
inside
the
container
Maya
that
let
us
like
do
CPU
one
and
memory
assignment
along
with
device
assignment
like
a
unified
fashion
and.
F
Cable
vision,
next
time
it's
a
small
spike
to
visit,
or
it
took
like
a
look
at
maybe
three
four
different
ways,
but
no
the
biggest
thing
for
us
I
think
was
just
to
get
a
feel
for
people's
like
so
maybe
its
own
sense
of
urgency,
which
is
like
starting
a
maybe
a
feature
request.
So,
like
you
know,
in
coop
nad
1.11,
or
something
that
you
have
a
new
being
being
a
a
line
item
and.
G
F
I
think
one
aspect
of
this
is
also
that
as
to
have
this
in
mind
when
we
design
the
current
components,
so
we
don't
shoot
ourselves
in
the
foot
later
on.
I
think
that
that
is
maybe
why
it
be
worthwhile.
Just
like
doing
a
thought
experiment
on
it
now,
yeah.
G
Right
come
on
so
I've
been
thinking
for
a
while.
I
don't
have
a
cubelet
architecture.
Arch
I
like
describing
the
different
components,
may
be
good.
You
might
want
to
have
one
for
the
container
manager
that
describes
what
the
different
modules
are
and
what
each
clear
response
really
hard
for
the
interact
with
each
other.
Oh
my.
D
F
A
Yeah
one
thing
I
was
thinking
and
I
know
similar
six
do
stairs.
Maybe
we
can
dedicate
time
just
to
doing
design
reviews
for
the
proposals
that
might
be
out
and
one
to
eight,
if
not
next
week,
then
only
beyond
that
and
just
ensure
that
maybe
we
alternate
meetings
to
discuss
the
design
before
talking
some
bouts
with
these
future
issues
but
I
know
I,
know
Federation
was
doing
that
time,
but
other
six
are
doing
similar.
We
could
we
could
try.