►
From YouTube: Kubernetes SIG Scheduling Weekly Meeting for 20220310
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
hi
everyone
thanks
for
joining
in.
As
you
may
all
know,
this
meeting
is
recorded
and
will
be
uploaded
to
youtube.
Please
adhere
to
kubernetes
guidelines,
so
we've
got
three
issues.
I
just
added
the
first
one
just
because
it's
a
relatively
urgent.
A
Scalability
quotient
bug
that
I
think
worth
quickly
discussing
because
of
the
code
freeze
that
we
will
have
in
the
next
couple
weeks,
so
we
might
want
to
fix
it
before
that.
So
the
issue
is
that.
B
Yes,
it's
a
little
small,
but
maybe
you
can
reduce
the
size
of
your
window
or
that
yeah.
A
So
in
the
scalability
tests,
they're
testing,
I
think,
on
some
5000
nodes,
but
there
are
a
lot
of
demon
sets
that
are
being
deployed
in
that
test
and
the
thing
that
they
noticed
is
that,
like
for
regular
pods,
they
get
scheduled
quite
fast.
They
can
get
like
upwards
of
300
pounds
per
second,
which
is
great,
but
for
dinosaur
pods.
They
are
getting
like
below
100
pods
per
second
scheduling
throughput,
and
the
issue
here
is
that
it's
most
likely
because
if
you
can
see
here.
A
So,
for
a
normal
pod,
we
typically
value.
We
don't
really
need
to
evaluate
all
5000,
pods
and
nodes.
We
just
try
to
find
500
eligible
ones
and
try
to
score
them,
but
for
demonstrate
pods,
because
there's
only
one
feasible
node
in
the
cluster.
A
The
journey
of
trying
to
find
500
is
basically
going
to
fail,
and
so
we
have
to
go
through
all
these
5
000
nodes
and
then
eventually
we
will
end
up
with
one
anyways,
and
so
that
is
like
compared
to
just
like
a
normal
pod.
That
would
be
a
lot
more
expensive.
It's
not
the
scoring
phase
itself.
That
is
more
expensive.
A
It's
the
fact
that
we
have
to
go
through
all
these
5
000
nodes
just
so
that
we
pick
the
one
node
that
is
eligible,
even
though
that,
if
the
assuming
like
a
in
a
ideal
situation,
the
scheduler
is
aware
that
this
is
demon
set
board
and
then
it
will
just
pick
the
node
and
move
on,
rather
than
actually
trying
to
find
it
using
part
of
node
affinity.
A
So,
to
give
you
some
background,
the
way
that
demon
sets
work
is
that
the
dms
controller
creates
the
part
and
injects
an
affinity
to
the
specific
node,
where
the
part
should
be
scheduled,
like
on
on
node,
x
or
node
y,
like
that
exact
node,
where
the
demon
support
should
go
so
the
scheduler
is
not
aware
of
the
part
that
the
part
is
a
demon
support.
It's
just
basically
applying
not
affinity,
rules
on
it
and
and
that's
the
issue
basically
do
you
have
any
questions.
C
You
think,
thanks
for
raising
this
issue,
it's
a
very
interesting
symptom,
so
yeah.
Definitely
I
think
we
should
do
some
specific
logic
in
the
affinity
filter,
because
if
I
remember
correctly,
the
demon
cell
controller
injects
a
very
special
field,
called
dot
metadata
done,
name
or
sort
of
like
that,
and
that
is
right
now
is
exclusively
used
by
the
demon
set
part.
C
A
Yeah,
what
we
have
proposed
here-
the
it's
not
special
to
demonstrate,
was
basically
like
in
node
affinity.
You
can
do
selection
based
on
labels
or
on
fields
right.
So
you
have
two
types
of
node
node
selection.
A
If
you
look
at
the
node
selector
spec,
so
it's
not
really
anything
special
to
demonstrate,
but
it
is
really
it
should
be
reliable
to
basically
say
if
we
find
this
affinity
to
the
metadata
name.
A
We
can
assume
that
there
is
only
one
node
with
a
metadata
that
name
matching
right
like
that
yeah,
but
we
can't,
but
we
can't
do
that
for
labels
right,
like
labels,
are
more
exactly
like
free
text,
but
for
metadata
name.
This
is
already
enforced
by
the
api.
Yes,.
A
Mapping
to
the
node
yes
yeah,
so
this
should
be
reliable
and
we
can
apply
that
as
a
pre-filter
in
the
node
affinity
plug-in
the
other
one
that
we
could
potentially
explore
is
adding
the
demonstrative
controller,
adding
setting
the
node
nominated
node
name
since
we
already
optimized.
For
that.
I
think
we've
done
that
past
couple
of
right.
I
don't
think
that
would
work,
because.
B
C
A
B
A
B
In
the
next
cycle
you
mean
in
yeah
in
the
next
cycle.
Yes,
we
try
not
to
delete
more
parts
until
all
the
ones
that
are
marked
for
deletion
are
deleted.
Sure.
But
then
there
is
nothing
so
right,
but
there
might
be
another
pod
that
put
preempted
a
pod
and
then
you're
only
preempting
that
one
you're
gonna
wait
for
that.
To
finish
before
you
start
printing,
more
right.
A
Like
I
guess
in
general,
this
is
a
like
a
hacky
situation
that,
yes,
it's
it's
not
really
the
cleanest.
A
The
other
issue
here
is
that
in
general,
this
is
like
one
specific
case
for
demon
setback,
but
in
general,
when
there
is
a
lot
of
pause
and
getting
scheduled
like
we
may
not
really
need
to
always
try
to
find
500
nodes
or
whatever
like
a
bad
scale
and
so
and
so
to
make
the
scalar
more
efficient.
I
was
suggesting
here
as
well
that
we
could
adapt
the
number
of
nodes
that
we
need
to
find
for
scoring
based
on
the
length
of
the
queue
the
q
and
the
pod
queue
length.
A
So
if
we
have
a
large
q,
then
oh
the
scheduler
figures
out.
I
have
a
lot
of
work
to
do.
Let
me
not
try
to
be
extremely,
like
you
know,
optimal,
try
to
find
the
most
the
best
note
possible.
I
will
just
try
less
basically
so
that
I
make
progress
and
make
sure
you
know
I
I
it
goes
through
the
queue
faster.
C
Aggressive
yeah,
so,
okay,
here's
another
concern,
I
think
from
some
other
serverless
users
is
that
right
now
we
evaluated
the
path
according
to
the
percentage
of
paths,
to
score
the
field
right
so
but
internally
for
the
serverless
users
they
want
to
the
the
bin
packing
in
a
very
fixed
manner,
fixed
manner.
I
mean
internally,
we
sort
of
do
this
kind
of
random
part
selection
right
in
the
rubbing
way
or
whatever.
What
other
way.
C
A
But
that's
that
should
be
easy
to
solve,
like
I
don't
even
think
we
need
to
continue
from
the
previous
round
exactly
you
can
just
select
a
random
starting
point
and
the
set
of
nodes
and
and
start
from
there
and
the
the
question
here
is
how
many
nodes
I
should
look
for
from
that
point
on
whether
I
should
look
for
500
or
just
maybe
10
is
enough,
since
I
have
a
ton
of
work
to
do.
Does
that
answer
your
question
concern.
C
No,
I
don't
mean
it's
very
related
with
whether
your
proposal
to
validate
the
notes
according
to
the
name,
so
so
we
just
want
to
a
similar,
not
similar,
just
related
concern
from
users
and
the
behavior
that
we
evaluated
to
the
nodes
in
a
random
way.
So
for
now
it's
by
default.
You
cannot
change
that.
It's
totally
render
if
you're
running
a
large
cluster
right.
A
Okay,
let's
comment
on
the
issue:
please,
if
you
have
any
questions
about
it
or
you
have
an
opinion
comment
foundation
or
if
you
want
you
can
you
can
speak
up
now.
B
A
A
Okay,
we've
got
a
couple
more
topics
here,
just
want
to
give
enough
time
mike.
D
D
If
this
is
a
kind
of
some
good
goals
for
the
project
or
not,
and
the
basic
summary
is
that
I'm
proposing
that
we
really
fundamentally
refactor
the
descheduler
repo
as
it
is
into
more
of
a
framework
to
let
people
build
their
own
d
schedulers,
you
know
and
something
that's
kind
of
similar
to
the
scheduler
framework
or
really
exactly
similar
to
it.
D
From
a
concept
standpoint,
this
idea
of
a
de-scheduler
framework
has
been
kicked
around
for
a
while,
and
I
think
that
now
is
kind
of
a
good
time
to
start
organizing
it
a
bit
better,
because
we
have
a
lot
of
different
proposals
and
bigger
projects
that
I
kind
of
tried
to
outline
in
this
dock
in
ways
that
they
relate
to
the
idea
of
a
scheduler
framework.
D
You
know
projected
in
that
point,
but
it's
starting
to
hit
some
scaling
issues
with
the
growth
of
the
project
and
we're
starting
to
reach
a
point
where
you
know.
Not
only
do
we
not
have
enough
bandwidth
to
really
effectively
review
and
merge
new
proposals
and
changes,
but
some
of
them
are
just
not
really
feasible
to
happen
without
like
conflicting
with
things
that
we
have
already
built
in
as
assumptions
to
the
project.
D
So
our
solution
to
that
so
far
has
been
to
add,
like
new
feature,
gate
flags
and
settings
into
our
api,
and
this
is
also
causing
a
scaling
issue
where
our
api
is
growing
and
not
reaching
any
kind
of
stable
setting
where
people
can
start
reliably
really
trusting
this
and
using
it
as
a
stable
project.
D
The
goal
here,
basically
being
to
shift
the
feature,
development
tasks
really
out
of
the
hands
of
the
maintainers
of
the
descheduler
project
and
put
that
you
know
enable
that,
but
also
encourage
that
in
third-party
developers,
so
that
people
that
want
really
custom
pod
eviction
logic.
It
can
be
idiomatic
for
them
to
build
that
themselves
and
that
frees
up
our
maintainers
to
work
on
building
a
stable
platform
that
other
people
can
reliably
build
on.
D
So
that's
basically
the
summary
of
what
I'm
talking
about
here.
I
really
just
want
to
get
a
lot
of
feedback
from
anyone.
That's
interested
in
how
this
would
work,
what
it
would
look
like
you
know.
I
tried
to
put
a
couple
of
questions
at
the
bottom
of
the
document
that
I
posted
to
get
the
conversation
started
and
try
to
come
up
with
like
what
would
the
scheduler
framework
work
like?
How
do
you
know
the
existing
features
fit
into
it?
D
D
That's
really
what
I'm
looking
for
is.
You
know
input
at
this
point
when
we
get
enough
input,
you
know
I'm
imagining
that
this
will
take
a
little
while
to
sort
out
and
organize.
I
would
like
to
start
on
some
sort
of
work
for
this.
Some
concrete
work
in
the
next
release,
like
the
kubernetes
125
timeline,
if
we
decide
to
go
ahead
with
this
kind
of
approach,
but
you
know,
as
part
of
the
discussion,
I'm
also
open
to
any
completely
alternative
ideas.
D
My
main
summary
a
goal
here
is
just
to
kind
of
alleviate
some
of
the
pain
points
that
we've
had
and
really
steer
the
descheduler
project
towards
a
more
stable
platform.
For
these
kind
of
eviction
controllers
so
open
to
feedback,
just
putting
this
out
there
right
now
and
we
have
an
issue
on
the
github
repo,
I'm
sure
that
there's
probably
people
on
this
call
that
have
had
pull
requests
to
d
schedule
that
have
sat
for
a
long
time
or
hit
a
lot
of
review
or
gotten
blocked
because
of
conflicts
with
everything.
D
So
my
desire
here
is
to
alleviate
that
and
get
a
better
workflow
for
the
project.
So
if
anyone
has
the
time
and
you're
interested,
please
you
know
leave
some
feedback
answer
some
of
the
questions
that
we
have
in
here
work
on
this
proposal.
I
want
this
to
be
a
collaborative
proposal
through
the
people
that
are
invested
in
it.
D
So
yeah,
that's
basically
it.
I
don't
want
to
really
discuss
any
of
these
questions
right
now.
I
want
to
let
people
have
some
time
to
think
about
them
and
work
on
them
offline,
and
you
know
maybe
in
the
next
meeting
or
if
this
you
know
is
a
big
enough
task.
Maybe
this
becomes
its
own
regular
meeting
that
we
would
have
to
have
among
the
people
working
on
it.
I
don't
know.
That's
what
I'm
looking
for
from
people,
so
that's
it
in
a
nutshell!
D
If
you
have
the
time,
please
take
a
look
at
this
and
feel
free
to
help
out
we're
looking
for
people
that
can
contribute.
B
Is
is
the
frame?
Sorry?
Is
the
objective
primarily
to
be
able
to
review
better
and
well
have
some
separation
of
concerns
within
the
code,
or
is
it
also
about
like
starting
a
new
descaler
plugins
repository
that
is
separate
from
from
the
scheduler
yeah.
D
I
think
that
that's
one
of
the
open
questions
that
I
would
really
like
to
get
some
feedback
from
people
on.
I
don't
want
to
bias
the
discussion
too
much,
but
my
thinking
is
just
to
kind
of
alleviate.
The
reviews,
like
you
were
saying,
like
your
first
point
there
like
make
reviews
easier
for
the
scheduler
and
give
our
maintainers
the
time
to
work
on
building
a
more
stable
component.
That
can,
you
know
eventually
graduate
to
a
v1
api.
D
You
know
it
could
follow
the
same
pattern
as
scheduler
plug-ins,
where
you
know
we
have
a
more
open
process
for
people
to
contribute.
You
know
strategies
and
stuff
there,
but
that's
part
of
the
design
that
we'll
have
to
figure
out
and
see
if
people
are
interested
in.
C
Question
worth
exploring
is
whether
you
want
to
do
the
same
way
that
the
scheduling
framework
and
studying
parking
are
wired
so
right
now
we
we
have
to
recompile
the
whole
battery
right
and
in
terms
of
this
scheduler,
I'm
not
sure,
because
it's
not
quite
as
if
you
look
at
the.
If
you
don't
look
at
that,
the
central
component,
maybe
you
can
leverage
some
other
interaction
mechanism
like
grpc
or
some
other
things,
so
you
don't
need
to
recompile
the
whole
binary.
D
Yeah,
I
think
that's
a
good
idea
and
also
part
of
what
I'm
looking
for,
because
you
know
I
think,
since
the
scheduler
framework
has
really
been
pretty
successful
to
this
point,
and
but
I'm
hoping
that
there's
maybe
some
things
that
we
could
learn
from
that
to
apply
to
this
points
like
that.
Where
making
it
so
you
don't
have
to
recompile
a
new
d.
Scheduler
would
be
useful.
F
Is
there
any
thoughts
on
adding
these
scheduling
to
the
the
the
inbuilt
scheduler
so
that,
after
scheduling,
if
certain
constraints
are
no
longer
being
maintained,
there's
some
kind
of
descheduling
that
can
happen
in
the
in
schedule?
The
inbuilt
scheduler
itself.
D
That's
been
asked
a
lot
and
it
you
know.
People
have
requested
that
a
lot,
but
there
isn't
any
plan
for
that
right
now
and
I
can
tell
it's.
A
A
It's
similar
questions
about
auto
scaling
as
well
like
the
order
scale,
is
also
a
separate
controller,
not
as
you
would
have
expected,
like
in
some
other
systems
where
schedule
or
discovery
or
scale.
All
of
them
are
a
single.
A
F
Yeah,
I
think
it's
a
good
point.
I
mean
yeah.
I
agree.
I
think
it's
essentially
more
of
a
scheduling
decision
from
a
scheduler
point
of
view
versus
rearranging
pods.
I
I
I
think
my
my
my
goal
would
be
to
like.
I
don't
know
if
this
scheduler
is
the
right.
I,
the
the
only
the
main
issue
I
see
is
probably
the
fact
that
you
need
to
write
two
different
configurations
for
scheduling
and
descheduling
and,
and
maybe
there
is
a
scope
for
one
single
spec
that
can
basically
do
both
schedule
time
and
run
time
configuration.
D
I
I
don't
want
to
take
up
too
much
time
with
the
d
scheduler,
because
I
think
that
there
was
one
more
thing
on
the
schedule,
but
just
to
wrap
that
part
up.
I
think
that
this
could
potentially
relate
to
the
framework
idea,
because.
F
D
You
know
if
this
is
an
easy
enough
and
portable
framework.
Maybe
we
do
revisit
that,
but
I
think
that
the
idea
of
merging
the
descheduler
into
scheduler
as
a
whole
is
its
own,
pretty
big
topic
that
you
know.
I
don't
want
to
take
up
too
much
time
discussing
and
we
could
definitely
open
another
issue
offline
to
start
the
discussion
about
that.
A
Thank
you
mike
I'm
happy
to
stay
for
another
15
minutes
answered.
If,
if
you
want
to
like
to
present
the
video
anyway.
B
A
C
E
G
Right,
so
I'm
not
sure
hello,
everyone,
I'm
not
sure
how
much
time
I
have,
but
I'll
try
to
be
quick.
This
is
a
proposal
for
a
scheduler
plug-in
that
deals
with
over
commitment,
so
chen
is
also
on
the
call
can
ship
in
any
time.
I
believe
you
have
to
go
in
a
couple
of
minutes.
In
any
case,
why
do
we
need
to
look
at
over
commitment?
As
we
all
know,
schedulers
just
place
pods
based
on
what
they
request
and
that's
the
guaranteed
by
the
kubernetes
scheduler.
G
You
guarantee
that
you're
going
to
get
your
requested
and
not
your
limits
limit
is
a
nicety
that
a
container
could
use
and
grow
into
and
and
spike
load
into
it,
but
there's
no
guarantee
from
the
scheduler
that
you're
going
to
get
that
much
resources
which
makes
a
lot
of
sense.
The
problem
is:
if
the
schedule
is
not
aware
of
of
these
over
commitment
and
not
aware
of
the
limits
of
these
pods,
then
they
could
get
scheduled
on
one
node,
for
example,
then
they
compete
with
each
other
leading
to
performance
degradation
because
of
throughout
cpu
throttling.
G
You
could
even
have
om
kind
of
things.
If
it's,
if
the
resource
is
memory,
so
this
this
particular
plugin
is,
is
proposed
to
look
at,
in
addition
to
requests,
look
at
limits
and
consider
those
not
guarantee
them
but
consider
the
limits
so
that
you
place
the
pods
basically
on
nodes,
the
pods
that
are
burstable
and
best
effort
on
nodes
where
they
can
grow
they
can.
G
G
Right
so,
as
I
said,
versatile
and
best
effort,
that's
what
this
plugin
is
targeting
the
the
guaranteed
one.
They
don't
there's
nothing
else.
You
can
do
with
them,
because
this
is
about
limits
right.
So
there
are
basically
these
containers
that
are
burstable
in
best
effort.
They
are
telling
us
that
they're
going
to
grow
more
than
what
they
requested
may
grow
into
it.
So
it
would
be
nice
if
the
scheduler
can
can
consider
that
and
give
them
again
room
to
grow
and
that
room
should
not
be
congested.
G
So
that's
that's.
Basically
the
idea,
it's
a
very
simple
idea.
If
we
move
on
to
the
next
one,
what
this
plugin
is
gonna,
basically
evaluate
two
risk
factors
and
two,
these
two
risk
factors
will
make
up
a
risk
value
that
would
evaluate
into
a
score.
So
as
such
is
a
score
plug-in,
it
will
do
a
scoring
of
the
nodes
based
on
the
risk,
and
the
risk
has
two
factors:
what
I
call
a
limit
risk
and
the
load
risk.
G
The
limit
risk
is
based
on
the
values
of
the
requests
and
limits
of
of
all
the
pods
in
in
the
cluster,
as
well
as
the
part
that
is
to
be
scheduled
and
the
load
risk
has
to
do
with
actual
load
measurements,
and
luckily
we
already
have
in
trimaran
plug-ins.
There
are
two
plugins
there
that
use
a
load
watcher
and
the
load
watcher
is
providing
through
prometheus
and
other
providers,
data
about
the
actual
load
on
those
nodes.
G
So
the
next
slide,
please
and
the
suggestion
is
to
add
this
plugin
as
a
third
plugin
to
the
trimaran
family,
because
they
all
use
the
same
load
watcher
that
provides
loads.
So
as
such,
this
is
a
load
aware,
plug-in
and
as
well
as
as
limit,
how
does
it
differ
from
the
other
two
at
a
very
high
level,
so
the
trimmer
and
all
of
them
are
load
aware.
So
they
do
something
with
the
load.
The
first
one
looks
at
the
average
and
basically
considers
all
paths.
G
The
second
one
looks
at
also
not
only
the
average,
but
the
variation
in
the
load
and
the
proposed
one
trimaran
3
would
would
also
look
at
the
limits.
In
addition
to
the
load
and
as
far
as
load,
it
looks
more
than
just
the
average
and
the
variation
it
has
to
compute
the
tail
of
a
distribution
which
is
the
the
probability,
the
probability
that,
if
you
place
a
node
on
the
part,
if
you
place
a
part
on
a
particular
node,
it
will
compete
with
others
with
this
probability
or
chance.
G
Okay.
Next
one,
please
right.
So
I
show
here
two
two
use
cases,
one
that
talks
about
the
limit
risk,
the
other
talks
about
the
load
risk.
So
as
far
as
limit
is
concerned,
let's
imagine
that
you
have
two
nodes
in
this
picture
here.
Node
one
on
the
left
has
two
pods
a
and
b.
You
know
two
on
the
right
has
two
parts:
d
and
e,
and
they
have
these
requests.
G
Both
of
them
have
the
same.
Both
nodes
have
the
same
request
values.
So
a
default
scheduler
would
pick
whichever
to
place
the
new
pod
pod
x.
Now,
if
you
consider
limit,
you
can
see
that
node
one.
If
you
add
up
the
limits
of
a
and
b
parts
there,
they
go
beyond
allocatable
beyond
capacity,
whereas
in
node
2
they
don't.
So
it's
it's
better,
probably
based
on
that
to
place
pod,
the
new
pod
x
on
note
2.
G
G
Next
slide,
please
how
it's
computed
the
risk
itself.
This
is
very
simple
formula
that
says:
okay,
I'm
gonna
add
up
the
allocated
on
on
which
are
the
requests
on
the
node
and
I'm
gonna
add
up
all
the
limits
of
all
pods
on
the
node.
I
will
evaluate
something:
that's
called
the
excess,
which
is
the
between
allocated
and
total
limit,
including
the
part
to
be
scheduled
and
the
allowed,
which
is
the
available
room
on
the
node,
and
I
do
this
for
all
resources.
G
So
if
cpu
memory-
let's
say
so
I'll,
do
this
for
cpu
I'll
do
this
for
memory
and
then
this
risk
limit
is
is
nothing
other
than
a
value
between
0
and
1.
That
says,
if
the
risk
is
0,
then
then
there's
no
risk,
and
that
would
happen
if,
for
example,
excess
is
the
same
as
allowed,
then
there's
no
risk
of
of
going
beyond
what
you
want.
You're
gonna
get
what
you
want
if
you
wanted
to
and
the
risk
of
one
when
you
really
don't
have
room
at
all
to
grow.
G
So
that's
the
that's
the
formula
to
compute
the
limit
risk.
That's
the
first
factor.
Secondly,
next
slide
is
so
it
shows
the
second
the
use
case
two,
which
is
how
about
the
load,
the
actual
load.
G
So
here
is
the
same
arrangement
again
of
node
one
and
node
two,
but
because
in
node
one
you
look
at
that
distribution
there
of
load
that
there
is
some
chance
that
the
load
is
between
two
and
four
four
is
the
is
the
capacity
of
this
node
of
both
nodes,
whereas
node
two
all
of
the
load
and
it's
an
extreme
case,
but
all
of
the
of
the
load
is
between
two
and
four.
G
So
if
I
place
a
new
part
x
on
node,
two
is
going
to
really
be
competing
with
others
more
likely
than
node
one.
So
in
this
case
I
will
favor
node
one
based
on
usage.
Okay,
so
that's
the
idea
move
on
to
the
next
slide,
and
so
how
do
we
compute
that?
Well,
the
risk
of
of
of
competing
with
this
load?
Is
we
look
at
the
distribution?
G
We
look
at
the
probability
that
usage
or
load
is
beyond
the
allocator,
and
we
compute
that
and
we
say
that's
the
risk
and
again
it's
between
0
and
one.
If
all
the
load
is
is
below
allocated,
then
there's
no
risk
at
all.
If
all
the
load
is
above
allocated,
then
there
is
a
lot
of
risk
and
anywhere
in
between
okay.
G
So
next
slide,
please
how
do
we?
What
do
we
do
with
these
two
risk
factors
where
very
simply,
we
just
do
a
weighted
sum.
We
have
risk
limit
and
a
risk
load
we
multiply
by
some
weight
default
is
a
half
that
they
are
equally
likely
to
contribute
to
the
total
risk.
That's
a
configurable
thing,
and
so
the
total
risk
for
that
resource
on
that
node
is
one
value
between
zero
and
one.
G
G
C
Yeah
thanks
elsa
for
raising
this.
I
think
it's
a
very
practical
challenge
in
the
real
question,
so
I
have
one
question
I
think
you
mentioned
two
factors
to
to
serve
as
the
input
of
this
kind
of
scope,
plugin.
So
in
terms
of
first
factor,
you
look
at
the
limits.
I
think
that
code
depends
on
how
correctly
the
user
sets
their
limits.
So
what
if
they
like
just
said
just
request
under
one
cpu
and
the
limits,
it's
a
very
large
number,
but
they
never
reach
that
limit.
C
So
does
that
mean
the
first
factor
will
totally
useless,
and
then
you
have
to
totally
look
at
look
upon.
The
second
factor
that
look
at
the
real
load
of
the
node.
G
That's
a
good
point
yeah,
so
the
these
two
factors,
the
first
one,
as
you
noted
it's
based
on
it,
makes
its
decision
based
on
what
users
specify
as
limits
and
they
could
be
wrong
and
that's
when
the
second
factor
is
gonna
fix
that
by
by
looking
at
the
actual
load.
So
on
one
hand
you
could
say
well,
why
do
I
need
the
first
factor,
on
the
other
hand,
well,
what?
If
the
load
is
the
one
that
is
not
right,
that
that.
G
Judging
only
by
the
actual
usage
may
not
be
enough
to
to
assess
what
the
limit
is,
in
other
words,
if,
if
a
container
once
in
a
while
goes
about
as
above,
it's
requested
based
on
how
much
is
going
above,
it's
requested,
I
cannot
say,
okay,
the
limit
is
going
to
be
the
maximum
of
that
or
twice
the
maximum,
or
something
like
that.
G
Maybe
making
that,
as
as
an
estimate,
may
not
be
accurate
because
because
you
know
maybe
that's
what
happened
in
in
in
the
past,
but
in
the
future,
all
of
a
sudden
we're
going
to
see
a
big
spike
from
that
container.
A
Right,
I
I
think
you,
this
is
what
I
understood
like
one
factor
is
basically
trying
to
take
the
future
into
account.
The
other
is
doing
things
based
in
the
past.
G
Right,
it's
it's
one
is
based
on
the
past
measurements
and
the
other
one
is
based
on
potential.
What's
in
the
specs
yeah.
B
G
They
don't
have
to
use.
Of
course
they
don't
have
to
use
all
three.
I
think
that
all
trimaran,
I'm
calling
them
abbreviating
them
as
tramaran
one
two
and
three
and
in
the
table
I
kind
of
contrasted
them.
So
typically,
one
would
choose
only
one,
because
you
don't
get
any
benefit
from
using
the
three
first
of
all.
Trimaran
three
here
doesn't
do
anything
to
the
guaranteed
parts.
G
You
guarantee
this
you're
gonna
get
what
you
what
you're
asking
for
so
it
doesn't.
I
I
believe,
unless
there
is
an
indirect
effect,
that
it
won't,
it
won't
help
or
it
won't
hurt
the
guaranteed
ones.
So
how
about
for
bursts
of
best
effort?
What
does
timer
and
one
and
two
do
is
that
they
base
their
placement
on
actual
load
on
nodes,
whether
average
or
average
plus
variation,
but
they
don't
look
at
limits
at
all.
G
So
if,
if,
if
pods
are
concerned,
if
a
user
is
concerned
about
work
on
those
pods
or
containers
within
the
pods,
that's
going
to
grow
beyond
request,
then
definitely
trimaran
3
would
would
be
a
would
be
the
right
the
one
to
choose.
B
Yeah,
I
guess
I
wonder
how
what's
the
best
way
to
either
conversion
like?
Is
it
possible
to
have
one
plugin
that
you
simply
configure
differently
to
behave
differently
instead
of
having
three
plugins
just
to
simplify
the
user
experience,
because
someone
that
is
not
very
familiar
has
to
choose,
and
maybe
they
decide,
maybe
they
think
using
this
three
at
the
same
time
is
the
best,
and
it
might
not
be
so
if
you
just
give
them
one
with
different
configurations,
and
it
might
be
easier
to
use.
G
Yeah,
I
see
what
you
mean.
I
think
it's
a
good
suggestion.
My
only
feedback
is
the
the
common
thing
about
these
three
trimarans
right
is
that
they
use
a
load
watch.
They
use
the
data
about
a
measure
metrics
of
usage
of
resources
on
the
nose,
that's
the
common
thing,
but
they
have
different
objectives.
G
Each
one
of
them
has
a
different
objective,
only
thing
in
common
and
they
belong
to
the
same
family-
is
the
fact
that
they
are
load
aware.
I
can
imagine
many
more
scheduler
plug-ins
are
going
to
are
going
to
be
proposed,
saying.
Oh,
we
want
also
to
be
load
aware,
but
we
have
this
other
objective
in
mind.
B
You
can
clarify
the
documentation
and
maybe
that's
enough
yeah,
but
that's
my
only
feedback.
C
Any
other
questions
yeah
no
question,
but
suggestions
that
maybe
the
terminal
3
can
also
apply
to
all
paths
because
the
limits
itself.
Well,
it's
not
accurate
right.
It
depends
on
how
the
users
benchmark
their
workloads
and
set
their
limits.
A
G
The
ex
the
experiment
would
be
like
I
would
have
a
cluster
with
some
pods
that
are
running
and
going
and
beyond
the
requests
heavy
load.
What
maybe
once
in
a
while,
and
then
I
have
this
new
pod-
that
that
is
also
going
to
have
a
lot
of
demand,
and
I
would
schedule
the
default
scheduler
and
schedule
primarin3
and
see
how
long,
for
example,
how
long
it
took
to
run
based
on
one
schedule
and
the
other.
A
I
mean
this
is
not
a
like
a
scheduling,
latency
or
scheduling,
throughput
improvement.
This
is
an
improvement
on
how
much
resources
a
skater
is
able
to
burst
into,
and
so
basically
what
you're
saying
is
that
you
want
to
measure
the
performance
of
the
workload
right
like
not
the
scheduler
performance.
Yes,
yes
did
you
do
that?
Like
did
you
yeah?
A
G
A
And
this
is
like
my
follow-up
question
here
is
like
when,
if
we
are
to,
you
know,
suggest
these
plugins
to
customers
to
users.
What
would
we
tell
them?
What
type
of
workload
example
workloads
like
it's
not
enough
to
say?
Okay,
I
have
this
synthetic
workload
that
basically
maximizes
the
success
of
the
plug-in
that
implemented
and
you
try
to
match
into
it.
A
You
know,
like
you,
can
always
manufacture
a
workload
that
works
well
for
the
plug-in,
and
so
it
would
be
really
really
interesting
if
you
can
have
a
real
workload
that
can
show
it
actually
did
better
and
it's
not
the
same
right.
It's
not
just
the
application
per
se.
It's
also
the
mix
of
workloads
that
are
running
on
the
cluster
that
would
contribute
to
whether
this
is
actually
going
to
be
useful
or
not.
On
the
bigger
scheme
of
things.
G
Right,
yeah,
let
me
help
you.
I
agree
with
you
100.
Let
me
add
to
that.
I've
recently
seen
a
scenario
on.
I
believe
it
was
on
six
scheduling
somewhere,
where
someone
was
saying
that
pods
when
they
start
for
this
kind
of
workload,
when
they
start
the
innate
container,
takes
a
lot
of
resources
and
then,
after
that,
it's
very
normal
right.
So
there's
a
lot
of
demand
in
the
beginning,
and
if
you
have
a
bunch
of
these
all
of
a
sudden,
they
are
surging
they're,
there's
a
surgeon
in
demand
in
usage
right.
G
So
I
was
thinking.
Maybe
this
would
help
in
this
case
by
just
detecting
that
these
parts
have
a
limit
that
they
grow
into
so
they're
not
going
to
be
scheduled.
On
the
same
note,
so
they're
going
to
be
basically
scheduled
on
the
kind
of
round
robin
this
is
an
extreme
right
and
a
kind
of
round
robin
on
the
nodes
in
the
cluster.
So
that's
why
they
get
they
get
separated
when
when
they
do
their
surges
in
the
beginning
and
after
that,
they're
okay-
I
don't-
I
don't
know
if
I,
if
I
clarify.
C
C
Like
for
this
kind
of
chairman
three,
I
think
the
correct
procedure
is
that
in
a
normal
default
scheduler
with
the
default
config
that
you'll
load
up
the
cluster,
with
the
specific
request
versus
limits
and
maybe
put
some
cpu
intensive
workloads
there
and
you
can
notice
some
om
or
of
cpu
and
then
with
the
terminal
3,
you
can
see
this
symptom
can
disappear
in
a
very
consistent
way.
So
that
can
be
a
very
good
evaluation
procedure
that
you
prove
this
plug-in,
helps
a
lot
and
will
be
super
convincible
to
others.
But
we
do.
C
We
are
lacking
of
this
kind
of
simulation
tools,
verse.
I
think
yeah
internally,
I'm
building
some
kind
of
simulation
to
to
to
sort
of
fix
the
gap
but
yeah
in
some
time.
Maybe
I
can
share.
H
This
is
really
valid
example,
and
we
we
we
saw
this
we're
seeing
this
this
problem
a
lot
by
not
like
overloading
the
the
node,
but
it's
like
okay,
I
I
dedicated
a
very
huge
node
for
the
java
workload,
but
after
the
start,
as
the
start
of
time,
I
don't
need
it.
I
don't
need
this
huge
node,
so
this
is
really
benefit
for
for
this
scenario,
and
I
believe
we
can.
We
can
use
this
example
as
a
as
a
validation
or
to
to
to.
A
All
right,
thank
you
guys
here.
This
is
really
interesting.
I
love
the
direction
of
this
project
and
the
the
fact
that
there
is
like
attraction-
and
I
was
just
trying
to
play
this
devil's-
advocate-
trying
to
poke
into
the
methodology
and
see
if
there
isn't,
but
I'm
really
interested
to
see
at
some
point.
This
is
being
deployed
in
in
an
actual,
like
you
know,
real-life
cluster,
and
I'm
wondering
if
this
is
something
that
you
are
planning
to
deploy
on
ibm.
I
don't
know
if
ibm
actually
uses
kubernetes
in
some
form.
C
Okay,
now
I
think,
next
time,
maybe
you
assert
chain
or
abdul
from
paypal
can
demonstrate
this
kind
of
triggering
from
the
end-to-end
workflow
like
how
the
metrics
like.
What's
the
metric
provider
you
choose
and
then
to
see
with
versus,
without
distracting
how
the
close
looks
like
that
could
be
a
super
super
interesting
demo.