►
From YouTube: Kubernetes SIG Scheduling Weekly Meeting for 20210520
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hi
everyone
today
is,
may
20
2021
welcome
to
this
week's
sixth
scheduling
meeting
and
this
meeting
is
being
recorded
and
will
be
uploaded
to
the
youtube
channel.
Alright,
so
I
suppose
you
can
see
my
screen
right.
A
A
Okay,
that's
for
your
information
and
second
wise.
Last
week
we
and
dave
just
introduced
idea
about
adding
the
extend
external
resource
support
to
the
balance
like
allocation
plugin
and
during
the
reviewing
both
from
me
and
aldo,
we
found
out
that
there
is
some
designing
flaws
in
supporting
the
extended
resources
on
the
existing
plugins
like
list
the
request
and
the
mercer
requested.
A
So
the
basically
the
problem
we
found
is
that
in
a
heterogeneous
environment,
which
means
some
machine,
have
this
kind
of
particular
resource,
but
some
don't
so
for
this
kind
of
situations
we
are
not
properly
scoring
the
machines
properly.
For
example,
I
can
give
the
example
like
we
have
swinners
and
why
nails
doesn't
have
a
gpu.
A
So
then
supposedly
comes
better
as
a
first
part,
and
so
the
this
is
the
current
usage
among
the
requested
results
and
the
capacity.
So
in
this
case
this
part
doesn't
require
any
gpu
resource
right.
We
calculate
all
the
accumulated
requested
results
and
divided
by
the
capacity
so
that
we
got
the
final
score
for
node
one
for
the
list
of
requests.
It
will
be
the
highest
and
second
one
will
be
node
two
and
the
node
three
will
be
the
lowest
score.
A
So
this
sort
of
doesn't
quite
make
sense
because
the
gpu
resource
discuss
resources
right
if
a
node
doesn't
request
that-
and
in
this
case
you
can
see
there
is
a
targe
in
other
resources.
A
So
probably
it
makes
more
sense
to
choose
note
3,
so
so
that
we
so
don't
interrupt
this
kind
of
discuss
resources
right
so,
but
this
is
not
the
current
behavior,
so
we
want
to
sort
of
get
a
consensus
on
how
and
if
we
should
improve
this-
and
you
can
you
can
you
can
leave
any
opinion
on
these
issues
and
after
we
get
to
a
consensus,
say:
okay,
we
need
to
adjust
these
algorithms.
A
Then
we
can
go
to
the
algorithm
details
like
what
kind
of
algorithm
we
should
to
make
a
better
ferrous
on
the
nails
so
basically
yeah.
This
is
there's
some
discussion
on
whether
which
kind
of
algorithm
we
should
choose
yeah-
and
also
I
mentioned
diff.
Maybe
you
can
live
through
the
dif
algorithm
to
show
the
results
famous.
A
So
this
is
for
the
issue
and
the
reviewing
of
the
balance
allocation
and
adding
the
extensive
resource
support
on
this.
So
I
think
we
may
need
to
resolve
this
first
if
it
doesn't
take
a
long
time
and
then
block
this
issue,
because
the
problem
exists
in
the
extended
resource,
support
and
also
the
this
proposal
is,
is
intended
to
adding
the
extender's
result
possible,
so
basically
kind
of
depend
on
this.
B
Is
that
yeah,
like
I
agree
that
maybe
we
should
we
should
improve
it,
but
typically
you
wouldn't
want
to
schedule
a
pod
that
doesn't
request
the
gpu
on
gpu
node
in
the
first
place,
because
so
yes.
B
Yeah
usually
like,
for
example,
on
gke
what
what
we
do
is
that
you
have
a
taint
on
these
nodes
by
default
and
only
if
you
tolerate
these
nodes,
which
is
typically
a
part
that
requests
a
gpu,
the
part
will
get
scheduled
there.
But
the
general
point,
I
think,
is
still
valid
like
we
need
to
take
care
of
this
in
a
probably
better
way,
but
I
don't
think
it's
as
problematic,
as
you
might
think,.
A
B
Right
idea,
those
cost
resources
are
expensive.
You
don't
want
to
actually,
like
you
know,
hog
them
for
parts
that
don't
want
to
use
them
in
the
first
place.
So
a
filter
here
is
probably
better
in
general,
not
just
like
a
preference
yeah.
C
A
A
A
A
single
plugin
as
the
scoring
plugins
and
make
those
kind
of
behavior
as
the
options
in
the
form
of
the
plugin
arguments
so
abdullah
proposed,
so
that
we
can
use
the
existing
no
resource
set.
This
is
the
right
now
is
the
filter
plugin
so,
but
we
can
make
it
make
it
the
variant
of
the
score
point
as
well,
so
that
we
make
a
list
allocated
mostly
located
at
the
last
others
as
plugin
arguments,
so
because
this
is
good,
but
we
need
also
support
the
viva
better
one
which
is
use
the
scoring
plugins
as
this.
A
So
once
we
migrate
to
viva
beta
2,
once
we
have
the
unified
resource
field,
plugging
as
our
scoring
plugin,
we
may
need
to
handle
the
conversion
as
well
as
some
other
things.
So
thanks
to
abdullah
to
do
a
breakdown
on
the
issues.
So
we
can
get
this
kind
of
items
one
by
one.
So
right
now
the
first
items
it's
been
working
on
and
the
other
items.
B
B
Because
the
default
set
of
plugins
right
now
is
being
set
outside
the
defaulting
logic
for
component
config,
that's
kind
of
problematic
like
we
want
it
to
be
virgin
and
if
you
want
to
be
virgin,
needs
to
be
set
and
proper
like.
If
we
wanted
to
be
properly
version,
we
need
to
set
the
default
set
of
plugins
in
the
defaulting
logic
so
that
in
v1
beta1,
for
example,
we
don't
change
the
behavior.
We
continue
to
use,
least
allocated
in
v1
beta2.
We
change
the
behavior.
B
This
is
important
because
if
you
remember
you
could
disable
and
enable
plugins
right
so
disabling
plugins,
you
disable
them
for
the
default
one.
So
if,
for
example,
a
provider,
disables
least
allocated-
and
at
least
allocate
is
not
part
of
the
default
set,
it's
gonna
break
the
like
their
setup,
and
so
we
want
a
way
for
them
to
do
it
properly
and
the
proper
way
is
to
say:
if
you
are
using
v1
beta1,
then
you
can.
You
can
disable
list
allocated
because
it
exists.
B
But
if
you
are
in
the
v
one
beta
two,
you
don't
disable
it
you
can
use.
You
can
change
the
configuration
right
for
the
default
plugin
that
already
exists,
which
is
the
fit,
and
so
I
I
yeah
I
create
another
issue
to
to
remove
the
algorithm
provider,
and
that
will,
I
think,
it's
a
good
cleanup
it.
It
will
move
things
towards
really
making
all
the
defaulting
logic
for
the
scheduler.
More
and
more
as
part
of
component.
B
B
A
Okay,
thanks
abdullah,
for
working
on
this
next
item
is
from
tyler.
Do
you
want
to
take
over
to
give
some
demo
and
introduction
on
the
scoring
plugin
proposal
on
the
topology
aware
scheduler?
I
can.
D
Yes,
please
can
you
hear
me.
D
Great
okay
I'll
try
to
put
the
mic
far
away
a
bit,
because
now
it's
better.
D
Hi
everyone,
my
name,
is
talo
and
I'm
a
team
member
of
the
echo
engineering
team
at
red
hat
so
as
part
of
our
ongoing
effort
to
improve
the
way
kubernetes
handles
latency
sensitive
workloads.
I'm
going
to
introduce
you
to
the
proposal
for
noma
whale
scope.
Again.
D
D
Okay.
So
so
this
is
today's
agenda
and
with
no
further
ado,
let
me
jump
into
the
motivation
section
so
currently
node
resource,
topology,
filter,
plugin.
D
Cubelet
has
more
knowledge
and
more
information
about
how
the
node
resources
are
spread
among
pneuma
nodes
and
kubernetes
scheduler
doesn't
so.
We
would
like
to
add
this
score
plugin,
which
has
new
awareness
ability
in
order
to
reduce
these
gaps,
and
eventually
we
will
end
up
with
less
issues,
as
I
mentioned,
with
less
issues,
as
I
mentioned
before,
the
the
pods
stuck
in
the
pending
state
and
on
the
long
term,
it
will
allow
more
optimal
utilization
of
the
system
resources.
A
Right,
I
just
yeah-
I
just
want
to
give
a
background.
Is
that
the
topology
aware
plugin
support
has
been
been
available
in
the
scheduled
plugins.
So,
in
the
background
we
pursued
to
add
this
part
in
the
upstream,
but
due
to
the
no
resource
api
is
not
mature
and
served
as
a
crd,
so
it
against
the
community
rules
to
adding
a
theory
support
in
the
upstream.
A
D
So
this
scope
plugin
offers
three
different
strategy
for
school
calculation,
which
are
basically
an
imitation
of
the
already
existing
entry
kubernetes
plugins.
D
So
we
have
the
list
allocatable
strategy,
which
is
basically
a
way
a
way
to.
We
are
trying
to
beam
pack
as
much
pods
as
possible
on
a
given
node.
D
D
So
this
is
an
example
for
the
kind
of
problems
that
the
new
score
plugin
intended
to
solve.
So
we
are
going
to
very
quickly
I'll
try
to
provide
here
a
comparison
between
two
plugins,
which
has
the
same
logic,
but
one
of
them
is
non-new
malware
and
the
other
does
so.
D
There
is
the
kubernetes
entry
most
allocated
scope,
plugin,
which
try
to
bin
pack
as
much
pods
requests
as
possible
on
a
given
node,
and
we
have
the
new
scope
plugin
configured
with
the
list
allocated
strategy.
So
they
both
try
to
be
impact
as
much
resource
as
much
board's
request
as
possible
on
a
given
node.
D
So
this
is
the
current
cluster.
We
have
here
two
hosts
with
two
pneuma
nodes.
Each
one
has
four
cores
on
each
pneuma
and
the
other
one
has
two
cores
on
each
numa.
D
The
brown
cores
are
resolved
cores
and
the
blue
ones
are
available
for
allocations,
and
we
have
here
a
set
of
three
requests:
three
pods,
which
are
requests
to
be
deployed
on
this
on
the
current
cluster,
and
here
I
attached
a
link
to
a
live
demo,
but
due
to
time
constrain,
I
won't
present
it
right
now.
So
I
just
run
very
quickly
on
the
this,
the
problem
itself
and
later
on
guys,
you
can
take
a
look
and
see
it
on
a
real
environment.
D
So,
as
I
said
scenario
one
with
the
entry
most
allocated
scope
again,
we
try
to
deploy
pod
number
one
or
one
one.
Other
important
thing
to
mention
here
is
that
both
nodes
are
configured
with
single
pneuma,
node
policy.
D
So
to
the
people
here
who
doesn't
know
what
single
police,
single
nominal
policy
is
is
means
that
the
pod
will
be
accept
only
if
we
have
all
the
cpus
that
it
asks
for
are
coming
from
the
same
numa.
D
So
this
is
an
example,
for
this
is
a
good
example
for
a
pod
that
can
be
accepted
because
all
cores,
as
you
can
see
here,
are
coming
on
the
same
numa.
So
the
first
port
will
ask
for
three
three
cores
and
it
will
be
deployed
successfully.
D
Okay,
so
this
board
will
stack
on
a
pending
state.
So
let's
go
back
and
try
to
do
the
same
with
the
new
scope,
plugin
so
again,
the
same
setup,
the
same
cluster,
everything
they
remain
the
same.
We
just
replaced
the
we
configured
the
port
to
ask
to
be
deployed
with
this
scheduler.
D
So
request
number
one
is
exactly
the
same
port
one
three
cpus
on
host
one
numero
node
zero.
Then
here
comes
the
key
difference.
Port
number
two
will
be
deployed
on
host
number
two,
because
since
this
plugin
is
numa
well,
it
says
that
it's
better
to
the
to
take
two
cores
out
of
two
available
at
the
newman
level,
then
take
two
core
out
of
three
available
at
the
normal
level.
D
So
this
is
kind
of
more
fitting
down
at
host
one
number
one
and
of
course
the
third
request
will
eventually
it
will
eventually
manage
to
successfully
deploy
it
as
well.
D
So
that's
basically
the
the
explanation
of
the
of
the
problem.
I
will,
since
I
don't
have
much
time,
I
will
skip
on
the
algorithm
itself
and
I'll
just
give
you
a
quick
look
at
the
manifest
how
it
looks
like
so
this
is
a
standard
scheduler
configuration
manifest.
D
D
A
Yeah,
so
you
mentioned
that
you
will
do
the
scoring
her
human
note
right.
So
I
suppose
you
mean
that
in
the
scoring
in
the
final
normalized
scoring
you
will,
for
example,
for
for
the,
for
the
note
one
you
just
mentioned:
there's
two
candidates,
two
numer
knows
candidates.
So
so
you
will
choose
the
higher
the
score
for
the
newman
note.
Or
do
you
do
a
aggregation
on
the
two
new
manuals
there.
D
You
are
asking:
how
do
we
decide
the
final
score
for
the
note?
That's
the
question:
yeah
yeah,
correct.
Okay!
So,
basically,
assuming
we
have
two
numerals,
as
you
said,
we
will
score
each
of
each
one
of
them
independently
and
we
will
return
the
pneuma
node
with
the
minimal
score
as
the
node's
final
score.
We
won't
calculate
them
together
or
something
like
that.
Okay,.
C
A
Is
the
decision
is
conceived
consistent
with
the
like
the
cube,
topology
manager's
algorithm,
because,
for
example,
for
the
note
one
you
have
yes,
we
do
assign
the
node
one
to
the
spec,
but
when
it
comes
to
the
keyboard
execution,
it
also
has
two
choices
there
right.
So
how
can
we
ensure
the
result
is
consistent.
E
You
may
I
can
yes,
I
can
take
that,
so
we
actually
the
filter,
plugin,
that
we
have
is
almost
a
simplified
version
of
the
topology
manager
logic.
So
it
does
consider
how
alignment
would
happen
from
resource
point
of
view
on
a
numa
node.
E
And
then
the
scoring
plug-in
essentially
scores
the
the
pneuma
nodes,
essentially
the
nodes
based
on
the
numerators
and
the
resources
that
have
been
requested
and
that's
how
this
and
eventually
you'll,
obviously
select
the
one.
The
node,
which
has
the
maximum
score.
B
E
Yeah,
like
that
is
that
is
a
gap
that
is
essentially
known
right
now,
because
there's
no
way
for
scheduler
to
convey
exactly
what
no
one
would
the
resources
would
be
allocated
from,
like
the
apology
manager
still
executes
its
same
logic,
so
it
is
kind
of
a
best
effort
that
topology
manager
would
make
the
same
decision
as
the
scheduler
would
make.
But
there
could
be
a
scenario
where
this
decision
is
kind
of
different
from
what
the
scheduler
has
evaluated.
A
Yeah
and
another
potential
risk
is
that
the
you
have
to
issue
the
policy
in
both
cubelet
and
the
scheduler
are
consistent,
like
you
are
specifying
a
list
allocated
in
the
schedule
side.
Meanwhile,
maybe
you
are
specified
the
most
allocated
in
the
kipling
technology
manager
side,
so
that
will
also
cause
the
unexpected
behavior
yeah,
but.
E
E
A
A
A
E
E
E
Yeah
yeah,
that's
a
good
point.
So
v.
Where
would
you
see
like
this
kind
of
hint,
like
obviously
scheduler,
is
doing
its
evaluation
and
we
could
figure
out
a
way,
maybe
adding
an
annotation
in
the
pod
spec
or
something
like
that
for
the
cubelet
to
consider
or
an
agent
to
consider
to
maybe
act
as
a
provider.
A
C
E
F
E
Yeah
that
was
actually
me.
I
had
proposed
the
cap
as
well
as
like
the
implementation
upstream,
but
I
think
we
mentioned
it
initially.
The
the
gap
was
that
the
crd
api
itself
is
not
maturing
yet
and
that's
why
we
move
towards
scheduler
plug-in
to
be
able
to
gain
some
kind
of
attention
and
for
allow
people
to
use
it
there
and
then
eventually,
we
move
towards
an
entry
plug-in.
F
And
what
what
is
missing,
I
think
I
would
rather
like
to
see
the
crt
maturing
yeah.
E
Yeah
so
the
crt
exists
and-
and
I
think
like
I
want
to
clarify-
if
is
this-
is
if
that
is
the
question
for
the
filter
plug-in
that
was
proposed
previously,
or
the
scoring
plug-in
that
tellor
just
presented.
A
C
A
We
got
a
lot
of
pushback
on
getting
premiums
in
the
in
the
upstream,
so
I
think
once
we
got
the
node
resource
api
mature
and
we
can
get
that
incorporated
in
the
call
apis
down.
You
can
move
that
approach.
Yeah,
yeah,
plugin.
E
So
kind
of
they
are
related
and
basically
I'm
the
owner
of
this
piece
of
work.
So
it's
me
I
was
initially
working
on
entry.
Enablement
then
started
working
on
out
of
tree
enablement
and
once
things
are
kind
of
at
a
reasonable
state
and
we
have
people
who
are
using
it,
and
maybe
we
get
some
more
feedback
on
the
crd
api
and
we
think
it's
in
a
reasonable
state,
we'll
again
go
back
to
the
having
those
conversations
with
with
essentially
sing
architectures
ignored
sex
scheduling.
Everyone.
F
I
see
okay,
all
right,
I
don't
have
any
more
concerns
for
them
cool,
but
I
would
suggest
you
continue
pursuing
the
the
graduation
of
the
crd
into
into
an
api,
and
that
definitely
involves
your
scheduling
when
you
do
that,
even
if,
even
if
it's
only
node
related,
I
think
it's
better
for
us
to
be
involved
from
the
beginning.
E
Totally
like
you
can
have
a
look
at
the
current
state
like
how
how
it
looks
like
at
the
moment,
I
did
open
in
the
staging
repo
in
the
staging
section
in
kubernetes
as
well,
but
then
I
didn't
want
to
kind
of
bypass.
Everyone
agree
on
how
the
api
looks
like
so.
Yeah
just
have
a
look
and
maybe
give
us
some
feedback.
If
there
are
things
that
you
think
should
be
done
differently
or
or
anything
that
needs
to
change.
E
E
Okay,
so
I
think
just
to
wrap
this
up,
I
would
like
to
make
sure
that
we
are
all
on
the
same
page.
We
were
thinking
that
for
the
time
being,
given
that
the
scheduler
plug-in
is
in
the
the
topology,
where
scheduler
plug-in
is
in
this
catalog
plug-in
repo,
is
it
okay
to
go
ahead
and
maybe
create
a
separate
cap
and
push
an
implementation
pr
to
the
scheduler
plug-in
for
this
change?.
A
I
think
I
think
we
should
firstly
raise
the
cap
to
the
kubernetes
repo
to
focus
that
cap
will
focus
on
the
kiplet
policies
of
how
first,
so,
basically,
we
have
two
options.
One
is
to
make
the
decision
dispatch
back
scheduler,
so
I
mean
the
humanoid
decision,
aware
of
cubelet.
That
is
one
option.
The
other
option
is
there
to
add
the
policy
scoring
policies
about
the
keypad,
which
is
should
be
consistent.
In
the
later
we
introduced
the
scoring
plugin
in
the
schedule
style.
A
E
Sounds
good:
okay,
okay,
so
that
sounds
like
a
good
direction.
We'll
follow
up
on
that
then,
and
and
way
I'll,
keep
you
in
the
loop
as
we
go
along.
E
On
a
proposal
for
the
hint
and
in
calculation
between
cubelet
and
scheduler,.
A
A
A
So
it's
not
a
good
fit
for
the
first
time
contributors.
So
it
does
need
some
understanding
on
the
soft
skills,
so
yeah,
that's
it
and
before
we
enter
this
meeting.
Anyone
has
any
questions.