►
From YouTube: Kubernetes SIG Scheduling Weekly Meeting for 20220616
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
This
meeting
is
being
recorded
all
right.
Thank
you
away
for
starting
the
the
recording
as
you
as
you
might
know
this
well,
this
meeting
is
being
recorded
and
it
will
be
uploaded
to
youtube
remember
to
adhere
to
the
to
the
cncf
grace
code
of
conduct.
A
A
A
A
Sorry
yeah
I
mean
you
are
good
to
go.
Oh,
yes
is
my
screen
being
shared
yeah.
I
can
see
your
screen.
Okay,
perfect.
So.
A
Quickly,
let's
quickly
recap
some
of
the
six
scheduling
caps,
the
first
one
is
kind
of
obvious,
or
I
guess
we
have
been
working
on
this
for
for
a
lot
of
releases,
we've
gone
through
a
v1
alpha
one
alpha
two
we've
been
through
a
beta
1
beta
2
with
the
3.
A
So
after
all
these
iterations,
we
we
think
is
the
time
to
finally
graduate
the
scheduler
component
config
to
ga
and
well
that
practically
means
we're
just
copying
the
code
copying
the
latest
beta
3
api.
We
have
and
renaming
it
to
v1.
So
this
this
was
fairly
straightforward.
A
A
I'm
not
seeing
I'm
not
looking
at
the
the
calls.
So
please
talk
if
you
want
to
say
something:
okay,
the
next
one
is
a
new
feature.
A
This
we
actually,
oh,
yes,
this
is
a
new
one.
Basically,
today,
when
you
define
a
pot.
A
You
have
to
specify
how
to
match
the
pots
by
specifying
the
the.
A
Of
of
the
label,
this
has
been
working,
okay,
except
when
you,
when
you
have
a
rolling
upgrade.
A
So
when
you
have
a
rolling
upgrade
each
each
version
has
a
different
label,
and
you
might
want
to
limit
your
spreading
to
specifically
this
version
right
or
this
version,
or
this
replica
set
so
that
you,
once
your
replica
set
finishes.
Upgrading
or
I
mean
the
old
replica
set-
is
fine,
fully
removed.
A
Your
new
replica
set
is
fully
spread
it
so
for
that
alex
introduced
or
is
proposing
a
new
api
that
simply
expands
the
the
topology
spread
policy,
so
you
can
just
say:
derive,
derive
the
label
from
the
current
part
and
use
that
to
match
with
other
parts.
A
This,
as
far
as
I
know,
is
already
yes,
this
is
tracked,
so
this
is
good
to
go.
It's
just
pending
implementation
and
reviews.
A
Okay,
the
next
one
look
at
this,
the
next
one
a
it
does.
It
is
a
graduation
from
alpha
to
beta.
A
A
Policy
currently,
the
the
the
calculations
are
based
on
the
nodes
that
exist,
and
this
is
somewhat
limited
if
you
have
plaster
autoscaler,
if
you
have
cluster
of
the
scalar,
maybe
your
zone
today
has
maybe
you
know
your
nose
are
only
in
two
zones,
but
you
want
to
spread
across
three
zones.
A
So
with
this
simple
mean
domains,
api
field,
you
can
now
say
you
know,
I
want
to
spread
among
zones,
but
I
also
want
them
to
be
at
least
three
zones.
So
if
you
know,
if
the
scalar
doesn't
find
the
third
zone,
it
would
mark
the
pod,
it
would
consider
that
zone.
It
would
consider
that
there
is
a
zone
which
has
zero
nodes
and
if
you
know
the
skew
doesn't
work
out
with
that
that
theoretical
zone,
then
the
pod
doesn't
get.
A
And
then
the
auto
scatter
would
kick
in
and
you
you
will
have
perfect
spreading
after
this,
how
to
scale
your
ads,
and
you
know
so
that's
this
was
already
released
as
alpha
in
124.
We're
we're
targeting
beta
now
and
125..
B
A
So
that's
the
that's.
What
is
left
here.
A
A
So
what
is
this
about?
What
apology
spreading
currently
ignores
things
and
tolerations?
A
A
A
So
this
cap
is
just
introducing
well,
it's
introducing
two
fields,
one
for
no
affinity
and
one
for
tolerations
to
determine
whether
or
not
they
should
be
taken
into
account
when
calculating
skew
and
that's
as
a
feature.
I
think
this
is
yes
also
has
the
same
problem.
It
has
an
outdated
template,
but
the
code
is
already
there.
It's
already.
A
I
think
the
code
is
even
already
merged,
so
the
cap
needs
to
be
updated.
Yes,
so
that
that's
also
going
that's
going
for
alpha
125.,
and
I
noticed
way
you
added
this
one
today.
I
suppose
this
this
is
currently
under
discussion,
because
there
is
a
lot
missing
in
the
cup
currently.
B
We
will
give
a
try
to
polish
all
the
missing
pieces,
like
the
implementation
details,
as
well
as
some
like
the
explanation
of
how
to
integrate
with
the
ca
cluster
autoscaler
and
something
else
so,
basically,
the.
B
I
think
the
idea
has
been
brought
up
in
previous
meeting
that
we
want
kubernetes
native
api
to
represent
a
scheduling
capability
to
schedule,
a
group
of
paths
all
together
or
schedule
enough
enough
of
them.
So
that's
basically
a
scheduled
directive.
We
want
to
introduce
natively.
So
that's
the
background
and
in
terms
of
the
design
and
the
implementations,
there's
still
some
details
need
to
shape
shape
up
so
yeah
me
and
alex
are
working
on
it,
and
hopefully
we
can
catch
up
for
the
freezing
date
next
week.
Next
annual
next
week,.
A
Right,
yes,
as
it
currently
stands,
the
the
cap
only
proposes
a
an
api
without
implementation
and
it's
actually
a
goal
not
to
give
an
implementation.
B
A
Yes,
so,
and
with
that,
I
think
the
major
concern
from
my
point
of
view
is
that
there
there
are
a
few
alternative
implementations
that
are.
A
A
Because
if
we
need,
we
later
decide
that
we
need.
A
B
It's
yeah,
I
agree
we
we
do
want
to
do
it
right
from
day
one,
but
on
the
other
hand,
I'm
seeking
a
way
to
do
the
things
into
iteratively
like
okay.
We
can't
get
this
implemented
and,
additionally
adding
the
support
for
reservation
for
back
filling
so
instead
of
well.
B
We
so
basically
I'm
not
that
convinced
that
we
need
to
get
everything
ready
and
then
start
to
implement
the
idea
so
like
if
the
code
scheduling
is
a
standalone
feature
that
can
be
implemented
right
now,
so
it's
better
to
have
the
feature
to
open
to
the
users
and
when
the
reservation,
api
and
backfilling
stuff
are
ready,
they
can
add.
On
top
of
that
feature,
that
is
how
the
principles
we
design
each
scheduling,
plugins
and
each
works.
A
But
this
proposal
is
also
adding
us
a
field
to
the
v1
pod
spec
and
that
might
be
more
problematic.
So
we
need
to
figure
out.
B
I
do
I'm
aware
of
the
reservation
requirement
that
can
not
not
a
guarantee
but
best
efforts
govern
so
the
better
best
efforts
to
reserve
some
resources
for
the
power
group
right,
but
the
resolution
can
be
only
very
core
screen,
I
mean
so,
even
if
you
reserve
this
kind
of
resource
there
might
be
some
like
interpod
constraints
out
there
that
you
are
pretty
hard
to
reserve
the
resource
on
each
knows
to
quite
satisfy.
B
Otherwise
you
have
to
make
the
knowledge
of
all
the
past
distribution
of
the
cluster
view,
so
they
can
make
a
very
guaranteed
reservation
right.
So
I
mean
my
point:
is
the
reservation?
Is
the
best
efforts
manner
right,
so
that
is
good
to
have
to
on
top
of
the
code
scheduling,
but
because
scheduling
doesn't
quite
depends
on
the.
B
C
B
Maybe
we
we,
we
should
have
composed
the
one
single
object,
called
like
scheduling
constraints
and
put
everything
inside
there.
But
I
mean
this
is
current
state
and
we
are
not
breaking.
A
Well,
I
specifically
mean
multiple
objects,
but
again
the
the
problem
is
that
the
the
size
of
the
database.
B
C
C
I
guess
if
we
have
a
second,
I
just
want
to
call
out
that
in
the
in
the
d
scheduler
repo,
we
started
doing
some
of
the
refactoring
that
we
were
talking
about
for
trying
to
make
it
more
of
a
framework,
try
to
get
the
code
a
little
bit
more
stable
and
more
adaptable
and
customizable.
So
those
changes
are
all
going
on
in
the
repo.
C
We
talked
a
lot
about
it
on
these
calls
in
the
past,
calls
and
posted
the
design
docs.
For
so,
if
you,
if
anyone's
interested
or
wants
to
help
out
or
offer
input,
please
feel
free
to
join
in
on
any
of
the
for
us
that
are
opening.
I
know
young
has
opened
a
couple
already
and
we're
just
gonna
start.
You
know
migrating
some
of
the
internal
code
of
the
descheduler
into
more
of
a
you
know,
plug-in
design
like
the
scheduling
framework.
So
just
a
little
shout
out
to
that.
That's
all.
A
A
Q
is
a
project
sponsored
by
by
six
scheduling
as
well,
just
like
the
scheduler
and
yes,
it
is.
The
question
was
whether
it's
a
separate
report
and
temporarily,
or
would
it
be
march
back
to
keep
scavenger
eventually
I?
A
C
A
For
example,
or
the
or
maybe
just
the
apis
would
be
merged
into
kubernetes,
but
that's
something
that
needs
to
mature
in
this.
It
needs
to
appear
more
before
we
can
even
consider
that
and
it
kind
of
relates
to
to
the
pod
group
api
as
well
right
now.
There
is
a
similar
api
in
in
queue,
so
it's
also.
It
might
also
be
valuable
to
somehow
see
if
we
can
merge
bot
group
with
workload
api
and
they
can
share
basically
the
same
object
in
the
future.
A
But
yes,
it's
it's
it's.
It's
still
an
open
discussion,
long
term
discussion.
B
The
media
build
don't
have
any.
I
have
a
rough
idea
to
improv
improve
the
metric
of
measuring
scheduled
latency,
but
before
I
share,
I
can
give
in
time
to
any
other
books
who
have
other
topics.
B
If
not,
I
can
share
my
screen.
I
will
take
you
another
or
eight
minutes
bear
with
me.
So
basically
in
some
production
system,
especially
there's
a
a
lot
of
tenants
and
you
know
like
3k
or
5k
nails
cluster,
there
might
be
some
different
priority
parts
and
your
pos
scheduling.
B
May
your
scheduling,
scheduling
chances
may
be
head
of
blocked
by
other
parts,
but
right
now
we
don't
have
a
very
good
metrics
to
measure
that.
So,
basically,
if
you
look
at
this
picture
as
past
successful
scheduling
may
increase
several
attempts
and
each
attempts
in
include
three
kind
of
phases.
B
So
the
part
here
has
to
be
weighted
for
the
for
the
voice
chance
and
we
are
missing
the
time
that
the
party
is
waiting
in
active
queue
and
also
then
the
next
phase
is
that
okay,
the
father
has,
has
this
term
to
be
scheduled
to
be
popular,
then
it
enters
the
pure
scheduling
phase.
That
is
the
usually
the
regular
phase
we
are
talking
about,
like
pre-filter
filter,
pre-score
score.
All
the
things
happen
in
this
phase,
so
we
could.
B
We
can
call
it
either
pure
scheduling
phase
and
after
that,
if
it's
claimed
to
be
unscheduleable,
it
will
have
to
owner
the
global
back-off
timer
settings
so
that
it's
like
a
penalty
so
saying
that,
okay,
you
are
unscheduled.
You
have
to
be
sit
there
for
a
while
and
until
another
the
back
of
timer
is
is
up.
You
can
be
popped
out
back,
so
that
is
the
backup
queue
so
for
right.
B
Now
there
is
in
one
attempt
of
the
scheduling
the
attempt
I
mean
it
might
be
a
unsuccessful
attempt,
because
nobody,
sorry
no
node-
can
fit
the
part.
So
we
are
missing
the
orange
part
and
the
green
part,
and
we
only
have
the
blue
part.
So
in
the
code
is
in
the
is
called
yeah.
The
local
variable
is
called
scheduling
latency,
but
the
name
in
the
metrics
called
scheduling,
attempt
during
seconds
and
some
legacy
version
is
called
the
entering,
but
the
entering
isn't
misleading.
So
we
defecated
a
little
bit
yeah.
B
I
think
this
matrix
is
deprecated
and
this
one
should
be
yeah
replaced
by
this
one
okay
anyway.
So
what
I
want
to
propose
is
that
we
may
want
to
record
the
duration,
the
orange
block
or
even
the
green
block.
So
right
now
I
did
a
poc
internally
is
that
I
expose
the
orange
plug
so
that
I
add
a
face
to
the
scheduling,
latency
metric
and
locally.
B
B
So
this
is
just
like
the
current
metric
we
have
so
I
want
to
check
in
each
attempt
the
pure
scheduling
times
period.
So
it's
like
okay,
I
only
spent
952
nanoseconds,
which
is
pretty
good
because
it's
running
in
my
local
cluster
and
only
two
mils
and
but
I
do
not
know.
B
B
Then
I
execute
this
query,
so
90
percentile
and
it
spans
it's,
not
a
small
number.
It's
almost
300
milliseconds
right.
So
if
you
want
to
break
down
some
production
issues,
say:
okay,
a
customer
may
be
asking
you
why
my
part
is
staying.
B
They
are
painting
for
so
long
time
and
you
want
to
narrow
down.
Maybe
it's
a
custom
scheduling,
plugins
implementation
issues
which
cause
the
issue,
because
the
performance
issue,
or
it's
just
waiting
in
actual
too
long
or
waiting
back
off
queue
too
long.
So
that
is
the
idea
I
want
to
put
bruh.
I
just
want
to
get
you,
although
it's
pretty
preliminary
about
to
gather
some
feedback.
If
you
want
to
have
some
or
have
similar
requirements.
A
If
I
may
it,
this
seems
very
valuable,
but
I
think
we
probably
want
a
new
metric,
just
maybe
dedicated
to
cues
queuing
time
or
something
q,
q,
q
duration,
because
because
the
the
other,
the
the
metric
is
already
stable
right.
Yes,.
B
B
And
do
you
think
the
record
the
duration
span
in
back
off
queue
is
valuable.
For
now
I
didn't
implement
that
I
just
implement
the
duration
back
to
queue.
B
A
Good,
so
the
the
active
queue
you
count
from
the
time
it
enters
the
queue
until
it's
possible
until.
A
B
But
maybe
we
have
to
also
introduce
that
other
label
dimensions
like
priority
to
see
each
priorities
pass:
durations
yeah,
but
yeah.
Just
some
early
plcn.
There
will
try
to
draft
a
formal
proposal.
A
Oh
there's
one
question
always
feel
free
to
to
talk,
but
if
not
I'm
happy
to
repeat
the
question
from
the
chat,
is
there
any
benchmark
to
test
the
throughput
of
all
of
all
or
nothing
scheduling?
Now.
B
A
There,
I
don't
know
if
this
was
discussed
ever,
but
there
is
some
con.
At
least
I
have
some
content
contention
of
using
the
term
gang
scheduling.
I
prefer.
B
A
Because
scheduling
or
all
or
nothing,
which
is
even
more
more
expressive.
A
All
right
with
that,
I
think
we
can
close
this
session.
Yeah
just
remember
the
enhancements
freeze
is
next
week
and
yeah.
If
you
still
have
any
proposals
you
you
have
some
time,
but
it's
very
limited
so.