►
From YouTube: Kubernetes SIG Scheduling Weekly Meeting for 20230810
Description
Kubernetes SIG Scheduling Weekly Meeting 2023-08-10T17:02:14Z
A
Hi
everyone
welcome
to
today's
schedule.
Meeting
it's
been
leaving
is
being
recorded,
so
be
respectful
to
each
other,
we'll
be
publishing
the
internet.
Okay,
let
me
share
my
screen.
A
The
first
one
was
last
week
last
weekend:
I
found
some
potential
issues,
so
I
will
start
with
the
original
bug.
I
found
I'll.
Give
you
a
heads
up.
So
is
that.
A
Then
it's
a
kind
of
regression
in
147,
because
when
I,
when
I
upgrade
the
schedule,
plugins
from
126
to
127
I
spot
this
issue,
the
issue
is
that
we
introduce
a
Gap
Machinery
into
pre-filter.
So
there
are
the
foreign.
A
But
every
pre-filter
we
are
sort
of
get
wired
with
a
post
filter
implementation
hook,
which
is
the
preferator
extensions,
add
pod
remove
part,
because
that
is
need
to
be
dry
wrong.
Preemption
right
there
are
some
prefectors
importation
was
relying
on
reading
a
pre-calculated
state
with
that
particular
pre-cycle
State
key,
and
that
will
cause
if
the
skipped
converter
and
doesn't
specify
that
key
and
also
in
the
letter
in
the
latter
case
you
specify
I'm
sorry,
it
costs
the
pure
add
part
and
the
remove
part
it
will
cause
cause
issues
cause
the
prevention
totally
non-functional.
A
If
you
look
at
this
logic,
if
a
prefetter
returns
are
scheduleable,
they
will
just
return
and
without
setting
the
skip,
filter
plugin
state
in
this.
So
we
should
the
fix
wasn't
to
ensure
it's
always
been
executed,
so
they
are
in
the
latter
case.
We
are
get.
You
know
this
issue,
so
it
has
happens
on
127.
We
should
pick
two
or
seven
and
this
remarks
into
another
discussion
with
cancer,
which
is
that
this
might
be
a
general
problem.
A
Then
maybe
plugin
B
is
placed
after
plugin
a
and
then
plugin
B
doesn't
have
the
chance
to
to
do
is
pre-filter
calculation
as
well
as
maybe
there's
something
to
pre-calculate
its
state,
which
you
will
be
used
in
the
in
the
latter
case.
Yes,
sorry
in
the
latter
phase
that
is
used
for
preemption,
then
once
the
power
is
not
be
scheduled,
it's
not
scheduleable,
then
plugins
be
well
cost
error
and
then
clear
the
whole
preemption
process.
So
this
is
the
general
problem.
I
want
to
bring
up
today,
so
Cafe
reserve.
A
A
proposal
is
that
we
can
say:
Okay
continue
to
run
prefetter
if
I
plug
in
returns
unscheduable.
That
means
that
will
be
sort
of
for
waste
because
then
there's
also
not
not
online.
With
current
reputation.
That
means
goes
through
all
the
pre
filters.
Calculation,
no
matter
it
will
be
using
or
not.
So
this
is
one
proposal
and
he
raised
one
PR
yeah
run
up
filter
one,
the
preemption,
the
happiness
in
the
same
schedule,
I
think
I,
I
yeah
go
ahead.
B
I
have
some
questions
so
I
think
there
is
one
bug
that
affects
the
cube
scaler
right.
A
Yes,
but
lucky,
but
unfortunately,
is
that
all
our
schedule,
programs
return
on
scheduleable
and
the
wrong
resolvable.
So
that
means
that
will
block
the
preemption.
So
vanilla
schedule
will
be
good.
It
just
will
be.
B
But
I
think
there
was
a
bug
or
we
were
not
injecting
this
status
into
the
nodes.
When
we
had
unscalable.
A
Yeah,
that's
that
that's
a
second
second,
that's
a
second
regression
that
is
also
well
and
I.
Can't
say
it
was
discussing
about
this
and
yeah.
We,
we
are
both
skeptical
about
our
memory.
I
do
think
that
unresolvable
will
block
the
preemption,
but
the
the
behavior
is
not
so
I.
We
tried
to
find
the
regression
PR,
and
that
was
turns
out.
This
is
the
Google
and
yeah.
This
is
the
third
thing
I
want
to
mention
later,
but
this
is
definitely
needs
to
be
ozotic
to
what's
27
and
26..
So
this.
A
Yes
correct,
they
were
effect.
That
means
that
the
story
that
this
semantics
doesn't
work
at
all
you
will
wait.
Yeah.
You
will
waste
some
cycle,
some
prevention,
because
okay,
the
plugin,
already
tells
you
that
this
will
be
unresolvable.
Why
you
spend
extra
efforts
on
the
prevention,
drywall
Etc?
That
definitely
needs
to
be
checked.
A
Yeah,
so
this
issue
is
it's
not
aggression.
It's
I
would
say
it's
minor
issues
that
my
exists
in
all
versions
and
it
because
because
entry
plugins
always
return
our
schedule
our
own
and
are
on
reservable,
so
vanilla
schedule
won't
be
impacting
it's
just
all
out
of
trade.
Plugins
yeah
Elia
go
ahead.
C
Hi
I
have
a
slightly
old,
related
question,
but
maybe
slightly
different.
So
let
me
know
it's
not
applicable,
so
this
plugin
specifically
will
disallow
preemption
in
scheduler.
However,
it
was,
the
pad
will
still
can
be
evicted
through
the
drain
right.
So
let's
say:
if
somebody
wants
to
remove
the
spot
by
other
means
which
goes
through
eviction
and
pdb,
it
will
be
still
evicted
right,
so
in
a
sense
that
the
functionality
in
scheduler
and
other
paths
to
preempt
will
be
slightly
inconsistent.
Is
that
a
concern
or.
A
C
C
So
in
this
case
we
will
say
oh
right,
even
though
otherwise,
this
pod
should
be
the
schedule,
because
it
cannot
be
run
anywhere
else.
We're
gonna
exclude
it,
which
not
goes
the
same
way.
If
somebody
decides
to
drain
you
know,
then
that
part
will
be
be
scheduled
or
removed
without
reservations
of
schedule
ability
anywhere
else.
What
I'm
trying
to
get
to
is
that
those
two
path
kind
of
will
behave
slightly
inconsistently
with
each
other
right.
A
Because
the
regression,
if,
if
you
are
talking
about
the
regression
that
is
unresolvable,
was
not
honored,
then
yes,
it
will
impact
the
prevention
pass,
because
the
parent
pass
doesn't
need
to
be
executed
in
this
case
right
so
but
it
happens
since
126.,
so
that
may
trigger
on
some
totally
unnecessary
eviction
of
the
part,
yeah
you're
right
yeah.
That
can
cause
your
symptom.
Yes,
so.
B
Wait
but
I
think
the
bigger
problem
here
is
not
that
preemption
my
might
do
unnecessary
preemptions,
but
rather
that
it
would
crash
it's.
It
could
potentially
crash
right
because
not
Opera
filters
run.
B
A
Yeah,
so
it
won't
crash
I
would
say,
because
once
you
made
the
first
unschedule
status,
you
will
return
right
and
when
you
return
then
this
this
schematics
tells
you
I,
don't
want
to
continue
on
preemption,
but
you
did
so
the
symptom
is.
That
is,
although
it's
not
totally
hopeless
for
for
preemption
for
this
part
and
still
girls,
preemption
and
then
there's
any
maybe
unnecessary
eviction,
but
still
can
now
find
a
good
spot
or
something
yes.
B
But
if,
if
we
stopped
adding
the
status
to
the
status
map,
then
then
the
the
scheduler,
the
preemption,
would
still
try
to.
A
A
Okay,
so
yeah,
maybe
in
this
it's
confusing
you
so
basically,
this
discussion
evolves.
Two
regression
one
is
that
the
skip
results
is
not
populated.
The
other
is
that
the
unscrewable
and
unresolvable
is
not
honored.
They
posted
both
regressions
and
then
that's.
It's
inspired
me.
Another
discussion
is
that
being
the
difference
with
unscheduled
and
unresolvable
is
bad.
A
So
the
more
General
discussion
is
that
for
more
for
out
of
trade
plugins
or
for
future
industry
practice
that
may
use
on
scheduleable
watch
your
video
here
so
cassette
says.
Maybe
we
should
continue
running
the
prefer,
the
plugins
for
all
failures
that
is
only
Returns
on
scheduleable,
because
that
is
a
lightweight
error.
That
means
the
further
preemption
may
have.
So
this
is
one
option.
The
other
option
is
I
said
here.
Maybe
often
up
in
scheduled
config
to
the
users
choose
whether
they
want
to
continue
around
prefer
account
the
First
on
schedule
or
not.
A
The
other
is
to
try
to
infer
the
intention
of
the
user
without
opening
a
configuration.
Basically,
we
can
detect
whether
the
with
a
pre-filter
implements
the
prefer,
the
extension
or
not.
So
that
is
the
critical
part.
That
is
what
impacted
the
preemption,
because
it
continues
to
add
power
and
the
remote
path
section.
A
Right
now
now,
but
you
know,
plugin
can
be
carbonated.
So
if
we
don't
support
that
we
should
add
a
sort
of
thing.
Like
limitations,
that's
okay!
If
you
preferred
her
Returns
on
schedule,
you
may
try
it
on
a
place,
one
one,
this
kind
of
plugging
there
right
and
then
and
maybe
also
place
it
in
the
end
of
your
filter.
Plugin
list.
The
other
thing
we
can.
We
can
do
for
now
schedule,
plugins
I,
think
only
one
plugin,
which
is
called
capacitive
scheduling,
which
is
the.
A
B
A
Yeah
that
will
be
no
option
for
for
entry
practice.
B
A
D
So
I'm
one
of
the
Engineers
on
the
Apache
unicorn
project,
which
does
make
use
of
these
apis
I,
am
I.
A
D
We're
not
returning
just
on
schedule
we're
returning
on
schedule,
one
unresolvable
specifically,
because
we
don't
want
creation
to
hurt.
Currently
we
actually
have
the
entire
preemption
plug-in
disabled
in
our
configuration
that
was
actually
just
but
being
able
to
selectively
control.
That
instead
of
global,
would
would
probably
be
preferable.
I,
like
the
semantics
of
unschedule
means
we
might
try
it
on
schedule.
An
unresolvable
almost.
A
Yeah,
okay,
I
think.
The
conclusion
is
that
both
the
two
regression
will
be
terrific
and
then
for
this
one
we
can
just
let
it
soak
for
well,
because
it
doesn't
impact
the
vanilla
schedule
and
also
for
yes,
very
rare
case.
You
have
two
prefer
against
at
the
same
time,
return
ask
everyone.
So
if
you
want
to
understand,
what's
the
impact
of
this
General
discussion,
you
can
return
through
this
about
where
and
how
this
can
can
be
triggered.
A
B
Yes,
so.
B
We
were,
we
were
I
was
debugging
a
customer
issue,
and
we
found
that
you
know
in
in
kubernetes.
We
have
this
asynchronous
nature.
So
what
could
theoretically
happen
is
that
you
could
schedule
some
parts
in
a
node
user
pods
in
a
node
before
system
pods
from
a
demon
set
are
created
because
of
their
synchronous,
nature,
right
and
well.
This
is
okay
for
service
workloads.
B
Sometimes
it's
not
okay
for
certain
applications
like
gaming,
video
calls
or
certain
AI
ml
Frameworks
are
not
very
resilient
to
this
kind
of
preemption,
so
they
they
need
stronger
guarantees.
Now
they,
this
kind
of
workloads
also
don't
want
to
be
evicted.
B
If
there
is
some
kind
of
temporary
or
temporary
disruption
in
a
node,
so,
for
example,
I
know
temporarily
becomes
not
ready
or
unreachable
things
like
that.
This,
this
applications
don't
want
to
be
evicted
in
that
case,
so
we
provide
the
semantics
for
not
to
be
evicted
right.
That's
through.
We
have
this
no
execute
paint
and
corresponding
to
duration.
B
So
these
users,
these
users,
use
the
the
Toleration
the
the
Toleration,
not
ready,
no
execute
right,
which
is
supposed
to
give
you
this
Behavior
to
not
be
evicted.
However,
the
the
actual
semantics
of
no
execute
is
that
it
also
restricts
scheduling.
B
So
if
there
is
a
no
execute
paint,
your
pot
is
not
cannot
be
scheduled,
but
addition,
additionally,
if
you,
if
you,
provide
a
toleration
for
this
taint,
then
you
are
allowed
to
schedule
before
and
all
becomes
ready.
So.
A
B
So
the
semantics
of
no
execute
is
not
is
eviction
and
and
scheduling,
whereas
the
semantics
of
no
no
schedule
is
just
the
scheduling
so
yeah.
If
you
have
acceleration
for
no
execute
you're
tolerating
being
scheduled
early
and
you're
tolerating
not
being
disrupted
or
tolerating
yeah,
your
tolerating
disruptions
or
sorry
not
being
disrupted.
Yes,
so
in
a
sense
no
execute
is,
can
be
think
of
as
as
a
union
of
well,
you
can
think
of
no
schedule
as
a
subset
of
no
execute.
A
B
E
right,
so
you
you
have,
you
can
have
tolerations
for
them
separately,
but
the
semantics
of
no
execute
is
that
stupid.
B
So
if
there
is
a
taint
for
no
execute
and
you
tolerate
this
state,
you
will
be
scheduled
and
you
will
be
accepted
from
evictions.
That's
the
that's
the
behavior!
What
these
workloads
want
is
being
saved
from
eviction,
but
they
don't
want
to
be
scheduled
early.
They
don't
want
to
be
scheduled
before
they
know
it's
not
ready.
B
So
so,
basically,
what
they
want
is
system
pods
to
be
ready,
and
only
once
the
system
ports
are
ready,
the
they
want
to
schedule
and
and
once
they
are
scheduled,
they
don't
want
to
be
evicted.
B
B
B
To
add
these
stains
for
not
ready,
we
don't
know
continue
effect,
but
at
the
same
time
we
need
attained
and
existing.
So
we
don't.
This
would
just
just
what
I
just
said
would
be
a
breaking
change
right,
so
we
would
need
to.
We
will
need
to
make
a
no
execute
taint.
A
Try
it
downstairs
so
if
the
thumbnail
is
like
Network
partition,
that
is
already
a
supply
with
some
system
no
execute
tint
and
then
suppose
you
want
to
tolerate
this
kind
of
Network
partition,
because
maybe
you
are
more
certain
or
you
have
on
the
integrator
to
control
the
probability
of
this
path.
So
you
want
totally
you
don't
just
system
stable
of
bills,
so
you
want
to
tolerate
the
system
automatic
ad,
no
execute
10,
so
your
part
already
carries
the
no
execute
Toleration.
A
B
Yes,
that
is
correct.
The
problem
is
that
we
cannot
change
the
behavior
of
no
execute
that
will
be
breaking,
because
that
you
could
have.
For
example,
a
system
demon
set
a
user,
a
user
defined
system
demon
set
that
needs
to
be
scheduled
before
not
before
the
node
is
ready
or
actually
or
is
part
of
the
you
know,
Readiness
checks.
B
B
C
Have
clarification
question?
Maybe
so,
if
I
understand
the
gist
of
the
issue,
is
that
no.
B
B
C
C
B
Yes,
but
but
the
problem
is
that
as
I
would
point,
it
did
not
ready
they're,
not
ready,
tolerate
they're,
not
really
taint
is
one.
It's
only
one
and
it's
not
execute.
C
C
C
B
B
B
C
B
Yes,
now
we
have
we.
The
problem
is
that
we
still
have
a
backwards
compatibility
problem
here,
because
if
we
add,
if
we
add
this
extra
taint,
there
might
be
pods
system
pots
that
don't
tolerate
it
and
then
they
would
yeah.
C
C
C
Yeah
I'm,
looking
on
the
notes
right
now,
some
of
the
examples
where
I
see
the
nose
being
not
ready
so
I
do
see
collection
of
things.
So
you
have
a
known
that
kubernetes
that
are
unreachable,
no
execute,
then
you
have
node
companies
that
are
unscheduleable
no
schedule.
You
have
a
so
yeah.
You
have
not
that
kubernetes
that
are
unreachable,
no
schedule
as
well,
so
unreachable
key
comes
in
both
no
schedule
and
no
execute
right.
C
B
C
C
B
B
For
parts
that,
let's
say
for
system
pods
that
we're
assuming
that
the
only
thing
is
no
execute,
so
the
explicitly
tolerated
no
execute.
C
Yeah,
that's
what
I'm
doing
I'm
specifically
challenging
that
assumption
from
what
my
understanding
is.
It
could
be
entirely
incorrect.
I'm,
sorry,
if
I'm
wasting
your
time
is
that
while
it
is
possible
to
have
only
one
no
executane
in
reality,
when
notes
goes
not
really
becoming
scheduleable.
Typically
they
come
with
a
set
of
taints
one
or
more,
and
usually
they
come
accompanied
with
no
schedule
10
with
one
of
the
different
keys.
But
again,
if
I'm
I
could
be
wrong
in
that.
So
that's
why
I'm
kind
of
trying
to
challenge
the
assumption.
B
B
A
Oh,
and,
and
also
one
thing,
is
that
not
sure
if
you're,
using
just
one
single
node
or
several
Nails
there's
an
internal
fully
disrupted
ma,
there
were
another
applied
10
or
something
that
may
make
the
behavior
a
little
different.
So,
for
example,
here
I'm
just
using
one
node
and
then
maybe
I
won't
see
the
expected
Behavior,
because
that
is
controlled
by
a
internal
special
State
called
fully
disrupted.