►
From YouTube: Kubernetes SIG Scheduling Weekly Meeting for 20221215
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
All
right
welcome
everybody
to
this
session
of
sea
scheduling
today
is
December
15th
and,
as
you
may
know,
this
meeting
is
being
recorded.
So
for
this
we
actually
have
a
kind
of
a
packed
agenda.
B
We
wanted
to
start
today
mentioning
a
few
of
the
things
we
we
are
really
planning
to
do
in
the
for
the
127
release.
C
Yeah,
can
you
give
me
back
co-hosts
I
think
I
get
yeah.
Can
you
make
me
a
co-host.
B
Okay,
so
meantime,
as
I
was
saying,
I
was
trying
to
say
that
oh
my
God,
we
were,
we
were
with
the
help
of
six
scalability.
We
were
running
some
some
performance
tests
and
we
discovered
a
few
low
hanging
fruits
that
we
can.
We
can
do
on
on
the
scheduler
to
to
elevate
the
throughput.
B
Yes,
you
might
know
for
several
releases,
we've
been
working
on
performance
and
we
are
really
getting
down
to
the
to
the
very
low
level
details,
but
there
is
still
room
for
improvement.
So
here
we
open
a
list
of
issues
where,
where
these
issues
will
actually
improve
the
performance
of
of
the
scheduler,
there
are
a
few.
This
one
is
about
some
calculations.
If
I
remember
correctly,.
B
Second,
one
right:
that's
also
yes,
calculations.
Some
of
them
are
closed
even
already.
Some
of
them
require
API
changes.
For
example,
the
third
one
might
require
add
in
some
configuration
to
the
do
the
cube
scalar
config,
which
is
a
component
coffee
for
the
scheduler
and
yeah
more
optimizations
here
and
there
maybe
more
interesting,
is
the
last
topic
in
in
the
scheduler.
B
B
So,
for
example,
if
a
bot,
if
a
new
node
is
created,
we
retry,
we,
we
put
the
pots
back
into
the
active
Cube,
but
we
have
even
extra
checks,
for
example,
if
they,
if
the
part,
if
the
Pod
has
a
no
Affinity,
we
immediately
check
just
the
affinity
for
that
new
new
pod
new
note,
sorry
and
then,
if
it
doesn't
satisfy
we
skip
so
we
we
in
in
general,
we
ended
up
doing
a
lot
of
quick
calculations
to
reduce
the
the
amount
of
retries
for
pots,
but
there
is
still
an
unconditional
retry
of
bots,
which
we
call
flashing.
B
So
every
after
a
certain
period,
the
posts
are
put
back
into
the
active
queue,
regardless
of
why
they
were
unscalable
and
we've
been
experimenting
with
increasing
this
timeout
or
this
flashing
period,
but
giving
the
ability
to
revert
just
in
case.
We
still
have
some
bugs.
So
this
this
issue
is
mostly
about
increasing
the
the
the
the
flash
period
even
further.
We
are
currently
at
five
minutes
by
default.
B
We
want
to
make
it
15.,
but
we
still
want
to
keep
the
option
of
reverting
it
back
so,
and
we
want
to
do
that
through
the
keepscaler
config
API.
So
this
is
what
the
issue
is
about.
If
you
have
any
concerns,
I
think
we
we've
been
doing
this
carefully
through
through
other
releases
and
there
there
hasn't
been
issues
lately.
So
we
that's
where
we're
proposing
to
increase
the
the
period
even
further,
but
please
communicate
any
concerns.
B
So
currently,
when
we
have
when
a
filter
runs,
we
don't
know
if
we
don't
know
if
a
filter
actually
made
sense
for
a
particular
pod.
For
example,
if
we
we
have
the
node
Affinity
filter
and
the
Pod
didn't
Define
any
node
affinity,
we
want
to
be
able
to
tell
in
metrics
that
that
this
filter
didn't
do
anything
any
any
meaningful
decision
on
this
spot
that
that
way
for
any
new
feature
that
we
had,
we
can
more
clearly
tell
whether
there
are
parts
that
are
using
the
feature.
B
So
this
is
what
the
the
this
issue
is
about.
The
first
step
is
actually
determining
some
form
of
status
that
clearly
says
that
the
the
plugin
didn't
do
anything
meaningful.
So
that's
that's
the
current
War,
the
current
work
that
cancer
is
doing,
but
then
we
have
to
migrate.
We
have
to
do
a
pass
through
all
the
all
the
plugins
to
make
sure
we
are
using
this,
this
new
distinction,
and
with
that
once
that
is
that
is
finished.
B
We
would
also
like
to
include
that
into
the
logs.
So
it's
it's
more
easy
to
debug,
and
there
is
some
we
could
generalize
also
for
score
plugins,
but
we
haven't
really
thought
of
all
the
implications,
but
once
once
we
finish
this
and
there's,
if
there
is
still
time
in
the
release,
we
can
definitely
look
into
improving
metrics
and
logs
for
for
score
for
scoring.
B
So,
yes,
that's
that's
kind
of
on
the
small
side
of
things
that
we
want
to
achieve
during
this
release,
things
that
don't
require
caps
Abdullah
over
to
you
to
talk
about
a
the
bigger
features.
Clips
can.
B
B
C
Okay,
so
yeah
we've
got
like
one
two,
three,
four,
five
caps
that
we
would
like
to
graduate.
C
Have
only
one
that
is
proposed
as
new
so
which
is
the
first
one.
So
if
you
recall,
we
introduced
the
idea
of
match
label
keys
to
topology
spread
to
solve
the
problem
of
you
know
when
you
update
a
replica
set,
it
goes
through
like
a
rolling
update.
C
You
wanted
a
way
to
basically
apply
the
constraints
on
the
new
replica
set,
not
on
the
old
one
right
like
because,
while
you're
doing
it,
the
old
one,
we
know
that
at
some
point
the
pods
will
be
downscaled
and
removed,
and
so
those
all
dark
parts
from
the
older
replicas.
That
should
not
be
taken
into
account
when
doing
the
calculations
of
skew
and
whatnot.
C
So
there
is
a
proposal
to
introduce
the
same
idea
to
part
Affinity
and
product
Affinity
I.
Think
that's
it's
reasonable!
Making
the
API
symmetric,
although
I,
expect
that
hot
departure
thread
is
the
one
that
will
be
used
more
often
for
anti-affinity
instead
of
product
Affinity
in
general,
I'm,
not
sure
how
much
usage
we
have
for
part
affinity,
but
I
mean
it's
nice
to
have
I,
wouldn't
I
wouldn't
say
like
we
should
block
it.
C
So
that's
the
first
one,
the
the
others
are
like
again.
This
is
the
bottle
spread
one
it's
the
same
as
I
discussed
for
portable
spread.
So
we
want
to
graduate
this
to
to
GA
to
this
table.
It's
125
now,
I
think
this
is
a
really
nice
feature.
It
actually
makes
it
easier
in
general
to
use
spot
topology
spread
instead
of
specifying
both
label
and
label
by
key
and
value
you
can
only.
You
only
need
to
specify
the
label.
Okay
and
the
value
will
be.
C
Detected
by
the
scheduler,
the
third
one
is
the
mutable
scheduling
directives.
So
this
is
in
beta
as
well,
and
the
I
The
Hope
was
to
graduate
the
ga
725.
C
C
C
Enhancement
related
to
the
schedule
having
more
guarantees
on
like
having
having
the
schedule,
respect
for
the
power
pod,
surgeon
budget,
I,
guess
on
on
preemption
I,
don't
recall
the
exact
details
of
this
one
like
what
is
it
that
we're
trying
to.
E
Say:
okay
I
can
I
can
mention
a
little
bit
so
the
background
is
that
pdb
was
the
sort
of
the
first
class
citizen
in
terms
of
preemption
since
day
one,
but
it's
a
best
efforts
manner.
That
means
only
like
two
parts:
bills
pdb
protected
and
then
on
hnl.
Then
they
are
Thai.
Then
in
this
case
we
will
preempted
the
PVP
protected
past
anyways,
but
in
other
case
like
I,
mean
by
best
web
efforts
means
okay.
E
If
there
is
a
part,
not
ability,
protected
and
then
another
part
protected
by
PB,
then
we
will
definitely
choose
the
part
without
the
pdb
protected.
So
that
is
why
I
call
it
best
efforts
manner,
but
some
users
says:
okay:
pdb
represent
the
disruption
semantics
in
all
Dimensions,
not
only
preemption
schedule,
but
also
in
like
the
eviction
API
Etc
others.
So
it's
reasonable
for
us
to
our
list
to
give
option
to
the
user.
Says:
okay
pdb
is
don't
preempt
the
path
protected
by
pdb.
E
C
B
C
C
B
Yes,
you
you
linked
this
one
which
needs
to
graduate
to
GA
into
development
domains
yeah
right,
but
we
also
have
another
cap
for
an
arcade
for
for
topology
spreading,
which
is
the
one
about
match:
label
Keys
right
to
the
to
the
note
that
one
is
eating
Alpha.
So
we
need
to
to
make
it
beta
in
127.
C
Which
is
smart,
okay
right,
so
we're
graduating
from
alpha
to
Beta.
The
other
one
is
from
beta
2G
correct.
Yes,
sounds
good,
okay
yeah,
so
this
is
an
overview
of
like
the
features
that
we
will
probably
focus
on
in
127
for
the
scheduler,
please,
if
you
have
any
like
other
things,
you
have
in
mind.
C
Please
add
them
here,
because
we
need
to
go
through
the
opt-in
process.
I
think
again
like
last
time,
so
you
need
like
a
lead
to
tag
the
issue,
so
that
gets
tracked
so
yeah
back
to
you.
B
Okay,
well
any
questions
or
we
can
go
to
the
next
topic.
B
Okay,
so
Sergey.
Thank
you.
F
Hi,
thank
you
for
having
me
I
came
here
from
Sig
note
for
the
sidecar
for
early
announcement
of
sidecar
cap
that
we
plan
to
run
in
127.
I.
If
you
don't
know
about
side,
cars
sidecars
is
a
containers
that
run
inside
the
port
and
doing
some
infrastructure
ambient
work.
It
may
be
logging
containers
a
metrics
container
or
it
may
be
a
shutter
smash
proxy
that
runs
inside
the
port
and
the
average
of
the
networking.
So
all
the
networking
will
go
through
this
proxy
instead
of
regular
Network.
F
So
the
sidecar's
problem
is
that
they
will
not
run.
They
will
affect
both
life
cycle
if
we
Implement
sidecars
as
regular
containers-
and
that
is
a
big
problem,
so
we
were
trying
to
address
this
problem
for
a
long
time
to
enable
this
scenario
for
our
customers
and
in
127
I.
Think
we
get
to
the
point
when
we
know
what
API
will
look
like.
F
So,
if
you
switch
to
this
email
yeah,
there
is
a
The
Proposal
that
is
currently
on
the
table,
and
everybody
seems
to
be
agree
that
this
is
what
the
desired
state
will
be.
If
we
design
from
a
scratch,
we
want
to
extend
init
containers
with
new
type
of
containers
with
restart
policy,
always
so
those
init
containers
will
run
in
the
same
order
as
regular
init
containers
would
typically
run.
F
The
only
difference
is
that
we
will
wait
for
Readiness
of
these
containers
and
then
we'll
move
on
to
next
one
leaving
this
one
alone,
so
it
will
leave
through
the
installation
stage
through
the
regular
containers
life
cycle
and
they
will
not
block
determination
of
a
port.
So
if
all
other
containers
in
a
job
for
instance
completed
then
this
contains
will
be
terminated
by
coblet.
F
So
this
is
overview
of
a
change
that
we
propose,
and
now
it
has
a
lot
of
issues
for
customers
and
issues
for
implementation,
because
I
mean
one
obvious.
One
is
backward.
Compatibility
now
like
before
people
didn't
expect
that
you
need
containers
who
survive
through
the
life
cycle.
Now
some
of
them
will
be
and
from
scheduling
perspective,
this
change
will
mean
that
the
resource
calculation
needs
to
change.
F
So
if
you
want
to
understand
where
the
report
will
fit
into
specific
node,
the
formula
before
was
maximum
for
any
containers
and
some
of
all
other
containers,
and
you
take
maximum
of
all
of
that.
Now.
You
need
to
be
more
elaborate
because
this
sidecar
containers
they
will
run
during
the
installation
stage
and
will
survive
through
container
stage.
F
When
we
understand
the
order
of
startup
and
take
maximum
of
like
specific
chunks
of
sub
lists,
I
can
explain
more,
but
the
key
Point
here
is
that
calculation
resources
will
change
without
major
change
of
post
structure
so
like,
ideally
when
you
want
to
extend
init
containers,
at
least
with
this
extra
new
containers.
F
So
now,
I
I
want
to
open
for
major
questions
and
maybe
some
recommendations.
Some
like
suggestions,
I,
can
talk
about
some
ideas.
We've
been
running
around
backward
compatibility.
How
to
work
around
this
problem,
but
yeah
I
want
to
hear
questions
first,.
D
C
Restart
policy
here,
so
you
go
through
the
containers
and
it
contains
one
by
one.
The
the
current
schematic
is
that
if
one
fails,
you
continue
to
restart,
but
you
don't
go
down
the
list
right
until
that
succeeds
correct,
and
here,
when
you
get
to
this
one,
the
one
with
restart
policy,
always
what
happens.
F
For
this
one,
we
will
wait
for
a
Readiness
of
this
container,
so
if
Canada
needs
to
go
through
the
startup
stage
and
get
into
register
state,
if
they
Define
the
Readiness
problem,
we
will
wait
for
prop,
succeeded
and
then
we'll
move
on
to
the
next
one
and
restart
policy
always
indicates
that
whenever
it
will
crash,
even
if
it
will
crash
during
the
like
after
the
installation
stage,
we
will
keep
restarting
this
container.
C
So,
in
this
case,
basically,
the
difference
is
that
this
container
will
be
allowed
to
continue
to
run
after
you
finish,
this
is
the
new
semantic
as
well
right,
which
means
you
need
to
change
the
calculations.
Take
into
account
that
it's
not
just
the
maximum.
You
need
to
add
it
up
to
whatever,
like
the
the
Pod
spec,
the
the
the
the
normal
containers
have
exactly
okay,
so.
F
Yeah,
it
looks
like
regular
container
we
needed
during
crystallization
stage,
because
some
of
the
service
meshes
provide
TLS
Network
for
everything
and
inclusion
init
containers
themselves,
but
then
there
are
some
Elite
containers
that
initialize
sidecars.
So
in
this
example,
we
download
a
certificate
from
some
Vault
and
then
once
you
have
a
certificate,
we
can
enable
TLS
networking
for
entire
report
and
all
containers
after
this
istio
proxy
will
use
istio
proxy
for
all
the
network
communications
and
nobody
allowed
to
use
regular
Network
after
that,
okay.
E
E
B
B
I
have
a
a
request
that
when
you
go
through
these
changes,
we
should
probably
have
a
single
Library,
probably
in
component
helpers
or
in
the
component
helper
staging
repo,
because
we
have
had
issues
before
where
there
was
a
slight
difference
in
implementation
between
kubernet
and
scheduler
or
cluster
of
the
scaler
and
then
or
or
even
I.
Guess
the
quota,
the
resource
quotas
so
yeah.
We
really
should
have
a
single
implementation
that
is
shared
among
all
of
these
systems.
Yeah
cluster.
B
D
Not
a
problem,
but
so,
if
you
have
it
in
sorted
out
in
scheduler
request
photos
clear
will
be
fine.
B
Right,
so
it's
more
about
cubelet,
API,
server
and
scheduler
to
have
a
share
have
a
shared
implementation.
So
we
don't
have
problems.
So
if
there
is
some
intermediate
work
needed
to
first,
you
know
do
the
the
cleanup
or
the
refactoring.
Let's
do
that
first
and
then
change
the
implementation.
F
F
Have
as
a
weird
life
cycle
like
any
port
changes,
pot
spec
changes
they
they
are
really
hard
to
implement
and
the
life
cycle
of
them
is
way
long
because
of
skew
polishing
right.
Okay,
so
one
question
I
had
this:
once
you
have
this
library
to
share
a
resource
calculation.
Is
there
a
I
know?
There
are
custom,
schedulers
I
know
there
are
like
some
plugins
how
much
this
backward
compatibility
will
hit
us
from
perspective
of
third-party
tooling?
Is
it
something
we
need
to
I
mean?
Is
it
something
big
enough?
F
So
we
there
is
no
way
we
implement
it
and
we
need
to
go
different
route
and
rename
something
significantly
so
it
will
be
like
breaking
change
ish
like
so
people
wouldn't
just
list
any
containers
any
longer
or
we
it's
manageable
and
with
enough
communication
we
probably
can
get
away
with
just
adding
things
to
this
collection.
B
So
when,
when
people
Implement
custom
scalers,
they
are
encouraged
to
use
the
existing
plugins
and
that
yeah,
so
once
we
fix
the
scheduler,
then
the
cubescaler
they
should
be
able
to
benefit
from
from
the
existing
plugging
the
the
no
resources
fit
plugging,
but
they
might
have
extra
plugins
I,
don't
know
for
quota
calculations
or
things
like
that
which
might
break
but
I,
don't
think
as
kubernetes
kubernetes
project.
We
can
offer
guarantees
in
that
regard.
E
E
B
And,
given
that
we
would
only
be
able
to
graduate
let's
say
to
Beta
in
two
releases
at
least
that
gives
some
some
period
where
they
can
get
up
to
date
with
implementation.
E
B
E
I
want
to
give
a
ever
correct
some
noise,
maybe
mindset
thinking
about
customer
schedule.
So
basically
the
reason
why
I
look
at
some
code
of
the
like
the
third
party
schedule,
like
unicorn,
so
like
in
their
document
official
document,
their
claimed.
They
supported
the
latest,
the
1.5
or
100
1.4.
E
But
that's
not
really
the
case
because
I
look
at
the,
for
example,
unicorn.
Their
dependency
is
only
upgrade
to
dependency
the
kubernetes
122..
That
means,
if
we
introduce
a
new
field
in
the
122
or
133,
then
even
though
I'm
not
aware
of-
and
you
feel
so
how
come
they
are
compatible
to
to
to
honor
the
new
apis.
E
F
We
had
is
completely
duplicate,
init
section
and
have
a
new
section
for
with
the
same
properties
as
in
each
sections
today
with
new
containers,
but
it
doesn't
make
things
much
better,
because
people
wouldn't
know
about
this
new
section
and
they
will
just
not
I
mean
they
would
probably
have
the
same
problem
with
this
new
section
as
they
have
a
problem
with
new
flag
on
existing
section,
and
another
proposal
was
to
have
a
fake
containers
in
containers,
sections
that
will
duplicate
the
sidecar
containers
and
they
will
be
ignored
by
kublet
when
they
actually
scheduled
I.
F
Don't
know
how
much
you
want
to
invest
into
that
work
around,
but
it
sounds
a
little
bit
too
much
to
worry
about.
But
if
you
think
that
it's
something
we
need
to
think
about,
I
mean
I
would
really
appreciate
the
feedback.
B
F
We
have
early
Junior,
we
will
have
a
cap
out,
it's
kind
of
a
big
change
and
we
already
have
like
three
or
four
attempts
to
write
this
cap.
So
this
time,
there's
a
big
working
group
so
hopefully-
and
we
already
get
API
approval-
I
mean
API
like
head
notes
on
apis,
so
we
it
will
be
I
think
we
are
in
much
better
shape
in
this
this
time
around
so
yeah
I
will
send
a
cap
link
once
we
have
it
to
this
group
as
well.
B
B
Okay,
so
one
more
question,
so
this
is
about
startup.
B
A
F
Be
terminated
once
job
completed,
so
this
will
be
implemented.
One
big
problem
with
trying
to
wrap
our
head
around
is:
what
do
we
do
during
graceful
termination
of
regular,
pods
or
regular
containers,
because
if
it's
East
you're
providing
networking
the
network,
you
can
be
needed
during
termination.
So
you
need
to
be
really
careful
like
ordering
and
restarting
if
yeah,
if
this
proxy
crush
it,
we
need
to
restart
it,
even
though
we
are
in
termination
stage.
F
Yeah
and
also
yeah,
and
also
they
will
be
so
also,
they
will
not
block
the
termination
of
what.
B
F
F
Could
sidecars
today
already
have
this
mechanism?
They
just
ignore
sick
term
for
some
extra
duration
of
time
and
try
to
erupt
like
clean
up
all
the
buffers.
So
this
will
apply
still.
D
B
Excellent,
this
is
this
is
very
exciting,
so,
with
that
Michelle.
A
Hello
for
folks
that
don't
know
me,
I
am
one
of
the
sick
leads
for
sex
storage.
I
came
in
here
because
I
wanted
to.
Let
folks
know
about
this
user
end
table
Community
called
data
on
kubernetes
they're,
a
user.
It's
a
community
full
of
people
who
are
trying
to
run
stateful
workloads
in
kubernetes.
A
So
there's
a
lot
of
there's
a
lot
of
database
vendors
here
that
are
writing
operators
and
there's
also
just
users
of
those
operators
as
well
and
I'm,
trying
to
organize
sort
of
a
regular
Round
Table
session.
A
With
this
group
between
kubernetes
maintainers
and
people
from
the
the
dok
community
and
hopefully
I
think
we
can
maybe
my
goal
is
we
can
sort
of
get
direct
feedback
from
users
about
you
know
any
sorts
of
friction
or
problems
or
things
they
like
to
see
us
enhance
in
kubernetes
to
make
their
lives
easier
and.
A
But
yeah,
so
we
are
going
to
schedule
a
first
Roundtable
session
in
January
and
so
I'm
just
kind
of
going
around
to
all
the
things
to
gather
interest,
and
so
when
that
session
is
actually
scheduled,
I
can
reach
out
to
everyone
here
with
that
and
I
think
eventually,
eventually,
if
from
that
first
session,
if
you
know
there's
enough
topics
to
talk
about
to
be
worth
creating
a
working
group
or
something
like
that,
like
a
stateful
working
group,
I
think
that
is
also
a
potential
option
on
the
table
and
see.
A
I
think
the
the
idea
is
that
stateful,
like
State
running
a
stateful
workload,
is
a
lot
more
than
just
storage.
It's
storage
is
just
a
small
piece
of
it,
but
you
know
they
might
have
specific
scheduling,
problems
or
maybe
specific
node
problems
that
or
networking
problems
that
they
need
to
deal
with.
So
that
would
be
the
purpose
of
the
working
group
to
kind
of
put
together
this
like
cross
Sig
sort
of
cross,
because
the
problems
might
be
cross-sick.
C
A
A
B
Sorry
is
this:
does
it
have
any
relationship
to
a
user
group
from
cncf,
or
is
this
separate
effort
or
ad
hoc
group.
A
A
I
think
just
add
your
name
here
and
then,
when
the
first
meeting
is
scheduled,
then
I
will
be
reaching
out
to
everybody.
I
guess
I
should
probably
find
a
better
way,
maybe
like
a
mailing
list
or
something.
D
B
I
guess
you
could
still
consider
reaching
back
to
cncf
and
and
figure
if
starting
a
user
group
is
a
good
idea
as
well.
D
A
F
B
B
B
Well,
okay,
we'll
see
you
in
the
next
meeting
you
see
in
two
weeks.
We
don't
have
a
meeting
when
it's
two
weeks.
Yes,
in
two
weeks,
we
won't
have
a
meeting.
Let's,
let's
cancel
that
one.
It's
December
29th
so
but
we'll
see
you
after
New
Year's,
so
happy
Holidays,
happy
New,
Year,.