►
From YouTube: Kubernetes SIG Scheduling Weekly Meeting for 20210729
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hi,
everyone
today
is
july,
29,
2021
and
welcome
to
this
week.
Six
scheduling
meeting
and
this
meeting
is
being
recorded,
so
be
aware
for
what
you're
saying
this
will
be
uploaded
to
youtube
later.
A
Customizable
scheduling
queue,
abstraction
or
interface.
So
if
yeah,
the
author
has
talked
to
me
offline
at
the
very
beginning,
and
the
background
is
that
we
we
do
expose
some
internal
mechanics
of
the
schedule.
Queue
like
adjustable
flashing
interval
as
well
as
some
other
stuff,
but
the
whole
internal
queue
thing
is
not
totally
possible.
A
So
if
a
user
wants
to
highly
customizable
scheduling,
cube
behavior,
not
only
the
sorting,
but
also
they
want
to
totally
manage
the
like
internal,
how
to
do
back
off
how
to
do
flash,
how
to
do
maybe
some
advanced
features
like
multiple
sub-q
inside
the
sky
q
interface.
So
maybe
they
it's
a
good
idea
to
expose
the
interface
for
them
to
implement.
So
this
is
the
background
of
the
the
so
the
cap,
but
in
terms
of
a
cab,
because
right
now,
cab
is
a
very
strict,
kubernetes
process.
A
We
may
don't
have
a
long
run
plan
for
how
to
reflect
the
scheduling
key,
especially
the
internal
scanning
queue
implementation,
because
right
now,
some
some
stuff
like
the
option,
as
well
as
some
other
internal
things,
are
quite
coupled
with
the
internal
q
implementation,
instead
of
to
make
it
totally
abstract
on
the
interface
level.
So
in
the
long
run
I
do
want
that
the
scheduling
queue
can
be
obstructed
and
reflected
so
that
the
user
don't
need
to
implement
the
brand
new
scheduling
queue,
implementation.
That
is
too
much
burden
for
the
for
the
am
user.
A
But
right
now
we
don't
have.
We
don't
have
the
time
and
effort
to
test
apart
the
which
parts
should
be
abstracted,
so
we
don't
have
a
long-run
plan
that
is
doesn't
like
quite
fit
for
the
cap.
So
I
think
also
abdullah
mentioned
here-
is
that
maybe
we
can
go
with
just
with
the
pr
and
with
the
document
attached.
So
that
would
be
should
be
good
enough
for
the
initial
phase,
how
to
load
this
forward.
B
I
guess
my
my
point
is
that,
as
you
mentioned,
there
is
no
clear
plan.
The
proposal
is
really
not
clear
or
not
deep
enough,
and
just
simply
saying
we
want
to
be
able
to
pass
in
the
instance
instead
of
of
the
queue,
instead
of
it
being
like
instantiated,
implicitly
all
the
time
as
the
internal
one.
B
That
is
not
a
proposal
right
like
I
mean
it,
doesn't
really
add
too
much,
and
even
if
we
go
with
that
like
as
planned,
which
right
now,
which
is
we're
gonna,
allow
you
to
instantiate
a
queue
and
pass
in
that.
I
don't
think
we
wanna
canonically
support
this
like
the
way
we
do
for
plugins,
because
I
don't
feel
that
this
is,
for
example,
the
right
interface
that
we
would
want
to
have
for
supported,
like
you
know,
customizable
queue.
B
B
You
know
like
similar
to
the
framework,
for
example
like
in
in
a
sense
or
or
if
we
go
in
with
a
much
simpler
approach
like
okay,
it's
just
an
interface
that
you
would
want
to
implement,
but
that
interface
should
be
again
abstract
enough
from
the
current
internal
implementation,
and
there
should
be
examples
of
how
different
implementations
can
use
that
interface,
like
how
different
limitations
can
apply
to
that
interface.
B
That
would
be
interesting.
This
one
I
mean
it
seems
like
just
a
hack
yeah,
basically
or
like
what
the
proposal
is
to
enable
a
hack,
and
I
don't
support
supporting
this
long
term,
like
even
short
term
like
I'm,
not
gonna,
say
I
don't
think
we
want
to
see
this
in
the
docs.
Oh,
if
you
want
to
skip
like
a
customizable
cue,
then
go
ahead
and
do
this
and
that
I
don't
think
we
want
to.
B
We
want
to
advocate
for
this
at
all,
but
if
whoever
is
going
to
do
this
want
to
take
the
risk,
it's
it's
on
their
own
like
risk.
We
will
try
to
help
them
basically
by
doing
this
small
refactoring,
but
I
don't
support
doing
anything
more
than
that
at
this
point,
without
clear
proposal
of
how
to
support
a
customizable
queue.
A
Yeah,
I
got
your
point
so
yeah
you're,
not
you
do
not
like
the
current
the
short-term
way,
which
is
sort
of
just
put
the
very
heavy
interface
there
and
they
do
sort
of
a
hack
of
the
invitation,
and
it
says
that
we
support
this
right.
We
don't
don't
think
this
is
a
good
good
idea
to
claim.
A
B
Yeah
like
even
if
we
do
this,
I
don't
support
like
documenting
or
saying
that
yeah
yeah,
whether
we're
doing
it
for
the
framework
and
plugins
and
whatnot
and
and
again
we're
just
doing
it,
because
it's
a
small,
refactoring
sure
we
can
make
instantiating
the
the
queue
at
an
earlier
stage
that
you
can
replace
it.
B
If
you
like,
make
it
easier
for
anybody
who
is
importing
the
scheduler
code
to
change
it
basically,
but
I
don't
think
we
want
to
go
further
than
that
at
this
point,
without
a
decent
proposal
for
a
customized
queue,
interface
or
framework.
What
whatever
that
may
be.
A
All
right
so
yeah,
maybe
let's
comment
on
the
on
the
issue,
as
we
do
like
sort
of
more
detail
reflecting
on
the
scheduling,
skill
interface,
firstly
and
yeah
pistol,
part
or
the
interdependencies
of
the
internal
implementation
with
the
interface,
then
we
can
have
a
better
interface
and
time.
We
have
really
the
customers
getting
kills
apart.
C
C
D
D
So
I
guess
yeah,
this
kind
of
was
beyond
just
it's
more
about
sick
apps
right
than
six
scheduling,
so
maybe
we
can
chat
a
little
bit
about
it
once
we
finish
with
the
gathering
topics.
C
C
A
C
Yeah
thanks,
let
me
briefly
introduce
some
of
our
ideas
on
queue
management
and
we
merely
want
to
achieve
two
goals.
C
The
second
one
the
operator
will
find
the
job
has
a
system.
Annotation
and
operator
will
stop
creating
the
post
and
the
second
to
the
third
is
we
have
some
some
past
the
two
parts
of
the
q
controller,
the
first
one
is
extension
extension
will
identify
the
jobs
template
and
will
create
the
q
unit.
The
q
unit
is
a
unit
of
our
queue
and
the
s
the
four.
C
After
the
after
job
scheduled
successfully,
we
will
update
the
status
of
the
q
unit.
Then
the
extension
will
remove
the
suspend
annotation.
Then
the
operator
will
create
a
post.
This
is
the
main
idea
of
our
system.
We
set
up
a
project
called
qbq
and
it
is
also
used
in
alibaba
and
attendance
on
production,
environment
and
the
second
page
is
the
implementation
of
the
q
controller.
C
A
Yeah
yeah,
I
have
some
questions
so,
if
I
understand
correctly
the
letter
part,
the
queue
management
you
mentioned
is
the
controller.
So
it's
it's
not
inside
the
scheduler
or
it's
not,
and
also
it's
not
a
scheduled
tracking
drive.
It's
a
totally
independent
component
yeah,
the
second
page.
Okay.
So
how
does
that
associate
with
the
scheduling
framework
like
the
qsr
filter
and
result.
C
C
C
But
the
ghost
is
the
same
with
the
key,
the
issue
we
talked
before
and
we
just
to
solve
the
multi-tenants
scheduling
through
the
controller
not
to
win
the
schedule
framework.
B
C
B
Helpful,
how
much
has
it
has
it
been
in
production?
If
I
may
ask
about
seven
months,
seven
months.
B
What
is
like,
if
I
may
ask
like
what
is
how
many
jobs
like
do
you
handle?
Does
this
controller
handles
like
how
how
long
like
each
queue
is
going
to
be
and
how
many
queues
do
you
have
just
like
just
for
me
to
get
a
rough
understanding
of
what
scale
are
you
targeting
with
this
design.
C
C
C
Yeah,
maybe
more
maybe
more
than
this
yeah
I
will.
B
B
Okay,
yeah,
that's
all
like
pretty
great
information,
just
give
us
like
in
a
sense
of
and
okay
okay,
that's
great,
like
I'm
probably
aldo
mentioned
that,
like
we
are
working
along
the
same
lines,
we're
interested
in
collaborating
on
this.
If
you
are
planning
to
open
source
this,
we
can
take
a
look
as
well
and
and
see
contribute
to
it
or,
if
not,
then,
if
you
just
only
want
to
discuss
the
design,
then
we
can
implement
something
in
the
open
for
the
community.
B
B
C
B
B
B
C
C
The
operator
will
stop
creating
stop
creating
the
post
and
we
talked
with
the
people
from
cubiflow
community,
and
we
will
add
this
logic
in
the
operator
in
cubic
flow,
and
they
also
set
up
a
common
operator
for
the
jobs,
and
we
also
talk
with
them.
To
add
this
logic
in
the
operator.
Yeah.
D
What
I'm
thinking
is
early,
we
can
definitely
agree
on
common
apis
and
then
the
decision
of
whether
we
want
to
share
the
same
controller
can
be
left
for
later.
D
B
I
think
this
belongs
more
to
sick
scheduling
and
I
don't
know
if,
like
yeah,
you
could
have
another
working
group
or
I
mean
we're
just
adding
another
layer
and
another
another,
like
you
know
organizational
version,
I'm
not
sure
if
sick
acts
as
like
a
whole
sig
is,
you
know,
invested
in
this
other
than
the
job
api
say,
scheduling
is
more
invested
in
batch
schedule.
We
have
a
lot
of
people
here.
B
I'm
trying
to
you
know
enhance
the
default
scheduled
support
batch
cases,
but
as
for
hosting
it
like
I
mean
we
have
a
clear
option
which
is
like
you
know
like
a
similar
to
the
plugins
one.
B
But
that's,
I
think,
premature
talk
like
let's
see
like
the
proposed
design
first
and
and
maybe
then
we
can,
we
can
see
how
how
to
proceed.
Good.
C
A
Make
you
the
slice
shareable
and
pick
the
link
either
in
the
issue
or
under
the
agenda?
So
now
we
know
that
we
are.
We
have
gone
through
alternative
design
of
the
and
manage
the
cost
creation
and
manage
the
paths
which
belongs
to
a
job
belong
to
a
bachelor
class
at
the
creation
time,
instead
of
the
scheduling
time.
So
that
is
another
angle
of
solving
this
problem.
A
Yeah.
Could
you
do
that
to
make
your
slides
link
back
here?
A
So
the
second
item
is
that
I
discussed
without
the
offline
refactoring
some
our
behavior
on
how
to
handle
the
scheduling,
internal
failures,
so
internal
failures,
including
some
internal
scheduling,
thoughts
and
upon
the
thoughts
so
right
now
we
don't
distinguish
the
internal
errors
with
the
standard
errors
like
the
filter
errors,
filter,
fit
error
or
some
other
things.
So
I
think
iodo's
idea
is
that
upon
this
internal
error,
maybe
next
term
the
internal
and
the
transient
error
will
go.
A
So
we
prefer
more
to
treat
this
kind
of
error
as
transient
and
make
it
retriable
as
soon
as
possible.
So
that
means
we
prefer
to
put
it
to
backup
cure.
Instead
of
on
scheduling
queue,
because,
if
you're
buying
is
in
unscheduled
queue,
you
have
to
wait
for
a
related
event
that
comes
in
and
then
the
trigger
that
part
to
be
recharged
yeah.
That
is
the
background
of
this
discussion.
D
Yeah,
no,
not
really.
The
distinction
is
basically
internal
errors,
or
maybe
the
api
server
was
down
or
just
the
api
request
failed
because
of
because
the
ps7
is
overwhelmed,
and
this
is
different
from
from
a
scheduling,
unscheduleable,
pod
yeah.
So.
D
So
yeah
my
my
idea
is
that
we
should
move
these
spots
back
to
back
off
directly
and
I
think
yeah
we've
seen
this
similar
problem
when
doing
pvcs
doing
when
a
pod
has
pvcs
and
the
pvc
is
somehow
not
created
yet
or
is
you
know
because
it's
all
asynchronous
and
then
we
should
be
retrying
that.
B
But
right
like
the
pvc,
when
it
gets
created,
it
will
create
an
event
and
it
will
move
the
part
back.
I
mean
the
first
yeah.
The
first
example
you
give
is
reasonable,
like
if
you
have
errors
related
to
apis
here,
but
I
don't
know
where
like
in
which
plugins
do
we
face
these
errors,
like
where
none
of
these
plugins
actually
directly
talk
to
the
api,
except
for
other
than
the
ones
try
to.
D
B
Yeah
binding
is
issue
is
one
good
example.
B
D
I'm
not
sure
if
any
other,
actually
any
other
plugins
would
have
different
retriable
errors.
Yeah
most
of
the
errors
are,
should
be
unexpected
right.
They
they
only
happen
if
there
is
a
weird
bug
somewhere.
D
So
in
those
cases
I
guess
I
don't
know
what
to
do,
but
in
in
bending
it
should,
I
feel,
like
you,
should
go
directly
to
the
back
off
queue
so
that
that's
one
topic,
I
don't
know
if
we
all
agree,
but
the
the
side
topic
is
that
the
whole
retry
logic,
the
whole
recording
scaling
errors,
and
all
of
that
is
it's
very
flat.
Lucky
like
that.
The
code
is,
is
getting
very
complex,
so
we
might
need
to
find
a
better
design.
Just
for
that
part.
D
B
D
Where
we're
trying
to
handle
this,
this
kind
of
errors
differently
for
each
bot,
if
a
permit
fails
or
if,
if
a
permit,
is
denied,
then
we
we
have
to
decide
what
to
do
with
that
pod,
and
I
just
realized
that
we
we're
not
doing
these
kind
of
decisions,
we're
just
always
putting
everything
in
a
scalable
queue
and
yeah
it's
kind
of
I've.
I
have
a
feeling
that
this
is
kind
of
this
triggers
those
questions
of.
Why
do
we
need
a
backup
key?
Why
can't?
Why
can
I?
D
Why
can
I
not
just
remove
the
back
off
cue
and
I
feel
like
it's,
because
this
kind
of
retries
the
causal
delay
on
these
spots
and
and
shouldn't
so
yeah,
but
I
don't
have
a
real
world.
A
All
right,
I
think,
that's
pretty
much
for
today's
meeting
and
abdullah,
maybe
sometime,
we
can
talk
offline
about
the
discussion
of
the
max
release.