►
From YouTube: Kubernetes SIG Scheduling meeting - 2019-01-10
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hello
and
welcome
to
scheduling,
meeting
and
happy
new
year.
This
is
the
first
meeting
of
scheduling
in
2019
I
hope
you
all
have
started
a
great
new
year.
As
you
know,
this
meeting
is
recorded
and
will
be
uploaded
to
public
Internet.
All
right
with
that,
let's
start
the
meeting
I
have
a
few
updates.
There
are
probably
more
updates
than
usual,
given
that
we
didn't
have
any
meetings
in
the
past
couple
of
weeks.
A
So,
let's
see
so
I'm
gonna
I'm
gonna
try
to
start
with
the
more
important
ones
because
we
may
not
get
to
all
of
the
updates
this
week.
But
hopefully
we
will
try
to
cover
those
in
the
given
time
that
we
have
all
right.
First
of
all,
let's
go
over
some
of
the
project
updates.
So,
as
you
know,
we
have
a
relatively
long
list
of
items
for
114.
Some
of
these
items
are
already
done.
A
One
of
the
one
of
the
items
that
is
already
done
is
adding
a
backup
mechanism
for
our
new
schedulable
pods.
So
this
was
an
item
that
was
basically
pending
for
a
while,
given
the
complex
area
of
the
feature,
but
it's
already
merged.
The
idea
here
is
that
pods,
which
are
not
scheduled,
able
are
not
going
to
be
retried
in
a
tight
loop.
They
are
gonna
basically
yield
to
other
pods.
You
know
our
current
scheduling
queue
has
any
pretty
sophisticated
mechanism
to
give
higher
priority
to
parts
which
have
higher
priority.
A
But
if
such
high
priority
pods
are
not
a
scheduler,
but
they
could
block
the
head
of
the
queue.
So
in
order
to
avoid
that,
we
have
added
this
back
of
mechanism.
So
when
a
high
priority
part
is
not
a
schedulable,
it
is
subject
to
back
off
in
order
to
not
watch
the
queue.
Basically,
that
item
is
done.
We
have
optimized
notice
status
updates,
essentially,
large
clusters
know
the
status
updates
are
very
frequent.
A
Every
no,
it's
sent
status
updates
every
10
seconds
every
time
that
these
node
updates
arrive,
scheduled
retries
rescheduling
unschedulable
parts,
but
a
lot
of
these,
no
doubt
dates
are
essentially
no
ops,
they're,
simply
heartbeat
updates.
There
is
nothing
to
change
on
the
node,
so
we've
added
an
optimization
to
check
what
is
changed
on
a
node
and
if
there
is
nothing
that
could
make
the
node
more
schedulable
and
the
scheduler
is
not
gonna
retry,
this
part
is
done.
A
We
have
landed
another
feature
to
improve
performance
of
the
scheduler.
As
you
may
know,
we
were
trying
to
basically
score
fewer
nodes
than
the
whole
cluster,
especially
in
larger
clusters,
in
order
to
improve
throughput
after
scheduling
his
initial
phase,
we
started
with
like
scheduling,
scoring
50%
of
the
clusters
and
then
in
a
subsequent
phase.
We
added
a
dynamic,
dynamic
percentage
in
larger
clusters.
This
percentage
goes
lower,
so
it's
like
5,000.
No,
it's
not
clusters
as
soon
as
we
find
10%
of
the
nodes
signify,
hundreds,
for
example,
in
it
5,000
a
cluster.
A
We
stop
scanning
for
more
nodes
and
we
used
500
nodes
to
schedule.
The
plot
this
percentage
increases
we
see
go
is
higher
for
smaller
clusters.
Sorry
in
like
hundred
Custer's,
it's
50
notes
or
less
it's
gonna
be
hundred
percent
of
the
cluster
and
clusters
we
like
to
want
to
know
it's,
or
so
it's
gonna
be
about.
30
percent
of
the
fastest,
so
anyways
it's
there
is
a
linear
formula
for
this.
This
has
improved
the
scheduler
performance
quite
a
bit
in
5000
North
clusters.
A
A
Don't
see
her
here,
design
of
the
new
equivalents
cache
Harry
was
working
on
that
I.
Don't
know
what
is
the
latest
status?
He
had
a
prototype,
but
we
haven't
seen
a
PR,
yet
gang
scheduling,
so
class
you're
here
I
know
that
the
proposal
is
already
merged.
I
would
like
to
hear
your
thoughts
and
thanks
for
the
update
by
the
way
in
the
community
meeting
this
morning,
oh
yeah.
B
Sure
my
pleasure
yes
I,
can
update
to
a
power
cascading
this
morning.
I
didn't
give
time
for
them
to
ask
questions.
So
I
didn't
look
at
mostly
bankfull
now,
but
as
far
as
I
know,
there
are
several
people
would
like
to
try
wrong
some
patchwork.
Loading
containers,
I
think
cause
scheduling
is
a
good
start
porn
for
us
to
move
a
wall.
Yes
for
the
crown.
Yes
for
the
gun,
scheduling
the
basic
feature
is
implemented
in
the
crew
patch
trying
to
work
some
about
the
resolution
in
culture
and
a
controller
for
this
part.
B
B
I
feel
the
coup
batch
of
France
crash,
so
I'm,
also
yes,
I'm,
also
trying
to
align
with
scalloping
framework
I'll
be
sad
before
yes
I'm.
Yes,
the
future
of
oku
patches,
almost
there
I
in
the
short
term
item,
maybe
I
will
know
how
to
a
more
feature
here
so
I'm
trying
to
choose
to
align
with
the
gallery
framework.
So
we
can
share
share
some
comments,
yeah
that.
A
One
yeah
yeah,
okay,
excellent!
Thank
you.
Okay,
no
speaking
of
the
scheduling
framework
jonathan
has
worked
on
the
scheduling
framework,
a
bit
mostly
around
the
ideas
of
polishing
the
design
and
reflecting
some
of
the
changes
that
we've
made
afterwards.
One
of
the
important
ones
was,
of
course,
the
fact
that
we
didn't
want
to
build
another
scheduler
from
scratch.
We
decided
to
bring
all
the
ideas
of
the
scheduling
forum
work
to
the
current
scheduler
in
order
to
keep
backward
compatibility
and
everything.
So
the
new
revised
virginia
boson
is
there.
A
There
is
a
PR
I
have
linked
it
to
our
meeting
notes.
The
document
which
is
also
in
the
calendar,
invite
you
can
you
can
check
it
out.
Please
go
and
read
it
if
you
care
about
this
scheduling
framework,
because
this
is
an
important
time
to
give
us
feedback
later
on,
it
will
be
harder
to
change
things
and,
of
course,
after
the
after
its
implemented
and
it's
out
there,
it
becomes
even
harder
and,
of
course,
even
further
when
it
things
become
beta
or
GA,
they
become
almost
impossible
to
change.
B
B
Some
people
ask
how
to
use
as
default
scheduler
as
wonderful
as
SDK
last
week
for
using
about
the
performance
part.
Yes,
so
I
think
we
need
our
it
was
uncle
together,
is
use
or
how
to
use
that.
So
yes,
so
the
community
will
say.
Oh,
this
is
what
I
want.
Oh
they
don't
would
have.
This
is,
and
also
what
I
want.
So
they
will
can
give
us
feedback
yes
yeah.
So
I
would
like
to
have
some
some
I
believe
the
two
example.
The
first
one
is
a
how
to
use
the
scattering
framework.
B
The
other
one
is
a
how
to
use
our
default
algorithm
such
as
predicts
on
prioritize.
Yes,
I
think
there
we
we
use
it
to.
Will
we
some
some
some
issue
has
the
introduced
or
issues
that
we
cannot
use
the
portal
of
finding
other
side
scatter.
Yes,
you
know
I,
some
people
submit
a
pull
request
of
all
this
part.
Yes,
because.
C
B
C
A
C
A
A
It's
still
a
little
early
for
the
scheduling
framework
to
have
code
examples,
at
least
not
in
it
design
proposal.
But
definitely
you
know,
as
we
are
building
the
scheduling
framework
and
we
are
adding
extension
points.
We're
definitely
going
to
have
some
examples
of
the
way
so
I
know
school
and,
of
course,
we're
gonna
also
build
some
tests,
which
are
gonna,
also
work
as
examples
of
how
to
use
the
framework
but
yeah.
That's
a
good
point.
Thank
you
for
the
feedback.
Yeah.
B
A
All
right
so
there
has
been,
there
have
been
like
several
issues
actually
in
the
scheduling,
queue
and
also
a
race
condition
in
setting
nominated.
Now
it
named
in
a
preemption
logic
of
the
scheduler.
Some
of
all
of
this.
Actually
all
of
these
issues
are
already
fixed
and
cherry-pick
pr's
are
also
sent
out.
So
hopefully
those
issues
will
be
addressed,
pretty
quickly
yeah.
A
So
these
these
are
the
issues
that
are
gonna
affect
mostly
larger
clusters
or
clusters,
with
a
lot
of
pending,
pods
and
stuff,
like
that,
so
I
guess
I
talked
about
some
of
that
stuff
all
right,
so
one
more
update
is
about
affinity
and
Atta
affinity.
There
is
a
thread
started
by
Voight
edge
for
those
who
don't
know
him,
he
is
one
of
the
top
contributors
to
kubernetes.
A
Of
course,
I
don't
agree
with
all
the
details
in
his
proposal,
but
the
discussion
is
going
on,
so
he
was
thinking
that
maybe
we
should
remove
the
feature
or
make
it
very,
very
limited
to
only
node,
but
I
don't
think
now
it
will
be
enough
because
there
are
users
who
want
to
use
the
feature
in
use.
Cases
such
as
I
want
to
put
all
my
paws
in
the
same
zone
in
order
to
avoid
like
network
delays
and
also
avoid
Network
charges,
which
is
pretty
common
in
most
cloud
providers.
A
If
you
go
into
zones,
so
I
think
there
are
use
cases
for
affinity
to
zones
and
like
larger
collection
of
notes,
other
than
just
a
note
itself.
So
anyway,
the
discussion
is
ongoing.
Nothing
is
finalized.
I,
don't
think
that
we
have
concluded
anything
yet
if
you
are
interested,
feel
free
to
go
and
take
a
look
at
the
discussion
and
participate.
A
These
are
most
of
the
stuff
that
I
wanted
to
talk
about.
Today.
There
are
a
few
items
that
I
would
like
to
hear
from
the
contributors
about
their
status.
One,
let's
see
so
Ravi's
here,
Robbie
I
as
far
as
I
can
tell
there
is
currently
no
plan
for
making
these
schedulers
other
component
of
kubernetes.
Is
that
right?
Is
that
still,
okay?
A
C
B
C
C
A
So
you
would
like
to
keep
the
list
shorter
and
also
having
them
on
the
reviewers
list
causes
get
help
to
auto,
assign
some
of
the
peers
to
them
and
as
a
result,
if
yours
may
not
get
reviewed
by
anyone
for
a
long
time,
which
also
gives
bad
experience
to
our
contributors.
So
I
would
like
to
clean
up
some
of
those
bars
so
I.
Basically,
the
reason
that
I
ask
the
question
is
exactly
because
I
wanted
to
make
sure
that,
if
he's
not
working
on
this,
we
remove
him
from
the
list.
Yeah.
A
A
D
C
C
So
because
of
that
it
is
kind
of
becoming
a
problem
because,
from
a
security
perspective,
we
do
not
want
notes
to
update
the
taints
on
them
or
qubits
to
update
teams
on
them,
because
they
could
steal
the
workload
towards
them
or
can
that
I
do
not
want
a
particular
pot
to
land
on
me.
So
updates
are
not
possible,
but
we
can
register
the
teens
during
cubelet
initialization
at
pains
to
the
know
during
cubed
a
finished
initialization,
so
I
have
created
a
PL
for
that,
but
it
won't
cover
the
case
wave.
There
are
updates,
meaning.
C
C
Will
eventually
do
it,
but
cross
has
has
a
point
to
where
he
mention
that,
after
like
300
seconds,
usually
the
pod
would
get
evicted
and
if
you
take
some
time
for
node
controller,
to
apply
13
so
long
at
300
seconds
how
come
I.
The
300
seconds
is
coming
from
the
the
Toleration.
The
default
operations.
C
I'm
not
talking
about
the
time
that
is
needed
for
applying
and
removing
the
paint
I'm
talking
about
once
the
taint
has
been
applied.
Example
no
executing
by
Lee
for
validation.
Time
is
300
seconds
to
the
correct.
So
in
both
the
cases
like
I
do
not
know
the
exact
time
like
how
long
a
node
controller
takes
before
applying
the
team
when
I
tested
it
locally.
C
It
was
happening
like
very,
like
almost
instantaneously,
find
out
clusters
that
have
tested,
but
the
Toleration
is
something
that
cross
has
mentioned
on
on
the
PR
that
it
might
be
difficult,
especially
in
online
environment,
where
some
of
the
services
would
tell
that
okay,
after
300
seconds,
I
am
going
to
be
unavailable
that
might
be
kind
of
a
difficult
proposition
for
them.
So
he
has
given
couple
of
solutions
like
how
can
we
add
those
stains
within
the
node
status?
C
As
of
now
they're
in
node
spec,
the
the
paints
are
actually
in
the
node
spec,
so
he
has
like
few
sessions
and
he
wanted
a
discussion
to
be
started
within
six
scheduling
so
that
we
could
come
up
with
some
proposal
and
eventually
close
this
out.
Without
that,
we
have
to
disable
take
notes
by
condition
yeah.
A
That's
a
little
unfortunate
because
we
had
to
do
that
that
for
112
and
113
in
gke,
given
given
this
condition
that
exists,
but
you
would
like
to
entry
enable
it
if
possible,
I,
don't
have
the
full
understanding
of
the
proposal
that
you
mentioned,
but
one
question
that
I
have
is
that
some
classes
here
himself
too?
So
I
am
aware
that
you
know
when
you
specify
toleration.
Usually
you
have
the
timeouts,
that's
three
other
side.
Yes,
afterwards,
the
parts
are
affected,
but.
B
A
B
Think
for
the
figs
I'm,
okay,
with
we
house,
we
have
our
things
in
short-term,
yes
to
handle
this
part,
but
for
the
can
denote
by
condition
we're
already
meters.
That
worries
you
about
the
risk
condition
so
I'm,
I'm
thinking
whether
we
should
rethink
about
our
design
about
this
part
yeah.
So,
yes
for
Fugees,
kids,
I
think
we
need
to
I
will
create
solution.
For
this
part,
we
don't
want
to
block
the
prodigies,
who
are
drug
users
case
yeah
and
for
the
long
term,
I
would
like
to
resolve
this
problem.
B
Yeah
I
think
we
can
I
think
they're
too
strident.
No
first
one.
We
need
to
have
a
clear
phase
2
for
our
release
and
another
is
a
we
need
to
rethink
whether
tend
to
know
the
declination
is
a
is
a
current
information.
If
there's
a
right
interaction,
we
may
have
some
other
option
for
us
because
the
week,
if
we
want
more
something
runs
by
the
truth,
a
person
live
there
will
enroll.
B
A
So
one
question
before
not
before
this
feature
not
10,
not
by
condition
we
had
a
node
status
or
no
conditions
which
were
like
not
ready
or
whatever,
and
the
scheduler
was
taking
those
into
account.
So
cubelets
are
allowed
to
change
those
right.
I
mean
they
deserve
this.
You
know
seriously
concerns
about
that.
A
C
A
That
was
the
first
solution
that
came
to
my
mind
as
well.
Maybe
that's
something
we
can
do.
It
feels
a
little
bit
like
a
you
know,
sort
of
like
a
reaction
to
the
problem
and
feels
like
a
little
bit
of
a
hack
to
me,
but
but
yeah.
That's
one
of
the
immediate
solution
that
comes
to
mind.
Okay,
I
will
but
I,
don't
think
we
can
actually
solve
the
problem
right
now.
I
don't
want
to
take
another
look
at
the
PR
and
I
will
share
my
thoughts.
D
A
A
D
A
So
I
think
your
proposal
is
fine,
actually
I
initially
I
was
not
so
sure
about
it,
but
after
you
explained
I
think
it
it's
fine.
We
can
have
LT
+
GT
operators,
they
should
be
okay,
I,
don't
think
it's
gonna
make
performance
a
lot
worse,
but
we
need
to
actually
make
sure
that
that's
the
case.
You
need
to
basically
have
some
performance
tests,
but
are
you
kind
of
work
on
it?
Do
you
wanna
send
a
Senate
PR
or
you
yeah,.
D
A
A
Is
related,
it
is
related
to
priority,
but
why
urine
is
one
of
the
things
and
that
resource
code
of
control?
It's
not
basically
resource
collar
controls,
many
things,
including
priority
classes.
They
also
control
a
number
of
parts,
for
example,
that
a
they
can
create.
They
can
control
memory,
CPU
and
resources
that
users
can
consume
stuff,
like
that,
so
one
of
the
things
that
they
control
is
like
how
much
resources
you
can
have
a
particular
priority.