►
From YouTube: Kubernetes SIG Scheduling Meeting - 2019-01-24
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
Renee
is
here
today,
cliven
e.
Let's
start
with
you,
since
you
may
not
be
interested
in
the
rest
of
the
meeting.
I,
let
you
start
and
I
can
actually
give
a
quick
update
about
what
this
topic
is,
and
then
we
can
discuss
it
and
after
that
you
can
feel
free
to
disconnect
if
you
want
so
for
other
folks.
I
have
actually
put
a
link
to
our
meeting
notes
about
the
issue
or
the
PRI
dissent,
which
is
a
proposal
actually
for
in
place
updates
of
pods.
A
So
the
idea
behind
in
place
update
of
pods
is
that
sometimes
positive
wire
to
change
their
resource
requirements
that
they
have,
for
example,
pods,
may
need
more
CPU
or
memory.
In
those
cases
you
would
like
to
have
a
system
which
automatically
changes
those
resource
requirements.
This
is
especially
needed
because
a
lot
of
users
don't
know
how
much
their
paws
require,
and
even
if
they
know
these
requirements,
often
change
during
the
course
of
execution
of
many
parts.
For
example,
a
server
may
suddenly
get
more
traffic.
As
a
result,
it
needs
more
CPU
on
relevant
subject.
A
So
we
would
like
to
build
a
system
that
automatically
changes
these
resource
requirements
based
on
the
actual
usage
of
parts.
There
is
a
proposal.
I
just
sent
a
link
to
the
proposal.
We
already
have
actually
a
mechanism
for
changing
resource
requirements
of
parts,
but
the
current
existing
mechanism
requires
parts
to
be
restarted
after
these
changes
are
applied.
What
we
are
pursuing
is
another
approach
that
does
not
require
or
doesn't
necessarily
require,
a
restart
and
that's
the
subject
of
our
discussion.
Today.
The
there
are
some
design
decisions
that
we
would
like
to
make
and
Rene
is.
A
B
Thank
you
so
the
issue
that
I
in
this
proposal
that
we
were
looking
at
there,
the
backstory,
is
that
there
have
been
a
couple
of
design
proposals
over
the
past
year
and
our
design
proposal.
We
looked
at
this
issue
where
how
to
get
in
place
update
happen
and
what
the
flow
should
be.
Control
flow
should
be,
who
the
initiates
are
initiators.
Are
the
initiating
actor
in
our
case
is
a
job
controller.
Our
customer
requirement
stem
from
a
long-running
job.
B
We're
having
max
resources
allocated
for
the
peak
usage
is
expensive
and
they
wanted
to
see
if
they
could
get
something
that
can
scale
up
and
scale
down,
as
the
resource
increases
needs
increase
in
decrease
another.
The
main
use
case
here
is
VP
a
which
is
currently
using
odd
restart
mechanism
and,
looking
into
the
admission
controller,
to
update
the
resource
requirements
on
a
port
that's
being
created
and
before
it's
scheduled
this,
the
proposal
that
we
worked
on
the
design
that
we
worked
on
it
involves
keeping
the
scheduler
in
the
loop
today.
B
What
we
have
is
we
see
that
when
you
create
a
new
part,
the
controller
goes
and
creates
a
pod,
and
then
the
scheduler
sees
that
the
pod,
a
pods
been
created,
that's
not
bound,
and
it
goes
and
looks
at
its
predicates
around
such
predicates
on
the
pod
and
one
of
the
predicates
is
a
resource
check
and
finds
the
best
node2
for
it
to
assign
and
fix
a
node
and
assigns
the
sensor
to
it.
The
flow
that
we
want
to
want
to
use
is
similar.
B
B
So,
as
part
of
this,
the
flow
that
we
were
looking
to
do
is
have
the
scheduler
pick
up
the
resource,
update
first
and
then
either
say
yes
or
no,
and
the
proposal
that
carol
has
here,
which
is
this,
which
is
what
I'm
planning
to
merge
accounts
for
one
other
case
where
scheduler
can
pre
and
lower-priority
pods
and
get
things
going,
which
is
which
was
not
caught
in
our
proposal.
He
the
issue
with
the
current
flow.
B
B
Okay,
so
this
particular
comment
that
we
have
just
going
into
the
details
of
this
one.
It
looks
like
Bobby
already
commented
and
he
Thank
You
Bobby
for
looking
at
this
particular
issue
here.
So
the
issue
that
I
was
looking
at
the
current
proposal
has
we
update
the
resource
requirements
after
validation,
its
updated
for
the
in
the
pods
back
then
the
scheduler
checks
the
pod
first
and
preempts.
At
the
same
time,
Kubla
tries
to
apply
this
is
where
I
saw.
B
Scheduler
is
at
the
time
working
on
scheduling
another
port
to
the
node
against
the
capacity
which
it
sees
is
available,
and
it's
can
you
set
pod
and
then
rejects
this
update
request
and
then
the
schedule
pod
goes
to
the
coab,
late
and
Kubb
net
sees
it
doesn't
fit
and
rejects
it,
and
initiating
actor
will
have
two
couplets,
then
processes
the
update
request
and
fails.
It's
saying:
okay,
no.
B
This
node
is
full
I
cannot
do
this
update
now
and
initiating
actor
has
to
take
that
part
out
and
couplets
work
is
kind
of
double,
because
it
has
two
scheduled
pod,
p2
again
and
then
p1
its
schedule,
because
initiating
actor
most
likely
will
kill
and
then
create
the
pod
which
increases
the
couplets
workload
overall.
So
this
is
the
point
that
I
I
felt
was
an
issue
and
I'm
working
on
a
flow
proposal.
B
That's
more
in
line
with
what's
being
proposed
right
now
in
the
in
terms
of
there
are
some
nice
things
to
Carol's
approach,
where
his
use
reusing,
existing
pod
condition
and
container
statuses.
So
I'm
writing
some
quick
and
dirty
code
to
test
it
out
and
see
if
there
are
no
major
gorgeous
here
in
applying
our
approach
has
used
for
the
controller
set.
B
B
Have
some
smart
retry
mechanisms
where,
okay,
if
the
in
place
failed
and
it
has
to
be
in
done
in
place,
then
what
we're
going
to
do
is
we're
gonna,
look
at
pods,
leaving
that
node
and
when
pods
leave
that
node
there
is
potential
that
capacity
is
opened
up
and
then
we
retry
that
way.
We
are
not
burdening
the
scheduler
for
with
random
retries,
and
it's
done
when
there
is
a
certain
expectation
that
it
should
succeed.
A
It's
also
join
I,
actually
went
and
read
the
whole
thing
and
I
agree
that
it's
not
a
good
idea
to
have
like
racy
sort
of
algorithms
between
two
components.
As
you
mentioned,
it
causes
extra
extra
activity
in
various
modules,
including
the
scheduler
which,
which
is
sort
of
like
a
precious
resource
for
us,
especially
in
larger
clusters.
If
this
happens
a
lot,
it
could
impact
negatively
disc
the
scheduling
throughput,
so
we
prefer
to
remove
the
races
as
much
as
possible.
A
A
B
With
Carol
as
well,
another
question
that
I
was
looking
into
Derek,
had
raised
this
concern
about
gamification
of
the
resources
so
and
then
Carol
had
suggested
using
max
resources
max
of
resource
allocated
versus
resource
requested
or
the
desired
resources
and
I'm
trying
to
understand
that
a
little
bit
more
closely,
mainly
from
the
perspective
of
does.
It
does
seem
to
like
act
more
on
the
conservative
side
and
which
is
good.
You
will
end
up
in
a
situation
where
you're
not
over-provisioning
any
node.
B
A
Remember
this
exact
problem:
my
memory
is
not
that
good.
So
maybe
I
probably
had
this
conversation,
but
I,
don't
recall
exactly
so
on.
Do
you
know
what
exactly
this
max
resources
is
used?
I
mean,
alternatively,
if
you're?
Okay,
with
like
having
different
key
OS
classes,
we
could
use
like
limits
right
for
setting
the
maximum
to
indicate
the
maximum
resources
that
VP
I
could
go
for
a
particular
part
right.
So
I,
don't
exactly
remember
what
was
the
max
max
resource
is
used
Oh
in.
A
B
And
do
the
during
the
reduction?
Also
it
might
work
out
in
the
sense
that
okay,
the
your
desired
resources,
is
now
lower
than
the
actual
resources
when
you're
accounting
for
it.
You
account
for
the
max
so
that
you're
not
over
provisioning
or
your
like.
To
take
a
simple
example:
let's
say
there
is
one
node
with
one
pot
which
is
using
all
the
capacity
and
then
it's
capacities
is
you.
The
scheduler
reduces
it
to
half
and
then
another
pod
new
pod
comes
for
scheduling
which
can
fit
on
that
node.
B
The
one
potential
problem
I
see
is
okay,
the
scheduler
goes
and
reduces
the
capacity
and
Del's
couplet
okay
go
ahead
and
reduce
its
capacity.
The
couplet
hasn't
seen
that
update
yet
and
then
this
new
part
comes
in
gets
scheduled
to
that
same
node.
Now,
if
they,
if
these
two
updates
are
reordered,
then
the
schedule
pot
will
get
rejected.
That
is
one
potential
I
see.
Do
you
see
that
happening
in
could
could
that
happen?
Yeah.
A
A
B
That's
what
I've
done
wrap
my
head
around
today
and
understand
more
closely
what
it
exactly
means,
and
maybe
that's
what
the
gamification
word
that
term,
that
direct
used
refers
to
I'll,
follow
up
with
Derek
and
then
see
if
this
is
what
he
meant,
and
this
particular
scenario
is
what
Carol
and
Derek
we're
worried
about.
As
far
as
the
rest
of
the
document
goes,
I
believe
you
had
a
couple
of
questions
in
there.
A
B
Yeah,
okay
I
will
take
this
new
parameters.
That
carol
has,
and
probably
tomorrow,
I'll,
be
able
to
finish
my
quick
prototype
and
verify
so.
The
only
outstanding
issue
in
this
which
there
is
no
good
solution
is
when
you
have
two
schedulers
acting
independently,
and
in
that
case
the
best
thing
to
do
is
the
scheduler
sees
that
the
couplet
has
failed,
the
update
request
and
it
deducts.
Let's
say
it's
an
increase
and
then
the
couplet
couplet
can
detect
the
scheduler
can
detect
a
transition.
B
B
A
Basically,
the
idea,
the
basic
of
the
idea,
is
to
accept
the
fact
that
there
is
race
condition
and
try
to
try
to
deal
with
it
right.
If
it
happens,
we
should
not
insist
on
the
same
decision
that
we
have
made
already,
because
it
is
failed.
Scheduler
can
try
a
different
node,
although
we
don't
have
any
of
that
and
this
casual
logic
basically
scheduler
does
not
keep
any
history.
So
maybe
we
need
to
add
that,
but
that's
one
approach,
but
it
could
be
other
options
as
well.
Yeah.
B
The
approach
we
have
taken
is
to
keep
setting
scheduler
simple,
the
controller,
the
initiating
actor
in
this
case
it
could
be
VP
a
or
the
controller
or
job
controller
would
see
that
it
has
failed
and
then
retry
it,
and
we
were
looking
at
a
couple
of
different
policies.
So
in
the
case
of
job
controller,
when
it
sees
that
it
has
failed
and
there
it
goes
and
retries
it
at
a
later
point
when
the
pods
leave
and
we
use
the
reason
field
in
the
pod
condition
to
specify
why
it
failed,
it
failed
because
of
capacity.
B
Okay,
the
node
doesn't
have
capacity,
it
will
have
capacity
when
pods
leave,
and
this
is
something
the
controller
has
a
view
of,
so
the
controller
can
decide.
Okay,
now
is
a
good
time
when
the
quad
that
bustable
pod
leaves
the
node.
Now
it's
a
good
time
to
retry
the
other
cases.
When
you
have
deployment
controllers,
where
you're
resizing
all
the
instances
of
the
pod
and
if,
if
it,
if
the
recess
requires
restarting,
then
you
could
violate
the
pod
destruction
budget.
B
B
Vpa
might
have
other
ideas
of
how
to
handle
it,
I
think
they
were
looking
at
awaiting
and
then
where
the
scheduler
takes
the
task
of
kicking
out
low
priority
pods,
which
makes
sense
because
the
outside
of
kubernetes
cluster
nobody
has
control.
The
control
is
with
the
scheduler
and
controllers
to
see
if
they
can
move
low
priority
pods
off
the
nodes,
so
that
the
higher
priority
pods
can
get
the
resources
that
they
desire.
A
A
In
the
past,
we
try
to
change
cubelet
and
add
logic
to
attain
to
the
nodes
at
the
startup,
but
also
upgrade
issues.
So
we
are
now
pursuing
a
different
approach,
basically
changing
the
API
server
2
to
basically
take
notes
at
creation
time.
So
it
looks
like
the
PR
is
now
ready
to
get
managed
and
we
will
go
cherry-pick
this
change
to
all
the
releases,
basically
since
1:12.
A
Hopefully
this
will
resolve
some
of
those
issues.
Another
issue
we
have
faced
recently
Rey
has
been
working
on.
This
is
that
the
scheduler
sometimes
leaves
some
parts
in
pending
state
and
not
retry
these
parts.
So,
as
some
of
you
may
already
be
aware,
we
have
had
in
logic
in
the
schedule
to
not
retry
under
schedulable
parts.
A
Until
it
there
is
a
change
in
the
cluster
that
makes
parts
more
schedulable,
for
example,
no
doubt
place
and
stuff
like
that
in
the
past,
scheduler
was
reacting
and
redrawing
all
these
unscheduled
parts
at
every
node
heartbeat,
which
was
arriving
at
like
every
10
seconds
from
each
node,
so
I
mean
a
larger
cluster.
It
would
happen
very
frequently
nowadays.
A
Scheduler
is
more
efficient,
but
we
know
that
there
could
be
some
races
where
the
scheduler
could
possibly
miss
some
of
these
events,
when
it's
trying
the
part
and
if
that
part,
which
is
in
flight,
is
determined
to
be
under
schedule
at
all,
then
sometimes
this
part
may
not
be
retry,
so
we've
had
a
mechanism
to
retry
some
of
these
pending
parts,
and
but
that
mechanism
is
only
in
the
master
we
may
need
to
cherry-pick
that
into
all
the
releases
to
solve
this
issue.
I
will
follow
up
with
way
on
this.
A
A
So
Valerie
I
know
that
you
have
raised
interest
for
one
of
our
issues,
which
is
non
pre-empting
priority
functions.
I
I
do
support
adding
that
feature
I'm,
not
so
sure
if
I
will
have
enough
time
to
help
with
fixing
the
problems
in
the
existing
PR
for
this
feature.
But
if
you
think
you
can
help
how
with
that
feature,
I
would
really
appreciate
your
help.
A
C
A
So
yeah
I
mean
for
helping
with
the
code
base
and
sort
of
like
mentoring,
people,
I
I,
don't
know
if
I
will
find
enough
time
to
be
honest
with
you,
I
will
be
happy
to
answer,
maybe
some
questions
which
are
a
little
bit
more
quick,
but
some
of
them
which
need
a
quieter
more
time.
It's
really
hard
for
me
at.
D
This
point,
so
if
I'm,
a
sick,
rebec's
has
been
working
on
the
mentoring
program
for
a
long
time
to
try
to
meet
needs
like
this.
If
it's
something
like
a
code
based
tour
or
help
on
a
particular
PR
and
so
I
think
it's
it's
quite
likely
that
the
knowledge
that's
needed
is
not
just
in
Bobby's
head
but
somewhere
in
this
SIG.
D
Among
a
few
of
us
and
so
I
think
you
know
one
asking
this
SIG,
who
has
time
for,
for
example,
a
code
based
tour
is
one
step,
but
another
would
be
getting
finding
the
the
landing
page
for
the
mentoring
program
to
see
if
you
can,
if
putting
in
a
request,
is
the
right
path
for
this
particular
issue.
Yeah.
A
That
sucks,
these
are
all
great
points
thanks
to
them.
So,
though
these
are
great
points
you
can
seek
help
there
and
in
fact
the
failure
that
we
are
seeing
is
not
in
the
scheduler
code.
It's
actually
more
on
the
API
side,
so
it
falls
mostly
in
the
guy
missionary
and
how
the
API
should
be
added.
And
since
this
change
is
touching
the
API
there
is
some
amount
of
work
to
be
done
there.
A
Some
amount
of
code
should
be
automatically
generated
and
we
should
make
sure
that
you're
touching
the
right
places
to
ensure
that
all
these
new
codes
are
generated
properly
and
I'm
Finnish,
or
that
the
failure
you're
saying
is
because
because
of
that,
so
yeah
seeking
help
from
say
country
bugs
is
probably
the
best
approach
at
this
point:
okay,
yeah.
Thank
you
any
other
questions
or
comments
or
updates
from
projects
you
guys
are
working
on.
A
Okay,
one
quick
thing
and
Harry
I
know
that
you
and
one
of
your
colleagues
I
believe,
has
been
working
on
equivalents,
cache
or
equivalence
class.
At
this
point,
I
guess
is
not
quite
a
caches
more
like
class
I'm.
Sorry
that
I
haven't
had
the
chance
to
take
a
look
at
the
P
R,
but
it's
in
my
to-do
list
I
will
definitely
take
a
look
and
hopefully
we
can
get
that
going.