►
From YouTube: Kubernetes SIG Scheduling Meeting - 2019-03-14
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
let's
start
the
meeting.
As
you
know,
this
meeting
is
recorded
and
I
will
be
uploaded
to
public
internet,
so
whatever
you
say
is
likely
to
remain
for
a
very
very
long
time
with
that,
let's
start
talking
about
some
of
the
items
that
we
have.
Unfortunately,
we
don't
have
that
many
participants
in
today's
meeting,
so
some
of
the
feature
owners
may
not
be
present
today,
we
can
actually
go
to
some
of
the
items
that
we
have
in
the
recording,
so
hopefully
those
people
will
get
back
to
us
later.
A
One
of
the
items
for
1:15
is
any
new
equivalence,
couch
and
new
equivalence
class.
Hopefully
we
can
get
these
to
a
level
that
is
performant
enough
soon
and
or
we
can
have
those
in
115.
We
definitely
need
to
have
a
proper
cap
for
these
in
order
to
merge
them
in
115
queue.
Batch
is
one
of
the
other
items
which
is
still
in
an
incubator.
A
Jonathan
actually
helped
us
with
its
refining
the
design,
so
that
that
is
in
I
will
probably
start
creating
a
few
issues
for
building
some
of
the
smaller
items
or
smaller
extension
points
in
the
scheduler.
For
the
framework
we
need
to
mean
we
already
have
a
couple
of
extension
points
based
on
the
older
design,
the
previous
design.
Those
probably
need
to
be
modified
slightly
to
make
them
fit
better
with
the
newer
design,
so
I
will
probably
either
myself
or
Jonathan.
A
One
of
us
will
probably
make
those
changes,
and
then
we
will
create
some
other
issues
for
other
folks
to
contribute.
Hopefully
we
can
get
a
major
portion
of
this
and
115
we
would
like.
We
still
would
like
to
work
on
implementing
quality
scheduling
policies.
These
are
the
policies
that
specify
what
kind
of
scheduling
features
people
can
put
on
their
pods.
That's
another
item.
A
B
A
B
A
B
A
A
C
A
As
soon
as
you
are
here,
we
are
working
on
resource
bin,
packing
I
left.
Some
comments
and
I
saw
that
you
answered
some
of
those
comments
already
with
respect
to
basically
modifying
one
of
the
existing
priority
functions
instead
of
instead
of
just
creating
a
new
one
for
resource
bin
packing.
So
hopefully,
I
mean
based
on
your
comments,
looks
like
you're
fine
with
that
design
as
well.
So
hopefully
we
can
get
that
in
one
in
115
as
well.
Is
that
fine
with
you
do
you
want
to
discuss
it?
Yes,.
D
A
Sure
supporting
non-preemptive
priority
so
I
know
that
valerie
has
Valerie's.
Now
here
you've
been
working
on
this,
you
ran
into
an
issue,
I
believe
Alex
or
no,
not
Alec,
yeah,
Alex
I
believe
he
actually
managed
to
fork
that
and
fix
the
issue
with
with
some
of
there
was
no
no
test
failing
with
non-priority
and
he
he
managed
to
fix
that
issue.
So
hopefully
we
can
get
that
in
in
115
as
well.
We
might
actually
need
a
cap
for
this
I
think
we
do
yeah
yeah
yeah,
so
we
don't
have
a
cap
right
now
correct.
A
A
We
need
more
scheduling,
metrics
way
as
volunteered
to
take
care
of
that
all
right,
and
then
there
is
a
one
relatively
large
feature
that
we
would
like
to
add
to
the
scheduler
and
that's
being
aware
of
physical
nodes
in
a
cluster.
So
here
the
problem
is
that
in
many
cases
people
run
kubernetes
on
pram,
basically
on
physical
hosts
and
when
they
do
so,
they
usually
run
kubernetes
on
top
of
another
virtualization
layer,
for
example
Iran
vSphere,
or
they
run
KVM
or
something
like
that
and
then
create
virtual
machines.
A
These
virtual
machines
become
their
nodes
of
the
cluster.
So
now
imagine
you
know
you
have
let's
say
three
physical
hosts
in
your
cluster
and
each
run
in
10
virtual
machines,
so
10
nodes
of
the
cluster
land
on
a
single
physical
machine
kubernetes
and
his
way
of
spreading
pods.
She
spreads
paas
among
the
nodes
that
it
sees
problem.
Is
that
some
of
these
nodes,
or
in
some
situations,
many
of
these
nodes
land
on
a
single
physical
house?
Now,
let's
you
run
a
web
server
which
has
ten
replicas.
A
It
spreads
these
ten
replicas
among
ten
notes
and
those
ten
note,
it
happened
to
be
on
the
same
physical
host.
So
if,
when
a
physical
house
goes
down,
all
of
those
ten
replicas
go
down
and
your
web
server
or
your
web
service
will
see
an
audit.
This
is
not
ideal.
A
lot
of
folks
are
running
into
this
issue.
This
is
not
even
it
is
not
even
limited
to
physical
hosts
or
on-prem
clusters.
It.
A
It
kind
of
applies
even
to
an
extent
to
the
cloud
providers
too,
because
they
have
the
very
similar
set
up,
so
it
would
be
even
more
reliable.
I
mean
your
communities.
Cluster
would
be
even
more
reliable
if
you
could
do
the
same
thing
for
cloud
forms.
So
we
are.
We
are
planning
to
build
something
that
allows
kubernetes
to
be
aware
of
physical
hosts,
and
then
it
labels
nodes
of
the
cluster
with
those
physical
hosts
as
well.
So
those
physical
hosts
essentially
become
another
failure.
A
Domain
label
in
kubernetes
and
convention
starts
spreading
pods
among
those
the
different
failure
domains
a
different
physical
host.
We
currently,
as
I
said,
have
only
a
way
to
spread
parts
among
nodes
of
the
cluster.
So
with
this
we
were
change
our
priority
function
to
spread
among
physical
hosts.
If
that
is
not
possible,
then
it
spreads
among
the
nodes,
for
example
in
case
that
or
not
in
a
physical
host.
A
Then
we
don't
have
any
other
option
than
spreading
among
other
nodes
in
the
cluster
which
some
of
them
could
happen
to
be
on
the
same
physical
host,
so
I
have
filed
an
issue
for
this,
it's
in
our
spreadsheet
and
we
would
like
to
work
on
it.
I
believe
this
is
a
alex
has
already
volunteered
for
doing
this,
but
this
is
I
believe
is
a
more
than
one
person
project.
We
need
more
volunteers,
so
basically,
there
are
multiple
aspects
to
this.
A
One
is
to
build
the
you
know:
I'd,
like
a
label
for
the
failure,
failure
domain
and
added
a
functionality
to
dis
scheduler
to
spread
pods
amount,
physical
hooks,
that's
one
part
of
the
project,
which
is
probably
one
one-person
project,
but
there
is
a
much
larger
part
of
it,
which
is
essentially
building
admission
web
hooks.
That's
one
option
so
quickly
know
you
may
come
up
with
other
options
as
well,
and
those
might
be
even
better,
but
one
one
that
I
thought
about
is
just
building
an
admission
web
hook
for
various
various
virtualization
layers.
A
For
example
it
could
we
could
build
one
for
vSphere,
they
could
build
one
for
AWS.
We
can
build
one
for
GCP
and
so
on.
So
so
what
this
does
is
that
it
communicates
with
some
of
these
ApS,
for
example,
with
vSphere
APRs.
Reads
the
physical
host
information
from
this
fear
and
then
adds
the
label
at
the
time
of
node
creation
to
the
node
with
this
label.
A
But,
for
example,
let's
say
we
call
it
failure
domain
I,
don't
know
that
case
that
IO
or
whatever
slash
physical
host
or
something
of
that,
and
then
that
admission
web
hook
is
responsible
for
adding
these
labels.
We
need
to
build
probably
a
few
of
these
admissions
web
folks.
We
can
start
with
vSphere
for
now.
In
the
same
issue,
I
commented
that
there
is
already
a
vSphere
plugin
that
that
has
this
functionality.
Actually
we
can
borrow
ideas
from
there
at
least
look
at
it
for
reference
to
what
this
will
API.
A
We
should
call
for
a
final
physical
house
and
and
then
build
that
admission
webhook.
So
if
any
of
you
folks
or
other
folks
who
not
here
and
listen
to
our
recording
in
the
future
or
willing
to
participate,
please
go
ahead
and
mention
that
in
this
issue
we
can
probably
be
directed
Brett
to
work
among
multiple
folks
to
take
care
of
different
parts
of
it.
A
A
Basically,
you
can
specify
set
of
operators,
and
you
can
say
that
you
don't
want
the
labels
of
the
pod
to
be
in
this
set
or
not
to
be
in
this
set,
but
we
also
want
to
support
less
than
and
greater
than
operators,
so
that
people
can
specify
a
range
of
values
and
say
that
if
the
pod,
for
example,
is
in
this
range,
then
we
have
affinity
to
eight
of
what
we
have
anti
affinity
to
it.
So
Leon
is
working
on
that
the
cap
is
out
and
two
we
need
to
implement
this.
A
One
of
the
main
concerns
here
is
that
Finnegan
anti
affinity
is
already
slow.
We
have
made
it
a
lot
faster
than
before,
but
in
the
past
it
was
super
slow.
It
was
like
a
thousand
times
slower
than
order
predicate.
It's
now
kind
of
bearable,
slow,
it's
like
10
times
slower,
but
still
significantly
slower
than
other
predicates,
and
by
introducing
this
operator
we
might
make
them
make
it
even
slower.
So
we
definitely
need
to
make
sure
that
this
is
not
going
to
impact
as
much
in
terms
of
performance.
A
A
C
C
It
has
this
because
it
will
make
the
design
be
confusing
and
awake
in
providing
the
even-par
distribution
function
so
basically,
I
think
after
the
offline
talk
with
Chris,
Bobby
I
think
the
plan
P,
which
is
make
the
part
even
distribution
as
a
standalone
predicates,
are
our
priorities,
makes
more
sense
because
it
has
more
may
divine
and
maybe
less
error-prone,
and
can
it's.
You
know
it's
a
standalone
stuff
and
the
basically
my
idea
is
that
it
can
extract
some
existing
fields
in
affinity
to
the
even
spread
expression.
C
So
that
means
we
have
to
implement
similar
things
in
affinity
to
make
sure
which
parts
should
be
grouped
together.
They
may
they
more
so
they
are
more
attracted
to
each
other
to
be
placed
together
right
and
based
on
that.
We
should
have
a
top-level
max-q
settings
to
make
sure
the
the
degree
of
imbalance
right
so
whether
they
are
totally
perfect
friends
or
they
are
tolerant
to
have
some
degree
of
skew
sort
of
like
that,
and
we
can
have
a
also
have
should
have
a
top
level.
C
C
The
definition
are
exactly
the
same:
they
they
are
kind
of
identical,
but
we
are
not
in
forced
people
to
have
same
definition
in
both
the
spreading
and
be
part
of
in
here
right.
So
it's
good
to
have
them
both
there
if
they
want
alright,
so
we
don't
import
them.
Only
yeah
only
issue
is
the
pod
anti
infinity
because
these
works
independently.
So
we
are
not
in
forced
to
change
the
semantics
of
current
pod
affinity,
which
is
to
run
only
one
part
exclusively
in
ethology
to
me.
Yes,
yes,.
A
So,
thank
you.
Thank
you
for
sharing
this
I
agree
with
many
things
that
you
said
one
regarding
dean
by
the
way.
First
of
all,
I
apologize,
I
didn't
mention
this.
In
you
know,
our
plan
for
115
is
actually
one
of
the
important
projects
that
we
would
like
to
implement
them
on
15.
So
it's.
The
idea
here
is
that
we
are
gonna,
we're
gonna,
add
a
new
feature
to
spread
pods
evenly
among
different
failure
domains.
Basically,
it's
very
similar
to
enjoy
affinity,
but
in
a
try
affinity
we
were
in
like
hard
anti
affinity.
A
We
were
letting
only
one
part
to
exist
in
a
particular
failure
domain.
For
example,
if
the
failure
domain
was
a
node,
we
were
spreading
parts
among
the
nodes
and,
if
more
than
one
part
had
to
land
on
the
same
node,
and
there
was
no
other
node
in
the
cluster,
the
end
up,
I
wouldn't
get
scheduled
right
now.
Let's
say
that
you
have,
for
example,
10
replicas.
You
have
5
nodes
with
this
new
feature.
You
evenly
distribute
these
pods
among
these
five
nodes,
so
each
part
gets
each.
Each
node
gets
two
parts.
A
This
is
the
idea
behind
evenly
distributing
cards
among
in
a
failure
domain
or
among
failure,
domains
regarding
the
topology
or
distribution
here.
I
honestly
prefer
to
go
with
the
same
topology
key,
because
it's
essentially
the
same
thing.
It's
the
same
concept
as
we
have
infinity
and
I
I
feel
it.
It
makes
more
sense
for
right
for
our
ATI
to
be
consistent,
but
once
this
feature
is
built,
I
think
we
need
to
go
back
and
change
certain
things
in
anti
affinity
that
that's
what
we
had
from
the
very
beginning.
A
A
In
the
cluster
that
has
an
affinity
and
that
anti
Fenty
could
get
violated,
but
by
the
pod
in
schedule,
so
until
I
fennel,
tea
is
a
little
hard
to
to
check
and
takes
a
lot
of
cycles.
If
we
make
it
twist,
if
I
make
it
limited
to
a
node
only
then
it
becomes
a
much
much
simpler
check
when
we
run
predicates,
we
can
just
easily
check
whether
there
is
any
anti-family
in
other
parts
without
definitely
on
on
a
particular
node,
so
yeah.
C
B
A
A
B
Had
one
comment,
which
is
that
the
pod
spreading
I
I
definitely
see
like
a
lot
of
like
most
of
the
users
that
wander
into
our
slack
asking
questions
about
the
scheduler,
basically
asking
for
this
feature,
so
I
think
it's
going
to
be
really
valuable
when
we
deliver
it,
I
also
think
it
it
supersedes.
Maybe
some
of
the
other
affinity
changes
that
are
in
the
pipeline
and
I'm
I'm
wondering
if
there's
some
yes.
A
So
particularly
this
one
addresses
one
other
option.
One
other
feature
that
you
were
targeting
in
the
past
and
I
was
supporting
like
a
like
max
pods,
or
something
like
that
for
anti
anti.
Basically,
the
idea
there
was
that,
instead
of
having
an
affinity
to
only
one
part
which
is
the
current
state,
we
could
have
another
parameter
in
our
anti
affinity.
A
Api
to
say
what
is
the
maximum
available
mega
maximum
acceptable
number
of
parts
that
we
have
anti
affinity
to
so
is
the
one
we
may
say:
okay,
we
can
have
like
three
parts
in
this
failure
domain
and,
after
that,
after
the
three
we
no
longer
want
to
have
anymore
parts
in
the
failure
on
that
option
or
that
fit.
That
particular
feature
is
no
longer
needed.