►
From YouTube: Kubernetes SIG Scheduling Meeting - 2018-12-20
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
Q
Khan
was
useful
for
getting
some
feedback
from
customers.
There
has
been
a
lot
of
questions
regarding
scalability
of
the
scheduler.
Some
people
are
asking
that
you're
a
DD
one
around
larger
clusters,
aku
behind
SNP
cannot
do
it,
partly
because
of
the
scheduler.
We
know
that
there
are
other
bottlenecks
in
kubernetes,
but
definitely
we
would
like
to
address
some
of
the
issues
with
the
scalability.
B
A
Scheduler
in
particular-
and
hopefully
we
can
do
that
in
the
next
quarter
and
maybe
further
work
in
the
following
ones.
There
has
been
some
questions
regarding
scheduling
framework.
A
lot
of
people
are
waiting
for
that
they
would
like
to
customize
the
scheduler
for
their
own
workers.
We
have
a
tentative
plan
to
have
some
of
these
extension
points
in
114,
and
hopefully
we
can
go
from
there.
A
B
A
Basically,
change
that
it
will
change
that
and
make
it
a
little
bit
more
generic
and
usable
and
sort
of
future-proof.
It
we'll
see
how
that
goes.
But
anyhow,
our
plan
is
to
have
something
usable
in
114
and
then
probably
in
115.
We
want
to
have
a
fully
fledged
scheduling
framework
with
a
lot
of
extension
points
and
something
that
we
can
use
for
our
for
building
our
own
scheduler,
basically
moving
our
own
priority
and
predicate
functions,
and
hopefully
the
preemption
logic
to
a
plug-in.
A
A
A
Sure
yeah
sure
I
can
I
can
actually
write
something
down.
So
this
scalably,
the
question
was
our
basically
run.
Usually
two
topics,
one
throughput
of
the
scheduler:
some
people
want
to
have
higher
throughput
of
the
scheduler,
the
other,
the
other
part.
So
an
example
was
uber.
Uber
was
comparing
kubernetes
to
the
system
that
the
operating
in
house
and
they
were
saying
that
their
system
is
a
lot
more
performance
compared
to
communities.
A
Although
I
I
feel
like
some
of
their
numbers,
were
a
little
off,
so
we
have
some
scalability
numbers
that
are
not
fully
sort
of
agreeing
with
those
numbers
that
they
present
that
maybe
they
had
a
slightly
different
set
up,
or
maybe
they
had
measured
something,
but
anyhow
one
one
question:
one
request
is
basically
having
higher
throughput.
The
other
question
is
supporting
the
larger
clusters
authority.
The
scheduler
does
not
have
much
problem
with
larger
clusters.
Of
course
you
would
require
more
memory,
but
still
the
memory
usage
and
everything
is
not
something
completely
unreasonable.
A
So
I
don't
see
any
huge
issues
with
raising
the
size
of
the
cluster
for
the
scheduler.
However,
it
depends
on
what
are
the
expectations
for
throughput.
If
you
raise
disguisers,
if
you
raise
the
cluster
size
to
like
$10,000,
for
example,
communities
scheduled
throughput
is
kind
of
dropped
to
IPR
2030
pods
at
maximum
per
second,
which
may
not
be
acceptable
for
everyone.
A
So
these
are
basically
the
two
main
things
that
are
that
are
related
to
us
as
a
scheduling,
but
we
know
that
kubernetes
has
some
other
issues
around
the
scalability
as
well.
So
the
number
of
parts
in
the
system,
the
number
of
namespaces
in
this
system,
the
number
of
volumes
that
you
can
attach
to
nodes-
and
you
know
DNS
records,
IP,
table
records
and
rules
and
all
of
those
can
become
a
issue.
If
you
go
beyond
the
serving
size
in
the
cluster.
C
A
Know
part
yeah
there
are.
There
are
some
folks
who
have
done
this.
So,
for
example,
I
was
at
a
talk
by
Atlus
and
those
guys
have
done
a
you
know
something
like
this:
the
instead
of
having
a
small
number
of
large
clusters,
they
have
gone
with
large
number
of
relatively
small
to
us,
there's
not
very
small
but
relatively
small
clusters.
B
C
A
Definitely
is
an
issue
about
that,
but
even
if
you
build
tools
to
manage
large
number
of
clusters
still
you
cannot
achieve
high
resource
utilization
in
a
large
number
of
small
clusters,
as
opposed
to
a
large
cluster.
That
has
a
lot
of
notes,
hardly
because
you
know
it's
similar
to
like
sharing
its
if
you,
if
you
basically
give
a
cluster
to
one
or
two
teams
now,
let's
say
those
guys
always
submit
their
jobs
at
their
place
a
day
time
and
then
once
the
work
hours
are
over,
the
cluster
remains
empty.
A
But
if
you
share
it
among
the
cluster
is
larger.
If
you
share
it
among
I,
know,
50
teams
and
chances
are.
There
is
at
least
a
few
teams
all
times
of
24
hours
now
that
have
some
workloads
to
run
your
classes.
So
when
you
share
it,
usually
the
resource
utilization
is
higher.
You
can
over
commit
your
resources,
and
many
of
these
are
available
T,
which
don't
exist
in
small
class
or
by
doe
Nexus
I
mean
it
usually
doesn't
happen
in
smaller
classes.
B
Want
to
amend
is
that
for
scalability
problem,
actually
any
notice,
the
scheduler
is
not
in
the
critical
cost,
I
mean
he's,
not
the
only
one
part
of
making
the
scalability,
actually
the
edge,
CD
and
API
server
path
is
very
matters
a
lot.
So
so,
if
you
shake
the
city
up,
shimmy
can
see
that
we
are
trying
to
solve
this
problem
from
a
city
partly
actually
undo.
Another
were
clear:
you
can
check
that
patches
and
also
we
are
refactoring
the
banking
of
database
of
LCD.
B
A
Thanks
for
the
update,
I
I
wasn't
aware
of
all
the
efforts
in
that
area,
but
I
totally
agree
with
you
tours
particularly
at
City
is
one
of
the
bottlenecks.
Here
we
always
have
been
always
aware.
I've
been
trying
to
address,
but
I'm
glad
that
there
has
been
some
important
improvements
over
there.
So
is
this
mostly
about
at
city
or
both
at
cdnas.
A
B
Currently,
most
work
is
done
from
the
edge
city
park.
You
can
trick
the
extreme
countries
we
have.
We
are
doing
this
totally
in
the
open
way.
Everything
is
all
almost
only
a
dream.
You
can
check
the
PRS
and,
after
the
other
we
will
also
do
something
like
indexing,
partying
agents
over
park
and
also
help
a
lot
and
also
how
to
split
events
from
a
CD.
So
we
won't
have
become
much
larger
cluster
yeah.
B
C
D
A
All
right,
that's
good,
so
yeah,
but
you
know,
maybe
schedule
is
not
the
bottleneck.
The
main
modeling
here,
but
scheduler
is
certainly
a
problem
in
large
clusters
and
if
you
go
like
five
thousand
or
clusters,
a
scheduler
becomes
a
problem.
The
NCD,
of
course,
becomes
a
problem.
If
you
create
a
lot
of
objects
as
well,
but
even
you
know,
LCD
can
handle
certain
number
of
objects,
and
if
you
want
to
have
like
a
throughput
of
fine
deposit
or
more,
you
cannot
get
it
in
five
thousand
or
faster,
at
least
today.
You
cannot
get
it.
A
So
that's
why
we're
trying
to
address
those
as
well
so
scalability
in
general
scalability
of
kubernetes,
particularly
referring
to,
of
course
and
generally
is
a
problem
that
cannot
be
solved
by
fixing
one
component
it.
This
is
a
team
effort
and
many
teams
need
to
contribute
to
make
it
more
scalable.
As
I
said,
there
are
many
areas
that
need
improvement
and
scheduler
is.
A
Those
all
right
so
one
more
update
about
the
scheduler.
We
identified
a
relatively
serious
issue
and
the
preemption
logic
we
see
there
is
a
race
condition
between
setting
a
nominated
know
them
for
a
part
and
the
next
scheduling
cycle.
So
when
we
said
the
nominated
node
name
for
a
part
in
with
in
first
scheduling
cycle
for,
let's
say
part
number
one,
then
the
next
scheduling
cycle
for
part
number
two
may
start
before
that
nominated
node
event
has
arrived.
A
A
We
are
trying
to
address
that
I
I
will
hopefully
same
APR
today
or
tomorrow.
Today
is
unlikely
everybody
yeah.
So
that's
about
that.
I
also
find
a
couple
of
other
issues
with
respect
to.
Actually
this
is
again
related
to
scalability
setting
percentage
of
nodes
that
we
use
call
in
each
scheduling
cycle
dynamically
if
it
is
not.
Basically,
if
it
is
not
specified
as
an
argument,
then
the
scheduler
has
a
logic
that
determines
the
number
of
percentage
of
those
most
dynamically.
A
E
Some
thoughts
from
the
github
and
then
I
think
the
the
end
result
of
the
last
discussion
was
basically
that
we
were
trying
to
see
if
we
can
separate
the
event
handler
queuing
to
two
just
so,
wherever
we
add
to
the
scheduling
queue,
we
put
it
in
the
queue
package
in
scheduler
internal
queue
and
wherever
we
are
adding
to
the
pod
queue,
which
is
basically
the
scheduling
queue
we
added
to
the
other
package
right,
but
that
was
busy
the
last.
The
issue
that
we
were
discussing
was
that
that
has
an
ordering
issue.
E
It
might
happen
that
an
event
a
goes
into
one
of
the
twos
and
not
the
other,
and
so,
if
we
I
mean
the
way
we
have
the
code
structured
in
scheduler,
we
have
that
dependency.
I,
don't
think
we
can
easily
structure
that
way,
so
we
would
I
mean
without
doing
too
much
Rilke
tech
chure.
It
feels
like
we
would
need
to
have
that,
adding
to
both
the
scheduler
cache
and
the
part
you
should
happen
in
the
same
like
together
as
an
atomic
operation.
E
So
with
that
in
mind,
right
now
those
add
event,
handlers
are
being
called
from
factory
or
go,
and
my
current
PR
moves
them
to
scheduler
dot
go
in
the
outer
scheduler
paper,
so
yeah
I,
guess
I
have
laid
out
a
few
other
options
like,
for
example,
should
we
move
those
to
a
common
package
parallel
to
queue
and
cache?
So,
for
example,
we
have
internal
queue
and
internal
cache
right,
so
do
something
like
internal
common
and
move
everything
there
or
keep
the
way
it
is
in
the
current
PR,
which
is
in
packet
scheduler.
E
A
Yeah
for
sure
we
have
I
actually
thought
Jonathan
as
well.
We
definitely
have
logic
dependency
on
the
ordering
of
updating
queue
on
cache
and
we
cannot
easily
get
rid
of
those.
So
one
option
is
to
basically
have
event
handlers
in
one
place,
but
make
at
one
place
aware
of
the
other.
So,
for
example,
you
can
move
possibly
even
handlers
to
the
cash,
but
you
pass
the
pointer
of
the
queue
to
the
cache
for
that
cash
updates
the
queue
as
well.
But
this
is
not
it
I,
don't
know.
A
This
is
not
necessarily
a
very
clean
design
either.
So
we
are
solving
one
problem,
creating
like
right,
so
I,
I
kind
of
like
the
idea
of
having
maybe
a
separate
common
package,
but
it
doesn't
have
to
be
internal
common.
It
can
be
like
internal
I,
know,
event,
handlers
or
something
like
that
sure
yeah.
So
something
yeah,
something
like
that
sounds
to
me
like
a
better
option,
but
I
have
to
actually
look
at
the
code
eventually
to
see
if
it
all
fits
and
and.
C
E
Guess
the
other
question
is
like
I
think
what
Jonathan
was
suggesting
was
to
pass
the
informers
in
new
cache
and
new
queue
so
that
we
can
add
them
inside
right.
So
if
we
create
a
new
package
and
put
all
the
definitions
of
those
event
handlers
there,
then
maybe
then,
how
do
we?
Where
do
we
call
that
from
I
guess
that
question
you.
A
A
E
I
think
the
main
issue-
the
main
issue
here,
is
that
we
didn't
want
that
event
hand
needs
to
be.
In
fact
we
don't
go,
which
is
like
the
internal
part
of
the
scheduler
right
and
it
doesn't
even
belong
to
the
factory
part.
So
it
does
belong
to
the
outer
part.
So
should
we
should
we
keep
it
in
the
scheduler
and
not
something
internal?
E
It
could
be
something
in
the
internal,
common
or
internal
event
handler
as
well,
but
but
then
will
not
be
able
to
call
it.
So
if
we
put
it
inside
internal
will
not
be
able
to
call
it
from
something
outside
internal
right,
so
yeah
and
put
it
inside
internal,
then
we
cannot
put
it
in
the
queue
or
cache,
because
that
will
have
the
ordering
issue
or
we
I
mean
the
other
way
to
look
at.
E
D
E
A
F
Yeah
wait
here
so
regarding
the
regarding
the
feature
there
to
support
the
path
affinity
jointly
on
multiple
paths,
so
basic
aid,
the
API
straight
and
also
the
performance
test.
The
drug
test
has
been
lost
almost
already,
so
it
doesn't
have
any
Xfinity.
Second,
a
decent
performance
increase
according
comparing
the
same
codebase
we've
without
peer,
but
I
do
know
this
they're
slightly
6
or
7
percent
interest.
A
F
And
now
I
mean
it
has
again
current
code
base
right
and
I
also
used
to
taste
against
113
code
base.
I
mean
we
post
without
my
PR.
The
stick,
they
assumed
seems
slightly
performance
increase
in
these
two
versions.
I
guess
maybe
some,
because
we
have
a
lot
of
refactoring.
We
have
a
lot
of
also
some
like
issues
like
we
are
doing
on
the
timestamp
of
the
parts
of
the
same
priority
right
so
I'm
not
exactly
sure
it's,
because
because
of
that
I
do
see
some
performance
increase.
A
F
Okay,
so
I
think
it's
actually
correlated
with
with
our
benchmark
testing.
So
what
can
cut
performance
is
that
we
have
a
lot
of
affinity.
That
is
not
ideal
case
right.
We
usually
don't
have
too
many
to
any
affinity.
10.
The
other
effector.
My
impact
performance
is
that
we
example
we
have
a
lot
of
paths
can
satisfy
this
and
that
they
are
in
different
topology
domains
in
our
benchmark,
cast
that
we
we
just
put
them
in
the
same
zone,
so
yeah,
there's
no
starting
and
an
intersection
calculation
logic,
so
that
larger
wasn't
hit.
A
Thank
you
very
much
for
the
update.
Is
there
anything
else?
No,
this
is
it
okay
and
thank
you
very
much
Harry
for
for
that
for
the
removal
of
equivalence,
cache
I
know
that
was
a
huge
effort
because
we
kept
running
into
issues
with
rebasing
about
PRF,
but
finally,
it's
in
Spanish.
Thank
you
very
much
for
your
help
and
we
are
looking
forward
to
see
the
next
phase
of
the
project
yeah.
Thank
you
alright,
guys,
because
this
is
the
end
of
our
meeting.