►
From YouTube: Kubernetes SIG Scheduling Meeting - 2019-02-21
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
B
A
A
Is
it
as
the
is
this
the
new
implementation
that
I
suggested
with
that
one
map
that
has
basically
information,
or
maybe
our
keys
for
on
schedule
of
all
parts?
Yeah,
yes,
I,
see
I,
see
yeah.
What
one
of
the
issues
that
we've
seen
in
the
past
has
always
been
the
case
that
you
know
adding
more
shared
data
structures
such
as
a
map
could
potentially
cause
a
slowdown.
A
It's
a
little
bit
unfortunate
that
this
one
also
causes
to
slow
down.
So
let
me
think
so.
Okay
for
for
the
information
of
others,
what
we
are
doing
here
is
that
once
the
scheduler
finds
out
that
a
part
is
unschedulable,
it
looks
at
a
part
and
if
it
has
a
tenant
reference,
which
essentially
means
that
a
part
belongs
to
another
collection
like,
for
example,
a
replica
set
and
so
on.
A
It
will
add
that
pan
reference
as
the
key
to
a
set,
what
it
means
is
that-
and
this
set
basically
is
a
set
of
parts
that
are
potentially
on
your
schedule
potentially
or
unschedulable.
So
it
means
that
all
other
parts
which
had
which
belonged
to
the
same
collection
will
also
be
on
a
schedule
above
because
they
they
share
the
exact
same
specifications.
A
A
We
don't
store
any
particular
value
for
it,
but
anyway,
once
this
is
this
key
is
in
the
set.
It
means
that
all
other
parts
which
are
from
the
same
collection
will
also
be
considered
on
a
schedule
abode.
This
is
a
performance
improvement,
because
the
scheduler
does
not
need
to
check
all
the
other
cards
in
the
collection.
You
can
just
look
up
the
map
and
figure
out
that
okay,
this
other
part,
is
also
under
schedule
because
it
shares
the
same
exact
parent.
A
So
a
couple
of
suggestions
for
this
one
potential
suggestion
you
may
want
to
consider
using
a
cinch
map.
Sync
map
is
one
of
these
maps
and
go
implementation,
which
is
apparently
more
efficient
compared
to
acquiring
a
lock
over
a
regular
map.
I,
don't
know
if
you
have
tried
that
long,
but
that
might
help
a
little
bit.
If
it
doesn't,
we
need
to
revisit
this
I.
A
Don't
have
any
great
suggestion
right
now
on
top
of
my
head,
but
I
can
take
a
look.
Another
thing
that
I
was
conceived,
I
was
thinking
is
that
do
we
actually
need
to
acquire
a
log
to
access
this
map?
The
reason
that
I
am
saying
this
is
that
the
scheduler
accesses
this
map
in
its
single-threaded
a
main
loop
right,
and
it
only
adds
to
the
map
in
its
own
single
threaded
main
loop
as
well.
So
it
may
not
require
any
locks.
A
Yeah
you're
right:
okay,
yeah,
the
deletion
is
a
problem,
you're
right,
so,
okay,
so
again
for
the
information
of
others.
When
another
event
happens
in
the
cluster
that
could
potentially
make
parts
of
schedule
of
all.
We
need
to
remove
all
these
entries
from
this
set
so
that
the
scheduler
retries
to
scheduling
some
of
these
parts
once
the
schedule
determines
that
these
are
still
on
on
the
schedule.
Body
we'll
add
them
back
to
the
set.
But
we
need
to
remove
these
entries
at
some
point.
A
And
that
point
is
you
know
the
fighter
that
the
event
happens
in
the
cluster
that
makes
potentially
makes
parts
of
schedule.
So
an
example
of
such
events
is
deletion
of
a
part
from
the
cluster
or
addition
of
another
node
to
the
cluster
and
things
of
that
sort
which
happen
in
parallel
to
the
main
scheduling
loop.
That
requires
locking
you're
right,
you're,
absolutely
right
long
I
can
I,
can
take
another
look
and
see
if
we
can
find
any
better
solution
for
this.
But
thanks
for
working
on
it,
and
thanks
for
sharing
the
problem
with
us.
A
A
All
right,
I
also
have
a
good
news.
Today.
Our
scalability
results
are
in
and
we
hit
a
pretty
aggressive
goal
that
we
had
to
schedule
a
hundred
parts
per
second
in
a
5,000,
not
cluster,
so
that
this
is
actually
great.
This
was
the
result
of
multiple
algorithmic
optimizations
that
we
made
to
do
scheduler
in
the
past
six
months
or
so,
and
the
last
one
of
them
was
one
PR
that
got
merged
recently,
basically
just
yesterday,
and
today's
results
will
pretty
promising.
A
So
we
actually
see
that
the
average
in
our
scaleable
results
hits
100
parts
per
second
and
that's
our
rate
limitation.
Essentially,
the
scheduler
money
potentially
even
exceed
100
parts
per
second,
but
because
of
this
rate
limitation
that
we
have
in
our
scalability
test,
it
cannot
really
go
beyond
100,
but
if
that
rental
rate
limitation
is
lifted,
it
could
potentially
even
go
beyond
100
parts
per.
Second
I
am
trying
to
work
with
the
scalability
team
to
maybe
raise
that
rate
limitation
because
going
forward.
A
If
we
make
any
further
optimization
to
do
scheduler,
we
will
not
see
the
results
we
need
to
have
this
raised
limitation
completely
lifted
or
at
least
raised
to
a
larger
number
so
that
we
can
keep
observing
our
improvements.
Also,
if
there
is
a
change
that
reduces
performance,
but
if
it
reduces
performance
marginally,
for
example,
since
we
don't
know
whether
we
achieve
110
pots
per
second
or
not,
if
there
is
a
performance
in
the
gradation
that
reduces
us
from
honeypot
per
second
hundred
ten
pots
per
second
200
pots
per
signal
will
stop.
A
C
A
All
right
sorry
way
go
ahead.
Yeah,
that's
really
a
milestone,
yeah,
alright,
so
there
is
another
PR
that
Wang
had
sent
out.
Thank
you
very
much
one
for
that
PR.
That
one
is
to
basically
not
update
the
API
server.
Every
time.
Did
the
scheduler
try,
scheduling
the
part
and
determines
that
the
part
is
on
a
schedule
recently.
We
actually
made
this
change,
but
every
time
that
the
scheduler
finds
out
that
the
part
is
unschedulable,
it
adds
the
timestamp
to
the
part.
A
This
timestamp
is
then
used
in
our
scheduling
to
you
to
sort
pods
which
have
the
same
priority.
Basically
parts
with
your
more
recently
retry
go
to
the
back
of
the
queue.
This
is
like
a
fairness,
improvement
mechanism
in
the
scheduler,
but
updating
the
API
server
at
every
scheduling
cycle
is
not
necessarily
efficient,
especially
in
larger
clusters,
with
many
unschedulable
pods.
This
eats
up
a
lot
of
our
bandwidth
to
this
to
the
API
server.
I
was
just
telling
you
about
the
at
the
rate
limitation
that
exists.
A
That
rate
limitation
applies
to
any
request
that
we
send
to
the
API
server,
including
request
to
update
party
status.
So
it's
important
to
save
that
bandwidth
as
much
as
possible
and
this
PR
that
one
has
sent
and
is
almost
ready.
I
had
just
a
couple
or
like
several
minor
comments
in
it,
so
it's
gonna
be
merged.
Soon
is
gonna.
Remove
that
recent
change
that
updates
the
timestamp
of
a
part
after
every
scheduling
attempt.
Instead
it
keeps
that
timestamp
in
the
scheduler
internal
state.
A
There
is
really
no
reason
to
update
the
API
server
about
this
time
stylist
times.
This
timestamp
is
something
that
is
only
valuable
to
the
scheduler,
so
hope
fully.
This
will
emerge
soon
and
will
improve
for
further
the
efficiency
of
schedule
teaser,
because
some
of
the
updates
today
they
had
I
believe
Klaus.
A
D
I
will
we
try
a
mystery?
We
try
to
reuse
some
part
party
in
the
defaults
teller.
Now
when
we
try
to
use
the
image
relocate
a
party
we
fund
light
it's
a
depending
on
the
meter
Missa
data
and
we
only
depend
on
the
total
total
number
in
this
prior
tax.
But
when
we
build
metadata
from
the
factory,
we
have
to
pass
silver
Lister.
Citrus
salad
is
their
service.
It
is
service,
lynnster
and
maybe
some
other
Lister,
so
yeah
so
I
think.
D
Maybe
we
can
enhance
such
candle
really
has
this
factory,
maybe
I,
don't
know
how
I
any
solutions
right
now
but
I'd
like
to
rest
here
is
that
maybe
we
can
simplify
the
interface
so
for
them
whole
April.
Some
user
just
want
to
one
priority
tiles,
and
this
for
this
party
only
depend
on
maybe
one
or
two
misses
data
in
the
you
know
in
the
meter.
So
we
can.
We
can
have
some
way
to
build
that
easily.
D
D
A
Know
every
time
that
something
like
this
comes
up,
I
I,
think
why
we
don't
have
the
scheduling
framework
yet
yeah.
A
lot
of
a
lot
of
those
issues
would
have
gone
away
by
a
sort
of
automatically
by
the
framework,
but
angle,
right,
I,
see
your
point.
I
don't
have
any
great
solution
at
the
moment.
On
top
of
my
head,
I'd--
terrifies,
this
is
worth
exploring.
We
should
take
a
look
and
see
how
we
can
improve
the
situation.
Yes,
all
right,
so
it
looks
like
they
also
had
an
update
for
us.
Thank
you
very
much
way.
A
You
Ray
has
added
a
new
cap
for
evenly
distributing
pods
in
a
in
an
arbitrary
topology.
So
this
is
actually
a
an
effort
that
we
recently
been
working
on.
The
idea
here
is
that
we
do.
Basically,
the
idea
here
is
that
we
want
to
distribute
a
number
of
parts
in
a
particular
topology
domain.
A
topology
domain,
for
example,
can
be
a
zone
or
multiple
zones,
or
can
be
even
a
node,
can
be
a
region
or
whatever,
depending
on
what
labels
you
put
on
your
notes.
So
today
we
have
entire
affinity.
A
The
problem
with
anti
affinity
is
that,
once
you
put
anti
affinity-
and
let's
say
your
parts,
then
you
basically
tell
the
scheduler
to
not
put
more
than
one
of
such
parts
in
that
particular
topology
that
you,
you
have
defined
in
the
anti
affinity
rule.
For
example,
the
topology
can
be
note
by
saying
that
you
haven't
ever
you
tell
the
scheduler,
don't
put
more
than
one
pods
of
this
type
on
a
node.
A
Now,
let's
say
that
you
have
five
nodes
in
a
cluster.
You
have
tens
of
such
parts,
so
putting
on
a
a
fan.
Idiom
on
pods
causes
five
of
them
to
not
get
scheduled
at
all
oftentimes.
Actually,
users
want
to
achieve
distribution
instead
of
just
having
an
affinity
really
so
in
the
case
that
they
have
like
ten
parts
and
five
nodes
in
the
cluster.
The
oftentimes
want
to
have
like
two
parts
on
each
node,
not
necessarily
one
part
and
five
of
them
getting
on
the
scale.
A
So
we
are
we're
trying
to
address
this
problem
by
adding
a
new
feature
to
kubernetes,
where
users
can
specify
a
topology
very
similar
to
enter
a
finale.
But
this
is
basically
the
difference
between
this.
An
anti
affinity
is
that
this
one
tells
the
schedule
to
distribute
pod
evenly
in
a
participant
apology,
so
they
has
added
an
egg.
Every
singly
I
am
trying.
Okay,
let
me
copy
the
link
or
meeting
notes.
A
So
after
this
is
merged,
you
know
implemented
and
merged.
We
may
revisit
the
anti
affinity
today
and
say:
affinity
is
one
source
of
scalability
issues
in
this
scheduler.
As
many
of
you
know,
we've
been
trying
to
address
this
and
we
have
achieved
quite
so
much.
As
you
may
be
aware,
we
have
achieved
over
a
hundred
or
close
to
120
X
performance
improvement
for
that
feature,
but
the
feature
was
a
thousand
times
slower
than
many
of
other
scheduling
features.
So
even
after
120
X
performance
improvement,
we
are
still
like
a
210
times
slower
than
other
features.
A
Eight
to
ten
times
is
still
a
pretty
large
number.
Most
of
the
reason
for
this
is
slower.
Performance
of
the
feature
is
because
it's
very
very
flexible.
We
haven't
a
if
any.
Basically,
once
we
have
an
anti
affinity
paradigm
for
infinity
faster
than
every
other
part
that
is
being
scheduled,
we
need
to
check
it
and
make
sure
that
the
entire
fancy
rules
of
the
running
part
are
honored,
so
you're
changing
this.
Well,
we're
not
changing
it.
Yet
we
are
thinking
about
changing
it
so
that
we
only
provide
anti
affinity.
A
On
the
same
note,
not
necessarily
in
any
arbitrary
topology,
and
then
we
add
this
new
feature
to
evenly
distribute
in
arbitrary
topology
the
reason
how
basically
are
reasons
for
making
these
changes
that
most
people
really
want
to
use
anti
affinity
in
arbitrary
topology
domain
to
achieve
evenly
into
an
even
distribution
of
distribution
of
their
parts.
So
once
we
have
another
feature
which
provides
even
distributions,
we
probably
won't
need
to
have
an
affinity
in
arbitrary
topology
domains
and
at
an
eternity
on
the
node
with
most
probably
be
enough
for
pretty
much
all
use
cases.
A
C
C
A
Right
make
sense
all
right.
Thank
you
very
much.
Yeah
I
will
definitely
take
a
look
at
your
cab,
other
other
people
who
were
interested.
Please
go
ahead
and
take
a
look
at
that.
The
link
is
in
our
meeting
notes,
all
right.
We
are
24
24
minutes
and
our
meeting
I
don't
have
any
further
updates
for
you.
I
do
only
think
that
I
want
to
mention
is
that
this
week
is
a
little
bit
tough
for
me
we're
at
Google.
A
We
are
in
this
like
performance
review
cycle,
so
most
of
us
are
busy
with
other
stuff,
so
I
apologize
for
not
being
very
responsive
and
reviewing
all
your
peers
in
a
timely
fashion,
but
I
will
get
to
those
I
know
way.
I
owe
you
one
review
for
one
of
your
peers
and
also
I.
Don't
review
your
captivation
yeah,
that's
totally
fine!
C
Other
comments,
yeah
go
ahead:
yeah
I
just
want
to
yeah
emphasize
that
an
idea
in
mind
cab
that
so
originally,
when
we
think
about
this
pinpoint
from
a
customer,
we
I
proposed
a
original
proposal
named
max
past
apology
right,
yes
that,
but
we
discussed,
we
exchanged
a
lot
of
five
years
and
the
following
thing:
that
ideas
you
have
to
define
a
specific
number.
This
is
pretty
challenging
for
the
user
price,
but
they
are
the
to
tune
or
something
cry.
C
C
Yeah,
so
for
improvement,
so
introduce
that
kind
of
value
called
max
Q
max
skill
described
imbed
the
degree
of
imbalance
of
the
past
spreading
in
the
cluster.
For
example,
the
default
value
will
pull
up
a
table.
One.
That
means
four
for
ten
past
apply
on
the
five
nailed
Custer
the
department
grow
out
could
be
one
on
each
note
and
the
sixth
one,
for
example,
is
deploying
on
the
first
note
and
a
single
same
one
can
only
be
deployed
on
the
rest
for
most
Destin.
So.
A
C
I
want,
maybe
it
still
gave
the
flexibility,
for
example,
the
distribution
can
be,
for
example,
2/0
for
some
instances
if
they
want
o
example.
If
right
now,
the
distribution
is
2/1
right.
If
the
skill
value
we
said
is
1,
then
it
cannot
be
3/1.
It
can
only
be
true
such
to
write.
The
max
is
between
just
described
the
imbalance,
so
we
will
see,
of
course,
if
we
hard
call
it
tomorrow
and
only
provide
a
boolean
value
to
use
it's
it's
more
easy.