►
From YouTube: 2021-01-21 Kubernetes SIG Scalability Meeting
Description
Agenda and meeting notes - https://docs.google.com/document/d/1hEpf25qifVWztaeZPFmjNiJvPo-5JX1z0LSvvVY5G2g/edit?ts=5d1e2a5b
A
Okay,
I
believe
we
are
recording
so
welcome
everyone
to
the
six
credibility
meeting
today
we
have
january.
C
Hi
hi
matt
hi
voytek,
my
name
is
swathi.
I
work
for
red
hat
and
just
had
a
couple
of
questions
just
to
chat
and
wanted
to
chat
to
you
guys
about
a
use
case
that
we
had.
A
It
okay
so
before
we
do
that,
just
a
few
announcements,
so
we
were
able
to
finally
figure
out
all
the
access
issues
to
our
6k
leads
group
and
also
figure
out
how
to
publish
the
recordings
to
youtube.
A
A
A
All
right,
so
do
you
want
to
give
us
a
quick
overview
of
of
the
tests
you
plan
to
run
and.
C
C
Quick
introduction
of
the
use
case
that
we
care
about
it's
about
topology,
aware
scheduling,
there's
a
well-known
issue
that
has
happened
because
topology
manager
tries
to
align
your
resources,
but
the
scheduler
doesn't
have
visibility
at
of
resources
at
a
pneumonoad
level,
because
of
that
the
scheduler
sees
nodes
kind
of
in
the
same
manner,
regardless
of
the
resources
available
at
a
normal
node
level,
placing
the
workloads
and
then
topology
manager.
In
case
it
is
configured
with
a
policy
like
single
numeral.
C
It
ends
up
in
a
topology
affinity
error.
So
that's
essentially
the
problem
that
we
are
trying
to
solve.
We
have
a
few
caps
and
we
have
like
implementations
as
well
entry
as
well
as
out
out
of
tree.
So
first
component
is
node
feature
discovery,
so
that
exposes
like
crds
per
node
to
expose
the
capability
and
expose
the
resources
at
a
per
number
level.
So
the
scheduler
has
that
visibility,
while
it
makes
its
scheduling
decision.
C
So
when
we
went
to
sig
architecture,
they
recommended
that
first
of
all,
we
need
kind
of
an
api
review
review
and
in
order
to
kind
of
go
ahead
with
this
proposal,
we
have
to
prove
this
solution
and
the
design
that
we're
proposing
at
a
large
scale.
So
initially
they
mentioned
that
maybe
can
we
can.
We
show
that
there's
regression
at
5000,
node
scales,
so
being
new
to
kind
of
scale
side
of
things.
I
wanted
to
get
opinion
a
few
guys
and
see
if
there's
there's
some
pointers
from
your
side.
A
So,
as
I
wrote
on
slack,
I
think
a
good
starting
point
might
be
our
pre-submit
that
it's
available
for
any
kubernetes
pr.
It
runs
100
nodes
like
the
only
difficulty
here
is
that
we
need
to
adjust
it
to
enable
like
all
these
topology
related
features
and
and
also
like
deploy
your
changes.
So
I
would
like
to
understand
better
what
kind
of
changes
we
need
to
make
in
the
cluster
in
our
like
test
setup.
C
C
So
I
would
say
that
would
be
the
first
requirement
prerequisite
for
us
and
then
the
other
other
things
would
be
deploying
node
feature
discovery
to
be
able
to
which
is
a
component
responsible
for
kind
of
examining
the
nodes
and
creating
the
crds
and
then
deploying
the
scheduler
plugin.
So
we
have
like.
B
So
my
sorry
sorry
just
to
interrupt
the
quick
question.
So
can
we
somehow
fake
that
to
really
prove
the
scalability
part
without
having
like
end-to-end
setup?
So
my
my
point
here
is
that
whatever
is
happening
on
the
node
itself
and
it's
purely
happening
with
within
the
node,
more
or
less
seems
to
not
affect
scale?
What
effect
scales
is
like
the
what
is
kind
of
like
cluster
scope,
so
every
api
call,
for
example,
that
the
that
is
happening
from
components
running
running
on
the
node.
B
C
So
we
did
think
about
it
and
probably
that's
an
area
we
need
to
explore
a
bit
further,
but
one
of
the
things
that
we
would
like
to
test
is
like
running
streams
of
pods
and
seeing
how,
when
they're,
requesting
certain
resources,
how
does
it
work
out
like
scenarios
like
how
does
segmentation
happen
and
in
scenarios
where
there's
resource
crunch?
How
does
the
scheduler
plug-in
react
in
those
scenarios?
Obviously
like
api
load
and
things
is
part
of
it.
C
I
think
faking
the
resources
we
can
achieve
through
that,
but
kind
of
having
an
understanding
or
kind
of
having
seeing
how
this
cluster
behaves
and
how
these
components
are
able
to
deal
with
the
different,
maybe
edge
cases.
C
Probably
for
that
we
we
would
like
to
test
at
scale,
and
I
think
the
expectation
from
sig
architecture
is
also
that
you
showcase
it
in
a
real
environment
they
mentioned
like
simulators.
If
we
can
simulate
it,
I'm
not
sure
if
that
is
almost
kind
of
equivalent
to
your
point
of
faking.
The
api
components.
B
B
More
smaller
scale,
but
like
generating
a
bunch
of
like
what
do
you
mean
what
you
call
like
corner
cases
and
load
and
stuff
like
that
on
a
single
node
and
and
exercising
what
is
happening
there
and
whether
it
kind
of
works
or
and
so
on,
and
the
other
that
is
more
like
faked,
and
this
on
on
the
large
scale.
Maybe
it
doesn't
make
sense
what
I'm
saying.
I
think
I
don't
understand
like
that.
B
Your
proposal,
deep
enough
to
be
able
to
to
to
currently
say
whether
it
makes
sense
and
what
what
are
it,
what
other
interactions
you
can
expect
at
scale.
So
I'm
definitely
happy
or
to
be
proved
that
I'm
wrong
here
and
like
what
I'm
saying
doesn't
make
any
sense.
But
I
think
we
should
consider
that.
C
Yeah,
that's
not
a
bad
idea
kind
of
having
two
sets
of
cases
kind
of
one
dealing
with
a
small,
smaller
scale
where
we
are
dealing
with
all
the
edge
cases
and
then
larger
scale,
where
we
just
focus
on
showcasing
that
there's
regression
essentially
with
this
new
feature
and
that's
not
a
bad
idea.
If
there's
anything,
I
can
do
to
maybe
help
you
understand
this,
maybe
like
you,
can
refer
to
the
slack
thread
there
and
have
a
look.
C
Yeah
we
have,
we
have
a
couple
of
caps,
I
can
drop.
B
C
It's
yeah,
so
I
have
I
have
I
delivered
a
presentation
to
signored
I
I
can
send
like
the
youtube
link
of
that.
I
think
that
would
give
you
enough
yeah.
C
Useful
and
I'll
send
you
the
link
to
the
caps
that
we
are
proposing.
I
think
all
this,
while
that
we've
been
having
conversations
our
proposal
have,
has
slightly
changed
a
bit.
Now
we
are
considering
out
of
free
scheduler
plug-in.
Initially
it
was
just
an
entry,
so
those
minor
things
have
changed,
but
overall
design
wise.
That.
B
C
B
Yeah,
I
think
we
shouldn't
point
you
to
the
presab
needs,
but
rather
like
the
periodic
jobs,
because
it's
they
are
more
fun
more
well,
maybe
yeah
I
mean
like
that.
We
don't
have
like
five
thousand
note
presubmit
per
se.
They
are
all
kind
of
similar.
So
maybe
that
comment
like
doesn't
make
much
sense,
but.
A
Yes,
some
links
here,
the
the
problem
is
that
probably
the
parts
you
will
be
most
interested
in
so
like
the
the
setup
of
the
cluster
lives
in
in
like
actually
the
config,
is
in
different
place.
The
last
test
leaves
in
different
phrases
like,
like
the
all
the
cube
up
stuff,
which
configures
notes
and
master
also
like
lives
in
different
place.
So
there's
like
a
lot
of
places
but
like
everything,
should
be
doable
in
general,
like
if
you
want
to
modify
one
part
or
another.
A
So
the
question
is
just
like:
what
do
you
really
need
comparing
to
what
we
do
right
now?
So.
C
A
I
would
like
drop
something
later
to
not
waste
time
now
I
wanted
to
mention
just
one
thing
and
if
it
happens
that
we
cannot
really
do
that
on
gc
vms,
then
I
think
it
would
be
worth
asking
question
whether
cubemark
can
be
somehow
adapted
for
this
test,
and
I
just
wanted
to
mention
that.
There's
like
a
big
space
between
end
to
end
test
and
like
some
benchmarks,
I
think
like
yeah.
A
So
so
we
started
with
end-to-end
and
voitec
like
mentioned,
simulating
something-
and
I
think
you
understand
that
we
should
we
can
just
like
simulate
control,
plane,
just
apa
server,
which
is
probably
not
enough
for
you,
but,
like
cube
mark,
I'm
sure
you've
heard
of
it
it's
a
framework
that
we
own-
and
it's
basically
allows
you
to
run
end
to
end
test
control.
Plane
is
a
regular
control
plane,
but
nodes
are
are
faked,
so
we
have
concept
of
hollow
node.
A
Take
note,
and
the
question
is
whether
we
can
adapt
cube
mark
to
basically
simulate
the
behavior.
We
want
from
notes
right
and
then
test,
because
you
have
a
real
scheduler
there,
so
we
can
basically
test
what
I
believe
you
can
test
what
you
need
there
so
that
that's
awesome.
C
A
A
D
C
A
A
D
Yeah
hey,
this
is
abu
hey,
so
I've
been
working
on
a
pr
to
simplify
the
the
timeout
path
of
the
request
in
the
api
server
that
was
mentioned
here.
Let
me
just
copy
the
yeah.
D
Yeah
so
yeah,
I
just
pasted
that
yeah.
D
Yes,
so
like,
if
you
have
any
feedback
in
terms
of
like,
does
it
have
an
impact
on
the
scalability
or
any
other
areas?
It's
basically.
What
what
happens
today
is
we
have
a
fixed
60,
second
timeout
in
the
timeout
filter
and
when.
C
D
Request
enters
the
the
rest
filter.
We
basically
create
another
context
with
this
specified
timeline.
What
I'm
trying
to
do
here
is
actually
like,
add
a
new
filter
in
the
filter
chain
that
basically
creates
the
context
with
the
user
specified
timeout
at
the
very
beginning,
and
if,
if
a
user
doesn't
specify
any
timeout,
then
you
basically
create
a
context
of
60
second
and
then
go
from
there
and
that's
pretty
much
it.
I
think
in
essence,
but
yeah.
If
you
have
any
like
any
feedback,
that
would
be
really
great.
A
So
so
let
me
understand
it
better.
So
currently
the
situation
is
if
a
user
doesn't
specify
a
timeout
in
the
request
is
devoted
to
60
seconds
that
that
yes,.
D
Yes,
if,
if
the
user
doesn't
specify
in
timeout
the
timeout
filter,
so
so,
the
timer
filter
always
uses
a
fixed
60.
Second
timeout,
no
matter
what
the
user
specifies.
D
Okay
and
when
the
request
basically
the
hand
the
handler
enters
the
the
rest
filter,
and
I
think
we
have
code
that
basically
creates
a
context
based
on
the
user
specified
timeout.
D
Because
the
it's
a
because
the
context
acts
like
a
tree
so
basically,
but
what
I'm
doing
in
the
pr
is
basically
doing
that
up
front
in
the
filter
change.
So
so,
basically,
when,
as
soon
as
the
request
enters
the
pay
server,
we
set
up
a
deadline
for
the
request
and
that
context
is
used
throughout
the
handler
chains
and
the
rest
handlers,
so
that
you
know
it's
more.
It's
more
correct
in
that
way.
A
I
see
yeah,
I
think
I
understand
like
the
correctness
and
like
the
the
basically
refactoring
the
calls
to
to
make
it
cleaner,
but
I'm
worried
about,
like
the
user,
facing
changes
here
so
yeah.
So.
D
C
D
D
That's
the
the
change
in
behavior
today,
and
I
don't
know
today,
like
a
user.
If
any
user
today
specifies
a
mail
forum
timeout,
I
guess
we
still
start
the
request
with
200
like,
but
with
this
pr
we'll
be
sending
400s.
B
So
so
my
other
questions,
I'm
generally
supportive
for
that,
the
the
only
thing
that
we
may
want
to
solve
together
with
that,
I
I
didn't
look
into
that.
So
maybe
you
already
did
that,
but
just
let
me
mention
that
is
that
we
are
not
canceling
any
operations
that
are
sent
to
fcd.
Currently,
so
we
are
sending,
we
are
setting
the
timeout
for
the
lcd
calls
completely
and
independently
from
anything
or
we
are
not
setting
them
at
all
or
something
like
that.
B
So
in
theory,
I
can
imagine
that
with
your
pr
and
I
let's
say
that
as
a
user,
I'm
setting
all
the
operations
to
say
with
timeout
one
second
or
something
like
that
like
something
very
short
and
a
bunch
of
operations
are,
let's
say
they
are
taking
more
than
a
second,
because
the
cluster
is
super
overloaded,
whatever
something
something
bad
is
happening,
then
it's
technically
possible
that
we
will
kind
of
exceed
the
in-flight
limit
or
whatever
we
we
have
with
priority
and
fairness.
B
It's
like,
I
can't
remember
the
name
there,
but
there's
like
the
concurrency
limit
or
whatever
it's
called
it's
possible
that
we
we
kind
of,
can
exceed
it
because,
even
though,
in
the
api
server
itself,
it
works
correctly,
because
we
are
canceling
under
the
operation
and
stuff
like
that
on
the
lcd
layer.
We
are
not
cancelling
them
and
they
are
still
happening
and
they
are
kind
of
they
are
still
in
flight.
B
So
I
guess
it's
a
little
bit
independent
from
that
pr,
because
we
have
this
problem
even
now,
where,
like
list
operation,
if
list
operation,
for
example,
is
taking
more
than
a
minute,
it
happens
that,
or
I
think,
delete
collection
is
even
a
better
example
that,
like
they
call
the
the
the
call
to
the
api
server
is
ending,
but
like
the
underlying
lcd
thing
is
still
happening.
D
The
call
to
xd
is
already
wired
up
with
the
context.
So
with
this
pr
merged,
the
context
now
will
have
like
an
actual
deadline
and
like
the
call
will
fail
if
the
context
is
properly
used
throughout
the
layers.
Okay,
so.
B
D
Based
on
what
I've
seen,
I
think
it
already
solves
it,
but
I
this
is
like
my
follow-up
task-
is
to
look
at
the
look
at
the
storage
layer
like
deeply
and
see
if
the
context
is
being
wired
properly.
I
see
no,
that
that
makes
perfect
sense
to
me.
Yes,
if
that's
the
case,
then
we
already
solved
the
problem.
It
doesn't.
The
case
then
just
need
to
wire
up
the
context,
and
that
should
be,
I
think,
good
to
go.
I
think.
A
I
see
okay,
yes,
so
sorry
for
asking
again
absolutely
yeah,
let
me
just
because
I
would
like
to
understand
so
how
does
it
work
currently?
So
once
again,
I.
A
Question
so
we
found
your
change.
We
have
this
like
timeout
filter
right
that
sets
the
timeout
to
60
seconds
and
what
happens
with
the
user
provided
timeout
in
the
in
the
in
the
request.
D
Right
so
previously
the
timeout
in
the
timeout
filter.
The
timeline
was
always
60
seconds,
no
matter
what
okay.
A
D
A
D
Was-
and
there
is
the
context
before
at
the
time
of
filter,
okay,
so
now,
timeout
filter
does
not
create
any
context.
It
just
uses
the
the
context
provided
by
the
the
the
request
deadline,
filter,
okay
and
it
is
always
set
to
that
to
the
value
if
there
is
a
specified
10.
Second,
as
a
timeout
that
would
be
10
seconds.
If
this
doesn't
specify
anything,
it
would
be
60
seconds
the
default
that
we
plumb
through
the
command
line.
A
Okay,
so
let
me
just
ask
a
follow-up
question
so
if
we
now
have
like
two
timeouts,
the
60
second
and
like
the
other,
like
user
specified,
so
how
how
do
they
interact
with
each
other,
because
I
I'm
lost
here-
to
be
honest,
so
let's
say
I'm
issuing
a
get
request
with
timeout
set
to
five
seconds
right.
Yes,
so
what
really
happens
in
that
case
now
without
this
change.
D
A
A
Okay,
so
I
think
just
the
last
question
to
let
me
send
this,
so
what
does
the
timeout
filter?
Do
you
said
it's
creating
a
routine?
I
think
there.
D
Yes,
no,
so
what
the
timeout
filter
does.
Is
it
spins
up
it
spins
up
a
goru
team
where
it
executes
the
rest
of
the
handler
chain?
Okay
right
and
it
basically
then
waits
up
to
60
seconds
to
see
if
any
response
has
been
sent.
Okay,.
A
D
Please
please,
if
you
have
some
time
re
review
it,
we
need
as
much
eye
as
many
as
as
possible
sure
to
make
sure
that
this
doesn't
have
any
other
ripple
effect.
A
Cool
yeah
totally,
I
will
take
a
look
and
I
encourage
others
to
also
take
a
look.
B
A
B
Great,
like
I
think
we
wanted
to
do
that
for
a
very
long
time,
and
no
one
really
actually
sat
down
to
that
so
far.
So
so
this
is
great
to
see.
D
No
yeah,
I
think,
centralizing
the
the
timeout
logic
in
like
a
filter
it
it.
It
makes
the
it
makes
wiring
up
the
context
for
like
rest
layer
for
like
admission
control,
plug-in
layer
like
very,
very
easy,
like
we
have
to
worry
about.
So
I
think
there
are
there'll
be
follow-up
prs
that
will
be.
We
can
open
after
this
one
gets
merged
appropriately.