►
From YouTube: 2017-03-23 Kubernetes SIG Scaling - Weekly Meeting
Description
Public meeting recording of the Kubernetes Scalability SIG.
See comments for Zoom chat log.
A
They've
gone
through
on
and
the
latest
version
of
it
city
to
make
it
nice
and
seamless
to
integrate
in
process
so
I,
don't
know
whether
or
not
we
wanted
to
have
a
configuration
or
a
flag
to
potentially
have
an
embedded
version
of
STD
in
the
API
server
or
some
people
that
want
to
have
that
configuration,
because
that
allows
you
to
get
rid
of
the
caches
everything's
in
memory.
So
the
queries
will
be
fast.
You
get
rid
of
that
cert
over
the
wire,
so
you
don't
54
people
that
have
that
configuration.
A
B
I
think
that
yes
I
agree
that
some
caches
now
we
may
be
able
to
get
to
it,
but
it's
not
all
of
them.
So,
for
example,
like
this
cache
I
think
I
think
it's
still.
It
will
have
to
still
be
there
because
it's
like,
if,
basically,
if
we
are
served
like
serving
waters,
then
we
then
in
@cd.
Obviously
we
have
serialized
data
and
deserializing
it
like
if
we
will
project
type
waters
in
directly
into
a
CD,
which
would
be
the
case.
B
If
we
don't
have
this
kasher,
then
it
would
mean
that
we
will
be
deserializing
the
same
data
over
and
over
again.
So
it's
not
that
it
will
be
able
to
remove
every
caches
that
we
have
so
that,
as
matter
so
I
think
it
can
potentially
have
some
advantages.
But
I
wouldn't
like
to
be
at
the
point
where
this
is
like
the
only
thing
like
we
can.
We
can
consider
like
making
it
an
option,
but
it
should
be
an
option.
A
Yeah
I
agree:
I,
agree,
I'm
not
convey
I'm,
not
saying
like
we
should.
Should
you
know
it's
the
one
true
path
and
saying:
there's,
obviously
a
trade
out
there,
but
if
you
wanted
like
a
simplified,
H
a
deployment,
this
gives
you
they're
a
little
bit
easier
if
you're
trying
to
manage
the
number
of
bits
that
you're
trying
to
deploy
it
makes
it
easier.
There's
also
some
slight
performance
advantages
there
too.
So
don't.
C
A
We,
actually,
you
know,
full
disclosure,
we
actually
did
this
an
open
shift
and
is
you
can
provide
the
same
endpoints
that
you
normally
would
provide
and
what
we
did
is
we
brought
it
up
through
localhost.
So
that
way,
if
you're
on
the
machine,
you
can
do
all
the
exit
e
operations,
but
you
have
to
be
on
localhost.
C
Interesting,
okay,
I
think
it's
worth
looking
at
I
think
you
know
it
expands
at
s-matrix.
Yet
yet
again,
that's
nature
says.
A
C
We
were
going
to
go
to
the
sort
of
hyper
hyper
cube
where
it's
just
one
process
that
sort
of
brings
stuff
up.
I
think
that
starts
to
make
some
stunts,
but
as
long
as
we
still
have
separable-
and
we
have
to
do
like
boots,
QV
type
of
things,
you
know
I'm
not
sure
how
much,
how
big
of
a
gain
it
ends
up,
be
so.
A
B
A
C
A
D
B
I'm
not
like
we've
seen
some
in
some
number
of
nodes,
but
that
are
already,
but
it's
mostly.
It
was
mostly
like
because
of
API
keep
yeses
in
starved
and
not
not
and
cubelet,
not
being
able
to
send
the
to
update
the
status
itself,
but
it
was
like
during
some
like
it
was
few
weeks
ago.
So
so
this
actually
but
but
it's
all
obviously
depends
on
the
log.
B
So
if
you
are
get
some
like,
if
you
generate
some
very
high
load
on
Cuba
or,
for
example
like
if
you
create
a
lot
of
secrets
and
or
convict
maps
for
pots
that
are
running
on
this
qulet
and
this
cubelet
will
be
periodic
me
like
retreating
today,
that
may
potentially
affect
like
the
ability
to
update
node
status.
So
so
it's
it.
It
very
highly
depends
on
the
load
that
you
intended
on
the
pods
that
are
running
on
this
nervous
in
this
is
like
the
problem
that
you
are
observing,
because
it
can
potentially
be
something
else.
D
So
this
is
a
normally
in
point
of
disease
just
because
that
I
dunno
know
that
it
works.
I
mean
I,
guess
to
the
students
that
it's
a
that
comes
from
a
cupola
not
being
able
to
talk
to
the
API
server,
guys
server,
folder!
Something
sorry
can
you
repeat,
I,
not
ready.
What
does
being
a
measurement
of
is
that
reading
from
like
a
healthy
or
than
reading
from
like
from
I,
don't
even
know
how
that's
measured
by
little
but
make
a
server
once
more
on
a
running
cooler.
So.
B
B
We
were
like
with
some
well
like,
especially
when
we
were
even
we
were
having
like
a
lot
of
config
maps
and
or
and
a
lot
of
secrets
and
and
and
stuff
like
that,
on
a
given
note,
we've
seen
that,
but
it
it
wasn't
that
the
cubelet
was
like
ready
unready
for
a
long
time.
It
was
just
like
swapping
between
already
on
android,
so
it
might
be
something
completely
different
than
you
are
observing.
I
mean.
A
So
they're
members
that
you
brought
up
a
point
that
I
think
is
worthwhile
or
you
know
so
I
thought
amounts
or
the
original
topic,
but
it
was
the
question
of
our
QPS.
Limits
are
still
really
fictitious.
Leeloo,
given
all
the
changes
we've
made
over
history
actually
have
a
tool
right
now
that
I'm
working
on
to
rip
through
and
do
like
a
ridiculous
number
of
queries
and
a
little
afraid
of
how
harder
could
hit
a
loaded
system
so
or
keep
the
estimates.
Are
they
in
scoped
for
like
one
seven
to
start
to
update
them?
D
A
Don't
know
that
crazy
thought,
I,
don't
know
what
maybe
it
is
online
now
it's
a
reasonable
expectation
to
have
the,
but,
but
it's
topaz
total
fanciful.
It's
like
we're.
Making
up
we're,
making
up
or
retreat
we're
trading
one
magic
number
for
a
different
magic
number.
That's
slightly
variable,
yeah
I
think
it's
reasonable
to
up
the
QPS
limits
to
some
value,
but
it
needs
to
be
qualified
right.
That's
the
problem.
A
Biggest
limitation
on
2px
I
mean
CPU,
and
originally
it
was
CPU.
So
like
back
when
we
put
these
limiters
in,
it
was
way
back
in
the
1
1
100
time
frame
days.
That
was
the
original
limiter,
so
maybe
a
buck
then
I
think
in
1
1
and
then
the
we've
kind
of
left
them
there
over
across
multiple
releases
and
in
the
1
2
time
frame
you
know,
was
it
like
a
300
node
cluster
could
pretty
much
completely.
A
A
B
A
F
A
Could
probably
easily
write
a
benchmarking
tool,
which
is
not
a
bad
idea
honestly
to
basically
hammer
the
API
server
minute
end
test,
because
it's
just
client
stuff,
so
you
could
load.
You
can
load
sed
up
with
fake
data
and
just
constantly
have
a
client
clearing
the
heck
out
of
it
and
see
where
you
know.
What's
the
threshold
by
which
we
you
know
exceed
some
measure
of
okayness
like
we'd,
have
to
specify
that
like.
If
we
exceed
like
four
cores,
then
you
know,
Kim
came
over
put
her
for
something
to
that
eccentric,
I've.
B
Ever
eaten
my
to
delete
for
more
than
two
porters
now
in
my
backlog
and
always
dip-dip
deprioritize
deprioritized
failure
p.m.
still
for
your
sandal.
Oh
sorry,
oh
listen!
So
I'm
saying
that,
like
this
benchmark
for
expeditions
idea,
ever
I
gave
it
on
my
to-do
list
on
my
backlog
for
like
more
than
to
bother
balance
like
it
was
always
be
prioritized
priority,
M
bottle.
Oh
yeah
I
definitely
agree
that
we
should
do
it
at
some
point.
Maybe
I
will
have
time
to
look
like.
Please
contribute
to
it
and.
B
Marcus
on
vacation
now
I
was
pretty
busy
11.6
also
so
like
I
I
think
that,
except
from
like
what
we
were
discussing
two
weeks
of
all
I,
think
I
think
that
mark
send
it
like
wider
to
the
whole
DK
death.
I.
Think,
sorry,
oh
sorry,
not
a
kubernetes
net
and
I
think
there
weren't
any
like
any
concerns
about
it.
Sometime
look
so
I
think
that
we
like
within
a
like.
We
should
be
back
in
a
month
I,
so
how
next
week
is
actually
Cuban
so
probably
into
like
make
a
final
decision.
B
A
Yeah
because
I
mean
these
type
of
like
not
only
the
F
fellows,
which
is
super
important
for
the
long
term,
but
these
other
benchmarks
are
also
good
because
right
now
the
kind
of
ship
something
out
in
the
field.
You
know-
and
it's
got
governors
in
place
and
users-
don't
understand
why
things
timeout
and
right
issues
exist,
and
it
would
be
nice
to
have
it
more
formalized
honestly,
I
come.
F
A
Ideally,
we
would
have
some
type
of
like
document
for
this
scalability
sig.
The
outlines
here
are
the
knobs.
These
are
the
azir's,
the
history
behind
them,
so
that
somebody
can
go
through,
and
you
know
new
people
that
come
on
board
could
understand
in
it.
We've
never
actually
done.
That
I
think
it's
probably
worthwhile
for
someone
to
take
an
action.
I
have
to
do
that.
I
am
swamped,
though.
D
D
D
A
D
A
E
D
B
B
D
G
D
Yeah
I
mean
negative.
The
only
other
thing
was
rolling
back
to
the
note
issue.
I
think
this
end
and
the
decide
when
to
end
we're
talking
about
so
I,
don't
know
if
we
want
to
measure
the
not
ready
stuff,
also
as
a
measurement
of
cluster
failure
in
this
end-to-end
or
in
another
end-to-end,
but
it
seems
like
it
might
be,
it
seems
like
it
might
be
something
there
should
be
good
if
we
had
a
test
for
it.
Watch
this
to
scale
tests
that
we
actually
with
that.
D
The
other
thing
is
a
store
of
a
disk
set.
I,
don't
know
if
we
do
anything,
really
there's
Gil
stores
on
discs
that
could
be
disruptive
but
I,
don't
know
what
what
is
there
any
plant
upstream
for
for
stressing
discs
out?
One
is
the
first
question
and
then
second
of
all
hit
there
do.
We
think
we
should
actually
have
me
to
leave
for
the
doll
any
stuff
I
think
not
ready.
We
are,
or
we
should
measure
that
maybe
destiny
or
something
like
as
part
of
a
three
tests,
see
if
I'm
not
ready.
B
Yet
strips
the
I
think
that
we
are,
or
maybe
we
are
only
like,
I'm
sure
we
are
testing
something
about
not
readiness,
but
it
might
be
only
at
the
end
of
the
test
potentially
which
made
so.
It
means
that,
with
my
ID
Wharf
like
doing
something
Mordor
regarding
like
disk
I
think
that,
yes,
we
definitely
want
to
do
at
some
point,
but
we
are
still
more
more
focusing
on
a
little
bit.
I
would
call
it
stateless
work
out
then
stateful,
but
yes,
I
D,
like
I
whatever.
B
A
Yeah,
so
the
problem,
the
problem
with
this,
because
you
probably
want
to
talk
to
the
no
team-
and
you
know
there
are
no
fences
for
discs.
So
if
you
started
to
do
like
disk,
stressing
things
where
you
have
multiple,
you
know
if
you're
actually
exercising
local
disk
versus
like
attached
volumes
and
there's
also
well,
it's
two
parts
right
there.
There
are
no.
A
There
are
no
Governors
that
exist
for
disk
and
there
are
no
Governors
for
network,
so
you
can
kill
it
in
two
ways
right,
so
you
can
kill
the
local
disk
by
having
too
many
things
writing
concurrently,
you
kill
the
network
and
basically
denial
service
attack,
the
control
plane
or
having
too
many
TVs.
That
exercise
and
stress
that
way
too.