►
From YouTube: 2021-09-16 Kubernetes SIG Scalability Meeting
Description
Agenda and meeting notes - https://docs.google.com/document/d/1hEpf25qifVWztaeZPFmjNiJvPo-5JX1z0LSvvVY5G2g/edit?ts=5d1e2a5b
A
So,
hey
everyone-
this
is
six
callability
meeting
and
today
it's
16
september
2021
and
on
today's
agenda
we
have
two
points.
So
one
is.
I
think
this
point
was
actually
for
quite
long
time
and
honestly,
I'm
not
sure.
What's
what
the
current
step
is
right
now.
B
Yeah,
so
I'm
not
sure
if
anybody
anyone
else
has
started
working
on
it,
but
I
have
some
cycles
and
to
work
on
this,
so
I
think
I'm
pretty
new
to
the
scalability
tests
I
haven't
like.
I
haven't,
I'm
not
very
familiar
with
the
ripple.
If
somebody
can
provide
me,
some
pointers
and
also
like
for
the
pnf
test
is
equal
to
a
new
test,
or
are
you
going
to
extend
an
existing
test?
B
C
B
A
Yeah
yeah
exactly
that,
that's
what
I
was
actually
going
to
propose.
So
do
we
have
an
idea
in
mind
like
what
actually
do
we
want
to
test
regarding
pnf.
B
For
example,
I
would
start
with
a
basic
question
like
in
these
scalability
tests
like
we
have
different.
We
have
different
layer
tests
right
and
I
know
that
we
have
like
5k
tests
and
like
a
100,
node
test
and
different
tests
like
so.
The
first
question
is:
are
we
going
to
see
like
how
pnf
impacts
in
like
a
very
large
scale,
like
let's
say
you
know
five
thousand
now
lc?
B
A
Test
so
so
I
think
that
currently,
the
pnf
is
not
actually
blocking
too
much
a
request
right
and
the
question
is
like
because
our
test
also
measures
the
latency
of
of
calls.
So
I'm
not
sure
like
how
exactly
we
want
to
test
it,
because
if
we
reduce
the
number
of
let's
say
concurrencies
in
pnf,
then
then
obviously
the
latency
will
increase.
So
it's
it's
straight
off,
but
I
don't
know:
do
you
have
any
ideas?
C
We
want
to
test
it.
I
think
it
shouldn't
be
the
same
test.
I
think
it
should
be
a
separate
test
and
I
would
I
would
focus
more
on
not
necessarily
very
large
cluster
but
generating
much
higher
load
and
yeah
and
ensuring
that
api
server
won't
fall
over
by
more
or
less
right.
So
it's
more
like
a
reliability
test
right
more
of
like
reliability
tests.
Yes,
so.
B
So
drive
the
concurrency
to
a
very
high
limit
and
then
see
so
in
in
today,
like
what's
the
usual
like
the
like
the
number
of
requests
in
flight
for
our,
like
largest
scale
test
like
how
far
do
we
go
like
how
many
requests
in
flight
at
any
time,
do
you
have
any
rough
idea.
C
I
can't
remember
from
the
top
of
my
head,
but
I
think
it's
it's
it's
more
in
hundreds
than
something
else
right.
Okay,.
D
A
So
we
have
around
600
set
and
how?
How
much
do
we
reach
like
500,.
B
Yeah
in
the
past,
like
probably
when
pnf
was
in
alpha,
I
tried
like
to.
I
did
some
some
testing,
mostly.
I
did
a
benchmark
test
where
I
didn't
know
how
to
actually
drive
the
concurrent
sales.
B
So
what
I
did,
I
actually
added
a
filter
that
basically
waits
for,
like
let's
say
one
second,
and
then
that
cut
they're
all
for
all
the
requests
that
originate
from
certain
user
right
and
basically
I
drove
the
concurrency
through
that,
but
that
requires
you
know
it's
like
you
know,
changing
the
api
server,
so
I'm
not
sure,
because
through
all
the
tests,
I've
seen
in
scalability
is
basically
all
are
like
end-to-end
like
it's
a
real
test.
No,
not
any
integration
type.
C
C
So
it's
a
little
bit
more
expensive
in
the
sense,
so
we
should
probably
have
a
baseline
with
that
webhook
doing
nothing,
but
but
it
will
let
you
to
simulate
what
you
what
you
basically
want.
I.
A
B
So
do
you
have
any
is
which
repo
is
it
in
like
it's
in
the
e2
test
report.
B
Okay
sounds
good.
Thank
you,
yeah,
okay,
so
I
think
the
first
iteration
what
I
could
try
to
look
at
is
use
an
external
workbook
to
to
to
add
some
artificial
latency
right.
So
this
will
drive
up
the
concurrency
and
then
run
run
the
test,
and
I
also
need
to
measure
I
don't
know
by
default.
B
D
D
D
A
In
our
default
test,
if
you
are
going
to
develop
the
new
one,
then
you
will
need
to
use
basically
our
measurements
that
gather
those
results,
and
then
you
can
set
appropriate
thresholds
that
you
want
to
use.
B
A
So
the
question
is
like
okay,
so
I
think
that
first
of
all,
we
need
to
prepare
kind
of
like
test
scenario,
kind
of
like
idea
what
we
want
to
test
and
then
I
guess
we
could
help
you
with
finding
like
what
parts
of
cluster
already.
B
I
guess
we
want
to.
I
guess
both
I
just
want
to
see
like
at
a
higher
load,
hardest,
pnf
impact
and
pnf
has.
I
think
we
have
a
metric
that
actually
measures
how
much
time
a
request
spends
in
in
the
pnf
filter
exclusively,
and
we
can
use
that
as
a
as
a
you
know,
a
baseline,
like
you
know
this
shouldn't
go
higher
than
that
or
when
we
would
come
up
with
those
goals.
I
think
later.
D
B
D
Yes,
there
is
a
metric
like
that,
but
I
afraid
it
includes
the
actual
time
like
spent
on
the
waiting
for
for
so
like
it
will
not
measure
like
the
overhead,
but
actually
like
you
know
how
much
we
have
like
trousers
requests
in
the
system.
So
yeah,
I'm
not
sure
if
this
is
exactly
what
what
what
we
need
to
measure
like.
I
asked
him
because
I
was
thinking.
Maybe
we
do
not
need
to
like
create.
D
You
know
some
cluster
loader
tests
and
anything
like
that,
maybe
simply
like
a
go
benchmark
would
be
a
better
tool
for
that
like
it
can
be
used.
I
have
seen
some
examples
that
it
has
been
used
like
with
success
to
validate
like
where
we
spend
like
cpu
time
stuff
like
that
and
like
then
it
is
easier
to
it
right
on
this,
and
so
maybe
this
can
be
used
for
some
for
some
tests,
but
I
don't
know
like.
Maybe
we
already
have
some
some
yeah.
B
I
mean
definitely,
we
definitely
can
use
some
like
you
know,
benchmark
actual,
go
benchmark
test
at
a
very,
very
micro
level,
but
I
thought
this
was
like
actually
seeing
what
happens
on
the
real
cluster,
so
this
like
at
a
very
macro
level.
I
guess
we
can.
Both,
I
guess,
are
good
ideas.
B
C
B
B
But
ci
already
has
all
the
machineries
to
run
benchmark
tests.
It
has
yes,
oh
okay,
so
we
can
do
both.
Actually,
like
you
know,
we
can,
if
you
have
a
particular
like
you
know,
module
in
pnf
that
we
want
to
see
how
much
cpu
it
spends.
We
can
add
those
tests
as
well
and
we
can
also.
C
B
The
other
thing
is:
if
we
for
this
test,
I
know
that
pnf
ships
with
a
set
of
bootstrap
configurations,
like
we
can
add
more
objects
too,
to
see
how
much
cost
it
it
adds
to
its
processing
right.
B
That
could
be
another
like
avenue
we
can.
We
can
pursue.
B
B
A
It
can
be,
but
we
also
create
issues
in
our
birth
test.
Repository.
A
That's
that's
usually
where,
where
we
create
this
all.
A
Okay,
so
I
guess
the
question
is:
do
we
have
anything
more
that
we
want
to
discuss
right
now,
because
I
see
on
the
agenda
that
there
is
a
release
team
intro,
but
there
is
none
from
release
team.
I
think
so
I
mean
we
still
have
15
minutes
to
talk
about
this
testing.
If
we
want.
B
I
have
some
like
a
different
question
like
if
we're
done
discussing
pnf
yeah
go
ahead
yeah,
so
I
know
that
we
have
the
5k
test.
Is
it
actually
like
actual
notes,
or
these
are
like,
like
cookmark
like.
A
D
B
A
So
what
I
would
say
is
that,
okay,
let's
say
that
you
are
making
some
changes
to
our
ci,
then
usually
what
we
do
is
you
know
we
have
this
like
kind
of
feature
flag
and
then
we
enable
in
100
100
nodes,
and
then
we
enable
on
five
5k
nodes.
I
see
okay
and
if
it
doesn't
work,
then
we
roll
back.
A
Yes,
yes
yeah,
so
basically
maybe
I
can
give
you
some
short
introduction,
so
so
the
idea
is
that
we
are
creating
multiple
objects.
Deployment
stateful
sets.
A
A
So
this
generates
some
pot
churn
and
during
the
whole
test,
and
that
and
then
at
the
end
we
are
deleting
everything.
So
so
what
we
are
doing
is
we
are
also
measuring
different
things
like
some
out
of
memory:
errors,
api
server,
latency
and
things
like
that,
and
basically,
if
any
of
of
those
measurements
go
above
some
threshold,
then
then
we
fail
the
test.
A
So
do
you
have
any
more
questions?
I
think
that's
that's
all
from
from
my
side,
okay,
so
so
I
guess
we
can
discuss
it
further
in
the
issue
and
and
I
don't
think
we
have
any
more
anything
more
to
discuss
right.
So,
let's
finish
earlier,
and
everyone
will
have
nine
minutes
more.