►
From YouTube: 2021-06-10 Kubernetes SIG Scalability Meeting
Description
Agenda and meeting notes - https://docs.google.com/document/d/1hEpf25qifVWztaeZPFmjNiJvPo-5JX1z0LSvvVY5G2g/edit?ts=5d1e2a5b
A
A
A
Yeah
matt,
okay,
perfect!
It's
always
confusing
when
I
do
that
one
zoom
all
right
arnold,
so
I
see
you
already
added
something
to
the
agenda,
so
go
ahead.
B
Matt
we
wanted
to
discuss
one
of
the
issue
we
have
raised
related
to
couple
of
test
cases.
We
added
you
know
a
couple
of
months
back
related
to
network
performance,
metrics
measurement,
so.
C
B
That
time,
I
think
you
were
on
leaf
oz
was,
you
know,
handling
those
basically
that
issue
in
the
sense
we
uploaded
the
test
cases
and
the
code
he
reviewed
and
all
that
code
is
committed,
but
what
we
see
is
okay.
Let
me
paste
that
issue
here
so
that
you
know
we
can
have
a
discussion.
B
So
we
have
used
a
couple
of
tools,
like
you
know,
iperf
to
to
measure
the
udp
latency
and
other
packet
per
all
that
measurements.
So
what
do
we
see
is
for
the
latency?
We
see
negative
values
being
reported
in
the
run.
Actually,
so
when
we
searched
in
the
internet,
we
see
that
you
know
time.
Synchronization
could
be
the
problem.
A
B
Guys
are
saying,
but
ozzy
was
saying
that
you
know
all
the
worker
nodes
are.
You
know
perfectly
time
synced
in
the
in
the
ci
setup
right
where
these
test
cases
run
so
so
there
has
been
no
progress
on
this
issue
because
we
we
asked
him
whether
we
can
consider
those
values
as
negative
and
so
that
you
know
we
can
continue
with
still
going,
and
you
know,
reporting
this
metric
measurement
because
say
out
of
many
pod
pairs.
B
Maybe
at
least
we
have
seen
50
percent
of
them
report
negative
values,
so
at
least
other
positive
values.
We
can
consider
for
the
calculation
and
go
ahead.
So
that
was
our
question
in
that
issue,
but
I
think
oz
seems
to
be
maybe
busy.
I
think
right,
so
he
has
hasn't
got
time
to
look
at
this.
So
that's
why
we
joined
this
meeting
today.
So
if,
if
he
could,
you
know,
take
a
look.
A
B
B
A
A
Yeah
is
this
the
one?
No
that's
something
else.
B
One
eight:
I
have
pasted
it
actually
in
the
chat
window.
A
Okay,
I
think
like
like
thank
you
for
bringing
easter
attention.
I
will
make
sure
someone
takes
a
look
at
this
and
and
help
you
unlock
that
yeah.
I
don't
think
we
will
be
able
to
debug
it
now,
but
definitely.
D
Just
to
add
a
few
more
details
to
it
like
so
we
had
at
least
one
issue,
so
the
sourceforge
net
that
link
that
gives
the
issue
that
we
had
raised
in
the
tool
itself.
So
what
they
were
saying
is
like,
I
think,
by
default,
gc
uses
ndp,
but
I'm
not
very
sure
the
place
where
we
are
running.
Like
the
sixth
scalability
I
mean
or
the
cic
pipeline,
where
it
runs.
He
is
really
using
npp
or
some
other
clock
algorithm.
So
I'm
not
really
sure,
but
I
think
the
default
is
empty.
D
D
A
Okay,
but
like
that
is
like
someone
from
six
favorites,
you
will
take
a
look
and-
and
we
will
try
to
help
you
here-
yeah
so
like
going
forward
like
I
assume
this
is
blocking
you
or
like,
maybe
putting
in
the
other
relic.
So
what's
the
long-term
plan
here,
basically,
are
we
running
this
test
continuously
currently
using
our
infrastructure
right?
That's
what
we
say.
D
B
D
Merged,
I
think,
like
the
puff
dash
has
to
be
upgraded
and
that
has
to
be
running
in
the
test.
That's
fine.
A
A
Magic
is
not
here
today,
but
he
was
he
was
working,
but
this
is
closed
and
the
question
is:
did
we
agree
or
we
upgraded
so
just
double
check
that
your
changes,
whether
they
there
are
or
they're,
not
there,
it's
possible
like
it's
already,
if
not
just,
let
me
know
on
slack
or
or
like
so
the.
D
A
B
But
we
wake
that
version
of
puff
dash
is
not
there
right.
So
I.
A
A
B
But
but
it
was
merged
long
back
right
with
quite
a.
A
We
can
check
that
online
so,
like
basically
just
let
me
know
on
slack,
you
can
send
me
the
pr
number,
but
if
you
say
it
was
merged
long
time
ago,
then
it
should
be
there.
Then
the
question
is
like
whether
the
pr
did
everything
that
is
needed
to
actually
have
the
jobs
displayed,
because,
like
perfect,
is
not
a
great
example
of
a
good
codes
like
the
config
is
like
hard
coded,
and
it's
actually
there's
like
many
places.
You
need
to
change
if
you
need
to
add
like
new
tests,
so
it's
very
error-prone.
A
A
So,
thanks
again
for
bringing
this
to
our
attention,
our
notes,
like
do
you
want
to
shed
more
light
on
the
migration
of
scalability
jobs
to
the
camera.
F
F
F
So,
as
I
saw
some
comment
about,
there
are
some
basically
lists
of
quota
except
cpu
that
need
to
be
raised.
So
I
would
like
to
know
if,
basically,
if
you
have
a
specific
list
of
course,
ways
or
just
we
need
to
discover
over
time
so
of.
A
Yeah,
like
I
don't
think
we
have
anywhere
documented,
like
the
exact
list
of
which
quarters
should
be
raised
off
top
of
my
head.
There
will
be
like
storage
quota,
so
basically
like
the
issue,
we
run
this
super
large
test
in
scale
five
thousand
nodes
so
to
spin
up
5000
vms.
You
need
a
lot
of
cpu
quota,
that's
obvious,
but
also,
I
think,
there's
a
quota
of
number
of
vms
in
network,
or,
if
I
remember
correctly,
there
are
quotas
for
disk.
That's
definitely
like
the
standard
quota
won't
be
enough
to
speed
up.
A
Okay,
probably
there
is
something
more.
I
see
that
you,
you
are
working
with
yeah
secure,
so
that's
actually
a
great
guy
to
work
with.
So
if
you
can
create
a
dog
and
share
with
us,
then
definitely
we
can
like
check
the
projects.
We
have
and
basically
feel
the
the
the
quotas
there
and
then
you
can
proceed
from
there.
C
So,
actually,
a
few
days
ago,
I
I
noticed
this,
this
bug
and
I
shared
with
the
team.
So
I
think
that
from
six
capability
almost
everyone
is
aware
of
this
issue.
So
I
guess,
if
you
need
help
with
you,
know
understanding
what
kind
of
quotas
do
we
do
you
need
in
new
project?
Then
then
I
think
we
can
discuss
it
on
slack
and
help
you,
but
but
as
as
matt
said,
basically
there
is.
There
is
no
like
specific
list.
We
will
just
need
to
figure
it
out.
F
G
F
C
C
E
C
I
think
I
think
we
posted
in
one
of
the
issues
with
buckets
the
name
of
this
bucket.
F
I
think
that's
all
for
me.
Thank
you
for
the
app.
A
Yeah
like
thank
you
for
like
doing
that
and
running
this
effort
to
migrate
the
jobs,
smart,
social,
just
like
start
a
dock
and
and
we
will
make
sure
it's
filled
with
the
requirements
for
migrating.
The
the
test
to
the
new
project.
A
I
mean
that
would
be
perfect
all
right,
cool
thanks,
okay
abu.
I
see
that
you
already
had
a
comment.
I
wanted
to
ask
about
this.
What
I
see
okay,
so
you
are
going
to
spend
some
time
in
123
right
so
right
now
you
are
not
working
with
that!
That's
completely
fine!
I
just
wanted
to
ask
more
or
less
about
this
about
the
timeline.
H
A
Okay,
that's
cool
all
right
and
last
but
not
least,
marshall.
A
The
example
is
the
the
issue
that
the
first
thing
we
discussed
today
right
yeah.
C
Yeah,
exactly
and
and
yeah
I
think
like
two
days
ago,
I
also
found
issue
that
was
like
two
weeks
ago,
which
seems
to
me
pretty
important.
It
was
kind
of
like
memory
leak
with
watches,
so
I
believe
it
was
also
important
at
least
to
look
at
look
at
it
and
maybe
redirect
to
someone
else,
but
basically
right
now,
we
don't
have
good
visibility
like
what
what
kind
of
issues
are
assigned
to
six
scalability.
C
C
First
of
all,
we
have
at
least,
I
think
three
different
repositories
right
like
we
have
test
infra,
perv,
test
and
and
also
kubernetes,
where
there
are
multiple
issues
that
that
are
later
assigned
to
six
scalability,
and
so
I
did
a
little
bit
of
research
and
I
found
out
that
the
other
six
like
there
is
one
sig
in
particular
sig
clee,
who
is
using
tool
called
triage
party,
which
is
open
source
tool
and
basically,
what
it
allows
you
to
do
is
like
you
have
basically
like
tabs
and
you
go
for
them
and
you
you
can
see
all
issues
that
are
not
triaged.
C
So
I
think
this
would
be
great
help
to
us,
but
also
that's
not
probably
enough.
So
you
know
tool
is
the
tool,
but
we
also
probably
would
need
to
spend
some
time
and
what
I
would
like
to
start
is
basically,
I
will
start
document
and
I
want
to
also
discuss
it.
What
do
you
guys
think
about
it?
What
was
your
opinion?
How
would
you
like
to
see
it
if,
for
example,
some
part,
maybe
of
of
this
meeting,
could
go
towards
bug
scrubbing
process
like
what's
your
opinion.
A
So
I'm
very
separate
of
this,
as
I
said
like,
we
have
a
great
example
event
today,
and
you
mentioned
some
other,
and
I
also
remember
that
it
happens
a
lot
that
we
have
some
back
open
and
sometimes
we
just
miss
it.
That's
that
is
something
we
definitely
should
address
so
like
starting
a
process
for
that,
and
that's
definitely
a
great
idea
when
it
comes
to
using
this
meeting.
A
For
this
I
would
say
why
not
like
it's
actually
a
good,
maybe
like
a
good
standing
item
on
the
agenda
to
to
do
something
and
just
make
sure
all
the
issues
are
trius
and
there
is
an
owner
for
them.
So,
yes,
like
sounds
like
a
good
idea
like
starting
a
dog,
proposing
something,
and
then
we
can
like
discuss.
Maybe
next
meeting
and
yeah.
C
Yeah
yeah,
probably
I
would
say
that
you
know
at
the
beginning.
We
won't
be
able
to
try
out
all
those
bugs
that
are
kind
of
like
our
backlog,
but
moving
forward
new
issues
like
starting
from
from
yeah.
So
and
then
you
know,
if
we
have
some
spare
time,
then
we
could
go
back
and
and
look
at
those
like
backlog
issues
that
are
kind
of
like
the
newest,
but
not
the
one.
A
Yeah,
but
this
basically
sounds
like
a
great
start
like,
let's
start
with
something
simple
that
we
know
we
can
handle
and
once
it
works,
then
we
can
figure
out
like
what
to
do
with
the
with
the
backlog
right.
I
Hey
yeah,
I
think
the
backlog
idea
is
a
is
a
good
one.
I
because
I
I
think
this
is
a
problem
for
us
as
well,
because
I
keep
getting
pink
from
time
to
time.
E
I
I
I
cut
particularly
was
around
duplicate
metrics
we
had
for
watch
counts.
We
have
two
metrics
two
different
things,
both
of
which
can
count,
watches
and
yeah.
So
very.
A
Sticky,
so
if
you
are
doing
it,
it's
already,
that's,
I
think
another
reason
to
to
have
it
written
somewhere
and
make
sure
it's
like
coordinated
right
like
we
are
not
duplicating
the
word
yeah.
Oh
yeah,
even
more
reason
for
doing
that,.
C
A
Yeah
cool
definitely
great
idea,
so
thank
you
so
much
marcel
for
for,
like
attacking
that.
F
I'm
the
one
branding
twitch
party
from
sigrid.
So
if
you
did
just
just
pick
me,
I
will
walk
you
through
the
process
to
deploy
through
each
party
on
the
community
infrastructure.
F
Thanks
we've
been
working
on
this
for
like
a
year
now
we
have.
We
had
some
issue
over
time,
because
basically,
three
party
have
some
glitch,
but
I
can
basically
give
you
the
tips
when
you're
on.
F
A
A
E
Think
yeah,
I'm
new
to
this
big
group,
so
I'm
just
wondering
does
ko
pop
basic
group,
for
example,
if,
like
I
have
a
proposal
for
like
for
us
to
increase
scalability
by
adding
like
running
multiple
scheduler
instances,
because
you
know
from
my
you
know
from
my
experience,
bottleneck
is
in
the
scheduler.
So
is
that
for
does
that
fall
into
this
c
groups?
Scope,
like
you,
know,
running
multiple
scheduler
instances
in
parallel,
so
that
you
know
can
support
larger.
You
know
number
of
nodes.
A
A
Like
we
work
with
other
sikhs
or
we
do
things
in
the
area
of
some
other
city,
I
think
it's
even
written
somewhere
here
that
we
very
often
do
something
that
is
falling
into
a
charter
of
other
individual
sikhs.
It
might
be
that
eventually
to
go
to
some
other
thing
to
also
discuss
it
in
this
case,
like
six
scheduler
is
like
definitely
a
good
place,
maybe
not
a
good
place
to
start,
I
don't
know,
but
maybe
they
they
experimented
with
like
running
multiple
schedulers
but
yeah.
A
A
Yeah,
what
is
the
throughput?
That's
that
you
are
like
trying
and
when
you're
running.
E
Yeah
so,
like
you
know
like
basically
it's
like
you
know
the
the
latency
you
know,
if,
like
you
know,
one
we
have
one
scheduler
right.
Latency
is
very
long.
It's
all
in
the
concept
right,
it's
all
kind
of
blocked
there
and
q
there.
So
I
think
if
we
can
run
multiple
schedulers,
then
that
will
help
you
know
solve
the
problem
in
parallel.
E
So
that's
one
thing.
Another
thing
I
think
you
know
what
we
can
do,
I'm
not
sure
whether
that's
fought
into
this
scope.
It's
like
vertical,
auto
scaler.
We
have
like
horizontal
hpa
and
the
vpa
right
so
like
for
epa.
As
far
as
I
know,
currently
it
needs
a
kind
of
a
reboot
right.
If
you
want
to
increase,
you
know
the
boundary
of
the
pulse,
that's
my
understanding
that
it
needs
a
reboot.
So
if
we
can't
have
mechanism,
you
know
that
doesn't
need
the
reboot.
E
I
think
that
would
be
great,
but
is
that
does
that
fall
into
this
group?
Or
does
it
fall.
A
E
Another
general
question
is
so:
is
this
like
china?
Is
it
isis?
Is
it
open
to
everyone
I
mean
for
every
sikh
group?
Is
it
open
to
everyone?
Oh
I.
I
need
to
get
access
approval,
something
like
that.
A
Like
sorry
for
what
exactly?
For
just,
if
you
could
specify
sorry.
E
A
A
E
A
I
think
like
starting
with
maybe
slack
channel
writing
down
what
you
want
to
do
or
like
what
are
the
problems
you
are
facing.
That's
probably
a
good
starting
point.
If,
like
it
requires
more
discussion
or
like
detailed
discussion,
then
basically
you
can
add
to
the
agenda
of
this
document
and
we'll
discuss
during
the
next
meeting.
The
thing
is
like
our
tl.
A
Today
by
tech,
but
usually
he
is
here
so
definitely
like
you'll-
be
a
good
person
to
have
during
this
discussion.
Like
is
basically
incubated
from
the
beginning
or
almost
beginning,
and
he
knows.
A
Like
ideas
about
running
multiple
schedules,
I
I
surely
will
ring
a
bell
in
his
head
and
he
will
know
whether
someone
already
tried
this
or
what
are
the
potential
issues
there
or
you
will
definitely
know
things
about
it.
So,
oh
okay
well
like.
If
there
was
slack,
then
we
can
basically
make
sure
that
takes
a
token.
A
If
there
is
like
it
requires
like
more
discussion
than
we
can
discuss
in
in
two
weeks,.
E
Okay,
so
so
so,
if
I'd
like
to
discuss,
discuss
something,
I
can
just
add
to
this
meeting
notes
right,
yep
totally,
so
it's
open
to
everyone
who
can.
I
do
not
have
access
to
like
to
modify
you.
A
E
A
You're
welcome
all
right.
F
Just
just
quickly,
I
added
in
the
zoom
chat
the
link
from
six
scheduling
and
see
auto
scaling.
So
if
kd
wants
basically
information
about
those
two
groups,
they
basically
they
have
communication
information
in
the
link
I
put
so
what's
happening
is
like
you
just
you
join
the
mailing
list,
so
any
of
those
things
and
you
will
get
an
invitation
in
your
calendar
for
the
next
meetings
and
in
every
readme.