►
From YouTube: 2021-03-04 Kubernetes SIG Scalability Meeting
Description
Agenda and meeting notes - https://docs.google.com/document/d/1h...
A
A
I
can
probably
give
one
update
regarding
one
of
the
things
that
we
were
doing,
which
is
the
the
cap
for
efficient
watch
resumption,
which
is
graduate,
which
is
like
code
complete
for
beta
now,
for
those
who
are
not
familiar
with
that,
let
me
just
paste
the
link
here.
If
you
are
interested
in
reading,
I
will.
I
will
just
briefly
summarize
it
in
a
minute
like
just
looking
for
the
link.
A
Yeah,
so
basically
the
the
idea
behind
it
is
that
when
you
or
the
motivation
for
that
is
actually
that,
when
you
are
doing
when,
when
we
are
doing
a
rolling
upgrade,
for
example,
of
the
masters
or
of
the
control
plane,
the
the
way
that
the
the
way
how
currently
watch
cash
is
working
is
is
working.
A
Is
that
basically,
it
changes
the
on
the
the
resource
version
that
it
serves
only
changes
when
one
of
the
resources
of
that
particular
type,
because
research
watch
questions
per
type
is
changing,
which
means
that
if
we
have
a
like
a
resource
type,
that
none
of
the
objects
is
oh,
so
that
that
is
when
it's
changing
and
when
it
we
are
initializing
it.
A
That
basically
means
that
every
single
watch
for
that
resource
version
has
to
has
to
be
basically
re-enacted
re
in
re
initialized
with
the
list
which
is
like
pretty
pretty
expensive
operation.
I'm
sorry,
I'm
I'm
not
doing
super
great
job
in
probably
and
explaining
that,
because
it's
it's
pretty
tricky
to
explain.
I
think
if
you
want
more
details,
I'm
happy
to
answer
like
particular,
like
any
any
more
specific
questions.
A
If
not
like
it's
it's,
I
think
it's
pretty
well
written
in
the
cap,
so
the
problem
in
the
more
details
and
like
the
exact
examples,
what
exactly
happens
so
the
way
we
are
solving
it
we
are
used.
A
We
are
taking
advantage
of
progress,
notify
feature
in
lcd
which
allows
you
to
reque
like
when
initiating
the
watch
to
the
lcd
you
can
al,
you
can
specify
the
progress,
notify
or
yeah,
something
that
is
called
progress,
notify
and
then
it's
sending
on
over
this
watch
channel
the
the
notification
or
bookmark
event
every
n
seconds,
where
n
is
like
configurable
and
that
cd,
which
means
that
we
are
able
to
update
this.
This
watch
cache
like
every.
A
We
are
configuring
it
currently
to
five
seconds.
We
are
test,
we
were
testing
up
to
like
half
half
a
second
and
it
also
worked
well.
So
it
seems
we
have
like
quite
a
lot
of
headroom
is
really
needed,
but
basically
basically
yeah
it
it.
It
allows
us
to
like
update
this
this.
The
research
version
surf
from
the
watchcase,
so
during
the
control
plane
rolling
upgrade
of
the
control
plane.
We
are
basically
not
not
forcing
every
single
watch
to
release
well,
not
every
single,
but
every
single
one,
that
is
of
the
resource
type.
A
That
is
not
changing,
that
that
didn't
have
any
change
over
that
period,
so
so
yeah
that
is
going
to
beta
in
121..
A
If
you
have
any
questions
about
that,
I'm
happy
to
try
to
explain
it
a
little
bit
better
but
yeah.
I
think
that
that's
mostly
what
I
had
from
my
site.
C
That's
that's
very
interesting.
Maybe
we
can
read
the
cap
and
then,
if
you,
if
I
have
any
questions,
I
can
probably
ask
you
to
follow
up
in
the
next
meeting.
D
You
said
it
will
be
better,
it
will
be
better
in
21
1.21.
D
D
B
A
Yeah,
so
that's
mostly
what
I
had
like
if
you
have
any
question
any
other
questions,
not
necessarily
related
to
it,
it's
probably
good
time
to
ask
them.
E
Yeah
so,
like
I
had
a
question,
the
watch
catch
that
you
mentioned
is
something
which
is
like
stored
on
every
node
or
is
something
the
only
part
of
the
control
plane.
A
Watch
case
is,
is
part
of
api.
It's
a
layer
in
api
server
that
but
this
in
theory
at
least
it's
it's
optional.
I
mean
it
you
you
can
disable
watch
cache,
but
you
can't
really
have
a
very
large
clusters
if
you
disable
watch
cash,
at
least
for
some
of
the
critical
like
the
most
frequently
changing
resources
or
well,
maybe
not
necessarily
most
frequently
changing,
but
the
ones
that
are
heavily
watched.
A
So,
for
example
like
if
you
disable
watch
cache
for
pods,
you
won't
be
able
to
scale
the
cluster
significantly,
because
the
way
we
implement
watch
without
the
watch
cache
layer
is
that,
like
every
every
watch
is
hitting
hcd,
so
it's
basically
being
proxied
by
2xcd
and
given
that
hcd
doesn't
understand
our
data
model,
labels,
fields
and
stuff
like
that,
so,
for
example,
when
cubelet
is
watching
its
own
parts
in
its
cd,
it
translates
to
watching
all
the
pots
in
the
system
and
then
basically
deserializing
every
single
one,
every
single.
So
do
it!
A
Sorry,
let
me
take
a
step
back
so
so,
basically,
if
you
have
like,
like,
like
n
cube,
let's
so
like
endnotes
in
the
cluster,
it
means
that
there
are
like
any
watches
that
are
watching
every
single
part
in
the
cluster.
So
whenever
some
of
the
bot
changes
or
is
created
or
deleted
or
whatever
like
it,
this
event
is
sent
to
every
single
of
those
watches
the
serialized
like
filtered
and
sent
only
probably
to
one
of
those
watches
because,
like
it's,
it's
it's
only.
A
The
pod
is
running
on
the
only
single
but
like
this,
the
serializer,
the
serialization
of
and
filtering
and
sending
to
all
of
those
watches
from
its
cd.
It's
basically
super
expensive.
So
it
was
critical
step
to
scale
to,
I
think,
even
250
notes
when
we
were
doing
that
like
five
years
ago
or
so
or
four
years,
probably
more
than
probably
more
than
four.
A
E
Yeah
it
does,
I
also
can.
I
is
there
a
resource
that
I
can
read
about
like
how
the
watch
thing
works.
A
That's
a
good
question:
there
is
a
design
for
watch
cash
which
might
be
a
little
bit
outdated,
but
I
think
it
basically
reflects
the
idea
it
pre.
It
significantly
predates
the
like
cap
era,
but
I
can
probably
try
to
look
for
it.
A
A
A
A
A
I'm
criticizing
myself
here
but
but
yeah.
It
kind
of
explains.
A
D
If
not,
we
can
probably
oh
sorry
question
so.
Do
you
know
if
six
scalability
or
architecture
plan
to
do
some
like
scale
out
plans
for
kubernetes
or
any
like
zig
project
aims
to
solve
the
problem.
D
Like
scale
out
kubernetes
like
api
server
or
storage
layer
or
schedulers,
etc,.
A
So
it
deep
like
I'm
not
aware
of
any
any
horizontal
scaling
of
any
controller
components
so
like
scheduler
or
anything
like
that.
That
is
not
happening,
because
we
didn't
yet
really
hit
her
here
about
like
significant
desire,
and
that
would
be
like
significant
complication
for
api
servers.
You
can
technically
horizontally
scale
them.
The
only
thing
that
doesn't
scale
horizontally
is
the
watch
cache
and
there
is
like
there
are
some
discussions
we
are.
A
How
we,
how
we
could
redesign
watch
cash
to
make
it
like
a
little
bit
more
memory,
efficient
though
I
don't
think
anyone
has
like
desire
to
scale
it
to
like
size
state
higher
than
maybe
10
gigabytes,
or
something
like
that.
So
with
big
enough
machines,
you
can
that
really
works
big
enough
control,
paint
machines
that
more
or
less
works
now,
even
and
regarding
ncd,
it's
probably
the
same
story
like
I
don't
think
we
have
like
strong
needs
here
that
justify
the
complexity.
D
Okay,
got
it
yeah
yeah.
Could
you
help
find
the
issues
for
the
like
the
watch,
cash
implementations.
A
A
So
it's
not
something
that,
like
is
super
high
prioritized,
but
I
think,
like
clayton
recently
mentioned
to
me,
that
they
would
like
to
explore
it
a
little
bit
more
deeper.
So.
B
C
D
B
Hey,
hey
hi
tech,
so
this
is.
This
is
a
question
I
had
asked
you
earlier
and
this
is
in
our
public
talks
and
I
wasn't
sure
why
this
was
right
about
endpoint
slices.
I
thought
maybe
you
might
have
an
idea
which
is
so
in
in
our
endpoints
slices
public
dogs.
They
say
that
it
might
be
possible
that
the
same
end
point
at
one
point
in
time,
maybe
in
more
than
one
endpoint
slice.
A
If
you
want
to
do
a
reshuffle
like
let's
say
that
you
had
a
huge
huge
service
and
then
you
removed
like
like
let's
say
you
had
like,
I
don't
know:
20
endpoint
slices
of
a
service
and
each
fully
packed.
So
let's
say
that
the
default
is
hundred.
If
I
remember
correctly
and
then,
like
you,
basically
scale
down
the
service
and
you
remain
like
you,
you
end
up
with
like
just
single
end
point
in
every
single
endpoint
slice.
A
We
want
to
do
a
reshuffling
to
not
have
like
like
because
you
you
basically
have
like
20
end
points.
We
want
to
do
a
reshuffling
to
to
condense
it
to
a
single
object,
because
there
are
not
too
many
and
to
be
able
to
do
that.
A
You
need
to
have
a
moment
where,
where
you
basically
don't
want
to
remove
it
before
you
enter
it
to
another
one
to
to
to
not
have
it
not
included
okay
yeah.
So
basically
we
first
included
in
in
in
the
in
another
one
and
then
deleted
the
old
one.
So
so
there
are,
there
are
potentially
like
short
periods
of
time
where,
when
this
can.
F
Guys
I
have
a
question
regarding
the
eps
server,
like
the
operation
like
to
like
operations
on
the
xt
cluster,
so
we,
when
we
create
and
update
actually
using
the
transaction
but
form
or
actually
we
did
some
performance
evaluation.
Recently
we
found
the
the
transaction
performance
actually
much
much
slower
than
the
like
single
operations,
so,
for
example,
for
put
actually,
if
you
use
transaction
pool,
the
performance,
like
is
like,
I
think,
probably
half
of
the
actual
pull
like
the
pro
operations.
F
So
have
you
guys
ever
thought
about,
like
the
you,
like,
you
know,
push
the
active
site
so
that
they
add
some
atomic
operations,
such
as
like
create
update
and
like
delay
operation
so
that
we
don't
have
to
use
transaction.
We
can
just
use
these
atomic
like
the
operation.
A
I
don't
think
we
ever
had
a
real
need
to
do
that,
but
I
think
that's
probably
interesting
thing
to
well,
I'm
wondering
if
it
really
deserves
a
cap.
It
might
not
like
really
deserve
a
cap
itself,
so
it
might
be
interesting
to.
I
think
it's
purely
like
api
machinery
thing,
so
I
would
recommend
like
reaching
out
to
api
machinery.
A
I
see
folks
and
and
discussed
with
them.
I
think
that
makes
like
I
would
need
to
think
a
little
bit
more
about
the
consequent.
What
what
what
will
change
but.
A
F
Exactly
so,
we
you
know
reaching
the
actual
api
so
that
when
we
do
the
update,
we
use
this
some
argument
to
specify
we're
actually
doing
the
update.
We
don't.
We
want
a
particular
previous
version
to
be
there
when
we
do
the
update,
but
because
we
are
using,
like
the
common,
like
transaction
interface,
actually
transact,
actually
transaction
interface
that
you
like.
F
If
you
use
like
right
operation,
it
first
needs
to
have
a
re
like
read
transaction
and
then
only
when
they
find
that
you
have
like
the
these
write
up
operations
and
it
will
or
close
the
read
transaction
and
open
another
write
transaction
which
is
exclusive,
and
this
will.
This
is
like
taking
a
lot
of
time.
A
Okay,
yeah
I've
never
really
like
profiled
that
cd
parts-
so
maybe
I
don't
know
yeah
but
yeah-
that's
definitely
an
idea
that
is
worth
considering.
Okay,
I
would
I
would
reach
out
to
like
api
machinery
folks
and
and
and
got
their
opinion
about
that.
F
A
If
not,
then
like
thank
you
for
today
and
see
you
in.