►
From YouTube: SIG - Performance and scale 2023-07-13
Description
Meeting Notes:
https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.tybh
A
All
right,
July,
13th,
okay,
let's
start
with
the
post,
C1
tracking,
so
V1
was
released
and
I
took
all
of
the
V1
items
and
moved
it
over
to
this
issue.
So
we'll
have
so
mainly
things
in
here
that
we
need
to
I
like
the
themes
in
here.
It's
like
it's
mostly
documentation
stuff
like
so
we
have
the
support
Matrix.
That
fob
is
working
on.
There's
the
the
API
review
process.
A
All
these
things
are
kind
of
that's
part
of
the
their
process,
so
it'll
continue
and
then
the
sixth
scale
plus
C1,
which
I
created.
So
that's
the
other
one
where
I
think
what
we
wanted
to
go
with.
This
you've
got
the
automation
right,
you've
got
benchmarks
enhancement,
and
then
we
have
some
of
the
documentation
and
then
I
think
one
of
them
is
it's
one
of
these
covering
like
the
the
rendering
I,
don't
know.
If
it's
here.
B
I
already
marked
that
issue
done,
because
we
are
able
to
get
the
we're
able
to
get
HTML
pages
in
cubeboard
website.
Let
me
share
the
link.
Actually,
if
you
can
go
to
the
previous
issue
that
is
linked
in
the
yeah.
This
is
a
2705.
B
Yeah
so
yeah
this
one,
okay.
A
Okay,
all
right,
so
this
is
the
link
to
it.
Okay,
let
me
oh
okay,
so
this
is
V1
release.
A
A
A
A
Okay,
okay!
What's
something
we
have
everything
then
yeah,
so
let
me
see
so
the
well.
This
is
where
I
would
ask
like
the
the.
So
this
is
till
June.
Okay,
this
is
release
V1.
So
what
should
we
expect
here
like
this
page,
is
not
going
to
change
or
is
it
where
would
be
like
the
latest
stuff.
B
I
think
this
is
not
going
to
change.
The
latest
stuff
will
be
in
weekly,
BMI,
yeah
and
then
index.html
here,
yeah.
B
Okay
right
right,
this
is
order
one
to
update
this
index.html
is
what
we
have
as
pending
item
close
to
you.
B
Yeah,
if
you
go
to
the
issue,
I,
have
it
the
most
viewer
yeah
I
have
it
in
automation.
The
first
bullet
point.
A
A
A
I've
got
the
screenshots
okay,
okay
sounds
good.
B
B
So
a
question
I
have
is
the
like:
this
keyboard
have
a
schedule
for
a
minor
release,
or
how
does
that
get
decided.
A
Did
we
I
don't
know
if
it's
written
about
anywhere
I
mean.
B
A
B
A
Not
sure
I
think
what's
gonna
happen
is
like
we'll
get
some
of
the
we'll
get
some
Z
streams,
but
then
I.
The
next
thing
is
so
the
next
thing
that
I
I
there's
something
I
have
to
do.
I
have
to
create
the
or
you
know
me
or
Fabi,
understand
you'll
have
to
write
the
schedule
like
it
needs.
We
need
a
or
at
least
one
one
schedule.
A
B
A
B
So
the
next
release
is
going
to
be
release
one
one
and
it's
going
to
be
three
months.
Sorry,
it
will
follow
the
kubernetes.
A
Yeah
exactly
four
months,
not
sorry,
not
three,
four
months,
so
it
should
be,
should
be
the
fall.
This
one
I
think
129
is
released.
B
B
B
B
Storage,
use
for
scraping
and
and
storing
that
data,
so
the
actual
utilization
will
go
up
and
and
what
I
wanted
to
brainstorm
a
little
bit
is
is
the
you
know
is:
is
it
worth
it
to
go,
set
up,
work
fix
releases,
or
should
we
have
some
kind
of
like
manual
process
where
we
were
at
it
and
and
only
if
it
gets
out
of
hand
we
will
introduce
automation,
I,
don't
know
just
just
sharing
some
thoughts.
Yeah.
A
Even
with
this,
with
this
performance
job
running
per
PR,
at
least
I
haven't
seen,
anyone
complain,
I
haven't
seen
even
from
the
from
the
CI
metrics
that
it
was
considerably
worse.
I
actually
noticed
to
be
worse,
like
it
like
things
be
more
by
worse,
I
mean
like
jobs,
taking
longer
like
just
being
queued
up
because
of
some
massive
resource
usage
by
performance.
I
haven't
seen
it
so
I
I,
don't
think
we're
near
the
at
least
I
don't
well.
A
Maybe
we
already
need
a
limit,
but
I
don't
know
if
we
at
least
I
I,
don't
know
if,
like
I
would
say
it's
worth
adding
just
because
I,
don't
I,
don't
know
if
we
need
to
live
in
I.
Think
like
it's,
it's
definitely
shown
its
value
and
I
think
we
can
demonstrate
it
and
if
the
case
comes
at,
we
need
more
resources.
I
think,
let's,
let's
have
that
discussion
for
before
we
make
the
Judgment
of
not
including
this.
B
Got
it
okay?
So
then,
if
we
say
we
want
to
include
it,
then
the
task
will
be.
We
need
to
set
up
the
performance
job
for
release
V1
branch
so
that
Whenever
there
is
a
new
PR.
B
The
changes
get
tracked
and
compare
that
with
with
the
actual
Benchmark,
before
the
release
for
additional
information
and
we'll
do
that
for
next
three
releases
right
and
next
to
until
we
defecute
this
one.
A
Yeah
so
yeah
we'll
do
yeah
we'll
have
right
we'll
have
those
we'll
hold
on
to
the
all
those
Money
release,
metrics
and
then
get
rid
of.
A
Don't
think
that
we
don't
get
that
much
like
there
aren't
a
ton
of
backboards
or
anything
like
but
like
it's,
not
it's,
not
a
crazy
amount,
usually
like
I,
think,
what's
the
determination
of
whether
it's
worth
cutting
a
release,
whether
like
how
many
things
they
get
through,
and
it's
usually
not
that
many
so
I
guess
like
yeah
and
and
so
then,
then
the
question
is
okay,
so
like
should
we
keep
them
because
I
I
think
technically
Hubert
allows
at
least
right
now.
A
It
allows
like
people
to
continue
to
create
minor
releases,
I
I,
don't
think
people
are
actively
doing
it
or
like
actively
requesting
it,
but
there
has
been
some
cases
where
people
have
like
what
zero
five
zero
or
zero
four
nine
or
something
people
have
done.
There's
been
additional
releases.
I
think
that
needs
to
be
sorted
out
like
I,
think
that
needs
to
be
started
out
with
whatever
I,
just
like
the
I
think
it
needs
to
be
sorted
out
with
the
with
this
I.
A
Don't
know
this,
the
release,
support,
Matrix,
I,
think,
there's
I
think
what
we
should
do
here
is.
We
should
follow
whatever
the
CI
version
is
for
the
kubernetes
provider.
A
So
it's
like
if
we're
on
127
that
should
correspond
to
you
know:
108,
that's
to
correspond
to
one
one,
and
then
we
just
kind
of
Follow
that
window
around,
so
in
other
words,
whenever
we
we
eventually
want
to
like
Drop
support
for
those
branches
like
include
in
all
our
inside
of
all
our
all
our
repos
and
and
we
don't
want
to
yeah
I
guess
we
just
do
it
in
a
repos.
We
probably
wouldn't
change
CI
right,
so
we
would
just
we
just
stopped
tracking
it
in
our
repos.
A
B
There
a
way
to
do
that
you
really
have
to
we'll
have
to
change
CI
as
well
right
because
we'll
have
to
turn
those
post
submit
jobs
off
and
introduce
new
ones.
B
A
B
A
B
Got
it
okay,
yeah
then
I
will
reflect
this
in
the
in
the
tracking
issue
for
post
V1.
So
somehow
we
need
a
way
to
keep
updating
the
release
branches
from
the
the
actual
release,
data
in
in
The,
Benchmark,
Repository
and
yeah
I'll
re-edge
that
a
little
bit
and
add
it
to
the
issue.
A
Okay,
if
there's
something
that
we
need
to
do
like
so
I
I,
think
the
release
team
already
updates
a
bunch
of
things
by
the
way
whenever
we
like
go
between
releases.
If,
if
we
need
to
add
something
to
the
list,
maybe
we
can
talk
to
Daniel
on
be
like
hey
like
when,
whenever
we
move
to
this
new
release,
remove
this
piece
of
code
I
think
would
fail
to
help
us
there
sure
yeah.
A
B
Else
no
I
think
there
is
couple
in
that
issue.
If
you
can
go
back
to
the
issue
yeah
so
I
think
there
was
a
couple
of
suggestions,
property
and
Benchmark
enhancement.
One
is
including
VM
with
instance
times,
and
preference
data
I've
created
an
issue
for
this.
It
might
be
good
to
invite
Lee
sometime
in
this
call
and
have
a
discussion
for
this.
B
And
another
suggestion
was
that
we
should
also
add.
B
Averages
along
with
dp95
data
and
p50
data
and
I
think
the
the
technical
justification
there
is
that
if
you
take
the
binomial
distribution
of
run,
running
creation
to
running
time,
and
if
there
are
Peaks
just
before
the
P95
and
just
before
the
p50
uh-huh,
the
average
will
suffer,
but
because
we
are
tracking
P95,
the
P95
will
stay
the
same.
So
there
the
having
the
average
will
help
us
understand
a
little
bit
more
about
the
binomial
distribution
of
these
runtimes.
B
So,
while
the
P95
and
p
50
are
good
indicators,
if
we
want
more
data
points
on
how
the
actual
binomial
distribution
is
doing,
we
can
add
on
average
and
get
that
visibility.
A
Okay,
the
thing
that
so
I
understand
that
the
thing
that
we
have
to
do
is
explain
it
whenever
we're
so
like.
We
know
like
right
now
we're
handing
people
graphs
and
we
have
a
little
bit
of
text
in
the
readme.
So
as
long
as
we
can
explain
it
as
to
what
its
purpose
is
and
how
people
can
use
it
when
they
look
at
it.
That's
fine.
B
I
think
so,
I'm
not
sure
if
this
will
go
to
the
benchmark,
but
at
least
for
our
visibility
we
should
have
that
craft,
then
whether
we
want
to
shape
it
with
the
Benchmark
or
if
the
explanation
is
enough,
can
can
be
a
secondary
decision.
We
make
during
releases
right,
but
at
least
when
we
are
looking
at
the
graphs
for
any
performance
problem
should
consider
both
all
three
P95
p50
and
the
average
okay
yeah,
and
then
we
can
explain
it.
B
Okay
and
then
the
next
thing
next
bullet
point
in
the
issue
was
about
documentation.
I
think
we
have
already
discussed
that
I.
Don't
have
any
additional
discussion
points
there.
A
Okay,
all
right,
let's
close
this
stuff,
so
next
blog
about
explaining
V1
graph,
so
you've
got
a
something
empowers,
looks
like
okay!
Oh!
Is
this
what
we.
B
Now,
I'm
using
the
same
document
but
different
pages,
okay,
so
I
I
did
a
little
bit
of
brainstorming
as
to
how
what
or
what
kind
of
data
can
go
into
this
detailed
blog
post.
That
will
be
helpful
to
readers.
B
The
thing
I
came
up
with.
Is
this
five
bullet
points,
so
we
want
to
introduce
what
we
are
doing.
We
want
to
show
the
benchmarks,
then
help
a
little
bit
on
how
to
interpret
this
graphs,
so
these
graphs
have
some
kind
of
esoteric
thing
which
only
people
involve
with
this
can
understand.
B
So
you
want
to
decode
that
a
little
bit
then
explain
how
this
can
be
helpful,
for,
let's
say
any
other
project
that
is
using
the
crd
and
controller
and
with
this
we've
been
kind
of
need,
the
tooling
as
well,
so
that
becomes
a
sub
point
that
what
tools
are
used
in
order
to
set
this
up
and
and
how
you
can
do
it
for
your
project.
A
Okay,
yeah
so
I
think
I
mean
it
and
it's
so
I
follow
that.
A
I'm
trying
to
think
like
in
terms
of
developer
blogs,
so
we
want
so
how?
How
much
do
you
want
to
go
like?
How
much
do
you
think
we
should
go
into
so
we
say,
like
intro,
show
a
benchmark
graphs
just
tell
them
what
it
means.
A
B
I
think
three
and
four
right:
okay,
so
in
three,
what
I'm
thinking
is
we
show,
let
me
show
explain
the
graph
then
show
its
usage
by
tracking
one
or
two
PRS.
B
A
B
Right,
so
what
what
I
want
to
highlight
here
is
how
this
tool
or
strategy
that
we
are
using
can
help
achieve
this,
find
performance
degradation
and,
in
the
actual
PR
that
cost
right.
The
source
of
degradation.
A
A
B
Yeah,
so
there
are
two
things
here:
either
we
take
this
to
how
to
get
similar
benchmarks
set
up
Downstream,
or
we
make
this
generic
and
say
how
to
get
similar
benchmarks
for
your
project
and
then
because
this
is
a
generic
thing.
It
extends
to
people
who
want
to
do
this.
Downstream.
A
B
Yeah,
that's
that's
what
I
am
not
a
little
bit
clear
on
S2
what
will
be
the
best
transition,
because
there
are
two
points
right
we
are.
We
need
to
be
careful
about
the
length
of
this
post
blog
post
and
if
we
put
a
lot
of
things
in
here
for
let's
say
your
project,
which
is
the
controller
and
crd,
this
boost
can
get
really
big.
A
Maybe
what
we
do
is
we
if
we
focus
on
one
of
them,
I
mean
this
could
be
I
mean
there
are
there's
opportunity
for
multiple
blog
posts,
maybe
the
so
maybe
this
one.
So
this
is
focused,
like
maybe
let's
say
if
we
focus
on
like
what
yeah
I'm
sure
like
I'm,
just
kind
of
like
almost
writes
itself.
The
title
almost
worked
itself.
It's
like
it's
like
our
understanding,
performance,
regressions
and
your
cubert
deployments
right.
We
have
production
users,
they
use
this
stuff.
A
They
have
likely
some
custom
commits
on
top
right,
maybe
even
it's
just
one
or
two
that
is
still
valuable
to
know.
So,
if
you're
someone
who's
who's
coming
across
this
blog
you'd
be
interested.
If
you
have
a
production
deployment,
if
you
were
Downstream
now.
The
other
part
of
this
is
what
you're
saying,
which
is
point
two,
which
is
that
that
kubecon
title
that
one
would
be
like
for
the
developers,
the
community,
so
maybe
that
one
we
can
try
to
go
to
Lexington
CF.
For
that.
B
B
Yeah
yeah
I
mean
I,
have
so
I
want
to
kind
of
Focus
the
discussion
on
what
content
we
should
come
up
with
and
select
the
content,
because
there
is
a
lot
here
once
we
you
know
brainstorm.
A
This
belongs,
wait,
wait
like
and
and
so
if
we
want
to
do
a
Blog
process
about
like
end
users,
right
I,
think
it's
a
great
Spot
Vancouver
we're
focus
on
Qbert
and
and
the
thing
is
like
we
could
talk
about
this.
We
could
tie
it
in
with
V1
like
I,
think,
there's
a
definitely
a
keeper
blog
post
that
I've
written
I
also
think
there's
one
for
the
cncf
I
I
mean
I,
don't
I
think
it's
fine.
We
do
both
I
mean,
like
it's
I,
think
really.
A
Okay,
I
think
they're.
Both
I
think
like
it's
not
like.
If,
when
we
look
at
it
and
combined,
it
becomes
a
lot
of
work,
it
comes
along
blocks,
so
I
think
we
should
decide
on
which
one
we'd
like
to
at
least
start
with.
Okay,
one
was
just
released
if
we
want
to
use
the
momentum
from
V1,
I
mean
I.
Think
where
we
should
go.
Is
this
directions
that
we
should
go
towards.
B
A
Doing
this
and
then
and
then
don't
do
the
controller,
the
general
controller
blog
not
yet
I
mean
we
can
do
that
in
maybe
in
August
I
mean
the
other
thing
we
got
to
think
about.
The
LA
is
that
we,
so
they
all
in
August
the
kubecon
talks
get
selected.
We
obviously
don't
know
whether
we're
going
to
be
selected
or
not.
B
A
Accepted
right,
we
could
still
write
the
blog
post
to
get
interest
from
the
talk.
I
mean
then
mention
the
talk
as
part
of
it.
So
I
mean
it's
sort
of
like
a
win-win
I.
Think
if
we,
if
we
look
at
it
that
way
when
we
just
would
have
to-
and
it
gives
it
in
the
timeline
lines
up
to
like
we
just
have
to
wait
till
August.
B
Yeah
I
I
think
I
I
agree
with
you,
so
we
can
use
the
momentum
to
publish
this
part
of
the
content
for
cubeboard
and
then
and
then,
as
we
get
close
to
the
talk
if
it
gets
accepted,
the
material
prepared
here
can
be,
you
know,
brainstormed
and
ironed
out
in
a
way
that
you
know
we
could
either
use
some
for
the
talk
and
some
for
the
blog
posts.
A
B
Okay,
yeah
I
think
that
direction
makes
sense.
So
for
this
one,
do
you
think
the
user
guide
user
guide
documentation
will
be
a
prerequisite.
A
I
think
I
think
we
should
I
think
it
would
be
really
nice
to
have
like
I
think,
maybe
what
we
can
do
to
shorten
our
blog
is
to
have
a
user
guide
with
a
lot
of
detail,
and
then
we
kind
of
do
a
high
level
in
the
blog
and
then
point
to
it
for
the
people
that
are
interested
for
more
because
I
think
there's
a
lot
of
things
here
like
we
have
prow,
we
have
Prometheus.
We
have
like
all
these.
We
have
grafana.
B
Makes
sense
so
I
think
with
that
this
user
guide
setup
should
be
your
one
of
the
highest
priority
for
post
V1
issues.
A
A
Okay,
okay,
all
right
so
I'll
go
to
the
next
one.
I
I
only
mentioned
I
only
had
a
k-wop
here
with
a
question
because
I
because
I
know
you
sent
me
some
of
the
stuff
we've
been
doing,
I
mean.
Do
we
want
to
talk
about
this
at
some
point?
I,
don't
know
if,
like
what
your
plans
were,
I
think
it'd
be
good
to
see
at
some
point
in
the
community
sure.
B
So
I
can
give
a
little
bit
of
a
background
on
what's
Happening
Here
we've
discussed
this
before
quack
is
a
project
which
allows
us
to
learn
resources
without
actual
cubelet.
B
So
we
can
fire
up
our
virtual
machine
instance
where
the
word
controller
creates
the
pod,
which
is
actually
not
running,
but
its
status
shows
running
and
it's
not
backed
by
any
hardware
the.
B
In
order
to
fully
support
a
virtual
machine
instance
API.
We
need
to
fake
out
the
word
Handler
parts
because
word
Handler
takes
the
Pod
that
is
in
schedule,
phase
and
moves
it
into
running
phase.
So
I've
been
doing
some
work
and
I
have
a
short
one
minute.
Demo.
Where
you
can
see
you
can
create
a
fake
node.
You
can
create
a
fake
BMI.
B
The
word
Handler
sorry
word
launcher
pod
gets
running
there,
the
VMI
goes
to
schedule
phase
and
then
this
fake
extension
of
Quark
transitions
VMI
from
scheduled
to
running
state
so
there,
and
then
it
does
this
with
a
little
bit
of
a
Jitter.
So
Quark
has
a
functionality
where
you
can
transition
from
one
state
to
another
and
add
delay
seconds
and
Jitter
period.
So
what
I
have
right
now
is
10
seconds
as
delay
and
15
seconds
as
cheater
period.
B
So
that
takes
care
of
scheduled
to
running.
The
second
thing
we
need
to
take
care
of
is
birth.
Handler
also
removes
finalizers
during
cleanup.
So
when
when
the
fake
VMI
is
deleted,
the
Pod
will
be
the
word
launcher.
Pod
will
be
deleted,
but
the
finalizers
on
BMI
will
still
be
there.
So
then
the
Quark
VMI
controller
will
come
in
instead
of
the
bot
Handler.
It
will
remove
the
finalizer
and
the
deletion
will
go
through
so.
B
Of
this
object,
so
all
the
Quark
controller
needs
is
an
input
that
it's
watching
the
vmis
for
one
state
and
it
is
going
to
transition
from
this
state
to
the
next
state.
So,
for
example,
for
schedule
to
running
the
conditions
I
have
is
that
dot
metadata,
dot
deletion
timestamp
is
not
specified
and
Dot
status,
dot
phase
equals
to
schedule.
So
if
it
is
in
this
phase,
then
the
transition
from
schedule
to
Running
With,
The
Jitter
reader
will
kick
in.
B
B
A
B
B
Yeah
and
then
the
second
example
I
was
talking
to
about
the
input,
is
that
metadata
dot
deletion,
timestamp
is
set
and
then
remove
the
finalization.
So
if
that
is
the
input,
then
the
fake
VMI
controller
will
do
more
finalizers.
If
the
schedule
state
is
the
input,
then
it
will,
it
will
transition
it
into
running
state.
A
In
this
case,
do
you
do
you
have
a
special
deployment
of
keyword
because
I
can't
imagine
you
have
to
have
so?
Actually?
Maybe
you
don't
because
you
have
fake
notes
right,
so
you
must
have
the
K
walk
controller
watching
those
fake
notes
and
then
it's
extension
handles
the
work.
Okay,
so
I,
guess
you
don't
so
you
just
you
were
deployed
to
get
the
apis
though,
and
you
need
some
of
the
invert
API
invert
controller
running
right.
So,
okay
and
then
this
takes
care
of
the
fake
note
part:
okay,
right.
B
So
we
need
the
keyword,
deployment
and
the
Quark
deployment,
and
both
of
them
will
not
intersect
with
one
another.
So,
for
example,.
B
A
It
requires
a
keyword
deployment
with
the
and
came
up
with
a
keyword,
I'm
gonna,
just
call
it
a
keyword.
Extension
right,
I,
don't
know
if
that's
the
right
terminology,
but
basically
yeah.
B
So
the
next
part
I'm
still
figuring
out
how
to
abstract
things.
This
is
just
a
POC.
Ideally
what
should
be
so
once
we
Brew
out
value
with
POC,
we
should
be
able
to
have
a
decent
enough
abstraction,
where
we
don't
need
custom
controllers
for
all
extension
right.
So
then
we
can
go
to
the
okay
clock,
maintainers
and
say
here's
how
you
can
extend
it
without
maintaining
custom
controllers,
and
that
way
we
will
have
just
a
few
configuration
of
cube,
word,
specific
resources
and
use
vanilla,
Quark
deployment
to
run
fake
vmis.
B
B
Yeah,
so
that's
the
next
step.
We're
going
to
take
is
find
out
what
is
the
difference
with
fake
and
normal
bmis
in
resource
utilization
of
the
control
plane
with
a
scale
test
and
improve,
bring
it
close
to
the
actual
vmis,
and
then
you
know,
work
on
the.
B
Open
question
as
to
when
exactly
can
we
bring
this
in
in
the
keyboard
and
and
in
our
keyword
post,
submit
jobs?
The
answer
is
I,
don't
have
a
plan
yet
not
at
least
until
we
can
find
out
the
difference
between
the
actual
scale,
like
the
actual
resource
utilization
of
BMI
and
the
facon.
Once
we
have
that,
we
can
brainstorm
a
little
bit
on
how
we
can
leverage
it.
A
B
What
are
the
things
we
can
do
with
out
of
box
Cube
word
and
clock
deployment
is
sad.
We
know
that
the
word
controller
is
not
fake
and
we
know
that
word.
Api
is
not
fake,
so
we
can
understand
the
scaling
and
performance
behavior
of
word
API
and
word
controller
without
any
improvements
or
additional
development
with
just
what
we
have
today.
B
But
the
problem
is
that
our
metrics
are
not
categorized
into
components,
so
we
don't
know
from
the
benchmarks
or
the
metrics
that
we
track,
which
is
generated
by
Body
controller
and
which
is
generated
by
what
API,
so
the
only
higher
level
aggregation.
We
have
is
memory
and
CPU
usage
and
that's
where
we
can
start
with
for
now.