►
From YouTube: SIG - Performance and scale 2023-05-18
Description
Meeting Notes:
https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.tybh
A
All
right,
let's
start
all
right:
it's
May,
18th
2023
with
the
sixth
scale.
Please
add
yourself
in
those
attendee,
please,
okay,
let's
start
with
the
looking
at
the
performance
job
result.
So
lay
we've
got
three
okay,
so
we
got
the
density,
one
and
sweet.
Okay,
anything
to
note
on
this
one.
B
Yeah
faulty
density
job
PR
is
out.
There
were
some
changes
needed
to
make
sure
that
density
job
works.
Well,
I've
added
it
to
the
agenda.
It
would
be
good
to
get
that
merged.
This
particular
graph
is
from
March.
B
Actually
it's
from
February
2022,
so
more
than
years
worth
of
data
I've
created
two
PRS
to
add
all
of
this
historic
data
to
the
sit
Benchmark
Benchmark
repository
that
was
newly
created
yeah
and
we
can,
you
know,
store
this
information
in
a
very
public
way.
B
Let's
see
so
other
things
I
have
observed
is
there
are
a
bunch
of
times
when
this
job
was
not
working,
which
is
reflected
in
the
blank
thing.
But
apart
from
that,
the
grouping
has
been
very
consistent.
B
B
On
the
same
timeline,
oh
and
one
more
thing,
I
have
added
both
p50
and
P95,
so
we
should
be
able
to
get
an
average
as
well
as
the
worst
case
sounds.
B
Yeah
so
the
worst
case
numbers
for
these
have
been
going
down,
but
for
the
density
clusters
they
have
been
going
up.
So
there
are
some
weird
behavior
I'm,
not
sure
if
it
is
related
to
the
cluster
version
or
the
cubelet
that
runs
this,
but.
C
B
An
observation
I
have
posted
the
PRS
that
that
went
in
around
the
time
when
this
number
started
going
down.
So
this
was
May
7th,
May,
8th.
B
So
from
from
these
I'm,
not
sure
yeah
that
one
I
doubt
that
could
be
the
one
that
that
is
saving
us
some.
So
my
understanding
is
that
a
it
that
PR
changes
the
way
network
is
set
up
for
VM.
It
uses
up
in
memory
cache
to
set
up
Network
for
VM,
so
that
could
be
yeah.
B
C
B
I
was
hoping
somebody
from
keyboard
Community
would
attend
today,
and
we
can
ask
questions
like
does
this
affect
the
end-to-end
tests
or
or
things
like
that,
but
I
think
we
can
raise
that
question
in
the
pr.
A
A
So
yeah
like
maybe
we
can
did
this
I,
want
to
see
if,
let's
see
so
we
should
have
this
is
so
this
is
where
is
it
so?
This
is
the
it's
the
periodic
that
you're
getting
this
right.
C
A
A
That
appearing
here
so
I
think
it
was
a
19
half
right
there.
A
A
1909
wow,
which
is
our
oh,
no,
that's,
not
our
low
okay,
so
it
looks
like
15.
So,
whatever
all.
B
D
A
Is
15.
yeah
yeah,
that's
50
Okay,
so
15.
yeah.
So
it
is
our
low
and
sitting
the
lower
Bound
for
both
of
these
okay
yeah.
A
B
I
up
it
should
be
able
to
find
a
link
that
says
job
history,
yeah
the
first
one.
B
So
I
wonder
if
we
can
find
a
job
prior
to
this
and
see
if
it
was
lower
or
higher
than
19.
A
A
And
we
should
see
if
we
can
do
this
again.
So
if
we
can
get
two
for
two
here.
A
All
right
past
them:
okay,
okay,
32.,
interesting,
okay
and
so
in
your
chart.
You've
got
so
it
looks
like
okay,
so
we're
hitting
a
lot
more
of
these
19th,
so
we're
adding
fewer
of
these.
These
looks
like
28s.
A
B
A
Interesting,
okay,
maybe
there's
so
maybe
what
we
can
do
is
like
I,
yeah
I
mean
so
start
thread
on
the
on
the
pr.
You
know
it's,
let's
post
our
observation
and
you
can
post
this
chart,
I
mean
I,
think
it
might
be
valuable.
Just
to
see
like
hey.
Your
PR
was
like
right
here.
No,
it's
like
May
8th.
It
was
yeah,
it's
like
right
here
and,
and
it
is
we're
starting
to
see
a
better
P95.
A
Do
you
think
that
it?
You
know?
Maybe
it's
associated
with
your
change.
Do
you
think
I
don't
get
their
thoughts,
see
what
alonas
says
I
mean
I,
don't
know.
Maybe
she
wasn't
thinking
about
performance
when
she
went
through
this,
but
it
would
be
good
to
identify
this
for
a
lot
of
reasons,
I
mean.
Maybe
we
have.
Maybe
we
can
understand
exactly
what
the
performance
changes
like.
Maybe
something
we
don't
know,
maybe
something
we
can
look
at
doing
in
other
parts
of
the
code
whatever
it
is
that
she
was
doing
here.
A
C
A
Cool
okay,
anything
else
interesting,
you
noticed
any
other
well
so
back
to
the
the
change
that
we
were
saying
or
the
increase
in
the
okay
I
wonder
if
this
is
just
kind
of
within
the
normal
behavior
like
I'm
looking
over
here
and
we
see
like
it
kind
of
goes
up
and
down
a
little
bit.
Maybe
we
maybe
this
is
well
yeah.
This
is
also
a
different
version
of
kubernetes
I
mean.
A
Maybe
we
should
still
let
this
play
out
a
little
bit
sort
of
within
our
bound,
our
lower
and
upper
limit,
where
we're
kind
of
less
than
60
about
and
greater
than
50.,
it's
like
yeah,
greater
than
50
less
than
60.
we're
still
within
there
yeah
56
and
49
we're
still
right
around.
Let's
see
what
this
plays
out.
If
this
kind
of
continues
to
go
up-
or
maybe
it
comes
back
down.
B
Sure,
just
for
my
understanding
what
version
of
kubernetes
moving
on
in
the
density,
cluster.
A
I,
forget
I,
think
it's
like.
Oh
it's!
So
it's
it's
it's
openshift,
but
it's
does
the
installation,
but
it's
I
thought
it
was
one
124.
I
see
or
125,
or
something
yeah
yeah.
B
B
And
I
know
for
for
the
Sig
performance
periodic
job,
we
install
a
new
new
kubernetes
cluster
and
it
runs
on
there
for
the
density
cluster.
It's
just
this
Standalone
cluster
and
we
run
VMS
after
every
every
day.
Right.
A
A
B
Yeah
I
just
wanted
to
understand
the
hardware,
so
we
can
compare
better
both
the
results.
Now
there
we
have
it.
A
Yeah
so
well,
I
mean
the
thing
that
I
want
I,
think
that
I
actually
wanted
to
get
to
with
these
p95s,
and
you
can
like
this
is
so
49
and
what
do
we
have
for
yeah?
So
it's
higher
right,
it's
it's
we're
getting
higher
numbers
and
that's
like
we
are
creating
more.
A
B
So
I
have
couple
of
suggestions
from
what
I
have
observed,
so
the
think
performance
test.
We
just
run
one
test
every
day.
B
What
I
have
observed
is
that
sorry,
the
density
job
we
run
one
test
every
day,
the
performance
we
run
three
tests
every
day.
The
three
days
test
gives
a
little
bit
better
signal
if
there
is
a
flake
or
or
something,
and
usually
that
helps
me
better.
So
one
of
the
suggestions
I
was
thinking
was:
can
we
change
this
density
job
to
run
three
times
a
day?
I
think
that
will
improve
the
signal
we
get
of
this
cluster.
A
B
Should
do
it,
and
one
more
action
item
for
me
is
to
add
a
p50
creation
to
running
for
the
density
as
well.
A
A
Okay
sounds
good.
This
is
all
really
good
progress.
So
now
we've
got
some
other
things
that
you've
made
progress
on
too.
We've
got
the
repo
and
everything
do
you
want
to
talk
about
that?
So
where
are
we
with
I
I'm,
just
not
caught
up
so
like?
Where
are
we
with
we've
got
the
report
created
that
I
think
we
need
and
then
and
and
where?
Where
are
some
of
the
other
changes
you've
been
working
on
or
actually
no?
We
have
it
right
here
right,
oh
so
here
we
go.
B
I
have
put
it
in
the
agenda
if
you
click
the
first
link,
which
is
issue
for
the
project.
Everything
is
summarized
there
actually
great
yeah,
so
we
have
the
Repository.
We
have
a
PR,
that's
open.
That
will
scrape
the
result
every
week
and
dump
it
into
that
repository
going
forward.
But
before
that
PR
merges
I
have
seeded
the
the
repository
with
historic
data.
So
if
you
go
I,
don't
have
a
link
handy,
oh
yeah!
It's
in
the
agenda.
B
If
you
go
at
the
the
repository,
you
will
see
two
open
PRS
I
mean
all
of
that
is
generated
data,
so
it
will
be
hard
to
go
through,
but
if
you
can
just
verify
the
the
directory
layout,
that
should
be
enough
for
us
to
get
started.
Okay,
we
can
merge
that,
so
that's
that
will
seed
us
with
historic
data
and
then
from
there
we
can.
You
know,
merge
that
project
to
start
collecting
data
every
week
going
forward.
So
we
would
not
have
to
do
this
manually.
A
Okay,
what
are
these,
what
are
these
numbers?
Those.
A
B
Weekly
results
are
in
the
weekly
folder
and
then
they
are
separated
by
the
job
name,
so
the
performance
name
and
the
density
name.
Those
should
be
two
separate
folders
inside
the
results
and
weekly
and
within
the
daily
results.
You
will
see
a
job
ID
within
the
weekly
research.
You
will
see
it
separated
by
the
metric.
B
What
I
was
saying
within
the
daily
results,
it's
separated
by
job
ID,
then
we
aggregate
those
all
the
jobs
into
weekly
and
so
during
a
weekly,
it's
aggregated
by
the
metric
that
we
care
for
so
in
this
particular
run.
You
will
see
all
the
metrics
that
that
the
job
reports,
so
we
have
the
data,
but
out
of
those
let's
say
the
first
five
is
the
one
we
care
about
you'll
see
only
those
in
the
weekly
aggregation
okay,
so
we
are
not
actually
losing
data
in
the
future.
B
If
you
want
to
compare
let's
say
the
last
one
which
we
have
not
been
aggregating
it
weekly,
we
can
always
go
back
and
run
the
automation.
A
A
A
B
Good
and
one
more
thing,
the
weekly
directory
will
have
index.html.
B
That's
the
one
you
see
me
linking
over
in
our
agenda.
So
one
thing
yeah,
that's
the
one.
So
one
thing
that
we
have
not
captured
yet
is.
B
How
to
make
sure
that
that
index.html
is
rendered
in
a
web
page
the
way
I
do
it?
For
my
personal
Repository,
we
should
have
a
similar.
A
A
B
B
Github
has
automatic
job
for
it.
I.
B
So
when
you
merge
a
PR
with
index.html
GitHub
will
run
a
deployment
job
to
automatically
change
the
GitHub
pages
to
reflect
that
PR
margin.
The
only
thing
that
we
might
have
to
do
is
that
GitHub
expects
GitHub
pages
to
be
in
a
specific
format
like
the
way
I
have.
It
is
fixed
scale,
Dot,
username,
dot,
IO
dot,
github.io
right.
B
So
we'll
have
to
set
up
that
repository
and
then
it
will
be
able
to
run
this
automatic
deployment.
B
Yeah
just
to
render
that
automatically.
A
B
So
if
you
look
at
my
I
think
it's
the
fourth
tab
right.
It's
myusername.github.io,
slash.
B
C
A
A
Okay,
so
this
one
looks
fine
and
then
the
second
one
is
oh,
okay,
the
rest
of
the
oh.
No,
that
was
the
one
I
was
just
gonna.
A
A
Do
we
need
just
an
approval
on
this,
or
is
here.
B
B
No,
no,
the
one
which
is.
B
Yeah,
so
once
this
merges,
we
can,
we
can
ask
Daniel
to
add
that
another
command
to
even
scrape
the
density
job,
and
we
should
that
should
set
us
up.
You
can
also.
A
All
right,
let
me
talk
loop
on
this
one,
and
maybe
we
can.
Let's
have
them,
do
quick
review
and
if
it's
not
anything
he
notices
in
a
fun
place,
then
maybe
we
can
get
this
immigration
quickly.
Okay
sounds
good,
sounds
good.
Okay,
so
we've
got
you've
merged
this
performance.
We've
got
the
density.
Job
changes
we
also
okay,
here's
the
artifact
search
okay,
so
this
one
I
think
was
okay,
so
also
it
means
so
I'm
working
through
the
testing.
B
Yeah,
so
this
is
setting
up
the
owners
just
for
the
perfect
report,
generator
I've
added
you
and
Logo
as
the
reviewers
and
approvers.
It's
a
copy
paste
from
cubot
keyboard,
so
I
don't
anticipate
any
problems,
but.
A
A
Yeah
yeah
we'll
get.
Let
me
get
Lugo
on
these
and
I
think
today
they
all
look
like
they're,
making
good
progress.
Okay,
cool
all
right,
thanks,
delay,
all
right
anything
else
about
this.
This
thing's
gonna
do
it.
This
is
a
lot
of
progress.
C
B
A
B
A
Then
so
I
think
so
Louisville
city
is
only
when
they're
doing
it.
Okay,
I
thought
Louisville
was
gonna,
do
it
so
I
guess?
Maybe
maybe
we
can
ask
Daniel
and
if
he's.
B
Already
yeah
see
that
comment.
The
second
last
comment:
yeah
Daniel
had
ideas
on
how
to
do
this
already.
I
think
that
sounds
good
to
me.
A
A
A
Last
thing,
then,
is
just
for
the
sixth
scale
for
the
kubernetes
work,
so
we
already
have
a
lot
going
on,
but
the
the
only
thing
I
wanted
to
do
is
just
make
a
little
bit
of
progress
on
this
and
I
was
hoping
I'm
gonna.
Try
to
do
the
do
some
of
this
today,
I
want
to
I.
Think
we
need
to
do
is
is
well.
This
is
what
we
agree
to
do.
Is
we
take
the
inventory
of
the
existing
performance
related
metrics?
So
it
sounds
like
we.
A
Some
of
the
things
aren't
actually
totally
known.
I
mean
we
don't
I,
don't
know
all
of
them.
I
actually
don't
know
many
of
them,
but
it
sounds
like
within
a
larger
group
than
that.
Not
all
of
them
are
known.
So
let
me
just
take
a
quick
look
at
what's
there
and
just
out:
let's
have
a
description
of
them
and
that'll
help
us
out
and
guide
us
as
to
like
what
how
we
want
to
improve
or
change
them.
Anyway,
so
that's
the
first
step
that
I
see.
A
A
It
seems
simplest,
like
we
know
there
I,
like
I
know
there
are
a
few
phases
like
just
attaching
like
we
know
there
are
a
few
phases,
so
I
think
it
makes
sense
like
let's,
let's
see,
let's
I
think
the
way
women
approaches
is
like
what
are
the
phases
for
PVCs,
who
sets
them
which
transitions
between
the
phases
and
and
those
are
our
entry
points
and
I.
Think
it's
that
simple
as
that
and
then
the
final
thing
is
like
okay:
what
will
having
these
metrics
do
how
it
impact
anything.
A
You
know
whether
it's
people
using
them
or
how
would
it
impact
the
cubelet
or
how
could
it
impact
your
monitoring?
We
now
have
a
lot
of
PVCs.
A
A
lot
of
metrics
I
think
that's
sort
of
the
final
step,
so
just
those
three
and
I
think
that
would
be
a
good
start
and
for
for
us
to
begin
the
work
I
think
I'm
not
going
to
be
here
next
week,
so
I'm
not
going
to
talk,
bring
us
up
at
the
you
know
if
I
get
far
with
this
I'm
like
I'm,
going
to
talk
to
about
it
at
the
design
session,
so
if
La
through
there,
if
anyone
else
wants
to
talk
about
it,
feel
free
to
if
I
got
some
documents,
sir
or
I
can
bring
it
up
on
slack,
that's
fine!
A
We
can.
We
can
go
that
route
in
their
six
scale,
ability
in
the
kubernetes.
We
can
go
that
way
as
well.
So,
depending
on
how
far
we
can
coordinate
how
we
can
so.
B
Ryan,
this
sounds
great
I
think
we
did
a
bunch
of
work
to
identify
these
metrics
in
cubelet
I.
Have
it
in
one
of
the
documents
in
our
Downstream
effort,
I
can
link
you
I,
think
that
will
save
you
some
work.
Okay,
so
to
put
this
together,
yeah,
okay,
the
only
thing
is,
it
is
I
think
you
blood
is
missing
the
PVC,
metrics
and,
and
the
network
related
metrics
right
now.
So
the
thing
it
has
is
end
to
end
pod.
B
A
Yeah
I
suspect,
I,
I
I'm,
pretty
confident.
That's
all
we're
going
to
get
away,
I,
I,
just
I,
don't
I
think
we
kind
of
caught.
My
impression
was
we
kind
of
caught
them
by
surprise
in
that,
in
that
their
approach
has
been,
and
it
makes
total
sense,
like
their
approach,
has
been
to
measure
things
and
mention
them
in
the
aggregate
right.
It
makes
a
lot
of
sense
because
they
want
to
measure
they
want
to
measure
and
report
and
have
you
know,
slis
and
slos
makes
it
makes
sense.
A
That's
what
the
approach
was
where
our
approach
is
slightly
different.
It's
that
it's
sort
of
two
things
like
one
of
them
is
that
we
don't
want
to
do
the
aggregation
Within,
the
cubelet.
We
want
to
do
it
at
the
the
Prometheus
or
the
dashboard
level,
so
we
we
want.
We
basically
we're
going
to
essentially
pay
the
price
of
doing
the
holding
on
to
more
data
and
then
and
doing
the
aggregation
later
and
then.
The
second
part
is
like
our
approach.
Is
we
want
to
look
a
little
bit
further
under
the
hood?
A
A
B
Yeah
I
agree
with
you,
I
think.
The
another
surprising
bit
for
me
was
that
they
were
using
the
events
to
collect
when
what
happened,
even
though
the
data
is
available
in
scheduler,
metrics
or
cubelet
metric.
So
I
think
that
change
in
approach
that
you
were
talking
about
is
first,
it
should
happen
in
the
test.
End-To-End
results
that
that
they
are
using
so
like
skip
collecting
things
from
events
and
use
actual
exported
metrics
from
scheduler
and
cubelet.
B
That
can,
you
know,
fill
up
the
entire
story
with
with
missing
data.
I,
wonder
if
you
need
a
bullet
point
here
to
locate
the
metric
collection
that
they
have
been
doing
and
suggest
changes
to
that
as
well.
A
Okay,
yeah
makes
sense:
okay,
okay,
it
sounds
like
a
plan,
I
think
so
something
I'll
try
and
get
to
today
and
see
if
we
can
come
up
with
yeah
we'll
come
up
with
a
plan.
I
mean
I'll
just
start
by
Friday,
depending
on
how
far
I
get
the
weather.
Will
you
know,
I
will
approach
them
with
whenever
it
is
okay.
This
sounds
good.
A
All
right,
I
think.
That's
all
all
right
anything
else,
anything
else
you
got
away.
A
Problem
did
you
I,
don't
know
if
you
caught
some
little
discussion
I
we
had.
We
had
a
few
things.
We
wanted
to
ask
you
yeah,
so
this
actually
La.
Why
don't
you
cover
it
since
you
have
written
the
points
there?
So
what?
What
did
you
wanna?
What
do
we
need
here
still.
B
B
What
we
would
really
need
is
a
post
submit
job
on
that
repository
so
set
up.
Setting
up
raw
like
Daniel
was
mentioning
here.
D
Sorry
for
interrupting,
you
I
think
that
two
seven
seven
three
is
exactly
that:
Daniel
is
trying
to
trade
periodic
job
for
it.
B
No,
so
my
impression
was
that
this
does
an
initial
data
Gathering,
but
so
this
will
just
push
data
to
Sig,
sorry
to
CI
Benchmark
repository
after
data
is
pushed.
We
need
a
post
to
submit
job
on
that
repository
to
aggregate
it
into
weekly,
folders
and
publish
a
chart
for
it.
So
if
you
look
at
line
818,
it
just
does
initial
result.
Gathering
for
this,
we
need
the
same
command
with
weekly
report
and
weekly
graph
sub
commands
to
actually
do
the
graphing
for
it.
B
Daniel
was
suggesting
that
we
could
do
it
as
a
post
submit
job.
So
once
this
job
publishes
data
second
job,
which
is
a
post
submit
job,
will
you
know,
aggregate
it
and
publish
it
that
way,
we
can
change
the
publishing
logic
without
affecting
the
aggregation.
B
D
B
Okay,
yeah
I
think
that
CI
repository
Benchmark
repository
that
needs
raw
setup
and
everything
I,
don't
know
if
it
is
enabled
right
now
it's
a
Bare
Bones
repository
with
just
admin
access,
so
yeah
I
think
that
would
be
really
helpful.
I'll
I've
already
summarized
this
here
and
we
can
get
this
merged.
Okay,.
B
B
I've
taken
a
step
at
creating
a
PR
for
owners
in
in
the
perf
report,
Creator
tool,
I
have
added
Ryan
and
yourself,
as
as
the
reviewers
and
approvers
I,
think
that
will
help
us
iterate
faster
on
on
this
too.
So,
if
you
are
okay
with
it,
please
take
a
look
at
this
PR.
D
Sure,
just
a
note,
I
don't
have
an
approval
on
this
repository,
so
I
will
need
to
ask
somebody
else
for
approve.
A
Okay,
so
this
this
post
match
up,
so
do
we
away
we're
gonna
wait?
Is
that
what
it
sounds
like
yeah.
B
Okay,
we
need
to
need
Daniel's
to
create
and
set
up
Raw
on
the
new
Repository
yeah.
A
All
right
sounds
good
all
right,
the
rest
of
these
PR,
as
we
can
look
at
Lou
Elizabeth.
Here
we
have
another.
Oh
actually,
sometimes
you
can't
approve
this
one.
You
can,
though,
yeah
I
think
just
this
one.
A
D
Thanks
I
will
try
to
look
but
didn't.