►
From YouTube: SIG - Performance and scale 2023-03-02
Description
Meeting Notes:
https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.tybh
A
Thus,
and
and
the
major
my
major
focusing
is
to
run
a
workload
on
pod
and
VM
against
openshift
cluster
and
it's
a
touch
all
the
resources
that
related
to
openshift
storage,
Network,
CPU,.
B
A
Improve
it
and
and
add
more
workloads
and
more
tests
in
order
that
we
get
more
coverage
in
the
Upstream
side.
Okay,.
C
That
sounds
great,
so
I
can
do
a
little
intro
to
to
the
sixth
scale
meeting.
So
we
have
the
sixth
scale
group
we
meet.
Usually
it's
it's
scheduled
weekly.
It's
we
usually
meet
Indonesia
ends
up
being
two
to
three
times
a
month
and
we
as
part
of
this
meeting
what
we
do
is
we
look
at
covert
from
the
scale
and
performance
perspective
and
we
look
to
to
build
any
sort
of
testing
that
could
possibly
expose
any
bugs.
C
We
used
to
we'd
look
at
ways
that
we
can
calculate
scale
and
performance
in
Cuba.
We
look
at
all
these
different
perspectives.
We
look
at
tooling
other
things
like
that
and
also
things
in
in
kubernetes,
and
we
kind
of
we
take
those
things
and
we
look
at
different
ways
that
we
can
improve
things
in
the
community
by
you
know
creating
pull
requests,
creating
issues
and
even
dashboards
and
metrics,
and
things
like
that,
just
so
that
we
have.
C
You
know
ways
to
measure
scale
and
things
like
that,
so
we
we've
basically
been
doing
this
for
a
little
while
now
I
think
it's
been
over
a
year
and
we've
had
a
lot
of
changes
that
we've
had
that
we've
made
over
time.
We
kind
of
we
started
this
and
we
kind
of
began
with
looking
at
ways
we
can
measure
what
was
a
major
Focus
for
us.
It
was
a
lot
of.
C
It
was
focused
on
like
getting
alignment
on
metrics
that
we
could
get
in
Prometheus
that
could
describe
for
us
what
you
know,
how
I
basically
describe
for
us
the
performance
of
a
of
a
job
that
we
wanted
to
run,
for
example,
and
so
we
did
a
bunch
of
stuff
like
here.
I
can
show
you
so
or
what
I'll
do
is.
C
Let
me
share
my
screen
here:
okay
and
share
the
document,
so
when
we
dive
into
this
here's
a
link
to
the
document,
add
yourself
as
an
attendee
and
well
yeah
and
I'll
walk
through
some
of
those.
So
what
I'll
do
is
this
is
we
have
here?
Is
we
have
these
jobs
we?
This
is
something
that
we
over
time.
C
We've
worked
on
these
performance
jobs,
let's
see,
where
is
the
performance,
and
this
will
I
think
clear
up
some
of
what
we're
some
of
what
we've
been
able
to
accomplish
and
give
you
an
idea
of
what
our
direction
is.
So
what
we'd
have
is
we
have
a
bunch
of
we
have
this
proud
job?
That's
does
a
few
tests
for
us
and
measures
a
bunch
of
things,
so
we
have
when
we
go
to
the
first
one,
so
there's
actually
three
tests
that
we
run
in
here.
C
Or
the
name
of
that's
at
the
bottom,
so,
basically,
what
we
do
is
we.
We
create
a
hundred
pmis,
and
we
do
this
from
from
nothing.
We
previously,
we
created
a
cluster
with
you,
know,
make
cluster
up,
make
cluster
sync
and
we
create
100,
vmis
and,
and
what
we
want
to
do
is
we
want
to
measure
which.
C
The
door
so
for
I
think
we
use
so
I'm,
not
sure
it
might
be
a
serious
image.
It's
that
we
use.
We
all
right.
It's
I,
don't
know
one
of
the
default
images
that
we
use,
I,
don't
I,
don't
know
if
I
don't
think
it's
a
container
disk
like
I,
think
it's
something
that
we.
B
A
C
C
What
we
do
is
we
is
we
we
actually
we
trigger
it
through
the
shell,
it's
basically
just
what
it
does
is.
It
runs.
It
actually
runs
the
tests
and
they're
they're
integrated
into
the
test
Suite.
So
it's
like
with
with
Ginkgo
so
I'll
show
you
those.
C
Okay,
so
here's
the
density
test
we
do
see
if
we
have
the
image
in
here.
C
A
And
what
once
you
deploy
the
cluster
deployment?
Did
you
install
all
the
related
operators
like
cuber
and
all
this
stuff
yep
for
each
step
or
for
each
bunch
of
of
tests?
So.
C
C
We
use
the
same
cluster
and
then
what
we
do
is
we
so
in
the
same
test,
we'll
we'll
create
a
bunch,
then
we'll
delete
them
and
then
we'll
go
to
the
next
one.
C
So
we
do
three,
we
have
the
batch
of
emis
and
then
we've
got
another.
This
is
the
batch
of
VMS
and
then
we
have
the
with
VMS,
with
single
instance,
type
and
preference
so
create
100,
delete
them,
create
100,
delete
them,
create
100,
delete
them
and
then
after
each
one
we
measure.
C
So
we
do
so
it's
it's
I,
don't
know
how
many
notes
it
is
I,
don't
know
how
many
the
infrastructure
I
don't
know
what
it
provides
off
the
top
of
my
head,
but
so
what
we
end
up
doing
is
because
we
end
up
deleting
we
we
won't.
We
won't
overload
the
node,
so
we
create
100,
we
delete
100,
create
a
hundred
delete,
but
this
is
the
this
is
the
the
meat
of
it
right
here,
and
this
is
what
ends
up
showing
up.
You
know
these
results
here.
C
So
that's
the
basic
idea
is
we
we
create
that
that
hundred,
and
so
we
do
three
times
what
we
have
here
is
so
we've
we
sort
of
have
two.
We
kind
of
have
two
major
pillars
and,
as
part
of
this
thing
is
that
we
have,
we
have
the
performance
partner.
We
have
the
scale
work,
so
we
kind
of
we
look
at
capturing
both
in
the
in
the
job.
So
what
you
see
here
is
you
have
a
bunch
of
HTTP
requests
right
and
you
know
these
look
familiar.
C
A
C
Input
as
fast
as
possible
create
this
I
know
so
we
have
a
rate
control.
So
let
me
see
what
is
the
right
control.
A
A
C
C
C
B
C
There
is
we
have
this.
C
C
B
A
Okay,
this
is
what
we
do,
because
we
want
to
make
a
pressure
and
real
pressure
stress
against
our
node
and
verify
what
will
happen
if,
when
the
user
created.
B
A
To
investigate
it
because
but
I
see
that
you
solve
it
by
slip
between
each
VM,
but
it's
not
I
think
it's
not
fair
from
testing
purpose,
because
it's.
C
Not
the
reality,
I
yeah,
Ellie,
I,
agree,
I.
Think
I.
Think
what
you're
alluding
to
here
is
that,
like
you
know,
this
is
so.
What
we're
doing
is
we're
making
a
choice
to
like
for
us.
This
is
like
what
we
found
to
be
a
reliable
way
to
get
200,
but,
like
I
said
it's
like,
we
don't
expect
the
user
to
do
this
so.
B
B
B
A
B
A
When
we
enter
to
the
node,
we
see
that
after
the
six
things
or
18
we
are
and
eating
is
extensive
in
intensive
CPU
usage,
and
this
is
something
we
need
to
continue
and
do
the
investigation
and
behind
it
and
we
have
the
all
the
data
of
now.
It's
nightly
CI
we
distribute
in
the
garfana
the
logs
in
a
spring
bucket,
so
we
can.
A
C
So
the
this
is
what
I'm
showing
so
like
in
prow
we
have
these.
We
have
these
objects
that.
A
Question
on
the
same
time,
but
I
think
when
they
come-
and
you
know,
writing
something
I
think
always
on
performance
and
perspective
in
general
and
not
functional.
So
when
I
see
this
test
in
for
Loop
I
I,
it's
most,
it's
more
actually
Dutch
functional
more
than
performance.
If
you
understand
what
I
mean
foreign.
B
B
B
A
B
C
A
B
A
B
B
C
And
I
guess
what
you
could
do
is
like
you
could
to
do
what
you're
describing
you
could
create.
The
Manifest
render
them
ahead
of
time
and
then,
but.
B
C
When
you
create,
when
you
do,
when
you
run
this
step,
this
is
this
is
what
I'll
do
is
make
an
API
call
to
the
API
server
and
then
it
will
go
down
and
go
through
cubert
and
then
create
the
Pod,
so
the
VMI
won't
it.
It's
there's.
B
A
B
B
B
C
No
I
I
understand
what
you're
saying
this
is
this
is
this
is
one
of
the
tests
that
Marcelo
did
try
and
it's
not
one
that
we
included
in
our
CI
we.
Actually.
This
was
one
that
Marcelo
did
try
out
in
the
performance
cluster,
but
we
don't
I,
don't
think
we
have
it.
So
it's
like
a
different
area.
This
is
so
specifically.
What
I'm
showing
is
here
is
like
the
the
cluster
is
shared.
We
have
like
a
very
we
have,
since
it's
a
shared
resource.
C
We,
the
results,
can
vary
a
little
bit,
so
we
we
do
is
we
we
focus
on.
We
kind
of
make
some
caveats,
and
the
thing
is
like
with
this
is
that
it
does
get
us
some
information
like
we
do
get
some
pressure
here.
We
do
get
a
little
bit
and
and
definitely
like
the
API
server
right
totally
understand
can
handle
it,
but.
C
C
We
want
it
yeah,
because
what
we
want
to
do
here
is
we
want
to
compare
across.
We
want
to.
We
need
a
simple
way
for
us
to
compare
across
pull
requests
on
a
shared
cluster,
and
we
need
to
do
it
in
a
way
that
is
going
to
give
us
some
consistency
like
we
don't
want
like
we
don't
want
to
the
goal
here,
is
not
to
apply
like
crazy
pressure
and
see
how
it
holds
up
not
in
this
test.
C
What
we
want
to
do
is
to
get
like
this
test
is
to
get
some
consistency
across
different
PR's
and
get
some
data
back,
and
that's
that's
what
we're
doing
here,
but
we
I
I.
Think,
though,
one
thing,
though,
is
important
to
highlight
is
like
these
are.
C
This
is
three
tests
that
we're
looking
to
do
like
or
that
we're
doing
now,
and
we
have
a
bunch
of
data
that
we've
been
Gathering
and
using
for
a
while
when
you're
talking
about
are
a
lot
of
tests
that
we
have
talked
about
before
even
tried
and
just
one-offs
that
we
want
to
do.
We
just
haven't
had
the
the
time
to
to
do
this
stuff,
and
so
I,
like
we're,
we're
describing
to
me,
makes
total
sense
I.
Just
we
just
haven't
had
a
chance
to
it
and
I'd
like
to
like.
C
A
See
because
when
we
run
it
in
our
in
our
way
in
performance
where
we
start
and
see
some
problems
against
against
cluster
and
stuff
like
it,
and
it's
make
the
whole
idea
of
the
test
more
interesting,
okay,
we
we
found
and
see
issues
that
we
cannot
see
in
this
test.
Okay,
yeah.
A
B
Each
pod
update
that
it's
ready.
B
A
B
A
B
C
No,
it
makes
sense
to
me
we
have
so
we
have
a
few
other
areas.
So
we
have
this
thing
called
this.
This
load
generator
that
we
did.
We
have
a
bunch
of
stuff
in
here,
whereas
the
I
was
just
over.
C
To
do
so
here
we
don't
wait,
we
what
we
do.
This
is
what
we
would
use
in
the
performance
cluster,
which
is
and
this
one's
important,
because
this
is
the
dedicated
cluster.
There's
no
one
sharing
this
cluster,
it's
just
for
running
the
performance
jobs
and
this
one
we
do
like
sets
of
like
200
400
600,
and
we
create
them
as
fast
as
we
can.
C
We
can
apply
pressure
even
more
than
what
we're
doing,
even
just
by
creating
things
on
the
for
Loop.
Maybe
we
can
have
all
the
objects
created
ahead
of
time
and
then
in
a
in
a
buffer,
and
then
we
just
fire
them
all
as
the
API
server
at
the
exact
same
time
and
measure
like
there's
a
lot
of
things
we
want
to
do
here
and
and
I
I
totally
hear
you,
but
what
I'm
saying
is
Ellie
is
these
tests?
C
I
I
would
be
great
if
you
can
write
some
of
them
in
the
notes,
because
there's
things
that
we
want
to
do
and
but
we
I
mean
it
would
be
good
to.
If
you
can
enumerate
on
them-
and
we
can,
you
know,
discuss
them
and
I
can
help
point
you
in
the
direction
of
where
we
can
actually.
A
Go
and
Implement
these
things
in
order
to
know
to
improve
something.
We
I
need
to
get
more
details
about
the
environment,
to
get
more
details
about
how
all
the
things
connect
together
and
all
this
stuff.
So
I
don't
have
this
that
yeah,
but
we
for
sure
this
is
the
the
direction
is
to
take
and
what
the
existing
to
improve,
maybe
to
add
a
more
workload.
And
so,
according
to
your
recommendation,
what
do
you
think
how
it's
going
to
be.
C
C
There
yeah
you
all
so
the
way
that
won't
work,
so
you
can't
go.
Do
it
through
Google,
like
you
have
to
go
to
the.
If
you
go
to
the
Qbert
Dev
Google
group
and
you
hit
join
with
them
when
you're
logged
into
one
of
your
Google
accounts,
then
you'll
get
access
to
this
it'll
you'll
have
the
ability
to
have
right
access,
okay,
I.
C
Okay,
what
I'll
do
is
I'll
put
some
links
in
here,
so
what
we
have
so
we
have
I'll
put.
Let
me
do
two
things
so
we've
got,
we've
got
our
performance
job
that
week,
so
we
run
for
PR.
This
is
what
I
was
talking
about
earlier.
This
is
the
this
is
an
example
and
by.
A
C
I
think
we've
got
like
a
few
hours
that
gives
us,
but
I
think
it
takes
us.
Oh
I,
don't
just
have
the
time
on
here.
I,
don't
know
if
it
has
it
in
the
job,
but
if
it
doesn't,
then
it
should
have
it
somewhere.
On
the.
C
Yeah
I,
don't
know,
I,
don't
actually
see
it,
there's
a
time,
I
I.
So
the
thing
is
well.
The
the
end
time
test,
though,
isn't
I
guess
is
important.
The
the
thing
that
we
do,
but
since
we
have
multiple
tests
in
here,
it
wouldn't
be
valuable.
What
we
have
here
is
like
I,
was
saying
before
us,
there's
a
bunch
of
data
that
we
output
and
here's
another
thing
that
I'll
give
you
a
link
to.
This
is
important,
for
this
is
how
we
used
to
measure.
C
So
let
me
go
to
give
you
a
link
to
it.
That's
not
here.
C
So
give
you
a
high
level
what
this
is
It's
a
we
take,
our
we've
created
a
bunch
of
metrics
and
these
metrics
get
into
Prometheus,
and
then
what
we
do
is
we
scrape
Prometheus
whenever
we
run
our.
C
A
C
We
were
it's
written,
go
it's
it's
in
here,
the
the
things
that
we
that
we
capture
you
can
find
it
in
this
tool
and
all
the
things
that
we
we
care
about.
C
Okay,
but
we
can
add
anything
to
it,
we
I
don't
think
we
have,
but
you
can
take.
You
can
take
a
look
for
yourself,
but
basically
this
is
some
of
the
output.
Now,
like
some
of
the
things
we
do
like
you
can
see,
we
take
p99s
of
things.
We
look
at
the
deletion
times.
We
look
at
the
number
of
requests
stuff
like
that,
like
that
that
get
done
and
we
we,
when
we
chart
those.
B
B
C
B
C
The
classroom
yeah,
so
if
you
want
to
for
deployment
you
can
use,
make
cluster
up
and
make
cluster
sync
commands.
A
For
the
environment,
I
guess
and
I
saw
that
it's
golang,
so
I
need
which
ID
do
I
using
for
it.
Visual.
A
Any
ID
for
to
do
the
deployment
and
all
this
stuff
locally.
C
No,
no!
No,
so
you
just
need
no
I'm,
not
you
can
I
mean
it's
pretty,
it's
pretty
lightweight.
It
uses
Docker
and
Docker.
You
just
need
you
just
need
Docker
running
on
your
localhost
and
and
it's
pretty
simple
just
running
these
two
commands
will
get
you
a
local
running
cluster
and.
A
B
A
Order
to
have
this
support
for
cluster
and
all
this
stuff.
Oh
no
yeah.
C
B
A
C
The
run
the
performance
tests
I
forget
if
we
have,
if
it's
in,
should
be
in
make
somewhere.
Let
me
see
I
need
to
check
to
make
recipes.
C
They
have
like
a
performance
yeah,
we
do
okay.
Here
we
go
so
you
can.
You
do
make
perf
test,
and
this
will
run.
This
will
run
the
yes
it'll
run
this
test,
the
one
that
that
creates
them
with
the
100
millisecond
sleep
time.
A
Okay,
do
you
have
do
I
need
more
things
to
make
it
work
locally.
C
C
C
A
Okay,
nice,
nice,
so
I
will
try
to
play
with
it
and
by
the
way
I
can
reach
you
directly
by
email
or
stuff.
C
Like
yeah,
that's
fine
I'm
on
the
Hubert
Dev
slack
Channel
and
naku
and
kubernetes
you
can
in
this
channel.
B
B
B
A
Talk
about
at
the
beginning
on
the
yaml
which
container
we
can
see
it
in
the
test
or
I
need
to
dig
into
this
because
I
thinking
about
huge
VM,
for
example,
I,
don't
know
if
it's
a
if
it's
a
okay
to
run
a
Windows
VM,
because
you
know
it's
license
or
something
like
that.
B
C
We
don't
this
is
something
we've
had
some
conversations
a
while
ago
on
doing,
but
we
haven't,
we
haven't,
it
hasn't
picked
up
a
lot
of
steam,
so
if
it's
something
we
can
resume,
we
just
haven't
had
the
bandwidth
to
take
this
on
with
something
that
yeah
I
mean
it's
something
that's
more
interesting
doing,
though,
that'd
be.
C
C
A
Something
that,
but
let's
I,
think
that
I
should
start
play
with
it,
because
I'm
not
familiar,
but
once
actually
once
I
emerged.
Someone
should
will
review
it
and
after
it
commit,
but
we
need.
B
C
C
I'm,
not
sure
yeah,
maybe
something
you
can
look
at
when
when
you
go
through
I
mean
they'll,
be
if
there
are
unit
tests,
they'll
run
when
you
run
I
think
there's
a
mid
test
command,
not
perfect.
This
one.
C
B
A
C
A
C
I
would
start
I
would
just
start
with
this
just
to
get
familiar,
because
this
is
just
a
building
box
of
what
we
have
and
then,
when
we
we
have
that
dedicated
cluster
and
I
was
going
to
look
and
see
here
if
it
is
up
and
running
because
oh
yeah
here
it
is
so
here
is
the
performance
cluster.
So
this
one,
if
you
want
to
okay,
here's
the
time,
so
here's
there's
two
hours
at
the
top
okay,
this
one
takes
two
hours
to
run.
This
is
on
the
dedicated
cluster.
C
This
should
be
where
we
eventually
want
to
Target
a
lot
of
your
tests
that
you're,
describing
because
this
is
only
on
this-
is
the
only
one
that
shares
or
no
one
shares
this
resource.
It's
just
for
testing.
C
How
many
VMS
is
this.
This
looks
like
200,
no
400.
No,
this
is
600
VM
test
yeah,
so
you
can
see
like
this.
Is
this
is
our
larger
stress
or
the
largest
stress
that
we
can
do
right
now?
Maybe
we
can
do
larger,
but
yeah.
D
Okay,
so
when
you,
when
you
guys
have
some
time,
I
just
have
a
have
a
question:
unless
there's
something
else
in
the
agenda:
no
go
ahead,
so
I'm
I'm
trying
to
capture
some
information
I'm
trying
to
help
a
university
that
wants
to
build
a
assured
campus,
pretty
much
with
virtual
machines,
and
they
want
to
try
and
understand
who
out.
There
is
running
a
massive
amount
of
of
convert
deployments
and
I
stumble
across.
D
Of
course,
the
G4S,
the
the
Nvidia
I'm,
trying
to
find
more
out
there-
and
you
know
some
of
the
some
of
the
highlights
that
may
have
been
found
from
the
cluster
at
scale
group.
Are
there
any
findings
or
any
notes,
or
anything
like
that
documented
anywhere?
C
This
is,
this
is
a
gap
we
have.
We
have
we've
had
discussions
about
this
for
a
while
about
having
a
documentation.
I'm,
assuming
you
mean
like
how
you
know
what
it
like.
The
recommendations
is
running
at
scale
and
maybe
the
the
highest
level
of
scale
in
the
community.
Things
like
that
is
that
what
you're
looking
for
right.
D
Like
you
know
at
what
point
it's
it's
dumb
to
have
more
than
a
certain
amount
of
VMS
in
a
single
cluster.
You
know
with
a
your
control
plane
and
is
you
know,
is
500
or
a
thousand
VMS
good.
Once
you
pass
the
Thousand
number,
maybe
the
entity
requires
a
different
different
performance.
You
know
so
so
things
like
that
right.
So
how
many?
How
many
VMS
do
we
want
to
run
as
a
single
cluster?
How
many
nodes
have
we
actually
put
into
into
a
cluster
on
the
VMS?
C
Where
yeah
there's
a
good,
let
me
see
if
I
have
it
here
somewhere
in
the
notes,
there's
a
good
guiding
document
that
we
used
a
while
ago
from
kubernetes
that
I
would
point
to
if
I
could
locate
it
in
here
that
answers
some
of
this,
and
so
the
thing
all
right
here
we
go
so
the
thing
about
this
is
like
it's
kind
of
the
way
to
look
at
the
way
to
look
at
this
problem
is
like
with
with
Qbert
it's
really
it's.
C
It's
pods
pretty
much
like
it's
like
you
know
the
rvms
and
like
there's
a
computer
control
plane
in
the
middle.
One
of
the
biggest
factors
is
going
to
be
kubernetes,
and
this
is
what
this
document
focused
on
and
explains
it
really
well.
So,
basically,
the
idea
is
like
that
is
described
in
this.
This
presentation
is
like
how
different
things
affect
the
overall
pressure
that
you
apply,
and
so,
for
example,
let
me
see
if
I
can
find
a
good
one
based
on
what
you're
asking
there's
should
be
nodes
in
here.
C
So
here's
like
example.
So
if
you,
if
you
have
the
number
of
PODS
per
node
at
110,
let's
just
say,
then
the
number
of
nodes
you
can
scale
to
comfortably
ends
up
being
about
1300.
and
then
on
the
other
side
of
this.
If
you
have
30
odds
per
node,
the
number
of
the
number
of
nodes
Scouts,
who
comfortably
is
five
thousand.
This
is
this
is
from
a
few
years
ago,
but
this
is
what
you
know.
They
were
so
back
in
2018,
so
six
five
years
ago.
C
This
is
what
was
tested
at
the
time.
So
this
is
a
I
think
this
is
along
the
lines
of
what
you're
looking
for
and
when
you
kind
of
extend
this
to
VMS
right
VM
is
VM,
is
just
a
pod.
I
mean
there's
the
keyword
piece
in
the
middle,
but
this
will
give
you
a
sense
of
like
how
it
would
work
with
kubernetes
and
it
should
work
like
the
same
with
with
Qbert
I
mean
cubert
has,
for
the
most
part,
what
we've
been
doing
and
this
sig
is
making
sure
it
keeps
up
with
the
kubernetes
scale.
D
Node
quantity
feels
a
little
exaggerated
even
for
for
for
large
companies.
It's
a
single
cluster
then
does
this
imply
that
having
200
pods
but
50
nodes
is
absolutely
fine,
because
I
remember,
there
was
also
networking
limitations
after
110
pods.
C
Yeah
I
mean
I,
guess
it
depends
on
the
on
how
you
set
up
your
IPS,
but
you
should
it
should
be.
It
should
be
fine,
I
mean
I,
don't
know
exactly
I
mean,
but
you
I
mean
you'll
probably
have
to
test
this
200
and
200
pods
and
how
many
nodes
you
say:
500.
D
Yeah
even
50
right
because
50.
I
don't
know
it's
is,
is
already
it's
it's
a
very
good
quantity
right
and
then,
if
we
have,
you
know
100
virtual
machines
and
node,
which
is
probably
two
dens.
D
It's
already
a
massive
amount
of
virtual
machines.
So
again
this
is
the
scale
of
of
universities
and
and
different
users.
So
just
just
trying
to
to
get
into
a
sweet
spot
versus.
C
D
Right,
okay
and-
and
are
there
any
other
limitations
that
that
are
well
known,
I
guess.
C
Yeah
there's
a
ghost
this.
This
presentation
actually
goes
into
a
few
like
it's
some
relationships
between
Services
back
in
service
namespaces
services
for
name
space,
here's
another
one.
This
is
important
and
one
we
actually
do
see
in
our
in
our
measurements.
A
lot
pod
churn.
So
even
at
Nvidia
like
we,
this
is
one
of
the
biggest
pressure
is
that
we
see
like
so.
In
other
words,
like
the
amount
of
PODS,
you
created
the
lead
and
an
update
per
second.
C
This
is
like
this
supplies,
a
lot
of
pressure,
so
you
know
if
you're
I
mean
I,
don't
know
what
your
use
case
is.
But
if
you
have
a
lot
of
people
creating
and
doing
workloads
in
the
very
quickly
and
high
throughput,
then
it
will
apply
a
lot
of
pressure.
D
B
C
Sure
yeah
there
was
a
few
others
here,
I
think
names
of
places
and
stuff
like
that.
The
link
I'll
copy
it
up
to
the
top.
Just
so
I.
D
D
Was
there
any
I'm,
assuming
once
a
virtual
machines
are
created
and
the
pressure
and
entity
or
any
of
the
on
the
control
plane
is
off,
is
is
this?
Is
this
something
we
we
see
as
well
or
or
when
you
guys
are
running
through
through
this
Benchmark
jobs?
What
are
the
things
you're
monitoring?
D
This
is
just
the
the
how
long
it
takes
to
provision
a
VM
like
the
provision
in
time.
Are
you
actually
looking
at
the
underlying
infrastructure
of
you
know,
CPUs
memory
and
and
so
on,
as
they
are
consumed
and
and
the
over
commit
ratio
as
well?
Are
we
doing
an
overcommit
ratio
like
we
do
with
things
like
openstack,
like
an
18
to
1
or
16
to
1.,
or
do
you
guys
just
go
1vm
one
CPU
yeah.
C
We
have
it
so
this
is
another
area
where,
like
like
Elliot,
even
talked
about
like
we,
we
haven't
that
these
tests
haven't
expanded
or
haven't
matured.
To
the
point
that
we
have
some
of
the
things
that
you're
talking
about
like
we
could
very
easily
start
measuring
some
of
the
the
CPU
and
memory
changes
that
could
happen
on
the
control
plane,
based
on
the
amount
of
pressure
we're
doing.
We
don't
have
that
we
have.
C
We
do
have
dashboards
like
for,
especially
on
the
performance
cluster
that
that
we
were
reviewing
before
when
we
were
actually
going
through
and
doing
this
and
like
it
helped
us
find
a
bunch
of
go
routine
leagues
and
a
bunch
of
other
stuff.
C
But
it's
not
something
that
we
that
we
report
specifically
in
you
know
as
part
of
the
job
and
and
check
and
fail
our
gate
on.
But
we
kind
of
we
do
look
at
it
from
time
to
time
since,
like
each
of
these
has
a
Prometheus
instance
that
we
can
query
so
we
can
see
it,
but
we
don't.
We
don't
always
look
just
because
it's
not
you
know
as
a
part
of
the
we
don't
have
an
automated
it's.
Basically,
what
I'm
saying
yeah.
C
C
So
we
we
do
that
over
based
on
the
the
number
of
requests
we
create,
and
then
we
measure
based
on
where's
my
we
measure
the
the
create
to
running
time
for
this
stuff
and
see
what
how
this
changes,
based
on
the
amount
of
pressure
that
we
apply.
D
C
No
no,
this
is
this
is
like
the
K
like
this
is
the
nine
and
a
half
percentile.
So
this
is
like
the
worst
case,
and
this
is
the
95th
and
here's
the
average
p50.
So
it's
so.
This
is
228
seconds.
C
Yeah,
this
is
600
VMS
created
as
fast
as
possible
on
average,
because,
as
we
see
about
that
four
minute
time,.
C
Yeah
well
so
this
yeah
well
so
I
wouldn't
well.
So
the
way
I'd
look
at
it
is
like
refuses
can
average
it.
So
there
could
be
a
few
of
them
that
were
done
in
20
seconds,
and
then
it
kind
of
slowly
crept
up
to
the
point
right
like
that's
so
but
yeah
right,
like
the
I
guess
the
way
I
look
at
it
is
the
average
of
of
all
600
ends
up
being
228.
C
D
Because
I
I
don't
know
if
this
is
10
nodes
or
or
five
and
then
that's
a
massive
massive
difference.
Okay,
all
right!
No
just
just
just
wondering
this-
is
this,
of
course,
validation
to
you
know
if
there
you
know
if,
if
this
University
wants
a
development
environment,
this
isn't
the
case
right
now,
but
if
they
want
to
develop
an
environment
that
wants
to
execute
virtual
machines
on
demand
like
this,
then
yeah,
okay.
Well,
no
thanks
for
sharing
and
we're
past.
C
Thanks
yeah
thanks
for
the
questions-
and
you
know
really
appreciate
you,
you
know
as
you
as
you
find
you
know,
if
you,
if
you've
got
any
more
questions
about
scale
or
probably
or
things
that
are
going
on,
you
know
please
come
back
and
you
know
we're
happy
to
discuss
more
and
try
and
solve
some
problems.
Anything
you
guys
encounter
and
you.
D
Know
yeah
yeah,
I,
I
I
joined
time
to
time.
Actually
this
call
I've
had
it
in
my
calendar
like
for
two
years
so
I
run
I
join,
you
know
about
the
ones
a
month
or
so,
but
I
just
stay
quiet,
but
most
of
the
time
is
pretty
much
about
this
dashboard
and
yeah
I.
Just
you
know,
wanna
I
wanted
to
talk
more
more
Hardware.
D
I
will
okay
thanks
thanks
a
lot
for
your
time
again
of.