►
From YouTube: SIG - Performance and scale 2022-06-16
Description
Meeting Notes:
https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.tybh
A
Okay,
this
is
sixth
scale.
It's
june
16th
2022.
I'll
share
the
link
to
the
notes
in
chat,
and
please
add
yourself
as
an
attendee.
Okay,
first
thing
so
marcelo
we
I
I've
yeah.
This
is
actually
good.
I
want
to
review
these,
so
first
is
which
one
qps
yeah.
So
this
one,
this
experiment,
marcelo
yeah,
I
mean
why
don't
you
talk
through
do
like
a
little
high
level
overview
and
then
we
can
have
a
discussion
on
it.
B
B
Okay,
so
well,
the
experiment
is
to
test
the
vm.
You
know
creation
latency,
it's
the
vm
object
is
which
well
it's
the
object
to
start,
and
you
know
stop.
B
The
vmi
and
I'm
following
the
whole
workflow,
so
I
create
the
vm
object
and
the
vm
object
will
automatically
create
the
vmi
object.
Then,
of
course
the
vmi
creates
the
pod
so
and
the
focus
here
is
it's
in
the
vm
creation
time.
It's
interesting
because
I
I
wasn't
seeing
this.
You
know
huge
latency
in
the
vmi
creation
time.
This
happens
to
in
the
vm
creation
times
so
and
and
yeah.
B
So
that's
that's
the
basically
the
experience,
the
experiment
and
the
experiment
it's
actually
in
is
medium-sized
cluster
or
maybe
can
be
considered
smaller.
You
know
it
has
12
worker
nodes
and
three
masternodes
and
I
am
creating
you
know.
1200
vms,
I'm
also
doing
another
experiments
in
the
report
there,
that
is,
to
creating
more
than
2
000,
vms
and
and
it's
which
means
it's
100,
vms
per
node
and
then
200
vms
per
hour.
So
and
the
the
experiment,
I'm
focusing,
I
tried,
I
changed.
B
Actually
the
you
know
the
cars
per
second
and
burst
of
other
components,
but
the
one
that
makes
more
difference
is
the
bridge
controller.
So
the
pr
is
focusing
on
the
virtual
controller,
especially
because
it's
to
you
know
narrow
down
the
discussion,
so
I'm
changing
the
the
the
vert
controller
curves
per
second
and
burst
from
the
default
configuration
which
is
20
30,
and
then
I
increase
that
to
100
and
200
and
then
400
cars
per
second
to
see
how
it
impacts
the
performance.
B
Especially
you
know,
because
I
was
seeing
you
know
some
huge.
You
know
vm
vm
creation
time
for
when
was
creating
1
000
vms.
At
the
same
time
and
like
22
minutes,
you
know
which
is
not
acceptable
for
this
and
and
then
here
it
should
say
so-
the
the
latency
and
the
impact
also
in
the
throughput,
so
how
many
vms
being
created
per
time.
So
it
turns
out
that
maybe
not
maybe,
but
one
of
the
reason
that
it's
taking
a
lot
of
time
also
to
create
the
vm
okay.
B
So
there
are
two
things
here
is
the
time
to
create
the
whole
batch,
for
example,
one
thousand
how
long
it
takes
to
create
one
thousand
and
then
the
each
vm
time
and
and
then
I'm
also,
I'm
also
analyzing
this
here.
So,
of
course,
to
create,
since
it
to
create
one
vm,
it's
very
slow
to
create
like
if
we're
considering
how
the
total
amount
of
vms
being
created.
B
It's
also
very
big
in
the
first
scenario
the
default
one
is
this:
is
the
default
configuration
okay
and
and
then
it
turns
out
that,
with
the
current
you
know,
configuration
burst
configuration
it
scan,
creates
only
one
vm
per
second.
So
in
it's,
there
is
a
bottleneck.
B
So
it's
it's
because
the
vm,
the
it's,
not
the
vmware
okay,
so
it's
the
vm
controller,
it's
not
being
able
to
do
too
many
requests.
So
just
comment.
This
query.
You
know
20
quarters
per
second
here
it's
shared,
be
in
between
all
the
controllers
in
the
bridge
controller.
So
internally,
in
the
virtual
controller,
it
has
the
vm
vmi.
B
You
know
node,
and
I
don't
remember
now
where
I
list
there.
So
there
are
many
other.
You
know
cues.
That
means
different
controllers
controlling
different
things
and
all
of
them
share
the
same
pairs
per
second
configuration.
B
Another
option
could
be
maybe
have
separate
characters
per
second
for
each
of
them,
and
then
we
have
more
control
of
that.
But
the
way
that
it's
implemented
now
it's
something
that
shared
between
everything
with
something
it's
internal
to
the
virtual
controller
yeah
it's
here,
so
you
can
see
highlighted
here
all
the
controllers.
B
You
know
evacuation,
so
it
will
also
impact
like
migration
that
you
know
at
some
point.
B
So
when
I
increase
that,
I
put,
you
know
for
the
maximum
of
course,
this
pr
after
some
discussion,
I'm
not
going
to
increase
to
the
maximum
curse
per
second
that
I
tried
because
some
concerns
I
I'm
going
to
get
a
middle
ground
with
something
in
the
middle
of
that,
but
with
the
maximum
carriage
per
second
that
I
test
400
and
600
for
burst,
I
could
get
up
to
17
gm's
being
created
at
the
same
time
and
note
that
my
test
I
configured
to
create
20
per
second.
B
Yeah
and
and
then
the
question
here
was
okay,
so
what
happens
if
I
increase
that?
Isn't
it
in
the
cluster
so
to
understand
that
we
need
to
analyze
first
the
number
of
requests
that
it's
been
generated
and
they
also
the
number
of
in-flight
requests.
The
current
things
request.
That
is,
the
current
request
that
is
being
processed
in
the
work
api,
because
using
flight
request
is
the
most
important
one
because
it
takes
let's
assume,
for
example,
it's
not
happening
okay,
but
we
need
to
understand
that,
let's
assume
a
scenario
that
could
be.
B
It's
overloading
divert
api,
getting
all
the
requests
that
vertex
they're
sorry,
api
server,
getting
all
the
requests
that
the
api
server
could
get
and
then
other
controllers
could
not
access
that.
I
know
that
we
can
have
priority
and
fairness
openshift
has
by
default.
Kubernetes
it's
you
know,
it's
still,
I
think
alpha
or
better,
but
it
will
have
priority
and
fairness,
just
something
that
we
will
improve
in
the
future,
because
it's
not
there
yet.
But
the
point
we
need
to
understand
how
is
cooper,
increasing
this
overloading
the
api
server.
B
It
turns
out
that
the
current
requests
that
are
being
processed
per
second,
it's
only
40.
and
the
default
ap
a
maximum
in-flight
request
in
the
api
server.
It's
500.
so
just
understand
what
does
it
means
it
means
when
we
see
here.
You
know
the
the
api
request
total.
Then
we
see
800
here,
isn't
it,
for
example,
the
maximum
scenario?
B
It
means
that
it
is
started.
You
know,
800
requests
and
it's
waiting
for
800
requests
so
and
summer
request
takes
more
than
one
second,
and
it
means
that
the
occupancy
you
know
of
in
here
it
will
be
more
than
100
seconds,
but
the
the
number
of
requests
that
are
being
processed
at
the
same
time
in
the
in
the
api
server,
it's
only
40,
which
means
we
are
not
getting
all
the
the
api
service
still
has
a
lot
of
room
to.
B
You
know
to
reply
for
requests,
because
summer
of
requests
here
takes
a
lot
of
time.
That's
the
point,
and,
and
I'm
running
a
system
that
has
you
know
it's
very
powerful,
cpus
very
fast.
You
know
machine
with
any
energy
ssds,
so
in
the
the
other
scenarios
means
in
slower
cluster.
It's
what
takes
even
you
know
more
time
to
maybe
process
some
lease
operation
or
request
like
that
and
then
we'll
take
so
our
post
operation,
because
the
pulse,
for
example,
create
a
gpu,
sorry
create
a
vm.
B
It
goes
through
a
lot
of
process
and
then
it
takes
time
to
process
that
so
and
that's
what's
happening
here.
So
it's
I'm
just
saying
that
the
odds
are
now.
Let's
just
say
we
are,
even
though
we
increase
that
we
see
an
increase
in
the
number
of
requests.
The
api
is
surveying.
This
is
on
a
safe
mark.
Okay,
that's
that's
the
explanation
of
these
two
figures
and
then
another
question
is
okay.
B
So
what's
the
impact
in
this
resource
utilization
and-
and
here
is
the
virtual
controller,
of
course
it's
increased
the
restrictionization
from
few.
You
know
cp
utilization
to
at
least
one
or
you
know
one
and
a
half
core
in
the
system.
It's
it
has
some.
You
know
the
cpu
has
some
high
frequency,
okay,
it's
a
powerful
cpu,
but
I
wouldn't
expect
to
take
like
more
than
two
cpus
and
especially
because
we
are
going
to
the
scenario
that
here
that
took
only
you
know,
100
percent.
B
Here
it
means
one
one
cpu,
so
it's
different
controller
will
be
using
only
one
full
cpu,
which
I'm
considering
also
to
be
okay,
because
it's
an
extreme
scenario
that
we
create
1000
vm
and
we
enable
it
to
scale.
You
know
in
a
reasonable
throughput
and
perform
and
latency
and
the
other.
The
other
thing
here
is
to
show
the
impact
in
the
work
queue.
We
have
some
prs
before
some
discussion,
especially
in
the
beginning
of
the
sixth
scale,
about
the
performance
of
the
work
cube,
and
we
were
not
understanding.
B
We
have
a
lot
of
you
know
a
lot
of
discussion
to
maybe
to
create,
trace,
to
understand
that
better
and
turns
out
that
when
I
increase
the
queries
per
second
with
the
high
value
here,
it's
definitely
improved
a
lot.
B
The
work
queue
so
the
the
issue
that
a
guy
from
new
videos
presented
like
a
long
time
ago
that
some
keys
were
processing
very
slow
in
their
vert
controller,
and
he
was
actually
also
you
know,
proposing
some
another
approach
to
bypass
work
queue.
Something
like
that.
It
turns
out
that
only
increasing
the
cars
per
second
and
burst.
B
It's
you
know
eliminate
the
problem
here,
so
we
can
see
the
longest
running
process
drops
to
less
than
one
second
here
you
know,
and
and
very
few
of
them
you
know
considering
the
the
longest
running
process
here
in
the
first
scenario
default
one.
We
have
a
lot
of
them
that
it
takes
six
seconds.
You
know
to
process
a
key
which
is
very
slow
and
then,
when
we
go
to
the
best
scenario
here
well
at
least
the
scenario
that
the
maximum
scenario
that
I
test-
it's
drops
a
lot.
A
Did
you
do
you
see
like?
I
wonder
if
you
see
this
in
the
traces
you
should
be
at
the
threshold.
I
have
one
second
and
the
trace
is
by
default.
A
It
would
be
interesting
to
see
what
you
what
it
also
shows
in
the
logs,
because
I
understand
this
metric
and
what
you're
showing
here,
but
I
I'd
be
curious
to
see
because,
though
the
metric
is
granular
and
enough
that,
like
we
should
be
able
to
see
exactly
where
in
the
work
queue
like
this
is
being
slowed
down.
If
it's
like
a
specific
call,
we're
making,
it
would
be
good
to
find
this.
B
A
A
There's
a
lot
of
really
good
data
and,
like
you
said
like
we,
we
like
in
video
we've
done
like
some
experiment
like
this,
and
we
saw
that
qps
was
like
the
biggest
influencer
on
improving
performance
and
the
thing,
but
the
thing
that's
really
interesting
to
me
is
like
at
least
my
take
away
with
this.
A
Is
that
when,
when
you
talked
about
like
okay,
so,
like
you
know,
800
requests
or
whatever
500
big
fight
requests
whatever
like
it's
it's
well,
you
know
I'm
trying
to
like
picture
in
my
mind
like
what
is
like
the.
What
should
kuvert's
footprint
be
in
in
the
in
your
kubernetes
cluster
like?
Should
it
be?
A
Should
it
take
up
like
half,
should
it
be
like
be
able
to
take
up
half
of
the
apr
service
requests,
or
should
it
be
lower,
like
you
know
what
is
like,
you
know,
what's
the
what's
our
right,
the
right
approach
like
which
what
should
be
like
the
right
way?
We
look
at
this,
it's
very
possible
that
the
defaults
for
kubernetes
should
should
be
like
hey
like
we
should
need
to
take
up
half
the
pay
request.
A
You
know
for
the
api
server,
but
I'm
wondering
if
it
could
be
lower,
like
I'm,
I'm
kind
of
interested
in
seeing
like
if,
because
the
data
you're
pointing
out
here
is
actually
like
it
to
me,
it
seems
like
you,
you're
you're,
hitting
some
some
bugs
like
the
bug
isn't
necessarily
like.
I
agree
that
the
qps
burst
is
probably
a
little
low
as
the
default
with
whatever
it
is,
10
or
20..
A
It
probably
should
be
higher,
but
but
it's
also
when
you
go
to
the
high
end
it
and
the
way
that
it
affects
the
effects
it
has
like
how
much
important
improvements
in
performance
makes
me
think
that
maybe
we're
just
making
too
many
requests
like
it
just
seems
like
it
seems
like
we're
we're
a
little
too
active
with
the
api
server
like
it
seems
like
we
may
be,
I'm
wondering
if,
like
you
know,
if
there's
a
way
we
like,
we
can
end
up
with
the
same
performance
at
around
100
or
less
than
100,
or
something
because
that
seems
to
be
like.
A
To
me,
it
just
sounds
like
it
just
seems
like
we're
using
a
lot
like
like
400,
because
you
know
like
thing
is
this
argument
can
go
on
forever
right.
We
could
say
why
not
600,
why
not
800?
You
know
why?
Don't
we
just
use
the
whole?
We
don't
want
people
to
do
that
like
we
want.
We
don't
want
people
to
have
to
say,
like
oh
I'll,
just
increase
the
qps
and
burst
forever.
You
know
and
then
also
not
get
my
performance
that
I
need.
A
We
don't
want
that
we
want
to
you
know
we
want
to
reduce
it
as
as
much
as
possible
and
that's
like
you're
exposed
to
like
a
lot
of.
I
think
you
exposed
probably
multiple
bugs
here,
the
like
the
number
of
put
requests
per.
Second,
these
things
like
seem
very
high,
and
maybe
we
can
lower
them
and
then.
B
Yeah,
if,
if
you
see
oh,
let's
first,
I
want
to
discuss
like
two
things
before
sure
can:
can
you
roll
up.
B
A
B
Just
to
comment
about
yeah
the
cars
per
second,
you
know
when
we
are
seeing
the
500
this
the
500
means,
what's
the
amount
of
now,
you
can
go
here
to
the
figure
with
yeah
below
the
one
yeah.
That's
the
green
one
yeah.
So
here
is
the
here's,
the
number
of
requests
that
it's
in
the
api
server,
the
real
one
that
is
being
processed
at
the
time.
So
the
api
server
has
this
maximum
in-flight
request
that
it's
the
default
500.
B
B
You
know
around
30
requests
per
second,
but
what
means
that
we
go
to
the
500
800
there
that
we're
showing
it
means
we
restarted
more
requests.
You
know
the
apis,
for
example,
the
components
were
able
to
request
more
things
to
the
api
server.
However,
some
of
these
requests
are
waiting
are
pending.
You
know
because
they
are
processing
somewhere
else
and
then
it's
it's
not
being
you
know
the
in-flight
request
that
it
means
it
was
replying
the
request.
You
know
it's,
it
means
we
are
not
overloading
we're
not
impacting.
B
You
know
the
vertipi
server.
You
know
in
a
bad
way.
It's
it's!
It's
just
safe
to
increase
to
this
number
that
that's
what
I'm
describing
here
and
you're
right.
We
can
improve.
You
know,
convert
based
on
this,
the
request,
if
you,
if
you
wrote
down,
I
replied
david
quite
very
in
the
end
yeah
yeah
here,
so
david
actually
asked
the
same
thing
that
you
mentioned.
B
What's
the
number
of
requests
per
vm,
more
or
less
like
that
and
okay,
I
don't
have
it
like
specific
per
vm,
especially
because
we
can,
as
you
mentioned,
so
I
didn't
remove
all
the
to
to
get
the
exactly
number
that
it's
been
doing.
B
Maybe
we
need
to
get
like
remove
all
the
you
know,
cars
per
second
restrictions,
and
then
it
will
do
the
maximum
that
group
vert
will
need,
and
then
we
can
understand,
but
I
think
we
are
close
to
the
to
the.
Maybe
we
are
close
to
the
limit
there
for
the
the
last
scenario.
B
In
any
case
it
was
able
to
create.
You
know
it's
requested.
You
know,
22
000
put
requests.
So
since
we
I
created,
you
know
more
or
less
one.
One
thousand
two
hundred
you
know
vms.
It
means
you
know
there
is
some
rounding
here
in
bermuda's,
so
but
it
will
be
approx
12.
You
know,
19,
put
requests
for
the
virtual
controller,
six,
poles,
four
packs
and
very
few
get
so
and
considering
that
when
we
create
a
vm
object,
it
create
the
vm.
You
know:
do
some
you
know
request
to
create
the
vmi.
B
It's
request
then
request
to
create
the
pods.
I
see
that
19
put
maybe
a
little
bit
high,
but
it's
still
fine.
I
don't
know
we
can
we
can
get
this.
You
know
impression
from
more
folks
about
that.
You
know
what
they
think
about.
I
I
don't
know
I
I
think
that
maybe
19
is
high,
but
it's
doing
a
lot
of
things
to
create
a
vm.
B
I
we
can
maybe
go
again.
So
you
have
to
create
this
six
seek
sequence:
diagram
for
creating
vmi,
isn't
it
or.
A
B
It's
been,
maybe
it
would
be
nice
to
incre,
you
know,
expand
it
to
the
vm
and
and
then
we
can,
you
know,
try
to
to
figure
out.
You
know
where
the
where
the
put
you
know
and
post
actually
the
put
requests
are
coming
from.
A
Yeah
we
can
do,
we
could
do
nothing
yeah.
I
think
what's
interesting,
so
I
always
you
know
with
what
you
have
like
19,
that's
interesting.
I
like
it
does
like.
I
was
saying
like.
Maybe
it
is
a
sensible
number,
maybe
you
know
whatever
of
any
amount,
some
sensible
number
I
mean.
I
think,
like
that.
You
write
that,
like
you
know
your
graph
about
the
api
server
and
the
weights
be
able
to
handle
it
like
in
theory
right
we
could
always.
A
We
could
always
increase
api
servers
because
resources,
you
know
so
on
and
so
forth
and
eventually
will
be.
We
could
serve
the
load.
I
think
it.
I
think
it's
totally
possible.
It's
really
reasonable
yeah.
It's
just
sort
of
the
question
of
like
you
know.
What's
the
right
default,
I
think
that's!
You
know
one
important
question.
You
know
what
is
our
default
workload?
A
You
know
what
what
should
the
right
default
be
and
then,
in
the
case
of
like
your
example
like,
for
example,
sort
of
outside
of
like
what
the
average
person
is
doing
well,
we
need
to
then
document
it
like
we
need
to.
A
We
need
to
like
you
know
your
experiment,
highlights
the
importance
of
how
you
can
achieve
performance,
because
if
you
you're,
someone
comes
along
with
your
exact
use
case,
you
know
like
like
you're,
showing
here
they're,
not
going
to
achieve
nearly
the
amount
of
performance
that
they
should
be,
and
so
these
are
kind
of
this
is
that
whole
other
area,
where,
like
the
slo,
is
document
where
we
should
spend
a
bunch
of
time
finding
these
things
and
and
documenting
them.
So
like.
B
C
A
Like
finding
that
defaults
and
documenting
and
then
and
then
the
third
thing
I'd
say
is
like
maybe
we
can,
you
know
reduce
some
of
this
stuff.
Maybe
I
mean,
I
think,
like
the
experiment,
we
did
a
long
time
ago
was
trying
to
skip
things
in
the
work
view.
Maybe
the
the
right
thing
to
do
is
instead
of
skipping
things
in
the
world,
we're
cute,
maybe
it's
to
like.
Maybe
it's
to
skip
like
or
not
do
put
requests
immediately
or
something
maybe
just
to
combine
them.
Maybe
it's
a
skip
request.
A
Maybe
that's
you
know,
sort
of
the
same
idea
just
you
know
just
a
little
nuance
like,
maybe
it's
so
instead
of
looking
at
it
as
like,
you
know,
you
know
skipping
steps,
let's
just
maybe
we
can
reduce
these
because
it
would
be
interesting
to
see
like
this
could
be
easy
to
tell
if
we
were
to
reduce
this
value
by
one
right.
This
value
should
go
down
significantly
in
your
experiment
and
we
should
see
pretty
quickly.
You
know
these
numbers
should
decline
like
and
we
can.
A
We
should
be
able
to
easily
measure
like
the
performance,
just
with
one
less
put
request.
So
it
would
be
interesting
experiment
because
of
how
impactful,
just
one
request,
or
even
two
or
three
could
be
on
like
you're
on
your
overall
performance.
So
that
would
be
really
interesting
to
see,
because
if
we,
because
I
mean
even
if
we
were
to
scrap
just
one
of
those
requests
it
would,
I
think
it
would
be
incredibly
valuable.
Probably
even
like
you
know,
20
to
30
qps
of
value,
just
by
you
know,
maybe
reducing
one
of
these.
A
B
B
So,
for
example,
if
we
go
to
the
scenario
that
the
maximum
scenario-
okay,
we
have
like
a
different.
I
have
different
configurations
here,
but
considering
this
one,
just
one
here.
B
So
it
the
root
controller,
for
example,
node,
which
controller
node,
which
it's
I
I
would.
I
was
not
expecting.
This
work
queue
to
be
like
that
it
was
80
operations
per
second
for
the
retry
rate,
so
that
it
was,
there
is
probably
definitely
a
bug
in
the
virtual
controller
node,
it
shouldn't
be
retrying
to
process
a
kill,
a
key
that
that
much
isn't
it.
B
B
B
A
I
wonder
if
the
tracing,
I
wonder
if
the
tracing
will
pick
this
up
or
if
it
if
it
doesn't
like
something
we
can
do
to
improve
it,
because
I,
I
would
really
be
interesting
to
see
the
tracing
on
this
like
he
talked
about
earlier,
like
the
the
amount
of
time
spent
in
the
work
cube.
Maybe
it's
the
retries,
that's
causing
it
and-
and
maybe
that's
I
hope
that
gets
picked
up
at
the
tracing,
but
maybe
it's
not,
and
that
would
be
interesting.
That
would
be
another
thing
to
look
at
be
really
interesting.
A
A
I
guess
if
we
don't
see
anything
in
the
tracing,
then
it's
it's
like
all
in
this.
This
retry
and
we
probably
aren't
supporting
retry
and
tracing,
and
we
probably
need
to
add
it,
and
I
bet
we
could
find
something
cause
yeah.
That's
that's
really
interesting,
though,
like
it's
like
I
I
forget
what
happens
in
the
retro.
A
I
think
if,
if
we
we
just
fail
like
during
one
of
the
steps-
and
we
just
you
know,
we
send
ourselves
back
to
the
queue
and
it's
a
rate
limiting
queue,
so
we
just
whatever
the
time
is
that
we
have
is
the
default
rate.
Limiting
we
wait
and
try
again
I
mean
eight
retries,
though
that's
yeah,
that's
that's.
Probably
that's.
Probably
most
of
your
time
is
by
half
that
time
is
waiting
in
the
weightlifting
queue.
B
I
think
someone
wrote
some
requests.
B
D
Oh
yeah,
that
that
was
me,
can
you
can
you
hear
me?
Yes,
yeah?
Okay,
because
I
was
just
speaking
working
on
ryan's
point
as
in
if
on
on
the
previous
point,
that
is,
for
the
19
put
request.
Let's
say:
let's
say
the
ideal
number
is
not
90
right.
The
ideal
number
is
15
or
or
10
how
much
performance
improvement.
Do
we
get
if
we
bring
that
number
down
to
15
or
10?
That
was
the
question
I
had.
B
Yeah,
I
think
you
know
it
will
improve
the
overall
thing,
especially
the
throughput
that
we
were
saying
for
latency,
it's
three
keys
and
it
it's
it's.
It's
not
easy
to
say
about
latency,
because
there
are
many
kills
going
on
different
components,
and
also
I
don't
know
if
the
latency
is
coming
from
the
put
operation
might
be
so,
but
it's
it's
hard
just
to
say
without
you
know,
testing.
B
D
And
then
another
question
I
had
is
when,
when
you
were
showing
those
numbers
for
for
the
api
server
right,
that
it
has
a
total
quota
of
500
requests
that
it
can
process,
but
it
is
only
going
up
to
40..
So
sorry,
I'm
new
to
keyboard,
but
word
has
its
own
aggregated
api
server
right.
B
D
I
see
so
this
was
against
the
actual
cube
api
and
not
the
the
word
api.
A
Okay,
so
I
so
let
me
close
this
point
so
marcelo,
I
here's
the
things
I
think
will
follow
up.
That
makes
sense.
Well,
fine,
let's
find
the
you
know
the
right
defaults
so,
like
so
good
balance,.
A
B
I
got
some
feedbacks,
you
know
and
I
defined
like.
Actually
I
put
two
hundred
four
hundred.
B
B
You
know
well
not
doing
this
amount
of
requests,
but
it's
configuring
to
support
this
amount
of
requests.
So
I'm
saying
it's
not
like
a
very
insane
value
so
and
the
other
thing
is
it's
we
I
I
check
it
there.
You
know
that
it's
not
impacting
too
much
the
api
service
or
we're
still
like
in
a
safe
march
and
yeah.
I
forgot
the
third.
B
The
other
point
that
I
was
going
to
say,
but
I
think
just
this
is
a
good
value
so,
based
on
the
experiments
that
I
did
and
yeah
okay,
someone
wants
to
say.
D
Yeah,
sorry,
you
know
this
is
allah
again.
We
have
so
in
the
past.
Whenever
I
have
done
whenever
our
team
had
done
these
kinds
of
performance.
The
guidance
I
have
received
from
api
server
folks
was
that
100,
qps
and
thousand
burst
could
could
be
okay
and,
and
maybe
200
and
thousand
burst
is
also
okay,
but
but
that's
the
defaults
that
we
were
running
our
controllers
with.
So
just
just
wanted
to
give
a
data
point.
Oh.
B
This
is
good,
it's
more
or
less
a
line
at
what
what
I'm
suggesting
here
so
yeah.
A
Well,
do
you
have
like
any
issues
that
we
can
point
to
or
like
unveiling
this
thread,
that
we
can
point
to
that
talks
about
that
or
in
case
we
can
like
just
sort
additional
evidence
for
like
why
we
should
go
with
a
number
like
this.
D
A
A
B
A
Yeah
that
makes
sense:
okay,
yeah,
that
that
that'd
be
cool.
So,
let's
see,
let's
go
to
the
second
point,
so
documentary
qps
based
on
performance
scale
requirements.
I
think
this
will
be
we'll
just
need
to.
I
think,
like
that's
some
of
the
stuff
I
have
in
that
slo's
talking,
we'll
just
need
to
refine
that
a
little
bit.
I
think
eventually,
we'll
just
we'll
find
a
place
for
this.
I
guess
is
the
point
we'll
just
I
think,
there's
just
there's
going
to
be
a
place
where
you
know
this
is
configurable.
A
This
is
valuable
to
documents
just
because,
like
let's
just
say
you
only
run
vmis
like
you
know,
it
makes
sense
like
you
just
want
to
give
it
as
much
access
to
the
api
server
as
possible.
Like
you
know,
what's
you
know?
What
is
it
when?
When
should
you
expect
to
do
that?
I
think
those
are
like
good
questions
that
you
know
good
answers
that
we
can
provide
to
those
questions.
Yeah,
okay,.
B
Maybe
like
write
blog
posts
in
the
converts
of
or
somewhere
and
then.
A
Okay,
yeah
that
sounds
cool,
okay
and
then
last
one
reduce
the
paraquest.
This
would
be
an
awesome
experiment
to
do
just
because
I
think
we
can
get
just
because
of
how
like
this
is
a
small
number
and
if
we
reduce
by
one
we
should
see
some
fast
improvement,
so
that
would
be
cool.
If
there's
a
way
we
could
do
this.
I
think
I
saw
on
the
chat
or
that
I
p
wreck.
The
number
of
requests
can
be
caused
by
conflicts
yeah.
We
we've
seen
this.
The
number
of
put
requests.
A
We've
seen
a
lot
of
conflicts.
We've
had
some
code
around
to
improve
this
over
time.
I
don't
know
marcelo
you're,
just
you're
dying.
Does
your
diagram
show
the
number
of
conflicts,
because
it
would
be
interesting
to
see
if
you
know
we
might
be
able
to.
C
A
Able
to
pinpoint
like
like
when
we
do
the
put
requests,
if
that's,
why
we're
seeing
if
the
puts
are
having
conflicts
and
that's
maybe
why
we're
getting
retries
or
why
it's
requiring
19
on
average,
because
we're
just
you
know
we're
any
of
those
things
like
it
could
also
have
to
do
with
the
work.
You
could
obviously
have
to
do
a
number
of
requests.
I
think.
A
The
in
your
in
any
of
your
diagrams,
do
you
have
the
the
http
error
codes
for
put
requests
like.
A
B
A
B
B
Comment
here
you
know
but
anyway,
so
if,
if
you
discuss
you
know
what,
if
you
agree
with
the
numbers,
things
like,
that
will
be
very
valuable
if
you
write.
A
Yeah
well
so
I'm
gonna,
I'm
going
so
like
this
is
what
I'm
saying
like
to
me.
This
seems
like
a
bug.
Let
me
take.
I
want
to
take
what
we've
talked
about
here
and
I'll
paste
it
into
the
into
the
as
a
comment,
because
I
think
like
just
so.
We
have
all
the
follow-ups,
but
I
think
overall,
though,
like
to
me,
it
seems.
Okay,
like
200
400
seems
like
seems
okay
for
balance.
It
just
seems
like
we're
too
low
like
that
might
be
the
same
defaults,
so
I
think
I'll
be
on
board
with
that.
B
A
B
I
think
it's
a
good
start,
because
that's
this
is
one
thing
that
I
said
also
to
david,
so
you
know
increasing
that
we're
not
hiding
the
problem.
Actually,
we
are
highlighting
the
problems
and
you
know
you
know
having
a
very
low
carbs
per
second,
that's
where
maybe
we
are
hiding
the
problem,
we
don't
see.
Actually,
what's
we
were
not
seeing
the
beginning.
What's
actually
what's
limiting
us,
we
were
not
understanding
that
so,
and
it's
definitely
relates
to
that.
A
A
Just
I
mean
look
how
long
this
is
like
like
how
much
time
you
can
see
right
there
and
then
and
then
it
gets
thinner
and
thinner
and
then
like
when
you
look
at
these
two,
like
it
there's
kind
of
diminishing
returns
in
some
way.
I
mean
it's
good
to
see
like
it
gets
faster,
but
it's
I
mean
it's
it's
I
mean
look,
how
much
better
of
an
improvement
between
just
these
two.
I
mean
it's
like
a
fourth
of
the
time.
A
You
can
just
see
it
right
there
I
mean
it's,
that's
passive,
so
yeah
that
yeah
there's
another.
This
is
a
really
good
illustration.
The
distance
between
this
line.
This
line
is
absolutely
massive.
I
mean
that's
such
as
that's
just
free
speed
that
we
can
get
with
a
simple
improvement
that
just
doesn't
start.
B
A
Yeah
and
the
third
one
is:
we
have
200
400
yeah,
which,
like
I
mean,
there's
another
jump
in
here
like
that
halves,
it
again
and
so
yeah,
okay,
yeah,
I
mean
I'll,
put
I'll,
write
a
comment
on
there.
I
think
it
makes
sense
to
me:
okay,
let's
go
to
the
next
one,
so
we
have
enough
time
to
get
through
these,
so
you
added
vmi
migration
free
face
range
and
times
this
looks
good
exactly.
B
A
A
Okay,
did
you
find
any?
What
do
you
find
in
here?
What's
anything
interesting
in
the
results
anything
stick
out
to
you.
B
Yeah
this
is
the
latency
it's
I
think
it's
might
be
seconds.
So
it's
you
know.
Migration
might
the
the
time
that
it
migrates
the
whole
time.
It's
depends.
You
know
of
the
size
of
the
vm,
but
what
we
can
see
here.
It's
preparing
the
target.
B
It
took
like
36
minutes
to
prepare
the
target,
which
means.
A
B
No
sir
seconds,
I'm
sorry,
oh
yeah,
okay,
it's
30
seconds
more
or
less
36
seconds
here
or
more
a
little
bit
more
than
that.
I
I
think
it.
This
experiment
was
migrating
100
vms,
from
from
an
old
and
and
then
I
have
like
a
very
high
configuration
like
I
could
migrate
to
any
parallel
having
20
parallel
migrations
and
yeah.
So
I
don't
have
like
too
big
conclusion
on
that.
Yet,
but
I'm
saying
here
it's
maybe
you
know
get.
B
I
need
to
get
the
pod,
maybe
creation
time
latency
that
might
be
to
about
20
seconds
or
10
seconds,
actually
but
creation
time
and
then
20
seconds
to
prepare
the
pod.
But
maybe
it's
too
much
isn't
it.
I
don't
know.
I
don't
know
it's
just
like
a
few
seconds,
so
I
think
we're
fine
migrating.
A
I
I'm
so
I'm
not
I'm
not
not
familiar
with
how
the
like
what
the
expected
times
are.
So
it's
hard
for
me
to
comment,
but
if
you
I
mean,
maybe
you
have
probably
a
better
idea
than
I
do,
but
it
would
also
be
interesting
to
hear
from
like
I
I
don't
do
you
know
this.
It
would
be
interesting
to
hear
to
show
them
this.
You
know
they
can
see
if
this
meets
their
expectation.
I
mean
that
would
just
as
another
data
point.
A
I
don't
know
who
did
it,
but
I
don't
know
it
would
be
interesting.
I
mean
I
I'm
not
like
yeah
I
mean
it
would
be
interesting
to
see
just
because
I
mean
they've,
probably
maybe
they've
never
seen
this.
I
mean
they've
probably
done
some
migrations,
but
maybe
they've
never
seen
it
like
the
way
you're
doing
it
like
with
100
of
them.
B
Yeah,
so
in
the
in
the
cooperate,
has
some
migration.
You
know
metrics,
but
it's
to
count
just
like
you
know
to
count
how
many
vms
were
migrated.
Things
like
that,
maybe
maybe
it's
possible
also
to
create
a
rage
to
see
how
many
are
being
migrated
at
the
same
time,
but
maybe.
A
Maybe
maybe
this
is
maybe
we're
starting
a
mailing
list
on
this,
because
we
can
kind
of
get
some
feedback
from
the
people
who
want
who
you
know
who
like
have
some
certain
expectations
around
this
to
see
like
you
know,
just
to
get
a
better
idea,
I
mean
I
I
don't
know
like
preparing
the
target
30.
A
B
A
A
A
A
A
To
do
melanie
performance
see
okay,
next
one,
let's
go
to.
A
B
A
Yeah,
I
think
what
made
me
realize
like
there's,
there's
maybe
another
way
to
do
this
like
there
may
be
some
improvements
that
we
can
make
on
the
current
structure.
So
yeah
I
mean
we
can
do
this
in
another
one.
I
think
yeah.
B
A
Yeah,
okay,
yeah.
I
think,
overall,
when
I
saw
this
fine
it'll
just
like,
I
think
it's
fine
to
proceed,
and
so
I'll
put
my
plus
one
on
there.
We
can
get
this
out
and
we'll
do
a
refactor.
It's
like
that.
I
think
what
we
should
do
is
like
maybe
as
a
follow-up.
We
can
have
a
discussion
here
in
terms
of
like
how
we
want
some
of
these
classes
to
look,
because
I
have
an
idea
of
like
what
they
should
be
but,
like.
A
B
Yeah,
I
think
there
are.
There
are
different
ways
to
implement
that,
for
example,
instead
of
we
have
like
a
burst
and
steady
state
job,
we
can
have
only
one
but
inside
we
can
have
like
you
know
actions.
So
it
means
with
we
create.
We
will
delete
the
object,
it
will
wait
between
the
deletion
and
then
we
will
generate
like
you
know
the
steady
state
and
then
it's
only
one
kind
of
job,
but
it
depends
how
we
configure
the
action,
but
anyway
we
can
discuss
that
later
and
if
you
can.
B
A
I'll
give
you
a
question
yeah
I'll,
give
you
a
plus
one.
I
think
I'm
okay
with
what's
there
and
we
can
do
as
a
fault.
I
think
what
I'll
do
is
I'll
book
some
time
in
this
meeting
next
week
and
we
can
discuss.
Would
you
like
just
an
overview
of
it
and
kind
of
get?
A
Do
some
a
little
bit
of
design,
see
how
we
can
properly
structure
this
just
to
make
sense
like
make
sure
it's
very
clear
what
our
interfaces
are.
B
A
Okay,
all
right
last
is
the
so
just
look
at
the
performance
job
results,
so
I
made
a
change
we.
Last
week
we
talked
about
there's
a
there.
One
of
the
vmis
was
stuck
and
didn't
have
enough
memory,
so
I
increased
it
that
merged.
I
think
I
did
it
correctly.
Like
I
mean
I
did
this,
I
didn't
increase
the
q
verts
allocated
right,
adjusted
the
the
the.
B
It's
actually
it's
it's
interesting
and
might
be
alarm
for
us,
because
that's
what
this
kind
of
job
that
this
kind
of
things
that
we
want
to
see,
isn't
it.
It
means
that
the
memory
footprint
increases
the
overhead,
the
vmi
overhead
increased.
B
A
Yeah
I
there's
been
a
there
was,
I
think,
pointed
out
that
the
the
issue
where
the
minimum
amount
of
bmi
memory
was
increased,
like
the
the
buffer
for
just
so
that
the
launcher
processes
can
run
that
was
increased
and
that
we're
starting
to
feel
that
effect.
But
I'm
actually
surprised
because
of
how
much
and
we're
only
launching
100
and
we've
already
had
to
increase
almost
like
almost
like
10
gigs.
Now,
and
so
it's
it's
it's
a
little.
A
Much
it's
it's
so
yeah.
It
doesn't
really
add
up.
Actually,
so
it's
a
bizarre,
but
I
so
anyway,
that's
what
I
did
because
it
should
should
have
fixed
this.
So
I
am
still
seeing
a
failure
here.
So
what
was
the
date
of
this?
I
just
I'm
sure
again
correct.
C
You
need
to
actually
increase
the
keyboard
memory,
because
I
think
that
that's
the
indicator
on
how
much
the
vm
should
have
by
the
way
the
memory
you
increased
should
be
only
for
the
job,
so
it
will
not,
for
example,
crash
on
the
cluster
and
so
on
and
inside
the
job.
The
the
vms
are
created.
A
Sorry
wait
which
memory
do
I
need
to
increase
it's
it's.
Not
this
you're
saying.
B
A
C
The
pod
we
are
creating
the
vms,
and
so
the
keyboard
memory
size
is
the
indicator
of
how
much
memory
each
vm
should
get
so
right
now
you
should
see
four
vms
with
10
gig,
and
so,
if
you
want
to
increase
the
memory
for
the
vm,
because
you
cannot
settle
the
vms
that
like
kuberiums
there,
you
need
to
increase
the
parameter
yeah.
Also.
C
A
Okay,
got
it
okay,
so
that's
that's
probably
why
I'm
getting
some
failures
here?
Okay,
so
this
probably
needs
to
be
12
or
something
then,
okay,
that
makes
sense.
I
was
getting
suspicious
as
to
like.
I
didn't
really
do
the
right
thing
here:
okay,
so
that
would
explain
this.
Okay,
let's
probably
explain
all
these
I
think
like
I,
I
didn't
go
look
through
them,
but
I
have
a
feeling
that
I'm
gonna
find
that
there's
there's
still
an
room
in
here
somewhere
like
it's.
A
Not
it's
not
gonna
work
for
what
to
have
okay,
so
it
all
increases
to
12,
and
I
think
that
that
should
that
should
fix
this
okay,
I
think
so
we've
done,
I
think
we
were
originally
eight
is
what
we
were.
Let
me
check
the
blame.
I
mean
I
think
we've
gone
up
almost
so
it
would
be
like
four
gigs
by
the
end
of
this.
I
think,
let's
see
what
it
was.
A
Well,
I
think
it's
I
mean
lubo.
I
think
you
were
the
one
who
pointed
this
out
to
me
that
it
was
it
had
to
do
with.
C
Yes
correct:
we
had
a
pr
that
recalculated
the
overhead
which
is
needed
for
the
pot
for
the
build
long.
This
one.
B
A
A
B
A
A
Yeah,
I
was
just
wondering
where
we
started
at
because,
oh
okay,
so
it
was
nine
okay,
so
we
went
nine
to
ten
and
then
we're
going
to
go
to
10
to
12.
okay,
so
we're
gonna
go
up
three
gigs
just
to
deal
with
this
issue.
Does
that
make
sense,
because
I've
heard
30
for
30
megs
per
bmi?
What
that
doesn't
quite
end
up?
B
B
C
So
if
you
increase
it
by
one,
then
you
have
actually
four
gigabytes
of
memory
because
you
have
four
vms.
So
in
the
cluster
you
will
have
four
gigabytes
more
memory.
A
A
Yeah
like
it
was,
there
was
one
out
of
100.
That
wasn't
so
I
mean
I
guess.
Maybe
it
should
be
11.
I
mean
that's
so
that
and
now
we're
raising
it
eight,
but
I
mean
ish
four
should
have
been
enough
right
because
it
should
have
been
roughly
30
or
three
gigs,
since
it
was
about
30
bags
of
vm.