►
From YouTube: SIG - Performance and scale 2022-01-13
Description
Meeting Notes: https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.yg3v8z8nkdcg
A
Can
everyone
see
the
the
document,
I'm
sure
you
see
the
screen.
B
A
A
Excuse
me:
okay,
let's,
let's
start.
A
With
so
we're,
gonna
say
three
topics
for
today.
I
want
to
first
talk
about
the
intensity
function
test
this
trying
to
get
some
more
results,
so
the
pretty
much.
The
only
thing
I
really
want
to
talk
about
this
is
that
I've
looked
over
this
a
bunch
and
just
trying
to
get
some
clarity
on
like
you
know
what
we
should
be
expecting
from.
A
You
know
the
density
test
and
right
now
that
runs
in
the
periodic
job
and-
and
I'm
not-
you
know-
I
mean
we've
seen
like
with
we've-
seen
all
sorts
of
different
results
with
the
great
pod
count.
We
expect
100,
and
I
mean
we
see
all
sorts
of
different
numbers,
and
this
was
a
change,
I'm
making
locally
it's.
Actually.
This
is
from
a
test,
I'm
running
actually
20
vms
and
I
just
made
the
change
to
run
the
job
in
the
funk
test.
A
I
don't
know
if
it's
gonna
work
but,
like
my
hope,
is
that
if
it's
just
gonna
eliminate
part
of
the
possibility
of
timing
here,
we
can
get
a
little
bit
closer
to
an
answer,
because
I
don't
it's
still
not
clear
to
me
what
the
problem
is.
I
still
see
when
I
do
local
testing,
I
mean
everything
here
looks
great,
which
has
probably
been
the
same
for
everyone
like
when
you
do
this
locally.
A
It
seems
like
it
works
fine,
but
I
do
see,
like
you
know,
based
on
some
timing
like,
I
would
see
different
results,
so
maybe
this
will
make
a
difference
in
the
periodic.
So
if
it
doesn't,
then
I
think
where
to
where
I
want
to
turn
to
next
is
how
we're
doing
reporting
in
in
prometheus.
I
want
to
see
if
we
can.
Maybe
it's.
A
C
I
have
one
question
for
you
yeah,
so
running
this
locally.
Did
you
always
the
thing
I
most?
The
most
obvious
metric
here
is
the
pod
creation,
like
we
know
exactly
what
that
we
should
be
and
like
19
point
whatever?
That's,
that's
really
accurate,
compare
with
respect
20..
So
when
you
run
this
locally,
did
you
always
see
pretty
close
to
20
pod
creates.
C
That
was
just
great,
that's
really
encouraging.
Actually,
because
I
was
beginning
to
doubt
my
I
was
beginning
to
doubt
this
tool
entirely,
but
the
fact
that
you're
able
to
get
to
work
consistently
at
least
locally
makes
me
have
a
little
bit
more
faith
that
this
is
possible.
What
we're
trying
to
do.
A
Yeah,
well,
I
I
think,
like
what
I
would
find
and
see
I
I
had
all
different
ideas
about
this
thing.
I
was
like
okay,
maybe
I'm
because
I'm
seeing
this
work,
I'm
like,
maybe
I'm
not
running
enough
fans.
Maybe
it
falls
apart
after
50
or
something
and
I
don't
know,
I
tried
different
a
few
different
things.
I
tried
five,
it
worked
fine
ten
two
I
went
to
20
and
then
I
was
like
okay.
A
B
We
go
through
my
comments
so
yeah
it's,
I
just
post
it:
okay,
okay,
so
I
did
some
tests
and
they're
actually
like
two
different
phases:
okay,
so
one
phase
is
when
I
run
the
test
and
the
cook
works
already
installed:
okay
and
the
other.
The
second
phase
is
when
I
just
install,
convert
and
run
the
test,
so
I'm
meaning
stockhold
vertices
run
make
cluster
sync
okay,
so
it
will
replace.
You
know:
uninstall,
kubernetes
and
install
again,
okay.
B
B
A
This
is
because
I
just
don't
see
anything
here
like
there's
the
flat
lines.
This
is
like
you're
saying
this
is.
B
A
You
you
did
what
you
would
stall.
This
is
when
you
installed
kubert,
and
this
way
song.
B
A
B
No,
this
is
sorry.
I
just
want
to
update
some
experiment
that
I
just
currently
okay.
So
now
this
is
what
I'm
doing
here
is
cobra
virtus
is
installed
for
a
while:
okay
yeah,
it's
like
I
install
it
like
after
many
minutes,
I
run
the
test:
okay,
okay
and-
and
actually
I
run
the
test
multiple
times
in
just
one
one
time
that
I
installed.
Okay
and
then
I
run
the
test
and
I
was
playing
with
some
sleeps.
B
B
So
just
double
check,
you
know
if
the
problem
that
we
are
seeing
is
actually
with
the
tool
that
we
are
collecting
the
metrics
or
with
the
metrics
itself
that
it's
imperative.
So
that
was
my
and
you
know
turns
out
that
the
problem
is
with
the
metric.
B
So
this,
what
is
showing
here
is
okay,
so
considering
the
timestamp
here,
I
put
I
slipped
off
six
sec
60
seconds
before
running
the
test
and
then
120
seconds
after
running
the
test.
Okay
and
the
timestamps
that
we
see
here
is
okay,
13
0.
You
know
0
6
38,
that
the
the
timestamp
that
I
collected
before
these
leaps
and
then
the
time
stamp
of
ending.
After
all,
these
leaps,
okay
and
then
I
run
all
this
to
to
collect
that
so,
but
in
prometheus,
and
we
can
see
here
that
the
delete
for
example,
event.
B
Maybe
we
can-
we
can
go
to
my
comments
before
after
the
figure
yeah,
so
the
delete
event
appears
of
45,
second
sorry
55
seconds
and
the
timestamp
actually
waiting.
You
know
two
minutes
after
the
test
finished
well,
the
timestamp
was
fortified,
so
the
auto
tool,
as
expected,
didn't
collect
the
delete
event,
because
the
delete
event
appears
more
than
two
seconds
after
the
two.
B
You
know
the
dash
the
test
finish,
which
is
way
too
long.
Isn't
it,
but
it's
appeared
there
and
again
sometimes
it
appear
first.
Sometimes
that
appear
later
and
but
I
I
I
run
many
times
and
I
didn't
see
it
appearing
after
three
minutes.
You
know
that
the
the
experiments
run.
B
B
The
no
things
get
slow
what's
happening.
Permit
is
not
collecting
metrics,
so
I
know
that
could
maybe
maybe
just
I
know
what's
happening
so
cook.
Vert-
creates
the
service
monitor
when
it's
deployed.
Isn't
it
so
it's
check
the
name.
The
parameters,
namespace
monitoring,
creates
a
service
monitor
and
then
prometheus
start
to
collect
methods.
So
maybe
this
is.
C
B
Process
the
point
is,
or
maybe
it
doesn't
make
sense
why
I'm
saying
I
don't
know
so
I
get.
C
What
you're
getting
at
this
is
interesting,
so
it
is
the
thought
that
when
we
first
create
the
cluster
that
maybe
prometheus
takes
a
while
to
begin
actually
collecting,
yes,
okay,
so
there's
a
gap
to
when
we
we
might
create
a
a
service
mod
or
whatever
it
is
that
tells
permeas
to
scrape
from
this
endpoint,
but
prometheus
might
begin
scraping
late,
because
maybe
it
hasn't
fully
initialized
or
something
like
that
and
we're
not
maybe
checking
that
condition
before
we
start
the
test.
B
Exactly
so
I
put
like,
then
I
sleep
of
120
seconds
before
run
the
test.
It
start
to
collect
a
little
bit
more
metrics.
Then
I
run
it
just
after
that,
but
it
it's
still
missing
a
lot
of
metrics
and
then
right
now
I
just
test
waiting.
You
know
360,
actually,
six
minutes
before
running
the
test.
It's
a
lot
and
it's
collecting
more
metrics,
but
it's
still
missing
some
metrics.
C
C
Would
that
potentially
have
enough
time
for
everything
to
catch
up
and
I
think
that's
total
overkill,
especially
for
the
after
the
test.
But
if
we,
if
we
pad
enough,
did
you
test
that
maybe
you're
about
to
get
to
it.
B
Yeah,
I
I
think
it's
worth
testing
that
so
I
will
try.
A
When
you,
when
you're
saying
you
get
more
metrics,
is
it
like
you're,
seeing
you're,
seeing
like
more
more
of
these
get
populated
like
okay,.
B
B
B
B
B
A
A
Okay,
yeah,
so
that
I
get
that
all
right,
so
so
it's
just
yeah
we
need
there.
Just
haven't
been
many
events
at
this
point
right.
I
think
that's
what
our
interpretation
is
like.
We,
we
haven't
seen
many
and
then
here
we
clearly
have
like
we
have.
We
have
60s
our
number
one
700
for
whatever
this
is
and
you've
already
seen.
Some
other
events
come
through.
B
Yeah,
sorry,
maybe
just
to
clarify
it-
was
actually
bad,
my
graph,
so
the
previous
one
in
prometheus.
If
you
click
in
the
name,
you
filter
out
the
other
ones,
but
this
one
also
has
like
a
much
more
values
in
the
other.
It
doesn't
have
the
numbers
here,
but
it
has
a
lot
of
values
and
you
can
see
also
that
it's
the
color
change,
because
it's
it's
just
filtered
out,
but
they
are
there.
You
know
it's
not
showing,
but
they
are
there.
B
A
Yeah
I
mean
I
kind
of
I
mean
I
like
the
line
of
thinking
marcel
I
mean
it's
like
this
doesn't
mean,
I
think
that's
the
to
me,
like
the
only
other
explanation,
if
it's
not,
you
know
something
with
like
it's
timing
or
something
to
do
with
like
for
metrics.
The
the
other
thing
is
with
that.
I
noticed
that
kind
of
gets
to
this
as
well.
A
So
you
can
see
I
I
I
comb
through,
like
50
different
tests
yesterday
for
a
few
hours
like
just
just
compare
like
what's
going
on
and
and
on
occasion
I
would
see,
I
would
see
the
deletes
verbs.
I
would
see
these
leave.
Requests
on
like
rare
occasions,
maybe
like
one
out
of
ten
or
more
would
be,
would
have
a
have
the
delete
requests
and
but
then.
A
A
Yeah,
so
in
the
periodic
job-
oh
yeah,
you
didn't
know
what
it's
called
between
the
parent
job.
You
know
the
output
results
like
this.
I
would
I
would
like
one
out
of
ten.
I
would
see
I
would
see
the
deletes
the
delete
pod
counts
and
it
would
be
different.
It
would
actually
be
different,
like
I've
seen.
Sometimes
it
would
be
like
greater.
I
think
one
of
them
I
saw
was
greater
than
the
podcast
I
was
like.
A
Led
me
down,
this
path
was
because
I
was
like
I
I
don't
know
what's
going
on
here.
I
don't
get
it
here,
obviously,
because
I
don't
clean
up,
but
when
I
was
doing
when
I
had
when
I
run
the
test
locally
before
this
change,
I
would
see
the
delete
show
up
pretty
much
every
time,
so
I
was
like
okay.
Maybe
I
can
try
to
see
if
it's,
if
it's
something
to
do
with
that,
you
know
for
just
catching
kind
of
somewhere
in
the
deletion
process
or
somewhere,
but
whatever
it
is.
It's
like
the
timing.
A
B
B
Okay,
just
to
double
check
and
every
time
that
I
redeploy
kubevir
things
get
you
know
I
don't
get
the
metrics
just
after
if
I
run
the
test
just
after
that,
but
if,
if
I
leave
that,
I
don't
know
how
long
I
you
know,
I
didn't
measure
that
if
I
leave
that
for
a
while
and
then
I
run
the
test,
I
see
the
metric.
So
it's
it's
takes.
Sometimes
you
know
to
bring
up.
You
know
every
you
know
all
this
mattress,
metrics
scrapping
process.
B
I
don't
know,
and
the
thing
that
you
mentioned
about
you
see
more
deletions
events.
I
totally
understand
that
because
I
think
I
commented
for
you
before
this.
The
metric
is
collecting.
So
it's
summing
all
the
events,
even
though
the
one
that's
failed,
so
you
can.
We
can
filter
that
for
the
code
like
200
codes
of
the
metric,
but
it
might
be
that
it's
it's
yeah,
504
or
whatever
you
know
code
that
it's
returning
the
deletion.
B
So
maybe
it's
failed
and
then
you
see
more
more
events
there.
You
can
see.
C
That
I
wanted
to
know
that,
though
I
mean
that's
part
of
it,
knowing
how
many
api
calls
were
made
like
as
far
as
like
for
the
maybe
I
didn't
completely
understand
as
far
as
the
auto
tool.
That's
that's
creating
the
report.
I
I'd
like
for
all
apis,
whether
it's
200
or
404,
or
whatever,
to
be
captured.
B
A
A
I
really
want
to
try
I'm
going
to
look
at
this
exactly
what
you
did
here,
I'm
going
to
launch
my
test
and
just
kind
of
find
when
it
starts
to
track
in
the
in
prometheus
here,
because
then
I
want
to
see
I
kind
of
want
to
see
what
you're,
seeing
here
with
the
timing
to
see
how
it
you
know
how
it
shows
up.
I
think,
that's
an
interesting
avenue.
A
I'm
gonna
try!
I'm
gonna
try
to
even
you
know
with
and
without
this
change,
just
to
see,
if
there's
any
like
difference
or
anything
there
and
just
to
see
what
let's
see
where
this
takes
us
yeah.
B
B
And
it's
definitely
it's
interesting
to
understand
that,
because
for
us
for
we're
doing
performance
tests,
if
we
lose
a
lot
of
metrics,
just
you
know
running
the
run
the
test
just
after
the
point
coupler
it's
something
that
needs
to
take
in
account.
You
know
in
all
the
tests
you
know
for
functional
tests.
Maybe
it's
just
not
a
problem,
but
for
performance
test
that
we
are
collecting
prometheus
metrics.
A
Oh
yeah
definitely
yeah,
we
need
it
all.
I
would
yeah
I
mean
I
think
yeah.
We
definitely
need
it
all.
I
think
it's
just
surprising
that
it's
not
like
it.
It
makes
logical
sense
kind
of
what
we're
doing.
I
just
I
don't
know
we're
missing
we're
just
missing
one
little
piece
of
information
here,
so
yeah,
let's
try
and
see.
If
we
can.
B
If
we
can
identify
actually,
what
it's
is
you
know
is
low.
In
start,
we
would,
you
know
we
could
watch
for
that
before
run
the
test,
but.
A
B
A
B
A
A
Okay,
all
right.
I
think
that
makes
sense.
I
this
at
least
gives
me
a
password
on
what
to
do
with
this,
and
the
marcelo
sounds
like
you're
also
doing
some
tests
in
there
too.
So
that's
good
okay.
So,
for
the
second
item
for
today,
I
wanted
to
take
a
few
minutes
to
talk
about
this.
We've
brainstormed
this
like
before
break
some
ideas
like
defining
tests,
and
I
kind
of
wanted
to
just
get.
A
You
know
when
our
overall
picture
is,
I
kind
of
sketch
a
little
bit
more
of
a
design
here,
because
we've
talked
about
a
bunch
of
different
tools
like
right.
We
have
our.
We
have
our
audit
tool,
we
have
our.
You
know,
load
generation
tool
like
we
have
these
two
things
right
like
we
need
to
like.
A
I
want
to
get
some
more
clarity
on
like
how
we
are
going
to
use
these
things
and
like
our
especially
the
load
generation
tool,
how
we're
going
to
how
we're
going
to
do
tests
like
like
this.
You
know
and
like
what's
our
path
forward.
B
A
B
There
are
different
paths-
okay,
I'm
actually
working
on
that,
but
for
internally
I
I
extended
cookburn,
which
is
actually
load
generator
to
its
base.
All
the
algae,
too,
you
know
in
low
generator,
2
is
inspired
from
kubernetes,
I
would
say-
and
I
actually
extended
quickburn
to
create
vmis
and
collect
metrics.
B
It's
it's
doing
more
than
maybe
the
the
ci
you
know
should
do,
especially
for
metrics
that
I'm
collecting
for
my
tests,
local
tests,
you
know
for
performance
evaluation,
and
but
I
definitely
want
to
extend
burn
to
have
the
steady
state
test
so
uber.
It's
also
for
burst
tests.
Okay,
so
that's
why
all
of
this
discussion
came
up
because
I'm
I'm
thinking
about
that
for
a
while,
and
I
really
want
to
do
those
those
tests.
B
You
know
we
maybe
can
highlight
very
quickly
here
again
about
the
difference
about
this
burst
test
and
steady
state
test
again.
So
I
I
would
maybe
stick
with
burst
test
instead
of
batch,
because
it's
the
what
kubernetes
calling
those
kind
of
tests
okay
again,
so
I'm
not
coming
up
with
those
ideas.
You
know
out
of
the
blue.
So
it's
something
that
kubernetes
scalability
group
is
also
defined.
Okay,
and
we
just
want
to
okay
bring
that
concept
for
our
context
as
well.
B
So
the
burst
test
is:
is
you
create
like
a
number
of
vms
and
waiting
for
them
to
be
created
and,
and
then
there
was
just
only
that
okay,
so
this
is
the
use
case
for
burst.
Test
is,
for
example,
failure
recover.
B
So
you
know
many
nodes
went
down
and
then
they
come
back
again
and
then
a
lot
of
vms
will
try
to
be
created.
A
user
want
to
create
many
vms
and
also
even
the
pool
vm
pool
that
david
was
implementing
it's.
It
will
be
something
like
that,
so
you
have.
We
have
like
a
burst
of
vms
being
created
in
a
timestamp,
and
so
the
definition
that
kubernetes
gave
for
that
is
the
birth.
So
I
try
to
find
more
reference
for
that
for
books,
papers
and
everything
it's
be
very
hard
to
find.
B
So,
apart
from
the
kubernetes
definition
anyway,
I'm
still
looking
for
more
reference
for
these
two
kind
of
tests.
Okay,
I
know
that
the
people
are
doing
that
for
a
while,
but
it's
hard
to
find
reference
anyway.
So
and
then
it's
it's
for
what
they
call,
no,
not
normal
situations.
B
It's
for
you
know,
suddenly
increase
traffic
in
the
cluster,
so
that's
what's
happening
with
diversity,
it's
idle
and
then
suddenly
we
create
1000,
10,
000,
vms
and
the
system
should
you
know,
cope
with
that
in
a
reasonable
time
and
also
what
they
say
that
this
kind
of
purse
tests
we
should
measure
the
total
time
of
creating
the
vmis
okay,
which
actually
I'm
not
doing
any.
B
I
mean
analyzing
this
data
as
a
steady
state
test
that
I
would
describe
now
like
that
anyway,
so
it
will
be
like
the
the
the
the
total
time
to
create
the
vmis
should
be
the
theme
more
important.
This
the
burst
test
is
more
analyzing
the
triple
throughput.
Isn't
it
something
like
that?
Also,
the
steady
state
test
is,
then,
what
they
say:
analyzing
normal
situations,
which
means
the
system
is
properly
working,
and
then
we
introduce
some
constant
load
in
the
system
and
which
we
expect,
for
example,
for
a
cloud,
a
public
cloud.
B
Isn't
it
many
users
creating
and
deleting
resource
there?
So
that's,
that's!
The
steady
state
has
three
phase:
it's
the
rumba
phase,
which
means
we
create
a
number
of
objects
and
then,
when
the
tests
reach
this
number
of
objects,
for
example,
we
want
to
have
10
000
vms
in
the
cluster.
It's
reached,
the
10
000
vms,
and
then
it
starts
to
is
the
phase
the
stead
phase?
Okay,
so
the
steady
phase,
it
will
cycle
the
the
the
vms.
B
So
what
it
means
the
cycle
is,
we
will
have
vms
being
deleted,
maybe
being
updated
and
recreate
and
kubernetes
define
this
as
a
churn.
So
they
they
just
find
that
like
in
their
test,
they
have
this
turn
of
20
pods
per
second,
and
I
think
which
maybe
you
know
good
random
number.
B
I
don't
know
what
how
they
came
up
with
20
but
is
20
vmis
per
second
also,
so
we
delete
20
vmis,
update,
delete
and
recreate
these
vmis,
and
we
should
have
a
turn
of
20
vmis
per
second
for
a
constant
time,
and
then
this
just
should
be.
You
know,
show
some
stability
of
the
system.
Isn't
it
and
then
we
have
the
ramp
down
phase
that
it's
just
deleting
everything.
B
Yeah
and
and
then
you
know,
latency
api,
latency,
vmi
creation,
latency
and
those
things
should
be
actually
been
collected
during
the
steady
state
test
and
and
the
burst
test
is
more,
you
know
dramatically
tests.
Actually,
for
example,
I
just
do
the
test
trying
to
create
a
burst
that
try
to
create
30
000.
You
know
vms,
and
I
actually
I
break
the
system
out
the
cluster
because
of
I
was
creating
with
a
rate
of
20
vms
per
second.
B
B
I
know
that
third
thousand-
maybe
it's
too
too
many,
but
it's
just
just
gender
interesting,
just
to
see,
isn't
it
so
and
but
if
we
want
to
test
the
system
with
more
vms,
you
know
it's
the
steady
state
test
that
will
show
us
more
more
feasible
tasks
that
we
can.
We
can
run
and
don't
break
the
system
and
also
collect
the
the
metrics,
especially
following
the
the
way
that
kubernetes
is
doing
so
yeah.
Okay,
yeah,
that's
my
comments.
Yeah.
A
C
These
are
great.
The
burst
test
is
something
really
interesting
that
like
marcelo
was
saying.
I
haven't
seen
a
lot
really
talk
about
that
in
the
kubernetes
ecosystem
and
it's
something
that
matters
a
lot.
So
when
I
look
at
like
infrastructure
as
a
service
in
the
cloud
there
was
times
when
I
worked
at
a
previous
company
that
we
would
have
to
burst,
like
enormous
amounts
of
vms,
to
reproduce
a
qe
sort
of
test.
C
So
there
are
use
cases
for
this
sort
of
thing
and
I
think
we
should
at
least
begin
tracking
how
we
perform
under
these
scenarios,
because
it
eventually
improving
that
burst
scenario
improves
our
control
plane,
which
improves
the
efficiency
of
our
control
plane,
which
then
improves
everything
else.
So
it's
a
it's
a
great.
It's
a
great
test,
same
thing,
with
the
steady
state
test
as
well
just
a
continuous
churn.
C
A
And
so
marcelo,
so
this
like,
so
I
wanted
to
get
okay,
so
we
have
kind
of
our
general
idea
of
these
so
like
to
maybe
talk
through
a
little
bit
like
so
you're
talking
about
how
to
be
looking
at
doing
this
with
cube
burner,
like
maybe
talk
through
a
little
bit
of
the
directions
like
this
is
kind
of
what
we
have
now,
but
you
see
you've
already
you're
exploring
q
burner
now.
Is
that
like
what
you
want
to
go
with
this
or
you
think
we
should
go
with
this.
B
B
Well,
the
the
point
is:
if
we
have
this
the
control
of
this,
you
know
these
two
we
can
do
whatever
we
want,
for
example,
coopburn
actually
doesn't
use
watch
for
waiting
resource.
It's
doing
lists.
Okay.
B
Maybe
I
want
to
suggest
that
in
the
future
to
change
that
there,
but
I
don't
want
to
send
too
many-
you
know
like
before
I
sent
a
pr
a
huge
pr
with
a
lot
of
change
for
them,
so
maybe
slowly
we
can
improve
uber,
but
we
can
even
have
like
you
know
more
the
way
that
we
want
the
tool
in
our
convert
code,
so
I
think
both
work,
so
we
see
a
lot
of
just
you
know
things
on
kubernetes
also
they
are
always
changing
all
their
tools.
B
C
B
C
Response
so
I,
like
I
messaged
the
contributor,
the
primary
contributor
directly,
but
then,
like
six
months
after
I
messaged
him,
he
messages
us
asking
why
we
didn't
use
kubernetes.
So
I
was
concerned
that
maybe
there
was
some
communication
issues
and
that
that
might
be
a
difficult
project
to
contribute
to
regularly
for
our
custom
logic.
I
could
be
totally
wrong
and
maybe
they're
super
responsive,
and
that
would
be
great
if
they
were
because
there's
a
lot
of
overlap.
C
It's
just
that
we're
going
to
need
to
make
some
gonna
need
a
high
level
of
flexibility
to
create
changes
and
not
be
encumbered
by
existing
or
yeah.
I
need
to
do
what
we
need
to
do.
That's
the
best
way.
I
can
say
it.
B
C
The
waters
there,
and
if
we
see
that
there's
great
communication,
there's
great
feedback
and
acceptance
to
what
we
need
and
there's
good
direction
with
that
as
well,
then
I
think
that
would
be
a
great
area
to
invest
in
hubert
would
be
a
great
area
to
invest
in
if
there's
any
sort
of
friction
and
difficulty
of
getting
your
stuff
in
it's
just
going
to
slow
us
down.
We're
just
talking
about
testing
infrastructure
like
stuff,
really
isn't
that
complicated
to
to
do
ourselves
as
well.
A
Yeah,
brazil
makes
sense
if
they
like
yeah.
If
they've
they've
got
a
community,
that's
supporting
the
project,
then
we
can
get
changes
in
the
end
and
why
not?
I
mean
I
think
yeah
we've
got
to
give
use
case,
so
I
mean
perhaps
they'll
be
open
to
it.
I
mean
it's.
I
guess
I
mean
is
that?
Where
is
it?
I
think
you
know
you
want.
A
Is
that
where
you
want
to
go
with
this
marcel,
you
want
to
try
and
submit
a
change
and
then
see
how
they
respond,
and
then
maybe
we
decide
based
on
how
that
goes.
C
B
The
way
that
coburn
is
doing
now,
it's
we
can
collect
metrics
a
bunch
of
metrics
from
prometheus
and
it
can
push
all
these
metrics
to
a
elasticsearch.
You
know
cluster,
I'm
actually
doing
that
now
for
my
test.
Okay,
so
it
has
some
drawbacks
and
advantages
about
that.
This
approach.
Okay,
so
the
advantage
is
we
control
the
amount
of
data
that
we
store
and
it's
it's
easier.
You
know,
for
example,
right
now.
B
We
cannot
keep
like
too
much
data
in
prometheus
because
permeating
will
explode,
especially,
for
example,
let's,
let's
assume
that
we
have
like
six
months
running
the
experiments
in
the
performance
cluster,
for
example,
and
or
whatever
you
know
in
experience
that
it
we
are
going
to
do
it's
hard
to
get
to
store
all
this
data,
but
with
coolburn
it's
we
just
you
know,
collect
the
metrics
that
we
want
and
put
in
elasticsearch,
and
also
it
puts
some
index,
for
example,
the
job
name,
the
experiment
name
that
it's
easier
to
search.
B
You
know
for
that
in
the
grafana
later,
and
it's
also
possible
to
do
that
with
prometheus
roman
was
mentioned
about
that,
but
it's
it's
requires
some
hacking.
Like
a
restarting
prometeurs,
you
know
to
introduce
some
new
labels
of
the
job
name
in
the
inside
the
metric.
So
but
with
coobern,
it's
easily
add
a
new
label
with
the
name,
then
the
name
of
the
experiment,
and
then
it's
might
be
easier
to
you
know
to
to
see
different
experiments
in
the
in
aggregate.
So
I
already
created
also
this
graphing
dashboard
for
last
search.
A
Yeah
marcelo
I
mean,
if
you
see,
features
and
and
q
burner,
I
mean
that
we
can
leverage.
I
don't
see.
Why
not
I
mean
it's
like
I
really
it's
just
a
question
of
like
if
it's
something
that
you
know
like
dave
was
saying.
If
it's
something
we
could
the
community
is,
it
makes
sense
for
us
to
leverage
the
community
in
terms
of
not
just
technical
side,
but
you
know
if
we're
able
to
you
know
we're
just
not
getting
blocked.
You
know
by
anything
yeah.
That's.
B
That
might
be
a
issue,
a
problem.
We
need
to
see
that
yeah.
A
Yeah-
and
I
think
like
I
mean
I
think,
like
the
best
I
mean
you
already
have
an
idea
with,
like
you
said,
watch
using
watson
said
a
list
and
we
have
a
great
case
for
why
we
shouldn't
be
using
lists,
and
it's
I
mean
I
think
you
know
they
will
be
responsive.
Hopefully
they
should
be
responsible
to
it.
I
mean
it's
something
I
don't
know.
Maybe
maybe
it's
just
the
time
we
reach
out
and
talk
about
it
again.
You
don't
see
what
they
say.
A
Okay,
all
right!
Well,
I
guess
so
we'll
leave
it
at
this.
Then
it's
just
like
you
know.
We
see
where
this
you
know
marcel
will
see
where
you
can
take
this
and
see
how
responsive
they
are,
and
then
I
mean
I
think,
for
on
this
side
and
the
test
side,
I
mean,
I
think,
we've
I
think
we
have
a
good
coverage
of
this.
I
would
do
you
want
to
do,
though,
is
just
write
a
readme
of
this.
A
I
think
that
would
be
good,
maybe
like
yeah
like
just
to
get
a
sense
of
just
to
record
like
because,
if
anyone
outside
of
just
us,
you
know
having
talked
about
this
reads
about
these
tests,
I
just
wanted
to
be
clear
kind
of
what
to
expect
why
we're
testing
and
then
I
think,
a
bunch
of
things
can
evolve
out
of
this,
like
I,
I
think
we're
gonna.
We
could
have
the
ability
to
get
some
some
slos
out
of
this.
Eventually,
I
think
kubernetes
like
have
their
own
sll
readme.
A
They
maintain,
I
mean
I
could
see
we
could
get
a
bunch
of
them,
especially
out
of
here,
like
you
know,
with
churn
rate,
for
example,
like
you
know,
I
think
we
can
gather
a
bunch
of
interesting
things
or
even
just
burst
rate,
maybe
that
the
control
plane
could
handle
or
something
we're
gonna
find
some
information
that
we
could
write
about.
So
I
mean
I
I
what
I
want
to
go
with.
A
This
is
I'll
write,
a
readme
for
this,
just
with
some
more
details
and
then
something
that
we
can
kind
of
build
on
over
time
with
based
on
what
we
observe
at
this,
when
we
do
this.
B
A
Yeah
yeah
mainly
marcelo
yeah.
I
can
look
at
that.
I
mainly
I
just
want
to
have
you
know
just
a
few
general
details
and
then
we
can
iterate
on
it.
If
you
see
things
that
I
don't
have
that
you
want
to
add,
let's,
let's
we'll
build
out
a
readme
for
this
okay
cool
all
right,
that's
mainly
what
I
wanted
to
cover
with
this
and
then
and
then
I
think
you
have
an
issue
and
get
for
this
right
tracking.
So
what
you're
looking
at
with
the
load
generator,
I
think
so.
A
But
I
don't
know
I
can
find
it
after.
Okay,
let's
move
on
to
the
last
bullets,
it's
kuvert
summit,
the
talks
submission
is
open
for
the
next.
A
I
don't
know,
I
think
it's
through
next
week.
I
wanted
to
just
get
your
guys
sense
of
like
things
that
we
could
talk
about.
I
I
was
thinking
of
talking
about
just
the
session
like
I
wanted
to
do
when
we
talk
about
sixth
scale,
just
kind
of
the
things
we've
done.
I
mean
this
whole
document's
filled
with
things
that
you
know
different
things
that
we've
already
accomplished
bugs
and
we
have
the
tools
we've
written
and
you
know
all
sorts
of
things
that
we
have
in
here.
A
That
I
think
are
would
be
interesting
to
the
community.
That's
what
I
was
thinking,
at
least
for
one
talk,
but
I
mean
you
know:
what
do
you
guys
think?
Does
that
make
sense?
Or
you
know
what
else
could
we
do.
C
C
To
talk
about
our
method,
I
mean
just
telling
the
story
of
the
sig
and
the
types
of
things
that
we're
interested
in
and
the
types
of
things
we're
beginning
to
to
measure
and
that
we
want
to
measure.
That's
all
pretty
interesting.
A
Yeah
and-
and
so
I
I
was
gonna-
submit
this
one,
if
you
guys
want
to
be
a
part
of
it,
if
you
guys
want
to
talk
about
specific
topics,
we
can
just
feel
free.
I'd,
be
happy
to
talk
with
you
guys.
Whoever
wants
to
join
it.
That's
that's
fine
with
me.
A
I
think
I
think
what
I'm
going
to
do
is,
I
think,
they'll,
be
I'm
not
sure
yet
how
long
or
kind
of
what
I
want
to
do
with
it,
but
based
on
topics,
but
we
could
also
cover
it
here
like
like,
like
features
as
well
like,
like
the
vm
pools
for
one
there's
another
one.
That
could
also
be
its
own
topic.
I
don't
know
if
you
were
thinking
of
submitting
that
data,
or
if
you
want
to
do
it
here
or
I
don't
know
what
what
do
you
think.
C
C
B
A
B
Stuff
yeah,
I
was
playing
maybe
to
present
this
the
cool
burn
extension
that
I
did
okay
and
show
use
case
yeah
how
to
use
it.
Yeah.
B
A
A
A
B
Okay
and
so
yeah
right,
so
if
maybe,
if
you,
if
it's
possible,
I
can,
I
can
also
join-
maybe
just
6k
of
so
yeah.
A
A
Cool
okay,
that
sounds
good
and
then
david,
I'm
gonna.
Add
you
guys
as
the
attendees
up
here
we
had
some
people
here
earlier.
I
didn't,
I
didn't
know
where
they
went
disappeared.
A
They
didn't
have
themselves
all
right.
I
think
we're
good
guys
see
you
online
thanks.
All
right
have
a
good
day.