►
From YouTube: Cloud Native Live: Building Stability
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Welcome
to
cloud
native
live
where
we
dive
into
the
code
behind
cloud
native,
I'm
annie
talvasto
and
I'm
a
cncf
ambassador
as
well
as
a
senior
product
marketing
manager
at
camunda,
and
I
will
be
your
host
tonight
very
excited
to
have
you
everyone
joined
today.
A
A
A
So
this
week
week
we
have
andy
suderman
from
wherewinds
to
talk
about
building
stability,
but
before
we
get
to
the
topic
of
today,
another
exciting
thing
that's
happening
in
the
cncf
universe.
At
the
moment
is
the
kubecon
europe
co-located
event.
Cfps
are
closing
soon.
So
if
you
have
any
talk,
ideas
or
you
need
to
come
up
with
ones,
you
can
go
ahead
and
submit
them
now
soon
will
be
too
late
and,
as
always,
this
is
an
official
live
stream
of
the
cncf
and
as
such
it
is
subject
to
the
cncf
code
of
conduct.
A
B
Great,
thank
you
so
I
today
I
wanted
to
talk
about
resource
requests
and
limits,
and
those
who
know
me
are
familiar
with
some
of
the
things
that
we
do
at
fairwinds.
Resource
requests
and
limits
are
kind
of
a
pet
peeve
of
mine.
If
you
might
say,
or
just
a
thing
that
I
commonly
harp
on
or
talk
about,
and
so
I
feel
like.
We
spend
a
lot
of
time
telling
people
to
set
their
resource
requests
and
limits
on
all
their
workloads,
and
we
tell
them.
B
But
what
we
don't
talk
about
as
much
at
least
out
in
the
open
is
what
happens
when
you
don't
set
them
properly
or
what
we
don't
get
to
see
very
often,
except
in
real
life
clusters,
are
some
of
the
the
negative
side
effects
that
can
happen,
and
so
I've
been
wanting
to
do
this
for
a
while,
I
put
together
some
demos
of
kind
of
the
different
different
things
that
you
can
break
so
hopefully,
today
we'll
get
to
break
some
stuff,
see
some
things
fall
over
and
have
an
idea
of
why
that's
happened
perfect.
B
So
all
the
code
I'll
be
using
today
see
is
the
screen
share
up
sorry,
I
can't
actually
tell.
A
B
Right,
so
everything
I'll
be
doing
today
is
in
a
github
repository
that
I
made
public
this
morning.
So
if
you
want
to
go
tinker
with
this,
what
I
have
here
is,
I
have
a
gke
cluster.
It's
on.
B
I
have
just
n2
standard
two
nodes,
so
they've
all
got
two
cpus
and
eight
gigs
of
memory.
I've
enabled
node
auto
scaling
across
the
three
zones,
so
I
by
default,
have
three
nodes.
I
can
scale
up
to
two
notes
per
zone,
giving
me
six
nodes
total
and
then
I
have
an
application
running
in
this
cluster.
So
if
you
saw
my
last
live
stream
a
few
months
ago,
I
used
the
same
app,
it's
kind
of
a
fun
little
app.
B
You
can
just
go
to
the
app
and
you
can
vote
for
where
you
want
to
have
lunch
and
the
counter
goes
up.
We
see
how
many
page
views
there
have
been.
We
see
the
name
of
the
back
end
that
we're
connected
to,
and
all
of
this
is
stored
in
a
db
in
the
cluster.
So
if
we
go
take
a
look
at
our
cluster
in
the
yelp
namespace,
we
have
the
app
server
which
is
kind
of
the
back
end.
We
have
the
db.
B
We
have
the
ui
and
there's
a
redis
server
for
caching
as
well
and
all
the
code
to
deploy
this
into
the
cluster
series
in
this
repository
in
the
app
directory.
So
if
you
want
to
deploy
it
to
a
cluster,
you
can
just
keep
ctl
apply
that
app
folder
and
get
this
app
running
by
default.
It
creates
a
load
balancer.
We
have
just
an
ip
address
here.
Very
simple
setup
really
easy
to
recreate
and
right
now
this
app
is,
as
far
as
we
know
functioning,
it
seems
to
work.
B
I
can
click,
I
can
vote
for
stuff
and
it
seems
to
be
doing
its
job
and
we
see
it's
using
relatively
low
amounts
of
cpu
and
memory
right
now.
So
we
have
you
know
once
one
millicore
and
36
megs
memory
going
on
here,
and
so
we
have
a
happy
app,
that's
great!
B
B
So
what
I'm?
What
I'm
going
to
do
now
is-
I
have
in
here
this
course
yaml
it.
It
works
with
an
open
source
tool
that
we
have
essentially
I'm
going
to
deploy
a
helm
chart
and
I'm
going
to
deploy
a
helm,
chart
two
different
helm,
charts
that
are
going
to
spin
up
a
program
called
stress
which
is
just
going
to
eat
up
cpu
and
memory
in
the
cluster
and
these
two
helm
charts
that
I'm
about
to
deploy.
A
B
Yeah
no
problem,
so
I'm
using
here
this
is
k9's
canine
s.
So
it's
a
just
a
kind
of
a
2e
or
a
terminal
user
interface
for
interacting
with
kubernetes.
What
we
can
do
also,
essentially
what
this
is
showing
us
is.
If
I
do
a
cube,
ttl
top
pods,
I
can
see
the
util
the
current
utilization
of
the
pods
in
the
name
space
of
my
current
context,
which
is
the
yelp
namespace.
B
That's
essentially
the
same
thing,
I'm
seeing
here
the
nice
thing
that
k9s
or
canines,
I'm
not
sure
how
people
say
that
does
for
us
is
that
it
shows
us
the
percentage
of
the
request
and
the
percentage
of
the
limit
that
we're
using
as
well.
So
that's
percent
cpu,
request
percent
cpu,
limit
percent
memory,
request
percent
memory
limit,
which
is
kind
of
a
nice
thing
to
see
as
we
go
through
the
rest
of
this
demo,
which
is
why
I'm
using
canines
in
this
case.
A
B
What
they're
going
to
do
is
hopefully
succeed
at
just
overwhelming
the
node,
and
so
we
have
a
couple
different
ways
to
look
at
this.
We
can
do
cube
ctl
top
nodes
and
we
can
see
the.
B
Cpu
percentage
and
the
memory
percentage
utilization.
Currently
we
can
also
see
this
in
k9s
by
going
to
the
node
view
in
canines,
or
we
have
another
tool
that
I
typically
use
that
we'll
use
a
couple
times
throughout
this
demo.
It's
called
cube
capacity
and
if
we
pass
it
the
usage
flag
and
make
this
a.
A
B
Bit
wider
because
there's
a
lot
of
output
there
we
go,
we
can
see
the
cpu
requests
and
limit
totals
for
the
nodes.
So
what
we're?
What
every
pod
on
that
node
is
requesting?
B
The
total
limits
of
all
of
the
pods
on
that
node
and
then
the
current
utilization
and
you'll
notice
that
all
three
of
these
nodes
are
now
packed
at
103
percent,
cpu
usage
and
over
a
hundred
percent
memory
utilization
as
well,
and
hopefully
our
view
down
here-
will
start
to
catch
up
with
that.
So
yep
there
we
go
cpus
at
2000.
We
have
two
cpus
on
these,
so
that
would
be
full
utilization.
It's
103
of
the
available
cpu
on
the
node,
and
so
now
what
we're
gonna
do
is
well.
B
First,
I'm
just
gonna
go
click
on
the
app
and
see
if
it's
still
working,
because
that's
like
the
easiest
way
to
check.
But
another
thing
we
have
in
place
for
this
demo
that
that's
going
to
be
useful
is
I'm
you.
I
have
another
file
in
the
repo.
B
It's
a
javascript
file
called
load.js
and
what
this
is
going
to
do
is
I'm
going
to
use
a
tool
called
k6,
which
is
a
load
testing
tool
and
it's
going
to
run
a
load
test
against
this
app.
So
essentially,
what
it's
going
to
do
is
it's
going
to
go?
Click
on
those
buttons,
so
I'm
going
to
go
click
on
those
buttons.
The
default
is
set
to
10
iterations.
B
So
it's
going
to
go
in
click
on
each
button
and
then
sorry
load,
the
main
page
click
on
each
button
and
it's
going
to
do
that
10
times
and
then
right
now,
I'm
using
two
what
they
call
virtual
users.
So
it's
going
to
click
on
it's
going
to
use
10
different,
essentially
processes
to
do
that,
so
they
all
kind
of
happen
in
parallel,
and
if
we
see
here
at
the
request,
duration,
the
average
http
request.
Duration
for
this
test
was
78
milliseconds
and
the
average
iteration
time
was
about
4.8
seconds.
B
This
is
the
baseline
of
what
I
expect
for
this
app.
So
if
I
had
run
this
before,
we
started
stressing
the
cluster,
this
is
what
we
would
have
seen
the
app
performance.
So
we
can
see
that
the
nodes
being
fully
utilized-
and
you
know
completely
overwhelmed
by
this
other
application
that
is
improperly
behaving-
are
not
affected
by.
B
They
are
not
affecting
the
application
that
we're
running
in
our
cluster
because
we
set
our
resource
requests
and
limits
to
that
guaranteed
qos
class.
This
is
why,
for
anything
that
is,
you
know
critical
or
you
know
important
to
you.
I
generally
recommend
that
you
use
that
guaranteed
qos
class
set
your
request
to
them.
It's
exactly
the
same
and
set
them
to
a
reasonable
number
that
you've
tested.
A
Yeah
great
and
I
see
a
comment
from
gary
sorry,
I
showed
up
late,
not
sure
if
I
missed
it,
was
there
a
link
given
for
any
of
these
tools.
I
don't
think
we've
given
links
so
far,
but
we
can
see
if
we
can
add
some
during
the
duration
of
this
webinar
live.
B
Definitely
definitely
if
we
can
share
the
link
to
the
github
repository,
there's
actually
a
section
at
the
bottom,
that
has
a
list
of
tools
used
and
some
links
to
those
as
well.
So
if
we
can
just
share
that
initial
github
repository
url,
that.
B
B
Thank
you
so
now
we've
seen
kind
of
what
a
well
actually
there's
one
more
thing
to
show
sorry,
so
the
other
thing
we're
gonna
use,
we're
gonna,
go
take
a
look
at
the
pods
in
that
stress
name,
space,
because
we're
spinning
up
a
whole
bunch
of
pods
that
are
attempting
to
use
way
more
cpu
and
memory
than
it
is
available
and
they
have
no
resource
requests
and
limits
set
on
them
whatsoever
and
we're
gonna
see
something
particularly
ugly
here.
B
We're
gonna
see
a
whole
lot
of
pods
that
have
been
evicted
because
the
because
they're
trying
to
use
so
many
resources
and
because
they
have
essentially
you
know
the
best
effort,
qos
class,
we
haven't
set
any
requests
or
limits.
We've
said
just
try
to
run
this
see
what
happens.
We're
gonna,
we're
gonna,
see
them
get
evicted
as
the
node
has
the
condition
memory
pressure,
so
we're
running
out
of
memory
on
the
node.
We
need
to
find
some
pods
to
get
rid
of
to
make
space
for
other
things.
B
These
are
the
first
on
the
chopping
block
to
get
removed
because
they
have
no
resource
requested
limits
set.
So
now
we've
really
seen
the
a
the
detriments
are
not
setting
any
resource
requests
and
limits.
You're
gonna
see
you
know:
pod
evictions,
you're
gonna,
see
potential
issues
with
applications
running
and
then
also
the
benefits
of
setting
your
resource
requests
and
limits
properly
on
your
on
your
critical
apps,
so
that
they're
not
affected
by
other
workloads
in
the
cluster
that
may
do
bad
things.
B
Let's
see
any
other
questions
about
that,
I
think
we're
all
right.
So
I'm
going
to
delete
the
stress,
namespace
and
we're
going
to
stop
stressing
this
cluster
so
much
we
may
have
noticed
here
in
our
node
list.
We
have
six
nodes
now,
so
we've
scaled
up
to
our
maximum
number
of
nodes
because
of
all
of
the
extra
pods
I've
been
attempting
to
schedule
and
all
of
the
memory
pressure
on
these.
B
It's
also
interesting
to
note
that
if
we
were
you
cluster,
auto
scaler
in
this
case,
if
this
wasn't
a
gke
node
pool
it's
possible,
we
would
not
have
scaled
up
the
cluster,
because
there
are
no
resource
requests
for
those
that
need
to
be
scheduled,
and
so
it
may
not
have
known
it.
The
pods
wouldn't
have
gone
into
pending
state
and
the
cluster
auto
scaler
would
have
known
what
type
of
node
to
spin
up
so
in
another
type
of
cluster.
This
may
have
had
even
more
detrimental
effects,
not
allowing
the
cluster
to
scale
up.
B
B
There's
been
a
lot
of
debate
in
the
community,
about
cpu
limits
and
cpu
throttling
and
what
you
should
set
your
cpu
limits
to,
and
linux
kernel
bugs
that
resulted
in
increased
cpu
throttling
more
than
you
would
expect,
and
so
I'm
just
kind
of
kind
of
cover.
What
that
looks
like
when
you're
experiencing
a
lot
of
cpu
throttling.
B
So
the
first
thing
I'm
gonna
do
is
I'm
just
gonna
put
a
little
bit
of
stress
on
the
cluster.
I'm
just
gonna
schedule
some
pods
that
use
some
extra
cpu
just
to
kind
of
create
a
little
bit
of
extra
noise
in
the
cluster.
While
we
do
this,
so
this
is
the
same
app
I
was
running
before,
but
we're
just
stretching
cpu
and
we're
not
running
nearly
as
many
of
the
pods
so
that
we
don't
get
quite
the
same
behavior.
B
And
then
what
we're
going
to
do
is
I'm
going
to
go.
Take
a
look
at
the
app
server
deployment
and
I'm
going
to
edit
this
and
we're
going
to
find
the
resources
block
and
I'm
going
to
turn
this
way
down.
So
originally
we
had
cpu
requests,
100
limits,
200
or
should
have
been
different
than
that,
but
that's
right.
I'm
going
to
turn
this
way
down
to
cpu
requests
of
10
millicourse
and
a
limit
of
10
millicourse
and
we're
going
to
take
a
look
at
the
pods
in
that
deployment.
A
B
It's
running,
we're
waiting
for
it
to
go
into
a
ready
state,
and
now
we
start
to
see
readiness
probes
failing
so
we're
trying
to
do
a
poll
or
a
get
request
to
the
api.
Endpoint
get
stats
as
our
readiness
probe
in
order
to
tell
kubernetes
when
our
pod
is
ready
to
receive
traffic,
and
these
are
just
failing
and
the
liveness
probes
now
failing
as
well,
it's
the
same
api
endpoint.
B
So
I
would
expect
that
and
if
we
try
to
get
the
logs
for
this
pod
and
grab
the
previous,
oh
there's
no
previous,
yet
we'll
grab
the
logs
and
spots,
not
logging.
Anything
nothing's
happening
so
essentially
what's
happening.
Here
is
this:
we
have
throttled
the
cpu
down
so
far
that
this
app
can't
even
serve
requests
anymore.
It
just
can't
can't
serve
these
requests.
B
So
if
you
have
random
intermittent
failures
of
probes
that
you
can't
explain
if
you
have
a
pod
that
doesn't
come
up
just
because
the
probes
are
failing
and
there's
no
logging
or
maybe
there's
some
logging,
but
it's
very
intermittent.
This
is
usually
evidence
of
cpu
throttle.
Now
you
can
go.
Look
at
graphs
from
prometheus
or
from
stackdriver
and
construct
grafts
disease
or
whatever
your
monitoring
tool
is
but
first
thing
I
always
look
at
when
I
see
just
unexpected
probe
failures.
I
know
the
app
should
respond
on
that.
B
That
endpoint
is
the
the
cpu
limits,
specifically
the
limits,
because
that's
what
controls
the
cpu
throttling
so
we're
gonna
go
back
to
our
deployment
and
we're
gonna
edit
this
again
and
I'm
gonna
turn
this
up
to
something
a
little
bit
more
reasonable.
10
millicourts
is
tiny.
We
originally
had
100
millicourse,
so
let's
bump
this
up
to
40..
Maybe
we
just
started
this
thing
up.
Maybe
we
would
kind
of
expect
it
should
only
take
you
know:
40
50
millicourse.
We
want
to
be
conservative.
B
B
If
we
look
at
the
the
last
one
that
tried
to
start,
which
is
now
terminating,
we
see
it
was
well
over
its
cpu
limit
and
request
and
it
crashed
twice
probably
killed
because
of
its
failing
probes,
and
so
it
just
wasn't
in
a
good
state.
So
we're
going
to
try
this
new
one
here
and
we're
going
to
see
how
it
goes.
Let
me
take
a
look
at
the
logs
on
that.
B
It
is
also
fun
to
note
that,
due
to
the
wonderful
features
of
kubernetes,
our
has
actually
been
functional
the
entire
time
throughout
this
that
new
pod,
because
of
our
deployment
strategy,
just
didn't
come
up,
but
our
other
two
pods
were
still
running,
and
so,
if
I
had
gone
and
clicked
on
the
app
or
run
my
load
test
here,
we
would
have
seen
that
the
app
is
still
fully
functional
all
right.
So
we
have
one
running
to
running.
A
And
then
we
have
a
question
from
mark
about
the
slack
channel
for
the
chat.
I
think
it
was
linked
or
told
a
bit
earlier.
Yes,
there
we
go,
you
can
see
there,
so
you
can
join
in
there.
But
obviously
you
can
also
ask
the
questions,
as
you
did
mark
already,
via
the
chat
in
your
preferred
streaming
provider
so
that
you
already
are
doing
really
well
on
that
front
perfect
and
then
actually
there's
a
question
from
muhammad
again.
A
Sorry,
if
I'm
I'm
failing
in
pronouncing
the
name
by
the
way,
so
they
say,
look
still
like
looks
like
still.
Cpu
takes
100
percent
off
40,
better
to
increase
limit,
cpu
100m,
and
then
thank
you
so
much
to
jonathan
for
saying
awesome
great.
That
you're
excited
to
be
here.
B
There's
a
great
point
about
using
100
of
40
millicourse
as
the
limit
right
now
we're
sitting
at
two
percent
and
62
on
our
two
pods
that
are
running
with.
I
would
assume
very
little
to
no
traffic
unless
everyone
watching
the
live
stream
has
gone
and
started
to
click
on
this
thing
so
and,
and
that
is
actually
part
of
the
demo.
So
I'm
gonna
talk
about
that
here
in
a
second.
B
So
now
we've
got
it
we're
sitting
at
40
millicourse
our
app
seems
to
be
running
we're
passing
the
probes,
everything's
kind
of
stabilized
out,
and
so
I'm
going
to
run
my
my
little
benchmark
thing
here
that
we
ran.
B
And
you
may
have
already
realized
it's
taking
far
longer
than
it
did
last
time
and
we
look,
we
see
our
average
requester
duration
was
500
milliseconds
and
our
average
iteration
was
12
seconds,
so
we've
almost
doubled
the
amount
of
time
it
takes
to
run
this
test,
and
this
is
a
very
small
test.
This
is
not
indicative
of
any
real
real
life
traffic
or
anything
like
that.
B
If
I
ran
this
for
a
lot
longer
turn
up
the
leave
the
views
at
10
but
change
the
iterations
like
10
000
and
just
let
this
sit,
we
will
see
that
the
app
will
use
100
of
that
that
cpu
and
it's
just
sitting
there
and
getting
throttled
over
and
over
and
over
again.
This
would
be
another
good
opportunity
to
go.
Take
a
look
at
our
metrics
graphs
and
see
that
throttling
in
action.
B
You
have
to
be
a
little
bit
careful
with
those
kind
of
graphs,
because
sometimes
they
can
be
misleading.
Sometimes
you
see
throttling
and
you're,
not
sure.
If
it's
you
know
consistently
a
problem.
What
I
prefer
to
do
is
take
a
look
at
actual
latency.
Just
look
at
your
application
performance.
Look
at
those
gold
metrics
and
see
if
your
app
is
performing
the
way
you
expect,
and
in
this
case,
at
40
millicourse.
B
We
know
that
our
request,
duration
of
500
milliseconds,
is
way
too
high.
That's
just
not
right
for
what
our
application,
what
we
expect
our
application
to
do,
and
so
the
the
suggestion
in
the
chat
to
bump
up
to
a
a
limit
of
a
hundred
millicourse,
definitely
a
great
idea.
So
we're
gonna
do
that.
Let's
find
the
resources
block
and
bump
this
up
to
back
up
to
100
and
100
and
hope
our
app
starts
to
perform.
B
B
So
it's
also
interesting
to
note
that,
during
that
that
test
we
never
actually
saw
the
cpu
spike.
So
cpu
throttling
is
is
a
very
complex
mechanism.
It's
I've
watched
a
few
videos
on
it.
I'm
not
sure
I
fully
understand
it,
but
I
I
understand
the
effects
of
it,
and
so,
if
we
see
that
our
app
isn't
performing
and
our
limits
are
too
low,
we're
seeing
a
lot
of
cpu
throttling
turn
those
limits
up,
even
if
you're
not
necessarily
seeing
full
utilization
all
the
time.
B
A
Well,
while
we
wait
for
that
as
well,
there's
another
question
from
antoine:
is
it
a
good
idea
to
profile
applications?
You
know
worst
case
cpu
usage
and
then
adapted.
B
B
You
are
correct
in
the
last
comment
as
well.
The
last
comment
is
resource.
Cpu
is
not
necessary.
I'm
assuming
you
mean
request.
Cpu
request
is
not
the
same
as
limits
and
that
you
can
keep
the
requests
low
and
the
limits
high,
and
that
is
definitely
true,
but
I
am
going
to
cover
that
a
little
bit
later
in
the
demos
and
so
we'll
talk
about
that
in
a
few
minutes.
So
we
ran
our
tests,
we're
back
down
to
120
milliseconds,
still
a
little
higher
than
what
we
started
with.
B
B
B
Happens
all
right,
yep
we're
back
down
under
100
milliseconds,
so
200
seems
to
be
a
pretty
decent
sweet
spot
for
us,
I'm
going
to
leave
it
there
for
now
and
we're
going
to
move
on.
I
believe
the
defaults
in
the
actual
yaml
that
we
used
to
deploy
this
is
200
for
both
the
request
and
the
limit.
It's
set
that
way
because
we
want
to
start
with
the
best.
So
if
you
go
to
run
this
demo
yourself
or
tweak
this,
you
should
start
with
that
at
200..
B
So
that's
the
end
of
the
cpu
throttling
demonstration,
I'm
gonna
move
on
to
the
third
demo
much
simpler
than
the
last
one.
B
What
happens
if
we
just
turn
down
the
memory
requests
and
limits
so
we're
gonna
edit
this
again
and
this
one
most
folks
are
probably
familiar
with.
This-
is
a
really
common
thing.
It's
relatively
straightforward,
because
the
reaction
to
running
out
of
memory
is
much
easier
to
understand
than
the
then
cpu
throttling
cpu
filing,
as
I
talked
about,
is
very
fairly
complicated.
B
What
happens
when
we
turn
the
memory
limit
down?
Is
we're
just
going
to
keep
getting
killed
over
and
over
again?
So
if
we
go
describe
this,
we
see
the
last
state
was
terminated.
We
are
oom
killed
or
out
of
memory
killed
the
exit
code,
for
that
is
always
137..
You
may
not
necessarily
see
the
reason
as
killed.
B
You
may
just
see
terminated,
but
if
that
exit
code
is
137,
you
know
there's
not
enough
memory
there
and
your
pods
just
gonna
sit
there
and
crash
over
and
over
again,
so
we're
gonna
go
back
and
we're
gonna
edit.
This
again
oops
wrong
button
edit
and
find
that
resources,
and
we
started-
I
think,
50s.
So,
let's
bump
this
up
to
20
because
you
know
obviously
10
10
megabytes
was
not
enough
for
this
app
to
even
start
just
like
in
the
cpu
one.
B
B
I
believe
this
app
uses
the
most
memory
at
startup
and
so
we're
gonna
go
back
up
to
40
here.
B
B
You
are
correct,
but
this
is
not
a
java
based
app.
So
as
far
as
I
know,
I
actually
don't
remember
exactly
what
it's
written
in.
We
can
go
find
out.
This
is
the
repo
it's
linked
in
my
repo
and
it
looks
like
yeah
front-ends
javascript
back-end
is
I'm
gonna,
guess
ruby
by
the
by
the
github
analysis.
Here,
I'm
not
gonna
go
dig
through
it
because
that's
not
important
to
the
demo
so
much
so
we're
at
40
megabytes.
We
see
just
at
steady
state.
B
Well,
I'm
not
running
any
traffic
right
now
we're
using
seventy
percent
of
our
memory
limit,
and
so
I'm
gonna
run
that
load
test
again
and
I'm
just
gonna.
Let
it
run
for
a
few
minutes
here
and
we're
going
to
see
a
lot
of
times.
What
we'll
see
is
we'll
see
intermittent
boom
kills.
So
things
won't
necessarily
you
know
the
app
will
start
up,
but
under
load
it
will
start
to
fall
over.
B
This
is
an
opportunity
to
either
adjust
our
memory
limits
up
and
have
a
little
bit
of
buffer
for
it
to
surge
or
it's
an
opportunity
for
us
to
turn
on
a
horizontal
pod
autoscaler
and
that's
usually
a
much
better
option
to
scale
horizontally
rather
than
vertically.
But
we
will
see
this
app's
not
so
much
memory
bound
as
it
is
cpu
bound.
It
seems
to
use
a
fairly
consistent
amount
of
memory,
and
so
I'm
not
going
to
likely
get
a
ton
of
results
out
of
this.
But
we'll
see
what
happens.
B
B
We
go
take
a
look
we'll
see
how
much
traffic
we're
at
22
596
requests,
we've
done.
Another
couple
thousand
or
so
the
app
is
still
running
just
fine
and
we
can
take
a
look
at
the
stats
and.
B
So
it's
actually
running
quite
well,
and
another
thing
to
think
about
here
is
that
92
memory
utilization
might
be
a
good
thing.
That
might
be
exactly
where
we
want
to
be
so
if
the
apps,
if
our
golden
metrics
or
our
golden
signals,
aren't
showing
any
issues
if
our
latency
is
still
where
it
needs
to
be,
92
is
fine,
I'm
just
going
to
let
it
run.
A
B
Predictive
auto
scaling
that
would
be
the
golden
goose,
wouldn't
it
I
don't
know
of
any
great
things
out
there,
there's
a
couple
of
solutions
for
sort
of
creating
buffer
space
to
reduce
the
amount
of
time
it
takes
to
scale
up
the
it's.
The
node
over
provisioner
is
a
common
solution
where
you
essentially
just
run
a
very
low
priority.
B
Pod
that
sits
and
holds
holds
those
resources
available
and
then
automatically
gets
evicted
when
those
those
resources
are
needed
by
something
more
important,
and
so
essentially
you
keep
those
nodes
kind
of
pre-warmed,
so
that
can
reduce
the
amount
of
time
it
takes
to
scale
up
as
far
as
predictive
auto
scaling,
you
really
have
to
know,
like
the
patterns
of
your
traffic,
coming
in,
to
be
able
to
do
that,
and
so
it's
a
much
more
complex
problem
to
solve,
and
I
don't
know
personally
of
any
great
solutions
out
there
for
it
generally.
B
The
solution
is
to
just
scale
more
aggressively,
so
turn
your
targets
down,
or
things
like
that.
So
we're
still
running
at
93
percent.
Here
we
have
an
average
request,
duration
of
67,
milliseconds
or
69
milliseconds.
So
actually
doing
quite
well.
Here,
even
at
91
95
memory
utilization,
so
I'm
going.
B
This
I
think
it's
a
great
spot
to
be
at.
I
don't
really
care
that
it's
at
95
96
97
percent,
as
long
as
it's
not
occasionally
getting
om
killed.
If
we
do
start
to
see
those
occasional
om
kills,
then
we
may
want
to
increase
that
just
a
little
bit
but,
as
I
said
before,
I
don't
think
this
app's
memory
bound
it's
very
much
cpu
bound,
and
so
I'm
not
going
to
worry
about
it.
So
that
was
the
easy
demo
boom
kills
very
simple.
B
We
all
relatively
understand
them,
so
I
am
going
to
move
on
to
the
fourth
one,
and
this
is
my
favorite,
because
we
see
this
a
lot
in
the
clusters
that
we
run
for
our
customers
and
that
I've
run
for
people
in
the
past.
It's
very
common
to
think.
Well,
I
know
my
app.
B
I
know
my
traffic's
gonna
be
bursty.
I
know
that
my
resource
utilization
is
gonna,
be
bursty.
You
know
I
don't
need
to
request
as
much
as
my
limits
are,
I'm
going
to
take
my
requests
and
my
limits,
I'm
going
to
use
that
burstable
qos
class
and
I'm
going
to
set
them
really
far
apart.
B
So
I'm
going
to
request
10
millicourse,
because
that's
what
I
need
to
start
with.
You
know.
That's
that's
really
all
it
needs
to
get
going,
that's
kind
of
what
it
uses
its
steady
state
and
I'm
going
to
set
my
my
limits
and
I'm
doing
this
backwards
on
the
screen,
because
I'm
trying
to
talk
while
I
type
but
I'm
going
to
set
my
requests
to
10
millicourse,
which
we
already
know
is
far
too
little
for
this
app
from
the
earlier
tests
that
we
were
showing
on
cpu
throttling.
B
But
I'm
going
to
set
my
limits
of
500
millicourse.
So
our
limit
is
way
up
there
we're
not
going
to
get
cpu
throttling,
and
so,
but
you
know
I
don't
need
to
request
as
much
so
it's
common
thing
very
common
thing
to
do,
and
I'm
also
I'm
going
to
do
the
same
with
my
memory.
This
isn't
actually
in
the
script.
I'm
going
a
little
bit
off
script
here,
but
we're
just
going
to
see
what
happens
when
we
do
that.
B
I'm
going
to
set
the
memory
request
to
10
millicourse
and
the
limit
to
100.,
which
is
way
higher
than
what
we
had
before,
and
so
I'm.
A
B
To
set
that,
at
the
same
time
where
this
really
becomes
a
problem,
because
it's
not
so
much
a
problem
when
you're
running
a
static
number
of
pods,
it
can
be
a
problem.
But
where
this
really
really
becomes
a
problem
is
when
we
start
using
horizontal
pot,
auto
scaling.
B
So
I'm
going
to
edit
that
hpa
that
horizontal
pod
autoscaler
and
I'm
going
to
let
it
scale
way
up.
So
I'm
going
to
say
mineral
because
2
max
replicas
200,
and
what
we're
going
to
see
here.
If
we
take
a
look
at
that
hpa,
we'll
see
it's
a
cpu,
auto
scaler
and
we're
targeting
50,
cpa
utilization
and
the
interesting
thing
about
this
is
that.
B
Scale
is
based
on
request
and
scheduling
is
also
based
on
request
and
so
by
requesting
only
10
millicourse,
but
allowing
a
limit
of
200
or
whatever.
I
set
it
to
500
that
huge
number
we're
gonna
see
some
probably
interesting
behavior
in
our
cluster,
and
this
is
where
I
like
to
go
back
to
that
cube
capacity
tool.
B
So
let's
just
keep
watching
this
for
just
a
minute,
and
actually
I'm
going
to
keep
running
load
against
this
and
I'm
going
to
run
a
little
bit
more
load.
So
say
we
got
quite
a
surge
in
traffic,
I'm
going
to
use
30
virtual
users
instead
of
10.
B
we're
going
to
see
what
happens
so
we're
already
sitting
at
280
percent
of
our
memory
request
and
278
percent
of
our
member
requests
on
these
two
pods.
So
we've
we've
we're
asking
for
not
nearly
enough,
so
we've
scheduled
this
pod
on
a
node
thinking
that
it
only
needs
10,
megabytes
of
memory
and
10
millicourse.
B
The
schedulers
made
this
decision
to
put
it
somewhere,
but
that
that's
way
off
because
now
we're
using
three
times
that
amount
already.
So
the
scheduler's
already
made
a
bad
choice
because
we
told
it
to
and
let's
also
not
demo,
one
sorry.
B
So
if
we
take
a
look
at
our
hpa,
we'll
see
that
we
are
now
fifteen
hundred
percent
of
our
target
cpu
usage,
and
so
what
this
is
gonna
do
is
we're
just
gonna
explode
the
number
of
pods
we're
just
going
to
scale
way
up
here
for
really
no
good
reason.
So,
if
we
look
at
our
stats.
B
And
take
a
look
at
our
current
request:
duration.
Let's
see
okay,
nope.
B
B
B
So
if
we
take
a
look
at
the
number
of
pods
here,
we
now
have
75
pods.
Take
a
look
at
our
hpa,
we're
starting
to
settle
down
on
our
target
here,
but
we
probably
don't
need
this
many
pods.
We.
B
Amount
of
traffic
I'm
running,
I
would
not
expect
to
need
70
replicas,
that's
going
to
go
up
again
in
a
second
because
of
where
we're
at
here
we're
do
we're
using
a
hundred
pods
for
a
relatively
small
amount
of
traffic.
So,
first
we're
gonna,
see
problems
with
scaling.
Scaling
is
gonna
happen
too
fast,
because
our
percentage
of
our
request
is
so
far
off
from
what
we're
actually
using.
B
The
second
thing
that
we're
going
to
start
to
see
is
nodes
are
going
to
start
to
get
overwhelmed.
So
this
is
where
that
toolkit
capacity
comes
back.
In
that
I
talked
about
written
by
an
old
co-worker
of
mine,
rob
scott
and
it's
gonna
sum
up
your
cpu
limits
per
note.
So
if
we
look
at
all
the
pods
running
on
this
node,
we
add
them
up.
B
You
can
get
this
information
from
a
get
nodes
output,
but
this
really
sums
it
up
nicely
in
an
easy
to
read
way,
but
you'll
see
that
our
cpu
limit
is
currently
our
summed
cpu
limits
are
currently
500
400,
300
percent
of
our
node
capacity.
B
So
we
have
the
opportunity
to
attempt
to
use
500
of
our
cpu
or
you
know
I
didn't
really
constrain
on
memory,
but
I've
seen
this
happen
with
memory
where
you
get
to
three
four
hundred
percent
of
your
memory
available
on
the
node
and
as
soon
as
you
get
a
large
amount
of
traffic
and
those
nodes
start
those
pods
start
to
consume
more
and
more
resources.
B
B
I'm
not
saying
that
all
applications
have
to
be
guaranteed
qos
class,
but
for
your
critical
applications,
use
that
guaranteed
qos
class
and
when
you're
going
to
use
burstable,
be
mindful
of
the
entire
cluster
in
the
ecosystem.
That
you're
deploying
into
and
understand
that,
if
your
you
know,
combined
usage
starts
to
get
to
a
point
where
maybe
you
could
hit
600
of
a
node?
That's
probably
not
a
great
situation
to
be
in
and
so
be.
Mindful
of
those
summed
limits
and
those
summed
requests.
B
So
we'll
just
run
the
regular
test,
real
quick
here
and
see
what
our
current
utilize,
our
current
numbers
come
back
at
looks
like
we're
we're
at
382
milliseconds
average,
with
an
iteration
of
10
seconds.
So
even
with
all
these
extra
pods
spun
up
we're
still
not
getting
great
performance.
B
A
Right
and
then
there's
an
audience
question
again,
which
is
amazing,
so
this
is
part
of
chaos.
Engineering
over
communities
from
jonathan.
B
I
mean
you
could
kind
of
say
that
what
I'm
doing
is
a
form
of
chaos.
Engineering,
if
you
really
wanted
to
introduce
some
interesting
chaos
into
your
cluster,
deploying
something
like
that
stress
application
to
kind
of
attempt
to
eat
resources
would
be
a
form
of
chaos.
Engineering,
so
sure
yes,
thanks
for
the
question.
B
All
right:
well,
that's
actually,
my
last
demo,
I
know
we're
a
little
bit
early
on
time.
Are
there
any
other,
I'm
sure
I
can
come
up
with
some
other
ways
to
break
this
cluster.
Any
questions
other
questions
from
the
slack
channel
or
anything
like
that.
A
B
Sorry,
I
didn't
quite
catch
that
wait.
What's
the
channel
in
the
cloud
native,
I
can
go,
pull
these
up.
B
Oh
from
youtube,
okay,
oh
I
see
k6
versus
pi
test
for
endurance
testing.
I
really
enjoy
k6.
It's
super
easy
to
run.
Writing
the
tests
is
easy.
They're,
their
documentation
is
great.
Their
cloud
products
actually
pretty
nice,
but
you
know
it
works
for
a
whole
bunch
of
different
situations.
I've
been
using
it
for
a
while.
I
don't
know
if
I've
used
pi
test
directly
for
this
type
of
testing
myself.
B
So
I
don't
know
if
I
can
necessarily
give
a
good
recommendation
there,
but
huge
fan
of
k6,
it's
very
simple,
to
get
running.
A
Great
keep
the
questions
coming.
If
there's
more,
we
have
time
to
take
them
and
then
a
linkedin
user
asks.
Is
this
kubernetes
native
application.
B
No,
I
don't
believe
it
is
originally.
I
will
say
it
runs
quite
well
in
kubernetes,
I've
had
good
luck
with
it
as
a
demo
application.
All
the
different
pieces
are
nice.
I
grabbed
the
the
yaml
for
deploying
it
modified
it
pretty
heavily
from
the
repository,
so
it
definitely
runs
well
in
kubernetes,
whether
I
could
say
it's
necessarily
like
originally
kubernetes
native.
I
I'm
honestly
not
sure.
A
Great
and
thank
you
to
you,
dave
by
the
way,
of
course,
we're
asking
the
question
great.
So
any
other
questions,
or
do
you
have
some
some
other
way
to
break
break
things
more.
B
A
While
we
wait
for
that
inspiration
to
strike
again,
I
have
a
question
actually,
so
you
talked
a
bit
about
different
projects
and
so
forth.
But
do
you
have
a
favorite
cncf
project
or
another
open
source
project
regarding
stability
for
kubernetes.
B
Definitely
I'm
a
little
bit
biased
in
that
we
have
some
that
fairwinds
has
released
some
open
source
projects.
Goldilocks
is
around.
It
is
designed
for
setting
your
resource
requests
and
limits,
or
setting
up
baseline
for
them
using
the
vertical
pod
autoscaler
project,
which
is
part
of
the
kubernetes
project,
to
set
to
get
recommendations
on
how
to
set
those
things
initially.
B
Liveness,
probes,
readiness,
probes
and
stuff
like
that,
so
those
are
kind
of
our
two
main
ones
around
stability
and
then,
of
course,
any
of
the
tools
that
I
showed
today.
All
of
the
the
kubernetes
native
tools,
cluster,
auto
scaler
metric
server
is
obviously
in
use
here
as
part
of
gke.
So
I
think
that's
a
good
list
of
tools
that
I
frequently
use.
A
Great
perfect
and
yes,
jonathan,
thank
you
for
for
hyping
us
up
with
great
awesome,
thanks
for
the
share
of
for
this
increased
capacity
over
communities.
Thanks
a
lot.
Thank
you
for
attending
and
thank
you
for
asking
questions
and
good
day
and
good
night
and
so
forth
to
you
as
well
and
great
and
then
there's,
I
think,
a
question
in
the
slack
aside
as
well.
So
it's
clear
about
setting
good
requests.
What
would
be
the
best
practice
to
set
good
limits.
B
Well,
if
you're
gonna
set
your,
I
mean
I
I
feel
like
I
touched
on
this
a
little
bit
so
if
you're
gonna
set
your
requests
and
limits
different
if
you're
not
gonna,
use
guaranteed
coils
class,
I
would
say
a
good
guideline
is
to
just
not
keep
them
too
far
apart.
So
I
generally
like
as
a
very
very
like
total
generic
baseline,
you
could
start
with
not
setting
it,
maybe
10
more
than
10
over
your
request,
depending
on
various
various
variables.
B
So
if
you're
going
to
use
burstable,
just
don't
keep
them
too
far
apart
and
be
mindful
of
the
whole
ecosystem,
but
in
general,
if
it's
a
critical
workload,
if
it's
your
main
application,
if
it's
sensitive
to
being
throttled
or
you
know
any
sort
of
disruption,
then
using
that
guaranteed
qos
class
is
going
to
be
your
best
route,
so
set
your
resource
requests
and
limits
the
same
and
set
them
high
enough
that
your
app
can
function
and
you
get
the
lowest
latency
possible.
B
It
was
made
by
a
company
called
carbon
relay
that
is
now
part
of
storm.
Forge,
I
believe,
forgive
me
if
I
get
your
names
wrong,
but
essentially
what
it
does
is.
It
allows
you
to
run
a
series
of
tests,
so
you
set
some
variables
that
you
want
to
monitor.
You
say
I
want
to
look
at
my
latency
and
I
want
to
balance
that
against
my
you
know
how
many
resources,
I'm
using
so
effectively
my
my
cost
and
so
the
I
want
to
maximize
or
I
want
to
minimize
latency.
B
I
want
to
minimize
cost
as
well,
and
I
want
to
tweak
the
cpu
requests
and
limits,
and
so
you
give
it
a
set
of
parameters
and
what
it
will
do
is
it'll.
It
will,
and
then
you
give
it
like
a
test
to
run
like
a
k6
load
test
or
something
like
that
and
it'll
make
the
change.
B
B
It's
a
really
interesting
tool
and
a
fantastic
way
to
sort
of
get
an
idea
of
what
what
what's
going
on
in
the
cluster
around
these
two
types
of
things.
So
that
could
be
a
fun
thing
to
look
at
for
folks.
A
Great
thank
you
to
moisturize
for
that
great
question
and
yeah
there's
been
a
bit
talk
about
the
slack
as
well.
So
there
was
a
comment
then,
and
if
you
send
cncf
on
youtube
your
email
for
monaco,
we
will
get.
You
started
on
the
slack
as
well
on
that
front,
but
yeah.
Those
are
the
questions
so
far.
So
if
there's
any
still,
we
do
have
a
few
minutes
to
take
them,
so
keep
them
coming.
A
If
there's
anything
that
pops
into
everyone's
head
did
inspiration
strike
and
did
you
remember
the
good
way
to
break
everything.
B
It
did
not.
Actually,
I
was
too
busy
answering
questions,
so
I
think
these
definitely
cover
the
the
primary
scenarios
that
we
see.
Most
often,
I'm
sure
there
are
dozens
of
other
ways
to
break
the
cluster,
but
these
definitely
are
the
the
best.
You
know
the
most
common
ones
that
we
see.
A
Yeah
and
to
to
dumb
chicken,
you
can
see
the
slack
channel
on
the
stream
itself,
where
the
joiner
live
chat
and
cncf
slack
there
and
for
the
cncs
slack.
I
think
you
can
find
that
from
the
cncf
website,
for
example,
link
to
the
slack
to
begin
with.
A
It
has
also
been
linked
to
this
chat,
so
you
can
maybe
find
it
from
there,
but
if
not,
that
the
cncf
website
itself
should
have
the
link
to
the
cncf
slack,
where
you
can
then
find
the
cloud
native
live
channel
itself
and
then
there's
a
comment
from
jerk
saying:
we've
had
long
reviews
on
the
request
and
lim
range
and
come
down
to
20
in
our
particular
case.
B
That's
great
reviewing
your
cpu
requests
and
limits
is
always
a
good
idea,
something
you
should
definitely
revisit
on
a
regular
basis
and
definitely
see
you
know
it's
a
balance
between
performance
and
efficiency
or
performance
and
stability
and
cost
right,
because
we
can
always
get
a
very
stable
cluster
by
just
turning
our
requested
limits
way
up
always
using
you
know,
guaranteed
qs
costs
on
everything
all
the
time
and
it
will
be
very
stable
and
we
won't
have
any
interest,
but
we're
probably
over
provisioned
considerably.
B
If
we
do
this
and
so
reviewing
that
over
time,
looking
at
your
metrics
over
time
and
then
making
tweaks
to
those
that
don't
affect
your,
you
know,
latency
or
you
have
a
reasonable
range
that
you're
shooting
for
with
latency
is
super
important
to
do
so.
I
think
that's
super
great.
A
Great
everything's
going
smoothly
then
over
there
and
then
there's
another
question
about
what
is
this
channel
for
the
slack
and
it
is
still
on
the
stream.
So
you
can
see
it.
You
should
see
it
like
this
direction,
so
join
our
live
chat
on
cncf
club
made
it
live,
so
that
is
the
slack
channel
you
should
be
joining,
but
obviously
in
here
we
actually
see
comments
from
youtube.
So
stephen,
we
do
see
your
comments
here,
already
live
from
there
as
well
or
from
linkedin
or
from
the
other
places.
But
that
is
true.
A
You
can
join
the
sac
as
well
and
join
the
conversation
on
that
side
as
well,
and
thank
you
so
much
for
joining
and
I
think
on
the
thanks
and.
B
A
It
and
great
messages
on
that
side
and
the
slack
link.
I
think
you
can
find
that
from
the
cscf
website
itself,
if
I'm
correct,
and
that
should
be
there
so
that
you
can
join
there.
There.
B
A
That's
great
everyone's
interested
in
continuing
the
discussion
there
and
it's
the
same
channel
that
we
have
every
week
for
these
cloud
native
lives.
So
if
you
hop
in
over
there,
you
can
then
join
in
next
week
as
well
and
see
the
chat
in
action
in
there
as
well
as
well
as
the
links
to
all
of
the
live
shows,
are
linked
to
that
slack
channel
as
well.
So
that
is
very
nice
tip
for
everyone
here
again,
so
we
have
three
minutes
left
in
our
scheduled
programming
time.
B
Not
anything
new
just
make
sure
you
set
your
resource
requests
and
limits.
I
will
continue
harping
on
that,
probably
for
the
rest
of
my
career
so
and
thank
you
so
much
to
everybody
for
all
of
your
questions
and
for
listening.
It's
been
great
having
you
all
here,.
A
Perfect,
thank
you
so
much
for
joining
so
now
that
we
have
two
minutes
left.
I
think
it's
a
great
time
to
start
wrapping
it
up.
We've
had
a
lot
of
great
interaction.
Thank
you
so
much
to
everyone.
So
thank
you.
Everyone
for
joining
the
latest
episode
of
cloud
native
live.
It
was
great
to
have
andy
here
talking
about
building
stability.
A
We
I
and
we
both,
probably
also
really
love
the
interaction
and
questions
from
the
audience
and
as
always,
we
bring
you
the
latest
of
cloud
native
code
every
wednesday,
so
you
can
join
in
next
week
and
the
next
week
after
that,
as
well.
So
next
week
we
will
have
another
great
session
from
jason
morgan
talking
about
their
amazing
topics
as
well.
So
thanks
again
for
joining
us
today
and
see
you
next
week.