►
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hello,
everyone
welcome
to
cloud
native
live
where
we
dive
into
the
code
behind
cloud
native,
I'm
annie
and
I'm
a
cncf
ambassador
as
well
as
a
senior
product
marketing
manager
at
camunda,
and
I
will
be
your
host
tonight.
So
every
week
we
bring
a
new
set
of
presenters
to
showcase
how
to
work
with
cloud
native
technologies.
A
They
will
build
things,
they
will
break
things
and
they
will
answer
all
of
your
questions
so
join
us
every
wednesday
to
watch
live,
and
this
week
we
have
amazing
two
speakers
here
with
us
to
talk
about
using
litmus
chaos,
engine
and
microservices
demo
app
to
demonstrate
automated
rca
and
as
always,
this
is
an
official
live
stream
of
cncf
and
as
such,
it
is
subject
to
the
cncf
code
of
conduct.
So
please
do
not
add
anything
to
the
chat
or
question
that
would
be
in
violation
of
that
code
of
conduct.
B
Hey
everybody,
my
name
is
seamus
and
I'm
joined
today
by
braden
and
we're
both
devops
engineers
for
zebram.
So
we're
going
to
talk
today
a
little
bit
about
litmus,
which
is
a
cloud
native,
open
source
chaos,
engineering
framework,
and
we
want
to
talk
a
little
bit
about
what
that
means
and
what
we're
doing
with
it.
So
so
what
we're
doing
with
it
is
actually
a
little
bit
unique.
B
So
at
zebrin
we
built
a
product
that
analyzes
logs
to
find
the
root
causes
of
issues
so
being
able
to
cause
issues
on
demand
is
just
absolutely
invaluable
for
us.
So
when
we're
validating
and
demonstrating
our
product,
we
need
to
be
able
to
create
problems
within
our
kubernetes
demonstration
clusters.
B
We
generally
don't
have
access
to
customer
prospective
customer
environments.
So
it's
important
for
us
to
be
able
to
do
this
ourselves
on
demand
and
litmus
provides
on-demand
chaos
by
simulating
issues
that
can
occur
in
environments
such
as
just
bad
configurations,
heavy
infrastructure
loads
rainy
days.
Just
any
stability
threatening
issue
you
can
think
of
really
the
only
limit
is
imagination
for
it.
B
So
a
quick
thousand
foot
view
what
exactly
is
chaos
and
what
is
chaos?
Engineering
in
this
context?
So
there's
many
different
testing
methodologies
available
in
the
world
right
now.
The
thing
that
they
all
have
in
common
is
that
they
all
have
blind
spots
and
the
problem
is
the
blind
spots
can
overlap,
and
then
you
can
have
really
bad
unpredictable
behavior
sometimes,
and
if
the
first
time
you
find
out
about
a
resiliency
issue
is
when
a
customer
reports
it
at
three
o'clock
in
the
morning.
That's
an
ops
fail,
that's
bad!
B
We
don't
want
that
to
happen,
and
chaos
engineering
allows
you
to
create
these
doomsday
scenarios
in
a
more
controlled
environment
in
a
way
to
test
resilience
before
bad
things
occur
in
the
wild
litmus
is
by
far
the
best
cloud
native
framework
for
inducing
chaos
that
we've
found.
We've.
B
Searching
and
we've
actually
developed
some
stuff
on
our
own
and
litmus
is
just
by
far
our
favorite
tool
for
it.
So
exactly
is
litmus.
So
it's
a
framework
for
conducting
chaos,
experiments
just
individual
little
blurbs
of
bad
things.
That
can
happen.
So
this
is
done
in
a
declarative
way
of
via
experimental
templates
and
experiments
can
be
orchestrated
into
chaos.
Scenarios
which
can
include
things
like
chained
experiments.
You
can
run
experiments
in
parallel
and
sequential.
B
You
can
set
up
and
tear
down
experiment
resources
and
you
can
even
deploy
entire
environments
as
part
of
a
chaos
scenario.
It's
an
extremely
versatile
platform,
so
litmus
was
originally
accepted
in
cncf
sandbox
in
2020
and
actually
just
moved
into
incubation.
So
huge
congratulations
to
them.
For
that,
that's
a
big
step
up
and
we're
really
happy
for
them.
B
So
what
kind
of
experiments
are
available
right
now?
There's
a
fantastic
library
available
at
io,
hub.litmuschaos.io,
there's
currently
58
on
the
shelf,
minimal
configuration
required
experiments,
they're,
pretty
much
just
drag
and
drop
and
hit
go
and
receive
chaos
they're
available
for
a
wide
variety
of
cloud
platforms,
including
things
like
kubernetes
cube,
aws,
azure
vmware.
B
If
you
use
a
major
kubernetes
platform,
there
are
compatible
experiments
waiting
for
you.
So
I'm
going
to
hand
things
over
to
brayden
now
and
brandon's
going
to
give
us
a
live
demo
of
setting
up
litmus
in
one
of
our
demonstration
environments,
configuring
it
and
running
some
experiments
and
seeing
what
happens.
C
Yeah,
so
thanks
tramus,
let
me
go
ahead
and
share
my
screen,
which
one
I
watch
this
one.
C
Cool
and
hold
on,
I
have
probably
mouse
there.
It
is
all
right
so
kind
of
what
we're
going
to
run
through
real
quick
is
we're
going
to
kind
of
run
through
the
litmus
kind
of
install
directions,
we're
gonna
spin
up
the
litmus
cluster,
we're
gonna
access,
the
chaos
central,
their
ui,
we're
gonna,
run
their
default
test
cluster
and
then
we're
gonna
connect
it
to
one
of
our
live
apps
and
actually
break
some
stuff.
C
So
the
first
thing
we
want
to
do
is
litmus
offers
an
install
through
the
through
qcto
or
through
or
an
apply
ammo
or
through
helm,
I'm
just
going
to
use
the
home
one.
So
the
first
thing
you
do
is
add:
olympus
home
repo
and
then.
C
Make
sure
it
added
yep
it's
in
there
somewhere
yeah.
I
have
a
lot
of
requests
and
then
let's
do
an
install
command
so
we're
just
gonna
install
the
base
configs
that
come
through
witness.
C
A
Should
we
have
a
bit
more
zoom
on
the
terminal.
C
Okay,
so
go
back
and
forth.
First
one
was
just
installing
the
repo
listing
the
repo
and
then
now
we're
running
upgrade
command,
except
I.
C
Oh
so
I'm
using
lens
as
a
kind
of
web
ui
to
front
or
cluster
it's
a
little
easier
than
trying
to
remember
and
type
5000
keep
ctl
commands.
So
as
we
see
it's
kind
of
going
through
and
applying
it
right
now,
so
wait
a
little
bit
for
that.
To
finish.
A
Apparently,
there's
a
question
from
someone
on:
will
this
be
recorded
and
be
available?
Yes,
it
will
be
so
it
will
be
available
in
youtube
in
the
cnc
at
youtube
pretty
much
immediately
after
this
live
ends,
so
you
can
tune
in
to
watch
it.
There.
B
C
Apparently
you
close
that
look
and
that
wants
the
update
all
right.
So
it
looks
like
it's
fully
installed
the
next
step
when
you
deploy
this
out
of
the
box,
as
you
can
see.
If
I
go
look
at
the
server
and
apparently
now
alexa's
going
off
yeah.
So
when
you
complete
straight
out
of
the
box,
if
you
look
here,
it
just
does
no,
no
port
cluster
ips.
C
So
there
is,
there
are
instructions
for
installing
it
either
going
through
and
editing
a
noteport,
creating
a
load
balancer
or
putting
an
ingress
object
on
it.
I'm
not
going
to
dive
into
that.
I'm
going
to
cheat
and
lens
does
a
great
thing
where
you
can
do
an
internal
q,
proxy
and
just
proxy
from
the
cluster
to
local.
So
that's
what
I'm
going
to
do
just
to
kind
of
get
a
straight
log
in.
C
C
I
want
to
add
you
to
google
and
we're
not
going
to
change
the
password
for
now
cool.
So
just
like
that,
we
have
it
stood
up.
We
have
the
intro
ui
and
now
we're
kind
of
ready
to
rock
and
roll.
So
a
couple
of
things
to
kind
of
walk
through
here
chaos
delegates
you
can
see
this
one's
pending.
What
we
installed
was
just
what
they
call
the
chaos
center.
It's
the
ui,
it's
kind
of
the
command
and
control
center.
So
it's
the
ui.
C
You
can
specify
scenarios
you
can
download
from
the
hub
do
analytical
stuff,
the
actual
real
meat
and
bro
or
the
bread
and
butter
of
how
this
works
is
installing
the
self-agent
which
okay
there
we
go.
So
what
this
is
kind
of,
think
of
it
as
a
runner.
C
So
the
idea
being
is
that
you
can
install
different
chaos
delegates
and
cite
different
clusters,
you
own,
so
that
the
ui
doesn't
have
to
be
inside
the
where
it's
actually
going
to
be
so
that
way,
we
can
do
it
when
you
first
open
up
the
ui
and
sign
in
for
the
first
time
it
actually
installs
the
self
agent
for
that
cluster-
that
you
have
it
running
on
itself,
which
is
what
you
see
here.
If
we
were
to
hop
back
into
lens,
which
is
here,
we
can
see
that.
C
Let
me
close
this.
We
can
see
that
there
are
three
or
four
servers
and
that's
the
one
we
just
did
that
just
installed,
and
this
is
all
part
of
the
chaos
operator
some
of
the
subscribe
buses
that
actually
this
is
all
of
that
cell
service
agent-
that
just
installed
in
here.
C
So
let's
actually
break
something.
The
first
scenario
we're
going
to
run
through
is
we
are
going
to
use
their
demo
app
their
demo
app
is
called
potato.
That's
actually
a
funny
word
on
mr
potato
head.
It's
pretty
funny
yeah!
So
you
have
mr
potato
here,
potato
head.
Sorry,
my
bad.
C
Yeah,
so
we're
gonna
run
through
we're.
Gonna
leave
this
the
same
yeah,
and
so
we
have
kind
of
the
sequence
of
what's
gonna
happen,
since
this
is
their
predefined
template.
It's
going
to
actually
install
the
potato
head
application.
C
It's
going
to
install
the
chaos
experiment
that
we're
going
to
run
this
chaos
experiment
for
this
one
is
a
cube
kill.
So
it's
a
pod
kill,
as
you
can
see
right
here,
it
deletes
the
pod
and
then
once
that
completes
successfully,
we
are
going
to
revert
and
uninstall
the
the
chaos
container,
as
well
as
delete
the
application
that
we
installed
directly
here.
C
These
workflows
are
customizable,
it's
all
ammo
based,
so
you
can
upload
yaml.
I
believe
they
also
have
an
api,
so
you
can
actually
just
apply
directly
rather
than
having
to
go
through
the
ui
steps,
we'll
circle
back
on
what
the
weights
do
in
a
second
and
so
yeah.
So
we
kind
of
within
a
couple
clicks.
We
have
our
first
thing.
We
want
to
go
ahead
and
schedule
it
now.
It's
just
gonna
ask
us
to
verify
everything.
It
all
looks
good
to
me.
C
So,
let's
hit
finish
and
let's
go
to
the
scenarios
now
we
see
it's
running,
you
can
see
the
experiment.
The
main
experiment
is
a
pod
delete
and
yeah.
So
now
it's
just
going
to
kind
of
sit
here
and
if
we
go
back
into
lens
you
can
see
that
mr
potato
head
is
actually
so
it
went
through
and
it's
spinning
up
a
couple
of
containers.
You
got
one
for
the
head:
hats,
new
arm,
left
leg,
main
body
right
arm
right
leg.
I
don't
actually
know
what
this
does.
C
B
C
Yeah
so
yeah
that
looks
like,
let's
see
so,
if
we
go
click
onto
here,
we
can
see
where
we
are
so
we're
in
the
middle
of
running
the
actual
pod
delete
command.
The
one
thing
about
chaos,
experiments
and
I
can
see
if
I
can
find
it
in
here.
C
Oh
here
he
is
so
every
experiment
kind
of
gets
spun
up
as
a
job
and
that
job
does
whatever
the
experiment
does.
So,
as
you
can
see,
this
is
a
pod
delete.
It
has
a
target
argument.
That
target
argument
is
probably
one
of
these
mains
that
just
deleted.
Probably
this
one,
and
so
what
this
job
will
do.
So
you
can
see
it
terminated
one
pod
and
it's
spinning
up
another
now,
so
it's
just
kind
of
a
little
jaw
that
runs
in
there.
C
They
can
be
really
complex
or
they
can
be
as
simple
as
this
one
was
where
it's
hey,
we're
gonna,
delete
a
container
and
we're
gonna
delete
a
pod
and
see
if
the
plot
comes
out,
as
you
can
see,
because
of
the
way
this
app
is
designed,
our
hello
server
is
still
available.
Oh
no,
it's
not.
I
lied.
C
C
If
we
had
no
water
around
that
it
would
have
been
available
yeah.
So
we're
kind
of
just
waiting
for
this
kind
of
finish.
C
C
C
C
Cool
all
right,
so
it
ran
and
now
we're
doing
clean
ups.
I
know
this
is
kind
of
the
you
know
the
cheesy
side,
but
does
it
help
with
declared
configurations.
B
Well
so
this
is,
this
is
actually
what
we're
seeing
here
is
the
execution
of
declarative
configuration.
So
this
is,
we
didn't
configure
anything.
This
is
all
just
completely
off
the
shelf.
This
is
just
default,
behavior
that
comes
with
the
chaos
center
installation,
but
yeah
every
individual
step.
This
was
yeah.
Oh
there
you
go,
there's
manifest.
C
So
it
does
actually
provide
manifest
files.
You
can
go
through
and
do
a
declarative,
manifest
file
and
apply
it
directly
like
that.
That
will
work
we're
just
doing
it
through
the
ui,
because
I
didn't
write
manifest
false.
C
B
C
Yeah,
no,
it
does
all
right
so
we're
gonna.
So
we
ran
it.
You
can
see
the
experiment.
The
experiment
was
a
result.
Everything
passed
so
let's
do
some
stuff
where
it
doesn't.
The
first
thing
I'm
gonna
hook
up
the
other
thing
that
this
allows
you
to
do
is
you
can
tie
into
a
previous
data
source.
C
C
B
It's
it
would
be
good
to
not
allow
just
any
arbitrary
person
on
the
internet
to
run
chaos.
Experiments
on
your
production,
stuff.
C
They
actually
have
the
ability
inside
the
settings
to
go
in
user
management.
You
can
actually
add
users
to
authentication
with
create
new
users,
login
details
using
passwords
and
all
that
you
can
actually
as
part
of
the
install
directions
as
well.
You
can
do
go
off
authentication.
A
C
I
believe
so
you
can
do
that
with
oauth,
I'm
not
100
sure
we
haven't
set
that
up
or
gone
in
that
far
yet.
But
yes,
I
do
believe
it
does
support
sso
go
off
to
actually
also
answer
the
other
question
about
declarative.
You
can
set
it
up
for
get
ops
as
well.
B
C
Enough,
I
think
that's
everything
we're
caught
up
on
so
far
did
I
answer
your
question
mark
about
the
oauth
and
stuff
or
about
the
authentication.
I
believe
you
can
best
answer.
I
got
to
check
the
docs.
I
know
there's
a
section
in
there.
We
haven't
done
it
yet
I
haven't
personally
done
it
yet,
but
I
think
so.
C
Because
I
don't
want
to
get
paged,
that's
the
main
reason.
I
don't
want
to
get
paged
all
right,
so
we're
actually
going
to
we're
going
to
play
with
the
sudo
production.
So
one
of
our
demo
apps
that
we
have
installed-
and
it's
been
on
here
for
a
while.
B
C
Everyone's
familiar
with
this,
I
think,
if
not
we're
gonna
walk
through
the
ui
real
quick
after
I
found
my
ui,
so
wework
stock
shop,
it's
another
open
source
project.
It's
like
one
of
the
microsoft
or
microservice
demo
or
applications.
I
think
it's.
The
two
big
ones
are
wework
sockshop
and
google's
boutique
app.
We
like
sockshop
better.
C
Because,
as
yeah
socks
are
cool
and
it
has
more,
it
has
more
applications
and
several
different
database
layers
on
the
back
side
too,
but
yeah.
So
this
is
the
socks.
It's
a
fully
functioning.
Basically
marketplace
store,
so
you
can
go
in
buy
socks.
We
can
buy
shame
with
some
more
socks
because.
C
Yeah
yeah,
so
it
has
like
a
full
catalog.
You
can
go
see.
Colorful
socks,
non-colorful
socks,
super
soft,
oh
super
sport
socks.
I
can't
really
panic.
I.
C
It
has
a
full
working
cart.
If
we,
you
know
you
safeties,
oh
we're
missing
shipping
payment.
Transmiss.
Can
I
have
your
credit
card,
so
we
can.
B
Yeah
yeah
absolutely
just
just
start
with
the
404.
C
Okay
yeah,
so
it's
a
full
app
so
now
that
it's
up,
let's
break
this
thing
because
breaking's
fun,
that's
the
wrong
tab!
All
right!
Oh
the
other
thing
to
note
we
do
have.
If
I
switch
to
the
right
namespace,
we
do
have
a
below
generator
on
this
site.
That's
been
running
for
like
15
days.
We
just
kind
of
permanently
keep
it
running,
so
there
is
load.
Then
we're
gonna
see
some
fun
cool
things
in
the
graph.
Hopefully
this
is
all
graffana
prometheus
stack,
so
we'll
see
some
fun
stuff.
C
The
breaker
of
socks,
game
of
thrones
fun,
we
will
throw
if
anybody
got
that
so,
as
you
can
see,
there's
a
lot
of
different
experiments.
Actually,
let
me
let
me
back
this
up,
I'm
doing
things
out
of
order,
so
chaos
hub.
He
talked
about
it
earlier.
It
is
where
litmus
stores
all
of
their
experiments
they've
written.
So
this
is
the
free
to
fund
scenario.
We
don't
care
about
that
right.
Now,
that's
what
we
bring
so
chaos
experiments.
They
currently
have
58.
C
Yeah,
so
they
have
a
little
bit
of
aws
ssm.
They
have
some
azure
stuff,
some
core
dns,
some
gcp
stuff,
some
generic
stuff,
so
we'll
play
with
the
generic
and
the
generic
pod
stuff.
C
C
To
stay
away
from
the
cube
aw
stuff
because,
like
I
said
I
don't
want
to
get
paged,
I
don't
know
if
I
could
pinch
for
this
cluster.
C
Yeah
we
should
yeah
and
smoking
ebs
stuff,
so
we're
gonna
do
network
correction
to
start
with
and
then
we'll
do
some
fun
stuff.
If
anybody
has
any
suggestions
or
just
wants
to
see
anything,
I
keep
hitting
the
wrong
tab
all
right.
So
let's
go.
B
C
C
I
spell
everything
it
doesn't
really
matter
all
right,
so
you
wanted
network.
C
Did
it
add
it
camera?
Apparently,
I'm
thinking.
C
Just
thinking
about
it,
let's
do
it
again,
hey,
oh,
I
didn't
click
on
it.
I
don't
think
it's
done
there.
We
go
all
right,
so
we've
added
it
as
you
can
see
it
kind
of
defaults
to
app
and
nginx.
So,
let's
edit
it
and
let's
change
some
stuff
experiment,
name,
I'm
not
gonna
mess
with
that
default.
So
this
is
where
you're
asking
can
you
target
stuff?
The
answer
is
yes,
so
there's
two
different
ways
to
install
litmus
out
of
the
instructions.
There's
a
I
think
there
there
might
not
be
a
dns
poison.
C
There's
dns
spoof
though
so
that's
kind
of
the
same,
but
maybe
not
it's
close-ish,
but
there's
two
ways
to
install
this:
you
can
do
it
either
through
a
cluster
wide
or
any
space
void
scoping.
I
did
this
cluster-wide,
so
you
can
see
all
of
our
name
spaces,
but
we're
going
to
target
sockshop
we're
going
to
target
an
app.
So
it
targets
based
on
labels
and
let's
do
let's
do
the
cards?
That's
fine,
we'll
crash
a
tv
next.
C
C
Is
true,
so
you
can
add
probes
what
probes
can
do?
I'm
not
gonna
go
bother.
Looking
up
what
the
endpoints
for
this
is
but
yeah,
you
can
add
a
pro
pro
names.
It
does
an
https
endpoint.
You
can
do
http
commands
days
or
prom.
These
are
all
commands.
You
can
do
you
give
the
timeout
period
reach
right
period
and
this
will
actually
probe
the
endpoints
of
your
application,
and
so
this
is
where
the
waiting
comes
in.
C
So
this
basically
says:
hey
is
this
thing
up
and
is
this
thing
healthy
and
if
the
thing
is
up
the
entire
time
you
consider
it
successful?
If
it's
not
up
and
the
probe
fails,
then
the
app
the
tests
kind
of
considered
failed
and
you're
not
resilient.
That's
where,
if
we
look
at
this
next
section,
that's
fine.
C
C
The
weight
thing
so,
basically,
if
I
were
to
schedule
like
five
or
six
different
of
these
things,
let's
say
I
do
a
pod
network,
corruption.
I
take
out
an
aws
node
and
oh
by
the
way
this
is
running
on
eks.
So
that's
why
I
keep
referring
to
aws
say
I
take
out
a
node
and
then
I
do
it.
You
know
a
memory
or
cpu
load
test.
I
can
weight
the
different
tests
accordingly
to
each
of
this
on
a
one
to
ten
point
system.
C
So
let's
say
I
don't
really
care
about
network
corruption,
so
I
can
write
it
before.
But
if
I
were
to
go
in
and
do
our
node
kill,
which
is
something
that's
highly
more
likely
to
happen,
I
can
rate
that
at
10
it
will
actually
hit
the
endpoints.
It
will
test
everything
and
basically,
if
it
succeeds
and
the
if
it
succeeds-
and
it
doesn't
go
down
at
all-
you
get
that
percentage
points
calculated
to
the
resilience
score.
B
And
this
is
this
is
part
of
the
really
some
of
the
really
cool
stuff
you
can
do
as
far
as
like
ci
cd
stuff,
you
can
actually
integrate
chaos
center
with
your
git
ops,
so
that
every
time
you're
updating
things.
If
you
make
changes,
you
can
actually
run
resiliency
tests
automatically
to
see
okay
numerically
like
what
is
our
our
score
for.
What's
our
resiliency
like
like,
for
instance,
when
mr
potato
head
when
we
ran
that
we
came
back
with
the
perfect
score?
Okay,
what?
If
it
wasn't
so
perfect?
B
What
if
we
ran
something
one
of
our
chaos,
experiments
did
actually
cause
a
service
disruption
like
how
bad
was
the
surface
disruption.
This
allows
us
to
tune
that,
to
a
level
a
more
high
level
that,
especially
like
management's,
really
interested
in
seeing
because
it's
a.
A
C
Yeah,
basically
so
the
kiosks
workflow
won't
actually
remove
itself.
If
you
select
all
resources
of
the
chaos
itself,
when
you
run
it
that
thing
I
did
at
the
end
where
I
said,
let's
clean
up
and
this
in
this
aspect,
it
is
talking
about
cleaning
up
the
network,
corruption,
pod,
that
it
is
running.
It's
not
talking
about
cleaning
up
the
stock
shop
workflow
that
actually
exists.
C
The
only
reason
it
cleaned
up
the
application,
the
pod,
the
potato
one
is
because
it
actually
deployed
that
internally.
So
if
it
were
to
deploy
it,
you
can
then
have
a
step
to
clean
that
workflow
up,
but
since
we're
using
a
pre-existing
workflow,
the
only
thing
that's
going
to
clean
up
is
the
the
job
that
actually
ran.
B
And
I
mean
like,
if
you
really
want
to
make
life
difficult
on
yourself,
you
totally
can
have
it
set
up
just
to
delete
something
that
already
existed
when
you
ran
the
chaos
experiment
on
it
like
it.
Has
the
flexibility
to
allow
you
to
do
that.
I
personally
would
prefer
for
it
not
to
do
that,
but
no,
that
is,
that
is
something
you
can
configure.
If
you
want.
C
And
I
think
I
actually
got
a
different
aspect
to
your
question
too,
and
the
aspects
if
we're
doing
like
no
kills
and
stuff.
This
is
where
it
would
probably
be
beneficial
to
run
chaos
center
in
the
cluster
you're,
not
trying
to
test
from
the
off
chance.
If
you
do
nuke
the
no
that
it's
actually
running
on
and
all
that.
That's.
C
Where
kind
of
the
chaos
delegates
come
into
place
where
it's
only
a
small,
tiny
subset
of
pods,
and
hopefully
your
environment
is
running
on
more
than
just
one
aws
node,
where
it
can
just
get
rescheduled.
C
I
haven't
actually
tried
the
instance
of
doing
a
node
kill,
mainly
because
I
don't
want
to
be
paged,
but
I
believe
in
that
instance
it
would
kill
the
agent
and
the
test
would
just
fail,
basically
being
like
hey.
We've
lost
contact
with
delegate.
B
C
C
B
C
All
right
so
now
it's
running
that
experiment.
It's
installing
the
chaos
experiment,
so.
C
Yeah,
so
the
one
interesting
thing
to
point
is
when
you
install
litmus
and
when
you
install
the
the
cloud
delegate,
the
workflows
will
actually
run
so
you
can
see
right
here.
The
workflow
is
running.
The
workflow
actually
runs
inside
of
the
name
space
that
litmus
spill.
It
miss
chaos,
delegates
running,
so
it's
not
even
running
inside
my
sock
shop.
C
C
A
That's
kind
of:
why
can
I
ask
a
question
you're
perfect,
so
how
frequently
do
you
use
logs
when
troubleshooting.
B
Oh
constantly,
that's
the
absolute
constant
thing
for
us,
which
is
partially
the
reason
why
our
our
software
exists.
Why
zebriam
exists
in
the
first
place,
so
we
were
trying
to
alleviate
some
of
the
the
just
head
banging
headache
that
goes
into
diagnosing
root
causes
as
issues
occur,
so
we
have
a
very
powerful
artificial
intelligence
engine
that
can
actually
help
identify
the
root
causes
of
your
problems
as
they
happen.
B
Well,
there
we
go
there.
We
go
so
yeah,
the
normal
log
volume
we
would
have
to
look
at
for
sockshop
we'd,
be
looking
for
a
five
minute
range,
probably
about
two
two
and
a
half
million
logs
lines
of
logs,
and
we
do
not
have
the
time
or
interest
to
actually
try
to
look
through
that
many
log
lines
to
figure
out
what
exactly
went
wrong
in
our
environment.
B
So
with
zebra,
we
can
actually
pick
out
only
30
to
50
log
lines
that
are
the
actual
relevant
log
lines.
It's
much
more
user
friendly,
much
more
human
readable
to
present
information
like
that.
C
Yeah,
so
all
that
drops
slightly
because
all
the
network
activity
drops
which,
since
this
thing's
communicating
to
itself
it
all
plummets
yeah,
so
we'll
just
wait
for
this
to
finish,
an
error
occurred.
Fetching
data
well
yeah.
That
actually
makes
sense.
B
C
You
know
yeah,
I
love
it
so
once
the
interface
comes
back
up
which
only
took
60
seconds,
so
it
should
be
fine,
it
should
be
coming
up.
We
should
just
get
in
the
lag
of
scraping.
B
B
C
It
the
what
the
network
corruption
does
is
actually
corrupts.
I
believe
it
removes
the
network
interface
for
docker
on
that
container
or
on
that
node.
Oh,
oh
wow!
This
is
a
single
node
cluster.
C
C
A
C
C
Yeah
and
so
kind
of
to
go
full
circle
about
this.
This
is
full
disclosure.
This
is
like
our
own,
our
widget,
that
we
have
installed
in
grafana
kind
of
where
we
talked
about
the
beginning.
We
use
this
tool
to
kind
of
induce
live
alerts
and
stuff.
So,
as
you
can
see
here,
it's
a
grab
card,
stevie
says:
hey,
the
masterpod
was
restarted
and
the
keyboard
restarted.
I
don't
know
if
that's
actually
seriously.
You're
gonna
me
sign
it
now.
A
And
then
there's
a
new
audience
question
as
well,
when
we
have.
C
Actually,
yes,
and
no,
let
me
go
back
and
let's
let
me
show
you,
let
me
go
back
to
schedules
hold
on.
Let
me
get
back
into
the
manifest
of
this.
C
Everything's
time
bound
so
inside
of
this
massive
chunk
of
script
somewhere,
there
is.
C
It
is
so
everything's,
time-bound
in
seconds
so
for
that
test
we
ran
it
specifically
for
60
seconds,
there's
also
a
flag
you
can
set
inside.
Let's
say
you
apply
this
with
a
yaml
file
directly.
You
can
reapply
that
email
file
and
there's
actually
a.
C
C
Yeah
from
that
aspect,
everything's
time-bound.
That
would
be
the
glass
break
as
you
kind
of
it's
like
there's
a
mixture
between.
I
believe
you
can
cancel
a
test
inside
of
here,
as
it
runs
as
like
a
hard
glass
break,
stop
or.
C
B
Yeah
yeah.
C
B
What's
what
kind
of
log
lines
do
we
get
out
of
that
detection
by
the
way.
B
B
C
C
Yeah,
so
that's
our
into
india's,
I
think
the
last
one
I
think
mark
wanted
to
see
us.
B
Honestly,
this
is
this
is
one
of
the
things
I
like
so
much
about
chaos
center
is,
I
can
just
like
sit
in
here
and
I
can
just
play
like
you
know.
I've
never
done
something
so
catastrophically
bad,
that
I've
not
been
able
to
just
hit
a
button
to
reset
everything
but
yeah.
Let's
today
might
be
the
day,
let's
find
out.
B
C
It's
a
little
buggy
right
now.
I
think
most
of
that's
due
to
me
using
acute
proxy
and
it
being
on
a
vpn,
and
the
combination
of
the
two
is
a
little
fun
the
work
it
actually
set
up
a
load
balancer.
If
I
wasn't
lazy,
go
that
actually
set
up
an
ingress
object.
Add
in
the
proper
annotations
for
us
to
spin
up
an
internal
aod
for
it.
It
would
actually
be
a
lot
better
but,
like
I
said,
I'm
lazy.
B
C
C
Let's
do
that:
let's
do
that
I'll
schedule.
It
now
yeah
finish,
put
a
scenario,
so
this
would
be
fun
because
I
have
no
idea
what
this
is
actually
going
to
do.
B
So
yeah
there's
get
ops
integrations
that
I
have
messed
around
with.
I
would
assume
that
there
is
something
that
we
can
do
to
interact
with
slack.
I
honestly
don't
know
off
the
top
of
my
head.
C
Yeah,
I
don't
know
either
and
in
full
disclosure
us
moving
to
the
v2
is
kind
of.
This
is
definitely
newer
for
us.
We
first
grabbed
on
to
litmus
v1
v2
is
when
they
scan
and
put
the
ui
and
the
chaos
center
and
everything
in
front
of
it.
C
V1
was
entirely
server
and
api
based,
and
so
that's
really
what
we're
doing
is
actually
crafting
yaml
manifest
files
and
just
doing
qctl
applies
with
a
series
of
files
that,
like
added
the
r
back
process
and
all
that
this
is
definitely
much
easier
and
after
diving
into
this
a
lot
it's
on
both
of
our
we
both
have
tickets
now
to
go.
Look
at.
You
know
the
full
scale
implementing
this
all
the
way
through
with
the
ui,
because
it
just
makes
so
much
everyone's
life
easier.
You
can
actually,
we
can.
A
C
B
Stuff
to
a
cluster,
well
yeah
limit
the
great
factor
and
all
that
turns
out
people
actually
really
like
gooeys.
That's
interesting
innovation.
Man,
fresh
out
of
the
70s.
C
Yeah,
so
I
haven't
honestly
dove
into
that,
so
I
can't
really
say
that
I
believe
they
have
a
slack
integration.
But
again
I
don't.
B
C
No,
I
wonder
if
it's
because
it's
looking
for
something
that's
not
coordinate
us
to
actually
go
straight
with.
That
would
be
my
guess.
It's
looking
for,
like
cube,
dns
or
something
if
I
actually
go
and
read
the
manifest
like
I
said
we
haven't
ever
done
it.
So
it's
cool
thing
that
tennessee.
Let
me
see
if
prometheus,
says
something
interesting.
Yeah.
C
A
C
Yeah,
it
doesn't
look
like
that
actually
target
anything
correctly.
I
probably
could
have
set
it
up
wrong.
B
B
I
mean
we
could
do
something
like
pod
delete
or
something
like
that,
something
a
little
bit
more
innocuous.
B
C
B
Yeah,
you
might
want
note
selected
for
that.
One.
C
C
You
see
two
instance
id
give
me
one.
Second,.
A
C
On
some
of
it,
like
I
said,
if
you're
using
eks
gcp
stuff,
obviously
authentication
authorization
is
going
to
be
some
form
of
an
our
back
role
using
you
know,
defaulting
back
to
their
ima
management
using
a
rule
or
something
like
that.
Internally,
I
don't
really
know
yeah.
I
don't
have
a
good
answer
for
that.
Really,
like
I
said,
we
haven't,
played
super
much
with
isolating
them
and
actually
sharing
it
out
to
different
clusters.
Yet
I'm
sure
I'm
going
to
have
that
same
question
in
about
a
week,
so.
C
A
C
C
B
C
I
don't
know
if
this
will
work
but
we'll
try
it
next
next
finish.
Next,
I
should
be
the
instance
id.
C
C
Well,
they'll
exist
they're,
persisted
by
a
pvc,
so
that's
just
more.
I
still
have
access
to
them
here
we
go
yeah,
so
you
can
kind
of
see.
Here's
plot
metrics.
It's
I
mean
it's
limited
use
case.
I
think
you're
still
blowing
this
out,
but
it's
still
kind
of
cool
because
you
can
see
this
actually
needs
a
prometheus
scraper,
which
I
didn't
set
up.
There's
bringing
the
escriptor
I'll
dump
all
the
chaos
intervals
into
prometheus
as
part
of
an
exporter
with
a
service
monitor.
C
It
didn't
play
nice
with
our
already
previous
creators
installed
the
directions
they
have
kind
of
installs
prometheus
itself,
but
you
can
hack
through
it
to
get
it
running.
Did
this
thing
actually
kill
the
node
or,
and
then
it's
still
up?
Did
it
actually
run?
I
might
have
done
that
wrong,
we'll
see.
Oh,
it
failed.
C
C
C
B
A
C
B
C
A
A
B
B
Yeah,
I've
also
seen
some
really
good
demos,
so
chaos
carnival
is
annual
conference
specifically
for
litmus
and
there's
some
really
good
demos
that
came
out
of
that.
B
We
should
open
from
atheists
and
check
this
out.
A
B
Yep
yep
yeah.
Now
I
really
appreciate
the
opportunity
to
come.
Show
us
messing
around
a
little
bit
with
the
with
what
litmus
can
do.
C
A
Yeah
loved
it,
particularly
the
speed
running,
was
very,
very
nice
perfect.
Thank
you
so
much
everyone
for
joining
the
latest
episode
of
cloud
native
live.
It
was
great
to
have
a
really
good
session
about
using
litmus
chaos,
engine
and
microservices
demo
app
to
demonstrate
automated
rca.