►
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Welcome
cncf
community
thanks
for
giving
me
the
opportunity
and
joining
in
to
automating
sre
from
hello
world
to
enterprise
care
with
captain.
This
is
really
an
overview
and
introductory
section
to
our
cncf
sandbox
project.
Captain.
I
think
of
all
the
links
here
that
you
need
in
order
to
figure
out
where
to
find
more
about
captain
captain.sage,
follow
us
on
the
captain
project
star
us
on
github
or
join
the
slack
channel.
A
I
am
andy
grabner,
I'm
a
deaf
rail
for
captain
and
if
you
want
to
know
more
about
me,
feel
free
to
reach
out,
we
will
also
have
a
live
webinar
on
the
cncf
webinar
schedule
coming
up
next
week.
So
then
I
will
also
be
joined
by
jurgen
etzel
stuffer,
and
we
can
then
both
show
you
more
about
captain
in
life.
You
can
ask
us
questions
and
then
we
navigate
you
through
the
product,
but
I
really
encourage
you
as
a
first
step.
A
If
you
want
to
learn
more
definitely
check
out
our
website
from
here,
you
can
reach
all
the
tutorials.
You
can
get
access
to
additional
resources,
like
previous
recordings
on
different
use.
Cases
also
testimonials
to
see
how
other
users
are
using
kept
and
how
they
take
the
benefit.
We've
also
just
recently
released
captain
0.8,
so
it's
march
2021,
depending
on
when
you
watch
this.
There
might
even
be
newer
versions,
but
just
to
let
you
know
that
this
is
the
latest
and
greatest
as
of
the
time
of
the
recording.
A
One
of
them
was
a
lot
of
devops
teams
are
challenged
with
having
very
monolithic
automation
in
the
pipelines
and
it
becomes
hard
to
deploy.
What
does
this
mean?
An
example
is
from
christian
heckleman.
He
is
a
devops
engineer
and
he
constantly
is
challenged
with
pipelines
that
are
broken.
A
Here's
one
of
the
screenshots-
and
this
might
be
something
that
you
can
also
relate
to
some
of
these
pipelines-
start
small
but
all
of
a
sudden.
Well,
it
escalates
well
and
fast.
We
end
up
with
very
complex
scripts
that
are
doing
a
lot
of
amazing
things,
but
it's
really
hard
to
maintain
and
keep
up,
especially
if
you
then
have
different
permutations.
So
this
is
the
first
problem
we
see
out
there
and
that
we
want
to
address
with
captain.
A
The
second
problem
is
that
also
devops
teams
or
people
that
are
in
related
and
in
charge
of
tool
integrations
and
pipelining.
These
pipelines
tend
to
contain
tool
integrations,
and
they
are
often
custom
made
custom
built
and
then
copy
paste
it
around
because
of
lack
of
standard.
This
is
an
example
from
dita
he's
a
senior
ace
engineer
here
at
dynatrace,
and
he
says:
onboarding
or
updating
pipelines
is
manual
and
often
error
prone
now,
while
his
environment
is
much
smaller
than
what
we
saw
from
christian.
A
A
What's
interesting,
we've
done
some
analysis,
so
deta
has
done
some
analysis
to
actually
see
how
much
duplicated
code
we
have
across
all
the
different
pipelines
we
have
in
our
different
projects
and
it's
a
very
well
eye
opening
to
see
that
there's
a
lot
of
red
here,
which
means
a
lot
of
duplicated
code.
That
means
if
there
are
either
bugs
in
there
or
something
needs
to
be
changed,
need
to
change
it
in
many
different
places,
and
often
you
don't
even
know
anymore
where
the
changes.
This
is
another
problem.
A
We
want
to
make
this
easier
because
we're
spending
too
much
time.
Another
problem
that
we
solve
or
want
to
solve
is
we
see
a
lot
of
sre
teams,
they're
trying
to
get
sre
practices
around
sli
slows
around
performance
testing
around
chaos
engineering
at
scale
into
their
organization,
but
it's
really
hard
to
automate
that
at
scale
triscon,
so
roman
fiesta
managing
director
triscon
he's
been
working
with
organizations
where
they're
limited
to
the
number
of
tests
they
can
run
per
year
or
the
number
of
apps
they
can
test
and
validate
against
their
slos.
A
So
the
reason
why
they
are
struggling
with
it
is
because
a
lot
of
the
stuff
is
done
manually.
A
lot
of
tests
have
to
be
rerun
because
they
only
run
you
know,
let's
say
15
times
a
year.
So
a
lot
of
things
change
between
this
also
means
they
have
only
about
10
percent
of
the
projects
onboarded
in
an
organization
but
don't
scale,
they
haven't
scaled
it
across
the
organization.
A
The
reasons
for
all
this
is
because
there's
a
lot
of
manual
time
spent
in
script
creation,
configure
your
monitoring
analyzing
your
test
results
analyzing,
your
slos,
which
you
want
to
do.
If
you
want
to
get
broader
with
your
sre
practice,
not
only
in
production,
but
also
bring
it
across
the
lifecycle
now
these
are
three
problems
and
three
challenges.
Now
I
want
to
show
you
three
examples
of
how
captain
users
have
been
helped
by
captain
and
solve
their
problems.
Sumit
is
at
intuit.
A
Captain
has
a
capability
that
is
called
slo-based
quality
gates,
so
they
run
their
tests
and
their
existing
tooling,
and
then
they
hand
it
over
to
captain
to
fully
automatically
continuously
evaluate
slo,
something
that
they
have
done
manually
before
now.
Captain
allows
them
to
scale
coming
back
to
roman,
who
I
brought
up
earlier.
Remember.
He
had
like
15
to
20
tests
per
year
and
only
five
apps.
Well
now
they
run
15
times
the
amount
of
tests
and
10
times
the
amount
of
apps.
A
Thanks
to
the
automation
that
captain
brings
in
because
captain
runs
tests
more
consecutively,
more
continuously,
more
automated
and
also
automates
the
analysis.
This
really
enables
them
to
do
automated
performance
and
resiliency
resiliency
testing
and
the
third
one
remember
christian.
He
was
challenged
with
the
ever-growing
number
of
pipelines.
A
Well,
they
have
now
moved
over
to
kubernetes,
which
means
new
microservices
new
pipelines
that
have
to
be
onboarded,
and
they
didn't
want
to
make
the
mistake
from
the
big
from
the
from
the
past,
from
the
previous
architecture
so
now
they're
using
captain
to
orchestrate
the
whole
end-to-end
delivery
pipeline,
calling
gitlab
for
deployment
kept,
triggering
their
automated
tests
with
catalan
ng
meter
using
helm
for
deployment,
but
then
also
doing
the
automated
quality
git
evaluation.
A
So
this
is
where
this
is
some
of
the
stories,
and
you
can
actually
find
some
videos
of
these
three
gentlemen
and
more.
If
you
go
to
the
captain
websites
and
go
to
captain
resources,
there
are
some
other
nice
testimonials.
You
can
also
find
them
on
the
website.
What
I
really
like
is
taras
from
facebook
who
says:
captain
feels
like
a
reference:
implementation
of
google's
site,
reliability,
engineering
and
the
site,
reliability.
Engineering
workbook.
A
I
guess
this
was
really
nice
for
us
to
hear
that
it
seems
a
lot
of
people
understand
that
we
really
try
to
help,
especially
the
sre
community,
to
bring
sre
automated
in
your
cloud
native,
continuous
delivery,
all
right
so
now.
What
is
captain
right
captain
is
something
different
for
different
personas,
whether
you
are
an
ops,
an
sre,
a
deaf,
whether
you're
a
performance
engineer.
A
Whoever
you
are
captain
allows
you
to
pick
a
use
case
where
you're
currently
struggling
with
automation,
with
automating
it
in
general
or
in
the
way
you
want
to
automate
that
and
integrate
it
into
existing
automation
tools.
So
captain
allows
you
to
pick
the
use
case
that
you
want
to
automate
quality
gates,
delivery,
sre
automation
or
auto
remediation
for
production.
A
Depending
on
the
use
case.
You
then
bring
your
configuration,
for
instance,
for
the
quality
gate
evaluation.
You
have
to
bring
your
sli
and
slo
definitions
for
your
performance
test,
automation.
You
bring
your
workload
definition
for
your
order,
remediation
and
production.
You
bring
your
run
books
and
best
of
all
captain
doesn't
execute.
This
thing
captain
is
an
orchestrator
captain,
connects
to
your
tools,
so
you
can
bring
your
tools
that
work
well
in
your
particular
environment.
Everyone
has
a
different
environment.
A
Everyone
has
favorite
tools
that
you
have
investments
in,
so
you
can
bring
these
tools
and
connect
them
to
captain,
because
captain
then
takes
your
configuration,
takes
your
use
case
and
really
automates.
The
configuration
of
your
tools
connects
them
and
provides
the
use
cases
completely
as
a
self-service,
and
it
does
it
through
a
declarative
approach.
A
Everything
all
the
configuration
files
are
all
persisted,
stored
and
versioned
in
git
everything
is
centered
around
service
level
objectives,
as
slows
every
action
captain
takes
is
validated
that
it
doesn't
break
anything
or
still
you
are
within
your
slos,
and
the
whole
communication
from
captain
to
your
different
tools
is
all
based
on
the
cloud
event
standard.
So
everything
is
standard
based.
There
is
no
proprietary
integration.
A
The
architecture
was
driven
by
really
the
new
requirements
that
we've
seen
remember.
We
have
seen
pipelines,
we've
seen
automation,
scripts
that
grew
too
fast
because
they
had
mixed
information
about
processing
and
tooling
and
target
platform
and
environments
in
there.
There
was
also
no
clear
separation
of
concerns
about
what
the
developers
should
do
and
evaps
engineers
should
do
and
a
site
reliability
engineer
should
do.
This
is
kind
of
everything
we
packed
everything
together,
and
these
were
the
fundamental
problems
I
think
of
most
of
the
approaches
we
have
today.
A
So
what
we,
what
we
said
well
in
the
end,
we
have
processes,
but
we
want
to
automate
processes
with
the
hard
dependencies
to
the
tooling.
So
we
said
if
you
have
a
process
on
the
left
and
you
have
the
hardcore
dependencies,
why
not
just
break
these
things
apart?
Why
not
break
the
dependencies
or
remove
these
hard
dependencies
and
say
hey?
We
have
a
process
that
we
want
to
automate.
Of
course
right
it
may
be
build,
prepare,
deploy,
test,
notify
rollback.
A
So
if
you
have
the
process
on
the
left
and
the
capabilities
on
the
right
and
we
have
a
process
orchestrator,
then
we
need
some
way
for
them
to
communicate,
and
this
is
where
eventing
comes
in.
Captain
uses
an
event-based
model,
just
as
when
we
break
monolithic
applications
into
smaller
services,
then
use
eventing
into
connect,
and
we
do
the
same
thing.
A
We
allow
it
to
define
the
process
and,
as
we
execute
the
process,
captain
will
then
send
the
right
event
at
the
right
moment
to,
for
instance,
say
hey
I
need
to.
I
need
somebody
that
has
the
capability
to
deploy
container
number
one
in-depth
with
a
blue-green
deployment
strategy.
Then
you
may
have
one
or
two
capabilities:
maybe
you
have
helm
that
could
do
it?
Maybe
you
have
a
jenkins
pipeline
that
could
do
it
or
you
have
spinnaker,
then
these
tools
can
say.
A
Yes,
I
can
do
it
because
I'm
certified
and
I
have
all
the
config
files
that
I
need
for
that
environment.
Let
me
do
it
and
when
it's
done
it
sends
it
back
that
it
that
the
job
was
successfully
done
or
maybe
failed.
Who
knows,
and
then
captain
can
continue
with
the
workflow.
A
So
really
what
we
did
is
we
said
which
events
do
we
need
and
also
what
who
are?
What
are
the
capabilities
we
need
on
the
right
side
and
then
we
connect
them
through
eventing
from
10
000
feet.
The
way
this
looks
like
you
install
captain
on
kubernetes,
you
install
the
so-called
control
plane
on
a
cluster
that
manages
all
of
the
workflow
and
all
the
logic
I
just
explained
we're
using
nets
as
the
eventing
engine.
A
Now,
in
order
to
use
captain,
you
have
somebody
that
needs
to
say
which
processes
which
workflows
which
sequences
captain
should
actually
orchestrate
and
automate.
This
is
what
we
call
the
application
plan.
You
specify
what
type
of
processes
is
it
delivery
process?
Is
it
a
remediation
process?
Is
it
a
testing
process?
You
declare
this
in
our
config
files.
We
call
them
cheaper
than
remediation
files.
Shipyard,
that
means
everything
that
is
related
to
continuous
delivery
until
it
ends
up
in
production
and
remediation
is
everything
for
the
order
for
all
the
remediating
tasks
in
production.
A
The
nice
thing
is
because
we
have
a
clear
separation
of
concerns
between
process
definition
and
and
the
tooling
and
the
capabilities
you
can
have
even
a
different
team
that
can
define
and
install
the
execution
plane
either
on
the
same
cluster
or
on
different
clusters.
We
just
introduced
captain
0.8
that
now
finally
has
the
capability
to
execute
or
to
install
the
execution
plane
in
all
of
your
different
target
systems,
and
then
this
team
can
say
well
which
tools
do
you
want
to
use
in
this
target
environment,
and
then
they
install
these
capabilities
they're.
A
Listening
to
these
cloud
events,
so
it's
all
based
on
standards
and
once
they
receive
it,
they
execute
the
action
respond
which
means
at
the
end,
the
real
beneficiary
is
the
user,
the
dev,
the
ops,
the
sre.
That
can
then
say
I
have
a
new
artifact
and
I
want
captain
to
now
run
an
automated
process.
For
me,
let's
say
test
automation
or
even
delivery,
which
means
then
captain
starts
with
sending
the
events
depending
on
your
process.
Definition
with
this
triggers
the
right
tooling
in
your
execution
plane
these
tools,
then
do
the
action
and
then
report
back.
A
If
something
is
good
or
not
good,
the
nice
thing
is,
you
can
easily
now
change
the
process
without
having
to
think
about
which
tool
integrations.
You
now
need
to
worry
about,
or
maybe
break,
but
you
can
also
change
the
tooling
without
thinking
about
the
process.
Right,
you
can
say
I'm
swapping
from
let's
say
a
jenkins
pipeline
that
used
to
do
my
deployments
to
now
using
helm,
natively
or
you
may
switch
from
jmeter
as
a
testing
tool
to
something
like
niotus
or
you
switch
from
one
monitoring
tool
to
another
monitoring
tool.
A
It
gives
you
the
observability
data
and
the
nice
thing.
Is
you
don't
have
these
integrations
hardcoded
anymore?
It's
all
process
definition
tool
capabilities
and
then
they
are
connected
through
events.
So
I
want
to
off
go
into
my
first
demo.
I
want
to
show
you
a
little
bit
of
captain
all
right,
so
I
had
this
here.
Let
me
show
you
something
that
I
have
here
and
by
the
way,
as
I
said
in
about
a
week
or
so,
we
do
a.
A
We
do
a
live
demo,
we
can
do
more,
we
do
a
live
webinar
and
we
do
a
little
bit
more
on
on
live
demos
with
captain.
So
just
wanna,
let
you
know
I've
installed
captain
on
an
eks
cluster.
This
is
a
standard
installation.
Now,
where
I
have
control
and
execution
plan
installed,
you
see
a
couple
of
pots
here
that
I
have.
I
also
have
my
captain.
Cli
authenticated
against
my
captain
environment,
and
I
can
now
also
do
things
like,
and
let
me
just
do
this
here.
History
grab
artifact.
A
I
want
to
kick
off
a
new
deployment.
I'm
too
lazy
to
remember
all
of
this.
To
be
honest
with
you,
that's
why
what
I
want
now
says
I
want
to
say
captain
please.
I
have
a
new
artifact
for
you
for
a
particular
captain
project
as
service
and
here's
my
new
image,
and
now
you
go
off
now.
While
this
runs,
I
want
to
show
you
a
little
bit
of
what
actually
happens
behind
the
scenes.
A
So
here
is
my
captain.
Installation
here
is
my
captain.
O7
project
captain
internally
holds
a
config
repo
for
everything
it
does
so
for
every
project
you
get
a
config
repo
and
then
you
can
also
specify
an
upstream
git.
This
is
here
my
github
repository
what
you
can
see
here
in
the
main
branch.
I
have
my
shipyard
file.
This
is
kind
of
my
process
definition.
This
is
what,
where
I
say
captain
I
want
you
to
provide
me
three
stages:
dev,
staging
and
prod.
You
can
give
it
different
types
of
metadata
to
change
the
opinionated
workflow.
A
That
captain
has
like
what
type
of
deployment
should
happen.
What
type
of
testing
should
happen?
What
type
of
approval
should
happen?
What
type
of
remediation
should
happen
now?
What
you
see
here
is
a
shipyard
file
of
captain
version.
0.7
0.8
was
just
released
as
I'm
recording
this,
so
I
will
show
you
how
this
change
in
open
day,
because
in
0.8
you're
more
flexible
with
what
should
happen
in
a
stage,
but
I
start
with
o7
here,
because
in
the
end
it
gets
the
point
across
what
captain
is
doing.
A
So
this
is
what
I
specified
that's
my
whole
kind
of
pipeline
code.
Now
what
else
do
I
have
for
every
individual
stage?
Captain
created
a
branch
for
me
like,
for
instance,
if
I
go
into
the
dev
branch.
This
is
now
where
I
have
all
of
my
supporting
configuration
files
for
the
individual
tools
and
capabilities
so
that
they
can
do
their
job.
A
So,
for
instance,
I
have
my
g
meter
scripts
in
here,
because
I'm
using
jmeter
so
when
gmeter
is
triggered
later
on,
it
can
access
this
config
repo
for
def
and
then
say:
okay,
what
are
my
files?
What
is
my
configuration
I
also
have
for?
I
can
either
specify
it
on
a
global
scale
for
stage
or
I
can
do
it
on
an
individual
service,
because
a
project
in
captain
typically
contains
multiple
micro
services
that
you
want
to
deploy.
Then
you
can
have
more
specific
files
for
a
particular
service.
A
A
Something
was
changed
here
because
remember
I
triggered
off
my.
I
said:
captain
send
new
event
artifacts.
Let
me
just
go
back
quickly.
I
said
captain
send
event
new
artifact
for
this
project
for
this
service
for
simple
node
I
have
a
new
version
and
one
of
the
things
that
captain
does
the
way
I
specified
it.
A
Captain
will
first
trigger
send
an
event
and
say:
hey
andy,
wants
to
change
the
version
and
therefore
take
this
version,
information
and
update
it
in
the
files
where
it's
necessary.
So
the
first
thing
that
actually
happens
is
a
version
change
and
it
made
the
update
here,
which
is
nice,
because
I
don't
have
to
deal
with
it.
You
don't
have
to
do
it.
You
can
also
do
it
through
your
regular
github's
approach,
where
you,
you
know,
change
your
configurations
in
that
git
repo
and
then
trigger
the
rest
of
the
captain
workflow.
A
A
A
If
I
go
back
to
staging
this
is
where
probably
build
number
three
still
right,
because
so
build
number
four
was
in
depth
and
build
number
three
is
in
staging
and
because
I
sent
kind
of
captain
along
the
way,
captain
will
now
go
through
all
the
process
until
hopefully
it
ends
up
in
broad,
and
you
will
see
here,
I
actually
didn't
clean
up
my
environment
from
some
previous
demos.
A
I
actually
had
a
couple
of
builds
and
runs
earlier
that
made
it
all
the
way
into
prod,
not
almost
all
the
way
into
prod,
because
I
specified
in
my
sheet
pad
file
that
I
want
to
have
direct
deployment
or
direct
promotion
from
dev
to
staging.
If
a
build
is
good,
but
from
staging
to
production,
I
always
want
to
have
like
a
manual
approval.
This
is
why
these
are
these
are
waiting
here
and
now
I
get
the
overview
of
my
slis
and
my
slos,
and
I
can
then
make
a
decision
to
go
a
no-go.
A
So
these
are
some
old
test
runs
that
I
have
never
approved.
So
this
is
this
is
why
they're
still
lingering
around
now,
but
what's
interesting.
This
gives
me
an
overview
of
what
what
is
what
is
currently
deployed
in
which
stage.
But
if
I
click
on
services,
then
I
see
here
in
the
list
all
of
my
previous
attempts
and
my
previous
demos.
When
I
ran
deployments
right,
they
can
be
triggered
through
the
cli
through
a
web
hook.
You
can
do
it
as
part
of
the
git
action,
whatever
you
want.
A
If
I
go
to
build
number
four
now,
if
I
click
on
it
on
the
right
side
now
you
actually
see
all
these
events
that
I
talked
about
earlier.
Remember
in
my
in
my
animation,
I
said:
captain
is
sending
events
and
with
this
triggering
then
the
capabilities,
and
once
they
are
picking
up
the
job,
they
say
yes,
I'm
doing
it
and
then
they're
sending
back
once
they're
done
so
this
is
really
neat,
because
I
see
exactly
what
is
happening.
Deployment
test
finished
until
quality
gates
are
enforced,
and
then
it
goes
on
into
the
next
stage.
A
Now
let
me
switch
to
build
number
three
that
I
ran
a
little
earlier,
because
here
I
see
a
little
more
because
in
this
case
the
build
was
promoted
all
the
way
from
def
into
staging
and
then
into
staging.
We
also
ran
some
jmeter
tests
and
we
did
some
more
extensive
quality
gate
evaluations
until
it
ended
up
waiting
in
prague
for
approval,
so
I
can
actually
now.
Finally,
I
think
you
know
kick
this
off
and
push
build
number
three
in
prod.
I
mean
build.
A
Number
four
is
already
on
its
way,
but
now
I'm
good
with
this.
So
this
is
kind
of
like
a
quick
overview.
What
you've?
What
you
should
take
away
with
you
is
that
in
captain
everything
is
declarative,
meaning
you
declare
what
kind
of
process
you
want
to
automate.
We
have
the
shipyard
file,
for
instance.
A
You
then
also
add
all
of
your
configuration
files
for
your
specific
tools
and
capabilities
into
that
git
repository
for
a
particular
stage,
either
for
this
for
the
overall
stage,
meaning
all
your
test
files
for
that
stage
or
specific
ones
in
in
for
a
particular
service,
and
then
captain
orchestrates
everything
for
you
now.
Captain
also
has
a
very
rich
api
right.
We
have
a
swagger
ui
here
where
you
can
explore
the
api,
and
this
is
where
you
can
then,
for
instance,
trigger
an
event
from
the
outside.
You
can
create
your
project
to
service.
A
You
can
fully
automate
captain
there's
a
lot
of
different
options
here
to
then
access
the
git
repository
or
everything
in
the
git
repository
upload
files,
download
files
up
to
you-
and
you
may
ask
you,
may
ask
okay
so,
but
how
do
I
build
a
new
service?
How
do
I
extend
it?
Well,
services
are
basically
listeners
to
events
and,
if
you
go
to
captain
sandbox.
A
Then
this
is
actually
a
great
way
for
you
first
to
to
see
what
type
of
services
we
have.
We
have
a
sandbox,
we
have
a
contrib
and
we
have
the
core
captain
project,
but
this
is
where
we
have
low
cost
service
litmus
service.
You
see
there's
a
lot
of
stuff
already
built
monaco
service,
the
github's
operator.
We
have
a
lot
of
things
here
already.
A
A
So
what
you've
seen
is
a
quick
overview
of
how
captain
works.
Now
the
nice
thing
is,
captain
can
easily
be
integrated
in
existing
pipelines
and
existing
tooling.
That
was
also
our
goal.
We
don't
want
to
replace
everything
we
want
to
extend
them.
We
want
to
automate
things
that
are
currently
hard
to
automate,
and
one
of
the
examples
is
patrick
they're,
using
gitlab
for
ci
for
building
their
containers
and
pushing
it
to
the
registry,
and
then
they're
kicking
off
captain
and
keptness,
then
doing
the
delivery
for
them.
A
A
Sres,
especially,
are
asked
to
run
more
performance
tests,
more
chaos
tests.
You
need
to
bring
in
observability
data
from
tools
like
open
telemetry
or
your
apm
solutions
and
there's
so
much
data,
and
it's
really
hard
to
to
then
analyze
it
from
a
build
to
build
from
a
deployment
to
the
deployment
perspective.
It
is
all
possible,
but
it's
not
easy.
A
So
what
we
then
said
well,
we
want
to
tackle
this
problem
and
we
want
to
make
it
core
at
captain
and
for
this
we
looked
at
google's
s3
practice.
So
sre
stands
for
site,
reliability,
engineering.
I
guess
I
don't
tell
you
anything
new
for
those
where
it
is
new,
it's
very
simple!
Actually
you
have
slis.
These
are
your
service
level
indicators,
something
a
metric
that
you
can
measure
that
is
important
to
you,
like
the
error
rate
of
login
requests,
then
you
specify
what
is
your
objective
with
this
metric?
A
So,
for
instance,
you
want
to
make
sure
that
the
login
error
rate
should
be
less
than
two
percent
over
30
day
period,
especially
in
production.
These
are
the
things
that
you've
defined
and
then
slas,
probably
more
well-known.
Are
things
like
what
would
happen?
If
you
are
kind
of
missing
your
solos,
then
you
may
have
a
legal
contract.
You
may
have
some
obligation
or
you
may
lose
users
whatever
that
is
so
in
the
end,
google
did
a
great
job
in
advocating
for
this
principle
as
part
of
the
site,
reliability,
engineering,
practices,
great
videos,
great
books.
A
I
like
the
top
line,
slo
travis
slos,
which
inform
slas
now
what
we
thought,
it's
great,
that
more
and
more
organizations
are
looking
into
using
slos
as
part
of
their
production
deployments,
production
monitoring
right,
you
can
use
slos
for
individual
services,
applications
for
different
types
of
metrics.
A
You
use
them,
use
the
error
budget,
the
status
to
make
decisions
on
whether
or
not
to
deploy,
but
we
thought
why
not
take
it
and
use
the
same
concept
for
everything
we
do
for
everything
we
do
from
when
you
you,
you
create
your
first
container
image
until
you
deployed
in
dev
run
your
tests.
Why
not
use
the
same
concept
of
looking
at
metrics
and
then
validate
them
if
they
are
within
what
I'm
expecting?
A
My
expectations
are,
and
this
is
why
we
bring
captain
quality
gates
as
a
core
component
to
it,
which
is
based
on
the
concept
of
slis
and
slos.
Metrics
compares
against
objectives
and
then
captain
just
analyzes
metrics
that
are
important
for
you
with
every
commit
with
every
build
and
then
makes
a
decision
good
or
no
good.
A
Now
these
might
be
different,
metrics
and
different
thresholds
that
you
have
in
production.
I
understand
that
this
is
also
where
you
typically
use
regression.
Detection
between
builds,
because
you
want
to
know,
did
the
new
build,
maybe
introduce
cpu
consumption
by
20,
or
are
you
making
50
new
database
calls
to
the
back
end,
and
this
is
something
that
you
want
to
flag.
These
might
be
not
slos
that
are
interesting
for
you
in
production.
Well,
they
might
be.
A
But
what
I'm
saying
here
is
we
allow
you
to
also
specify
different
slis
and
slos
as
part
of
qualitygate,
so
very
high
level
how
this
works?
You
specify
sli
in
captain.
What
metrics
you
want
from
whatever
tool
and
data
source
it
could
be
prometheus
could
be,
dynatrace
could
be.
Wavefront
could
be
any
of
the
other
monitoring
tools.
Then
you
specify
your
slos,
where
you
can
say
I'm
expecting
this
metric
to
be
within
a
certain
range,
or
I
don't
want
this
metric
to
go
above
a
certain
baseline
by
looking
back
at
different
builds.
A
A
So
if
build
number
one
comes
along
and
everything
is
green,
then
great
and
captain
will
tell
you
you're
good
to
go
one
hundred
percent.
If
build
two
comes
along
and
it
seems
you
are
slower
on
response
time
and
failure
rate,
then
you're
getting
penalized
getting
75
and
then
you
can
decide
it's
still
good
to
go
yes
or
no.
A
Everything
is
green
and
now
we're
good
to
go.
So
this
is
how
it
looks
in
excel.
This
is
now
how
it
looks
in
in
captain
the
way
captain
treats
slice
and
slos.
You
specify
your
slis
in
as
indicators
sli
sli,
yaml
files.
I
don't
want
to
start
the
debate
on
the
amble
and
json
now
so
you
basically
say
these
are
the
metrics,
and
then
you
put
the
query
language
next
to
it
for
the
particular
tool
that
you're
using
and
then
you
specify
your
slos
in
a
separate
file.
A
So
if
you
have
those
and
then
captain
you're
asking
captain,
please
evaluate-
because
this
is
also
one
valid
use
case-
you
can
just
stay
kept.
The
only
thing
I
want
you
to
do
is
to
evaluate
my
performance,
metrics
or
my
sls
and
slos.
Then
captain
will
send
an
event
say:
hey,
which
tools
can
give
me
these
slis.
Here
all
the
definitions,
then,
whatever
tool
you've
connected,
can
then
report
the
value
captain
then
takes
this
value,
scores
every
single
value
based
on
the
slos
and
then
comes
up
with
a
total
score.
A
That
is
then
translated
into
pass
warning
or
fail.
Now
let
me
go
quickly
back
into
captain
and
to
show
you
this,
how
we
did
it
with
dynatrace
now
this
also
works
for
prometheus
and
others,
I'm
just
using
dyna
trees,
because
this
is
where
I,
where
my
day
job
is,
and
so
I'm
familiar
with
that
tool
in
dynatrace.
We
allow
you
to
simply
just
build
a
dashboard
and
then
captain
will
automate
all
this.
Now.
Let
me
go
back
to
my
captain
instance.
A
A
Now
remember
I
told
you
that
normally
you
would
go
in
into
your
git
repo
and
if
I
go
to
staging
and
if
I
go
to
in
my
case,
I'm
using
dynatrace
as
the
monitoring
tool.
So
here's
where
I
specify
my
slis
I
could
go
in
and
specify
all
of
my
queries
against
my
dynatrace
query
language.
So
I
say:
hey,
there's
an
sli
called
process
memory,
and
this
is
how
you
query
it.
A
You
can
do
all
that
and
you
can
then
also
specify
your
slos
in
the
yaml,
just
as
I
showed
you
earlier
here,
all
the
slos
and
I'm
sure
there's
somewhere
the
memory,
the
process,
memory
with
past
warning,
weight
and
so
on
and
so
forth.
So
you
can
do
this,
and
while
this
is
great
what
we
did
in
order
to
make
this
a
little
easier
because
not
everybody
is
yet
there
to
do.
Everything
is
code
from
this
from
the
scratch
from
from
scratch.
A
We
said
for
our
integration
here.
If
you're
using
dynatrace,
you
can
also
just
build
a
dashboard,
meaning
I
have
an
observability
platform.
I
build
a
dashboard
where
what
I
would
normally
do
right.
You
normally
build
a
dashboard,
you
put
all
your
metrics
on
and
then
you
typically
have
an
idea
on
how
how
how
must
the
metric
look
like
like
how
what's
the
metric?
A
I
basically
put
in
all
my
metrics
that
are
important
for
me
and
then
additionally,
if
I
zoom
in
here
a
little
bit
instead
of
me
looking
at
them
and
say:
okay,
what's
the
value
that
I
that
I'm
expecting-
and
that
is
good-
I
can
specify
my
rules
pass-
should
be
faster
than
600
milliseconds
and
it
should
not
slow
down
by
more
than
10
percent.
I
can
do
this
on
service
level
metrics
on
transaction
metrics.
A
A
That's
kind
of
the
source
of
proof
generates
the
sli
and
slow
yaml
out
of
it,
so
that
then
captain
internally
can
also
process
it
the
same
way
as
with
all
the
other
monitoring
tools,
and
then
here
back
in
my
in
my
cabins
bridge,
I
have
all
the
results
for
every
single
metric
for
every
single
run.
I
can
look
at
it.
I
can
also
click
on
the
chart
and
then
see
things
over
time,
which
is
really
nice.
A
Now
I
also
want
to
quickly
highlight
captain
0.8
I
mentioned
open8
was
just
released
and
I'm
really
happy
about
this,
because
there's
some
really
cool
new
capabilities
in
0.8.
You
also
have
a
nicer
way
of
visualizing
the
stages,
so
you
can
really
easily
click
and
focus
on
the
sequences
here
for
the
slo
validation.
We
now
get
a
nice
overview
of
tables.
This
was
missing
earlier.
What's
the
sli?
What's
the
value,
what
is
the
pass
and
the
warning
criteria?
What's
the
result?
What's
the
score?
How
much
does
it
contribute
to
the
100
points?
A
This
is
all
here
and
obviously
you
still
have
the
charts,
as
I
just
show
you,
you
can
also
ignore
tests
or
ignore
runs
in
case.
You
had
a
major
issue
that
you're
aware
of
so
that
it
doesn't
kind
of
pollute
your
baseline
right,
so
really
cool
things
that
that
are
that
are
possible
now,
especially
also
nicer,
visualization,
all
right
now
what
that
means
is.
A
This
is
core
to
captain.
We
always
evaluate
our
slos,
but
it
means
you
can
also
use
it.
Standalone-
and
this
is,
I
have
to
admit
it's-
it's
the
first
use
case
that
people
that
people
start
with
captain.
They
say
I
may
have
already
my
pipeline.
I
already
deploy
with
like
christian
here
with
gitlab
and
already
kick
off
some
tests,
but
then
I
have
not
yet
automated
my
test
validation,
so
I
want
to
use
captain
for
it.
So
in
this
case
it's
just
from
your
existing,
let's
say:
gitlab
pipeline
you
trigger
the
captain
evaluation.
A
A
A
We
can
do
the
same
thing
for
processes
that
we
want
to
trigger
as
part
of
a
problem
in
production.
So,
for
instance,
if
you
have
a
monitoring
tool
that
alerts
you
on
conversion
rate
dropped,
root,
causes
cpu
pressure,
then
you
can
specify
similar
to
the
shipyard
file.
I
showed
you
earlier,
but
now
a
remediation
file
will
say
these
are
my
steps
that
I
would
execute
as
remediation
and
again,
these
steps,
these
actions
captain
will
take
them
and
the
way
captain
treats
them
is
just
like
with
the
delivery
process.
A
It
sends
an
event
and
say
who
has
the
capability
to
execute
this
particular
action
on
that
system,
because
this
is
what
I
I
want
to
automate.
So,
for
instance,
the
problem
comes
in,
captain
will
say
well.
The
first
thing
is
scaling
up,
so
please
scale
up
whoever
whatever
action
that
is
and
then
remember
also
for
the
order
remediation
we
validate,
we
validate
the
slis
and
dslos,
and
also
we
call
them
blossom
business
level
objectives
because
in
production
you
typically
then
also
take,
maybe
some
end
user
metrics
to
it.
A
Really
cool,
but
I
know
also
really
scary,
a
lot
of
people
think
they
don't
trust
this
in
production
as
a
first,
I'm
running
it
as
a
first
time.
This
is
why
we
are
also
partnering
and
we've
seen
a
lot
of
movement,
and
this
is
why
jurgen
is
great
that
he
will
be
there
with
us
next
week
in
the
live
webinar
integrating
captain
in
prepro
with
chaos.
Engineering,
so
captain
can
trigger
your
performance
test
to
run
some
load
against
your
system.
A
A
A
A
Another
cool
thing
is,
and
I
showed
you
this
briefly
in
the
demo-
we
went
to
a
different
shipyard
model.
Now
we
have
shipyard
version
0.2.0,
which
allows
you
to
be
more
explicit
on
what
should
happen
in
each
stage.
In
the
previous
versions,
we
were
very
up,
not
opportunistic
opinionated.
That's
the
right
word.
A
We
were
very
opinionated
on
what
happens
in
a
stage
now
we
give
you
more
freedom,
you
can
define
your
own
tasks
and
sequences
and
you
can
say
which
sequence
should
trigger
when
and
what
should
happen,
after
which
sequence
gives
you
more
flexibility,
so
have
a
look
at
cap.
0.8
best
way
to
get
started
is
to
go
to
tutorials,
make
sure
you
choose
the
right
version
and
if
you
have
any
more
questions,
feel
free
to
you
know,
reach
out
to
us
make
sure
to
follow
us
on
twitter
visit.
A
Our
website
join
us
on
slack
and
yeah,
make
sure
to
also
join
us.
Live
at
the
cncf
live
webinar.
Where
we
talk
about
captain,
then
you
can
ask
all
the
questions
and
we
just
go
through
the
product.
Thank
you.
So
much
happy
sreeing
happy
scaling
from
a
small
project
to
to
a
large
enterprise
skill.
Thank.