►
From YouTube: Argo: Real Enterprise-scale with Kubernetes
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
Okay,
I'm
libby
schultz
I'll
be
moderating
today's
webinar.
We
would
like
to
welcome
our
presenters
today,
al
kimner
principal
software,
engineer
and
architect
at
new
relic
daniel
jimble
staff
engineer
at
new
relic
and
caleb
trotton
product
manager,
telemetry
data
platform
at
new
relic,
a
few
housekeeping
items
before
we
get
started
during
the
webinar.
You
are
not
able
to
talk
as
an
attendee.
There
is
a
q,
a
box
at
the
bottom
of
your
screen.
A
Please
feel
free
to
drop
your
questions
in
there
and
we'll
get
to
as
many
as
we
can.
At
the
end.
This
is
an
official
webinar
of
the
cncf
and
as
such
as
subject
to
the
cncf
code
of
conduct,
please
do
not
add
anything
to
the
chat
or
questions
that
would
be
in
violation
of
that
code
of
conduct
and
be
respectful
of
all
of
your
fellow
participants
and
presenters.
B
All
right,
hello
and
good
morning,
good
afternoon
or
good
evening,
everyone
welcome
before
we
get
started
with
the
presentation.
I
want
to
have
a
word
from
our
legal
team,
so
this
is
our
safe
harbor
slide.
We
can
move
on
next
slide.
Please
I'm
al
kimner
principal
engineer,
architect.
I
help
engineering
teams
build
software
and
systems.
That's
simple
to
maintain
and
scale.
My
favorite
hobby
is
scuba,
diving.
C
B
We
have
a
packed
agenda
for
our
presentation
day
with
two
compelling
demos,
I'm
going
to
give
you
an
overview
of
new
relic's
ingestion,
streaming
and
storage
architecture,
which
should
set
the
stage
of
what
problems
we
have
and
how
argo
fits
into
that
I'll
cover
our
use
of
argo
cd
and
the
scale
at
which
we're
using
it.
Caleb
will
walk
us
through
how
argo
rollouts
give
us
a
better
gives
us
a
better
experience
than
a
kubernetes
rolling
update
for
a
deployment
and
showing
and
showing
a
demo
of
the
deployment
with
an
automated
canary
analysis.
B
After
that,
we'll
cover
additional
needs.
We
have
with
orchestration
at
scale
and
how
argo
workflow
helps
us
there.
Daniel
is
going
to
cover
how
we
use
terraform
an
open
policy
agent
with
an
argo
workflow
to
safely
roll
out
infrastructure
as
code
changes.
Our
main
objectives
today
are
that
you'll
be
able
to
understand
how
to
safely
implement
continuous
delivery
at
scale
for
both
kubernetes
resources
and
infrastructure.
As
code
pipelines.
B
B
This
is
where
argosidi
enters
the
picture.
First,
some
history
about
argo.
Argo
was
created
in
2017
at
athletics,
which
was
acquired
by
intuit
in
january
of
2018,
who
open
sourced
argo
a
few
months
later,
blackrock
contributed
argo
events
to
the
argo
project.
Argo
then
joined
the
cncf
in
april
of
2020..
B
So
why
argo?
Well?
At
new
relic?
We
are
constantly
evolving
our
systems,
along
with
our
internal
engineering
processes
and
operations.
One
of
those
changes
was
introducing
kubernetes
kubernetes
was
a
good
fit
for
us
because
we
have
been
multi-cloud
for
many
years
and
our
services
exist
across
multiple
public
cloud
providers
and
private
data
centers.
B
This
is
a
long
list
of
features
that
made
it
compelling
for
us
to
pick
argo
cd
for
our
continuous
delivery
needs,
and
this
is
not
even
the
full
list.
I
just
ran
out
of
room
on
the
slide.
One
of
the
main
drivers
was
for
us
to
have
the
ability
to
easily
manage
and
deploy
multiple
kubernetes
clusters
with
a
get
ops
workflow.
B
B
B
Working
at
a
company
that
focuses
on
observability,
it
would
not
feel
right
without
sharing
some
stats
about
our
current
argo
cd
instance.
We
are
at
approximately
3
000
applications
and
over
10
000
kubernetes
deployments
in
the
last
month.
The
kubernetes
clusters
are
very
big
with
most
over
a
thousand
nodes,
we've
segmented,
our
services
into
different
workloads
and
workloads
are
assigned
to
different
size,
kubernetes
clusters,
I'm
pointing
all
this
out,
because
we
have
lots
of
variables
with
dozens
of
internal
engineering
teams
and
a
whole
bunch
of
services
and
lots
of
changes.
B
C
Thank
you
al.
So
I'm
going
to
talk
to
everybody
today
about
argo
rollouts
and
how
that
helps
us
with
the
safety
of
so
many
deployments
happening
every
day.
C
So
with
hundreds,
if
not
thousands
of
deployments
a
day,
we
need
a
way
to
make
sure
that
changes
roll
out
safely
and
don't
require
an
extreme
amount
of
effort
from
engineers
to
make
sure
that
the
deployment
works
right
for
this.
We
like
to
use
a
canary
deploy
strategy
if
you're
not
familiar
with
the
canary
deploy
strategy
that
involves.
C
This
can
totally
be
done
manually
in
terms
of
verifying
whether
the
canary
is
done
safe,
but
we
don't
want
to
have
you
know
a
human
involved
for
30
or
60
minutes.
Every
time
a
deploy
is
made
when
we're
making
hundreds
of
them
a
day,
so
something
that
we
were
really
looking
for
was
automated
canary
analysis.
C
So
we
looked
at
argo
rollouts
for
this,
because
the
standard
kubernetes
deployment
resource
doesn't
provide
most
of
this
stuff
that
you
see
here.
Their
rolling
update
strategy
allows
you
to
roll
out
things
slowly,
one
at
a
time
as
long
as
probes,
as
long
as
the
probe
conditions
are
met,
but
not
really
that
advanced
use
case
of
stopping
pausing
running
analysis
in
more
granular
steps.
C
So
argo
rollouts
does
provide
this
stuff
for
us
and
I'm
going
to
walk
you
through
now
some
of
the
pieces
of
argo
rollouts
and
why
it
was
compelling.
C
So
let's
talk
about
experiments.
First,
an
experiment
at
its
core
creates
two
different
replica
sets.
C
Each
of
those
replica
sets
has
its
own
pod
spec
template,
so
you
can
deploy
something
with
as
small
of
a
change
as
a
different
version
in
your
docker
image,
or
you
could
have
two
wildly
different
pod
specs,
it's
up
to
you.
C
What
you
do
with
those
pods
with
just
an
experiment
is
up
to
you,
they'll
be
run
and
you
can
go
poke
them.
However,
you
see
fit
but
where
it
gets
a
little
more
interesting
is,
when
you
pair
an
experiment
with
an
analysis
template
to
be
run
against
those
pods.
So
an
analysis
template
describes
three
metric
providers.
C
From
there,
the
experiment
will
use
the
analysis,
template
and
initiate
an
analysis
run.
An
analysis
run
is
really
just
an
instance
of
that
analysis.
Template
with
arguments
filled
in
typically
with
information
about
one
or
more
of
those
pods
in
your
stable
or
canary
replica
set.
C
And
finally,
what
what
most
folks
actually
interact
with
in
argo
rollouts
is
the
rollout
resource.
Like
I
said
this
is
a
drop-in
replacement
for
deployment.
If
you
didn't
use
any
of
argo,
rollout's
advanced
features
and
didn't
specify
like
a
canary
strategy,
you
could
just
use
it
like
a
drop
in
deployment
replacement,
but
you're
not
going
to
get
all
the
goodies
in
it
until
you
specify
a
special
strategy
like
canary
or
blue
green
deployments.
C
C
So
on
the
right
here,
we
have
a
really
basic
example,
mostly
taken
from
the
argo
rollout
stocks,
showing
that
most
of
the
spec
is
just
like
a
deployment.
However,
what
we
see
here
is
an
alternate
strategy
with
canary
in
this
example,
what
we're
doing
is
deploying
20
of
the
instances
first,
pausing
for
five
minutes
and
then
running
a
one-time
analysis
using
the
success
rate,
analysis,
template
and
passing
in
an
argument
with
the
service
name
of
the
service
being
deployed.
C
The
analysis
type
that's
used
here
of
running
a
one
type
analysis
or
a
one-time
analysis
at
the
end
of
that
pause
duration
is
one
way
that
argo
rollouts
lets.
You
configure
analysis
to
run,
but
there
are
other
ways,
including
running
analysis,
in
the
background,
the
entire
time
that
your
canary
steps
are
progressing.
C
Last
before
I
get
to
the
demo,
I
want
to
talk
about
all
the
different
metric
providers
that
you
can
specify
in
an
analysis
template.
So
first
you
can
run
a
kubernetes
job.
This
would
instantiate
a
kubernetes
job
on
the
cluster
and
just
look
for
the
exit
status
of
that
job.
If
it
exits
zero,
your
job
is
successful.
C
C
So
also,
you
can
directly
query
different
metric
providers.
Prometheus,
if
you
have
prompt
ql
queries
that
you
want
to
run
as
well
as
a
number
of
commercial
providers
that
includes
ourselves
new,
relic,
datadog
and
wavefront
the
demo,
I'm
about
to
show
you
is
going
to
be
using
new
relic
as
the
metric
provider,
because
that's
what
we
use
here.
C
So
let
me
pop
out
of
the
presentation
and
start
showing
you
some
stuff.
I'm
gonna
start
first
with
a
rollout
resource.
So
what
I'm
gonna
demo
is.
I
have
an
argo
cd
application
with
two
resources
in
it,
a
rollout
and
an
analysis
template
this
is
the
rollout
you
can
see.
We
have
five
replicas
and
then
we're
going
to
have
this
canary
deploy
strategy
where
we
deploy
20
of
those
resources
aka
one
instance
we're
going
to
pause
for
only
20
seconds.
C
This
is
a
demo
on
a
webinar
and
I
want
to
keep
it
a
bit
snappy
and
then
we're
going
to
run
a
one-time
analysis
against
the
error
rate
of
the
application,
we're
going
to
be
passing
in
a
an
application
name,
which
is
webinar
demo
app,
the
canary
hash.
This
is
a
something
provided
by
argo
rollouts.
C
It
is
the
replica
set
identifier
segment
of
the
pods
name,
and
the
latest
value
here
basically
says
give
me
the
pod
template
hash
from
the
canary
group-
and
this
is
just
another
argument
into
our
analysis-
template
saying
how
long
to
run
our
query
for
in
our
metric
provider
so
against
new
relic.
C
The
rest
of
this
spec
looks
exactly
like
a
typical
deployment.
Spec.
We
have
an
image.
We
have
environment
variables.
Most
of
these
environment
variables
are
just
hooking
up
the
new
relic
agent,
and
then
we
have
one
environment
variable
that
takes
the
rollout
pod
template
hash
and
using
kubernetes
downward
api
makes
that
available
as
a
environment
variable
in
our
container,
which
we
are
then
using
to
add
to
the
instrumentation
by
the
agent
so
that
we
can
pick
up
in
our
transactions
whether
a
given
pod
belongs
to
the
canary
group
or
to
the
stable
group.
C
This
error
rate
analysis
template
that
we're
referencing.
Here
it
takes
four
arguments:
it
takes
application,
name
the
canary
hash.
We
saw
that
in
our
rollout.
We
have
this,
since
this
is
defaulting
to
one
minute.
So
any
rollout,
using
this
analysis,
template
for
any
of
these
arguments
that
have
a
value
there.
C
That
value
is
the
default,
and
so
you
don't
have
to
pass
it
in
for
these
val
for
these
arguments
that
don't
have
a
default
you're
required
to
pass
in
a
value
from
your
rollout,
so
we're
using
20
seconds
as
the
sense
here,
the
error
threshold
we
didn't
pass
in
we're
using
the
default
that
is
1.0,
which
is
a
one
percent
error
rate
threshold,
we're
specifying
a
failure,
condition,
which
is
that
the
error
rate
is
greater
than
or
equal
to
the
threshold.
C
So
this
will
fail
if
the
error
rate
goes
above
one
percent
and
then
the
query
that
we're
giving
to
new
relic
is
the
shorthand
of
it
is
what
is
the
error
rate
of
this
application
with
this
pod
template
hash?
C
So
let
me
jump
over
here
to
argo
cd
for
a
second
to
this
application.
I
have
this
rollout
running
already.
I
am
going
to
make
some
changes
to
it
now
and
I
would
caveat
I
would
typically
do
this
in
a
more
get
ops
fashion.
C
But
again
this
is
a
demo
and
I
want
to
keep
it
snappy,
so
we're
just
going
to
edit
the
manifest
directly
to
simulate
a
new
deployment,
I'm
going
to
deploy,
release
4
and
we're
going
to
see
what
argo
rollouts
does
with
this
you'll
see,
first,
that
it
spun
up
a
new
replica
set
with
one
pod
and
scaled
down
the
stable
replica
set
to
four
pods.
So
we
still
have
five
pods
total
running.
C
In
a
second,
what
you're
going
to
see
here
we
go
is
an
analysis,
run
was
executed
and
very
quickly.
We
see
that
the
canary
replica
set
has
been
scaled
up
to
five
pods.
The
old
replica
set
is
scaling
down
to
zero.
C
C
So
let's
go
the
other
way:
let's,
let's
deploy
something
bad,
so
I
have
a
version
of
this
application
that
boots
up
completely
fine,
but
there's
a
bug
in
it
that
causes
all
of
the
background
processing
that
it
does
to
air.
So,
let's
deploy
that
version.
C
C
I'm
going
to
show
you
something
while
we're
waiting
here,
which
is
this
is
a
new
relic,
and
I
have
a
query
here
that
is
showing
basically
the
same
thing:
the
error
rate
for
this
application
and
we're
already
seeing
that
this
has
just
recently
spiked
up.
C
If
I
jump
back
to
argo
cd,
you're
going
to
see
that
another
analysis
run
occurred
and
the
newer
replica
set
rev6,
because
the
analysis
run
failed
scaled
down,
so
it
automatically
rolled
back
and
the
previous
stable
replica
sets
scaled
back
up
to
its
full
five
instances.
C
If
you
look
at
the
analysis
run,
we
get
some
events.
The
analysis
failed
and
specifically
the
metric
error
rate
failed,
and
if
we
look
at
you
know
some
of
the
data
behind
the
scenes
we
have.
You
know
a
full
100
error
rate.
Of
course
we
want
to
roll
it
back.
C
This
kind
of
metric
analysis,
automated
analysis
is,
is
really
important
to
us
with
the
scale
and
the
sheer
number
of
deployments
that
we're
doing
in
a
day
again
hundreds,
if
not
thousands,
that's
not
the
kind
of
time
and
attention
that
we
want
to
force
engineers
to
pay
we
this
is,
you
know,
really
allowing
us
to
continue
to
move
fast
while
making
changes
in
a
safe
manner.
C
B
Cool
thanks,
caleb,
that's
a
pretty
compelling
demo.
I
hope
everyone.
This
shows
you
exactly
how
to
implement
safe,
continuous
delivery
using
our
rollouts
and
how
we
do
it
too.
B
B
A
common
approach
is
to
scale
out
deployments
to
other
regions,
but
this
doesn't
work
if
your
applications
are
sensitive
to
latency
and
you
need
to
be
close
to
where
your
customers
are.
It
might
seem
straightforward
to
just
keep
scaling
out
your
kubernetes
clusters
by
adding
more
nodes,
but
that's
not
actually
a
good
practice
up
to
a
point.
You
know
our
clusters
are
already
thousands
of
nodes,
so
you
know
it
seems
like
we
need
another
mechanism.
B
B
B
This
allows
us
to
continuously
deploy
changes
to
a
small
subset
of
our
cells
without
impacting
all
cells
incorporating
a
cell
architecture
into
the
automated
canary
analysis,
deployment
that
caleb
shows
means
we
can
have
a
really
high
confidence.
Our
changes
are
not
causing
an
issue.
Not
only
are
we
doing
canary
analysis
inside
of
an
application
deployment,
it's
also
now
inside
of
a
cell,
that's
isolated
from
our
whole
environment.
B
So
the
telemetry
data
platform
looks
like
this
inside
of
a
region
where
we
can
just
keep
adding
cells
as
we
need
more
capacity
in
isolation,
and
we
end
up
with
n
number
of
cells.
This
architecture
can
be
applied
to
any
number
of
applications.
You
just
have
to
look
at
how
you
route
and
charge
your
data
set.
B
B
B
The
amount
of
orchestration
at
this
scale
requires
us
to
have
a
flexible
orchestration
systems
that
many
teams
can
interact
with,
and
that's
where
arco
workflows
comes
in
argo
workflows
is
perfect
for
this.
In
the
system
in
the
systems
we
just
talked
about
moving
to
cell
architecture,
we
have
approximately
20
teams
involved.
B
B
B
B
D
I'm
going
to
talk
about
how
we
have
implemented
our
telephone
pipeline
using
our
workflows
when
we
start
with
our
product
of
concept,
to
integrate
our
existing
telephone
code
into
our
goal.
We
had
some
requirements
to
accomplish.
We
had
to
use
turbo
workflows
to
run
terraform
every
step
had
to
be
hidden.
Potent
that
way
can
be
run
multiple
times
if
needed,
without
affecting
the
current
infrastructure.
D
D
Then
our
workflows
run
shadow
images
under
the
hood.
We
have
to
create
a
new
image
for
us
to
fit
our
needs
using
terraform,
tfm,
opa
and
comp
effects.
Also
the
documents
use
different
inputs
and
it
doesn't
require
more
interaction
and
terraform.
D
We
had
already
some
existing
telephone
codes
where
we
were
previously
creating
our
infrastructure
with
another
angular
pipeline,
but
we
had
to
do
a
few
changes
to
make
it
even
better.
We
have
to.
We
have
to
switch
to
their
homework
spaces,
so
we
don't
have
to
duplicate
the
code
for
every
cell
when
the
code
is
always
the
same
and
it
only
changes
different
variables,
and
if
we
need
to
overwrite
some
values
for
a
specific
for
a
specific
cell,
we
can
specify
them
on
a
df
bar
file.
D
D
The
open
policy
agent
covers
the
need
in
making
sure
that
the
traffic
code
that
we're
applying
won't
do
any
underserved
changes
such
as
deleting
a
kubernetes
cluster.
For
example,
it
uses
a
real
query
language
where
things
can
specify.
There
are
acceptance
policies
and
if
theratron
plan
doesn't
pass
the
open,
the
opi
policies,
the
processes
cancel
and
it
access
without
error,
I'm
going
to
show
I'm
going
to
quick
demo.
D
So
this
is
a
workflows
interface,
and
here
I
have
some
terraform
code.
What
I'm
gonna
try
to
show
is
like
we
are
gonna
create
or
like
we
are
simulate
like
we
are
creating
a
cover
in
this
cluster.
D
Then
we
are
deploying
some
apps
into
that
cluster
and
in
that
only
we
are
creating
an
s3
bucket
if
we
go
to
the
ntf,
I'm
just
using
another
source,
because
I'm
going
to
spend
time
creating
a
new
cluster
right
now.
But
the
interesting
part
here
is
this:
transform
rail
file,
which
is
the
one
that
allows
us
that
allow
us
to
make
sure
that
we
are
applying
things
in
a
safe
way.
D
For
example,
we
have
this
new
resource
where
we
are
setting
different
ways.
If
the
plan
is
gonna
delete
another
resource,
it
says
that
it's
gonna
have
a
hundred
points
if
it
creates
a
new
one,
it's
gonna
have
ten
points
and,
if
modify
one
is
gonna
add
one
point:
we
see
that
our
plus
radius
is
30.
So
if
the
overall
calculation
of
all
the
resources
are
over,
50
is
gonna
be
cancelled.
D
D
Here
we
have
some
secrets
that
are
already
on
the
kubernetes
cluster
that
allow
us
to
clone
the
the
repository,
and
then
here
is
where
we
launched
the
docker
image
and
we
are
passing
the
values
to
the
different
environment
variables
that
we
need.
For
example,
we
need
to
pass
that
data
from
directory
where
the
code
is
the
telephone
version.
If
we
want
to
force
one
otherwise,
it
will
detect
our
terraformation
written
in
the
code
and
we'll
download
it
automatically.
D
If
it
isn't
already
present
on
the
local
container,
then
we
specify
the
workspace,
which
basically
is
the
name
of
the
cell
in
our
case
and
where
the
opa
file
is
also,
we
can
use
concepts
as
well
and
then
the
action
that
we're
going
to
apply
that
can
be
planned
applied.
Slowly
then
we
have
our
aw
access
keys
that
again
are
stored
on
the
cluster
that
allow
us
to
communicate
with.
D
Anyways
then
this
workflow
is
in
here
I'm
simulating
the
one
that
I'll
show
previously
the
the
huge
one.
So
basically
it's
a
dac
and
I
use
the
stacks,
so
I
have
three
tasks
here.
One
task
is
a
that
creates
the
coordinates
cluster
and
it
refers
to
the
workflow
template
that
I
showed
previously.
D
We
are
going
to
pass
only
different
values,
which
is
the
workspace,
in
our
case,
the
cell
name,
the
psd
that
we
want
to
apply
in
our
case
is
that
directory
where
our
opf
file
is
and
then
the
action
that
is
applied.
Then
the
second
step
is
our
deploy
apps.
In
that
step,
we
can
see
that
castle
dependency,
so
that
set
will
be
launched
after
the
previous
one
is
finished,
and
the
only
that
it
changed
is
a
psd
and
here
we're
using
a
contest
conference
instead
of
opa
and
then
for
that
create
s3.
D
We
don't
have
any
dependency,
which
means
that
at
the
beginning,
both
and
the
kubernetes
cluster
and
creator
3
are
going
to
be
run
on
parallel,
so
go
here.
D
D
Cell,
so
I'm
gonna
submit
the
the
workflow
in
the
workflow
channel.
In
that
case
it's
just
a
workflow,
not
a
workflow
template
and
I'm
gonna
pass
a
parameter,
which
is
that's
the
name.
D
I
can
check
the
locks
from
the
console
we
can
see.
Different
colors
in
our
case
means
that
every
color
is
a
different
set
or
a
different
container,
or
we
can
go
on
the
web
ui
which
it's
analyzer.
D
So
it's
detected
that
we
are
using
director
from
1279.
D
Then
the
workspace
webinar
demo
doesn't
exist,
so
it
creates
automatically
the
workspace.
D
So
now
let's
go
to
the
code,
and
imagine
that
I
say:
oh
I'm
going
to
change
the
restore
name
and
I'm
gonna
add
you
if
you
are
familiar
with
perform.
You
know
that
this
operation,
basically
what
it's
gonna
do,
is
gonna
destroy
the
previous
cluster
and
it's
gonna
create
a
new
one.
So
change
name
commit
the
changes.
D
Now
it
detected
already
the
network
space
because
it
has
been
created
before
and
it
says
that
this
tries
to
it's
going
to
destroy
a
cluster
and
it's
going
to
create
the
new
one
that
we
have
set.
So
it's
110
that
the
total
score,
because
it's
100
for
deleting
a
new
result
for
deleting
resources
and
10
for
adding
one.
D
And
then
it
says
that
it
failed
the
opi
checks
and
it
cancel
the
operation
and
as
it
cancels
the
operation,
we
can
see
that
the
deploy
apps
set
didn't
run
and
that's
basically
how
we
are
using
our
workflows
with
ultraphone.
B
Cool
daniel,
that's
a
pretty
sweet
demo.
I
definitely
showcases
how
we
use
terraform
and
opa
with
an
argo
workflow
to
safely
roll
out
our
infrastructure
changes.
This
fits
nicely
with
what
caleb
showed
combined
argo,
cd
and
argo
rollouts,
so
we
can
build
new
cells
in
a
safe
and
automated
fashion
without
any
degradation
of
our
service
at
scale.
B
We
had
a
question
about
crds.
I
think
it
was
related
to
machine
machine
templates
for
kids
and
how
you
manage
the
comp
complexity
of
that
simultaneously.
We
do
that
with
our
go.
Workflows
essentially
we'll
have
a
oh.
You
know
a
step
in
a
workflow
that
uses
argo
cd
and
argo
rollouts
to
push
out.
B
You
know
crds,
and
then
we
can
run
any
validation
that
we
want,
and
then
you
can
have
another
step
that
you
know
in
the
workflow
that
does
whatever
you
need
to
do
with
machine
templates
as
and
you
know,
those
are
treated
as
code
as
well.
B
The
next
question:
how
would
a
list
of
applications
and
application
details
ui
look
with
200
microservices?
Is
it
a
drill
down
ui,
or
is
everything
going
to
be
on
the
same
page
caleb?
You
want
to
take
this.
C
Yeah,
I
can
take
that
since
we
deal
with
having
you
know:
3000
applications
in
one
argo
cd,
so
any
individual
application
looks
like
what
you
saw
in
in
my
demo,
where
it's
scoped
to
just
the
resources
that
that
application
controls
the
list
of
applications
is
is
one
big
list,
but
it
is
filterable,
so
you
can
group
them
together
by
by
project
which
is
at
least
in
our
case.
C
We
typically
tie
that
to
like
a
namespace
or
a
team,
and
then
you
can
also
filter
it
by
like
which
cluster
you're,
targeting
you
can
put
arbitrary
labels
on
those
applications
and
filter
by
that
kind
of
a
robust
filtering
system
there.
So
it's
not
exactly
drilled
down,
but
it's
like
a
filterable
list.
B
C
I
can
take
that
too,
because
I
think
it
probably
depends
on
your
your
das
tool,
but
I
can
imagine
using
either
a
job
metric
provider
in
a
canary
release
or
a
webmetric
provider
and
either
running
a
kubernetes
job.
That
goes
and
asks
your
das
tool
to
go,
do
a
thing
and
then
inspect
the
results.
C
D
B
Unfortunately
not,
but
I
think
we
we.
We
will
probably
create
a
public
blog
post
on
new
relic
and
be
able
to
share,
share
a
lot
of
the
content.
C
B
It
kind
of
depends
what
the
step
in
the
workflow's
doing,
if
you're
using
rollouts
then
you're
in
good
shape,
because
it
automatically
rolls
you
back
for
the
terraform
stuff,
we
showed
it
essentially
will
if
it
passes
the
opa
checks
and
then
it
fails
for
some
reason,
then
you
know
you're
going
to
get
alerted
and
have
to
dive
in
to
see
what's
wrong.
B
C
Are
we
running
an
argo
for
each
case
cluster
or
yeah
or
or
do
we
have
one
that
manages
all
of
them?
So
I
the
short
answer
is
we
have
one
that
manages
all
of
them?
The
real
answer
is
we
have
two
that
each
manage
all
of
them,
but
we
definitely
take
the
like
singular
approach.
The
only
one
of
those
components
that
is
deployed
to
every
cluster
is
the
argo
rollouts
component,
because
that
is
a
controller
that
needs
to
run
on
every
cluster.
Everything
else
is
centrally
managed
in
one.
B
B
I
think
the
one
from
alexi-
probably
how
do
we
create
another
environment
for
the
application
as
new
instances
of
the
application
that
consists
of
hundreds
of
services
and
components
and
additional
dependencies,
for
example,
it
might
require
to
create
a
pv
resource
and,
may
you
know,
rely
on
centralized
instances
of
like
rabbit
and
q.
Can
you
copy
application
to
create
new
instances,
I.e
copy
dev01,
test04,
staging
etc
or
create
one
from
scratch?
Every
time.
B
C
It
sure
like
this
is
why
we
do
get
ops
where
creating
a
new
version.
So
especially,
I
think
with
the
stuff
that
we're
deploying
a
lot
of
these
are
helm,
charts
and
argo
cd
handles
like
helm
and
customize
and
other
renders
very
well.
C
So
you
wouldn't
necessarily
be
creating
one
from
scratch,
but
you
would
be
creating
a
new
application
with
the
exact
same
helm
chart,
but
with
different
parameters
for
your
different
environments
and
that's
how
you
deploy
a
new
one.
Yep
cool.
D
B
Yeah,
so
each
each
workflow
that
a
team
owns
has
their
own
state
backend
and
an
s3
bucket.
B
B
B
B
That
one,
I'm
not
sure
of
I
think,
you'll
be
able
to
look
at
the
logs.
But
if
you
need
to
you're
gonna,
try
to
like
init
into
you
know
or
execute
into
the
container,
you
might
have
to
introduce
your
own
like
pause
mechanism
there
or
something.
B
So
this
is
a
good
question:
does
it
compare
versions
of
kubernetes
resources
against
what
argo
created
manual
edits
of
things
like
emvs,
resource
limits,
requests,
etc?
So
yeah?
It
has
a
sync
component
to
it
so
like
if
you
edit,
the
state
of
a
resource
that
is
managed
by
argo
like
fargo,
cd
or
argo
rollouts.
B
If
you
edit,
that
manually
with
cube
ctl
it'll,
essentially
be
out
of
sync
with
the
the
git
repo
that's
coming
from,
and
this
is
where,
like
get
ops
comes
in
where,
if
that
happens,
the
next
sync
will
essentially
blow
away
those
changes.
C
I
can
actually
show
this
real
quick,
because
I'm
out
of
sync
right
now
in
in
that
application
that
I
have
demoed,
let
me
let
me
do
this
so.
C
So,
like
you
know,
I
made
some
manual
edits
to
that
rollout
through
this
ui,
and
it
now
tells
me
that
I'm
out
of
sync
with
what's
in
git
and
and
you
can
get
this
diff
here,
let
me
show
the
compact
version.
You
can
get
a
diff
about
like
what
has
changed
here
versus
what
is
in
the
get
repo
backing
this
application,
and
somebody
had
asked
about
notifications
earlier,
like
we.
C
We
specifically
use
the
argo
cd
notifications
project,
that's
in
the
argo
like
labs,
org,
and
that
will
send
out
notifications
when,
like
into
slack
when
a
application
goes
out
of
sync
for
any
reason,
which
is
really.
C
D
D
Yeah
that
one
says
in
your
github
workflow
how
to
how
do
you
manage
credentials
like
required
by
argo
to
access
gift
repositories?
Are
they
stored
in
plain
text?
This
basics
fall
in
their
repository
itself,
so
we're
using
secrets,
coordinated
secrets
to
store
all
kind
of
credentials
and
then
with
vargo
cd
or
google
flow.
It's
really
easy
to
to
get
them.
It's
just
yeah.
It's
like
when
you
define
a
bot
that
you
want
to
use
a
secret.
I
think
it's
exactly
the
same
definition.
D
B
Does
argo
have
ldap
and
80
integration
yeah?
This
was
on
one
of
my
slides.
It
actually
has
that
sso
integration
and
you
know
very
granular,
our
back
controls
that
you
know
we'll
let
you
have
multi-tenancy
and
things
like
that
and
you
could
even
deploy
you
know
different
argo
instances
to
the
same
kubernetes
cluster
if
you
really
needed
to
in
their
own
name
spaces
and
have
some
granular
control
of
what
teams
can
access
and
stuff
so
like
that.
C
We
looked
at
some
commercial
providers
we
previously
had,
and
so
we
looked
at
harness,
we
looked
at.
We
looked
at
tecton.
We
looked
at
a
number
of
other,
like
pipelining
tools,
specifically
in
the
argo
workflows
space,
so
like
go
cd,
and
that
list
is
a
little
long.
We
previously
had
experience
with
spinnaker,
so
we
had
done
an
evaluation
of
that.
C
It
all
kind
of
boiled
down
to
this
collection
of
tools,
each
solving
specific
needs
that
we
had
very
well
in
their
own
like
pinpointed
way
like
so
argo
cd,
you
know,
did
the
get
ops
and
the
syncing
and
didn't
try
to
do
anything
else.
Didn't
try
to
be
like
an
all-in-one
tool,
but
our
go
workflows
did
the
pipelining
and
it
did
that
really
well,
but
it
didn't.
C
You
know,
try
to
be
everything
and
we
found
that
flexibility
to
meet
all
of
our
needs
and
together,
you
know
like
putting
these
three
things
together
drew
a
really
nice
overall.
C
C
See
I
can
answer
one
of
these,
so
how
do
you
set
up
and
configure
aggro
cd?
Do
you
use
cluster
bootstrapping
features
app
of
apps,
so
yeah
the
the
team
that
manages
the
argo
installation
for
all
of
this
stuff?
It's
all
managed
through
a
git
repo
that
is
synced
through
argo
cd.
So
there's
one
bootstrapping
step
to
get
argo
cd
like
up
and
running
the
first
time,
and
then
it's
just
a
an
automatic
sync
of
all
of
the
argo
components
after
that,
specifically
using
the
app
of
apps.
C
B
So
here's
a
question
for
do
you?
Do
you
think
argo
could
be
utilized
for
setting
up
base
apps
of
the
kubernetes
clusters,
like
a
paths
layer
of
the
application
that
we
use
in
our
infrastructure,
logging,
file,
b,
plus
log,
slash
ingress
controllers,
metal,
lb
for
bellmatter
clusters,
storage,
class
definitions
for
cloud
providers
etc?
Or
do
you
think
argo
is
a
good
primary
primarily
for
application
state
management?
B
I
think
it's
both
really.
You
know
this
is
where
the
get
ops
workflow
really
comes
in
so
like
the
source
of
truth
is
really
in
get
for
all
this
and
argo
is
just
the
intermediary
of
you
know,
applying
those
changes
and
that
really
lets
us
have
confidence
of
you
know
what
the
changes
were.
Who
made
the
changes?
B
Somebody
reviewed
the
changes
we,
you
know
you
can
have
different
environments
for
testing
those
out
and
then
you
know
argo
is,
you
know
the
delivery
mechanism,
but
it's
also
doing
stuff,
like
caleb
showed
with
rollouts,
which
is
giving
us
the
metric
based
analysis
for
canaries
and
other
things,
and
then
with
workflows
orchestrating
it
all
together.
So
we
can
kind
of
do
anything
we
want
with
it.
B
In
some
regard,
I
think
one
of
the
important
things
is
like
we
really
push
that
all
the
things
we're
we're
deploying
are
idepotent.
So
that
way
we
could
rerun
the
steps
over
and
over
again.
If
we
need
to
right,
like
it's
safe
to
just
say,
hey
they
run
and
they
will
not
make
any
changes
if
they
don't
need
to.
A
All
right,
I
think,
that's
all
we
have
time
for
thanks
everyone
for
joining
us
I'll
remind
you
that
the
slides
and
recording
will
be
up
later
today
on
the
cncf
website
and
thanks
again
for
joining
us.
Thank
you
all
for
your
presentation
and
all
your
q
a,
and
we
will
see
you
at
a
cncf
webinar
very
soon,.