►
Description
Don't miss out! Join us at our upcoming event: KubeCon + CloudNativeCon Europe in Amsterdam, The Netherlands from 18 - 21 April, 2023. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.
A
A
Cloud
custodian
is
a
yaml
DSL
policy
engine
for
the
cloud
it
scales
up
from
the
startup
level
of
just
having
a
few
resources
and
is
used
in
massive
Enterprise
scale
in
production
by
many
large
organizations.
The
intent
is
to
drive
behavioral
change
and
Tighter
feedback
loops
for
your
developers.
But
what
does
that
actually
mean
so
I'll
give
you
like
kind
of
my
plain.
A
We
can
go
back
one
my
plane,
my
plane,
explanation
for
it
is
you
write
your
policies
in
in
yaml
and
Cloud
custodian
is
a
rules
engine
that
runs
in
the
Cloud's
control
plane
and
ensures
that
the
policies
that
you're
writing
get
enforced
on
your
Cloud.
So
a
typical
example
we
like
to
use
is
make
sure
that
we're
not
opening
a
database
and
leaving
it
on
the
internet.
A
So
you
have
rules
that
kind
of
manage
all
of
your
resources,
and
typically
you
check
these
into
git
or
some
kind
of
Version
Control,
and
then
Cloud
custodian
ensures
that
those
policies
are
enforced
on
your
Cloud
to
make
sure
that
you
know,
if
you
have
a
policy
to
make
sure
that
you're
not
supposed
to
serve
open
databases
on
the
internet,
then
we
we
make
sure
so
in
a
lot
of
ways.
A
The
analogy
we
like
to
use
is
a
seat
belt
to
your
Cloud
resources
to
enable
that,
if
you
accidentally
do
a
thing
manually
that
you
have
Automation
in
place
to
keep
you
safe
next
and
recently,
as
cost
has
been
more
important
for
people
over
the
past
18
months.
Using
these
policies
is
also
a
great
way
to
ensure
that
you're
managing
cost
in
your
Cloud.
So
you
can
use
cloud
custodian
and
people
are
using
Cloud
custodian
to
not
just
ensure
that
they're
meeting
compliance
needs.
A
But
do
you
have
a
bunch
of
unused
DBS
snapshots
somewhere
or
resources
that
might
not
be
tied
to
a
specific
account
that
you
were
expecting
or
or
things
like
that?
So
by
defining
all
of
these
rules,
you
can
manage
your
entire
Cloud
deployment
more
smartly
and
things
that
aren't
supposed
to
be
there.
Cloud
custodian
can
kind
of
garbage
collect
for
you.
That's
where
the
analogy
of
the
custodian
is.
A
Is
you
define
what
resources
and
their
limits
are
supposed
to
be
in
a
certain
place
and
custodian
kind
of
forces
that
for
you-
and
this
is
useful
in
the
cost
aspect,
especially
because
resources
that
you
are
not
tracking
tend
to
kind
of
pile
up
so
having
that
garbage
collection
for
a
lot
of
organizations
ends
up
being
a
significant
cost
savings
by
ensuring
that
what
they
think
is
running
in
their
cloud
is
the
actual
thing
that's
running.
A
And,
of
course,
compliance,
one
of
the
great
things
about
this
tool
is
that
you
can
catch
kind
of
your
compliance
and
rules
and
by
version
controlling
them
and
using
them
in
cicd
custodian
kind
of
enables
a
git,
Ops
workflow
that
allows
you
to
manage
all
of
that
stuff
in
a
tight
feedback
loop,
because
it
does
do
real-time
compliance
checking
of
these
rules.
A
So
if
today
I
were
to
try
to
deploy
something
into
one
of
our
resource
and
it
was
violating
a
policy
custodian,
you
know
if
you
set
it
up,
that
way,
can
remediate
immediately
and
notify
me
that
hey!
You
know
there
was
a
resource
that
I
asked
for
that,
isn't
getting
made
because
of
these
reasons,
and
what
we
are
trying
to
do,
as
we
alluded
to
earlier,
is
kind
of
drive
that
behavioral
feedback
of
you
know.
Okay,
so
where,
where
do
I
fix
this?
A
A
That
goes
resources
developer
time,
especially
so
that's
kind
of
why
the
model,
the
Mantra
behind
a
tool
like
this
is
to
enable
that
type
feedback,
loop,
driven
all
by
your
existing
automation,
that
you
have
and
that's
what
we're
going
to
talk
about
here
today,
specifically
around
kubernetes
clusters
and
around
your
terraform.
B
Yeah,
so
what
does
all
this
look
like
exactly?
You
start
out
with
a
policy.
So
first
thing
you
do
is
specify
a
name
for
your
policy,
and
you
also
have
to
select
a
resource
in
this
case,
we're
looking
at
S3
buckets
and
AWS.
B
Then
you
can
Define
any
number
of
filters
that
you
wanted
to
filter
on
for
those
resources.
In
this
case,
we're
saying
we
want
to
find
any
buckets
that
have
a
head
bucket
and
git
object
actions
that
allow
the
account
listed
here
to
access
it,
and
then
you
can
specify
what
actions
you
want
to
run.
So
in
this
case
we're
saying
we
want
to
notify
the
resource
owner
and
also
to
send
a
slack
message
using
a
certain
policy
template.
B
So
that
way,
you
can
send
these
notifications
directly
to
the
people
that
are
violating
your
policies
instead
of
having
to
do
something
like
keep
a
list
and
then
track
it
down,
and
you
know
pass
around
a
CSV
or
something
to
your,
your
engineering
teams
and,
finally,
to
do
all
this.
B
You
just
run
the
custodian
run,
commands
where
you
pass
in
the
name
of
the
file
and
give
it
an
output,
and
then
you'll
start
to
see
your
policies
running
so
here's
another
example
policies
so
in
this
case
we're
looking
for
IM
roles
that
are
over
over
proficient.
So
you
can
see
that
we
also
support
these.
These
knots,
ands
and
ORS.
So
any
sort
of
Boolean
expression
that
you
want
to
have
so
we're
saying
ignore
any
any
roles
that
are
named.
B
I
am
provisioner
and
you
want
to
check
the
permissions
to
say
any
roles
that
have
this
I
am
change
password
action
inside
of
their
inside
of
the
the
role
itself
and
again
we
want
to
notify
that
so
in
this
case,
instead
of
sending
it
to
the
resource
owner,
we're
sending
it
to
the
security,
email,
distro
and
copying
the
the
cloud
team
as
well.
B
So
finally,
also
custodian
policies
can
be
run
in
two
different
types
of
modes.
So
there's
a
pull
node
where
you
are
querying
the
cloud
itself
or
the
cluster
directly.
So
in
this
case,
every
single
time
you
want
to
check
those
over
provision.
I
am
roles,
you're
checking
everything
that's
out
in
the
cloud.
Currently.
There
are
also
event
based
modes
which
utilize
things
like
cloudwatch
event
triggers
cloudtrail
and
config
on
the
AWS
side,
and
we
have
equivalents
for
that
in
Azure
and
gcp
these
modes.
B
Allow
you
to
trigger
off
of
events
that
happen
in
your
Cloud,
as
well
as
in
your
cluster.
So
that
way
you
can
be
much
more
reactive
as
well
as
do
things
like,
remove
any
non-compliant
resources
that
are
net
new
instead
of
having
to
wait
for
the
resource
to
exist
in
the
cloud
for
a
while
and
then
do
some
sort
of
action,
because
that
can
lead
to
things
where
you
can
potentially
take
down
live
running
services,
for
example,.
B
So
Cloud,
custodian
and
kubernetes
has
support
for
those
two
modes,
the
first
of
which
the
pull
mode.
So
you
can
query
your
cluster
with
the
same
policy
language
as
your
cloud.
Basically,
this
means
that
if
you're
familiar
with
running
custodian
policies
for
AWS,
Azure,
gcp
you'll
feel
right
at
home.
In
addition,
there's
a
Kate's
admission
mode
where
you
could
run
custodian
policies
in
an
admission,
controlled
mode
to
allow
deny
or
warn
on
any
sort
of
object
life
cycle
event.
B
It's
easiest
to
deploy
in
your
cluster
with
a
Helm
chart,
and
you
can
also
do
things
like
Auto
label
objects
as
they
come
into
the
cluster
to
determine
resource
ownership,
for
example.
Finally,
we
have
terraform
support
as
well.
So
not
only
can
you
govern
your
infrastructure,
that's
already
out
there.
B
You
can
also
use
custodian
to
govern
your
infrastructure
as
code,
so
this
allows
your
developers
to
know
ahead
of
time
that
the
things
that
they're
deploying
are
not
going
to
be
compliant
or
they're
not
going
to
be
in
line
with
the
guardrails
that
you've
set
this
way
they
can
make
those
changes
early
on
and
not
have
to
deal
with.
The
headache
of
going
going
back
and
potentially
having
to
do
things
like
stop
a
database
schedule
downtime
and
recreate
it.
B
B
So
we'll
go
off
and
do
a
quick
demo
see
so.
The
first
thing
that
we'll
start
with
will
be
a
kubernetes
poll
mode
example.
So,
on
the
left
here
on
my
screen,
I'm
just
running
the
kubernetes
admission
controller,
which
we'll
get
to
in
a
second
but
first,
let's
run
the
policy
for
kubernetes.
So,
like
I
said,
this
is
pull
mode.
So
this
is
point
directly
from
my
cluster
and,
if
I
take
a
look
at
that
research
dates
on
that
comes
back.
B
So
this
is
basically
all
of
the
information
that
you
would
expect
to
see
if
you
do
like
a
cube,
CTL
describe
pod
and
if
you
get
every
single
pod-
and
this
is
a
great
way
for
you
to
see,
attributes
that
you
can
filter
on.
For
example,
so
let's
go
and
take
a
look
at
the
event
based
modes.
So
the
first
thing
that
we'll
do
is
we'll
take
a
look
at
our
policies
that
we
have
here.
B
So
the
policies
here
are
just
in
a
config
map
that
we've
deployed
to
our
kubernetes
cluster,
and
you
can
see
here
we
have
a
few,
so
the
first
one
here
deny
pod
exec
based
on
the
Pod.
We
have
another
policy
here:
checking
for
missing
recommended
labels,
another
one
restricting
service
account
usage
on
pods
and
then
one
last
one
showing
that
we
need
to
require
at
least
three
replicas
on
any
kubernetes
deployment
that
we
have.
So
the
first
thing
that
we'll
do
is
we'll
try
to
create
a
pod.
B
So
let's
take
a
look
at
our
pod
manifest
right
here.
So
the
first
thing
you
can
see
is
We've.
We've
got
our
pod
manifest
and
if
we
try
to
deploy
that
you
can
see,
we
get
a
warning
saying
Mitzi
recommended
like
what
was
all
pause,
mustafu
and
bar
labels.
So
you
can
see
in
our
benefits
here
we
only
have
the
food
one.
B
So
if
we
take
a
look
at
our
part
that
we
created,
not
only
do
you
see,
we
only
have
this
through
equal
bar
label,
but
we
actually
use
the
the
policy
itself
to
append
the
owner
contact
label
here.
So
we
can
see
that
the
kubernetes
admin
was
the
one
that
created
the
resource
and
then
we
also
have
this
additional
message
that
we
that
we
appended
as
a
label
saying
it's
missing
labels.
So
if
we
delete
our
pod
there.
B
So
this
is
a
great
way
if
you
want
to
sort
of
ease
developers
into
making
sure
they're
doing
the
right
thing
before
you
do
a
hard
restriction.
The
next
thing
we'll
do
is
actually
let's
keep
that
pod
up.
B
So
if
we
run
Cube
CTL
exec,
let's
see
here
that
we
actually
get
an
error
saying
that
it
failed
due
to
these
policies,
which
says
you
can't
connect
to
any
pods
with
database
in
the
name
or
the
namespace
c7n
system,
and
so
this
is
really
great
to
allow
you
to
have
more
fine-grade
control
on
some
of
the
actions
that
developers
can
have
against
the
resources
and
let's
so
the
next
thing
we'll
do
is
check
out
how
to
what
happens
if
we
try
to
create
a
pod
with
a
more
restricted
service
account.
B
So
the
first
thing
we'll
do
is
we'll
create
this
service
account
here.
That's
called
cluster
admin
and
let's
try
to
apply
pod
with
service
account.
Actually,
let's
take
a
look
at
what
that
looks
like
first
so
here
the
main
thing
is
that
we're
using
the
service
account
called
cluster
admin,
which
I'm
sure
you
can
assume
has
all
sorts
of
permissions
that
you
don't
want
everybody
to
use.
B
So
if
we
try
to
apply
that,
so
we
apply
pod
with
service
account,
you
can
see
here
that
again
we
get
this
restriction,
saying
you
can't
use
that
service
account.
Finally,
we
had
that
policy
there
that
restricted
deployment
saying
you
have
to
have
at
least
three
replicas
on
your
deployment.
So
if
we
take
a
look
at
our
deployment
demo,
we
see
that
this
one
has
three.
So
this
should
be
able
to
work
just
fine,
but
if
we
go
ahead
and
drop
that
down
to
two.
B
And
we
run
a
acute
CTL
apply
deployment.yaml
you
can
see
here
it
failed
admission
due
to
the
policy
required
at
least
three
replicas.
So
let's
go
back
in
and
change
that
2
into
a
three.
B
You
can
see
that
our
deployment
was
able
to
go
through
just
fine.
So
again,
all
the
stuff
you
see
on
here
is
basically
what
you
would
see
in
your
blogs
for
your
deployment
when
you
deploy
this
on
your
cluster,
basically
it'll
match
against
only
the
events
that
you
actually
care
about
for
your
from
your
policies.
B
B
Yeah
yeah,
so
if
we
go,
two
seven
left
run
help.
Basically,
what
you
do
is
you
get
pass
in
a
policy
directory
which
will
be
your
custodian
policies
as
well
as
a
directory
for
your
actual
terraform
itself.
So
if
we
take
a
look
at
the
policies
you
can
see,
we
have
a
policy
here
that
says
all
resources
should
be
tagged
and
specifically,
it
needs
to
have
this
environment
tag,
and
then
we
have
one
saying
that
all
SQ
ads
must
be
encrypted.
B
So
if
we
run
C7
and
left
run
and
we
give
it
our
policies
directory
as
well
as
our
current
care
form
directory,
we
can
see
here
that
we
failed
two
of
these
policies,
so
the
first
one
saying
that
sqs
must
be
encrypted
and
the
second
one
here
is
saying
all
the
resources
should
be
tagged.
So
if
you
look
at
our
main.tf
here,
we
can
note
that
so
this
first
one
we
have
a
sqs
queue
that
we
just
have
here.
B
It's
not
in
a
modular
thing,
it's
just
directly
in
the
in
the
main
terraform.
So
if
we
add
our
tags
here
like
so
that
should
fix
the
first
one
and
then
you
can
see
in
the
you
can
see
here
in
the
second
one,
we're
actually
using
a
remote
module.
So,
rather
than
only
be
able
to
test
the
terraform
that
you
have
directly
inside
of
your
local
terraform
workspace,
it
will
actually
be
able
to
look
up
the
the
module
references
as
well.
B
So
here
are
problems
that
we
had
managed
SSC
enabled
set
to
false.
So
if
we
set
that
to
true
that
should
fix
it
and
it
will
be
run
C7
left
again,
you
can
see
that
we
have
passed
all
of
our
policy
checks
and
you
can
also
look
up
the
summary
based
on
the
resources
as
well.
So
in
this
case
we
have
some
ion
documents
and
those
paths
as
well
as
our
sqs
cues.
A
And
that's
basically,
a
tour
of
a
custodian
on
a
cluster
and
an
infrastructure
is
code
if
you're
interested
in
this
you'll
find
us
at
kubecon
and
cloudnativecon
in
Europe
in
Amsterdam
coming
up,
and
we
don't
have
any
information
now
but
hoping
to
also
have
a
maintainer
session
as
well,
if
you're
interested
in
contributing
and
checking
out
all
the
cool
stuff
that
an
open
source
project
has
to
offer
and
with
that
Sunny.
Thank
you
very
much
and
thanks
everyone
for
listening
and
feel
free
to
join
us,
Cloud
custodian
dot,
IO.
Thank
you.