►
Description
Multi-cluster management is hard. Technology, teams and culture clash in a race to deliver clusters and applications in a secure and compliant way. Red Hat Advanced Cluster Management for Kubernetes (RHACM) provides the capabilities to address common challenges that administrators and site reliability engineers face as they work across a range of public and private cloud environments. Clusters and applications are all visible and managed from a single console—with security policy built in.
B
Good
morning,
good
afternoon,
good
evening,
hello,
everyone
and
welcome
to
another
episode
of
red
hat
advanced
cluster
management
presents
it's
not
a
mouthful.
It's
a
product.
Today,
I'm
joined
by
today,
I'm
joined
by
scott
behrens
and
the
wonderful
team
from
our
rackham
program.
Developments
and
scott.
Do
you
want
to
introduce
everybody
just
kind.
A
So,
chris
you
know
I
was
on
your
show.
A
few
weeks
ago
we
talked
about
the
end
to
end
today.
I
thought
I'd
bring
in
some
some
real
muscle.
You
know
we
had
to
bring
our
vp
and
our
senior
architect
last
time
and
I
think
we're
going
to
go
a
step
better
this
time.
So
I've
got
randy
george,
I
don't
know
if
that's
left
or
right
he's
a
pink
floyd
shirt
he's
an
architect
that
leads
the
observability
pillar
within
rackham.
A
I've
got
joy,
deep,
banerjee
based
out
there
in
sunny
southern
california,
he's
the
technical
lead
that
focuses
on
observability
search,
analytics,
all
the
goodness
in
there
and
then
chris
doane,
who
brings
our
sre
perspective
to
the
table
today.
Chris
is
based
in
austin
with
me
and
raymond.
Chris
really
helps
us
to
dig
into
the
bowel
of
the
system
and
understand
what's
breaking
and
why
and
how
do
we
fix
it?
B
Yeah
so
I
mean
I
have
gray
hair.
If,
if
I
had
a
beard,
it
would
be
gray,
so
there.
A
B
A
C
B
B
A
A
B
A
A
Weeks
ago
and
we
had
the
integration
of
dancing
bowl
and
the
application
management,
so
today
we
get
to
look
at
the
observability
piece
kind
of
dissect.
What
we
do
there
we're
bringing
a
new
architecture
and
play,
and
I'm
going
to
stop
talking
there,
because
I
don't
want
to
ruin
it
for
ranges,
so
take
it
away
randy.
What
are
we
doing.
D
So
thanks
scott,
yes
I'll,
give
you
a
little
bit
of
kind
of
the
problems,
we're
solving
first
and
I'll.
Take
you
into
the
high
level
architecture,
however,
approached
it
right.
So,
as
you
know,
and
you've
talked
about
this,
what
does
rackham
do
right
and
you
can
break
it
down
into
simplicity,
cluster
life,
cycle
management,
application,
life
cycle
management
and
governance,
risk
and
compliance
right?
D
Well,
if
you
think
about
it,
it's
very
hard
to
do
cluster
life
cycle
management,
application,
life
cycle
management-
if
you
don't
have
insight
into
the
health
of
your
clusters
and
your
applications
right,
you
can't.
Obviously
you
want
to
meet
your
corporate
slos,
etc.
But
why
would
I
want
to
deploy
an
app
to
an
unhealthy
cluster?
I'm
definitely
not
going
to
meet
my
slos
right,
for
example,.
E
D
So
the
key
thing
we
want
to
do
is
provide
that
insight
and
we're
doing
that
as
scott
mentioned
by
adding
observability
and
again.
This
is
a
step
along
a
roadmap.
Observability
is
quite
complex
and
so
don't
take
it
as
what
we
show
you
today
is
the
end
game
by
any
stretch,
imagination,
it's
beginning
right,
and
so
we
have
some
key
focus.
Initially
we're
going
to
focus
on
clusters
right
and
then
we'll
get
to
the
app
layer
and
scott
will
take
us
through
that
road
map.
D
So
I
want
to
steal
his
thunder
touche,
and
so
some
of
the
things
we've
I
talked
about.
First
was
the
health
right
and
we'll
get
into
some
of
that.
Another
thing,
and
we've
heard
this
from
customers
quite
a
bit
as
the
various
development
teams
are
working
on
their
projects
and
then
they're
deploying
them
to
the
clusters.
What
do
they
do?
They
request?
D
So
they'll
come
in
and
find
out
that
you
know
I'm
reserving
x
amount
of
cpu
or
x
amount
of
memory
and
I'm
only
using
a
tenth
of
it,
for
example,
and
that's
a
cost.
It's
real
cost
to
get
the
commute
resources,
it's
cost
of
ocp
that
they're
running
on
or
whatever
you
know,
runtime
they're
running
on
et
cetera,
so
they
want
to
get
their
costs
in
control
as
well
right.
So
it's
not
just
a
make
sure
everything's
healthy
and
stable,
that's
very
important,
but
also
make
sure
everything's
optimized
right.
D
So
those
are
kind
of
thing
and
then
kind
of
I
call
the
third
facet
is
you
know
we
detect
issues
like
governance,
risk,
etc.
We're
gonna
detect,
you
know,
monitor
the
health
and
we
can,
you
know,
send
alerts
and
notifications.
Well,
the
sre
gets
notified.
How
does
he
or
she
go
about
problem
determination?
D
Well,
you
know
you
think
about
the
simple
pattern:
what's
an
sre,
do
they
they
form
in
their
mind
a
hypothesis
of
what
caused
the
problem
right,
and
so
they
have
to
either
go,
prove
or
disprove
that
right
and
if
they
can
prove
it,
they
obviously
provide
a
fix
or
run
a
script
to
compensate
for
that
or
they
disprove
they
come
up
with
a
new
one,
so
they
need
a
way
to
be
able
to
interact
and
introspect
the
data
right.
So
it's
not
just
dashboards
right.
D
Dashboards
are
great
for
status,
they're,
horrible
for
problem
solving
right,
and
so
that's
kind
of
the
third
facet
and
joy
dave
and
chris
will
demonstrate
some
of
this.
We'll
show
you
our
capabilities,
for
you,
know,
introspecting
the
data
and
and
discovering
problems,
and
so
the
focus
we'll
have
and
they'll
demonstrate
this
later
is
in
this
drop.
We're
collecting
metrics
from
the
clusters
like
I
said,
we
also
have
the
ability
to
collect
logs.
Now
the
approach
is
slightly
different
and
what
I
do
is
I'll
switch
over
to
a
a
picture.
D
Picture's
worth
a
thousand
words.
That
way,
you
don't
have
to
look
at
me
as
well,
because.
D
Okay
yeah,
so
let
me
share
this
diagram.
Where
we're
going
about
it.
Hopefully
you
can
see.
Okay,
there
you
go,
it
is
so
the
other
thing
we
had
to
think
about
not
just
the
capabilities,
but
some
of
the
challenges.
The
challenges
get
to
be
with
scale
right.
The
number
of
managed
clusters
that
we're
going
after
and
also
the
size
of
those
clusters.
These
aren't
servers,
they're
clusters
right
and
so
how
many
nodes,
how
many
pods,
how
many
names
etc?
D
Then
you
also
have
to
get
into
the
concept
a
long-term
store,
because,
especially
when
you're
looking
at
things
like
slos
or
even
compliance,
and
if
I'm
having
a,
I
noticed
a
where
I
became
out
of
compliance.
Well,
you
want
to
look
back
over
time.
You
know
how
many
times
that
happened
in
the
last
month
last
year.
D
So
this
is
a
real
high
level
diagram
of
our
approach
of
how
we've
gone
after
it,
so
starting
at
the
bottom
up.
If
you
look,
those
are
the
clusters
that
are
being
managed
by
acm,
acmb
and
kind
of
that
dotted
diagram
above
and
that's
a
small
subset
of
the
acm
hub.
D
A
So
that's
so
positive,
real
quick,
because
that's
a
key
piece
we
talked
about
last
week.
Chris
was
our
desired
state
of
two
weeks
ago
that
desired
state
model
we
play
in
this
kubernetes
world.
We
set
the
way
we
want
it
to
be
for
an
application
or
for
that
configuration
of
a
cluster,
and
then
we
work
up
to
that
and
so
what
we're
doing
in
2-1
this
isn't
even
something
you're
paying
for
it's.
It's
in
the
box,
you're
going
to
hit
this
observability
add-on.
That
brings
this
next
layer
of
cluster
health
monitoring
into
your
purview
automatically.
A
D
Yeah
and
that's
a
great
point,
scott
thanks.
In
fact,
I
want
to
add
one
thing
you
mentioned
about:
this
is
there
in
the
box,
but
one
thing
we
did
do
is
so
we
installed
2.1
everything
for
observability
is
installed.
It's
just
not
enabled
out
of
box
all
right
and
you
have
to
do.
One
apply,
create
a
cr
to
enable
it
and
the
reason
being
if
there
are
customers-
and
this
could
even
be
in
tests
or
whatever-
maybe
you
don't
want
to
be
collecting
all
this
data
collecting
the
data.
D
D
So
one
step
to
enable
is
really
what
you
need
and
then
automatically
everything
like
scott
said,
you
define
the
desired
state
with
that
enablement
and
everything
will
comply
with
that
desired
state,
and
if
you
decide
to
change
any
configuration
again,
everything
will
get
deployed
and
synchronized
with
that
desired
state.
Okay,
so
we
have
an
add-on
and
initially
for
openshift
clusters,
there's
already
a
built
in
prometheus
that
scrapes
and
collects
all
types
of
data.
We
have
we
collect
from
that
prometheus.
D
We
can
collect
from
other
sources,
but
the
initial
drop
will
scrape
from
that
prometheus
and
we
don't.
We
don't
gather
everything.
I
mean
there's
many
many
many
time
series
and
they
can
be
quite
expensive
down
there.
We're
collecting
a
subset
of
that
right
and
what
we
think
are
key
and
optimal
right.
That's
kind
of
general
purpose
across
all
customer
needs.
These
are
the
metrics.
That
would
be
important.
If
you
want
to
extend
that
list,
you
can
right.
So
we
have
the
ability
for
being
a
a
config
map
where
you
can
extend
the
list.
D
So
I'm
going
to
make
this
up
a
simple
number
like
we
collect
10
and
you
wanted
12,
you
can
define,
add
the
other
two
to
the
config
and
it'll
start
collecting
right.
So
going
after
that,
so
everything
is
collected
and
brought
forward
and
technology-wise.
D
You
know
we
knew
in
kubernetes
management
prometheus
and
when
I
mentioned
prometheus,
I
kind
of
mean
the
you
know:
prometheus
ecosystem,
the
community.
If
you
would
it's
not
the
fact
of
standard
right,
so
we
had
to
comply,
so
we
want
to
make
sure
we're
compliant
with
things
like
prom
ql,
which
the
query
language
right.
A
lot
of
people
have
skills
in
that
we
want
to
make
sure
that
we
support
rafana
a
lot
of
people,
use
grafana
and
have
skills,
and
that
you
know
we
want
to
take
advantage
of
the
alerting
capabilities
right
there.
D
We
want
to
take
advantage
of
the
open
metric
format
that
prometheus
uses
and
a
lot
of
people
in
the
whole
ecosystem
of
enabling
metrics
through
an
endpoint.
So
we're
doing
all
that,
however,
we're
adopting
the
thanos
technology
for
our
storing
of
metrics
over
prometheus,
which
is
built
off
of
the
prometheus
engine,
is
compatible.
It
just
provides
using
objects
towards
the
back
end.
It
provides
the
ability
to
do
that.
Long-Term
storage
right
that
we
that
we
require
now.
A
Go
ahead,
pause
here
for
a
second
take
a
breath
and
think
about
how
yeah
I
think
about
how
we
we
have
a
soluti.
We
had
a
solution
which
was
a
federated
prometheus
model
right
that
was
about
a
year
ago,
and
we
we
sat
down
and
talked
with
christian
heidenrich
and
you
know,
comes
from
the
korres
background.
Has
all
the
understanding
of
the
the
the
monitoring
space
observatorium
that's
being
used
internally
with
red
hat
gathering
telemetry
from
thousands
of
connected
clusters.
A
All
that
runtime
knowledge
over
the
past
couple
of
years
of
red
hat
cloud
is
now
in
this
project
is
now
in
acm.
So
we
take
that
goodness,
as
randy
was
just
talking
to
thanos
and
the
data
store
scalability
the
long
term
retrieval
all
of
that
baked,
in
goodness,
with
production
runtime
code
around.
It
is
now
part
of
the
acm
story.
So,
instead
of
a
federated
prometheus
model
that
falls
over
doesn't
handle,
you
know
scale,
we
have
it
it's
here,
it's
thanos
and
it's
prometheus
architecture.
A
B
B
D
A
B
B
F
F
Yeah
I
I've
been
playing
with
this.
This
world
of
containers
before
kubernetes
came
into
being,
and
one
of
my
first
introductions
into
containers
was
using
docker,
ansible
and.
F
Thing
about
what
we
are
doing
here,
you
know
is
randy
mentioned
that
everything
out
here
is
well
known
to
the
community
thanos
prometheus
from
ql
every
alert
manager.
Everything
is
very
well
known,
so
once
we
once
you
guys
start
using
it,
you
can
get
to
the
internals
of
this
very
quickly
and
you
know
ingest
all
the
information
so
so
that
that
is
key
and
the
other
stuff
is.
This
object
store
where
we
are
storing
this
long
term
time.
D
Yeah
yeah
scott
tried
so
yeah,
obviously
randy
george
and
so
long
long
time
in
this
industry
and
probably
been
doing
management
for
the
last
probably
15
to
20
years
right
and
focused
on
observability.
D
Well,
I'm
going
to
go
back
when
we
first
started
doing.
I
was
with
ibm
before
we
moved
to
red
hat
and
doing
autonomics,
so
it's
been
probably
a
good
ten
ish
years
been
doing
on
the
observability
side,
focused
on
network
and
then
and
clusters
and
kind
of
what
joe
deep
said.
His
experience
on
the
cube
side
of
managing
the
cube.
I
mean
we
ran
a
cube
in
sas
production
platform
when
it
was
like
0.7
release.
I
think
way
back
right,
and
so
it
wasn't.
D
There
was
no
eks
back
then,
or
I
guess,
or
I
guess
or
anything,
and
so
we
ran
just
vanilla,
cube
and
managed
ourselves
and
kind
of
learned
from
that
on
the
way
up.
So
it
was
kind
of
a
good
way
to
learn
because
there
weren't
a
lot
of
capabilities
back
then
or
a
lot
of
brute
force.
They
had
to
do
it
so
a
lot
of
experience
in
managing
coupe.
From
that
perspective,
a
lot
of
experience
and
observability
and
management
in
general.
B
C
Right
I
mean
I've
been
on
many
calls
with
four
or
five
chris's,
so
I
don't
know.
B
C
Yes,
sir
hi,
my
name
is
chris
doan
and
I'm
representing
the
s3
perspective
on
this
call.
I've
I've
been
in
the
industry
for
a
long
time
and
in
my
current
role
I
actually
wear
multiple
hats.
So
right
after
this
column,
I'm
probably
going
to
jump
off
and
do
some
devops
ci
cd
work
with
our
jenkins
cluster
but
yeah.
I
I
use
acm
our
product
internally
and
I
try
to
evangelize
its
usage
throughout
our
community
but
yeah.
C
D
Yeah
and
that's
actually
a
really
key
point
is
folks
like
chris,
that,
like
you
said,
he
uses
the
product
right
and
if
we
can't
buy
value
out
of
it,
how
do
we
expect
our
customers
right?
So
we
we,
actually,
you
know,
eat
our
own
dog
food
or
some
people
like
say,
drink
our
own
champagne.
However,
you
want
to
look
at
it
so
and
we
get
that
better.
That's
a
huge
benefit
right
of
using
that
and
getting
real-time
feedback
early
feedback
and
help
iterate
and
mature
the
product
right.
D
So
the
only
other
thing
I
wanted
to
call
out
here
is
any
so,
in
addition
to
the
object
store
where
metrics
and
alerts
are
stored
and
looking
for
long-term
trends
and
pattern,
recognition
and
again,
joy,
deep
and
chris
will
demonstrate
some
of
this.
We
also
collect
information
about
all
the
resources
and
their
relationships
on
these
managed
clusters
and
as
well
as
some
of
the
key
attributes
of
metrics
near
term,
the
now
metrics,
not
that
the
trend
that's
under
bonus
and
we
store
that
in
a
redis
search
database.
D
That
has
a
graph
db
plugged
in
right,
and
this
becomes
very
powerful
for
querying
and
introspecting
the
data,
especially
when
you
want
to
understand
the
relationships.
These
clusters,
as
you
know,
a
lot
of
relationships
that
are
very
dependent
on
each
other,
whether
it's
be
the
flow
of
the
app
or
where
things
are
deployed
and
relationships
with
other
things,
there
etc
right
to
be
causing
issues
right.
So
we
have
that
kind
of.
D
Other
database
there
to
provide
this
that
sort
of
capabilities-
okay,
another
key
pillar
of
absorbability-
is
not
just
metrics
and
the
alerting
and
inventing
is
but
blogs
are
critical
right
and
especially,
you
know
to
be
an
sre,
you
gotta,
hey.
How
am
I
gonna
solve
problems,
not
understand?
What's
going
on
these
logs,
as
you
can
tell
from
the
diagram,
a
lot
of
people
are
into
loud
collection,
log
storage,
there's
a
lot
of
value.
You
can
do
that
applying
anything,
but
it's
also
very,
very,
very
expensive
network
bandwidth,
storage,
etc.
D
Our
approach,
currently
we
may
look
into
down
the
road
centrally
storing
logs,
especially
for
something
like
edge
and
that
may
need
to.
But
right
now
we
have
more
of
an
on-demand
approach
right.
So,
if
you're
looking
into
solving
an
issue
on
cluster
a
if
you
would,
we
can
go
out
and
grab
the
logs
in
your
real
time,
pull
them
over
to
the
hub
server
and
utilize
them.
D
B
D
A
We
talked
about
that
a
few
weeks
ago.
Chris,
you
remember
with
michael
and
dave
and
jeff
how
we
we
ran
into
this
problem
in
our
own
development.
You
know
in
creating
our
own
kubernetes
platform,
we
said.
Well
now
we
got
a
bunch
of
these
clusters
and
how
do
we
introspective
and
how
do
we
dynamically
gather
information
about
them?
A
That's
where
the
redis
graph,
the
search
collector,
started
to
develop
that
kind
of
became
the
core
central
theme
of
that
multi-cluster
management
really
trying
to
solve
that
problem
internally,
and
that
became
the
start
of
the
the
next
two
years
project
for
us.
F
So
you
I
mean,
since
we
were
talking
about
search
and
viewing
logs,
why
don't
I
jump
over
and
you
know
yeah.
A
E
F
Let's
so
the
screen
as
the
we
didn't
show
our
launch
in
you
know,
when
you
go
into
acm,
I
didn't
show
how
that
page
looks
like
well
you'll
land
up
in
this
home
page
right,
but
what
I
think
I
tend
to
use
most
is
our
search
page
right.
Almost
everything
that
you
can
see.
You
can
get
it
centrally
from
here,
and
this
is
very
ad
hoc
and
hawkish
in
nature
right.
F
F
F
F
You
know
you
could
you
could
click
on
a
pod,
so
this
is
in
a
cluster
which
is
named
oregon2,
you
can
click
on
the
board,
and
not
only
can
you
see
the
yaml,
you
can
look
at
the
logs
right.
You
can
look
at
the
logs
to
see
what
they
are
doing
response
time,
200,
okay,
right,
this
is
cool.
Now,
the
the
real
issue
that
we
had
is
and
chris
is
aware
about
it.
He
and
I
we
were
on
the
same
boat
during
development.
F
We
had
you
know
what
do
you
call
it?
We
had
a
blip,
let's
put
it
that
way,
so
we
really
needed
to
reach
out
all
the
it
never
happens
normally,
but
you
know
so
we
had
to
restart
the
pods
right.
What
do
you
do
go
here
and
just
delete
the
part?
The
pods
are
restarted,
that's
huge!
That
was,
that
was
huge.
You
know,
and
in
real
life
situations,
as
you
can
imagine,
developers
do
not
always
have
access
to
log
in
to
all
of
the
managed
clusters.
D
Before
you
go
there,
just
in
that
use
case,
the
way
we
have
this
set
up
is
we
have
our
alert
manager
configured
to
talk
to
a
slot
channel
right.
So
if
we
detect
a
problem,
we
notify
slack
right
and
then
so
he
didn't
just
come
in
to
hear
cold
right
and
say.
Let
me
look
for
a
pod
with
this
name
right,
so
we'll
actually
detect
a
problem
by
collecting
the
data,
analyzing
the
data
and
doing
notifications,
and
then
he
came
into
here
and
said.
D
F
Right
right
and
and
talking
about
detecting
problems,
yes,
we
it's!
You
know
we
have
configured
alerting
rules
right
same
way
as
we
do
it
in
good
old
prometheus.
So,
let's,
let's
take
a
quick
look
at
that
example
of
an
alerting
rule
and
this
you
guys
can
obviously
go
and
change
so.
F
F
But
if,
if
you're
working
with
prometheus,
it's
pretty
well
known,
you
go
and
configure
the
alert
manager
to
slack
and
send
to
pager.
In
this
example,
I
think
in
this
cluster
I've
commented
out
the
pager,
but
anyway
you're
sending
it
to
slack
and
that's
how
it
launches
to
where,
and
he
said,
you're
getting
the
details
in
slack
and
it's
giving
you
the
details,
relevant
details,
and
you
can
jump
back
to
acm
by
clicking
here
right.
So,
as
randy
said,
you're
not
coming
in
from
cold
yeah,
but
you
know
where
I
was
going
with.
F
This
search
thing
is
given
a
problem
right.
Let's
assume.
Scott
and
randy
are
two
brilliant
engineers
who
has
been
given
a
problem
to
solve
right
and
they
have
some
some
history
some
background,
so
they
might
approach
the
problem
differently.
You
know
a
classic
case
is
which
I
think
you
folks
can
relate
to
it
very
well.
F
Is
some
people
when
they
are
told
there's
a
problem
would
like
to
log
in
to
a
container
and
look
at
the
logs
to
see
what
is
going
on
right,
whereas
there
could
be
other
folks
who
might
want
to
come
in
from
inside
out.
You
know
they
might
want
to
see
hey
what
all
has
been
created
in
the
last
hour
and
let
me
filter
from
down
there
right
what
changed
right,
what
yeah
and
I
want
to
be
a
little
careful
randy.
As
you
know,
we
don't
actually
fully
it's
in
our
roadmap.
We
are
not
exactly
capturing.
F
What
has
changed.
Change
has
no
answers.
We
are
capturing
right
now.
What
has
been
created?
There
could
be
things
that
has
changed
which
are
not
yet
created.
We
are
trying
to
get
there
but
yeah
exactly
so.
There
can
be
different
ways
of
looking
at
this
problem
right
and
and
we
you
know,
we
capture
this-
all,
no
matter
which
way
you
look
at
and
talking
about
eating
our
own
dog
food.
F
If
you
want
to
explore
how
how
the
multi-cluster
observability
works,
randy
mentioned
in
the
beginning,
that
we
have
a
cr
right,
so
our
little
cr
is,
if
you
go
to
kind,
it
is
multi-cluster,
observability
click
on
the
cr
boom.
This
shows
you
the
relationship
around
the
cr.
So
imagine
guys.
You
know
I
mean
this.
F
You
are,
I
mean
something
is
pushed
on
you,
you
have
to
take
ownership,
you
have
to
discover
what's
in
there.
This
is
telling
you
that
this
cr,
I
mean
typically
two
things
that
I
would
personally
look
at
is,
first
of
all,
what's
the
route
who
is
accessing
what
I'm
you
know
what
what
this
is
serving
so
there's
a
related
route:
okay
and
it's
the
observatory,
api.
In
fact,
this
is
the
route
through
which
everybody
is
looking
at
the
metric
data
right
and
what
are
the
services?
F
What
are
the
related
services?
This
is
this.
Is
you
know
this
is
not
only
I
mean
this
is
this
is
a
day
one
tool.
This
is
a
day,
zero
tool.
You
know,
however,
you
look
at
it
and
the
simple
fact
that
you're
sitting
in
one
place
and
you're
having
the
visibility
across
all
the
clusters.
That's
really
awesome.
B
A
Her
job
to
go,
keep
it
up
and
running
without
any
concept
of.
What's
in
there
what's
baked
in,
what's
out
of
what
are
these
routes
and
services
and
to
georgie's
point,
you
have
the
capability
to
learn,
to
understand,
to
use
this
tool
to
dissect
it
and
figure
out
where
things
are
at
how
they've
been
deployed
and
architected,
so
that
fsre,
who
typically
just
gets
handed
something
and
has
no
clue
where
to
go
next,
now
has
a
starting
point.
F
C
Yeah,
let
me
let
me
take
the
screen
and
I
can.
I
can
have
a
little
segment,
a.
C
How
acm
is
able
to
centralize
the
management
of
your
fleet
of
clusters,
and
I
really
found
that
in
this
latest
release
that
we
were
working
on
where
internally,
we
have
set
up
this
cluster
and
we
try
to
manage
a
number
of
clusters
right
up
to
50
clusters
and
we
deploy
different
levels
of
acm
into
this
cluster
and,
like
jody,
was
pointing
out.
Sometimes
there
are
bugs
in
the
code,
and
I
was
able
to
use
acm
itself
on
itself
to
help
debug
some
of
these
issues
right.
C
So,
first
of
all,
as
an
sre,
I
also
focus
on
command
line
interface.
So
another
cool
thing
within
our
platform
is
that
we
provide
this
virtual
web
terminal.
C
So
I
I
normally,
I
normally
go
and
look
at
what
are
the
what
are
all
the
the
managed
clusters
that
are
available
in
my
hub.
So
I
run
oc,
get
managed
clusters
and
these
are
the
available
clusters
that
have
registered
into
my
my
hub
and
I
can
start
drilling
down
onto
them.
C
So
specifically
in
let's
take
a
look
at
singapore,
so
I
can
do
oc,
get
or
search
a
search,
kind,
pod,
cluster
and
then
cluster
search,
kind,
pod,
cluster
name,
singapore,
we'll
scroll
down
to
singapore
here
and
then
in
in
a
particular
name
space.
So
we
were
debugging
a
problem
where
some
of
the
agents
pods
were
not
at
work
were
not
starting
up
or
deploying
appropriately.
C
And
the
cool
thing
is
that
it,
even
though
I'm
running
command
lines
commands
from
the
command
line
interface.
This
is
like
a
pseudo
command
line
with
with
ui
widget
components.
So
once
I
did
this
list
of
all
the
pawns
in
this
particular
name
space
on
this
particular
cluster,
I
can
still
leverage
some
of
the
widget
context
and
do
like
filtering.
C
I
can
also
look
at
the
particular
pod
by
clicking
on
them
and
inspecting
the
logs
right,
the
same
way
that
we're
doing
it
from
the
ui,
but
this
is
like
a
combination
of
ui
and
and
command
line
and.
C
Yeah
yeah,
it
generates
the
command
for
you,
you
don't
have
to
yeah
type
it
and
then
press
tab
to
complete
yeah.
That's
that's
a
lot
of
efficiencies
there,
but
the.
A
Cool
thing
is
that
that's
a
big
step
again,
sorry
chris,
but
you
know
we're
talking
about
an
app
sre,
gets
handed
a
bunch
of
stuff
and
has
to
keep
it
running.
Here's
another
tool
in
your
pocket
that
you
come
back
to
every
day.
That
says
well,
I've
got
this.
You
know
visual
interface,
this
visual
web
terminal.
That
allows
me
to
interact
with
an
environment
that
I
don't
know
a
ton
of
things
about,
but
I'm
learning
on
the
fly
right.
A
They
spent
six
months
building
it
and
I
get
six
minutes
to
understand
it,
and
now
it's
my
job
to
keep
it
running.
So
I
can
use
this
to
introspect
and
make
changes
to
pull
logs
get
summaries
and
events.
This
is
all
based
on
an
open
source
project
called
kui
kui,
and
it's
electron
based
implementation
here.
So
it
feels
very
similar
to
slack
and
things
like
that.
I'll
drop
a
tag
in
the
in
the
chat.
A
You
know,
I
bring
it
to
the
point
and
say
like
this
open
source
code
and
these
these
opportunities
we
have
for
flex
in
here.
This
multi-cluster
problem
is
really
cool
stuff.
I
mean
it's
a
challenge
that
we
looks
to
solve
from
the
outside
in
and
bring
in
some
neat
technology
to
solve
these
issues,
but
to
give
chris,
you
know
the
tools
to
do
his
job
so
boom
like
here.
He
is
able
to
click
on
things
and
inspect
and
move
through
it
in
a
way
that
is
unbelievably
impossible
for
just
a
cli.
C
C
So
I
think
that's
that's
really
one
of
the
cool
things
that
may
not
be
highlighted,
but
but
I
really
I
really
have
that
insight
after
using
the
platform
for
a
couple
of
releases
and
in
this
last
one
where
I
think
everything
came
together,
the
performance
came
together,
the
feature
came
together
and
I
was
able
to
experience
a
problem
locate
the
the
source
of
the
issue
fix
it,
and
then
we
were
able
to
continue
on
with
our
work
right.
C
C
But
if
you
have
to
have
a
session
into
the
remote
target
you
can
so
here
I
show
a
terminal
where
I
actually
log
into
one
of
the
remote
clusters
and
then
we
still
have.
We
can
maintain
a
session
or
a
context
to
this
remote
cluster
singapore
and
I
can
run
my
cli
commands
like
oc,
get
cluster
or
no
oc
cluster
info
in
the
context
of
this
remote
cluster,
and
it
is
the
context
of
singapore.
C
Oc
get
pods,
get
a
pause
in
the
current
namespace
and
I
can
do
the
same
thing
and
and
the
cool
thing
is
that
the
widgets
respond,
the
the
the
virtual
web
terminal
responds.
We
still
have
a
widget,
we
can
click
on
these
items
and
then
inspect
like
we
would
do
through
search,
and
I
can
access
a
lot
of
the
data
without
hashing
having
to
take
a
lot
of
commands
and.
D
To
me,
what
you
just
did
right,
there
is
so
powerful
is
went
right
to
the
logs
in
context
of
a
specific
pod
right
yeah,
you
don't
have
to
switch
over
to
some
other.
You
know
log
collection
mechanism,
search
for
the
logs
you're,
just
doing
a
writing
context
and
near
real
time.
You
have
the
launcher
right.
F
We
do
that
because,
at
the
end
of
the
day,
rackham
is
just
another
workload
running
on
openshift
yeah,
which
uses
you
know
which
uses
some
sophisticated
features
of
kubernetes,
but
that's
what
it
is
right
and-
and
I
guess
the
other
important
thing
chris-
that
we
were
talking
about
yesterday-
is
the
events
thing
you
know,
for
example,
if
your,
if
your
pod
is
stuck
in
a
state
where
logs
are
not
yet
generated
right,
I
think
in
your
cluster
you'll,
you
do
have
some
containers
chris,
which
are
in
a
container
creating
mode.
C
C
That's
right
so
yeah
in
the
event
that
that
we're
not
able
to
pull
back
the
relevant
data.
Then,
of
course
we
can.
You
can
log
into
the
cluster
directly
and
then
run
the
the
commands
to
debug
the
issue,
and
you
can
do
that
from
this
terminal
and
we
can
maintain
the
the
the
context
of
that
kubernetes
context
for
you,
while
you're
in
this
session
so
yeah
I've.
I've
used
that
as
well
before
as
well.
E
C
Right,
sorry,
that's
right!
You
can
right
here
you
can
rsh
into
the
into
the
remote
pod
and
do
any
debugging
that
you
need
to
do
right.
C
The
other
another
thing
that
was
cool
that
I
I
recently
experienced
in
kui
was
oc
get
nodes,
so
sometimes
you
might
when
you're
deploying
different
products
or
platforms
or
applications
on
an
ocp
cluster.
You
might
have
to
actually
log
into
one
of
the
nodes
here.
So
I
was
able
to
show
the
other
day
that
you
can
actually
run
oc.
Debug,
node.
C
C
Slash,
I
don't
know,
maybe
this
is
not
gonna
work.
Let's
see
there,
you
go,
which
is
kind
of
cool.
C
You
can
actually
log
into
the
to
the
node
and
start
to
debug
at
an
even
lower
level
than
the
application
layer.
So
I
thought
that
was
kind
of
cool
that
our
visual
web
terminal
supported
that
as
well.
F
Okay,
you
can't
you
know
you
can't
go
and
do
a
bunch
of
stuffs
in
other
clusters,
if
you're
not
authorized
to
you're.
F
Could
you
just
show
the
folks
where
you
click
to
get
onto
this
wonderful,
visual
web
terminal,
queer.
C
Sure
so,
if
I
go
to
the
top
menu
bar
at
the
top,
there
are
two
options:
you
can
actually
open
the
visual
web
terminal
in
the
current
session
or
open
it
as
a
separate
tab.
B
A
C
Okay,
so
yeah,
I
I
when
I
was
doing
srv
for
our
internal
cluster.
I
use
visual
web
terminal
a
lot
because
I
was
mainly
focused
on
on
cli
doing
things
through
the
cli,
but
like
joy,
deep
demos
a
while
ago,
the
search
capability
is
also
available
through
the
through
this
component
as
well.
Right.
B
D
Is
that
the
power
of
observability
we're
adding
to
we're
collecting
all
this
data
that
we
talked
about?
We're
storing
it
in
an
optimal
way
to
allow
you
to
search,
understand
relationships
and
giving
you
the
tools
necessary
to
quickly
introspect
the
data
right
through
the
search
capabilities,
the
kui
or
the
you
know,
visual
web
terminal
to
make
your
cli
interface
more
powerful
and
give
access
to
logs
events
etc
in
real
time
and
quickly
get
the
roots
of
problems
like
chris
was
saying
that
we
do
in
real
time
and
in
development
right
right.
F
B
F
It's
no
fun
being
woken
up
in
the
middle
of
the
night.
I
tell
you
it's
it's.
It
is
fantastic.
Once
you're
woken
up,
you
know
on
two
consecutive
nights
at
2m
in
the
morning,
then
you
know
all
about
the
best
principles
and
the
best
practices
that
you
know
what
you
need
to
do
to
make
your
life
easier
right
and
I
think
we
are
hitting
a
few
of
the
sweet
spots.
F
D
Yeah
but
but
I
think
you're
you're
spot
on
jd,
like
I
said
earlier,
this
is
you
know
the
initial
entry
to
this
space
and
we
went
after
some
of
the
major
items
that
are
required
right
I
mean
you
have
to
right,
and
so
we
still
ways
ago,
but
you
could
be
very
and
chris
demo
this
as
well
right,
very
productive
with
this
initial
site
capabilities.
C
Here's
an
example
of
a
simple
hello
world
application,
and
you
can
click
on
search
this
link
and
it
will
search
specifically
fill
in
the
search
parameter
for
those
particular
fields.
For
that
application
and
the
same
thing
applied.
C
We
were
deploying
a
number
of
applications
in
our
internal
environment
and,
having
this
feature
where
I
could
select
an
application
that
were
that
was
under
focus
and
quickly
have
a
link
to
search
for
the
related
components
for
that
application
allowed
me
as
an
sra
to
to
to
focus
on
the
the
key
parts
of
that
application
in
terms
of
whether
it
was
going
to
have
any
issues
or
not
right.
So
I
think
that
convenience
is
is
quite
quite
powerful.
C
Yep
and
one
of
the
key
indicators
for
the
for
the
reliability
or
availability
of
your
application
has
always
been
the
first
thing
that
people
ask
for
is,
is
the
pod
deployed
and
is
it
running
right?
So,
with
a
couple
of
clicks,
I
can
quickly
see
whether
my
pod
is
is
running
across
the
expected
set
of
clusters,
or
not
all
right.
So
I
thought
I
thought
having
that
reliability
in
our
platform
to
be
able
to
show
this
data
was
one
of
the
key
things
this
release.
F
No,
I
mean
well
and
and
chris
while
you're
here,
you
can
use
the
same
search
to,
for
example,
to
see
which
policies
are
being
violated
back
to
the
pillar.
Point
that
randy
and
scott
were
making
earlier
right
three
pillars.
The
third
pillar
is
the
security
you
can
see
which
policies
are
getting
violated
across
the
fleet
of
clusters.
Here
you
know.
F
E
F
C
F
E
F
Yeah
and-
and
so
you
know
back
when
randy
was
making
the
point-
that
the
the
two
things
that
we
are
stressing
on
or
inside
the
metrics
collection
world
is
the
health
of
the
cluster
and
the
the
optimization
features
and
the
capacity
so
the
the
health
of
the
cluster.
The
health
of
the
back
plane,
as
we
all
know,
is,
is
reliant
on
the
api
server
right,
api
service,
the
front
and
the
center.
F
This
this
page
is
organized
in
a
fashion
of
the
golden
signals
right,
sre
golden
signals,
so
we
are
showing
the
latency
the
99
percentile
latency,
and
we
have
a
threshold.
We
have
put
it
as
one
second
and
and
then
we
are
showing
the
request
rate
and
the
error
messages
right,
so
200
error
rate,
etc,
etc.
F
F
Scroll
bar,
you
look
at
the
q
depths
and
the
cure
rate
for
saturation.
So.
F
Right
and
yeah,
and
then
you
have
the
optimization
back
to
what
randy
was
talking
about
in
real
life.
We
do
know
this
happens
constantly,
and
this
is.
D
With
that
we've
had
cluster
customers,
talk
to
us
about
this
and
wanting
this
right.
So
it's
not
just
something
like
you
said
you
did,
we
knew
but
been
validated.
You
know
there
are
people
real
customers
coming
and
asking
for
these
quick
insights
and
in
the
way
this
is
organized
is
when
you're
doing
many
clusters.
D
These
dashboards
give
you
a
quick
little
kind
of
view
of
the
status
but
also
which
ones
do
I
want
to
drill
into
right.
So
it's
not
an.
I
equals
one
to
end.
Go
look,
look
look,
but
let
me
come
up
here
see
where
ones
that
I
want
to
attack.
You
can
see.
Oh
maybe
that's
a
dev
cluster.
I
don't
need
to
look
at
it,
but
this
one
is
one
I
care
about
and
it's
red.
Let
me
go
drill
down
right.
F
Exactly
exactly
and-
and
so
you
know
here,
you're
seeing
that
I
am
utilizing
only
16
but
because
of
my
request-
I've
claimed
51
and
you
know
trust
me
there
are.
There
have
been
cases
where
you
know
you
might
be
utilizing
16
and
you're.
You
have
requested,
you
request,
are
adding
up
to
99
or
something
like
that.
F
Back
to
the
point
randy
you
were
making
earlier.
I
think
you
cannot
schedule
an
app
on
that
cluster
anymore.
Right,
coop
will
not
schedule
and
talking
about
drill
downs,
you
know
you
can
go
all
the
way
down
you
can.
We
are
looking
obviously
at
the
name
space
level.
Here
right
we
are
looking
at
the
name
space
level
and
then
from
the
name
space.
You
can
go
down
all
the
way
down
to
a
pod
level
as
well.
So
you
know
that
that
feature
is
there
and.
D
D
They
also
found
as
we
drill
down.
Some
of
these
are
justifiable
right.
If
you're
running
in
an
aha
type
environment
right,
you
are
going
to
reserve
more
than
you
are
really
using,
because
you
need
that
that
spare
right
running
so
but
having
the
data
allows
you
to
determine.
Is
this
a
real
problem
or
not?
You
can
also
look
at
it,
as
this
is
how
much
I'm
using
to
get
aha.
Am
I
getting?
Is
this
really
worth
it
right?
Yeah,
right.
D
So
it
seems
kind
of
you
know,
basically
anybody
you
think
about
there's
so
many
people
when
you
and
you've
seen
the
charts
that
are
now
starting
to
adopt
kubernetes
at
last.
I
see
about
what
30ish
percent
or
something-
and
there
was
just
so
much
going
on
so
fast-
that
people
really
don't
have
a
good
governance
and
really
understanding
are
people.
D
D
B
D
One
customer
telling
us
that
their
openshift
was
too
expensive
to
run.
I'm
like
what
do
you
mean
the
license
and
we
found
out
was
well
they're,
they're
running
all
of
these
worker
nodes
and
they
didn't
need
to
but
they're,
paying
by
course
right
right.
They
have
all
them
reserved
but
they're,
not
using
them
right
and
so
they're
able
to
roll
back
to
more
realistic
usage
and
get
their
expenses
in
line
right.
D
A
A
D
F
A
Yeah
you
undo
the
roadmap
since
you've
got
the
data
you
can
you
can
pivot
around
it
yeah.
We
didn't
talk
much
about
the
the
trifecta
randy.
We
didn't
talk
about
metrics,
logs
and
trace
data,
but
that's
that's
in
the
that's
in
the
future.
Right
like
we're
working
towards
app
monitoring,
so
today
we're
delivering
cluster
health
monitoring,
that's
multi-cluster,
health
monitoring.
Yes,
we'll
start
to
build
the
direction
of
app
monitoring.
E
Open
source
project-
that's.
A
Logs
and
monitoring
together
events
and
alerts
which
are
in
the
box
today
and
then
over
time,
you'll
see
you
know,
jager.
You
know
that
trace
aspect
of
this
start
to
come
to
the
fold
and
I'm
you
know
it's
a
lot
of
products
that
basically
assemble
into
this
observability
platform.
But
what
we're
seeing
is
customers
are
already
doing
it.
They
already
have
an
opinionated
point
of
view
on
how
they
work
together,
but
we
want
to
make
sure
it
not
only
services
this.
B
D
A
C
B
B
That's
right,
yes,
I
was
about
to
say
two
weeks
from
now:
y'all
return
with
the
great
red
hat
advanced
cluster
management
present
show
not
a
specific
use
case,
just
curious
that
says:
mac
responding.
Sorry,
not
a
use
case,
specifically
just
curious
thinking
like
centralized
executive
reporting.
Well,
you
could
totally
do
that
with
what's
in
the
box
right
like
you,
can
build
those
dashboards
for
your
execs
with
grafana
no
problem.
A
B
Yeah,
so
look
at
it
that
way
right
like
use
the
tool,
that's
there
to
build
that
dashboard.
The
way
you
want
it
right
and-
and
that
way
it's
it's
native,
there's
no
conversions,
there's
no,
nothing!
It
won't
break
when
an
api
changes
right
like
it's
just
going
to
be
there,
so
you
can
totally
do
that.
B
B
Yeah
like
this
is
this
was
awesome.
It
was
really
great
a
lot
of
people
tuned
in
I'm
sure,
there's
some
questions.
So,
if
you
have
any
hit
me
up
on
twitter,
chris
short
hit
me
up
on
email,
see
short
of
red
hat
and
I'll
I'll,
get
them
I'll,
get
them
to
the
team
and
we'll
get
you
answers
if
you
need
them.
So.
Thank
you
all
for
joining.
B
Thank
you,
joy,
deep
randy,
chris
and
scott
and
we'll
see
you
all
in
two
weeks.
Thanks.