►
Description
Context is King in OpenShift
Matthias Luebken (Instana)
September 23 2020
OpenShift Commons Operator Hours
OpenShift Commons Briefing
A
All
right,
everybody
well
welcome
back
to
another
openshift
commons
operator
hour,
I
like
to
do
on
wednesdays.
We
have
someone
one
of
the
many
folks
who
have
built
operators
that
run
on
openshift
come
in
talk
about
what
they're
doing
why
they
built
it
and
what
their
operators
do,
and
today
we're
really
pleased
to
have
instanti
here
and
matthias
lupkin
and
he's
going
to
talk
about
using
instanta's
offerings
to
successfully
manage
applications
running
in
kubernetes.
A
So,
as
he's
want
to
say
in
his
title
context
is
king
I'm
going
to
let
him
explain
that
and
introduce
himself
and
then
we'll
have
live
q
a
at
the
end.
So
thank
you
all
for
joining
us
today
and
take
it
away
matthias.
B
Yeah,
thank
you
very
much.
Diane
yeah,
my
name
is
matthias
lipkin
and
today,
I'd
like
to
talk
a
little
bit
of
how
you
know
our
experience
of
running
our
operator
running
kubernetes
workloads
and
what
we
see
from
our
customers.
You
know
of
the
new
challenges
they
have
running
these
just
a
few
words
about
myself
on
the
pm
for
kubernetes
and
infrastructure
at
insana
got
some
cut
some
experience
on
software
development
all
over
the
place.
B
I've
actually
been
with
red
hat
and
done
some
interesting
stuff
over
there
and
now
within
stana.
We
are,
you
know,
focusing
on
helping
developers
and
devops
teams
to
kind
of
manage
all
this.
You
know
crazy,
crazy
things
that
we're
seeing
and
operators
help
a
lot.
You
know.
I
hope
our
this
talk
helps
a
little
bit
and
yeah
some
of
the
experiences
that
we
all
right.
So
basically
one
talk
in
the
slide
right.
This
is
the
agenda.
This
is
the
talk.
B
So
what
I
would
like
to
talk
about
today
is
basically
give
a
you
know,
an
understanding
of
how
of
what
to
look
for
if
you're
running
an
application
in
kubernetes-
and
you
know
we-
we
kind
of
like
separated
the
different
aspects
of
different
perspectives-
and
you
know,
came
up
with
these
three
basic
respective
different
different
views
on
this
and
that's
what
I
would
like
to
share
today
and
also
give
you
very,
very
time
tangible
ways
of
doing
this
on
the
the
slides.
The
contents
have.
B
A
lot
of
you
know
further
links
to
to
deep
dive
on
on
how
how
to
get
it
going
for
yourself.
I
hope
there's
a
lot
of
things
to
take
away,
so
this
was
actually
actually
something
that
that
you
know
brought
up
in
may.
But
I
I
love
that
I
love
the
the
phrasing
of
it
is
venetis
is
very
good
at
solving
the
problem.
It
reduces
in
your
environment
right,
so
you
know.
B
Kubernetes
and
openshift
are
awesome
platforms,
our
awesome
platform
for
distributed
for
for
managing
distributed,
distributed
setups,
but
at
the
same
time
it.
It
also
introduces
a
lot
of
the
complexity
that
we
might
or
might
have
not
been
exposed
to
initially-
and
I
think
this
is
a
this
is
a
real
challenge
right.
It's
it's
a
it's.
Yes,
we're
techies
and
we
we
want
to
get
this
all
solved,
but
it's
not
it's
not
always
simple
right.
B
B
So
if
we
take
a
step
back
or
if
we
look
at
what
a
kubernetes
application
actually
introduces,
it's
actually
you
you're
quite
quite
a
few
new
attributes
that
that
make
these
these
things
really
challenging
the
first
of
all.
B
It's
you
know
we,
you
know,
we've
talked
about
microservices
based
applications
before,
but
to
be
honest,
that
that
hasn't
been
the
scale
as
I've
seen
before,
with
kubernetes
and
with
kubernetes
there's
you
know,
a
decent
application
certainly
consists
of
a
hundreds
of
thousands
microsoft's
very,
very
different,
and
you
can
talk
about
pros
and
cons
of
microservices,
but
the
matter
of
the
fact
is
kubernetes
allows
us
and
gives
a
lot
of
means
to
doing
so.
B
So
you
know
people
are
taking
advantage
of
it,
but
they
also
need
to
manage
these
right.
The
change
also
increases
dramatically
what
with
developers
openshift
provides
a
great
platform
for
continuous
deployment,
and
you
know
the
increased
change
of
these
is
just
just
tremendous
and
parts
come
and
go
right.
So
you've
got
these
auto
scalers,
you've
got
rolling
deployments
and
everything
is
getting
ephemeral
right.
So
you
know
also
something
that
you
know
at
least
at
least
for
me,
but
also
what
I'm
seeing
from
customers
is
something
that
they've
not
not
been.
B
You
know
really
used
to
it,
and
you
know,
containers
are
awesome
right.
We've
got
now
the
whole
fleet
of
suddenly.
You
know
just
packing
everything
in
a
container
putting
the
runtime
on
it
and
then
just
pick
the
technology
you
want
everything
is
polyglot
now,
but
who
manages
this
right,
who's
who's,
taking
care
of
this.
B
So
when
you
are
then
looking
off
of
what?
What
does
that
mean?
What
are
the
problems?
What
are
the
challenges
that
that
you
face
that
these
are?
These
are
some
of
them
that
I
I
hear
hear
most
as
if,
if
there
are
using
astana
or
any
other
other
tool
of
looking
up
of
what
what
what
the
kubernetes
applications
are
doing
is
like
okay,
so,
first
of
all
what
is
actually
affected
like
like
I'm
building
my
application,
it
has
a
problem
what
are
actually
affected
and
what
what
metrics
do?
B
I
need
to
look
at
like
well,
what
is
the
cpu
utilization?
What
are
the
request
rates,
and
where
do
I
start
looking
at
what
would
be?
Actually
all
these
metrics
actually
mean
right.
Okay,
now
I've
got
I've
got
a
list,
but
how
do
these
work
together
right
now,
I've
got
a
couple.
What
else
do
I
need
to
look
at
and
what?
What
is
the
root
cause
to
all
this
problem
and
again
right?
This
is
about
you,
know,
developers
devops
that
are
are
now.
You
know,
working
with
this
new
environment
for
them
new
right.
B
You
know,
maybe,
for
some
of
the
industries
like
you
know,
been
working
with
with
this
for
quite
a
while,
but
this
is
like
you
know,
being
new
right.
It's
new
new
for
them,
and
operators
put
another
level
on
it
right.
B
They
do
codify
a
lot
of
these
things,
but
they
put
another
level
on
this
and
stitching
these
things
together
might
have
their
own
means
of
bearing
that
distributed
load
and
putting
their
own
custom
resource
definitions
and
again
like
which
metrics
do
I
look
at
what
do
these
mean
right,
so
another
layer
challenges
and
and
managing
the
company's
application?
B
That
that
you
know
just
you
know
to
make
it
a
little
bit
more
more
tangible
to
to
to
to
discuss
what
we
were
talking
about.
So
this
is
a
very
simple
I
don't
know
if
it's
a
very
simple
example.
B
B
You
know,
then
the
the
elastic
search
itself
is
written
in
jbm,
so
that
jbm
is
running
in
a
container
on
a
part,
most
likely
we're
running
this
on
a
linux
host
right,
which
is
itself
a
kubernetes
node
with
its
own
properties,
and
all
this
thing
is
running
in
a
kubernetes
cluster
and
some
availability
right.
So
the
question
is
like
what
what
do?
I
need
to
know
what
what
happens-
and
I
guess
you
know,
if
you're
looking
at
a
problem
right,
then
these
all
surface
right.
So
this
is
like.
B
Example
or
the
simple
stack,
the
more
you
know
more
well.
This
example
is
that
that
we're
actually
talking
about
the
cluster
right,
so
the
elastic
search
is
not
a
single
node,
but
rather
a
cluster,
and
each
of
these
you
know
easily
classes
is
obviously
mentioned
in
a
couple
of
nodes
and
we
might
have
a
spring
boot
application
or
another
java
application.
You
know
most
likely
with
a
similar
stack.
B
So
let's
look
at
a
let's
look
at
a
problem
that
that
could
could
happen
right.
So,
let's
start
with
the
lower
around
right
hand
corner
the
first
one.
Let's
say
that
that
the
node
has
a
an
I
o
problem
right
and
that
that
particular
shard
in
the
elasticsearch
cluster
you
know
is,
is
getting
something
to
to
look
out
for
then
you
know,
maybe
the
the
thread
pool
itself
on
the
jvm
on
the
elasticsearch
node
is
has
a
problem
and
there's
a
warning
sign
right.
So
I'm
I'm
circling
these.
B
The
the
the
yellow
circles
are
kind
of
warnings
to
your
system
and
and
and
then
and
then
another
elasticsearch
node
needs
to
take
over
and
there
the
thread
pull
is
so
overloaded
that
it
actually,
you
know,
goes
beyond
a
certain
threshold
and
that
requests
are
queued
that
much
that
it,
you
know,
can't
fulfill
the
there's
a
service
that
it
was
supposed
to
so
actually
the
the
overall
cluster
gets,
the
throughput
is
decreasing
and
the
performance
of
my
service
application
is
also
decreasing.
B
So
I
think
the
the
the
the
message
I
want
to
bring
across
is
is
you
know
this
is,
I
wouldn't
say
too
complicated
too
far
off
application,
a
rather
simple
application,
but
you
can
also
see
that
that
you
know
specific
problems
could
be
somewhere
anywhere
in
this
in
this
in
this
setup-
and
I
would
like
to
you
know,
give
give
a
little
order
into
this
and
give
a
little
bit
guidance
on
on
where
to
start
and
how
to
look
for
right.
This
is
a
little
little
example
from
from
our
tool.
B
So
instanta
is
an
monitoring
observability
tool.
We
do
all
sorts
of
sources
of
what
we
gather
and
we've
got
this,
this
dynamic
graph
that
hooks
up
everything-
and
this
is
actually
you
know
just
a
visualization.
This
is
a
fun
project
of
seeing
the
death
star
of
all
these,
you
know
components
hooked
up
quite
fun.
Actually,
the
live
version
is
even
even
funner,
because
it's
it's
it's
dynamic
and
you
can
see
things
moving
all
right.
So
how
do
we
get
started?
B
B
Everyone
kind
of
agrees
that
the
first
one
to
reason
about
our
services,
services,
to
you
know
what
the
the
user
or
the
internal
user
or
a
dependency
within
within
the
the
microservice
landscape
you
know
needs
to
provide
you
know,
is
a
logical
abstraction
that
that
we're
looking
at
and
we're
gonna
talk
about
it,
but
we're
gonna,
but
we
can't
stop
there
as
I
tried
to
go
earlier,
so
I
think
the
second
one,
and
that's
that
you
know
obvious
obvious
for
this
one.
B
For
for
this
audience,
hopefully
is
the
kubernetes
environment
right,
the
kubernetes
environment,
the
openshift
environment,
to
understand
of
what's
happening
there
to
get
an
overview
of
the
name
spaces,
the
past,
the
deployments,
other
workloads,
etc.
B
Now,
the
third
one
and
I'm
let's,
let's
see
what's
what
we
can
get
out
of
the
the
conversation
here
is
what
I
would
like
to
argue
for
is
that
the
infrastructure
infra
level
is
not
going
away.
We
tried
to
build
up
abstractions,
but
whatever
I'm
seeing
like
you
know,
looking
into
some
some
examples,
there
is
always
something
where
you
start
looking
up
how
abs
particular
containers
behaving
on
a
particular
host,
so
infrastructure,
as
I
showed
earlier
with
the
io
example,
is
still
something
that
we
need
to
look
into.
B
But
as
the
talk
right
context,
the
context
is
king.
You
need
to
understand
on
how
things
all
these
things
relate
to
each
other.
So
that's
the
fourth
dimensional
perspective,
I'm
going
to
talk
about
so
in
a
in
a
in
one
slide
again
right.
I
think,
if
you
reason
about
of
of
how
do
we
manage
kubernetes
applications,
then
I
I
would
you
know,
try
to
start
looking
at
these
three
core
perspectives.
The.
B
Application,
the
kubernetes
layer
in
itself
and
the
infrastructure
later
and
how
all
these
things
tie
together
this
it's
it's
it's
interesting
when,
when
I
started
talking
about
these
that
you
could
abstract
these
two
specific
roles
right
and
I
think
certain
roles
are
naturally
bound
to
one
or
the
other,
so
you
can
think
about
the
services
on
being
for
the
developer,
the
kubernetes
side
of
things
on
the
devops
side
and
the
infrastructure
on
the
upside.
B
I
just
put
them
here,
but
I
also
think
that
it's
you
know
the
the
the
nice
thing
about
devops.
Is
that
we're
not
building
up
walls
right,
so
we
don't
want
to
cut
off
and
then
say
I
don't.
I
don't
care
about
the
rest.
Just
give
me
a
host
and
I'm
done
with
it.
You
know
we
want
to
combine
these,
so
I
think
it's
also
important
to
share
this
perspective.
B
Share
these
views.
Share
these
metrics
share
dashboards
between
the
different
organizations
and
yeah.
I'm
looking
forward
to
to
what
you
guys
think.
But
that's
that's!
That's
my
that's
our
perspective
now,
just
just
a
word.
This
is
one
view
of
this
right
and
we,
you
know
kind
of
you
know,
brought
that
to
instagram,
but
there's
obviously
lots
of
all
other
perspectives.
B
Like
you
know,
end
user
monitoring,
business
custom
metrics
synthetic
morning,
networking
security,
yada,
yada
yada
I've
actually
had
the
argument
when
I
was
preparing
to
talk
with
with
a
colleague
of
mine
that
something
else
is
such
more
and
more
important
than
the
three
perspective
I
I'm
talking
about
here,
but
yeah.
It's
just
it's
one
view
right
and
if
you
think
others
are
more
important
happy
again
to
to
reason
and
talk
about
it,
but
I
think
these
these
are
generally
applicable
all
right.
So
we've
got
these
three
perspectives.
So
what
what?
B
What
should
I
look
at?
How
should
I
look
at
these
and
then
last
but
not
least,
let's
put
them
in
context
all
right.
Let's
start
with
the
service
again
how?
What
am
I
looking
for
so
the
definition
for
services,
for
me
is
something
that
is
is
has
some
logical
context
to
it
right.
So
it's
it's
implementation
and
infrastructure
independent
and
we
care
about
what
the
service
provides
to
that
user
right.
B
We've
we've
also
got
you
know
these
sli
slos,
which
would
work
perfectly
with
it,
and
the
important
piece
is
that
it's
a
logical
unit
that
serves
a
a
user
a
service.
B
Now
the
important
piece
is
that
we're
looking
at
an
implementation,
independent
definition
here,
because
technology
specific
kps
can
be
misleading
right
and
you
know,
maybe
you
want
to
exchange
the
servers
with
a
different
technology,
so
you
know
there's
so
many
things
that
you
can
consider
about
the
technology.
So
let's
look
at
the
logical
in
itself,
especially
in
kubernetes,
as
I
I
you
know
briefly
mentioned
with
a
polyglot
environment.
There's
so
many
things.
So
you
know,
let's,
let's
take
the
logical
view
on
this
extract.
B
So
if
you,
if
we're,
then
looking
off
of
what
what
to
observe
what?
What
are
we
looking
into
their
first?
The
first
question
that
you
need
to
to
answer
for
yourself
for
your
team
is
of
of
the
granularity
of
the
service.
B
C
B
Right,
you're
you're
not
bound,
and
I
think
you
shouldn't
bound
yourself
to
a
kubernetes
service
in
itself,
but
you
know
think
again
service
as
a
logical
unit
for
yourself
and
if
you,
you
know,
have
to
have
an
old
web
apple,
a
web
server,
then
you
might,
you
know,
split
different
end
points.
You
know
you
might
have
different
granularities
of
what
what
works
for
you
and
insana
we
have
the
the
the
default
is
something
that
is
named
and
a
certain
type
like
http
database
and
the
like.
B
The
other
part
is
that
you
know
need
to
consider
is
some
sort
of
higher
level
assemblies
you
know,
given
the
operator
right
operator
is
a
good
means
of
stitching
these
things
together,
there's
also
the
application
crd
or
or
helm,
but
you
know
it
doesn't
have
to
be
on
the
kubernetes
side
of
things.
You
can
also
talk.
Think
about
it
differently
of
you
know
many.
Maybe
some
other
back-end
unit
is
also
belonging
to
that
you
know
logical
assembly
and
I'm
and
I'm
making
this
term
very
very
loosely
we're
calling
these
application
perspectives.
B
B
There's
pretty
there's
a
pretty
good
understanding
right
now
of
what
to
observe
the
foregone
sickness
by
the
will.
Has
our
ebook
have
introduced
these
latency
track
traffic
error
situation,
tom
wilkie
from
we
forgot
scafana,
has
introduced
the
red
method
which
resonates.
B
Me
more
with
me
because
it
takes
out
the
saturation,
which
is
a
technology
specific,
often
a
very
technology,
specific
opponent.
That
said,
you
know
both
work,
I'm
going
to
go
with
the
raid
errors.
Duration
throughout
throttle.
B
So
these
are.
These
are
just
some
examples.
On
the
left-hand
side,
you
see
the
it's.
It's
stana
dashboard
right
where
we,
you
know,
show
the
information
of
of
that
service,
but
obviously
this
is
not
bound
to
to
any
tool
whatsoever,
and
so
here,
on
the
right
hand,
side
I've,
just
you
know,
introduced
a
grafana
dashboard
that
shows
similar
similar
stats
right
and
in
openshift.
You
get
similar
views
of
you
know,
showing
the
the
the
traffic
showing
the
errors
and
yeah
read
a
request,
errors
and
duration.
B
B
The
I
guess,
the
most
common
one,
and
at
least
out
there
in
the
space
that
that
we're
capturing
these
natively
or
with
some
library
out
of
the
workload
itself
and
in
the
openshift
club
space
prometheus,
is
the
standard
and,
if
we're
looking
at
java
again,
there's
tons
of
tons
of
options
of
just
using
using
a
specific
library,
called
client
driving,
client
java,
jmx
export
or
mitre
micrometer
exporter.
So
that's
the
most
common
one
we're
seeing-
and
you
know,
also
probably
very
straightforward,
a
different.
B
A
different
way
of
doing
this
is
actually
capturing
this
from
distributed
traces
so
with
distributed
traces.
Something
that
is
built
upon
is
that
you,
you
know,
look
at
the
traces
between
the
different
between
the
different
services
and
kind
of
use,
these
traces
to
calculate
the
different
different
kpis.
B
The
advantage
with
that
is
that
it
a
it's
it's
it's
like.
You,
don't
have
to
do
anything
if
you,
if
you're,
looking,
for
example,
if
you're
using
a
service
mesh
or
if
you're,
using
its
data,
you
don't
have
to
do
anything.
You
can
capture
them
automatically
and
the
other
one
advantages,
at
least
within
sauna,
that
you
can
dynamically
change
this,
and
inside
and
other
tools
you
can
dynamically
change.
The
service
commit
composition,
the
granularity,
something
that
I
talked
earlier
about.
B
So
that's
you
know
a
an
advantage
of
of
of
working
with
it.
Just
one
note
if
you
are,
if
you're
sampling
the
traces
you
know,
please
please
be
worried
about
this
and
and
store
stored
the
metrics
separately.
B
That's
always
the
starting
point:
that's
where
we
are
base,
basing
our
sli's
on
our
service
level
indicators,
our
slos,
the
objectives
that
we
strive
for,
and
it's
just
the
kind
of
the
the
starting
starting
point
to
to
it
all.
And
if
you,
if
you
do
one
thing,
then
do
this
right,
but
I
think
for
understanding
the
whole
perspective
to
understanding
everything.
I
think
the
other
perspectives
are
equally
important,
which
brings
us
to
kubernetes
so
kubernetes,
as
the
you
know,
probably
don't
know.
B
I
don't
need
to
talk
about
that
too
much
in
this
for
this
audience,
but
the
orchestrator
of
distributed
workloads
has
a
lot
of
new
things
to
take
care
of
that
might
have
been
hidden
earlier
right.
Kubernetes
opens
up
this
environment
for
us
and
schedules
the
workloads
across
across
the
fleet
and
makes
the
resources
available
to
the
kubernetes
to
the
resource
available
through
the
actual
workload
right.
B
Something
that
I
I
haven't
included
in
the
example
earlier
are
persistent
volumes,
persistent
position
volumes
to
the
elastic
surge
environment
so
to
the
electric
surge
application.
So
actually
you
know
the
database
can
be
stored.
So
you
know
that's
that's
the
the
the
job
of
of
kubernetes
and
has
these
you
know
great
apis.
Everyone
is
talking
about
and
makes
sure
that
you
know
the
the
different
setups
are
or
different
workloads
new
workloads
with
with
operators.
B
Now
the
question
is
now
what
what?
What?
What
do?
I
need
to
look
at
again,
so
I
guess
the
the
first
one
is
it's
just
the
cluster
itself
right,
depending
of
where
you
are?
If
someone
else
is
managing
it,
you
probably
need
to
look
at
the
cluster
itself
as
the
on
the
control
plane,
just
making
sure
the
cluster
itself
runs,
and
if
there
is
a
problem
that
again
can
correlate
something
with
this
right
is
lcd.
B
Behaving
as
expected
is,
is
it
kind
of
distributed
the
knowledge
through
the
lcd
and
that
that
is
just
you
know,
one
indicator
or
one
information
that
you
need
to
gather
now
on
the
workloads
itself,
the
distribution
state
is
essential,
like
how
many
of
my
just
desired
workloads
parts
are
running
right.
B
If
it's
a
demon
set,
is
it
evenly
distributed
on
all
the
nodes
that
I
want
to
have
taken
care
of
or
and
if
I
have
a
deployment
is
it
you
know
at
the
scale
that
that
I
need
and
if
something
is
going
unavailable
is
that
is
that
still
in
my
you
know,
budget,
or
is
that
something
that
I
need
to
consider
on
the
workload
side
of
things
for
the
scheduler
to
run
for
for
making
sure
that
the
works
are
distributed?
B
We
have
requests
and
then
limits,
but
we
need
to
put
these
in
consideration
in
context
to
to
the
others
to
make
sure
that
we've
got
that
covered
all
right
again,
two
examples
here,
so
you
know
looking
at
the
different
cpu
resources
of
the
requests
and
limits
and
how
the
utilization
of
these
are
is
something
if
you're
looking
at
from
the
kubernetes
perspective,
the
the
starting
point
of
you
know
investing
things
and
maybe
a
little
bit
on
the
starting
point.
That's
also
something
that
I
found
very
interesting.
B
These
three
perspectives
is
in
no
means.
Is
there
a
we?
We
need
to
measure
the
services,
as
I
said
earlier,
but
it's
not
something
that
that
you
that
it's
also
very
natural
for
people
coming
from
a
different
background,
starting
somewhere
else
right
so
with
the
kubernetes
environment.
Maybe
I'm
I'm
more
on
the
devops
side
and
then
need
to
to
to
to
level
up.
B
I
need
to
make
sure
that
a
new
new
name
space
is
running
smoothly
right,
so
I'll
start
with
the
name
space,
but
I
guess
the
important
pieces
that
you
know
when
you're
starting
kubernetes
that
you
also
understand
which
envir,
how
is
it
running
on
the
on
the
host
itself
on
the
cluster
itself,
that
you
have
some
means
of
getting
there
and
that
you
also
have
some
means
of
understanding
both
the
applications?
What
what
what
are
actually
the
developers
putting
on
there
right?
So
so
you
know
having
good
starting
points.
B
I
think
it's
it's
important
and
linking
them
again,
which
I'll
talk
about
how
to
measure
that's
actually
pretty
nice
in
kubernetes
right,
because
it
basically
is
all
there
right.
We've
got
a
cube
state
metrics,
which
talks
about
all
the
things
around
the
workloads,
the
configurations
it
serves
themselves
and
just
provides
provides
these
metrics
on
the
control
plane.
We
have
the
individual,
metrics
endpoints
and
just
you
know
for
for
completeness.
There's
metric
server
too
in
do
dual
auto
scaling.
B
So
there's
everything
is
already
there
and
I
have
the
dashboard
there,
but
with
these
being
provided
as
standard
there's,
also
like
a
lot
of
freak
and
dashboards
ready
to
just
you
know,
have
this
perspective
ready
and
and
easy
to
to
see,
and
we
have
that
in
grafana
we
have
that
an
openshift,
so
it's
it's
pretty
pretty
easy
to
to
get
started
with
and
to
to
enhance
with
more
metrics
all
right
infrastructure.
So
why
do
we
then
now
need
to
look
at
infrastructure
and
the
example
that
I
I
gave?
B
Should
you
know
hint
at
it
right?
The
I
o
problem
on
the
host
is
something
that
is
needed
for
the
troubleshooting,
but
also
without
the
trouble
shooting
it's
it's
an.
I
think
it's
something
that
we
shouldn't
be
afraid
of,
or
developers
devops
shouldn't
be
afraid
of.
Of
having
this
this
in
mind,
and
not
like
you
know,
only
look
at
the
at
my
pod
and
my
jbm,
but
also
understanding
of
how
the
jvm
run,
what
what
are
the
threads
doing?
How
is
it
running
on
the
on
the
on
the
host
right?
B
So
yeah,
you
know-
maybe
that's
that's
one
takeaway
for
for
the
talk
right
is
encouraging
developers
to
look
into
this
and
understanding
of
of
what's
happening
there.
Hopefully,
maybe,
with
with
this
talk,
a
little
bit
of
of
what
to
look
at
now
very
very
important
right.
We
talked
about
the
services
being
the
starting
point.
That's
still
the
case
right,
cpu
utilization,
as
adrian
cockroft
says,
is
virtually
useless
as
a
metric
in
itself.
B
There
are
so
many
assumptions
in
there
that
if
you
just
look
at
the
cpu
utilization
and
try
to
write
on
it,
you
know
you
will
most
likely
be
wrong,
but
putting
that
in
context
and
understanding
what
the
service
impact
is
to
to
a
possible
problem
on
the
host
is
the
point
I'm
trying
to
get
at
there's
a
great
method
similar
to
the
red
method.
B
There's
a
similar
method
here
by
brandon
gregg,
awesome
performance
engineer,
does
lots
of
talks
and
lots
of
great
books
and
the
use
measures
talks
about
you
know
for
all
physical
server
components,
so
we're
looking
at
cpus
memories
and
stores.
For
all
of
these,
you
know
look
at
basically
three
things.
B
First
of
all,
look
at
the
errors
right
if
there's
an
easy
way
to
get
at
the
errors,
look
at
the
errors
and
what
do
they
tell
you,
look
at
the
utilization
so
how
busy
the
resource
was
serving
serving
the
work
and
the
situation,
so
how
much
work
is
kind
of
like
queued
up
for
this
resource
to
to
work
at
so
on
the
host
level?
You
know
we
have
all.
You
know
these
gazillion
resources
on
the
horse
itself
or
connected
to
the
host.
So
you
know
something
that
you
know.
B
Probably
everyone
should
you
know
have
a
look
at
is:
is
the
cpu
and
memory
on
the
usage
side
of
things
on
the
low
side
of
things
and
again,
dashboards
are
all
over
the
place,
and
I
guess
it's
just
it's
an
important.
You
know
just
you
know
getting
familiar
with
it.
B
This
is,
this
is
another
example.
I
just
found
that
found
that
interesting
on
the
on
the
jvm
side,
right
jvm.
Being
such
an
important
important
part
of
our
system
is
that
we
kind
of
you
know
looking
into
it
deeper
and
looking
at
you
know
different
different
metrics
there
be
it
threads,
be
at
the
the
heave.
The
memory
pools,
and
especially
the
garbage
collection
is,
is
something
to
you
know,
understand,
and
you
know
have
ready
when
you're
looking
at
problems.
B
So
again,
so
we've
got
the
infrastructure
metrics.
So
how?
How
would
I
get
there?
The
in
in
kubernetes,
the
best
way
is
actually
to
work
with
again
with
some
exporters.
So
there's
the
node
exporter
and
the
jmx
exporter.
B
Also
c
advisor
is
promoting
a
couple
couple
of
a
good
metrics
on
the
infrastructure
side
of
things,
but
it's
important
that
you
know
some
for
some
performance
reason.
You
need
to
also
look
at
the
instrumentation
itself
that,
for
some
performance
reason,
more
native
information
might
might
be
needed,
our
our
sensor
and-
and
you
know
that
that's
true
for
other
sensors,
probably
like.
We
also
is
that
we're
you
know
we're
doing
more.
B
That's
roughly
50
of
the
instrumentation
we're
getting
and
out
of
in
a
native
way
choose
just
to
be
more
more
performant
all
right,
so
we've
got
the
service
kubernetes
and
infrastructure
and
the
they're
all
needed,
and
I
guess
you
know
to
be
taken
care
of
in
itself
now.
What
do
we
do
with
the
context?
How
do
we
stitch
these
things
together
and
that's
something
that
you
know
we
within
the
standard
we've.
You
know
basically
built
our
our
tool
about
upon,
but
it's
it's
something
that
you
know
you.
B
You
can
also
do
yourself
when
I
was
you
know
preparing
the
talk.
I
actually
realized
that
there's
a
pretty
good
standard
in
the
upcoming
that
you
know
hints
at
a
lot
of
these
and
that's
the
open
telemetry.
B
So
there's
lots
and
lots
of
things
in
the
open
telemetry,
but
something
that
for
for
this
context,
I
would
like
to
highlight:
is
the
resource
semantic
conventions,
so
the
the
resource
semantics?
We
mentioned
the
notification
they
describe
how
a
resource
should
be
considered
of
in
a
consistent
manner,
and
so
there's
a
couple
of
you
know:
kind
of
tagging,
suggestions
and
open
telemetry.
B
There
are
they're,
not
only
suggestions,
but
there's
also
some
mandatory
required
ones
and
some
optional
ones,
but
I
think
that's
a
pretty
pretty
decent,
pretty
good
starting
point
if
we
are
thinking
about
how
to
correlate
these
ft
together.
So
if
you
are
working
with
a
if
you,
if
you
are,
if
you've
got
a
service-
and
you
picked
like
a
service
name
and
open
telemetry
talks
about
the
service
name
space
that
makes
these
things
unique
together
and
then
you
correlate
it
to
a
service
instance
id.
B
So
something
that
serves
the
service,
then
you've
got
an
unique
identifier
of
what
what
this
thing
is
that
actually
serves
this
right
again
and
then
sana.
We
also
have
the
service
type
but
which
you
know,
I
think,
makes
a
certain
use
cases
easier
and
easier
to
to
get
at.
But
you
know
open.
Telemetry
does
not
not
now
so
we've
got
the
service
right
and
if
we're
using
this
tagging
theme,
then
we
can
start
correlating
things
together.
We
can
see
okay.
This
service
belongs
to
this
container
to
this
host
to
this
kubernetes.
B
You
know
part,
for
example,
and
the
other
way
also
around
right.
The
the
information
that
we've
gathered
from
all
these
different
other
instances
are
common
tagging
schemes
that
we
can
use
to.
Excuse
me
that
we
can
use
to
correlate
one
to
the
other
now.
A
different
way
of
looking
this
is
is
if
we're
looking
at
at
the
trace
phase
world
of
things,
and
I'm
mentioning
this
explicitly
because,
as
I
said
earlier,
the
the
services
that
the
service
metrics
that
we
gather
are
based
out
of
traces.
B
So
we
infer
a
lot
of
these.
We
and
others
right.
That's
not
it's
not
unique
to
insana,
but
those
who
are
working
that
way.
Infer
a
lot
of
these
information
out
of
it
and
the
interesting
or
the
way
to
look
at
it
is
if
you've
got
traces
right,
that
you
use
the
trace
id
in
itself
and
to
correlate
things.
B
Here's
the
first
one
the
example
is
from
grafana
and
and
they
talk
about
how
to
use
the
trace
id
and
logs
and
use
a
you
know:
common
service
tags
throughout
throughout
the
system,
something
that
you
know
this
service
name
here,
they're
using
slightly
different
but
throughout
the
system
so
that
they
can.
B
Then
you
know
look
to
the
service,
something
interesting
that
I
found
with
zipkin
traces,
that
the
tax
fans
had
the
part
id
and
for
the
service
naming
they
were
looking
at
doing
a
reverse:
reverse
lookup
on
the
pod
id
and
then
enriched
the
part
data.
B
The
open,
telemetry
talks
about
making
this
more
and
more
automatic
right
in
an
open
standard.
We
we
already
do
this
as
as
to
others.
B
So
this
is
an
example,
and
I
mean
we
could
also
do
a
short
demo
if
you
like,
but
this
is
an
example
of
how
we
are
doing
this
and
how
this
is
visualized
and
installer.
So
the
you
know,
this
is
our
example,
but
basically
we
separate
these
three
different
perspectives,
this
application
perspectives,
kubernetes
applications,
interest
perspectives
and,
basically
on
any
entity
that
you
know
you're
looking
at
you,
you
can
link
to
the
others,
but
you
know
conceptually:
it's
not.
B
You
know
you
can
rebuild
this
with
your
own
tools
or
again,
just
just
give
it
a
give
it
a
try
within
startup,
so
key
takeaways.
If,
if
you
like,
we've
got
service
bananas
and
intra,
please
consider
all
of
them.
Please
also
consider
them
all
of
them
independent
and
make
the
best
use
of
when,
when
you're
looking
at
and
at
these
independently,
because
there's
always
someone
coming
from
that
particular
background,
and
if
you
overload
them
with
information
from
different
perspectives,
they
might
be
overwhelmed
right.
B
They,
these
different
perspectives,
share
them
right,
make
them
shareable
within
the
team
to
ensure
a
common
understanding
of
what
to
measure,
and
why
do
you
measure
them?
Why
is
this
particular
saturation
metric
for
your
workload,
the
most
important
one
last
but
not
least,
right
context
is
king
link.
These
together
make
them
aware
for
everyone
and
link
them
together.
So
everyone
has
the
same
understanding
of
of
what
what
you
can
do
all
right.
A
A
And
that
would
get
them.
You
know
at
least
to
know
where
you're,
where
you
live
and
breathe
in
insana
as
well,
and
where
all
the
docs
are
as.
B
B
Right
so
a
couple
of
words
to
the
operator,
the
insana
operator
is
actually
we
we've
been
really
been
stoked
about
the
operator
and
and
the
the
insana
the
insana
operator
is
actually
basically
available
wherever
wherever
you
like
right,
the
it's
obviously
installed
in
the
in
the
operator
hub
and
you
can
also
get
it
in
and
directly
we've
we've
included
it
in
in
our
environments
on
how
to
install
the
agent
or
you
can
just
like
you
know,
get
get
the
source
and
and
get
everything
from
you
know
from
github
directly
now
the
operator
on
the
or
the
agent
operator
does
a
lot
of
like
nice
things
for
us
and
it's
it
helps
us
distributing
of
what
we
do
with
our
agent.
B
So
you
know
if
we're
going
back
to
the
talk
that
I
talked
about
of
of
correlating
all
these
different
things
together,
you
know
that's
something
that
the
operator
the
agent
does,
and
you
know,
as
you
can
think
of
you
know,
if
we're,
if
we're
taking
this
in
the
next
steps.
Further,
we've
not
only
got,
we've
got
infrastructure,
kubernetes
and
services,
but
you've
got
all
you
know
all
mixed
in
with
all
the
cloud
stuff,
with
different
operating
environments
with
different
runtimes.
B
So
there's
lots
and
lots
of
things
to
do,
and
the
operator
helps
us
on
distributing
that
workload
to
throughout
the
throughout
the
cluster
and
just
picking
or
selecting
different
nodes,
putting
some
intelligence
to
our
operator
and
making
the
operator
very
making
the
agents
very
dynamic
and
and
in
what
they
do.
B
A
Okay,
so
a
little
context
is
king
here
operator,
hub
dot,
io
runs
all
of
the
open
source,
kubert
runs
anywhere
on
any
kubernetes,
and
then
I'm
explaining
maybe
a
little
bit
michael.
What
the
catalog.redhat.com
operators
are.
What
that's
all.
C
Yeah
sure
and
I'm
so
sorry
that
I'm
late,
I
have
no
control
over
this,
but
I'm
I've
been
working
from
my
cabin
in
the
mountains
for
the
last
seven
months
and
it's
like
it's
a
dsl
phone
line
running
through
the
woods.
So
when
a
moose
gets
crazy,
things
can
go
down,
so
I
apologize
for
being
so
so
late.
But
I
did
link
the
and
hi
mike
matthias.
How
are
you
it's
nice
to
see
you.
B
C
Nice
to
see
you
wearing
an
instant
name
badge
these
days
on
that,
but
yeah
no.
I
did
link
the
red
hat
catalog
because
our
team
works
with
companies
like
instant
and
others
to
run
their
operators
through
the
red
hat
certification
process,
which
which
really
allows
customers
to
know
that
you
know
all
the
parts
and
and
the
internals
of
it
are.
C
You
know,
like
the
blueberry,
pillsbury
muffin
man,
seal
of
approval,
that
that
the
the
red
hat
components
and
the
instant
components
are
all
supportable
and
they
can
use
it
in
a
production
environment.
So
that's
that's!
That's
where
our
customers
can
go
to
download
something
and
make
sure
that
they're
getting
genuine
intel
inside
parts.
So.
B
Yeah
and
again
the
the
the
the
the
the
operator,
so
you
can
think
of
the
operator
and
multiple
dimensions,
the
so
so
far
we
don't
have
like
in
our
tool
dedicated
supports
of
monitoring
operators.
So
it's
it's
just
used
as
a
custom
resource
definition,
but
obviously,
as
it
gets
more
and
more
used
by
by
developers
it,
it
would
be
one
of
the
additional
perspectives.
B
Okay,
we've
got
the
services,
we've
got
kubernetes,
we've
got
infrastructure.
So
looking
looking
at
deeper
or
more
intelligent,
look
at
the
kubernetes,
and
you
know
kubernetes
layer.
The
operator
gives
means
of
even
better
understanding
and
better
linking
these
together
and
putting
some
semantics
into
the
operations
of
let's
put,
let's,
let's
take
the
elastic
search,
for
example,
right
and
that's
totally
in
a
layer
that
I
can
think
of
adding.
And
you
know
we
we
are
running
our
operator.
B
C
C
His
name
was
pete
abrams,
pete,
abrams,
terrific
guy,
and
I
was
talking
to
him
about
what
we
were
doing
and
and
instanto
was
probably
one
of
the
first
apm
type
vendors
that
ever
certified
a
container
for
the
red
hat
portfolio
and
built
an
operator
and
and-
and
that
was
because
we
were
working
with
them
very
closely-
and
I
used
to
travel
down
pete
invited
me
to
your
sales
kickoff
in
miami.
C
It
was
probably
two
or
three
years
ago
now
and
so
me
and
my
team
flew
down
there.
We
bought
you,
know
appetizers
and
drinks
for
the
entire
sales
organization
and
stana,
so
we've
actually
had
a
really
really
good,
close
working
relationship
with
your
whole
team,
including
your
marketing
people
as
well
for
a
number
of
years.
So
this
this
this.
This
doesn't
just
happen
by
accident
and
and
we're
doing
these
types
of
things
together
to
make
the
overall
customer
experience
as
as
good
as
it
possibly
can
be
in
a
cloud-native
environment
and.
B
Yeah
and
we
continue-
we
continue
to
do
so
right,
so
this
is
this
is
on
the
asian
side,
but
we
also
have
you
know
a
lot
of
backhand
components.
So
I'm
not
I'm
not
it's
again
for
another
another
conversation
down
the
road.
I
need
to
get
my
colleague
on
online
for
this,
but
we're
going
to
continue
investing
in
there.
A
There
is
one
question:
someone
is
asking
the
about:
the
back
end
operator
status
that
you
refer
to
sort
of
and
said
someone
else
would.
Can
you
give
us
any
hints
on
when
that.
B
B
Exactly
so
so
instanta
itself,
so
obviously
you
need
to
have
the
agent
running,
but
we
have
an
on-prem
solution
or
a
self-hosted
solution,
as
we
like
and
kubernetes
is
always
has
been,
or
for
a
very
long
time
been
our
primary
way
of
surfacing.
This
and,
with
you
know,
operators
with
our
experiences
on
the
agent
side,
we're
also
looking
into
you
know
what
we
can
do
on
the
backhand
side
to
make
the
on-prem
install
easier
and
faster.
C
Okay,
and
is
that
being
driven
by
customers,
because
they're
saying
you
know,
we
have
certain
requirements
where
we
need
to
have
the
full
apm
solution
inside
our
infrastructure,
from
a
security
perspective,
or
something
like
that.
B
Right
so
so
the
traditional,
the
traditional
on-prem
questions
apply
here
right,
so
making
sure
that
it's
secure
that's
in-house,
but
also
performance
reasons
of
of
on
ensuring
that
it's
it's
it's
nearby,
the
actual
workloads.
So
so,
there's
there's
a
multiple
way
of
reasoning
about
or
or
motivating
this.
We
don't
don't
take
a
stance
right
there.
We
we
just
try
to
make
it
as
easy
as
possible,
and
operators
and
hitler
are
a
homogeneous
environment
like
kubernetes
or
openshift,
gives
us
the
means
of
of
installing
it.
C
Yep,
hey
we
gotta,
we
got
another
question,
I'm
gonna,
I
don't
know.
If
you
can,
I'm
gonna
read
it
here
and
maybe
you
can
translate
for
me
matias,
but
so
jeffrey
says
hey
in
new
relic
as
we're
using
right.
Now
we
have
the
aptx
which
measures,
satisfaction
or
response
time
based
against
a
set
threshold
to
get
the
insight
of
the
application
health
does
it
have
any
corresponding
features
for.
B
Is
a
really
really
important
and
aspect
of
you
know,
monitoring
your
applications,
something
that
we
are
more
leaning
towards.
Is
the
sli
slow
way
of
looking
at
this,
so
we've
we've,
just
we've
just
introduced
that
that
you,
that
you
define
service
level
indicators
on
your
customer
journeys,
define
these
and
alert
on
these,
and
also
with
our
application
perspectives.
There's
you
have
much
more
fine,
granular
control
of
of
which
traffic
of
which
aspect
you're
looking
at
and
alert
alert
on
these.
B
So
right
now
we
don't
have
like
the
very
very
same
equi
equivalent
to
to
what
aptx
is,
but
I
think
you
know
if
you
look
at
insana
of
what
we
provide,
I
think
at
the
end,
you
will
also
be
you
know,
possibly
even
more
liking.
The
way
we
translate
these
things.
C
Okay,
hopefully,
hopefully
jeffrey
that-
that
addresses
your
question.
If
it
doesn't,
I'm
pretty
sure
that
we
can
get
you
just
about
any
questions
you
want.
Where
would
we
send
people
to
if
they
have
follow-up
questions
I
mean
my
email
address
is
weight
at
red
hat
dot
com,
it's
just
w-a-I-t-e
at
red
hat
dot
com
and
I
can
connect
people
with
just
about
anyone
at
any
level
of
the
organization
and
stana.
I
am
from
top
to
bottom,
very,
very
close
with
everyone
over
there
matthias
do
you
have
there
you
go.
B
That's
my
email
address
and
stana.com
there's
a
gazillion
ways
to
reach
out
whatever
ping
ping,
anyone
we'll
get
back
to
you
and
and
if
and
jeffrey.
If
you
would
like
to
talk
about
more
about
the
aptx
standards,
I'm
happy
to
you
know,
go
and
go
into
some
detail
with
you
or,
and
especially
you
know,
looking
at
use
cases
like,
why
are
you
looking
at
the
specific?
C
Hey
matthias,
I
I
really
wanted
to
ask
you
this
at
the
very
beginning,
but
I,
as
I
said,
I've
been
dealing
with
the
legacy
internet
issues.
B
C
But
I
was
gonna,
I
was
gonna,
ask
you
if
you
know
you
were
you
were
at
red
hat
for
several
years
and
and
then
you
and
then
you
moved
to
instanta.
How
lucky
are
you
I
mean?
I,
I
think
that
you
know
being
able
to
be
a
part
of
that
team
in
this
time
where
everyone
needs
apm,
for
you
know,
helping
them
to
have
the
visibility
and
insight
into
running
their
business
in
the
hybrid
cloud.
Are
you
just
absolutely
thrilled
to
be
there
or
I
know.
B
Well,
first
of
all,
I
was
thrilled
also
to
be
at
red
hat.
Red
hat
was
was
really
a
great
time
and
we
we
built
some
nice,
really
awesome
tools
in
the.
I
was
more
on
the
developer
side
of
things
there
and
in
the
code
ready,
and
we
really
do
build
an
awesome
tool
for
analyzing
dependencies.
So
shout
out
all
to
all
my
ex
colleagues.
B
That
was
an
awesome
tremendous
time
and
yes,
obviously
and
stana
is,
is
great
because
we
kind
of
like
you
know,
are
challenged
in
a
new
way
of
you
know
being
being
going
against
other
other
players
in
the
market,
but
using
this
new
microservice
movement
to
our
advantage
and
and
build.
I
think
you
know
something
very
unique
that
is
is
just
very
suited
to
this
new
to
new
environment.
Just
you
know,
just
one
example.
B
I
think
which
is
really,
which
is
also
very
dear
to
my
heart,
is
that
it's
really
really
easy
to
get
started.
Our
agent
discovers
everything
throws
everything
on
the
dashboard
and,
yes,
you
need
to
tweak
and
configure
things,
but
everything
is
it's
just
there
right,
it's
it's
just
it's
there
and
as
we
as
we're
talking
about
you,
know
different
perspectives
of
what
people
are
looking
at.
B
I
think
for
myself,
but
I
also
hear
that
from
customers,
it's
just
great
that
that
you
know
you've
got
a
platform
that
you
look
at
and
then
you've
you've
seen
them
the
majority
of
things
already
there
and
then
you
can
kind
of
then
dive
into
details
and
start
tweaking.
But
you
don't.
You
know,
you're,
not
you're
not
lost
at
the
beginning,
and
that's
that's
something
that
I
value
within
sana.
A
lot
about
and
the
whole
distributed
tracing
is
just
it's
just
a
fun
topic.
It's
just
it's
just
a
fun
technology.
C
Cool,
well,
I
don't
see
we
have
any
more
questions
coming
in
and
I
I
know
chris
short
is
going
to.
Let
us
know
that
we're
that
we're
just
about
out
of
time.
So
I
thank
you
so
much
for
coming
on.
I
mean
I
I
reached
out
to
to
to
star
who's
my
marketing
contact
over
there,
and
I
was
like
you
got
to
find
me
someone
like
really
really
good
to
be
a
part
of
this.
C
It's
our
it's
our
you
know
one
of
our
early
on
open
shift,
commons
briefings,
and
so
we're
really
really
glad
that
you
guys
could
help
be
a
part
of
this
today.
B
Glad
thank
you
very
much
for
the
invitation
always
a
pleasure.
I'm
happy
to
come
back
and
it's
yeah,
it's
it's!
It's
some
great
technologies,
mixing
in
together
and
yeah,
happy
to
be
here.