►
From YouTube: Gaining Observability in OpenShift Using InfluxData
Description
Learn how to quickly set up a robust monitoring solution for OpenShift using InfluxData. We'll touch on best practices for gathering and storing time series metrics with Telegraf and InfluxDB, visualizing that data using Chronograf, and alerting your team when problems occur using Kapacitor.
A
My
name
is
Russ
Savage
I
am
giving
a
talk
about
gaining,
observe
illud
observability,
an
open
shift
using
influx
data,
a
little
bit
of
a
mouthful
but
we'll
talk
through
what
is
in
flux,
data
if
you're
not
familiar
and
then
we'll
talk
a
little
bit
about
best
practices
with
open
shift
and
other
kubernetes
platforms
and
give
you
some
nice
screenshots,
so
got
20
minutes
I'm
up
against
lunch.
Actually,
last
time,
I
gave
a
talk
in
presentation.
A
The
president,
the
the
conference
ended
at
4
o'clock
on
Thursday
and
I,
was
at
3:49
on
Thursday,
so
I'm
moving
up
in
the
world
I
think
by
maybe
2025
I
might
be
keynote.
You
guys
saw
it
first.
You
guys
were
here
at
the
beginning.
So
congratulations
as
I
said:
I'm
a
product
manager
at
influx
data,
I
started
in
October
I
work
specifically
on
two
of
our
products,
coronagraph
and
telegraph.
So
if
you
want
to
talk
to
me
afterwards
about
that
anybody
who's
not
familiar
with
influx
data,
our
booth
is
literally
right.
A
Over
there
Nikko
are
the
marketing
team
is
waving.
We
deliver
a
modern
engine,
not
urn
engine
for
metrics
and
events,
otherwise
known
as
time
series
data
right.
So
anything
you
need
to
do
with
data
that
has
a
timestamp
associated
with
it
could
be
infrastructure.
Metrics
could
be
IOT.
Sensors
could
be.
You
know
if
you're
using
openshift
a
lot
of
container
metrics
all
that
stuff.
You
can
use
our
platform
to
collect,
gather,
display
and
keep
track
of
that
data.
A
A
couple
more
marketing
slides
before
we
get
into
the
actual
details
here,
but
we're
really
focused
on
developers
and
builders.
We
provide
a
platform
you
take
that
platform
and
make
it
your
own
right.
So
we
try
to
be
as
easy
to
use
as
possible
we're
open
source.
We
love
open
source.
We
love
the
community.
All
the
all
the
stuff
I'm
going
to
show
today
is
open
source
and
we're.
We
pride
ourselves
on
being
easy
to
deploy.
Our
CTO
always
talks
about
fastest
time
to
awesome.
A
A
A
Yeah
and
that's
exciting,
so
first
off
there's
a
mean:
is
anybody
using
influx
data
today
I'm
using
influx
DB
anybody
using
coronagraph
for
capacity
or
any
of
our
products?
Okay,
so
nobody
really
knows
knows
what's
going
on,
so
anybody
who
has
heard
of
us
knows
about
our
our
time
series
database
influx
DB.
A
We
started
building
that
for
four
years
ago,
approximately
and
realized
very
quickly
that
you
can
make
the
most
amazing
database
in
the
world
fast
performance
Enterprise
ready,
but
if
you
don't
have
the
things
around
it,
you're
not
gonna,
see
very
good
adoption,
so
we
started
building
out
other
pieces
of
our
platform.
Starting
on
the
left
with
Telegraph
is
our
collection
agent.
It's
an
open
source,
go
collection
agent,
very,
very
easy
to
extend
and
build
your
own
plugins,
for
we
have
about
150
out
there
that's
maintained
by
open
source
community.
A
We
have
internal
developers
that
are
maintaining
it
as
well,
but
most
of
the
code
is
contributed
by
the
community
and
you
can
use
that
agent
to
collect
and
gather
metrics
and
push
them
into
influx.
You
can
also
use
the
agent
to
push
it
into
other
things.
We
obviously
would,
rather
you
push
them
into
influx,
but
we
also
have
companies
like
wavefront
that
leverage
our
agent
to
push
into
their
system
and
we're
fine
with
that
too.
Data
goes
into
influx.
Db.
A
Next
thing
you
want
to
see
is
what
your
database
data
actually
looks
like
right.
So
a
lot
of
people
here,
leveraging
dashboards,
probably
in
grow
fauna,
for
a
lot
of
their
visualization
tool.
Tooling,
we
have
our
own
called
cronograph.
We
love
grow
fauna.
They
make
an
awesome
product
if
we
use
cronograph
for
managing
the
arrest
of
the
components
in
this
back
some
custom
visualizations
and
really
exploring
your
data,
and
you
can.
A
It
comes
in
our
in
our
platform
as
well,
and
the
next
thing
is
once
you
start
seeing
your
data
in
the
system
you
might
want
to.
You
know,
be
alerted
when
a
system
goes
down
when
you
start
spiking
your
CPU,
all
that
stuff,
that's
where
capacitor
comes
in,
so
you
can
build
out
really
complex,
alerting
mechanisms
using
capacitor
and
all
of
these
kind
of
components
work
together
as
the
we
call
it.
The
tick
stack
TI
ck
some
people
called
the
influx
data
platform.
You
can
call
it
whatever
you
want.
A
Just
we
recommend
you,
you
use
it.
So
that's
kind
of
an
overview
of
our
architecture
of
influx
data.
All
the
different
products
we'll
talk
a
little
bit
more
about
them
in
detail,
but
you
guys
are
at
a
Red,
Hat,
Red
Hat
summit
right.
What
about
open
shift?
I,
don't
know
it's
like
a
little
containerized.
A
You
know
container
system
that
redshift
has
kind
of
important
so
in
flux,
data
and
open
shift
right.
So
all
of
our
all
of
our
components
of
the
platform
they're
all
designed
to
run
in
containers
and
run
in
in
any
container
architecture,
including
OpenShift,
we're
a
certified
technology
partner
with
Red
Hat.
The
partner
marketing
team
had
to
make
sure
that
I
get
that
in
there
putting
in
the
logo
there.
A
So
we
work
with
red
hat,
make
sure
that
all
of
our
stuff
runs
on
runs
on
Red,
Hat,
Linux
and
in
their
system
and
we're
really
an
enterprise-grade
metric
system.
So
you
know,
there's
a
ton
of
metrics
options
for
out
there
right
and
a
lot
of
people
are
probably
using
Prometheus
for
doing
a
lot
of
their
metrics
collection,
which
is
great.
We
love
Prometheus.
A
You
know
a
lot
of
people
that
leverage
us
along
with
Prometheus,
come
to
us
because
of
our
enterprise
features
right,
so
enterprise
security,
highly
availability,
high
availability,
clustering,
we
have
a
lot
more
control
over
retention
policies
and
and
how
data
gets
expunged
from
the
system
and
multi
tendency
features
that
are
kind
of
built
in.
So
a
lot
of
people
will
start
out
with
Prometheus
and
realize
that
they
need
a
little
bit
of
longer
term
view
on
their
metrics
or
a
little
bit
more
security
or
metrics
across
a
bunch
of
different
containers,
kind
of
aggregated.
A
A
You
want
to
make
sure
that
you
are
definitely
putting
all
your
monitoring
infrastructure
into
its
own
namespace,
separate
it
out
from
your
from
your
other
application.
Namespaces,
that's
kind
of
table
stakes.
A
lot
of
our
a
lot
of
our
tools
need
persistent
volumes
in
a
containerized
world.
When
things
are
spinning
up
and
down,
they
need
to
be
able
to
access
the
same.
The
same
data
so
in
flux.
Db
needs
a
place
to
store
that
information,
capacitor
and
cronograph
all
have
little
little
metadata
databases
that
they
leverage
for
keeping
track
of
all
those
all
those
things.
A
Those
all
need
to
be
put
on
persistent
volumes
for
in
your
in
your
deployments.
Now,
when
we
talk
about
visibility
in
in
openshift
or
other
kubernetes
environments,
there's
kind
of
two
parts
right-
and
you
guys
you
guys,
probably
know
this,
but
you
need
to
track
the
underlying
resources
in
the
bare
metal
right.
That's
CPU,
usage
memory.
You
know
things
things
like
that.
Any
sort
of
AI
ops-
that's
that's
happening,
but
you
also
need
to
track
the
services
that
are
running
on
those
systems
right.
A
You
guys
know:
Damon
Damon
sets
kubernetes
and
openshift
right.
So
Telegraph
is
our
collection
agent.
What
we,
what
we've
done
internally,
is
we've
configured
a
daemon
set
for
Telegraph
use
that
to
make
sure
the
Telegraph
agent
is
actually
running
on
every
single
node
in
our
cluster
and
reporting
metrics
back
right.
A
So
that's
kind
of
that's
kind
of
the
first
piece
right
that
gives
you
visibility
into
kind
of
the
underlying
infrastructure
of
of
your
of
your
cluster
right
in
OpenShift
other
metrics.
We
internally
there's
a
bunch
of
different
ways.
You
could
do
this,
but
we've
used
the
sidecar
pattern
for
this,
so
essentially
for
every
single
pod,
you're
deploying
into
these
open
openshift
environments
or
other
kubernetes
environments.
You
attach
in
a
telegraph
container
right.
A
Those
containers
share
the
network
space,
so
communicating
between
your
application
and
telegraph
is
is
really
easy
and
you
can
set
up
a
set
up
Telegraph
to
scrape
all
the
premiership
Prometheus
metrics
you
want
so
like
I
said,
a
lot
of
people
are
leveraging
Prometheus
as
the
endpoint
format.
The
slash
metrics
scrape
that
data
to
bring
it
directly
into
influx
in
in
an
influx
1/5
latest
latest
DB.
We
actually
can't
accept
Prometheus
read/write
endpoints
directly
into
the
database.
A
Telegraph
will
communicate
to
to
influx
over
TCP,
but
we
recommend
for
your
application,
sending
metrics
into
Telegraph
try
using
UDP.
That
means
that
if
the
agent
goes
down,
your
application
won't
break
it'll,
just
fire
off
fire
off
things
and
if
no
one's
listening
to
the
socket,
then
you
know
those
metrics
will
just
get
dropped,
but
at
least
your
application
won't
stop
running
right.
So
so
we
just
prefer
UDP
when
you're
talking
directly
to
Telegraph
and
then
Telegraph
can
push
the
data
to
influx
via
a
TCP.
A
A
So
what
does
that
mean?
So
once
you
actually
get
the
data
into
the
system?
I
talked
about
a
couple
different
parts
of
the
ecosystem
that
you
need
right.
So
now
the
data
is
being
collected
by
telegraph,
the
agent
it's
getting
pushed
into
influx
DB
for
storage,
rapid,
rapid
access.
Now
you
actually
want
to
see
what
the
heck
is
going
in
there.
You
can
use
the
data
Explorer
in
cronograph
browse
through
data
and
quickly
chart
out
data
information.
That's
coming
into
your
system,
so
you
can
start
building
out
more
complex
dashboards
right.
A
A
One
of
the
cool
things
so
I
work
on
the
cronograph
team,
so
pretty
excited
about
that.
But
one
of
the
long-standing
features
that
we've
been
missing
is
table
so
we're
adding
that
into
the
next
release
in
the
next
couple
weeks.
So
cronograph
1-5,
we'll
have
table
support,
so
you
can
at
attach
host
lists
and
bad
actor
reports
and
log
data.
You
can
push
all
that
into
into
coronagraph
and
see
it
on
the
dashboard.
So
that's
gonna
be
really
exciting.
I'm
pretty
pretty
pumped
about
that.
This
is
just
a
quick
example.
This
this
is
actually.
A
This
is
how
we
monitor
of
influx
cloud,
which
is
our
enterprise
cloud,
offering
we
actually
leverage
our
tools
internally
to
monitor
all
of
our
customers
that
are
using
that.
So
this
is
an
example
for
a
particular
cluster,
very
small
cluster,
obviously,
but
giving
us
the
visibility
into
what
what
container
versions
they're
running,
what
their
I
ops
are,
what
their
memory
usage
all
that
sort
of
stuff,
which
is
cool.
A
So
data
is
coming
in.
You
can
see
all
that
data.
Now
you
start
identifying
trends,
and
now
you
want
to
build
out
alerting
on
top
of
that
right.
So
capacitor
is
a
tool
that
we
have
to
build
out
alerting
it
also.
Does
you
know
much
more
advanced
stream
processing
once
you
start
learning
how
to
write
tick
script?
But
here
you
can
set.
A
So,
in
summary,
is
this
nice
little
slide
wipes
across
the
screen
again
in
flux?
Data
read
how
it's
certified
partner
again
need
to
get
that
in
there
all
of
our
products
designed
for
container
architectures
designed
to
be
run
in
a
cloud
from
the
very
GetGo
telling
about
use
telegraph
agents.
It's
really
powerful
agent,
very
flexible,
very
customizable,
use
that
to
gather
all
the
metrics.
A
You
want
out
of
your
systems
and
there's
different
deployment
techniques
in
order
to
accomplish
that
and
then
leverage
in
flux
DB,
where
you
need
enterprise-grade
time
series
data
with
security
with
clustering,
with
high
availability,
all
that
stuff.
So
so
all
the
different
parts
of
our
of
our
stack.
You
know
really
work
together
and
create
an
awesome,
awesome
experience.
A
That's
it
for
me.
I've
got
technically
four
minutes
left,
but
I
encourage
you
guys
if
you're
interested
in
influx
data
and
our
and
our
products,
our
booth,
like
I
said,
is
right
behind
right
behind.
Here
we've
got
people
over
there.
That
would
love
to
chat
about
influx
with
you
guys
and
get
a
better
understanding.