►
From YouTube: Sponsored Keynote - Connecting Prometheus and OpenTelemetry Data for Faster Troublesho... Ramon Guiu
Description
Sponsored Keynote - Connecting Prometheus and OpenTelemetry Data for Faster Troubleshooting - Ramon Guiu, VP of Observability, Timescale
The last few years have been fantastic for observability practitioners with the growth of Prometheus as the standard for metrics monitoring and the emergence of OpenTelemetry as a standard for application monitoring. Interoperability is key for standards to be adopted and successful. In this case, these two standards can make it easier for engineers to both instrument their systems and troubleshoot problems faster. In this talk, we will show the true power of Prometheus and OpenTelemetry working together.
A
Hello,
everybody
and
yeah
welcome
to
prometheus
day
europe
2022.,
my
name
is
ramon.
I
work
for
time
scale
and,
and
today
I'm
gonna
be
talking
about
correlating
data
from
different
sources
and,
in
particular,
promises
and
open
telemetry
for
faster
troubleshooting.
A
I've
been
working
in
the
you
know,
building
observability
products
for
the
last
few
years,
and
this
is
a
you
know,
a
challenge
I've
always
encountered.
A
You
know,
as
I
you
know
talk
to
users
of
those
products
is
that
they
don't
typically
use
just
you
know
one
tool,
you
know
they
use
a
lot
of
tools
and
if
you
just
look
at
the
observability
cloud
native
landscape
for
for
all
the
different
tools
that
are
there,
which
are
by
the
way,
not
the
only
ones
that
exist,
which
is
just
part
of
the
community,
there
are
a
ton
of
them
and
you
probably
are
using
more
than
one
and
quite
often
the
challenge
is
that
you're
having
you
cover
you're,
getting
collecting
data
and
getting
that
data
into
different
systems,
and
you
have
to
correlate
it
somehow
so
the
first.
A
The
first
issue
is
that
interpretability
interoperability
is
key
and
and
in
particular
this
is
about
data.
You
know
how
you
can
get
the
data,
the
telemetry,
the
metrics
logs
and
traces
flowing
through
different
systems,
so
you
can
more
easily
correlate
the
data
and
and
with
that
hopefully
also
a
troubleshoot
problem
faster.
A
Luckily,
cncf
is
sponsoring
and
supporting
two
standards
that
have
you
know
a
lot
of
adoption,
a
lot
of
momentum,
so
we
have
on
the
one
side
for
metrics.
We
have
prometheus
with
its
prometheus
exposition
format
and
the
open
metric
standard
that
is
for
metrics
and
then
open
telemetry
for
metrics,
logs
and
traces,
and
you
know
prometheus,
obviously
very
widely
adopted
and
open
telemetry
is
a
standard,
has
a
lot
of
momentum.
You
know
still,
you
know
a
lot
of
building
happening
with,
but
it's
the
second.
A
You
know
most
active
project
in
the
cncf
and
it's
also
the
second
with
the
mass
contributor.
So
there's
definitely
a
lot
of
momentum.
You
know
on
both
sides.
The
question
is
okay.
As
you
know,
time
goes
by
you.
Most
of
you
will
probably
end
up
having
data
that
is
generated
using
those
two
standards.
So
how
you
correlate
the
data
together?
So
that's
what
I
I
try
to
cover
here
before
I
start.
I
just
want
to
paint
a
picture
of
what
a
high
level
architecture
of
this
system
would
look
like.
A
So
you
have
your
services
and
infrastructure
and
you're
generating
prometheus
metrics
out
of
them,
and
they
go.
You
know
into
into
prometheus,
and
in
this
case
you
know,
you
also
store
them
in
prom
scale,
which
is
you
know,
a
long
term
storage
for
for
permit.
So
you
can
do
long-term
analysis
and
things
like
that.
But
the
key
thing
here
is
that,
as
you
start,
adopting
open
telemetry
as
well,
you'll
have
metrics
and
traces
that
come
from
open,
telemetry
and
open
telemetry
doesn't
have
a
back-end.
You
have
to
store
it
somewhere.
A
The
first
thing
is
that
there
are
metrics
there.
You
already
have
prometheus.
You
know
how
to
use
it.
You
probably
want
to
have
your
data
in
there.
So
you
know.
The
first
thing
you
have
to
do
is:
how
do
you
convert
your
metrics
from
open
telemetry
into
prometheus?
So
luckily
there
is
this
component
called
the
open,
telemetric
collector.
That
does
a
lot
of
wonderful
things,
but
one
of
those
things
is
can
convert
data
from
a
lot
of
different
standards
via
something
you
can
receive
data
from
a
lot
of
different
standards.
A
We
have
something
called
receivers
then
process
that
data
to
do
things
like
sampling
or
batching
the
data,
and
then
you
can
export
that
data
to
a
lot
of
different
solutions,
one
of
them
being
prometheus.
You
know
via
exporters,
and
so
here
what
this
architecture
is
showing
and
this
configuration
of
the
open,
telemetry
collector
is
that
it's
getting
metrics
open,
telemetry
metrics
and
it's
transforming
them
into
prometheus
metrics
and
sending
them
to
prometheus
via
the
prometheus,
remote
right
exporter
and
then
for
traces.
A
The
only
thing
it
does
it
does
some
processing
and
then
it
still
exports
it
using
the
open,
telemetry
format.
Otlp
stands
for
the
open,
telemetry
protocol,
and
so,
in
this
case,
traces
are
being
stored
in
prompt
scale
and
because
prompt
scale
supports
you,
know:
prometheus
metrics
and
open
telemetry
traces,
we're
storing
all
the
data
there
and
then
we're
connecting
rafana
to
it.
So
we
can
query
it
and
you
can
query
all
my
metrics
using
promql,
but
from
scale
is
built
on
top
of
postgres
and
timescale
b.
A
So
you
can
also
use
c
sql
to
query
both
metrics
and
traces
and
do
some
you
know
interesting
correlation.
A
So
let's
talk
about
metric
and
trace
correlation
first,
one
very
common
way,
or
at
least
the
one
that
is
most
typically
a
talk
that
we
typically
talk
about
is
correlation
via
examples.
A
So
did
you
have
some
example
of
you
know
python
code
that
is
instrumented
with
both
prometheus
and
open
telemetry,
so
here
we're
creating
a
histogram
metric
to
measure
the
duration
of
api
requests
to
our
service,
and
here
we're
recording
a
new
open,
telemetry
span
every
time,
the
random
weight,
endpoint
or
the
random
method
in
that
api
gets
called,
and
so
in
order
to
correlate
and
use
examples.
A
In
this
case,
what
we
do
is
we
add
additional
metadata
when,
when
we
add
you
know
the
duration
to
the
prometheus,
when
we
add
an
observation
to
the
to
the
histogram
that
we
created
for
api
duration,
and
so
the
exemplar
is,
is
this
piece
here?
It's
a
piece
typical
piece
of
metadata
set
of
attributes
in
this
case
just
one
that
references
data
that
is
outside
the
metric
set.
A
In
this
case,
it's
you
know,
a
trace,
the
trace
id,
and
so
when
you
do
that
and
you
get
the
metrics,
you
know
from
the
prometheus
endpoint
of
that
service.
This
is
what
you
get
you
get.
Typically,
you
know
on
the
left
side,
you
see,
you
know
the
typical.
You
know
exposed
metrics.
You
know
prometheus
exposition
format.
A
A
If
you
enable
exemplars-
which
you
know
is
this
thing
here,
which
I
believe
is
enabled
by
default,
if
there
are
exemplars
that
are
were
sent
to
to
prometheus
or
or
prom
scale,
because
perhaps
also
supports
that
it
will
show
you
data
points
that
you
see
here.
Those
are
the
exemplars,
those
are
individual
traces
and
how
long
they
took,
and
so,
if
you
put
your
mouse
over
one
of
those
dots
or
you
click
on
it,
you'll
get
this
pop-up
here
and
if
you
click
on
this
button,
then
you
can
jump
straight
to
the
trace.
A
The
hope
here
is
that
you
have
a
you
know
you
you're,
trying
to
get
an
example
of
a
trace
that
took
you
know
a
certain
amount
of
time
within
you
know
it's
within
you
know
the
the
percentile
that
you're
looking
at
and
then
you
can
see
or
the
bucket.
You
know
that
you're
looking
at
and
then
you
can
see
where
the
time
is
spent
as
long.
Obviously,
as
that
trace
is
representative
of
all
the
traces
that
fall
you
know
within
within
the
bucket
or
within
that
percentile.
A
So
that's
that's
the
whole
idea
here.
It
will
help
you
instead
of
you
know
like
trying
to
figure
out
okay
what
traces
were
generated
and
when
this
metric
had
these
values,
you
can
actually
jump
straight
from
one
from
metrics
to
to
the
traces.
The
other
way,
which
probably
is
simpler,
but
still
you
know
really
important,
is
correlation
via
labels
and
attributes,
and
so
open
telemetry
has
a
concept
of
attribute
and
it's
the
same
as
a
label
in
prometheus.
A
Basically,
and
so
the
only
thing
you
have
to
do,
if
you
have
your
service
was
or
maybe
already
instrumented
with
prometheus
metrics.
When
you
are
traces,
don't
forget
to
add,
maybe
those
attributes
that
you're
using
you
know
in
your
prometheus
in
your
prometheus
metrics.
In
this
case
you
know
endpoint
and
instance,
you
know
it's
a
very.
This
is
very
similar
syntax
to
do
this
again,
it's
a
python
example
here,
and
so,
when
you
do
that,
you
can
do
things
like
that.
So
this
is
an
example.
A
This
is
a
dashboard
where
you
are,
you
have
a
filter
at
the
top
you're
filtering
by
service
and
and
what
it's
showing
is
on
the
top.
You
have
metrics.
You
know
this
could
be
queries,
you
know
using
promql
and
you're,
showing
those
metrics
in
charts,
but
at
the
bottom
you
know,
especially
the
two
on
the
on
the
bottom
right:
it's
showing
squaring
traces,
so
you
can
actually
see
performance
of
your
service.
You
know
with
the
three
golden
metrics,
but
also
traces
and
how
long
they
you
know
the
slowest
traces.
A
So
you
can
jump
straight
into
those
and
maybe
even
errors.
You
know
there
is
error
information
in
trace
data,
so
you
can
see
which
ones
are
the
most
common.
So
anyway,
you
can
correlate.
You
know
visually.
You
know
the
data
in
a
dashboard
other
things
you
could
do
actually
in
the
case
of
prom
scale,
because
you
have
sql,
you
could
actually
run
a
query
that
returns.
You
know
all
the
hosts
where
there
are
there
were
traces
or
spans
that
had
the
most
errors
and
then
do
a
subsequent
query.
A
All
all
you
know
all
or
a
join
to
retrieve.
You
know
all
the
plot.
You
know
in
a
chart
the
memory
consumption
on
those
hosts,
so
you
can
actually
try
to
understand
if
there
is
a
problem.
You
know
that
you
know
memory,
maybe
is
growing
or
peaking
at
some
points,
even
farther
you
could
actually
do
another
joining
all
of
that
in
the
same
query,
to
just
retrieve
the
exact
processes
that
we're
consuming
the
most
memory
at
that
point-
and
that
gives
you
that
gets
you
very
quickly.
A
You
know
from
spotting
a
problem
here
to
actually
getting
you
know
very
much
deeper
understanding
of
what
could
be
the
source
of
that
problem,
so
using
labels
and
attributes
actually
is
very
powerful,
especially
if
you
can
do
joins
on
on
the
data.
A
Another
one
is
metric
correlation
and
for
metric
correlation
one.
The
first
thing
you
have
to
take
into
account
is
that
open,
telemetry,
metrics
and
prometheus
metrics.
They
have
different
types
of
metrics,
and
so
they
need
to
be
mapped,
and
here
you
have
the
mapping.
I
won't
get
into
a
little
bit
another
thing
to
keep
in
mind.
There
may
be
some
types
that
are
not
that
you
cannot
map
and
an
example
of
that
would
be
open.
Telemetry
has
an
exponential
histogram.
A
You
know
that
doesn't
have
a
way
to
map
it
into
prometheus
metrics
and
that
is
defined
by
the
way
in
the
expect.
You
know
the
the
open
telemetry
spec
and
there
were
you,
know
a
lot
of
discussions
between
open,
telemetry
and
prometheus
projects
to
come
to
come
to
this
mapping
and
definition
of
of
the
metrics.
A
But
that's
something
to
keep
in
mind
so
make
sure
that
you're
using
metrics
that
you'll
be
able
to
convert
so
that
you
can.
You
can
map
them
together,
especially
you're
gonna
be
storing
them
in
in
prometheus
and
so
again
with
metrics.
Probably
you
know
the
only
thing
most
likely
is
available
is
you
have
labels
and
so
again
you're
going
to
be
able
to
correlate
with
labels
and
attributes.
A
So
here
is
the
same
thing:
you
see
code.
This
is
the
same
code
as
instrumented
with
a
prometheus
client
library
on
the
left
and
with
the
open,
telemetry
sdk
on
the
right
and
except
the
beginning,
you'll
see
the
rest
is
actually
fairly
similar.
You
know
it's
the
same.
You
define
the
metric
there's
a
name.
There
is
a
description
or
documentation
in
the
case
of
the
prometheus
client
library
and
then
you
just
increment
the
counter,
adding
some
labels
to
it.
A
You
know
in
this
case
it's
for
you
know,
we
add
the
the
name
of
the
api
endpoint,
which
is
add
product
as
an
example,
and
if
you
do
that,
then
you
can
start
correlating
metrics
again
in
a
dashboard.
You
know
you
could
be
filtering
data
from
both
prometheus
and
open
telemetry.
A
So
in
this
case,
most
of
those
charts
grafana
panels
are
from
prometheus
metrics,
but
the
one
on
the
top
right
is
actually
from
a
service
instrumented
with
open
telemetry
that
is
reporting
metrics,
and
so
you
can
show
and
filter
and
see
in
the
same
dashboard
for
a
specific
service
as
an
example,
the
the
the
telemetry
for
from
coming
from
open
telemetry,
as
well
as
seeing
telemetry
coming
from
prometheus
and
so
just
to
wrap
up
tool.
Interpretability
is
key,
and
I
actually
you
know
at
the
moment.
A
It's
I'm
really
happy
that
we're
saying
you
know
there
is
obviously
so
much
momentum
with
prometheus
for
the
metric
side,
but
now
with
open
telemetry
as
well,
especially
for
traces,
because
that
gives
us
you
know
the
tooling
and
the
foundation
that
we
need,
and
then
you
know
you
have
to
think
about
planning
when
you're
doing
instrumentation
planning
carefully
to
make
sure
that
you'll
be
able
to
correlate
the
data
in
the
future,
especially
by
using
consistent
tagging
across
sign
ups.
A
Maybe
thinking
about
using
examples-
and
you
know
also
choosing
your
metric
types
carefully-
you
know
so
you
can
you
can
do
the
mapping
correctly.
Thank
you
very
much
and
by
the
way
we
have
a
booth
outside
just
outside.
So
if
you
want
to
talk
more
about
this,
you
know
I'll
be
sitting
around
and
be
happy
to
discuss.
Thank
you.