►
A
Hey
everyone:
my
name
is
joe,
I'm
with
grafana
labs,
and
today
the
promise
was
to
talk
about
getting
started
with
tempo
and
to
demonstrate
an
open,
telemetry,
instrumented
application
that
supports
exemplars
and
we're
mostly
going
to
be
doing
that.
But
things
have
drifted
a
little
bit
and
it's
kind
of
this
presentation
is
kind
of
divided,
maybe
into
three
sections.
Now
we're
gonna
talk
about
tempo,
as
promised
we're
gonna
talk
about
the
state
of
open
source
exemplars,
and
why?
A
So,
let's
start
with
tempo
tempo
is
a
new
distributed
tracing
back
end
where
that
we
built
at
grafana
labs
kind
of
with
this
goal
to
sample
100
of
our
read
path.
So
at
the
time
we
were
sampling,
I
think
maybe
15
or
so
of
our
read
path,
and
we
often
were
having
long
queries
that
we
wanted
to
diagnose.
A
We
wanted
to
do
this,
but
unfortunately,
with
our
previous
back-end,
we
would
have
to
scale
cassandra
or
elasticsearch
to
a
point
where
the
cost
in
you
know
the
cost
in
memory
or
cpu
the
operational
cost.
The
cost
to
my
sanity
would
have
been
far
more
than
was
worthwhile,
especially
for
me,
I
think,
but
this
the
size
of
this
cluster
would
have
just
been.
My
whole
job
would
have
been
managing
elasticsearch
cassandra,
which
is
not
what
I
want
my
job
to
be.
I
want
my
job
to
be
building
tempo.
A
Apparently
so
we
built
tempo,
the
solution
is
yeah
tempo.
The
thing
we
built-
and
we
put
the
the
difference
here-
is
tempo
tempos
dependency,
the
only
tempos
the
only
dependency
of
tempo
is
object,
storage,
so
s3
or
gcs
are
azure,
and
these
are
of
course
cheap
to
to
to
manage.
I
mean,
there's
no
management
right.
A
These
are
managed
services,
it's
very
cheap,
to
store
very
large
amounts
of
data
in
these
services
and
to
use
them,
and
so
the
goal
here
is
to
basically
build
a
trace
back
tracing
back
end
on
object
storage.
Only
the
trade-off
for
now
at
least,
is
that
currently
tempo
can
only
search
by
trace
id.
I
think
the
goal
is,
or
at
least
the
goal
will
be
in
the
next
versions
of
tempo-
to
support
some
sort
of
search,
some
sort
of
native
search,
but
for
right
now
we
only
can
do
trace
id
search.
A
So
you
are
going
to
only
be
able
to
ask
the
question.
Give
me
a
trace
for
this
trace
id.
This
may
seem
limiting,
but
at
grifano
we
found
many
different
ways
to
kind
of
get
around
this
and
we'll
talk
about
those
in
a
second
we'll
talk
about,
maybe
not
even
get
around,
but
ways
that
we
feel
are
very
powerful
ways
to
search
for
traces
that
do
not
require
native
search
in
your
back
end.
A
So
tempo
currently
supports
all
major
open
source,
instrumentation
libraries,
jager
and
open
census,
open
telemetry
zipkin.
So
if
you're
already
instrumented
with
anything
like
this,
you
can
immediately
start
using
tempo
without
issue.
We
put
things
in
s3
or
gcs
and
also
azure.
This
slide
was
made
before
azure
support
was
added
and
I
actually
didn't
even
wouldn't
even
recognize
these
symbols
anyway,
but
that
red
one
is
s3
and
that
blue
one
is
gcs
and
we
visualize
everything
and
shockingly
grafana
discovery.
Okay,
so
you
can
only
look
up
by
trace
id.
A
We
were
often
going
through
our
logs
to
find
our
traces,
so
we
would
be
logging
on
the
same
line
as
a
trace
id,
and
this
is
standard
request,
request
response
style,
logging,
nothing
new
or
weird
about
this,
but
log
of
trace
id
log,
a
http
method,
path,
latency
status
code,
a
bunch
of
common
parameters
or
a
bunch
of
common
fields,
and
these
now
kind
of
effectively
have
built
an
index
into
our
traces
and
index.
A
A
Also
exemplars
are
kind
of
the
new
upcoming
feature
that
we're
going
to
talk
about
a
little
bit
in
a
in
a
bit
here.
The
state
of
exemplars,
but
this
new
upcoming
feature
is
a
way
to
discover
traces
through
your
metrics
as
well.
We'll
talk
more
about
what
that
means
in
a
little
bit
so
trace
id
search
only
in
tempo.
A
Currently,
we
use
logs
very
effectively
we'll
look
about
look
at
that
in
the
demo
and
hopefully
in
the
future,
we'll
be
using
exemplars.
So
where
are
we
at
380
000
spans
a
second
or
so,
and
it
fluctuates
more
than
I
care
to
admit
I
wish
it
didn't
fluctuate
at
all.
I
wish
I
could
just
push
this
higher
and
higher,
but
we're
at
380
000
right
now
we're
at
a
little
bit
less
than
7
000
traces
a
second.
A
So
if
you
do
the
math
there,
you
can
find
out
how
many
I
should
have
done
that
before
camera,
maybe
about
60
or
so
I
could
be
totally
making
that
up
never
mind
ignore
that
trace's
spans
per
trace
is
what
I
was
trying
to
get
to,
but
anyways
we
have
about
380
000
spans
per
second
about
7
000
traces
per
second,
and
our
latencies
are
good.
I'm
very
happy
with
this.
Certainly
we
can
always
push
this
down.
A
In
fact,
recent
additions
to
tempo
include
the
ability
to
scale
the
query
front
end
or
scale
the
squ
scale,
the
query
path.
So
we
could
actually
reduce
this
quite
quite
a
bit
if
we
wanted
to,
but
right
now
we're
querying
over
4
billion
traces
and
our
p50
is
around
400
milliseconds,
which
I'm
very
happy
about.
You
can
also
see.
A
P90
is
right
on
500,
milliseconds
and
p99
kind
of
reaches
up
to
one
to
two
seconds,
occasionally
so
very
happy
with
these
latencies,
always
something
to
improve,
of
course,
but
I
think
this
is
well
within
you
know
operational
expectations
for
a
tracing
back
end
architecture
of
tempo.
This
is
a
little
detailed.
We
won't
get
too
into
this,
don't
be
afraid
to
not
understand
all
these
pieces,
but
this
is
architected
roughly
like
loki
or
cortex.
If
you
spent
time
with
those
open
source
products,
so
we
have
a
distributor.
A
The
distributor
handles
replication
factor
pushes
our
traces
to
adjusters
and
gestures,
then
batch
those
traces
up
into
blocks,
and
then
these
blocks
are
pushed
into
our
storage
back
end
into
s3
or
whatever.
We
have
this
idea
of
the
compactor
over
here
on
the
side,
and
we
are
currently
flushing.
I
think
something
like
400
blocks
an
hour.
So
the
compactor
takes
those
small
blocks
and
builds
larger
and
larger
and
larger
blocks.
A
If
right
now
we're
doing
what
I'd
say,
400
blocks
an
hour
24
hours
a
day,
and
we
currently
have
a
retention
of
two
weeks
so
24
times
14
times.
400
would
be
without
the
compactor
our
total
block
list
length,
which
would
require
a
lot
of
time
to
search.
So
the
idea
of
compaction
is
to
basically
keep
this
keep
the
length
of
this
block
list
as
short
as
possible
in
order
to
in
order
to
improve
query
performance
and
then
on
the
query
path.
A
We
have
this
thing
called
the
query
or
its
job
is
to
well
look
into
the
ask
the
ingesters
for
recent
traces
and
also
check
the
back
end.
We
have
the
query
front
end
which
handles
parallelization
of
queries
and
sharing
of
queries
out,
and
this
tempo
query
piece
is
hopefully
going
to
go
away
soon.
It's
actually
like
a
shim
to
translate
to
jager,
which
is
the
only
which
is
the
only
tracing
back
in
that
grafana,
can
handle
right
now.
So
that's
why
we're
using
it
in
the
near
future?
A
Okay,
so
that
was
a
lot
of
things
and
you
don't
have
to
know
all
those
things
to
just
get
started,
and
this
is
supposed
to
be
getting
started
so
check
out
these
links
here,
single
binary,
deployment's
important,
so
a
way
to
just
deploy
tempo,
not
in
distributor
and
just
your
query
or
all
these
crazy
pieces,
but
instead
to
deploy
it
as
a
single
binary
to
get
started
and
to
understand
what
it
does
and
how
to
configure
it.
Tons
of
examples
in
these
docker
compose
folder
you'll
see
listed
here
we
have
helm
options.
A
There's
a
very
simple
helm
chart
that
I
made
that
I
think
the
helm
community
would
hate,
but
that,
but
it's
kind
of
the
way
I
would
do
a
helm
chart
and
we
have
a
elm
chart,
that's
being
pulled
or
a
pr
open
now
to
have
a
more
official,
more
robust
helm
chart
that
I
think
meets
the
expectations
of
people
who
regularly
use
home
and
jasonette
said
a
you
know,
blob
of
jsonnet
stuff
as
well
to
deploy
so
all
of
these
different
deployment
options.
Hopefully
you
can
kind
of
dig
into
these.
A
Yaml,
look
at
the
docker
compose,
get
a
feel
for
what
configuration
looks
like
and
how
to
deploy
this
thing,
so
you
can
get
started
working
with
tempo
on
your
own,
okay,
so
tempo,
it's
our
tracing
back
end
discovery
through
logs
and
exemplars
high
volume
is
the
goal
here,
but
exemplars
is
something
we
really
want
at
grafana
and
we
need
to
talk
about
where
those
are
because
part
of
this
presentation
was
supposed
to
be
demonstrating
those
with
it's
supposed
to
be
demonstrating
those
with
sorry
supposed
to
be
demonstrating
those
with
open
telemetry.
A
But
we
need
to
talk
about
why
we
can't
actually
do
that.
Just
yet
so,
first
exemplars
are
exemplars
are
a
record
of
a
single
request
or
an
instance
of
a
single
request
that
was
then
aggregated
away
to
create
a
metric,
so
the
power
of
metrics.
Is
this
aggregation
right?
I
can
aggregate
a
thousand
a
hundred
ten
thousand.
However,
many
requests
into
a
very
simple
number,
a
single
floating
point,
and
if
I
do
that,
if
I
aggregate
it
all
away,
I
can
query
these
extremely
quickly.
A
I
can
provide,
you,
know,
store
them
extremely
cheaply
and
provide
very
powerful
visualizations
of
my
infrastructure,
that
is
the
power
of
metrics,
but
what's
lost
is
the
individual
instances,
the
individual
requests
and
that
and
exemplars
aim
to
kind
of
complete
that
picture,
to
take
the
aggregation
to
display
the
aggregation
and
give
you
all
that
power
while
at
the
same
time
letting
you
drill
in
and
find
a
single
instance
of
a
request
that
was
kind
of
used
to
create
that
aggregation.
A
A
What's
going
on
right
now,
well,
they're
defined
in
open
metrics,
so
the
openmetrics
spec
has
been
it
has
a
defined
standard
for
exemplars,
but
that
is
not
supported
currently
by
open
telemetry
and
in
some
of
the
in
some
of
the
issues,
I've
read:
it's
not
they're,
not
requiring
it
for
ga.
So
I
really
just
don't
know
exactly
what
timeline
open.
Telemetry
is
looking
at
to
support
exemplar.
Specifically,
I
believe
they're
talk.
They
want
full
support
for
openmetrics,
so
we
are
excited
to
see
that,
but
it's
not
quite
there.
A
Yet
what
about
in
prometheus
client
library,
so
prometheus,
client
library
support
openmetrics
generally,
where
is
it
there?
Well,
happycat
knows
that.
Go
and
python
is
ready
to
go
so
if
you're
using
either
go
or
python,
if
you're,
using
either
of
these
libraries
or
these
instrumentation
libraries,
examples
are
available
to
you
now
you
can
expose
exemplars
in
an
openmetrics
compatible
format
or
in
the
literal
openmetrics
format,
which
prometheus
would
then
scrape
store
in
its
backend
and
make
available
to
a
visualization
layer,
java
and
ruby
have
issues
open.
A
This
unofficial.net
client
also
has
an
issue
open,
there's
a
lot
of
other
prometheus,
a
lot
of
other
openmetrics
clients
out
there,
and
I
encourage
you
to
find
the
or,
if
your
library
is
not
go
or
python,
if
your
language
of
choice
is
not
go,
go
or
python,
I
encourage
you
to
go
to
your
current
openmetrics,
client
library,
if
you're
using
prometheus
and
using
the.net
or
java
or
whatever
library
or
some
other
one,
please
get
into
those
repos
and
request
openmetrics
support.
A
Let
the
maintainers
know
these
things
are
important
to
you,
and
hopefully
we
can
kind
of
all
get
together
and
all
get.
You
know
support
in
these
various
client
libraries
for
for
for
exemplars,
so
in
the
demo
today,
I'm
going
to
use
go
because
goes
already
supports.
Exemplars
and
it'll,
be
you
know,
significantly
easier,
but
what
about
our
back
end
and
what
about
our
front
end?
So
in
prometheus?
There
are
two
pull
requests
now
that
need
to
be
merged
for
complete
exemplar
support.
The
first
is
an
in-memory.
A
This
first
is
in-memory
support,
so
this
first
pr
when
it
is
merged,
will
support
an
in-memory
ring
buffer,
so
an
ephemeral
storage
of
exemplars,
so
these
are
just
kind
of
held
for
a
short
amount
of
time.
It's
basically,
however
much
time
you
are
basically
based
on
the
amount
you're
scraping
and
the
amount
of
memory
you
give
to
the
ring
buffer
and
then
they're
stored
internally
only
and
when
they're
thrown
away
they're
just
dropped.
A
The
second
pr
adds
add
support,
for
example,
ours
to
the
wall
to
store
permanently
and
then
to
remote
right
to
push
to
a
back
end.
So
after
that,
second
one
is
merged.
You
should
you
would
see,
support
for
exemplars,
hopefully
start
coming
out
in
back
ends
like
cortex
or
thanos,
or
some
of
these
other
kind
of
long-term
prometheus
storage,
back-ends
we're
using
so
the
first
pr.
Those
of
us
use
just
used
prometheus
will
immediately
be
able
to
use
exemplars
for
the
second
pr.
A
Those
of
us
use
prometheus
in
combination
with
a
permanent
long-term
storage
backend,
the
second
pr
will
kind
of
make
it
so
that
those
back-ends
can
start
recording,
storing
and
exposing
exemplars,
okay,
so
grafana.
Actually,
support
for
exemplars
is
in
the
tip
of
master
of
grafana
right
now
so
soon,
actually
soon
for
real
soon
in
grafana.
I
can't
commit,
of
course
anything
I
don't
work
on
the
grafana
team
directly.
I
don't
def
with
them.
I
don't
do
any
of
their.
A
You
know
milestones
or
project
planning,
but
hopefully
maybe
like
in
7.5
or
some
soon
near
future
release.
We
should
see
example.
Our
support,
like
I
said
right
now
it
is,
it
is
tip
of
master
merged
already,
which
is
fantastic.
A
Okay,
so,
let's
get
to
the
demo
and
talk
about
instrumentation
a
little
bit.
So
the
goal
here
again
was
to
do
open.
Telemetry
everything
and
open
telemetry
has
a
really
compelling
and
very
powerful
server
and
client
instrumentation
setup.
Very
easy
got
my
server
here
right.
So
I
just
set
up
a
new
handler.
I
wrap
my
actual
handler,
so
I
have
some
http
handler.
A
I
just
wrap
it
in
this
hotel
handler
and
then
I
serve
the
hotel
handler
out
of
my
server
and
then
magic
happens
and
it's
going
to
instrument
my
http
server
for
me
same
with
the
client
side.
So
in
this
case
I'm
just
replacing
the
transport
with
an
hotel
transport,
and
in
this
case
my
http
client
is
now
instrumented.
A
In
this
case,
I'm
worried
about
tracing
so
context
will
be
propagated
correctly
from
client
to
server
and
everybody
will
be
happy.
This
is
all
going
to
work
nicely
and
with
a
very
few
lines
of
code.
Of
course,
there's
also
like
boilerplate
setup
code
to
initialize
the
tracing
libraries
and
get
things
set
up,
but
for
just
kind
of
using
an
http
client
setting
up
a
server.
It's
pretty
tight
set
of
clean
code,
but
open
telemeters
and
support
exemplar.
So
here
we
are.
A
So
this
is
grafana.
This
is
not.
This
is
a
build
off
the
tip
of
master,
which
is
why
or
is
it
built
off
the
tip
of
master?
So
this
is
not
something
you
just
pull
in
a
docker
container
or
just
install.
It's
not
ga
or
anything
like
I
said.
Hopefully,
support
for
this
will
exist
soon.
I'm
querying
prometheus.
So
this
is
this.
Prometheus
is
a
image
from
actually
callum's
branch,
so
callum
stan
is
head.
The
two
pr's
he's
the
he's.
A
A
A
So
I'm
asking
like
this
a
normal
can
I
maybe
I
should
zoom
in
a
little
bit,
maybe
a
little
bit.
So
this
is
a
you
know:
normal
histogram,
prometheus,
query,
p99
and
what's
being
returned
along
with
the
normal
return.
The
normal
return
is
the
metric,
and
this
is
actually
a
little
crowded
at
the
moment,
but
you
can
see
these
exemplars
as
well.
So
I
see
my
trend.
I
can
also
go
over
here
and
click
on
a
exemplar.
I
can
choose
one
of
these
and
I
can
immediately
jump
over
here
to
a
trace.
A
So
I
have
my
metric
that
metric
is
a
ton
of
requests
that
are
all
aggregated
up
into
a
single
request,
and
I
can
then
use
this
new
exemplar
support
in
openmetrics
and
in
prometheus,
hopefully
very
soon,
to
store
an
example
of
a
single
request,
and
then
here
I
can,
you
know,
dig
in
my
trace,
and
this
is
all
normal
distributed
tracing
for
those
of
us
who
have
played
around
with
this
cool,
so
exemplars
are
kind
of
this
new
upcoming
feature
talked
about
all
the
different
places
that
you
know
it
would
be
available
soon,
hopefully
cross.
A
Your
fingers,
like
I
said
before,
tempo
is
also
dependent
on
excuse
me.
Tempo
is
also
dependent
on
this
search.
It's
also
dependent
on
logs
for
discovery.
So
this
is
the
way
we
actually
discover
logs
right
now-
and
this
is
a
loki
query
as
discussed
doesn't
find
anything.
Maybe
I
can
as
discussed.
You
know
this
is
loki,
but
you
know
this
is
compatible
with.
A
Is
it
there?
Oh
yeah,
okay,
we're
just
in
a
weird
spot,
so
this
is
compatible
with
elasticsearch
or
anything
any
logging
backend,
where
you
can
build
a
link
from
like
an
id.
You
know
this
is
going
to
work,
so
there's
no
like
dependency
on
any.
You
know
on
loki
or
anything
like
else
like
that,
but
to
show
you
what
we
do
in
grafana,
so
you
can
see
here.
I
have
the
you
know
the
path
recorded.
I
have
the
latency
down
here.
A
I
have
the
trace
id
and
I
probably
should
have
other
things
of
course,
like
you
know
the
http
verb
status
code
and
other
information.
If
I
put
all
these
on
a
single,
if
I
put
all
these
on
a
single
log
line,
I
can
then
do
some
really
clever.
Queries
like
this,
like.
Let's
look
for
something
over
two
seconds,
maybe
yeah.
So
now.
All
of
these
traces
are
greater
than
two
seconds
and
I
am
now
interested
in
maybe
some
long-running
traces.
So
I
can
diagnose
some
kind
of
latency
issues.
A
If
I
were
more
clever
and
had
like,
let's
say,
method
equals
get,
and
you
know
I
could
do
another
pipe
and
do
something
like
status
equals
500.
If
I
was
interested
in
failed
queries
or
whatever
so
with
some
careful
logging,
I
can
create-
and
this
is
normal
http
request
logging.
So
a
lot
of
us
already
have
these
logs.
I
can
build
basically
an
index
into
my
traces
that
lets
me,
do
advanced
searches
and
can
also
discover
traces.
So
let
me
jump
over
here
boink
and
I
should
be
able
to
get.
A
You
know
a
trace
out
of
this
log
line
and
it's
greater
than
two
seconds
like
I
asked
for
cool,
okay,
so
tempo
is
this
new,
like
I
said,
distributed
tracing
backend
designed
for
high
volume,
extreme
volume
and
it's
designed
to
be
inexpensive
and
cheap
to
run
put
everything
in
s3,
but
everything
in
gcs,
don't
don't
have
to
bother
with
complicated
backgrounds.
A
This
discovery
through
logs
allows
us
to
store
traces,
super
cheaply,
super
inexpensively
and
then
soon,
hopefully,
we're
gonna
see
exemplars
like
like
we're
digging
in
only
supported
by
some
clients.
A
Support
for
prometheus
has
two
pr's
up
once
merged
we'll
have
complete
support.
Grafana
has
support
in
the
tip
of
master,
so
we're
real
close
to
the
point
where
we
should
start
seeing
open
source,
exemplars
and
all
of
our
favorite.
You
know
metrics
tools.
We
also
used
open
telemetry
for
the
demo
application.
So
if
you
want
to
dig
into
the
code-
and
we
saw
it-
showed
some
example
code
there
and
then
we
had
to
kind
of
use,
also
open
metrics.