►
From YouTube: Tempo Community Call 2022-02-10
Description
- Discussion of Backend Search with Serverless Architecture
- Metrics generator updates
- Pyroscope devs demonstrate linking traces and profiles
A
All
right
so
welcome
to
tempo
community
call
whatever
month
and
year
it
is
2022
february.
I
believe
we
got
a
bunch
of
stuff
to
talk
about
today,
including
we
have
a
group
from
pyroscope,
which
is
awesome,
we're
interested
to
hear
about
some
continuous
profiling
details
as
well,
as
I
think,
they're
working
on
some
kind
of
trace
to
profile
linking,
and
I
think
I'm
personally
really
excited
to
to
see
what
they're
working
on
and
to
get
an
idea
of
what
this
looks
like.
A
But
before
we
get
started
there,
we
have
we're
gonna
talk
about
our
latest
tempo
release.
This
last
past
month,
we're
gonna
get
into
search
some
more
serverless
discussion,
how
we're
using
serverless
grafana
and
then
we're
gonna
just
talk
about
metrics
generator
again,
where
we
are
in
that
we're
making
progress
there
as
well.
So
with
the
releases,
I
think
zach
is
gonna.
Take
a
look
at
that.
Oh
we
have
a
present.
Somebody
put
a
presentation
together.
Didn't
they
was
it
mario.
B
Nice
yeah
I
like
to
see
presentations
yeah
on
the
release.
On
the
new
front.
I
just
checked
the
dock.
It
looks
like
we
did
cover
the
130
release
last
time,
so
I
won't
belabor
that
here
do
check
the
release
notes.
There
are
a
few
breaking
changes
that
are
worth
worth
pointing
out.
So
yeah
take
a
look
at
that.
B
We
did
have
a
131
release
since
which
is
a
simple
bug
fix
in
the
case
where
you're
using
etcd
as
the
key
value
store
that
will
that
cause
the
panic
so
131
will
will
fix
that
you'll
want
to
upgrade.
If
you
are
using
a
td
and
that's
all
we
have
for
the
releases
right
now.
A
A
A
This
is
the
super
professional
and
well
well
produced
tempo
community
call.
A
Yeah,
that's
what
zach
zach
was
talking
about
131,
we
released
it
a
week
or
two
back
and
then
right
did
we
talk
about
the
breaking
changes?
Zach.
Did
you
detail
some
of
those.
B
Yeah
we
did
cover
it
in
the
previous
meeting.
I
was
just
checking
the
notes,
but
we
can
talk
about
that
a
little
bit
if
you
want.
B
There
are
three
breaking
changes
in
the
130
release.
One
of
them
is
the
hotel
grpc
receiver
default
port
changed
so
you'll
want
to
make
note
of
that.
You
know
firewall
rules,
policies
etc.
B
B
So
if
you
look
in
recent
versions
of
your
etsy
services,
you
will
see
open
telemetry
listed
there
super
exciting
for
them.
Also
config
change.
There's
two
parameters
that
move
from
the
query
config
to
the
query:
front-end
config.
So
you
will
want
to
make
those
updates
if
you
are
using
search
max
result,
limit
or
search
default
result
limit.
Both
of
those
will
end
up
on
the
query
front-end
now
and
then
also,
we
removed
a
deprecated
grpc
endpoint
on
the
ingestor
and
its
associated
data
encoding.
B
This
was
dedicated
back
in
1.0.
So
if
you're
running
earlier
versions,
you
need
to
upgrade
to
1.0
first
and
allow
that
to
to
blocks
to
be
rewritten
as
the
as
the
new
encoding.
So
just
that's
the
three
there
and
then
we
got
a
whole
pile
of
features.
So
I
would
highly
recommend
checking
out
the
release
notes
for
the
remaining
features.
A
Cool
thanks,
in
particular,
130
included,
back-end
search
for
the
first
time,
we're
going
to
talk
a
little
bit
about
that
now
kind
of
what
we
have
going
on
for,
like
this
full
scan
of
our
gcs
or
s3,
or
whatever
object,
storage
and
we've
already
had
some
people
experimenting
with
the
cool
which
is
cool.
I
think
that
really
points
towards
how
excited
everyone
is
to
be
able
to
search
their
full
background
with
tempo
instead
of
using
logs
or
exemplars
to
jump
into
traces.
A
But
there's
going
to
be
some
some
hiccups
here
and
there's
going
to
be
some
some
config.
We
have
to
talk
about,
and
so
let's
talk
about
what
we're
doing
in
search
what
we're
doing
at
grafana-
and
hopefully
we
can
help
you
all.
A
Hopefully
we
can
help
you
all
get
your
search
configured
correctly,
so
go
ahead.
Mario,
if
you
want
to
move
forward.
A
So
that
serverless
help,
I
think,
mario
copied
what
was
in
our
doc.
But
if
you
look
in
the
community
called
doc,
that's
a
link
to
our
docs
in
our
just
our
standard
tempo
documentation,
and
it
talks
about
some
some
of
what
we're
going
to
cover
today
or
real
quick
here.
Some
of
these
config
options
that
you
might
need
and
help
you
get
started
with
serverless
and
getting
getting
your
backend
search
to
work
with
serverless,
which
is
what
we're
using
at
grafana.
A
Also,
oh,
that
is
totally
wrong.
It's
not
grafana386!
I'm
sure
I
typed
that
wrong.
It
is
graffana,
836
nope,
sorry,
mario,
you
got
it.
That
was
you
bud.
Grafana
836,
which
was
just
released
yesterday,
supports
back-end
search.
We
need
to
update
our
docs,
I
think
we
say
a
future
grafana
version,
but
we'll
update
it
to
say
836.
A
and
you
need
to
include
and
I'll
include
this
here,
there's
a
feature
flag,
back-end
tempo
search,
so
we're
still
marking
this
as
beta
and
the
reason
we're
buying
this
is
beta
is
because
it's
just
not
at
the
performance.
We
want
it
to
be
for
our
level
of
ingest,
which
is
about
250
mega.
Second,
I
think
once
we
reach
that
scale,
we'll
feel
comfortable.
Dropping
that
tag
and
saying
it's,
you
know,
ga
we
also
just
want
to
be
a
little
easier
easier
to
configure
it's.
A
It's
still
a
little
bit
gorpy
in
that
world
and
I'll
show
you
some
of
that
and
talk
about
why
it's
a
little
rough
right,
serverless
help,
and
then
we
also
added
lam
to
support
since
131
was
cut.
A
A
So
this
is
kind
of
the
architecture
and
the
way
we're
searching
or
we're
making
use
of
serverless.
So
currently,
when
you
do
a
trace
by
id
search,
the
query
front
end
hits
the
queriers
and
the
queryers
both
ask
the
ingesters,
and
then
they
search
the
gcs
bucket
directly
with
serverless.
If
you
want
to
use
serverless
for
search,
we,
instead
of
digging
directly
into
the
gcs
bucket,
we
query
a
serverless
endpoint
and
it's
just
an
endpoint
that
you
configure
using
this
query
or
search
external
endpoints.
A
So
if
you
want
to
use
serverless,
you
have
to
configure
this
value,
which
will
tell
the
query
not
to
do
the
search
on
its
own,
but
instead
to
kind
of
offload
the
work.
So
with
this
configuration
point,
your
query
is
basically
a
proxy
when
you're
doing
these
range
searches
when
you're
doing
these
like
time
range
searches
with
different
criteria
like
you're,
looking
for
different
tags
or
durations
or
whatever.
A
So
the
query
front
end
builds
up
a
whole
bunch
of
jobs,
it
scans
the
bucket.
It
knows
how
many
blocks
that
need
need
to
be
searched.
It
makes
a
bunch
of
jobs
which
it
then
gives
creates
a
giant
queue
of
these
jobs.
The
queriers
take
the
jobs
one
at
a
time,
some
of
the
jobs
say,
go
look
in
the
ingestors
and
some
of
the
jobs
say
go
query
a
certain
time
range
in
the
back
end.
A
If
you
don't
have
this
external
endpoint
configured
that
search
the
back
end,
the
query
will
do
it
itself,
you
don't
need
serverless,
so
the
queriers
are
set
up
to
completely
search
the
background
on
their
own,
but
at
certain
scales-
and
if
you
want
certain
like
latencies,
then
you
basically
have
to
have
some
kind
of
burst
capacity
which
we're
using
serverless
technologies
to
do
so.
These
jobs
that
say
go
scan.
The
back
end
hit
the
query.
A
If
that
search
external
endpoint
is
configured,
it
will
just
hit
the
endpoint,
it
becomes
a
proxy
for
the
query
and
then
you're
kind
of
offloading
this
into
a
serverless
technology
like
lambda
or
google
cloud
functions,
so
that's
kind
of
like
the
key
config
option
that
turns
on
serverless.
It
changes
your
query
from
attempting
to
do
the
work
itself
to
offloading
it
to
something.
A
In
the
background
cool,
why
don't
we
go
ahead
and
to
give
you
some
numbers
about
what
we're
really
trying
to
hit
before
we
drop
this
beta
flag,
so
we're
doing
250
mega
second
of
ingest,
because
we
use
replication
factor
3
that
actually
turns
into
750
mega
second
written
into
gcs.
We
attempt
to
compact
that
away,
but
that's
kind
of
like
the
search
space
at
time,
0.
A
Essentially
it's
three
times
what
we're
ingesting
because
of
our
replication
factor
and
then
over
time
that
kind
of
gets
reduced
as
we
can
pack
some
of
that
away,
but
still
it
is
kind
of
makes
a
challenge
for
recent
searches.
A
We
have
202
query
front
ends
and
for
a
single
hour,
if
you
make
a
request
for
a
timerage
of
a
single
hour,
regardless
of
your
conditions,
it
creates
about
38,
000
jobs,
so
38
000,
small
jobs
for
the
queriers
to
do
that
are
like
scan
this
small
range
of
a
block
and
they
all
kind
of
take
these
jobs
and
either
the
queries
are
doing.
Those
or
kind
of
the
serverless
function
is
doing
those.
A
We
have
50
queries
and
these
queries
kind
of
act
as
proxies
for
the
time
range
search
for
the
traced
by
id
search.
They
still
do
the
work,
but
for
the
time
range
search
for
the
back
end
search.
They
are
become
this
proxy
and
we're
seeing
a
spikes
to
about
2
000
serverless
functions
when
we,
when
we're
doing
this,
this
search.
So
you
know
it
starts
at
zero.
Of
course,
I
have
that
you
see
a
graph
down
there
of
count
of
active
serverless
functions
in
gcs
or
in
gcp,
and
so
the
querier.
A
You
know
it's
sitting
at
zero
of
course,
and
then
you
start
asking
questions.
You
start
doing
these
time
range
searches.
The
query
front
end
hits
the
query.
The
query
starts
slamming
the
cl
google
cloud
functions,
which
then
scales
up
very
rapidly
and
you're
able
to
you're
able
to
service
your
query,
hopefully
pretty
quickly-
and
this
is
kind
of
the
current
bottleneck.
A
We're
hitting,
which
is
gcs-
is
really
only
allowing
us
five
to
eight
thousand
or
so
queries
per
second,
and
that's
what
we're
really
trying
to
move
past
right
now
is
and
it
kind
of
generally.
We
want
to
be
agnostic
about
our
backends
and
just
kind
of
think
of
that
about
them
as
generic
object
storage,
we
don't
care
we're
writing
blocks
to
them,
we're
carrying
back
it
doesn't
matter
but
to
really
push
the
limits
of
these
back
ends.
We
find
that
we
are
going
to
have
to
get
into
some
of
the
details.
A
And
how
do
we
push
it
harder
because
all
our
entire
infrastructure
for
search
is
capable
of
doing
this
250
mega?
Second
search,
everything
is
set
up.
We
just
get
bottlenecked
on
gcs
gcs
won't
we'll
just
start
throttling
our
queries.
It
starts
returning
429s
and
other
error
codes
indicating
it
just
won't
handle
any
more
requests.
D
A
One
more
slide:
two
more
slides
question
mark
yeah:
okay,
this
one's
kind
of
a
super
boring
slide,
but
it
shows
you
some
of
the
config
points
that
you're
going
to
have
to
look
at
if
you
start
dealing
with
these
huge
jobs.
So
if
you're
doing
100
200
megabytes
a
second
you're
going
to
have
tens
of
thousands
of
jobs,
you're
going
to
need
to
make
some
of
these
changes,
they're
documented
in
that
link
and
tempo
community
call,
so
don't
feel
like
you
have
to
just
you
know,
digest
this
all
right
now
go!
A
A
So
our
read
and
write
timeouts
are
up
quite
a
bit
of
course,
because
these
become
more
like
batch
jobs,
which
is
not
the
feel
we
want,
but
it's
just
the
way
it
is
right
now
so
we've
we
have
our
http
timeout
set
at
two
minutes
on
the
query
front
end,
and
then
we
have
to
massively
increase
like
the
number
of
jobs
we'll
perform
by
the
query
front
end
so
you'll
see,
there's
less
max
outstanding
per
tenant.
A
Where
I
think
it's
only
like
50
right
on
trace
buyers
to
d
search
or
100,
but
we're
going
to
like
10
000,
basically
100
000,
because
we
make
38
000
jobs
and
we
have
to
put
all
of
those
into
queues.
So
we
have
to
increase
that
quite
a
bit.
There
are
some
other
queue
kind
of
configuration
options
there
and
then
the
query,
also
by
default,
we'll
only
try
a
few
queries
at
a
time
and
we
need
to
up
that
massively
as
well,
because
the
query
becomes
this
proxy.
A
It
no
longer
is
doing
the
work
so
instead
of
the
default,
which
I
think
is
only
like
two
or
five,
which
was
a
default
setup
for
query
by
tr
or
traced
by
id
search,
you
have
to
massively
increase
that
value
as
well.
So
max
concurrent
queries,
I
think,
on
ours
are
like
a
thousand.
Each
of
our
queries
will
handle
a
thousand
queries
at
once,
but
handle
a
thousand
queries
means
proxy
these
to
serverless.
A
Basically,
so
these
config
points
are
what
you
have
to
kind
of
hit
inside
of
tempo
to
unlock
tempo
and
unlock
all
the
bottlenecks
inside
of
tempo,
at
which
point
the
bottlenecks
become
your
back
end,
I
suppose,
but
we're
kind
of
trying
to
figure
that
out
right
now
and
then.
Finally,
I
have
just
a
quick
blurb
there
we
support
gcp
and
aws,
and
it's
a
simple
make
command.
This
just
makes
a
zip
file.
A
A
That's
it
for
me:
okay,
so
serverless
we're
very
excited
about,
but
we're
kind
of
just
learning
it
as
we
go.
We
feel
like
for
these
extremely
you
know
these
extreme
environments,
extreme
high
load
environments,
it's
going
to
take
a
little
bit
more
nuance.
It's
going
to
take
maybe
a
little
bit
more
data
structures
on
the
back
end
to
get
the
speeds
that
we
want,
but
it
is
functional
and
available
now
and
we
use
it
internally.
We're
also
rolling
this
out
to
cloud
in
cloud.
A
Our
volumes
are,
you
know
our
per
tenant
volumes
are
at
a
level
where
we
feel
comfortable
rolling
this
out.
So
all
of
our
cloud
customers
are
going
to
have
access
to
this
to
their
full
backend
search
and
we're
going
to
be
iterating
on
this
in
the
next
months
years.
Frankly,
to
improve
performance,
add
features,
kind
of
build.
This
query
language,
we're
going
to
start
talking
about
and
other
things
cool,
so
search
is
in
a
good
spot.
A
If
anybody
has
any
questions,
you're
welcome
to
ask
now
for
serverless.
I
there's
been
quite
a
few
questions
in
slack
and
in
the
github
issues
about
how
to
set
this
up
and
other
concerns
people
have.
If
you
have
questions
you're,
welcome
to
ask,
or
we
can
say
that
for
the
end
you
can
ask
in
the
chat
or
you
can
unmute.
If
that's
your
style.
F
That's
very
cool
joe.
I
have
one
question:
yes,
how
did
we
get
past
the
cold
start
problem
in
serverless?
I
saw
that
like
in
the
number
of
jobs,
the
graph
that
you
showed
us.
It
was
like
at
zero
and
then
it
would
like
massively
scale
up
to
like
a
couple
of
thousand.
How
did
we
get
past
like
that?
Cold
start.
A
Honestly,
cold
start
really
hasn't
been
an
issue
with
gcp
cloud
functions.
I've
kind
of
had
my
eye
on
it,
but
the
bigger
bottleneck
right
now
is
the
gcs
requests.
A
Another
field
I
set
was-
and
I
I
think
I
referenced
this
in
the
docs
in
the
docs
that
are
linked
in
the
tempo
community
call,
but
there
is
in
google
cloud
functions.
There's
like
this
min
instances
you
can
set,
so
it
will
always
keep
one
alive,
and
I
did
see
that
making
a
substantial
difference
in
cold
start
just
telling
it
to
keep
a
single
one
around
seemed
to
be
meaningful
over
zero.
A
Basically,
something
else
I
noticed
was
if
I
do
a
code
update
and
I
push
a
new
package
and
then
I
update
my
function.
A
The
very
first
query
after
that
update
often
took
a
long
time
like
there
was
a
serious
cold
start
issue
there
versus
you
know,
even
if
I
waited
like
a
day.
The
second
query
on
that
same
you
know,
function
was
significantly
less
so
that
very
first
query
seemed
to
have
the
worst
cold
start
issues,
after
that
it
did.
Okay,
so
certainly
with
serverless
functions.
I
think
you're
always
going
to
be
trading
off
right.
A
Some
kind
of
cold
start
latency
in
exchange,
for
you
know
this
huge
kind
of
extreme
burst
capability,
and
I
think
that's
just
always
going
to
be
like
a
challenge
when
we
use
this.
E
A
A
Yeah
yeah,
the
worst
one,
is
that
one
I
kind
of
we
should
set
up
a
job
that
just
like
curls
it
as
soon
as
we
roll
it
out
or
something
right,
the
instant
we
roll
out
a
new
version,
just
curl
it,
so
we
hit
it
and
that
customer
doesn't.
I
don't
know
it's
a
good
idea,
cool
all
right
I'll
hand.
This
over
to
mario
we've
been
doing
a
lot
of
work
on
metrics
generator.
A
We've
been
talking
a
lot
about
this
in
the
community
calls,
and
I
think
we
have
some
updates
in
that
world
as
well.
C
C
Right
and
yeah,
I
mean
we
gather
some
feedback.
I
think
we
were
happy
at
the
point
we
were
at.
I
think
yeah.
We
have
a
good
base
to
to
build
a
solution,
so
yeah
it's
been
recently
accepted
and
it
will
be
merged
soon,
so
yeah
we
encourage
everyone
to
take
a
look
at
it.
Even
if
we're
accepting
it
and
merging
it,
please
feel
free
to
ping
that
things
ping
ass
ass
at
a
slack.
C
If
you
want
to
chat
about
it,
comment
on
the
the
same
proposal
open
an
issue
open
up
your
against
the
design
proposal.
It's
not
definitive
and
written
in
stone
yeah.
We
will
be
iterating
the
project
and
and
making
changes
as
as
we
built
it
so
yeah.
We
appreciate
all
the
all
the
feedback
that
we
can
get
so.
C
Yeah
well,
the
next
slide
is:
maybe
you
should
have
come.
First
is
so
what
is
actually
the
metrics
generator
for
those
who
don't
know
just
to
refresh
everyone's
memory?
So
the
metric
generator
is
a
new
component
that
we
have
designed
to
solve
this
project,
and
this
new
component
is
used
to
derive
metrics
from
ingested
traces.
C
C
Initially,
we
have
designed
to
kind
of
metrics
that
we're
deriving
one,
which
is
what
we
call
spam
metrics,
which
are
basic
metrics
from
spans.
C
Essentially
every
span
is
equal
and
yeah
we're
trying
to
start
extract
as
much
as
as
much
relevant
data
from
his
pants
as
possible,
and
the
other
one
is
service
graphs,
which
is
this
feature
that
we
built
for
the
graphene
agent
in
which
we
inspect
traces
and
try
to
match
expands
as
requests
from
the
client
and
server
perspectives.
C
With
that,
we
can
build
metrics
that
represent
how
services
communicate
and
we're
able
to
display
like
a
complete
graph
of
a
distributed
system
and
in
grafana
breeze
based
from
those
metrics
yeah.
I
mean
this
was
the
quick
introduction
to
the
projects,
but
if,
if
you
want
to
discuss
more
dive,
more
into
technical
details
about
or
have
any
question
feel
free
to,
ask
we're
happy
to
to
go
into
them
and
finally,
well
almost.
Finally,
what
are
the
next
steps.
C
At
the
same
time,
we've
been
working
on
the
design
proposal.
We've
been
building
an
initial
implementation
based
on
that
designed
and
yeah
we're
opening
a
pull
request
with
this
initial
implementation.
We
also
encourage
everyone
to
take
a
look
and
and
and
give
any
feedback
it's
always
appreciated
and
as
for
when
this
will
be
released,
we're
targeting
tempos
1.4
release.
C
So
I
will
say
a
couple
of
months:
it's
a
fur
estimate
and
yeah
also
to
to
announce
that
as
well
as
on
this
being
on
open
source
tempo,
we'll
be
integrating
it
to
grafana
club
traitors.
C
G
Yeah
I'll
quickly
share
my
screen,
so
I
can
do
a
quick
demo
about
the
metrics
generator
and
how
we
are
running
it
right
now,
so
we
have
the
metrics
generator
deployed
in
the
development
cluster
internally
and
it's
currently,
you
know
just
generating
metrics
all
the
time
and
we're
using
this
as
a
platform
to
try
out
our
new
code
and
see
how
it
works.
So
just
to
give
everyone
kind
of
an
idea
of
what
these
metrics
look
like.
I
can
just
show
them
and
go
to
different
graphs.
You
can
get
from
them.
G
G
These
can
also
be
generated
by
the
agent,
but
in
this
case
we're
now
generating
them
by
tempo
itself.
So
tempo
is
in
ingesting
traces
and
also
generating
these
metrics
at
the
same
time,
and
just
to
give
like
a
quick
example
of
what
you
can
do
with
them,
the
spam
metrics
are
great
to
see
like
historical
data,
about
your
spans,
so
we're
generating
these
metrics.
These
are
aggregated
view,
high
level
view
of
all
these
little
traces.
G
So,
to
give
an
example,
this
is
a
histogram
quantile
of
the
p95
of
the
spam
matrix.
I
trimmed
it
down.
I
only
showed
the
spans
that
start
with
http
gets
tempo
api.
So
that
way
I
only
show
spans.
You
know
that
are
part
of
the
tempo
api
in
this
case,
and
then
you
can
see
them
here
so
in
yellow,
that's
the
trace
id
request.
So
all
the
requests
to
fetch
a
trace
and
in
green
those
are
the
the
search
requests
and
then
you
can
see
like
you
know.
What
kind
of
pattern
is
there?
G
Is
it
slowing
down?
Is
it
slower
than
usual
like
that
kind
of
stuff?
What's
also
really
cool,
is
you
know
we're
generating
metrics
from
traces?
So
we
have
this
full
trace
information
at
the
moment
that
we
generate
the
metrics.
So
it's
very
easy
to
inject
examples
into
that,
and
so
all
these
green
little
dots
you
can
see.
G
Those
are
examples
that
we
added
to
the
metrics
about
the
traces
that
these
metrics
are
based
upon,
and
so
what
you
can
do
is,
for
instance,
see
oh
there's
a
slow
request
here,
that's
kind
of
weird:
you
can
see
the
labels
of
this
request
and
then
you
can
jump
to
tempo
to
carry
this
trace
and
then
you
can
immediately
see
like
oh,
why
was
this
request
slow?
G
Maybe
it
was
hanging
somewhere,
whatever
you
can
continue
exploring,
and
what
would
be
really
cool
is
something
we
will
try
to
implement
in
grafana.
Is
that
if
you're
able
to
go
from
a
span
back
to
spam
metrics
so
that
you
can
go
in
the
opposite
direction,
that
you
can
select
a
span?
And
you
might
wonder
like?
G
So
those
are
the
spam
metrics
and
the
other
set
of
metrics
we're
generating.
Are
the
service
graphs
and
so
the
service
graphs?
These
metrics
extract.
G
You
know
information
about
how
your
services
are
linked
together
and
how
the
requests.
You
know
how
many
requests
there
are
between
each
service
and
and
the
latency
between
those.
So
what
you
can
see
here
is,
for
instance,
we
are
collecting
traces
from
from
loki,
and
this
is
like
how
the
loki
components
are
talking
to
each
other.
G
So
there's
a
gateway,
a
distributor
adjuster
and
you
can
see
the
arrows
of
how
the
requests
are
flowing
to
the
system,
and
you
can
also
zoom
in
and
then
see
like
how
many
requests
this
node
is
receiving
every
second
and
also
kind
of
the
latency
and
stuff
like
that.
This
isn't
the
final
ui,
because
it's
a
slightly
older
version
of
grafana,
but
in
a
new
version,
there's
also
the
option
to
see
a
histogram
and
to
jump
to
the
histogram
like
in
a
separate
panel.
So
you
have
the
full
graph
view
there
yeah,
so
yeah.
C
All
this
is
improved
in
as
conor
was
saying
in
in
new
versions
of
grafana.
G
Yeah,
this
is
an
example
of
what
the
service
graphs
look
like,
so
you
always
have
a
client
and
a
server
and
then
like
the
amount
of
requests
that
happened
or
the
latency
of
those
requests
and
every
client
server.
Paris
is
what
we
call
an
edge.
It's
like
a
jump
from
one
service
to
another,
but
yeah.
This
is
currently
being
worked
on
graphile,
so
yeah
expect
to
see
some
improvements,
all
the
time,
basically
yeah.
That's
kind
of
what
we
what
I
had
to
show
for,
like
the
metrics.
A
G
A
Cool
excited
about
that.
I
really
want
to
see
it
in
our
operations
cluster
internally.
I
think
we're
only
running
it
in
dev
right
now,
but
I
think
especially
the
automatic
exemplar
thing
will
be
a
great
way
to
be
able
to
see
kind
of
trends
in
your
spans
and
then
jump
back
to
the
choices
that
are
having
the
issues
very
cool.
A
Speaking
about
linking
traces
to
other
forms
of
telemetry
data.
I
think
we
have
some
people
from
pyroscope
here
who
are
interested
in
sharing
some
of
what
they've
been
working
on,
which
I
believe
is
linking
traces
and
continuous
profiling.
So
I'll
give
this
over
to
ryan
and
yeah
show
us
what
you
got.
D
Cool
yeah:
what's
up
everybody
yeah,
I'm
ryan,
there's
a
couple
other
people
from
pyroscope
in
here.
If
you
don't
know
what
pyroscope
is
it's
yeah,
as
joe
mentioned,
a
continuous
profiler
open
source
and
something
that
we've
been
working
on
a
lot
lately
has
been
yeah.
This
idea
of
like
being
able
to
link
different
types
of
yeah
like
telemetry
data
to
profiles,
and
so
I
will
show
you
we
kind
of
have
like
the
the
beginnings
of
of
kind
of
getting
this
in
in
a
yeah.
D
I
guess
we
kind
of
have
the
beginnings
of
getting
this
at
more
of
like
a
production
state,
but
one
thing
that
we're
definitely
interested
to
learn
more
about
is
sort
of
how
we
can
design
this
in
a
way
that
would
be
most
useful
for,
in
this
case,
particularly
tracing
how
you
could
get
more.
I
guess
like
value
out
of
your
traces
by
being
able
to
see
the
profiles
associated
with,
you
know,
specific
spans,
and
so
I
don't
know-
maybe
let
me
see,
as
you
said,
the
button
always
moves.
D
D
Yeah
all
right
cool,
so
yeah,
I
guess
I
won't
get
too
much.
I
mean
if
you
aren't
familiar
with
profiling,
we
do
have
like
a
a
demo
page,
you
can
kind
of
see,
but
just
like
real
quickly
for
anyone
who's
not
familiar.
Basically,
this
is
just
this
is
a
flame
graph.
D
It's
a
visual
representation
of
sort
of
resource
utilization
in
this
case
cpu,
where
the
top
is
a
hundred
percent,
and
then
each
node
below
that
is
a
you
know,
function
that
you
know
this
function
calls
this
function
which
calls
this
function.
So
this
is
kind
of
like
a
pie
chart
on
steroids
sort
of
like
the
yeah,
the
the
path
of
your
code
and
resource
utilization.
D
D
Example,
if
you're
familiar
with
that,
we
created
sort
of
our
own
little
like
ride
share
example
to
kind
of
yeah
represent
this
there's
multiple
endpoints
car
bike
and
scooter
literally
you
just
order
a
car
or
a
bike
or
a
scooter
and
then
and
then
so
that's
what
this
tracing
example
is
sort
of
showing
there's
a
load
generator
that
is
generating
the
load
just
so
that
you
can
be
able
to
have
something
to
look
at
for
the
traces
in
the
profiles,
and
so
that's
kind
of
the
the
structure
of
of
the
of
this
application,
and
we
can
also,
after
this
right
now
it's
sitting
in
a
in
a
branch
somewhere.
D
I
guess
I
can
paste
that
in
the
chat
after
this
example
I'm
about
to
show,
but
basically
what
we
have
here
is
you
know,
so
we
have
this
running
in
grafana,
and
so,
if
you
do
like
search,
you
can,
as
I
said,
there's
these
two
there's,
the
load
generator
app
and
then
the
ride
sharing
app
the
ride,
sharing
app
is
more
interesting.
So
if
we
search
here,
you
can
see
a
bunch
of
different
traces
and
I'll
just
click
on
one
random.
D
One
and
again,
as
I
mentioned,
it
starts
off
in
the
load
generator
and
then
from
there
it
gets
into
the
actual
ride,
sharing
app,
and
so
the
thing
that
we
did
here
was
we
added.
We
added
this
like
pyroscope
profile
id
and
pyroscope
profile,
url,
which
for
now
opens
in
a
separate
tab,
and
so
basically,
what
you
will
see
is
that
in
this
tab
for
this
specific
span,
you
know
this
yellow
span
right
here
for
this
specific
span.
This
is
the
profile
for
again
that
specific
span.
D
So
in
this
case
you
can
see
it's
calling
this
car.order
car
function,
that's
what
it's
spending
most
of
its
time,
working
on
and
yeah,
I
didn't
really
get
into
it.
We
we
made
the
ordering
a
car,
go
extra,
slow
on
purpose
kind
of
to
exemplify
some
other
aspects
of
this,
but
basically
but
basically
yeah.
That's
that's
what's
happening
here
and
and
as
you
you
can
drill
down
as
of
right
now
we
only
show
the
the
profile
for
like
the
the
root
span
here.
D
D
The
idea
here,
what
we
kind
of
hope
that
we
could
do
eventually
is
we
do
have
a
grafana
plugin
as
well,
and
it
would
be
cool
to
be
able
to
kind
of
come
into
this
explore
tab,
similar
to
as
you
were
just
showing,
or
somebody
was
just
showing
where
you
were
able
to
kind
of
link
the
yeah.
You
know
a
bunch
of
things
to
the
traces.
D
It
would
be
cool
to
have
that
here
we
have
a
similar
query
language
to
prom
ql
or
it's
almost
identical,
and
so,
if
we
actually,
you
know
copy
and
paste
that
in
here
you
can
see
the
obviously.
This
is
not
a
flame
graph.
This
is
a
json
that
would
otherwise
be
a
flame
graph
if
that
was
a
supported,
visualization
type
here,
if
I
do
go
into
the
actual,
let
me
see
yeah,
I
guess
here
we
go
so
yeah.
D
So
basically,
if
I
did
go
into
here-
and
I
were
to
like
paste
that
exact
same
query
in
here
and
hit
apply,
then
you
can
kind
of
see
it
in
the
panel
over
here
but
yeah.
I
guess
I
don't
know,
as
you
may
know,
the
these
panels.
This
is
like
a
custom
visualization
panel
that
we
made
specifically
for
pyroscope,
whereas
the
panels
that
are
available
to
actually
like
visualize
data
here
are
sort
of
hard
coded
to
be
either
graphs
tables,
logs
traces
or
node
graphs.
D
I
believe,
if
I'm
looking
at
the
right
thing
here
and
so
anyways,
that's
something
that
we
kind
of
hope
to
eventually
you
know
potentially
lobby
grafana
to
allow
either
custom
visualizations
here
or
something
other
than
those
sort
of
blessed
visualization
types
or
add
profiling
to
those
visualization
types
there,
but
yeah,
that's
that's
kind
of
from
a
high
level
how
this
all
works.
D
We
can
get
into
more
of
like
how
it's
implemented
or
whatever
is
interesting
to
you
there,
but
I
would
say
that,
from
my
side
from
pyroscope
side,
the
thing
that
we're
most
interested
in
from
you
all
people
who
are
using
tracing
and
using
it
frequently
is
sort
of
what
are
the
yeah
like?
Does
this
sound
interesting,
you
know.
D
Would
this
be
something
that
you
see
an
extra
like
use
case
for
being
able
to
go
from
a
trace
or
a
span
to
the
profile
associated
with
that
and
just
kind
of
in
general
sort
of
you
know
yeah,
whether
it's
for
incident
solving
or
capacity
planning,
whatever
it
might
be,
just
improving
user
experience
in
general
and
being
able
to
kind
of
move
throughout
your
sort
of
the
life
cycle
of
whatever
your
application
is
doing.
D
Yeah
just
honestly
interested
to
get
your
general
thoughts
that
can
help
us
sort
of
plan
as
we
as
we
design
this
and
build
this,
that
it's
we're
building
it
to
be
something
that's
actually
useful
and
we
figure.
This
is
a
space
where
you
know
you
guys
are
most
familiar
with
chasing
or
probably
more
than
most
and
and
so
yeah
we'd
love
to
hear
your
thoughts.
A
Cool,
I
do
have
a
question
about
the
profile
that
I
think
will
help
me
understand
like
how
I
can
use
this
so
generally
with
like
cpu
profiling
right,
you
are
sampling
the
stack
right
at
so
many
times
per.
Second,
then
you're
building
your
flame
graph
out
of
this
so
is
when
you're
doing
this
link
between
the
trace
and
the
cpu
profile
is
the
flame
graph,
somehow
scoped
down
only
to
the
request,
or
is
it
the
state
of
the
process
as
a
whole?
At
the
time
of
the
request.
H
It
is
scoped
down
to
the
request:
hi,
I'm
dimitri,
also
from
pyroscope
in
go,
there's
actually
a
very
cool
way
to
basically
tell
the
runtime
that
this
particular
function,
or
I
guess
this
particular
go
routine.
You
know
belongs
to
this
trace
or
something
like
that
and
that's
how
we're
able
to
yeah
really
scope
it
down,
and
so
it
doesn't
include
other
things
like
garbage
collection
or
you
know
other
background
threads
in
that
profile.
That
ryan
showed
cool.
A
All
right,
so
I
think
my
my
instinct
here
is,
I
think,
both
have
a
lot
of
value
like
some
people
do
a
lot
of
just
like
in
out
tracing
right,
just
the
requests.
You
know
the
http
request
in
their
http
request
out
and
internally
inside
the
process.
There's
still
a
lot
of
questions
about
what's
happening,
and
I
think
what
you've
done
here
is
awesome.
A
In
that
case,
like
you,
don't
have
to
do
any
specific
instrumentation,
you
don't
have
to
do
anything
manual
and
you
can
immediately
see
the
state
of
you
know
the
stack
all
of
the
stuff
that
you
did.
An
instrument
is
given
to
you
kind
of
automatically
and
for
free
by
jumping
over,
but
I
also
think
there's
a
lot
of
value
to
knowing
everything
that's
going
on
in
the
process
at
that
time.
A
A
I've
been
trying
to
use
profiling
more
as
a
as
a
general
tool,
and
I
think
this
could
help
me
kind
of
bridge
that
gap
like
get
me
into
private
scope
more
instead
of
just
bringing
it
up
occasionally,
when
I'm
having
issues
like
use
it
more
regularly-
and
you
know,
give
you
more
insight
into
your
processes.
D
Yeah
yeah,
one
thing
I
would
add
there
too,
is
just
that
yeah
so
like
that
you
know,
is
while
that's
sort
of
like
the
specific
profile
for
that
trace,
I
mean
the
other
thing.
That's
nice
about
you
know
continuous
profiling
is
there
also
is,
if
we,
you
know,
removed
that
those
specific
you
know
profile
id
from
the
query
parameters
there
you
would
get.
You
know
a
full
profile
and
you
can
also
you
know.
It's
also
tagged
appropriately
in
this
case.
D
It's
tagged
by
vehicle
and
region,
as
I
showed
in
that
kind
of
graph,
and-
and
so
you
can
kind
of
you
can
kind
of
zoom
out
that
way
as
well
and
be
able
to
see
sort
of
just
like
at
that
time.
F
D
And
also
you
know,
if
you're
using
kubernetes
or
something
along
the
or
yeah,
if
you're
using
kubernetes,
you
can
see
you
know
for
that
pod
or
for
that
name,
space
or
whatever
you
can
slice
and
dice
it.
How
you
want,
and
so
we're
kind
of
trying
to
figure
out
yeah
like
what
the
balance
is
there
between
being
able
to
see
as
high
as
you
can,
but
then
also
being
able
to
drill
down
to
like
very,
very
granular
as
well.
D
Yeah
and
then
also
being
able
to
yeah
like
compare
sort
of
like
the
full,
I
don't
know
yeah
demon.
You
want
to
talk
about
what
you
were.
You
were
talking
about
earlier
with
comparing
the
sort
of
like
full
path
of
the.
I
don't
know.
I
guess
exactly
what
you
call
it
like
the
trace,
the
full
path
of
the
span
versus
like
one
that
was
particularly
slow,
yeah.
H
So
this
whole
idea
with
you
know
attaching
profiles
to
spam
started
by
you
know
like
originally,
we
had
only
kind
of
like
full
view
of
your
application
right,
both
garbage
collection,
all
the
threads
everything
some
people
wanted
to.
H
You
know
kind
of
get
more
detailed
information,
and
so
now
we
have
this
feature,
but
now
that
we're
talking
about
this
another
thing
people
are
asking
for
is
kind
of
be
able
to
compare
a
particular
profile
for
a
particular
span
to
kind
of,
like
all
the
other
profiles
for
that
particular
span,
and
so
we're
starting
to
think
about
how
we
could
do
that.
Basically
compare
you
know
compared
to
some
sort
of
a
baseline,
so
it's
some
somewhere
in
the
middle,
where
it's
not
like
your
whole
application.
H
It's
not
a
particular
spin,
but
it's
kind
of
like
in
between
hope.
That
makes
sense.
A
Personally,
I
find
myself
staring
at
metrics
a
lot
because
I
like
that.
I
like
to
get
a
feel
for
the
process
and
what
it's
doing,
and
I
think
it
would
be
neat
if
I
also
had
a
feel
at
the
profiling
level,
like
just
every
other
day
once
a
week,
just
jump
in
there
and
just
see
what
it
looks
like
it
during
a
normal
period
of
time
and
get
a
get
a
intuitive
feel
for
what
the
state
of
your
process
is,
and
I
think
these
kind
of
features
would
would
bring
me
in
more.
A
F
Go
ahead,
I
had
one
more
question:
if
you've
got
time
and
so
yeah,
I
know
you
can't
see
me,
I'm
only
I'm
part
of
the
tempo
squad
at
grafana.
First
off
really
cool
product.
I
had
one
question
around
sampling.
Actually,
so
it
seems
like
you
would
almost
be
storing
profiles
for
every
request
and
that's
kind
of
like
similar
to
what
we
deal
with
with
tracing
and
also
it's
like
periscope
seems
to
be
a
pull
model
from
what
I've
read.
F
So
is
it
like
like
how
do
these
profiles
for
every
request
actually
how
they
get
picked
up?
It's
almost
like.
We
were
dealing
with
this
with
exemplars
when
we
add
exemplars
to
the
metric.
We
ship
those
traces,
but
it's
not
necessary
that
the
exemplar
will
be
scraped
by
prometheus
and
landon
and
land
in
the
u.s.
So
do
you
actually
sample
100,
or
do
you
have
like
these
rules
in
pyroscope
as
well.
H
We
we
do
use
sampling,
profilers,
so
they're
sampling
in
terms
of
like
how
many
you
know
stack
traces.
Do
we
look
at
each
second,
but
when
it
comes
to
actually
storing
them,
we
store
every
single
one
right
now,
like
every
single
profile
for
every
single
profile
id.
F
H
It's
valuable
it.
It
is
a
lot
so,
for
you
know,
we
we
build
a
pretty
good
kind
of
storage
engine
compression
algorithm
to
be
able
to
kind
of
scale
vertically
and
like
even
if
right
now,
you
install
just
one
pyroscope
server
and
scale
vertically
you'll
be
able
to
get
pretty
pretty
far
yeah
for
the
future.
We
are
planning,
you
know
we're
planning
some
sort
of
more
scalable
solutions
so
that
you
could
do
yeah.
You
could
do
horizontal
scaling,
basically.
F
Very
good,
no,
we
use
bioscope
already.
We
wonder,
like
I
think,
the
highest
volume
cluster
uses
spice
scope
right
in
tempo
yeah
cool
thanks.
D
For
sure,
well,
yeah
we'll
definitely
try
to
add
a
a
tempo
example
as
well
to
the
we're
still
like.
I
said
we
haven't
like
really
been.
You
know
promoting
it.
Yet
we
want
to
do
some
like,
like
some
ui
stuff
around
this
and
some
cleaning
up
the
way
that
we
we
do
it
in
the
back
end
as
well,
but
but
yeah
I
mean
once
we
kind
of
make
more
progress
there.
D
Yeah
we'd
love
to
kind
of
share
it
with
this
community
and
and
have
have
you
all,
try
it
out
and
and
get
some
more
feedback
on
it,
but
appreciate
you
giving
us
a
little
time
to
talk
about
it
today.
A
Absolutely
thank
you
all.
It
was
awesome
if
anybody
has
any
questions
about
pyroscope
for
that
team
or
just
about
tempo
or
continuous
profiling
or
distributed
tracing
or
any
of
these
fields.
That
would
be
a
good
time
to
jump
in.
You
can
chuck
it
in
chat
or
you
can
unmute.
Otherwise,
I'd
like
to
thank
all
of
our
friends
from
pyroscope,
showing
up,
of
course,
as
well
as
everyone
else
for
coming
in
and
participating
in
the
community
call.
E
G
Like
approximative
yeah,
so
by
adding
the
generator,
the
distributor
has
to
do
a
little
bit
of
extra
work.
Since
it's
you
know
sending
requests
both
to
the
ingesters
and
the
metrics
generator.
G
So
there
will
be
a
bit
of
extra
cpu,
but
it's
not
you
know,
really
significant,
since
we
don't
have
to
do
any
extra
processing
that
we
don't
have
to
do
for
the
investors
and
the
generator
itself
like
memory,
usage
and
cpu
seems
fine,
like
seems
pretty
low
right
now,
but
we
haven't
been
able
to
test
it
at
like
a
good
scale.
Yet
so,
if
this
isn't
like
a
small
dev
cluster,
it
seems
to
be
using
like
have
the
memory
of
the
ingester
right
now.
G
So
that
seems
all
right,
but
we
still
have
to
test
it
at
scale.
It's.
E
A
separate
component
right-
this
is
a
yeah,
okay,
okay
cool.
I
had
forgotten
that
okay,
nice,
is
that
what
also
generates
the
service
graph
that
you
were
showing.
G
G
G
If
you
have
like
a
lot,
if
you're
ingesting
a
lot
of
traces
with
a
low
cardinality,
your
memory
usage
will
be
lower
than
if
you
ingest,
like
just
a
small
amount
of
traces
with
a
huge
cardinality.
So
it's
a
bit
tricky
to
estimate
at
this
point.