►
From YouTube: Tempo Community Call 2021-02-11
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
But
that
doesn't
let
us
do
cool
stuff
like
make
dependency
graphs
based
on
real
traffic
and
do
aggregations
and
get
average
real
latencies
based
on
our
traces
stuff
like
that
that
doesn't
work.
Our
plan
was
to
use
data
flow
and
just
iterate
over
the
chunks
in
our
bucket,
which
hurts
sure,
but
it's
going
to
get
really
complicated
as
compression
gets
added
and
a
bunch
of
other
things
come
in.
So
that's
why
I
was
like.
I
want
to
read
more
about
this
query
from
that,
so
I
don't
have
to
deal
with
data.
B
That's
like
pretty
cool,
so
joe
added
compression
here
so
he's
to
blame
for
all
your
troubles
with
the
parse
blocks
with
compression,
but
I
don't
know
how
much
the
query
front
end
can
help
with
that.
I
mean
really
it's
designed
to
be
something
that
can
help
to
scale
out
the
query
path.
So
what
the
query
frontend
does
today
is
split,
each
incoming
query
into
shards,
and
each
shard
is
basically
a
range
of
blocks
that
it's
it's
like.
B
So
every
block
has
a
uuid
and
the
and
the
chart
says,
search
between
this
range
of
uuids,
and
so
the
queries
pick
each
of
these
smaller
shards
and
then
work
on
them
in
parallel
and
if
we
have
multiple
hits
with
the
different
queries.
They'll
send
all
of
these
results
back
to
the
query,
frontline,
which
will
merge
all
of
the
split
traces
and
then
return
that
whole
trace.
I
see.
C
A
D
About
that,
what
that,
how
that
could
work
with
tempo,
how
we
can
do
that
with
our
existing
backend,
what
we
would
need
to
add
or
change
to
make
that
work
so
yeah,
that's
kind
of
on
our
long
term
roadmap.
D
But
it's
certainly
nothing
that's
going
to
happen
in
the
near
near
future,
but
we
generally
want
to
kind
of
make
a
few.
I
think,
there's
honestly
a
very
short
list
of
changes
to
really
feel
comfortable
with
the
key
value
store
being
in
the
millions
of
spans
per
second
and
we're
already
kind
of
working
on
maybe
design,
docs
or
floating
around
ideas
about
where
to
take
it
next
and
definitely
query
language
to
do.
Search
and
metrics
is
on
this
long
long
list
somewhere.
That
makes
sense.
D
The
service
graph
would
be
hard,
so
when
I'm
thinking
of
a
query
language,
I'm
thinking
of
things
like
give
me,
you
know
a
p99
of
a
certain
span.
You
know
latency
over
a
given
time
period
to
generate
a
query
graph,
though
you
would
need
a
different
kind
of
question
right.
You
need
like
what
are
all
the
connections
between
all
my
services.
F
F
D
Know
some
kind
of
ts
or
time
series
data.
That's
a
different
query.
That's
very
interesting!
I
wonder
if
something.
E
D
A
Yeah,
that's
that's
one
option.
Another
option
is
to
have
some
kind
of
query
language
where
you
could
say
give
me
all
the
trace
data
for
these
criteria
and
then
you
can
process
it
like.
You
want
right,
and
you
just
do
any
aggregation
that
you
might
want
to
do
as
long
as
you
have
some
kind
of
starting
point
like
give
me
all
traces
that
include
this
service,
and
then
you
can
kind
of
go
from
there
since,
like
if
you
know
your
critical
path,
which
you
usually
do.
H
D
D
I
think
the
issue
with
that
is
just
the
sheer
amount
of
data
you
would
need.
I
mean
we
take.
I
think
something
like
seven
to
eight
thousand
traces,
a
second
and
if
you
were
to
want
to
generate
a
dependency
graph,
you
need
to
see
a
good
percentage
of
those.
So
let's
say
you
want
to
generate
parentheses
for
the
past
hour.
That's
8
000
times,
3
600.,
that's
a
whole
bunch
of
data
to
request
from
the
back
end
and
then
start
doing
some
kind
of
some
kind
of
work
on
it.
D
I
think
a
really
cool
path
here
might
be
so
the
collector
the
open
palm
tree
collector,
which
the
grafana
agent
is
built
on
the
tre
or
at
least
only
the
trace
pipeline
and
the
grafana
agent,
is
built
on
already
has
kind
of
this
pr
up
to
add
a
metrics
processor
where
they
would,
like
you
know,
calculate
metrics
live
based
on
span
streaming
through.
D
I
think
it'd
be
really
neat
to
extend
that
or
add
a
new
processor
work
with
that
community
to
add
a
new
processor
that
would
generate
the
data
necessary
for
a
dependency
graph
store
it
in
redis
or
something
who
cares
and
then
query
it
out
of
there.
A
D
That's
true
that
would
not
work
well
for
cross-cluster
work
unless
you
had
some
kind
of
centralized
data
store
right
that
you're
putting
this
all
in.
I
don't
know,
that's
a
good
point
which
is
temple.
A
I
Would
you
say
your
dependencies,
like
your
dependency
graph,
is
kind
of
stable
over
time,
or
does
it
like
how
much
change
is
there
like,
let's
say
over
a
given
day
or
an
hour.
A
Yeah,
it's
mostly
stable.
The
main
value
I
see
in
running
this
regularly
is
that
you
can
you
can
notice
when
someone
sneakily
adds
a
dependency
or
accidentally
adds
a
dependency?
That's
why
you
want
to
run
this,
for
example,
on
a
weekly
basis,
but
it's
not
like
this
thing
that
you
won't
need
to
calculate
every
minute
right,
it's
more
great!
Okay!
A
Every
week
I
want
to
take
the
last
I'm
going
to
take
one
percent
of
the
last
week's
worth
of
traces
and
just
have
a
troll
through
and
see
see
what
I
get
something
along
those
lines
and
then
then
the
challenge
is
getting
a
representative
set
of
traces
to
actually
do
that
with
you
can
obviously
take
them
all.
That's
the
obvious
answer,
but
if
you
can
have
some
kind
of
query
language
where
you
can
somehow
get
a
representative
sample,
that's
better.
I
Right,
yeah,
amazon
x-ray
has
a
service
graph
that
is
built
up,
but
it
requires
a
time
range,
so
it's
kind
of
similar.
So
I
mean
it's
how
for
this,
given
time
range?
Here's
what
the
dependencies
look
like,
so
dependencies
change
every
time
it
thinks
yeah,
that's
fine!
Now,
joe,
like
do
you
know
about
the
jaeger
dependency
graph?
How
it's
generated
is
it
across
all
time?
Is
it
maintained
like
statically
outside
and
added
to,
or
is
it
within
time
range
like
how
x-rays
is.
E
D
A
Yeah,
and
in
our
case
it's
also
not
just
about
the
dependency
graph,
it's
also
about
okay.
What
can
I
expect
will
happen
if
this
thing
breaks,
you
can
do
all
kinds
of
fancy
things
once
you
have
a
dependency
graph
like
computing
aggregate,
sl
slas,
but
basically,
if
this
thing
is
that-
and
this
like
this
thing-
is
that
what
can
I
expect
of
the
thing
that
uses
both
and
and
stuff
around
that
level
that
we
actually
think
is
pretty
useful.
A
F
B
D
Yeah
yeah,
I
really
like
the
idea
of
what
you're
trying
are
you
luna
are
you
with?
I
think
I
was
just
told
about
somebody:
who's
writing
best
jobs
against
tempo
david.
D
You
know
all
four
hundred
thousand
companies
doing
embarks
embark
is
that
right,
okay,
very
cool?
It's
such
a
neat
idea.
I'd
really
like
to
meet
with
your
your
engineering
team
and
see
what
you
all
have
done
and
and
get
some
idea
what's
going
on
there.
I
really
like
the
idea
of
batch
jobs
for
this.
When
I
was
thinking
yeah
like
when
I
was
thinking
of
query
language,
I
wasn't
really
thinking
of
something
that
would
fill
this
need,
but
I,
if
it
is
a
need,
we
need
to
kind
of
document
it.
D
Do
you
all
mind,
maybe
writing
up
an
issue
on
the
repo
kind
of
showing
the
work
you
all
have
done
and
kind
of
saying
what
you
all
would
like,
and
we
can
use
that
to
kind
of
document.
This
feature
like
service
graph
and
how
tempo
could
support
that
and
maybe
use
that
as
a
running
document
going
forward.
A
Gladly
the
both
of
the
person
is
invested
to
work
on
this
is
actually
my
boss
or
director
of
infrastructure
and
services,
who
did
a
bunch
of
stuff
like
this
at
dice,
so
I'll
hit
him
up
and
ask
him
to
write
something
up
for
now.
D
It's
very
cool
yeah.
We
it's
a
neat
idea
and
it
was
not
the
direction
we
were
thinking
so
definitely
getting
it
documented
would
be
great,
so
we
can
kind
of
keep
an
eye
on
it
and
think
about
how
we
could
do
it
ourselves
as
well
roll
it
up
into
what
what
what
language
is.
It
is
it
what's?
Is
it
dataflow?
It's
dataflow.
A
So
dataflow
is
this
google
thing
that
essentially
lets
you
iterate
over
things
in
the
gts
bucket
and
since
chunks
are
stored
in
the
gcs
bucket
and
have
a
fairly
easily
decoded
fixed
encoding
format.
A
Oh
easy
right,
you
just
write
a
small
java
class
that
can
extract
data
out
of
them
you're
golden
very
cool,
oh
obviously,
as
we
knew
that
that
was
going
to
break
right
as
soon
as
the
data
format
becomes
more
complicated,
and
you
start
looking
at
encryption
compression
and
all
that
stuff
that
falls
apart.
It's
fine
sure!
That's
why
my
thought
was
more
like
okay.
A
D
A
D
Yeah,
it's
just
hard,
so
we've
been
made
what
we've
been
looking
at
when
we
talk
about
query
language,
what
we've
been
looking
at
is
just
the
amount
of
data
we
have
to
scan
to
answer
questions
right
we're
currently.
What
are
we
at
50
close
to
50
mega?
Second
in
proto?
Is
that
right
yeah?
Something
like
that
and
then
so
you
know
50
mega
second
times
36.
D
Second,
3600
is
gonna,
be
you
know,
180
gig
in
an
hour,
175
gig
in
an
hour,
so
to
answer
a
question
over
an
hour
definitively
you'd
have
to
scan
175
gigabytes
of
data.
Looking
at
some
internal
loki
metrics,
I
think
they
were
hitting
without,
I
think
they're
hitting
like
around
10
gigs.
A
second
is
that
what
we
saw
marty.
D
Sure,
let's
just
say,
10
gig,
so
that
would
be
17
seconds.
Is
that
terrible
is
that's
not
terrible,
17
seconds
to
answer
a
question
over
an
hour
to
scan
a
full
hour
for
the
data
and
that's
without
duplication
without
replication
factor?
Assuming
you
perfectly
compacted
your
time
period,
that's
a
little
bit
more
than
I
want
like
for
an
hour
query.
D
So
I
think
the
I
think
something
that's
more
aggregate
right
like.
A
Yeah,
something
very
true
I
couldn't
just
tell
I-
can
just
tell
tempo
hey,
give
me
a
representative
sample
of
my
traces
that
include
surface
a
and
the
make
it
representative
part
is
what's
hard
right
for
actually
figuring
out.
Okay,
these
two
traces
that
look
the
same,
except
some
small
difference
in
timings,
control,
one
way
and
stuff
like
that.
It's
like.
D
Super
hard
that
would
be
extremely
difficult.
What
might
make
more
sense,
instead
of
that
is
like
so
a
trace
tends
to
have
you
know
the
service
name
or
other
metadata
that
lets
you
know
when
it
crosses
trace
boundaries
right
or
sorry
service
boundaries.
So,
if
we
were
to
maybe
if
the
question
were
like,
what
are
all
the
relationships
and
counts
like
show
me,
every
ser
er
just
give
me
a
list
of
parent
and
child
relationship
and
a
query,
and
the
number
of
times
that
happened.
D
That
might
be
something
right.
That
would
be,
I
think,
more
tenable
like
it
would
be
a
lot
less
data
to
return
and
more
directed
at
what
was
attempting
to
be
done,
which
is
build
a
service
graph
right
of
course,
what
else?
What
other
kind
of
data
people
use
in
service
graphs?
Do
they
want
latency?
Do
they
want
error
rates?
What
else
are
people?
I've
never
really
used
a
service
graph
that
I
liked.
A
There's
a
bunch
of
things
there's
like
what's
my
p50
p95
p99
latency
between
these
two
services,
what
what's
my
error
rate
between
these
two
services?
How
many
times
do
I
have
to
retry
between
these
two
services
and
then
also
comes
the
question
how
the
heck
do
I
tell
her?
We
try
a
regular
request,
like
there's
lots
of
things
there,
a
bunch
of
them.
The
answer
is:
make
some
metrics
record
your
metrics
go
home.
That's
a
perfectly
valid
answer
for
a
lot
of
them
right.
A
That's
not
a
valid
answer
for
making
for
a
bunch
of
them
like
actually
generating
the
service
grab
and
the
dependency
tree
and,
like
that
level
stuff,
you
can't
do
with
metrics.
So
I
think
the
focus
should
be.
What
can
tempo
answer
that
metrics
cannot
and
kind
of
gear
it
towards
that,
because
for
everything
else
record
it
damn
it
and
you'll
be
fine.
D
Yeah
I
agree:
100
metrics
are
just
too
good
at
storing
what
they're
supposed
to
store
in
terms
of
like
compression
or
the
size
you
can
keep
them
on
disk
and
how
fast
they
are
to
query
and
scan
and
do
all
these
things
generating
metrics
out
of
traces
is
like
a
last
resort
like
here
was
a
metric.
D
We
couldn't
capture
due
to
cardinality
or
some
other
reason,
or
we
just
don't
have
a
metric
for
yet,
and
we
want
to
ask
a
question
that
our
metrics
don't
answer
just
yet
or
can't
answer,
perhaps,
and
that's
where
you
want
to
go
query,
something
like
logs
or
tempo
for
traces,
so
yeah.
This
is
a
cool
idea,
definitely
get
an
issue
up
and
I
think
it'd
be
a
good
place
to
chat
about
it
and
yeah.
I
really
want
to
meet
girl's
team
and
see
what
you've
done
internally.
D
Cool,
I
don't
think
I
captured
half
of
that
in
the
notes.
I
tried
to
get
some
of
it,
but
but
I
got
a
bit
there,
marty
or
ananya
or
whoever,
if
somebody
could
also
try
to
help
keep
track
of
what's
going
on.
I
appreciate
it
in
the
in
the
meeting
notes
for
anyone
who
showed
up
during
that
whole
conversation.
We
are
just
getting
started
on
our
community
meeting
here.
D
Luna
had
some
really
good
input
on
distributed
or
like
building
graphs
or
other
information
of
distributed
tracing
as
well
as
kind
of
pushing
tempo
forward.
In
terms
of
the
queries
that
we
can
use
with
tempo
right
now,
it's
just
a
key
value
store.
We
have
an
agenda
in
the
document
linked.
Let
me
link
it
again
in
the
chat
for
everybody
showed
up
after
I
did
and
please
feel
free
to
add
anything.
D
You
can
treat
this
kind
of
call
as
a
ama
or
a
or
a
commute
or
like
an
office
hours
kind
of
thing
like
we're
just
here
to
chat.
I
do
have
some
agenda
items
if
anybody
doesn't
have
anything
directly
to
add,
and
also
this
isn't
really
like
a
canned
presentation
thing
feel
free
to
jump
in
and
talk
whenever
and
add
whatever
you,
whatever
you.
You
know.
F
A
Another
potentially
interesting
topic
I
have
is
doing
tail
based
tracing
with
tempo,
I'm
not
sure
if
anything
there
is
planned,
but
I'd
be
surprised
if
it
wasn't
discussed
before
like
I
it
in
my
perfect
world
everything
I
have
in
my
clusters,
just
traces
100
and
I
can
actually
decide
whether
to
save
the
trace
or
not.
When
I
can
look
at
it
at
the
end
as
close
to
tempo
as
possible
right
has
any
work
been
done
towards
something
like
that
or
yep.
D
So,
let's
see
here
the
collect,
so
the
trace
pipeline
in
grafana
agent
is
based
on
the
collector,
either
fine
to
push
traces
into
tempo,
but
recently
jp
from
red
hat
added
support
for
tail-based
sampling
in
the
collector.
So
we're
looking
at
also
getting
that
to
the
agent
and
kind
of
playing
with
an
r
environment
and
trying
to
get
a
feel
for.
I
suppose
how
far
can
scale
like
how
many
spans
per
second
is
it
going
to
handle?
How
much
cpu
does
that
cost
in
memory?
D
And
all
these
things
it's
a
little
kind
of
it's
not
obvious
how
to
do
it,
and
I
think
we
want
to
try
to
help.
You
know
make
it
obvious,
put
together
some
documentation
to
help
see
that.
But
it's
split:
are
you
familiar
with
the
open
telemetry
collector?
Have
you
messed
with
this
thing
at
all.
D
Route
of
necessity
right
familiar
out
of
necessity,
that's
right!
So
let
me
find
a
closed.
Oh
it's
in
the
contrib
repo,
I
think
so.
There's
a
tail-based
sampling
processor
in
the
collector
now,
but
it
requires
the
full
trace
to
go
through
one
collector,
because
it's
not
like
right.
So,
let's
see
if
I
can
have.
A
It
and
though,
in
practice
unfortunately,
like
the
moment
you
deal
with
a
close
cross
cluster
request
you're
back
to
oh,
I
don't
have
my
full
trace.
D
B
Pieces
from
these
individual
clusters
and
then
run
a
tail
sampling
process
there.
That
would
be
huge,
like
egress,
like
network
cost,
just
shipping.
So
much.
D
D
The
first
one
or
the
first
one
just
does
the
actual
work
of
trying
to
like
batch
up
a
trace
in
one
piece
and
make
a
choice
about
whether
to
pass
it
on
or
drop
it
and
then
the
second
one
I
linked
actually
attempts
to
route
a
trace
to
one
collector,
so
it
will
batch
up
a
trace,
wait,
however
many
seconds
and
then
it
will
choose
one
of
a
set
of
downstream
collectors
to
or
yeah
is
that
upstream,
whatever
downstream
collectors
to
send
the
whole
trace
to
so,
the
second
collector
can
make
it
the
choice
right
because
it's
supposed
to
have
the
entire
trace,
so
you
actually
need
like
two
layers
of
collectors.
D
A
Yeah
my
thought
was
temple
should
have
some
kind
of
rules
here
or
well.
Actually
I,
I
guess
practically
you
could
put
a
collector
straight
in
front
of
temple
running
in
the
same
posture
that
gets
the
traces
instead
of
temple
makes
a
decision
and
sends
them
along,
but
obviously,
if
you're
using
grafana
cloud
that
doesn't
work
and
you
you're
introducing
an
extra
point
of
failure.
But
that
is
that
is
expected.
D
B
So
like
what
luna
just
mentioned
is
like
really
sounds
like
a
really
great
feature
to
add
to
tempo,
where
you
can
say
hey.
This
is
like
an
allow
list
of
conditions
that
I
need
to
have
to
store
anything
at
all
to
the
back.
Latency
of
my
trace
should
at
least
be
so
much.
We
can
add
these
conditions,
probably
when
we
marshal
into
the
trace
object
at.
I
guess,
like
the
distributor
level.
I
D
I
wonder
we
something
we
talked
about
a
long
time
ago,
but
haven't
done,
is
kind
of
have
a
way
to
keep
certain
choices
for
just
longer,
so
it's
kind
of
interesting
traces,
traces
that
have
failed
traces
that
are
latent
or
whatever
keep
them
for
a
month
and
traces
that
succeeded
and
are
under
your
latency
requirements.
You
don't
care
about,
you
know,
get
rid
of
them
in
two
days
or
something
like
that.
Yeah
kind
of
cool,
too
that'd
be
kind
of
a
that
way.
D
You
can
kind
of
keep
everything
immediately
and
you
can
cut
your
trace
to
id
logs
jump
will
still
work
all
the
time
you
can
feel
comfortable
with
that,
but
then,
like
when
you're
wanting
to
see
certain
information
a
month
later,
maybe
you
can
kind
of
keep
the
things
that
make
sense.
We
could
kind
of
do
that
compaction
time.
That's
another!
Really
good
idea
for
yeah
you're
welcome
to
make
that
issue
as
well.
Please
do.
C
A
D
Yeah,
we
also
kind
of
are
trying
to
balance
like
changes
we
make
for
the
open
source
product
as
well
as,
for
you
know,
graphonic
cloud
for
graphing
cloud.
It
makes
way
more
sense
to
put
this
in
the
agent
right,
because
that's
where
people
want,
they
don't
want
to
send
us
the
trace
and
pay
for
that
and
then
for
us
to
drop
it.
Two
days
later,
they
they're
gonna
pay
us
to
keep
it
they
may
as
well.
D
You
might
as
well
keep
it
the
whole
time,
but
for
the
open
source
option
for
people
who
are
running
this
in
their
own
clusters
yeah.
I
think
that
would
be
really
valuable
to
you.
Reduce
storage
costs
quite
a
bit
if
you
could
do
some
kind
of
like
filtering
or
whatever.
A
G
D
D
Compaction
cool-
let's
see
so,
let's
talk
a
little
bit
through
this
agenda
here
0.50
was
released.
That
was
released.
I
think,
shortly
after
the
last
meeting
and
we're
going
to
say
oh
6
0's
about
to
be
released
now,
marty
or
nina.
Do
you
want
to
go
over
the
o5
0
notes.
I
Right
yeah,
so
I
guess
the
biggest
feature
was
support
for
azure
blob
storage
yeah.
So
that
was
a
community
pr.
I
guess
so
that
was
really
cool
to
see
that
the
query
was
doing
that
release
gosh,
it's
been
so
long.
I
didn't
realize
it
was
doing
that
release,
but
that
is
a
very
scalable
way
to
to
query
the
data
right.
I
The
separate
module,
that's
independently
scalable,
so
we
did
have
a
breaking
change
in
there
for
the
communication,
signature
between
distributors
and
adjusters,
and
so
that
means
that
it's
a
new
endpoint
that
is
on
you.
The
ingesters
have
to
be
de
rolled
first
to
add
the
new
endpoints.
So
we
don't
drop
the
old
endpoint.
So
after
upgrading
the
adjusters,
then
the
distributors
can
be
upgraded
afterwards
to
kind
of
push
to
that
new
endpoint
disk
based
caching
is
removed.
Probably
is.
G
A
I
I
Memcache
and
redis
yeah
yeah
yeah,
so
this
was
another
change
to
kind
of
help,
control
in
gesture
performance,
so
it
can
target.
Well,
I'm
sorry,
never
mind.
This
is
that's
in
point
six.
So
a
change
here
to
just
control
some
burst
settings
and
then
other
fixes.
I
don't
know
if
we
want
to
read
through
all
those,
but
that
link
is
there.
D
Yeah
and
then
upcoming
an
o60
marty.
You
had
some
good
performance
improvements.
You
want
to
talk
about
those
yeah.
B
I
Sure
yeah
so
well,
one
thing
we're
kind
of
working
towards
nice.
Oh
yeah,
that's
awesome!.
I
There
was
who
who
had
a
3d
printed
tempo
thing
or
was
maybe
that
was
a
grafana
thing.
I
thought.
H
Yeah
richie
can
hook
people
up
with
swag,
so
at
least
stickers,
at
least
if
you're
external,
to
grafana,
so
yeah,
nice
yeah.
B
B
B
I
Yeah,
so
in
the
next
upcoming
release,
probably
the
biggest
thing
is
compression
of
the
storage,
so
that
is
significant.
So
the
default
right
now
is
z
standard,
but
there
are
a
couple
of
other
algorithms
there
that
may
work
better.
You
know
for
certain
data.
The
savings
I
think
were
75
reduction.
So
I
something
like
that.
So
I
mean
it's
significant
so.
I
Was
previously
the
largest
like
cost
driver,
so
I
mean,
I
think,
that'll
really
help
and
the
overhead
of
z
standard
is.
You
know
we'll
use
more
cpu
right,
so
it
is
kind
of
like
the
heaviest
algorithm
but,
like
I
said,
there's
multiple
ones
there
that
may
work
better.
If
you
need
to
yeah.
D
Yeah
real
quick
I
can
share.
Can
I
paste
this
anywhere?
D
D
Not
my
not
my
department,
the
so
compression
ratio
for
none.
Of
course
you
have
100
there
right
and
we're
taking
a
two
gig
block
and
we
batch
them.
I
think
internally
well
right
now
it's
by
bytes,
but
at
the
time
that
was
200
traces
per
compression
page
essentially
gzip
was
the
most
expensive
one
in
terms
of
in
terms
of
cpu,
but
it
was
also
it
wasn't.
D
Even
the
best
one
z
standard
had
the
best
compression
and
had
like
kind
of
a
middle
ground
in
terms
of
cpu
costs,
and
so
that's
why
we
chose
it.
It
was
a
little
rocky
at
first
we
had
our
queers
and
what
was
it?
Compactors
and
queries
were
just
booming
constantly
just
serious
memory
leak
which
we
fixed
and
so
we've.
D
So
what
we
consider
what
we
consider
kind
of
supported,
I
suppose
in
tempo,
is
none
no
compression
and
z
standard.
Those
are
the
ones
we've
run
at
scale,
those
ones
we
feel
comfortable,
advising
people
to
use.
If
you
want
to
use
snap,
I
don't
see
any
reason
to
just
use
g
zip
it's
in
there,
but
it's
whatever.
If
you
want
to
use
snappy
or
lz4,
there
is
like
it
will
read
faster.
The
compression
ratio
is
not
as
good.
D
I
Cool
so
there's
some
additional
compression
changes
for
the
traffic,
so
another
area.
I
guess
we
keep
looking
at
this
between
the
distributors
and
the
adjusters,
because
that's
the
critical
path
for
data
flow
and
so
we'll
have
it
we're
changing
the
compression
there
from
gzip
to
snappy.
So
those
are.
There
are
more
limited
options
there,
but
snappy
performance
is
looking
better.
D
I
Yeah
yeah,
I
guess
what
else
out
of
the
next
upcoming
release
would
be
good
to
talk
through.
D
Excuse
me
ananya's,
finished,
exhaustive
search,
so
query
front
front-end
was
added
in
05-0,
but
exhaustive
search
is
a
cool
feature
which
is
not
it's
owned
by
default.
You
can't
even
turn
it
off
at
the
moment.
E
D
Required
to
run
it
exhaustive
at
the
moment,
and
I
need
you
to
talk
about
kind
of
what
that
looks
like.
B
Sure
thing
so
yeah
this
was
like
what
we
were
looking
at
with
the
query
front
end
right
like
that's
what
we
wanted
to
do
when
we
introduced
the
query
front
and.
E
B
B
They
could
be
long
running
traces
that
could
be
part
of
multiple
blocks
and
we
wanted
a
way
to
combine
all
of
them
and
return
the
full
trace
and
that's
now
possible,
with
the
query,
frontend
and
yeah
we're
like
searching
for
traces
and
also
blocks
in
parallel
and
yeah.
It
works
really
well.
Our
latencies
are
still
as
good
as
they
were
when
we,
when
we
were
at
replication
factor
2
with
150
000
spans,
we're
now
close
to
400
000
spans
per
second
internally
and
our
latencies
are,
I
think,
99th
percentile
is
around
two
seconds.
D
B
D
Him
to
reiterate
for
a
second
and
then
our
current,
to
give
you
an
idea
of
our
scale
internally
is
we're
searching
13
billion
traces
over
a
four
day,
I'm
sorry
14
day
retention
window.
So
whatever
I
didn't
exactly
catch
the
latencies,
but
whatever
they
were,
they
are
fantastic
and
it's
over
that
size
of
data.
So
whenever
you're
kind
of
building
running
tempo
yourself,
you're
kind
of
making
a
call
about
like
how
long
do
I?
How
much
volume
am
I
adjusting?
D
What
retention
do
I
want,
and
that
turns
into
how
long
my
block
list
is,
and
that
really
is
the
driver
for
your
your
latency.
So
you
could
do
significantly
higher
volume
with
two
day
retention
or
you
could
do
lower
volume
at
a
30-day
retention
and
your
query,
latency
is
going
to
be.
You
can
kind
of
balance
that,
against
your
query,
latency.
B
A
A
question
on
that:
is
there
an
upper
limit
to
how
long
a
trace
can
be
in
temple?
Like
am
I
going
to
get
in
trouble
if
I
record
a
five
hour
trace?
Theoretically
speaking,.
I
Well,
this
actually
helps
fix
that
right,
whereas
normally
that
would
have
been
split
across
multiple
blocks,
because
there's
that
trace
idle
period
time,
where
it
will
be
cut-
and
so
this
will
actually,
this
will
bring
it
all
together
and
combine
it
fully.
So
different
parts
of
this
had
been
released
in
different
places,
but
that
this
was
finally
going
to
no
matter
where
how
many
blocks
it
split
across
it
will
bring
it
all
together
and
combine
it
right.
Yeah.
I
D
Something
I
think,
we've
seen
a
little
bit
more
of
in
the
community,
is
wanting
to
trace
longer
processes
like
ci
pipelines
or
other
other
things
that
take
batch
jobs
that
take
hours
and
initially
tempo
didn't
handle
that
well,
but
with
nanya's
recent
changes
it
does
and
so
we're
meeting
that
need
as
well
with
with
this
release.
A
D
Let's
see
so
060
exhaust
search
and
compression,
we
have
two
blog
posts.
I
wanted
to
highlight
these
so
for
people
who
are
kind
of
getting
started
doing
some
instrumentation
marty
wrote
a
nice
post
here
and
I
wrote
one
from
before:
that's
certainly
not
as
nice,
but
I
should
probably
make
these
links
instead
of
this
horrible
giant
thing.
The
first
one
is
spring
boot.
D
The
second
one
is
dotnet,
so
if
you're
using
any
of
those
technologies,
you
can
see
some
good
examples
there,
how
to
set
it
up
with
tempo
and
how
to
how
to
instrument
applications.
I
think
maybe
the
people
on
this
call
are
a
little
bit
past
that
we're
trying
to
just
create
more
material
that
gets
people
involved
in
tempo.
If
there's
any
languages
or
frameworks
that
you
all
are
aware
of
that,
we
could
add.
D
We
kind
of
are
trying
to
target
popular
things
right
to
get
to
drive
some
attention
to
tempo,
get
more
people
more
training
on
instrumentation,
but
I
don't
know
what
else.
Perhaps
the
question
for
the
group
is
what
other
kind
of
frameworks
or
technologies
you
think
are
very
popular-
that
we
could
like
benefit
from
a
blog
post
like
this.
H
So
another
thing
to
think
about,
as
you
guys
are
working
on
documentation,
is
our
lovely
go-to
market
cloud
writer
is
basically
is
working
on
making
tutorials
and
stuff
that
pulls
stuff
together,
but
it
really
helps
him
to
have
raw
material
to
start
from.
So
if
you
guys
make
more
raw
material
odds
are
higher,
that
he
will
be
able
to
go
in
and
help
make
something
cool
and
multimedia.
From
that
sure,
good.
D
Shot
sure
for
a
lot
of
these
we're
trying
to
just,
I
think,
get
people
started.
There's
a
group.
D
Tracing
right,
who
don't
really
know
the
beginning
of
the
process
or
what
tracing
is,
and
these
are
meant
to
be
like
an
opportunity
to
draw
people
in
and
we
chose
net
and
java,
particularly
for
that
reason,
there's
lots
of
you
know
odd
enterprise,
java
developers
extremely
popular
spring
boots.
Very
popular.net
is
very
popular.
We
want
to
attach
or
connect
to
those
communities
and
kind
of
give
them
material
to
get
involved
in
tracing.
D
I
don't
know
if
I
mean
for
me
those
are
two
of
the
most
popular
huge.
You
know
kind
of
languages.
Is
there
what
else
could
we
do
python?
Do
people
still
make
python
back-ends.
A
Regarding
python
and
rust,
which
I
mentioned
as
well,
what
could
be
also
very
interesting
for
those
languages
in
particular,
is
they're
often
used
to
make
things
that
talk
to
back
ends
and
initiating
a
trace
from
there
can
also
be
really
interesting,
since
you
could
actually
well,
you
can
start
getting
increasingly
that
earlier,
even
if
you
don't
entirely
trust
the
client,
you
can
do
some
interesting
things.
Black
starting
is
found.
D
E
D
D
A
There's
an
open,
telemetry,
rust,
client
of
some
sort
and
there's
also
some
there's
some
language-wide
efforts
in
just
introducing
tracing
as
a
provider,
agnostic
concept
which
are
currently
in
the
works.
So
it
would
be
interesting
to
plug
into
that.
D
A
B
G
It
sure
I
mean
it's,
I
don't
really
know
what
to
say.
We
mainly
just
kind
of
like
followed
through
the
examples
we
started
with.
We've
got
like
a
flask
app
that
then
you
know
talks
to
a
number
of
data
sources.
G
Postgres
redis,
I
believe
the
postgres
one
was
the
only
one
that
we
didn't
get
working
fairly
quickly,
so
we
basically
did
a
combination
of
the
flask
gap
using
the
flask,
auto
instrumentation,
various
data
storage
using
the
auto
instrumentation
and
then
the
back
end
python
stuff
was
all
manually
instrumented,
that's
not
flask,
based
so
okay.
I
can
probably
find
some
code
snippets.
What's
that.
C
G
C
G
Go
back
and
dig
out
our
examples.
I
mean
I
don't
know
a
lot
of
what
we
did
was
from
the
from
the
open
victory
python
documents.
So
yeah,
it's
one
thing
with
the
python
stuff
is
definitely
it.
It
tends
to
matter
how
you're
passing
through
the
thread
context
and
such,
and
that
seems
to
be
pretty
app
specific
in
my
experience,
so
I'm
by
no
means
a
python
expert,
but
I
can
help
provide
some
examples.
If
that
would
help
you
guys
along
so.
D
Cool
yeah,
if
you
have
some
links
to
share
in
the
document,
please
do
that's
another.
E
G
G
If
you
use
struck
log,
you
build
kind
of
like
a
logging
context
and
then
so
you
start
your
instrumentation
and
then
you
build
the
logging
context
for
struck
log
and
then
basically,
you
process
the
rest
of
the
request
through
that
and
then
the
the
data
source
auto
instrumentation,
obviously
like
it
will,
it
will
pass
through
the
trace
context,
or
I
forget
the
exact
name
but
anyway,
but
the
the
the
data
sources
should
pick
up
the
parent
trace
id
for
you.
It's
mainly
just
getting
the
initial
log
message
out
that
has
that
and
yeah.
G
That
does
depend
again,
that's
kind
of
like
python
specific
to
what
you're
doing,
if
you're
doing
something
with
a
lot
of
forking,
then
that
can
kind
of
like
you
know,
obviously
change
how
you
do
it,
but
in
the
example
or
like
it
also
too,
if
you're
like
doing
g
unicorn
flask,
I
don't
even
remember
the
workaround
that
we
did
for
that.
Somebody
else
did
that
one,
but
passing
that
context
back
and
forth
by
invars
can
that's
one
possible
solution,
but
yeah.
D
G
G
D
Okay,
python's
a
good
target,
then
we
can
look
at
that
I'll
kind
of
put
out
my
personal
notes.
I've
always
wanted
to
mess
with
some
of
these
languages
and
it's
kind
of
building
these
blog
posts,
and
these
these
repos
kind
of
gives
me
an
opportunity
to
dig
into
some
of
these
frameworks
and
languages.
I
don't
have
a
lot
of
experience
with,
which
is
cool.
D
Cool,
let's
share
some
internal
metrics
here,
so
just
to
show
where
how
we
are
running
tempo,
I'm
going
to
do
some
screenshots,
I
guess
of
our
internal
guy.
We
are
around
450
000
spans,
a
second
which
I'm
happy
with.
I
really
wanted
to
push
towards
a
million,
but
we
re.
We
could
it's
one
of
those
things
where
we
could,
but
we
don't
need
to,
and
the
cost
would
be
more
than
we
want
to
pay
right
now.
D
D
O6
is
going
to
drop
tco
a
ton
for
those
of
you
who
are
running
tempo
and
then
we're
going
gonna
see
further
improvements,
particularly
on
the
way
we
query
our
back
end
in
the
next
couple,
I'd
say
a
month
or
two
so
about
450
000
spans
a
second
it's
about,
I
think,
seven
to
eight
thousand
spans
traces.
A
second
I
think
our
average
is.
Is
it
about
a
hundred?
I
always
think
a
hundred,
but
now
that
I'm
saying
it
out
loud,
I
wanna
make
sure
the
math
works.
D
That's
divided
by
two,
because
it's
after
a
replication
factor,
oh
okay,
cool,
never
mind
and
then
we're
looking
at,
let's
say:
420
000.,
so
we're
at
oh
50.,
okay,
yeah
we're
about
50
spans
per
trace.
I
don't
know
what
other
people
are
seeing
in
terms
of
span
space
we're
at
100,
because
our
query
path
is
much
larger,
but
we
started
to
sample
larger
and
larger
percentage
of
our
right
path,
which
is
smaller
traces
so
that
our
spans
per
trace
is
going
down.
D
Are
people
seeing
I
don't
know.
I
really
don't
know
what
the
expectations
are
here.
I've
only
done
tracing
at
grafana.
So
what
can
people
share
like?
Do
you
know
how
many
spans
per
trace
you
have
on
average
in
other,
in
other
places,.
D
G
Ours,
ours
varies
quite
a
bit.
In
some
cases
we
have
like
three
or
four
spans
per
trace,
but
then
we've
had
traces
that
were
probably
like
80
is
probably
closer
to
the
average,
especially
when
you,
when
you're
instrumenting,
all
the
data
store
calls,
but
we've
had
traces
as
large
as
1700
on
kind
of
like
longer
running
things
that
do
a
whole
lot
of
data
store
queries.
D
Right
so,
let's
see
our
average
is
50,
but
our
largest
traces
hit
several
hundred
thousand.
In
fact,
if
you
look
at
that
thing,
I
I
linked
the
spans
per
second.
You
see
some
yellow
refused
spikes.
Those
are
spikes
because
I
generally
loki
is
crossing
the
350
000
span
border.
We
we
have
a
current
limit
that
prevents
you
know,
kind
of
a
max
on
trans
spans
per
trace
and
it's
always
loki.
So
those
are
few
spikes
or
somebody
querying
a
huge
query
in
loki
basically
but
yeah.
D
D
Sometimes
I
wonder
at
what
point
it
becomes
useless
to
have
a
trace,
at
least
to
just
look
at
it
in
grafana
at
what
size
it
doesn't,
even
matter
that
you
have
this
whole
trace,
but
I
also
think
for
some
of
the
things
luna
was
talking
about
earlier
in
terms
of
service
graph.
D
Building
having
that
data
is
valuable
anyway,
to
do
kind
of
analysis,
kind
of
kind
of
work,
maybe
not
staring
at
a
couple
hundred
thousand
span
trace,
but
just
using
it
as
data
for
some
kind
of
some
kind
of
other
job
metrics
or
whatever
we
are
close
to
time.
I
was
going
to
kind
of
talk
about
our
road
map
to
ga,
but
let's
not
get
into
that.
Oh
five
o
is
released,
o
six
o
is
coming
up
and
we'll
have
really
good
tco
improvements.
D
D
Cool
okay,
well,
thank
everyone
for
coming
in
and
showing
up.
I
wish
luna
didn't
have
to
run.
I
wanted
to
thank
thank
her
for
getting
involved
and
sharing
her
work
at
embark
and
hopefully
I'll
meet
those
engineers
soon.
Thank
everyone
else
for
also
joining,
and
we
will
see
you
in
a
month.
Hopefully.