►
From YouTube: 2022-06-23 meeting
Description
OpenTelemetry Prometheus WG
A
B
A
All
right,
well,
I
think
four
minutes
in
I
feel
like
we
can
probably
go
ahead
and
get
started
yeah.
So
I
guess
just
to
start
off.
Welcome
slash,
welcome
back
everybody
yeah.
So
to
start
off,
I
was
thinking
we
could
just
kind
of
recap
what
we've
been
talking
about
at
the
meeting
so
far
and
then
kind
of
get
into
some
stuff
for
today.
So
so
far
there's
been,
I
guess
for
those
who
don't
know
or
if
you're
watching
this
on
youtube.
A
Whatever
the
this
group
is
meeting
to
talk
about,
creating
an
event
type
for
profiles,
that's
supported
by
hotel
we've
had
two
meetings
so
far.
This
is
the
third
the
first
meeting
we
kind
of
just
met
with
everybody
to
get
an
idea
of
what
people
wanted
out
of
a
kind
of
standardized
profiling
format
or
profiling
event,
and
then
just
kind
of
yeah
sort
of
getting
a
a
wide
overview
of
what
some
of
the
goals
might
be.
A
A
That
seemed
to
be
for
most
the
overwhelming
majority
of
the
people,
the
type
of
profiling
that
we
were
interested
in
being
able
to
connect
profiles
to
other
hotel
signals.
That
was
another
big
goal
representing
slash
transmitting
profiles
across
native
code,
slash
runtimes.
So
that
was
one
that
a
couple
people
mentioned
that
I
guess
we
didn't.
A
We
didn't
dig
too
much
into
that
one
yet,
but
that
was
definitely
something
that
was
interesting
to
some
people
and
then
being
able
to
map
between
existing
profiling
formats
and
whatever
this
new
format
we
come
up
with,
because
there
are
already
a
decent
amount
of
you
know:
there's
a
decent
amount
of
support
for
some
profiling
formats,
and
so,
whatever
format
we
ultimately
choose,
we
identified
it
as
a
goal
that
that
format
should
be
able
to
play
nicely
with
the
existing
formats
and
somehow
or
be
able
to
migrate
easily
or
something
along
those
lines.
A
Yeah.
I
guess
let
me
pause
there
if
I
missed
any
or
if
anybody
you
know.
I
guess
I
I
put
out
here
on
the
the
agenda
that
if
there's
any
more
that
we
want
to
add
or
you
know
after
we've
had
you
know
a
week
to
think
about
it.
If
there's
any
that
people
think
we
did
not
cover
any
aspects
of
any
of
those
I
saw.
I
don't
know
if
floor.
A
Oh
yeah
florian's
here
he
had
mentioned
in
slack
something
about
like
the
network
bandwidth,
that's
something
we
didn't
explicitly
talk
about,
but
I
suppose
that
kind
of
falls
under
the
bucket
under
the
bucket
of
ability
for
data
center
system-wide
profiling.
Maybe
I
don't
know
if
you
want
to
add
anything
there.
Flooring.
D
Yeah
sure
so
the
question
is:
if
there
is
always
on
profiling
and
system-wide
and
data
center-wide,
it
will
generate
some
kind
of
data
around
the
network
and
depending
on
where
everything
is
deployed.
This
can
result
in
huge
additional
traffic
and
davro
can
cost
customers
additional
network
traffic.
So
I
think
this
should
be
also
a
point
in
the
existing
or
proposed
formats.
How
expensive
is
the
proposed
format
or
decided
to
format
on.
A
E
I
had
a
related
question
on
florian's
point,
which
is:
how
do
the
auto
kind
of
working
groups
typically
think
about
this?
Like?
Is
it
like?
Do
they
typically
try
to
take
into
account
the
amount
of
data
on
the
wire
or,
like
the
I
don't
know,
amount
of
processing
that
might
be
involved
in
like
either
serializing
or
deserializing
format,
or
how
is
that
typically
thought
about?
Because,
as
florian
says
like
in
in
the
world
of
continuous
profiling,
like
these
are
potentially
fairly
impactful
things.
F
Yeah
I
can
tell
about
what
we
did
for
open
telemetry
protocol.
Obviously
it
was
one
of
the
points
that
we
focused
on
the
the
wire
size
format
when
we
worked
on
the
otop-
and
I
think
it's
also
important-
maybe
even
more
important
for
profiling,
because
you
expect
maybe
even
higher
volume
of
data,
so
it
definitely
is
important.
F
So
we
when
we
were
designing
otlp,
we
did
multiple
alternate
designs
and
deep
comparison.
Benchmarking
from
the
site's
perspective
on
the
wire
uncompressed
compressed,
also
from
the
perspective
of
the
cpu
required
for
marshalling
and
marshalling
all
these
things.
So
I
mean
it
very
likely
applies
to
the
profiling
as
well.
A
Yeah,
I'm
curious
if,
if
there's
some
is
that,
I
guess
publicly
available
somewhere
like
in
a
issue
or
a
pr
or
something
that
we
could
kind
of
see,
get
an
idea
of
like
how
to
set
up
a
benchmark
suite
that
would
you
know,
I
guess
sort
of
be
in.
F
G
A
Okay
yeah,
so
I
guess
related
to
that.
Is
there
any
other,
and
I
put
a
link
in
here
as
well
to
the
the
logs
data
model,
there's
like
a
a
file
there,
that
sort
of
explains
the
logs
data
model
and
that's
sort
of
where
I
was
coming
from
for,
like
I
guess,
like
the
overall
structure
of
this
of,
like
you
know,
thinking
about
the
goals,
the
types
of
fields
that
we
care
about.
A
A
All
right,
I
guess,
if,
if
anybody
wants
to
add
any
either
throw
them
in
the
chat
or
we
can
talk
about
them
later
so
yeah.
A
So
one
of
the
things
that
we
had
talked
about
last
week
as
well
was
getting
a
little
bit
of
a
taxonomy
on
the
existing
formats
for
profiling,
and
so
you
know,
that's
obviously
important,
as
we
start
to
think
about
this
new
format,
to
understand
sort
of
yeah,
even
as
we
just
mentioned
the
benchmarks
of
the
current
formats,
and
I
think
that
kind
of
gives
us
a
somewhat
of
a
range
as
well
once
we
actually
get.
A
You
know
tangible
numbers
there
of
you
know
whatever
this
new
format
is
if
it
falls
somewhere
between,
for
example,
you
know
you
know
pprof
and
jfr
in
terms
of
like
bandwidth
or
whatever
the
benchmarks
we
actually
choose.
So
I
guess
that
is
a
good
segue,
maybe
yeah.
First,
let's
talk
about
the
ones
that
we
had
on
here.
So
I
believe,
is
it
and
alexei
added
some
info
here
on
prof.
A
I
think
that's
probably
a
good
place
to
start
since
that
when
we
pulled
everybody
last
week
that
was
the
most
popular
format
that
people
were
using
and
so
so
yeah.
So
we
kind
of
evaluated
several
different
formats
on
a
couple
of
those
goals
that
we
had
mentioned
and
so
yeah.
I
wonder
if
one
either
alexa
or
wants
to
you
know
chime
in
here
about
sort
of
pprof
and
how
it
stands
up
to.
I
guess
these
goals
that
we've
mentioned
so
far.
H
A
little
bit,
and
maybe
alexi
could
probably,
like
maybe
add
stuff
you
have
more
context
in
history
or
or
I
might
say
something
wrong.
You
correct
it
is
my
microphone
working
well
yep,
okay,
yeah,
so
I
pretty
much
try
to
answer
the
questions
on
that
doc
and
I
think
the
questions
stem
from
like
the
the
requirements
you
were
talking
about,
and
so
I
think
the
kind
of
like
the
summary
is
our
our
main
use
case
at
google
is,
I
think
it
was
mentioned
in.
H
Maybe
the
first
meeting
is
statistical
profiling,
which
is
more
like
a
more
like
an
aggregated
view
of
profiles,
rather
than
having
like
a
a
time,
stamped
profile
that
we
can
look
into
like
the
change
across
time,
and
I
think
the
meaning
of
that
is-
and
this
was
brought
up
in
my
previous
conversations
with
felix.
I
don't
think
felix
is
here
today,
but
I
think
one
thing
one
pain
point
of
of
this.
H
This
p
prof
format
is
that
it
doesn't
like
for
each
for
each
stack.
H
Oh
this
is
this
stack
and
then
multiple
profiles,
or
it
look
just
look
like
this,
and
so
that
kind
of
that
kind
of
is
a
little
disadvantage
if
we
were
working
with
like
timestamp
profiling,
and,
I
think
ryan
you
pointed
pointed
us
to
the
the
custom
profile
at
pyroscope
or
something
that
is
trying
to
address
that
issue,
and
I
think
yeah,
if,
if,
if
we
we're
also
trying
to
support
something
more
general
like
the
timestamp
profiles,
that
would
be
the
direction
direction
to
go
and
also
we
don't
have,
and
I
don't
think
we
have
like
native
support
for
like
coupling
the
profile
data
with,
for
example,
tracing
data.
H
H
That's
my
impression
at
least,
and
another
thing
is
like
the
question
about-
is
prof
stateful
and
I
think
the
data
format
alone
I
mean
it's,
it
just
represents
the
profile,
but
if
you're
talking
about
the
protocol,
we
have
something
like
the
the
server
determines
the
frequency
of
the
profiling.
So
a
client
tries
to
talk
to
the
server
the
server
kind
of
like
throttles
it
and
kind
of
like
lets.
H
The
client
know
when
it's
time
to
collect
that
profile,
and
so
I'm
not
sure
like
if
otlp
is
at
the
name
of
the
protocol,
if
hotel
the
auto
protocol
has
something
similar
to
this.
But
if
we
want
to
make
hotel
like
generic
generic
enough
to
at
least
support
like
google
cloud,
that
is
one
requirement.
That
is
also.
We
probably
also
should
talk
about
as
well,
and
I
think
finally
the
I
I
wasn't
super
sure
I
understood
the
the
last
question
correctly.
H
Can
people
represent
profiles
generated
across
native
code
or
runtimes,
but
my
interpretation
of
that
is
that
can
is
it
able
to
represent
enough
info
if
the
profile
is
from
like
some
native
code
like
perf
or
is
it
like?
A
interpreter?
Stack
on
java
can
those
be
mixed
together,
and
so
my
answer
is
like
yeah.
It
has.
H
It
has
enough
fields
to
represent,
like
the
the
vicious
information
or
it
can
just
represent,
like,
for
example,
function,
names
and
lines,
and
that
would
be
it
so
that
that's
kind
of
like
the
the
summary
of
of
p
prof,
and
I
don't
have
a
lot
of
historical
context
when
I,
when
I
joined
the
team,
p
prof,
the
proto
was
already
there,
but
if
the
context
matters,
I
think
alexey
would
be
the
better
one
who
couldn't
who
could
answer
history
or,
like
maybe
benchmark
related
questions.
B
Yeah
for
benchmarks:
probably
can
you
hear
me
well
for
benchmarks,
not
much
to
add?
Overall
ppro
format
tries
to
be
compact.
This
is
why,
for
example,
like
string
tables
and
string
and
turning
is
used,
even
though
it's
it's
not
like
many
protos.
If
you
like,
take
just
like
protobufs
in
abstract,
they
don't
do
that
so
extreme
in
churning
is
kind
of
like
people
of
was
people
of
decision
to
minimize
size
on
the
wire
further.
B
On
the
other
hand,
one
thing
that
I've
been
like
over
over
my
time
dealing
with
people
format.
One
thing
I
always
went
back
and
forth
and
wished
like.
Maybe
we
could
have
done
it
differently-
is
stack
representation.
Currently
in
people
format
stack
is
a
flat
table.
Basically,
so
each
stack
will
have
full
sequence
of
frames.
It's
not
encoded
as
a
tree,
and
that
is
like
that
is
one
obvious
place
that
can
be
improved.
B
For
example,
we
have
an
another
like
internal
formats
in
the
in
a
different
tool,
and
there
we
encoded
stack
as
basically
it's
two
arrays
one
arrays
like
indexes
of
nodes
and
another
array
indexes
of
parents.
You
can
encode
tree
in
like
basically
two
arrays,
and
that
is
more
efficient
representation,
but
in
people,
if
it's
a
flat
table,
on
the
other
hand,
having
like
flat
curves
is
maybe
more
convenient
super
users.
B
So
when
you
have
an
agent
or
like
profiling,
agent
and
those
you
have
for
many
languages
like
four
or
six
languages,
it's
kind
of
to
optimize
that
the
agent
code
is
as
simple
as
possible
and
with
flat
stack
table
that
usually
maps
well
to
how
profiling
agents
capture
the
stack
during
the
runtime,
because
it's
kind
of
like
it
would
be
a
hassle
to
manage,
extract.
B
B
A
I'm
heading
out,
could
you
say
that
yeah
you're
cutting
out
a
little
bit?
Could
you
say
that
last
part
we
heard,
I
think
it's
a
hassle
and
then
you
cut
out
a
little
bit.
B
Yeah,
okay,
yeah,
it's
it's
a
hassle
to
you
know
profiling,
aim
that
runs
within
the
application
process.
B
It's
a
hassle
to
manage,
like
stack
like
basically
like
stack,
because
then
you
would
need
to
deal
with
like
lifetime
of
nodes.
If
you
want
to
evict
a
specific
stack,
I
can
put
notes
on
this
somewhere.
Maybe
that
would
be
more.
Maybe
we
could
have
something
for.
One
question
for
the
format's
overview
could
be
like
what
people
are
unhappy
with
with
the
format
that
they
use
and
they
there
there
might
be
some
interesting
findings.
There.
D
B
B
Can
you
hear
me
now?
Is
it
better
now,
I
think
so?
Okay,
so
clients
are
supposed
to
aggregate
the
metrics
per
stack.
So
if,
for
example
like
if
it's
cpu
sampling,
then
cpu
samples
that
happen
at
the
same
stack,
they
are
supposed
to
be
kind
of.
Accumulated
accounts
are
supposed
to
be
accumulated
by
the
client.
B
But
I'm
not
sure
I
get
the
question
so
like
for
statistical
profiling
me.
Maybe
maybe
it's
because,
like
we
don't
kind
of
stamp
data,
it's
usually
like
we
typically
profiling
is
done
for
10
seconds
and
during
those
10
seconds
you
kind
of
get
like
called
tree
with
associated
sample
counts.
So
that
data
is
yes,
it's
aggregated,
but
maybe
I'm
missing
the
point
of
the
question.
D
I
think
I
did
understand
you
correctly,
that
you
have
sampling
rate
or
have
time
frames
of
10
seconds
and
aggregate
on
the
client
side.
These
stack
traces
before
singing
them
out.
B
Profiling
is
for
10
seconds
and
then
the
sampling
period
is
100
hertz
and
those
samples
that
occur
within
those
10
seconds
will
be
aggregated
per
cold
stack.
Does
it
make
sense.
B
E
Sean
my
question
is
actually
a
little
bit
of
a
follow-up.
What
I
was
wondering
about
when
alexia
and
were
describing
their
wire
format,
is
like
it's
a
little
bit
difficult
to
kind
of
evaluate
this
kind
of
the
suitability
of
a
wire
form
without
kind
of
knowing
the
context
in
which
it's
used.
This
kind
of
touches
on
what
alexa
was
just
saying.
E
I
was
kind
of
wondering
if,
alongside
the
wire
format,
descriptions
that
we
provide,
if
it
might
also
be
worth
describing
like
some
sort
of
architecture
for
how
we
currently
use
the
like
our
agents,
because
I
think
what
alexis
is
describing
is
actually
quite
a
bit
different
to
what
we
do.
E
So
I
think
he's
describing
an
architecture
where,
like
they
will
trigger
collection
for
a
certain
amount
of
time
intermittently
on
some
subset
of
nodes,
but
the
requirements
that
you
might
have
for
that
are
quite
different
to
if,
for
example,
your
assumption
was
actually
run
all
the
time
on
all
nodes
at,
like
you
know,
maybe
a
lower
frequency
like
20
hertz
or
something
because
that's
that's
what
our
use
case
is,
and
I
think,
depending
on
what
way
you
were
to
come
at
it,
you
might
end
up
with
a
very
different
wire
format
being
satisfactory.
B
I
would
say
people
of
format
is
not
designed
with
streaming
in
mind
and
I
think
you're
talking
about
more
of
something
like
streaming,
where
you
have
kind
of
like
ongoing
profiling,
and
then
clients
will
send
essentially
delta
for
over,
like
explicitly
taking
the
previous
statement.
E
Yeah,
to
a
degree
like
I
guess,
the
the
main
difference
is
we're
say
continuously
hurts
on
all
nodes
all
the
time
and
the
constraints
that
you
have
there
end
up
being
different.
I
guess
that
if
you
were
otherwise
doing
it,
maybe
perhaps
less
frequently,
but
I
guess
my
point
really
is:
I
wonder
when
we're
describing
the
the
wire
formats,
if
it
would
be
helpful,
to
add
some
sort
of
like
description
of
how
we're
actually
using
them
in
kind
of
in
production
at
the
moment,
because
it
will
just
help
contextualize
the
solutions.
C
We
can
definitely
add
that
in
yeah
I
think
it's
an
important
piece
by
the
way.
One
one
thing.
B
Before
just
before
I
forget,
I
don't
think
we
describe
anywhere
like
in
in
what
wire
format
we
are
going
to
like,
or
I
don't
know
like
which
wire
formats
hotel
actually
uses,
for
example,
for
protobufs.
B
So
I
wonder
if
we
need
to
take
that
into
account,
because,
for
example
in
not
in
this
format,
within
some
internal
formats
that
we
had
to
optimize
for
json
encoding,
we,
for
example,
we
used
structures
of
arrays
extensively
and,
like
again
like.
I,
don't
know
how
deep
we
want
to
go
here,
but
it
would
be
nice
to
capture
maybe
somewhere,
do
we
want
to
optimize
for
any
specific
actual,
like
wire
representations,
so.
I
I
F
Okay,
we.
F
Yeah
yeah,
the
the
binary
protobuf,
was
the
first
encoding
that
was
introduced
in
otlp.
Json
was
proposed
later.
It
is
not
stable
yet,
but
will
like
it
will
likely
become
stable,
but
both
are
supported
and
for
transport
we
support
both
grpc
and
http,
and
I
guess
to
answer
the
other
question.
I
think
it
was
about
throttling.
Yes,
odlp
does
do
throttling.
F
The
server
can
signal
to
the
client
that
the
server
is
overloaded
and
the
client
is
supposed
to
follow
a
specific
exponential
back-off
strategy,
but
it
is
a
generic
throttling
it
it.
It
is
not.
It
doesn't
have
any
domain
knowledge
right
like
like
what
was
described
by
where
the
server
may
may
know
what
is
profiling
and
offer
a
specific
profiling
rate
to
the
client.
F
F
F
The
the
gz
gzip
is
pretty
efficient
at
eliminating
this
duplication,
which
is
what
you,
what
what
you
actually
gain
right
by
doing
the
dictionary
encoding,
it
probably
still
is
worth
doing,
but
well
I'm
curious
to
see
maybe
some
benchmarks
which
show
how
much
you
gain,
particularly
with
compression,
because
you're
very,
very
likely
going
to
be
doing
compression
when
you
send
this
data
over
the
network.
B
Yes,
but
compression
assumes
that
you
at
one
point
in
time
you
have
to
hold
in
memory
both
uncompressed
and
compressed
data
set
and
for
in
process
profiling.
This
can
have
like
memory
usage
impact
for
the
profile
replication.
F
F
A
Okay,
cool:
we
can,
I
guess,
yeah
like
I'll.
We
can
kind
of
follow
up
after
this
meeting
on
more.
I
guess
I
guess
maybe
more
explaining
how
all
of
those
how
otlp
currently
does
it
and
that
kind
of
stuff
some
documents
there.
In
the
meantime,
we
also
have
so
dimitri.
You
want
to
talk
a
little
bit
about
the
custom
format
that
is
somewhat
close
to
prof
yeah
and
give
a
little
context
there.
J
Yeah,
so
we've
been
internally
working
on
a
kind
of
variation
of
p
prop
that
is
slightly
more
optimized
for
correlating
profiles
with
other
hotel
data,
particularly
traces
as
well
as
also
kind
of
optimizing.
J
The
network
bandwidth
a
little
bit
I
can
are
we
allowed
to
screen
share
in
this
meeting.
J
All
right,
so
let
me
maybe
show
this
visually
so
on
the
left
we
have
dprop.
This
is
a
visualization
of
the
protofile
and
on
the
right.
We
have
this
variation
that
we're
internally.
Working
on
this
is
not
currently
used
in
production.
J
This
is
yeah
still
kind
of
work
in
progress,
so
particularly
the
the
things
I'm
talking
about
you
can
like.
You
can
see
that
these
are
pretty
similar.
You
know
the
mapping
stuff
is
the
same
location
line.
A
lot
of
these
things
are
more
or
less
the
same.
J
What's
different
is
the
representation
of
samples,
the
the
kind
of
the
problem
we're
seeing
with
eprof
when
you
correlate
profiling
data
with
traces
is
that
you
end
up
with
a
lot
of
these
simple
structures
where
each
one
of
these
represents
a
stack
trace,
connects
it
to
a
set
of
labels
and
has
some
sort
of
a
value
and,
and
the
problem
is,
if
you
have
a
lot
of
different
combinations
of
labels,
you
end
up
with
a
lot
of
duplication
because
for
for
a
lot
of
kind
of
label
sets
you'll
have
the
same.
J
You
know
combinations
of
location
ids.
I
hope
that
makes
sense.
So
we
tried
to
kind
of
normalize
this
a
little
bit
and
we
moved
these
things
to
a
separate
stack,
trace
structure
and
that
allows
us
to
kind
of
de-duplicate
these.
J
In
addition
to
that,
we
are
kind
of
splitting
the
kind
of
the
global
kind
of
profile
profile
context.
It's
it's
called
scope
profiles
here.
Maybe
this
is
not.
J
I
guess
I
get
the
ideas.
I
think
it's
called
this
way
because
it
it
somehow
similar
to
naming
and
hotel
not
super
sure,
but
the
point
is
that
we
do
this
split
to
kind
of
represent
the
profile
like
the
the
kind
of
the
holding
structure
that
we
sent
and
then
separately
represent
all
those
separate
profiles
that
are
associated
with
traces
and
so
and
the
linkage
happens.
J
It
you
know
it
has.
It
has
things
like
trace
id
spam
id
things
like
that,
and
I
guess
the
last
thing
I'll
say
is
we
also
kind
of
started
to
incorporate
other
types
that
are
defined
in
otel
spec
to
this
format?
For
example,
we
use
this.
You
know
key
value
structure
that
is
already
defined
to
encode
attributes,
which
I
think
are
kind
of
the
same
as
labels.
J
And
I
think
we
do
a
few
other
similar
things
like
that.
I
think
this
concept
of
kind
of
specifying,
if,
if
a
metric
is
cumulative
or
not,
I
think
this
is
copied
from
from
a
similar
concept
in
metrics
spec
yeah.
I
don't
know,
there's
like
there's
a
dog
that
kind
of
describes
these
in
more
details,
I'm
happy
to
answer
any
questions.
People
have
so
far.
J
J
Oh
you're,
talking
about
the
last
time
I
see
what
you're
saying
yes
it
it.
It
adds
some
kind
of
nice
things
around
pprof
as
well
start
time
and
time
also
kind
of
like
global.
J
Actually
wait,
let
me
think
about
that.
Maybe
it
doesn't
do
that
yet
you
know.
Another
thing
we
talked
about
earlier
is
adding
global
labels
to
like
the
whole
profile.
I
think
it's
yeah.
Maybe
it's
not
here
in
this
version.
K
John
sorry,
I
need
to
rename
that
it
should
say
pete
I'll
answer
it
either
dimitri
that
gotta
say
those
are
really
nice
diagrams.
Thank
you
for
taking
the
time
to
render
those
very
helpful
a
couple.
I
I
think
this
comment
is
more.
What's
what
thoughts
I'm
having?
K
As
as
we
dig
into
these
details
and
one
thought
that
I
realized,
I
don't
think
we've
discussed
it
much,
but
what's
the
if
you're
just
writing
an
agent,
you
may
not
want
to
comprehend
the
vast
amount
of
thought
that
went
into
the
normalization
scheme
within
like
the
protocol
that
we're
describing
and
what
do
we
want?
K
The
you
know
api
into
this
thing
to
be,
and
what's
the
division
of
labor,
because
if
we're
g-zipping
or
hashing
or
doing
something
else,
it
just
starts
to
raise
the
bar
quite
a
bit
on
what
a
you
know.
What
the
person
on
the
other
end
of
this
is
going
to
be
how
they're
going
to
interface
with
it.
So
that's
a
thought
that
I'm
having
right
now
like
what
is
the
division
of
labor
is
the
are.
K
Is
this
what
we're
going
to
invent
here
responsible
for
taking
a
more
simple
like
I'm,
just
gonna,
throw
you
a
stack
trace
and
it's
going
to
and
then
some
other
piece
of
machinery
that
I
don't
own
is
going
to
encode
it
this
way
for
me,
because
that's
kind
of
my
secret
hope,
if
that's
if
this
is
where
we
want
to
go
that
is-
and
the
other
thought
in
my
mind
is
I
I
do
think.
The
convergence
around
you
know
scaled
up.
K
Data
center
wide
is
very
real,
but
I
do
see
a
bit
of
a
a
few
differentiations
on
how
that's
done
some
pprof
discussion
is
we're
going
to
target
something
for
10
seconds
and
and
get
that
and
then
there's
some
other
version
of
things
where
we're
always
running
profiling
and
maybe
at
a
lower
frequency,
sending
everything
we're
very
efficient
about
how
we
send
it.
K
K
Oh
sorry,
final
thought
is,
and
I
think
this
might
be
something
we
just
do
is
once
we
do
this.
We
can
do
some
back
of
the
envelope
like
what's
the
bytes
per
packet
and
how
what's
the
packet
rate
we're
going
to
send
out
in
some
vague
sense.
I
think
there
should
be
a
very
easy
back
of
the
envelope
sort
of
a
way
to
comprehend
how
much
data
is
going
to
be
sent
over
the
wire.
A
Thank
you,
yeah.
Those
are
all
great
points
I
added
to
the
the
goal
section
I
mean.
Obviously
we
will
have
to
define
what
reasonably
easy
is,
but
I
put
that
it
should
be
reasonably
easy
to
to
use
this
new
format.
You
know
so.
A
So
there's
it's
not
as
much
of
a
headache
for
those
who
have
to
work
with
it,
but
yeah
all
great
thoughts.
Jason.
You
had
something
you
wanted
to
add.
Yeah.
L
Thanks
ryan,
dimitri
thanks
for
showing
that
I'll,
try
and
keep
it
short.
I'm
curious
in
in
that
format,
if
you
have
anything
that
accounts
for
thread,
identity
for
multi-threaded
languages
and
if
that's
important
at
all
and
then
I
have
a
follow
up.
J
Yeah
we
we
yeah,
it
doesn't
have
that
out
of
the
box.
I
guess
like
if
you
treat
thread
ideas
just
another
attribute
or
a
label,
you
could
encode
that
information
that
way,
I'm
kind
of
curious.
Actually,
so
do
you
like
the
way
our
system
works?
We
never
really
like
we're.
Never
really
interested
in
the
specific
you
know
thread
id
of
where
something
was
running.
So
I
wonder
like
why.
Why
is
this
even
important.
L
Yeah,
so
I
mean
if
one
of
the
goals
of
profiling
is
to
track
down
where
you
have
hotspots
or
problem
points
that
can
often
be
associated
with
problem
thread.
That's
like
a
place
for
a
developer
to
go
look
like
if
I
have
the
name
of
a
thread
that
can
help
me
to
maybe
pinpoint.
L
I
mean
ideas,
it's
probably
both
actually,
but
I
think
yeah
I
think
id
is
probably
more
unique,
but
then
to
a
human
user.
It's
meaningless
right.
So
it's
probably
both.
B
It
just
seems
to
get
into
the
land
of
debugging,
because
when
I
collect
the
data
from
like
hundreds
of
production
machines,
I
I
don't
know
what
the
thread
ids
are.
Usually
it's
like.
It's
cattle,
not
pets.
L
Yep
yeah
respect
that
okay
and
then
I
noticed
in
the
linking
with
span
and
trace
identity
that
you
included,
attributes
there
and
I'm
wondering
what
the
what
the
need
for
that
was
right,
because
those
attributes
I
think
are,
is
that
the
same
set
of
attributes
that
exist
on
the
span
itself
or
is
it
a
different
set
of
attributes
just
from
the
data
model
that
you
showed
earlier.
A
Assume,
yes
matt,
you
have
no.
B
Sorry,
I
actually,
I
actually
was
muted
sorry
I
saw
I
saw
hotel
labels
or,
like
key
value
structure,
is
used.
Is
that
typed
or
is
it
like
just
string
string
because
people
of
labels
are
can
have
like
numeric
value
and
also
have
like
a
unit
field?
So.
G
F
B
A
G
Matt,
you
had
your
hand
up,
yeah,
hi
everybody.
I
I
wasn't
able
to
attend
the
first
two
meetings,
so
if
this
was
already
covered
and
not
in
the
notes-
apologies,
but
in
the
past
I've
written
profilers
for
embedded
systems
that
use
things
other
than
timer
interrupts
for
the
predicate
on
which
to
collect
the
sample.
So,
for
example,
you
might
care
about
hot
spots
for
tlb
cache
utilization
or
you
know
other
sorts
of
cache
hit,
miss
counters
or
things
like
that.
G
Where
every
nth
tl
be
miss,
you
might
want
to
generate
an
interrupt
or
something
like
that.
So
in
all
of
the
discussion
of
these
protocol
wire
formats,
are
we
making
an
implicit
assumption
in
sort
of
where
we
are
now
are
that
this
is
timer
based
or
timer
interrupt
based
sample
profiling?
Because,
if
that's
true,
then
all
you
need
is
you
know
the
frequency
or
something
like
that?
G
But
I
would,
I
would
propose
a
more
open
or
extensible
definition
of
what's
the
trigger
for
these
samples,
so
so
that
that
was
one
thing
and
I
had
two
other
responses,
but
I
think
that
to
other
things
that
were
said
but
but
on
this
topic
I
think
that
that
was
my
my
primary
rumination
over
the
last
week
or
so.
G
Sure
so,
for
an
example,
this
is
now
20
years
ago,
but
you
know
we
were.
Our
team
was
porting
some
stuff
from
win3
as
a
concrete
example,
porting
stuff
from
from
win32's
network
style,
the
windows
network,
stacked
down
to
windows
ce
and
we
were
moving
from
an
x86.
You
know
cisc
architecture
to
a
mips
risk
architecture
where
suddenly
pipelining
and
cache
utilization
really
is
more
to
predictive
and
determinate
of
performance
than
than
say,
cpu
speed
or
where
time
was
spent.
G
So
in
the
case
of
that
networking
code,
you
know
one
big
loop
through
a
tcp
packet
processing
loop
would
just
obliterate
an
instruction
cache
on
a
mips,
but
not
on
an
x86
that
had
larger
eye
caches
right.
So
when
you're
saying
how
come
everything
is
working,
but
it's
40
slower,
you
start
caring
about
destruction,
cache
and
data,
cache
utilization
and
hotness
versus,
say
time
spent,
and
so
we
took
our
sample
based
monte,
carlo
style.
You
know
p
prof,
style,
profiler
and
just
changed
the
predicate.
G
So
instead
of
a
timer
interrupt,
it's
now
generating
a
statistical
profile
of
where
you're
blowing
your
eye
cache
right.
So
these
kinds
of
profilers,
I
suspect,
will
be
needed
as
as
we
try
to
diagnose.
You
know:
scenarios
with
lots
of
edge
computing
nodes,
running
non
x86
hardware
right,
maybe
on
custom
socs
that
have
their
own
counters.
So
so
so
that's
sort
of
a
concrete
example
of
why
I
think
this
might
be
important
to
have
a
a
permissive
extensible
way
to
describe
what's
generating
these
profiling
reports.
G
What
I
wasn't
sure
about
is
in
the
canon
of
all
the
modern
profiling
formats.
Is
this
already
captured
by
any
of
them,
or
are
they
all
pretty
much
implicitly
sample,
based
on
the
matter
of
thread
ids?
That
can
be
really
useful
for
debugging,
but
there's
a
lot
of
scenarios
where
thread
pools
might
be
in
effect
or
goes
use
of
its
concurrency
model
that
doesn't
always
jive
exactly
with
the
physical
processors.
G
So
I
think
thread
ids
are
useful,
but
you
know
they
can
also
in
non-single
threaded
scenarios
or
scenarios
where
you
have
a
work
pool.
You
know
asynchronously
processing
things.
Sometimes
they
can
cause
a
little
bit
of
confusion.
The
last
little
thing
I
think
I
already
heard
answered
it-
was
a
minor
finer
pointer
on
labels,
but
dr
cite's
book
understanding
software
dynamics,
which
just
came
out.
He
uses
labels
in
a
profiler
at
kernel
level
and
uses
base64
encoding.
G
So
if
we're
willing
to
throw
away
a
case
and
just
have
uppercase
characters,
you
know
when
we
get
down
to
benchmarking.
You
know
how
many
bits
on
the
wire
that
can
be
a
quick,
easy
way
to
encode
labels
that
that
chops,
a
bunch
of
storage
out,
but
really
that
first
point
was
the
one
that
I'm
most
concerned
or
interested
in
hearing
what
people
think.
A
Yeah,
that's
thanks
for
adding
that's
a
lot
of
information.
I
mean,
I
think
I
don't
know
if
we
necessarily
have
come
to
a
decision
on
that.
Yet
I
mean,
I
think,
that's
kind
of
what
people
were
mentioning
at
the
beginning
of
this
call
of
you
know
is
something
we
kind
of
still
need
to
figure
out.
So
I
don't
know
if
there's
a,
I
guess,
certainly.
G
G
A
Sounds
good
yeah
florian
in
the
chat
mentioned
he
he
double
double,
plus
that
as
well
so
yeah.
We
can
definitely
add
that
in
as
well,
there
was
one
more
custom
format.
We
have
a
little
bit
of
time
that
I
was
gonna
try
to
pete
if
he
is
still
oh
yeah
yeah.
So
we
last
week
we
had
talked
about
a
couple
custom
formats
and
the
elastic
profiler
people
mentioned.
A
A
Just
as
I
don't
know
recap
there,
the
the
idea
is
that
you
know
those
using
custom
formats
obviously
found
something
inconvenient
or
that
could
be
improved
over
existing
formats
and
so
we're
trying
to
better
understand
that
pixie
uses
a
slightly
custom
format
as
well,
and
so
I
was
wondering
if
pete
you
wanted
to
maybe
just
give
us
a
quick
overview
of
what
that
is
in
the
time
that
we
have
left.
K
Sure
all
right,
so
thank
you
for
the
interest
that
I
think
I
would
describe
our
format
as
minimally
viable
and
we
went
with
a
a
flat
representation
where
we
put
profiles
into
the
all
the
stack
traces
go
into
a
single
table
with
columns
to
represent
the
stack
trace
and
that
table
the
schema
for
the
table
is
directly
translated
into
a
proto
format
so
and
that
becomes
the
wire
format
and
then
that
gets
unpacked
on
the
other
side
when
we
need
it
so
because
it's
flat
there
is
not
as
much
efficiency
in
this
we
basically
put
a
timestamp,
a
stacktrace
id,
which
is
loosely
an
integer
that
represents
this
particular
stack
trace.
K
K
The
string
might
have
symbols,
or
it
might
have
virtual
addresses
just
depends
on
whether
we
were
able
to
get
the
symbols
populated
for
that
stack
trace
and
then
the
count.
Sorry,
the
count
is
essentially
the
frequency
of
that
stack
trace
over
the
window
of
time
that
it
was
collected.
K
So
this
is
actually
probably
the
simplest
one.
I've
heard
so
far,
it's
and
and
to
add
a
little
context
on
push
versus
poll.
Pixie
was
originally
conceived
to
not
require
customers
or
users
to
deploy
additional
resources.
K
So
we
we
constrain
our
memory
usage,
but
we
do
store
the
profiles
on
the
host
where
the
profiles
are
collected.
So
you
don't
necessarily
have
a
long
history,
but
if
you
want
to
see
a
profile,
you
just
ask
the
cloud
component
to
go
and
pull
the
profiles
off
the
hosts,
and
then
then
it
does
so
and
then
it
renders
so
in
terms
of
data
over
the
network.
K
We're
not
always
sending
the
data
over
the
network,
we're
just
sending
the
data
over
the
network
on
demand,
and
that
that
you
know
if
it
was
a
network
bandwidth
problem.
Surely
there's
I,
I
think
the
discussion
around
how
to
make
this
efficient
is
quite
fascinating,
and
I've
certainly
seen
a
lot
of
really
neat
neat
ideas
that
are
out
there.
M
So
this
is
omid
work
with
pete.
I
think
it
might
help
I'm
a
visual
person.
I
like
the
diagram
that
was
presented
before
as
well.
I
think
that
really
helps
as
we
go
through
all
the
different.
M
You
know,
implementations
and
kind
of
look
at
their
pros
and
cons
and
just
go
through
this
process.
I
think
it
would
really
help
to
have
visualizations,
and
maybe
we
can
put
together
like
just
one
picture
for
the
sake
of.
M
Yeah
pixie
did
go
with
a
very,
very
simple
one
and
we
had
plans
to
kind
of
optimize
and
kind
of
commenting
to
pete's
earlier
thing
like
the
api,
we
were
kind
of
considering
it
more
as
just
the
api
level
of
like
how
we
get
it
out
to
kind
of
the
rest
of
our
infrastructure,
but
I
think
yeah.
Maybe
we
can
just
share
something
out.
Pete.
A
Yeah,
that
would
be
great
yeah,
as
I
I
kind
of
you
know
suspected
that
this
this
part
of
the
the
conversation
about
you
know
formats
and
stuff
would
take
some
time
for
sure.
You
know
we'll
we'll
definitely
talk
about
the,
how
you
know
how
elastic
is
doing
stuff
related
to
this
next
week,
and
so
maybe,
if
you
want
to
share
that
week,
that
would
be
good
in
the
meantime.
A
One
thing
I
guess
yeah,
I'm
curious
if,
if
anyone
else
has
thoughts
on
what
would
be
good
next
steps
here
sounds
like
one
of
the
biggest
themes
is
one
like
figuring
out
something
around
benchmarking,
and
you
know
how
we're
going
to
you
know
ultimately
evaluate
these
formats
and
on
what
criteria.
A
That
seems
to
be
a
big
theme,
as
we
continue
to
yeah
further
evaluate
the
existing
formats
and
then
and
then
yeah
also
sort
of,
I
guess,
some,
some
more
granularity
on
the
various
goals
that
we've
mentioned
and
sort
of
the
context.
That
is
important
as
we
sort
of
approach
those
goals.
A
So,
as
I
think,
sean
was
mentioning,
you
know,
understanding
sort
of
not
only
the
formats
themselves,
but
the
context
in
which
they're
used
most
is,
I
guess,
important
stuff
that
we
should
add,
as
we
continue
to
evaluate
the
existing
formats.
Is
there
anything
else
that
anybody
thinks
that
we
should?
You
know
be
that
should
be
top
of
mind
currently
curious.
Also,
if
tigran
or
morgan
have
anything
to
add
there
but
yeah.
First
matt,
you
have
your
henry's
I'll,
be.
G
Super
brief,
I
just
as
we
decide
like
kind
of
what
happens
next
moving
forward.
There's
some
nice
examples
in
like,
for
example,
in
the
prometheus
data
block
format.
You
know
they
have
a
scheme
where
every
block
can
be
encoded
with
a
different
compression
or
have
a
different
encoding.
G
So
in
talking
about
like
efficiency
over
the
wire,
this
compression
versus
that
compression,
I
personally
would
favor
again
sort
of
an
open
protocol
that
allows
for
a
variety
of
different
encodings
or
or
or
compression,
or
you
know
what
have
you
and
that
might
make
this
protocol
more
about.
What's
the
data
on
the
wire
and
then
and
then
leave
some
of
the
other
concerns
about
scenarios
to
be
something
that
could
be
fulfilled,
you
know,
while
in
the
protocol,
but
in
a
well-formed
way,
if
that
makes
sense,
yeah
that's
a
great
point.
F
Yeah
just
one
comment:
otlp
supports
arbitrary
structured
events
right.
These
events
can
also
represent
profiling
events
and
it's
possible
to
do
that.
It
will
likely
be
less
less
efficient
than
than
a
custom
format
designed
specifically
for
profiling.
F
I
think
it's
important
to
show
how
much
worse
it
will
be.
If
you
do
this
using
the
generical
tlp.
F
I
F
All
right
all
right,
so
what
I
was
saying
is
that
there
is
a
way
to
represent
this
profiling
events
using
otlp.
It
would
be
great
if
you
could
show
that
whatever
you're
designing
the
custom
profiling,
specific
format
is
significantly
better,
and
that
will
be
a
strong
argument
in
favor
of
doing
that.
If
you
not
show
that
that's
kind
of
weakens
the
argument
right,
why
you
want
it
to
be
a
custom
format
and
why
we
don't
do
the
otlp.
F
A
So
I
think,
we're
out
of
time
so
I'll
stop
here,
yeah
yeah,
that's
awesome,
thanks
yeah,
so
I'll
put
yeah!
We've
got,
I
guess,
yeah
out
of
time.
If
anybody
else
has
any
thoughts
feel
free
to
throw
them
in
the
slack
channel
I'll,
try
to
organize
these
notes
and
then
send
them
out
with
like
a
recap
and
then
and
then
yeah.
A
I
think
we
have
some
some
good
start
for
next
steps
and
then
next
week
we'll
talk
a
little
bit
more
about
elastic
profiler
and
as
well
as
potentially
a
more
visual
representation
of
pixie.
Thanks
everybody
for
coming
and
we'll
hopefully
see
you
all
next
week,
see
ya
thanks.