►
From YouTube: 2022-07-14 meeting
Description
OpenTelemetry Prometheus WG
A
A
All
right,
I
guess
we
can
go
ahead
and
get
started
all
right.
Welcome
back
everybody!
We
are,
I
guess,
on
a
roll
on
meeting
five
at
this
point
for
those,
if
it's
your
first
time
we
this
is
the
hotel
profiling
group,
we've
been
meeting
to
talk
about
creating
an
official,
basically
format
for
profiling,
events
in
hotel
and
basically,
we've
met
several
times
now.
A
The
past
couple
weeks
we've
been
focusing
on
hearing
about
companies
who
are
using
custom
formats
and
getting
a
better
idea
of
what
is
custom
about
their
format.
Why
they're
using
custom
formats
and
the
idea
being
that
you
know?
Hopefully
we
will
come
to
some.
You
know
common
denominator,
some
standardized
format
that
will
be
beneficial
for
the,
I
guess:
collective
group
of
people
using
profiling
data,
and
so
with
all
that
being
said
on
the
agenda
for
today,
it's
a
little
bit
shorter
of
an
agenda
potentially,
so
I
guess
we
won't.
A
If
we
don't
need
all
the
time,
we
won't
take
it,
but
we
have
pixie.
We
talked
a
little
bit
a
couple
weeks
ago
about
their
custom
format.
They
mentioned
that
they
wanted
to
talk
a
little
bit
more
about
it
or
I
guess,
do
it
go
a
little
bit
deeper
into
it
and
then
the
biggest
thing
that
we
haven't
really
discussed
as
or
haven't,
discussed
sort
of
what
the
actionable
steps
are
is
on
the
benchmarking
side.
A
Over
the
past
couple
weeks,
people
have
mentioned
that
two
of
the
biggest
you
know
biggest
areas
that
we
need.
You
know
we
have
a
lot
of
qualitative
data,
basically
about
these
custom
formats,
why
people
use
them
how
they're
using
them,
but
we
don't
have
any
quantitative
data
on
yeah
on
just
how
feasible
they
might
be,
for
you
know,
as
a
candidate
or
as
taking
part
of
it
for
it
to
be
a
future,
a
future
format.
A
So
I
guess,
there's
still
a
couple:
people
trickling
in
I'll
repaste,
the
notes
yeah
anyways
for
those
who
just
joined
basically
just
went
over
what
we've
talked
about
the
past
couple
weeks
and
what
the
plan
is
for
today.
So
we'll
talk
about
pixie.
First,
then
benchmarking
stuff,
and
then
we
can
talk
about.
I
guess,
meeting
cadence
stuff.
I
added
that
in
here
just
to
get
people's
opinions
yeah,
so
I
guess
to
start
off.
C
Yeah
no
problem
just
one
moment,
while
I
pull
the
slide
deck
up
and
share
the
screen.
A
Cool
sounds
good
yeah,
while
he
does
that
if
people
went
there's
a
link
to
the
benchmarking
or
to
a
benchmarking
proposal
that
we
will
talk
to
after,
if
you
guys,
wanna
or
if
you
all
wanna
check
that
out.
A
And
also,
I
should
add,
if
anyone
else
has
anything
they
want
to
add
to
the
agenda,
feel
free
to
do
so,
but
with
that
looks
like
pete
is
ready,
go
for
it.
C
Thank
you
all
right,
let's
dive
right
in
so
what
what
we
want
to
talk
about
is
just
to
a
large
degree,
describe
what
we
have
for
the
pixi
ebpf
based
profiler,
and
this
block
diagram
is
supposed
to
just
communicate
the
what
we've
done
in
terms
of
profiling
and
what
sits
on
the
node,
where
the
profiling
is
happening.
So
we
use
ebpf,
I
think,
there's
other
technologies
or
other
projects
out
there
doing
exactly
this.
For
very
good
reasons.
C
The
ebpf
component
is
triggered
every
11
milliseconds,
that's
just
a
choice.
We,
as
I
think
many
will
know
it,
collects
virtual
addresses
just
the
stack
trace
of
virtual
addresses
and
it
stores
them
in
ebpf
tables.
C
So
the
next
component
for
pixie
is
a
symbolizer
which
we
chose
to
aggregate
every
30
seconds
here
and
we
do
symbol.
We
do
symbols
locally.
I
actually
would
offer
this
may
be
reasonably
common
to
do
symbols
on
the
target
host
for
various
reasons,
and
then
we
have
a
local
data
store,
a
local
table
that
we
push
our
symbolized
stack
traces
into
and
we
only
communicate
data
out
of
this
over
the
network.
C
If
there's
a
user
generated
query
that
hits
our
cloud
component
and
then
that
would
send
the
query
to
all
the
nodes
being
profiled
whatever
got
requested
and
the
then
we
send
out
profiles
at
that
point
in
time.
So
there's
this
I'm
flagging
this
because
I
know
this
is
a
little
bit
of
a
divergence
from
what
I
think.
Maybe
the
assumptions
going
into
the
hotel
context.
C
That
being
said
there
are,
we
can
flag
a
few
other
choices
here,
there's
other
places
which
I
think
people
have
chosen
to
send
the
data
out.
You
could
send
the
data
directly
out
from
the
bpf
tables
over
the
network
or
you
could
symbolize
then
send
the
data
out.
Those
are
other
choices.
We've
discussed
in
this
group
all
right
going
to
move
on.
There's
this
right
here
is
a
table
describing
our
schema.
C
This
is
the
schema
going
into
the
local
table,
store
that
I
showed
on
the
previous
diagram,
where
we
aggregate
data
into
locally
before
it
gets
shipped
out.
If
a
user
generates
a
query-
and
this
schema
is
both
the
table
schema
and
it
directly
translates
into
the
network
protocol
through
protobuf,
I'm
not
going
to
dive
into
how
the
protobuf
component
works.
But
but
this
is
it's
both
it's
both
table
and
network.
C
So
it's
very
minimal.
We
have
a
time
stamp.
Our
timestamp
happens
to
just
be
assigned.
When
we
do
the
data
aggregation,
we
have
a
thing.
We
call
a
u-pid.
The
u-pit
is
meant
to
uniquely
identify
a
process
across
clusters
and
across
pid
reuse,
because
there's
only
32
bits
of
pid
in
linux.
So
when
we
grab
pids,
we
grab
a
time
stamp,
which
the
process
process
start
time
stamp.
So
this
is
enough
to
uniquely
identify
that
process.
C
Essentially,
it
identifies
the
uni
a
particular
stack
trace,
but
we
also
might
reassign
a
different
id
to
the
same
stack
trace
because
we're
not
keeping
infinite
history
and
it's
it's
convenient,
because
when
further
aggregations
are
performed,
you
can
group
by
this
value
first
and
it's
it's
just
a
little
bit
more
efficient.
That
way
the
stack
trace
itself
is
sent
as
a
string.
C
C
Okay,
good
to
go
next
up.
This
is
really
most
of
what
we
had.
We
are
aware
of
optimizations
that
we
could
pursue.
We
yes,
do
we
want
to
let
alexa
ask
I
just.
A
C
Alexa,
should
you
go
now
or
do
you
want
to
save
it.
D
I
can
go
now,
go
for
it
on
the
previous
slide.
Do
you
have
file
names?
The
example
stack
trace
only
has
function
names.
I
wonder
if
you
capture
file
names
too,.
C
Thank
you.
We
capture
the
so-called
com
name,
which
is
available
in
bpf.
It's
the
truncated
version
of
the
command
line,
but
we
also
have
npxi
a
system
that
collects
metadata
about
youpids
and
so
I
think,
that's
sort
of
out
of
band.
I'm
sorry.
I
didn't
dive
into
that
here,
but
the
the
pixie
setup
has
this
yupid
concept
and
on
the
back
end,
it's
going
to
know
a
lot
more
about
the
u-pid.
C
D
Sorry,
I
I
meant
in
the
stack
trace:
do
you
have
source
file
name
in
this
deck
trace
because
it
has
like
full
bar
buzz
and,
and
also
like?
Is
it?
Is
it
like
for
java?
It's
symbolized
on
the
host
and
for
because
you
have
addresses
and
function
names.
I
wonder
when
is
like
when
things
are
symbolized
on
the
host
versus
offline.
C
We
are
always
symbolizing
on
the
host
if
it
doesn't
get
symbolized
on
the
host.
We're
not
going
to
do
anything
further
so
for
native
binaries,
we're
using
the
bcc
symbolizer
and
for
java
we're
using
a
java
agent.
That
basically
is
like
perf
map
agent
for
people
familiar
with
that.
E
C
Yeah
we
yeah
so.
D
Yeah,
what's
your
this
is
this
is
kind
of
kind
of
like
diverging
a
bit
too
much.
But
I
wonder:
what's
your
experience
with
the
perf
agent,
because
my
impression
was
that
it
can
it
can
grow
the
jit
mapping,
file
unbounded
and
so
it's
not
very
suitable
for
continuous
servers.
A
Okay,
I
think
florian
had
a
question
as
well.
B
Yeah
thanks.
Thank
you
for
the
presentation.
How
do
you
solve
the
problem?
If
you
want
to
compare
the
similar
process
across
different
notes,
because
if
I
interpret
the
u-bit,
then
every
node
will
generate
a
different
u-bit,
even
if
it's
the
same
process
and
how
does
it
if
you're
using
the
com
process
can
change
or
have
multiple
and
different
comps?
Do
you
address
this
in
the
background
or
in
the
back
ends
in
some
way.
C
C
That
being
said,
I
I
don't
know
if
anyone
has
tried
to
do
exactly
that,
like
I'm
imagining
you're
saying
something
like,
I
know
that
I
have
a
fleet
of
servers
running
some
some
program
call
it
program
foo,
and
I
want
to
collapse
all
food
profiles
into
one
sort
of
aggregated
food
profile
is
that
the
idea.
E
I'll,
just
in
here
so
yeah
as
p
kind
of
mentioned,
we
have
metadata
so
pixi
was
designed
explicitly
for
kubernetes,
and
so
we
have
rich
metadata
on
like
what
these
viewpigs
can
be
mapped
to
what
pod
they
belong
to,
what
deployments
they
belong
to,
and
so
we
have
a
scripting
language
called
pixel,
where
you
can
do
these
sort
of
aggregations.
Now,
as
peter
said,
we
haven't
really
seen
this
use
case
much,
but
it's
possible
pretty
much
just
through
the
pixel
side
of
things
right.
B
For
for
what
it's
worth,
the
use
case
is
usually
that
you've
got
an
infrastructure
team
that
has
deployed
profiling
and
then
you've
got
a
team
that
works
on
a
particular
service
that
would
like
to
get
profiles
for
that
particular
service.
A
Well,
I
think,
that's
all
the
questions
if
you
want
to
proceed
at
pete
and
we
can
kind
of
yeah
ask
more
at
the
end
if
we
have
any
or
as
they
come
up.
C
Okay,
so
we
are
aware
that
we
could
do
more
with
stacktrace
ids.
We
did
not
go
there
because
it
is.
It
invokes
more
complexity
and
we
have
a
slightly
more
self-contained
use
case
and
in
particular
we
we're
not
as
worried
about
the
network
bandwidth
side
of
things
with
the
setup
where
the
user
query
triggers
the
data
transfer.
C
It
maybe
could
also
save
some
memory
storage
on
the
local
on
the
leaf
hosts.
So
this
is
something
we
might
pursue.
We
haven't
committed
to
it
yet,
but
it's
it's
interesting
to
us
for
sure
we'd
like
to
also
mention
for
the
group's
consideration.
C
We
think
it
would
be
interesting
to
also
discuss
or
possibly
shift
the
discussion
towards
what
the
hotel
api
might
look
like
and
try
to
orthogonalize
that
a
little
bit
from
wire
format
also
and
then,
even
if
we
don't
land
on
a
minimal
kind
of
feature
set,
there
may
be
some
utility
to
identifying.
C
What
component
of
this
is
the
baseline
feature
set
like
the
minimal
feature
set
that
hits
that
80
use
case
and
then
what
is
sort
of
advanced
or
more
sophisticated
features?
That
seem
seem
good
to
the
audience
here,
but
maybe
also
sort
of
like
how
shall
I
say
advanced
features,
I'm
not
sure
what
the
right
phrasing
is,
but
hopefully
that
makes
sense.
I
think
it
would
be
very
useful
at
at
a
certain
point
to
try
to
chart
that
territory
a
little
bit.
C
And
that's
that's
really
all
we
had
so
yep.
B
Yeah
thanks
again
sorry
for
a
second
question:
you
in
the
first,
I
think
one
of
the
first
slides
was
about
symbolization
and
you
do
symbolization
on
on
the
services
or
remote
itself.
B
Do
you
have
an
overhead
for
this
so
because
I
think
the
very
same
stack
will
be
symbolized
on
every
node,
and
this
will
sum
up
over.
This
will
sum
up
our
fleet
of
components
and
I'm
wondering
if
there
could
be
used.
Some
external
services,
like
debug
d,
does
where
you
can
request
symbols
for
it
for
address,
for
example,.
C
There
is
overhead
to
doing
the
symbolization
locally
and
we
have
measured
it.
The
the
insight
I'll
go
back
to
the
slide
here
that
we
have
come
to
is
the
ebpf
profiler
is
very,
very
lightweight
and
has
very
little
impact
in
terms
of
cpu
use.
C
The
symbolizer
is
the
main
cost
in
terms
of
cpu
for
this
setup,
and
I
I
want
to
say
it's
something
like
one
percent
cpu
based
on
our
standard
sort
of
how
shall
I
say
our
our
test
setup
shows
the
symbolizer
costing
about
one
percent
and
but.
B
C
E
C
E
F
C
We
have
improved
on
the
bcc
symbolizer
to,
to
a
certain
extent,
the
one
other
optimization,
we're
aware
of
and
kind
of
have
on.
The
shelf
right
now
is
trying
to
make
the
bcc
symbolizer
yet
more
efficient
and
submit
a
couple
changes
back,
but
but
we
we
understand
that
cost,
but
we
also
recognize
that
in-
and
I
think,
especially
in
the
context
of
hotel,
where
you
don't
have
full
control
over
the
ultimate
deployment
setup.
Whatever
is
happening.
C
This
style
of
profiling
is
a
lot
more
self-contained
and
you
don't
require
the
user
of
the
profiling
to
have
coordinated,
somehow,
what's
getting
deployed
on
the
leaf
nodes
with,
what's
in
the
back
end
and
there's
a
lot
less
synchronization
between
the
two.
If
you
do
symbolization
this
way,
so
in
terms
of
ongoing
efforts,
minimizing
the
cost
of
symbolization
is
always
something
we're
keeping
an
eye
on.
G
I
just
want
to
add
a
thought
on
the
overhead.
I
think
that
customers,
people
who
deploy
profiling,
can
often
find
much
more
than
one
percent
to
save
so
to
them.
One
percent
is
a
fair
bargain,
of
course,
if
you
can
get
to
0.1
or
something
even
lower,
that's
great,
but
in
our
experience,
customers
have
far
higher
overhead
from
other
observability
like
tracing
or
logging,
so
the
one
percent
of
profiling
is
probably
acceptable
to
most.
But
of
course
that
doesn't
mean
we
have
to
aim
for.
One
percent.
Getting
lower
is,
of
course,
good.
C
D
D
That's
just
to
share
perspective
from
google
scale
that
would
never
fly
at
google
because
we
stayed
two
percent
for
when
the
profiling
is
enabled,
but
then
at
large
scale
it's
amortized
by
using
constant
sampling
rate
across
all
the
nodes,
because,
just
to
give
you
perspective,
like
people
get
promoted
for
saving,
0.1
or
0.2
percent
of
the
fleets,
because
at
large
deployments,
that's
just
if
you,
if
you
take
like
one
percent
of
a
of
a
compute
google
compute
engine
bill,
for
example,
for
for
a
large
for
a
large
deployment.
That's
that's
a
lot
of
money.
C
Absolutely
I
mean,
assuming
google
has
already
got
their
own
profiling
fleet
wide
thing,
and
I
I
have
so
much.
You
know
it's,
it's
so
cool
to
learn
about
the
what's
offered
and
stuff
like
that.
A
Cool,
does
anybody
have
any
other,
I
guess
thoughts
or
questions
on
that
it
might
be.
That
might
be
a
good
segue
into
benchmarking,
those
exact
type
of
metrics,
unless
anyone
has
any
anything
else,
they
want
to
add
there.
A
All
right
well
awesome.
Well,
thank
you
pete
and
omid
yeah.
We
yeah,
I
mean
we've
heard
a
bunch
of
different
formats.
At
this
point
we
appreciate
both
you
all
and
everybody
else,
who's
taken
the
time
to
share.
You
know
to
prepare
some
material
and
share
about
their
format.
I
definitely
think
that
we
have
a
lot
more
context
that
we
can
now
start
to
sort
of.
You
know
as
we
start
to
move
forward
finding,
like
you
said
some
common,
you
know
minimal
feature
set
that
hits
80
or
you
know
some.
A
You
know
high
percentage
of
everybody's
use
case.
I
think
that
we're
in
a
much
better
position
to
figure
that
out
at
at
this
point
so
yeah,
so
I
linked
in
here
a
benchmarking
proposal.
Dimitri
do
you
want
to
maybe
just
kind
of
maybe
share
screen
and,
like
maybe
just
walk
through
the
proposal.
Briefly
and
yeah
I
mean
honestly,
so
I
guess
yeah
where
and
feel
free,
if
others
think
differently
to
to
chime
in.
But
you
know
I
feel
like
at
this
point.
A
A
You
know
the
two
areas
that
we've
heard
mentioned
for
most
are
network
overhead
and
then
the
cpu
overhead
of
you
know
converting
some
format
a
to
some
format
b,
and
so
I
think
you
know
both
of
those
are
things
that
we
should
probably
start
to
think
about
how
we
can
measure
those
just
so
that
we
can
get
kind
of
a
spectrum
of
what
is
the
most
expensive
way
to
profile
something
in
both
of
those
dimensions.
A
What
is
the
least
expensive
way
where
do
different?
You
know
profile
types
profile
formats
you
know
follow
along
that
spectrum
and
so
yeah.
I,
I
guess
I'm
curious
if
anybody
has
any
other,
I
guess
anything
else
that
you
feel
might
be
worth
discussing
before.
Benchmarking
feel
free
to
speak
now,
or
I
guess,
after
dimitri
talks
about
this
proposal,.
E
Yeah,
so
let
me
talk
about
the
proposal,
real
quick
and
then
we
can
discuss
so
from
you
know
these
conversations
that
we've
been
having
in
the
group.
E
It
seems
it
seems
that
there's
consensus
that
kind
of
the
performance
characteristics
of
the
format
we
come
up
with,
particularly
the
size
on
over
the
network,
seem
to
be
important
for
most
people
and
so
kind
of
like
to
in
order
to
you
know
we
we
have
all
these
proposals,
for
you
know
doing
things,
one
way
in
the
format
or
doing
things
another
way
in
the
format,
and
we
were
thinking
that
it
would
be
nice
to
have
some
sort
of
a
way
of
making
decisions
on.
E
You
know
picking
one
route
over
the
other,
and
so
the
solution
for
this
that
we're
proposing
is
to
have
this
kind
of
benchmarking,
suite
that
we
could
collectively
kind
of
put
together
and
use
to
make
decisions
about
certain
features
of
the
format
and
the
overall.
You
know
the
overall
kind
of
idea
is.
We
would
collect
a
bunch
of
test
cases,
meaning
a
lot
of
different
profiles
of
various
kinds,
and
then
we
would
have
this
reference
encoder
implementation.
E
That
would
encode
the
data
into
the
format
that
we're
considering,
and
we
would
also
have
some
tool
that
would
generate
kind
of
reports
showing
how
you
know
how
much.
How
long
is
the
resulting
message
that
the
encoder
generates?
E
How
much
you
know
cpu
did
it
use,
maybe
how
much
memory
used
things
like
this,
and
you
know
with
this
we
could
kind
of
I'm
hoping
we
could
make
decisions
more
efficiently
and,
for
example,
if,
if
somebody
had
a
proposal,
maybe
they
could,
you
know
open
a
pull
request
to
this.
You
know
reference
implementation
and
show
that
look.
You
know
if
we
do
things
this
way.
E
The
messages
are
smaller
and
everything
is
being
coded
faster
or
you
know,
if
you
do
something
this
way
you
know
for,
for
this
particular
type
of
profile.
Something
else
happens.
Hopefully
you
get
the
idea,
so
there's
more
kind
of
details
in
doc.
I
won't
go
too
deep
into
them,
but
I
will
talk
about
a
couple
of
things.
E
One
thing
I
think
is
important
is
kind
of
like
what
are
we
measuring
and
so
yeah.
I'm
like
we're
proposing
kind
of
two
things.
One
is
the
message
size,
as
I
mentioned
already,
and
the
other
one
would
be
kind
of
performance.
Overhead
of
you
know
converting
to
this
format,
and
we
can
talk
about
it
more,
and
I
think
there
were
already
some
conversations
in
slack
about
this.
E
E
You
know
I
was
thinking
we
could
have
different
runtimes
different
languages,
different
types
of
profiles,
maybe
very
wide
profiles,
profiles
with
a
lot
of
nodes
profiles
with
very
long
symbol,
names
things
like
that.
Maybe
profiles
with
a
lot
of
labels,
which
is
something
that
comes
up
when
you
correlate
profiles
with
traces-
and
I
think
one
more
thing-
I'll
focus
on-
is
kind
of
the
intermediate
data
format.
E
So
we
figured
you
know
we
all
probably
have
a
lot
of
examples
of
these
profiles,
but
I
imagine
that
a
lot
of
people
wouldn't
be
super
comfortable,
just
sharing
those
right,
because
there
might
be
some
kind
of
private
data,
so
we're
proposing
to
kind
of
create
a
tool
that
would
take
data
in
you,
know
various
formats
and
convert
it
to
an
intermediate
data
format,
and
we
propose
using
prop
for
that
and
I'll
go
to
I'll
talk
about
that
in
a
second
and
in
addition
to
it
to
using
this
intermediate
format,
it
would
also
anonymize
the
data
or
maybe
obfuscated
so
that
you
know
it
would
be
kind
of
safe
to
share
with
everybody
on
the
pprof
format
as
an
intermediary.
E
This
is
you
know,
I
mean
all
of
this
is
up
for
kind
of
discussion.
We
chose
pprom
because
that's
one
format
that
supports
using
labels
and
we
think
that
that's
somewhat
important
but
yeah.
Obviously
you
know
there's
probably
a
lot
of
different
ways.
We
could
implement
this.
So
I
would
love
to
hear
other
ideas
as
well
all
right
and
maybe
the
last
thing
I'll
say
to
kind
of
wrap
up.
E
Is
we
proposed
to
create
a
centralized
repository
where
we
would
collect
all
these
different
profiles
and
create
two
tools,
one
to
convert
from
various
data
formats
into
this
intermediate
format?
E
And
you
know,
add
this
anonymization,
slash,
obfuscation
and
the
last
tool
would
be
the
kind
of
the
reference
implementation
that
would
also
generate
reports
for
how
well
the
reference
implementation
performs
yeah.
I
think
that's
it
for
the
proposal.
B
Sounds
sensible,
I
think
the
the
question
whether
we
can
use
pre-puff
as
an
intermediate
format
will
only
be
resolved
the
moment
that
we
actually
try
to
do
it
because
you
you
won't
like
it
sounds
plausible,
but
I'm
I'm
sure
somebody
will
dump
jump
into
or
bump
into
a
sharp
corner
somewhere,
but
then
we
can.
We
can
deal
with
that
when
we
get
there.
E
F
You
know
all
of
this,
and
I
guess
what
I
keep
coming
back
to
is:
there's
a
couple
dimensions
around
which
are
designed
for
a
protocol
and
or
data
model,
or
a
data
format
pivot
around
or
you
know,
are,
are
evaluated
on.
One
is
stateful
versus
stateless,
another
is
push
versus
pull
and
you
know
and
tied
up
in
all
of
that
is
just
like.
F
How
do
we
think
about
this,
and
what
I
would
love
to
see
is
you
know
if
we
consider
like,
what's
the
logical
global
for
all
time,
unique
key
for
a
binary
you
know
from
which
everything
else
derives
symbolic
info
source
info.
F
You
know
the
actual
profile,
samples
and
and
all
of
that,
if
that,
if
that
could
all
be
fit
into
like
a
superset
data
model,
you
know
and
and
right
now
we're
talking
about
sample
based
profiling,
but
keeping
in
mind
that
execution,
tracing
and
other
forms
of
profiling
will
be
coming
in
the
future,
in
addition
to
non-merely
timer-based
sample
predicates.
F
If
we
have
a
data
model
for
for
that-
and
we
just
say
well
at
the
point
of
collection-
we
actually
don't
need
all
this
stuff,
so
we
don't
necessarily
need
to
even
make
that
part
of
what
traverses
the
wire,
but
we
do
need
it
to
be
eventually
consistent
like
when
a
developer
is
looking
at
profiles.
That's
the
point
where
we
need
the
aggregate
consolidated
data.
F
So
if
we
were
to
cleave
our
design
and
really
be
careful
about
keeping
the
data
model,
the
logical
data
model,
that's
globally
kind
of
able
to
be
managed
in
that
way,
where
it's
not
specific
to
one
host.
It's
just
you
know
a
grid
plus
stuff
that
you
can
make
a
globally
unique
key.
Then
it
then,
then
benchmarking.
The
protocol
really
needs
to
be
componentized
and
thought
of
in
terms
of
like
well.
F
What
are
we
actually
measuring
like
what
pieces
of
this
data
are
traversing
the
wire,
because
different
people
are
going
to
need
different
ends
of
that
spectrum?
Push
pull
staple
stateless
right,
so
I
would,
I
think,
like
how
we
benchmark,
obviously
will
evolve
and
what
formats
we
use
will
evolve.
But
if
the
framework
that
it's
built
around
is
inclusive
and
like
explicitly
acknowledges
this
like
and
has
this
agreeable
superset
for
a
logical
model,
then
we
can
actually
have
useful
comparisons
between
different
implementations
or
different
design
choices
for
how
to
move
that
data.
E
Yeah,
so
maybe
to
comment
on
that
a
little
bit
like,
if
I
understand
you
correctly,
like
the
way
we're
proposing
to
kind
of
solve,
that
is
to
have
a
lot
of
different
test
cases
and
so
that
when
you
run
this
benchmark
suite
you
would
be
able
to
see.
Oh
you
know
it
performs
really
well
with
these.
Maybe
it
doesn't
perform
really
well
with
some
other
profile
examples,
and
you
know,
maybe
that
means
something.
Maybe
that
means
that
we
need
to
like
re-engineer
it
a
little
bit
in
some
way.
F
Events
was
the
first
thing
that
popped
into
my
head,
like
if
you
look
at
what
tensor's
doing
and
what
k
native's
doing
like
k
native
events
are
cloud
events
they're
like
the
same
format
like
you
know,
I
know
otlp,
you
know,
has
its
own
event
kind
of
format
and
I
haven't
looked
at
the
compatibility,
but
in
terms
of
like
how
is
this
data
going
to
be
used
like?
Maybe
somebody
wants
to
stream
it
over
any
number
of
transports,
kafka
or
other
queueing,
stuff
right
and
so
open
standards
already
exist
to
just
move
stuff.
F
So
I'd
love
to
see
benchmarking
kind
of.
I
think
I
forget
who
said
it,
but
you
know
before
we
go
engineer,
something
specific
use.
What
hotel
you
know
otp
has
already
and
see
what
that
that
is
a
baseline.
You
know
maybe
something
like
cloud
event
or
something
like
it
might
be
another
useful
baseline
just
so
that
it's
first
about
the
raw
data
and
then
compression
here
compression.
A
There
all
the
other
stuff
yeah
we'll
go
to
alexa,
has
his
hand
up
next,
but
yeah.
I
guess
one
thing
I
would
also
throw
in
here
that
tigron
mentioned
before
was
just
that.
He
I,
I
suspect,
that
the
like
the
sort
of
generic
hotel
events
are
not
going
to
be
the
most
efficient
format
to
send
profiling
data.
Just
by
yeah
I
mean
I
suspect,
but
regardless
I
mean
yeah.
I
do
think
that,
whatever
you
know,
we
will
likely
have
to
measure
different
types
of
you
know
overhead
differently.
A
So,
for
example,
yeah
we'll
likely
have
to
measure
different
types
of
overhead
differently,
but
the
whatever
we
do.
I
think
that
it
would
be
good
to
kind
of
start
with
using
the
hotel
events
just
so
that
we
have
kind
of
at
least
the
argument
of
like
you
know
again.
If
we
have
a
full
spectrum-
and
we
say,
like
you
know,
hotel
using
the
you
know,
sort
of
generic
hotel
events
are
the
worst
end
of
the
spectrum.
A
Then,
at
least
we
know
whatever
we
come
up
with
would
be
marginally
better
than
that,
and
ideally
you
know
much
better
than
that,
and
so
I
just
wanted
to
throw
that
in
there
and
that
you
know
as
a
tc
member,
I
guess
it
would
also,
you
know,
make
sense
to
get
them
on
board
with
the
idea
of
a
new
format
at
all,
let
alone
you
know
which
specific
new
format
we
choose.
Alexa,
you
had
your
hand
up
next.
D
One
is
like,
I
think
this
is
a
great
the.
I
think
the
document
is
a
great
start
and
I
think
benchmarking
is
hard,
but
some
benchmarking
is
much
better
than
no
benchmarking.
So
I
totally,
I
totally
applaud
the
f
the
effort
and
I
think
we
should
start
simple.
E
D
The
actual
the
actual
question
is:
do
we
want
to
also
have
some
kind
of
like
mini
spec
or
something
for
what
kind
of
reports
we
want
to
produce,
like?
I
don't
know
like
aggregation
by
functions
or
querying
cumul,
exclusive
and
inclusive
time
per
function,
because
or
do
we
only
plan
to
measure
the
conversion?
E
Right,
so,
if
so
right
now
yeah,
this
benchmarking
is
only
about
kind
of
like
the
converting
the
conversion
process,
and
so
when
you
were
talking
about
aggregating
functions
and
things
just
so
that
I
understand
like.
Are
you
talking
about
kind
of
measuring
how
long
it
takes
to
then
parse
that
and
get
some
data
back?
Or
what
do
you
mean
by
that.
E
D
E
E
At
the
end
of
the
day,
there
will
be
like
some
other
component
that
you
don't
change
and
it
you
know,
gives
you
results
in
the
same
format
each
time,
but
I
think,
if
we're
talking
about
querying
the
format,
then
you
are
kind
of
then
you
are
writing
the
implementation
and
the
test
cases
to
read
it,
to
confirm
that
it's
working-
and
I
guess
my
point
there
is
that
you
know
you
can
make
mistakes
in
the
way
you
test
it
as
well
right,
and
so
you
might
still
be
reading
nothing.
D
A
Yeah
yeah,
I
think
that's
a,
I
think,
that's
a
pretty
fair
point.
Yeah
I'll.
We
can
add
that
to
the
proposal.
Also
yeah
like
feel
free,
everyone
to
add
comments,
or
you
know,
sections
ideas
to
the
proposal.
We
hope
for
it
to
be
a
collective
effort
as
well,
but
yeah
felix.
You
had
something
you
wanted
to
add.
G
Yes,
so
yeah
thanks
for
the
proposal.
The
one
question
I
have
specific
to
the
proposal
is
what
about
timestamps.
If
paper
office
uses
an
input
format,
you
would
only
get
a
timestamp
per
profile,
and
my
understanding
is
that
at
least
several
people
here
are
talking
about
potentially
sending
a
timestamp
with
every
stack
trace
that
they're
collecting.
So
that
could
be
a
problem,
and
my
second
point
is
more
meta
like.
Are
we
ready
to
start
talking
about
benchmarking?
G
Have
we
sufficiently
like
defined
the
goals
for
users
like
what
would
users
of
a
hotel
profiling
track
get
out
of
this,
and
also
for
vendors?
We
all
have
existing
tech.
What
do
we
get
out
of
either
changing
our
stuff
or
like
making
it
accessible
to
everybody
and
reusable?
So
it
might
be
a
little
early,
but
I
I
think
it's
cool
to
do
that,
even
in
parallel,
if
we
have
those
other
conversations
that
I
think
we
still
need
to
get
into
as
well.
E
Yeah
good
point
on
the
timestamps
one
way
we
could
maybe
hack
it
together
is
to
use
pprof
labels
and
encode
the
like.
If,
if
we
want
to
have
an
example
of
a
profile
with
timestamps
included,
we
could
add
labels
and
the
label
would
be
you
know
timestamp
x
times,
step
times
fy,
although
maybe
yeah
I
don't
know,
maybe
that's
not
the
best
idea,
because
then
how
do
you
distinguish
between
these
timestamps
and
other
labels
yeah?
So
that's
that's
one
kind
of
yeah
flaw
of
p
prof.
E
I
would
suppose
for
the
intermediate
format.
If
people
have
other
ideas
for
the
intermediate
format,
I
would
love
to
hear.
H
Hello,
I
have
a
comment,
and
actually
I
want
question
on
the
proposal.
So
the
comment
is,
I
don't
think
we
should
focus
on
the
querying
side
now,
because
we
still
have
to
define
the
form,
but
so
probably
it's
better
if
we
focus
on
the
encoding
and
the
coding
part.
So
what
alexa
suggested
that
we
also
want
to
have
benchmark
data
about
the
performance
of
retrieving
the
data?
H
H
Instead
of
queering,
the
perform
retrieving
the
performance
at
queer
time
once
the
data
stored
in
the
backend-
and
my
question
is:
why
did
you
suggested
p
prof,
similarly
to
what
felix
said?
Instead
of
maybe
a
time
series
like
format
of
all
the
stocks,
which
is
much
better
in
my
opinion,
because
what
we
want
to
measure
actually
is
the
performance
of
the
profiler,
translating
you
know
stack
trace
into
a
payload
in
a
network
payload
that
goes
out
to
a
backend
in
our
push
model.
H
So
what
we
really
want
to
do
is
parse
text,
and
you
know
probably
multiple
samples
of
this
text
accompanied
by
a
timestamp
and
then
create
payloads
that
goes
on
the
network.
So
the
the
reporting
that
we
should
have
is
how
much
cpu
cycles
we
spend
on
parsing
and
building
the
encoded
version
of
the
stack
trace
and
then
how
heavy
is
the
payload
that
goes
on
the
network.
H
So,
for
this
reason
I
saw
you
propose
it
for
the
stacks,
so
the
flame
graph
format-
I
think
that
is
probably
more
suited
to
this
type
of
measurement,
especially
if
we
take
the
single
flame
frame
graph
for
full
stack,
and
we
put
that
you
know
a
series
of
them
in.
I
don't
know
json
or
binary
text,
whatever.
E
Yeah,
so
if
I
understand
it
correctly,
you're
basically
saying
let's
take
the
folded
stacks,
where
it's
like
stack
trace
and
that's
a
number
and
expand
this
format
to
have
timestamps
and
labels
included
in
that
and
kind
of
use.
That
right
is
that
a
fair
yeah.
H
H
This
is
relevant
for
the
rest
of
the
group
yeah
that
that
sounds
like
a
better
measurement
of
what
the
type
of
workload
we
want
to
measure
the
performance
formula.
E
E
A
Yeah
alexa,
you
had
your
hand
up
as
well.
D
I'm
meeting
again
yeah
it
was
that
the
format
would
need
to
be
extended
to
include
things
like
file
names.
It
should
also
be
able
to
represent
inline
frames.
Things
like
that.
D
Just
want
to
make
sure
that
we
don't
lose
some
of
the
data
that
some
of
the
it
will
basically
need
to
be,
in
a
sense,
a
union
of
of
features
that
different
formats
support.
But
hopefully
that's
not.
I.
I
think
it
will
end
up
being
fairly
different
from
simple
look
of
folded
stacks,
but
I
agree
that
they're
like
having
something
closer
to
what
actually
happens
at
the
collection
time
in
an
agent.
E
A
Okay,
cool
thanks
for
that
we'll
do
matt
and
then
I
kind
of
want
to
go
back
to
something
felix
said
about
or
is
felix
still,
oh
yeah.
There
is
about
yeah.
What
this
group
collectively
thinks
is
the
next
most
important
thing
so
matt
you
had
something
you
wanted
to
add
yeah
just
another
couple,
brief
I'll,
be
brief.
You
know
when
thinking
about
a
format
for
call
stacks.
F
If
again,
if
we
have
that
logical
model
kind
of
well
thought
out,
then
a
call
stack,
isn't
really
a
string.
It's
a
collection,
you
know
it's
a
collection
of
frames
and
differing
levels
of
debug
versus
you
know.
Minified
binaries
might
actually
have
you
know,
arguments
in
a
debug
binary
version
of
the
call
stack
and
no
arguments
or
type
info,
or
you
know
all
manner
of
other
stuff.
That's
in
the
call
stack,
and
so
you
know,
if
again
we
think
about
like
that
sort
of
eventually
consistent
global
model.
F
F
If,
if
we're
not
thinking
about
like
every
sample,
has
this
pro,
this
call
stack
and
it's
a
string
and
we
always
have
to
shovel
the
stuff
around.
But
if
we
actually
break
it
down
into
a
logical
model,
then
you
could
have
optimizations
and
compressions
that
are,
like
you
know,
kind
of
tries
on
steroids
right,
because
the
duplication
across
stacks
call
stacks
coming
from
different
binaries,
for
example,
is
massive
right.
A
Yeah
well,
we
can
definitely
add
that
to
the
dock
as
well.
Yeah,
that's
that's
a
great
point
and
then
yeah.
I
kind
of
wanted
to
go
back
to
what
felix
said
along
the
lines
of
you
know.
Are
we
ready
for
for
a
benchmark?
You
know
for
benchmarking
phase,
you
know
I.
I
do
think
it's
something
that
we,
I
think
it's
worth
yeah
like
you
know,
kind
of
doing
in
parallel.
I
definitely
don't
think
we
should.
A
You
know,
sort
of
not
talk
about.
You
know
more
more,
have
more
conversation
about
the
goal
of
the
format.
I'm
curious.
If
others
here
have.
You
know,
I
guess
yeah
similar
concerns
that
we
still
haven't
sort
of
determined
the
the
value
of
a
of
a
standardized
hotel
event
for
profiling.
A
I
guess
I'm
kind
of
leading
the
the
question
but
yeah,
I
guess
is:
is
there
anything
else
that
people
here
think
that
we
should
perhaps
you
know
so
for
the
next
meeting?
For
example,
what
do
you
or
we
can
start
with
felix
like?
What
do
you
think
is
the
most?
You
know
pressing
topic
that
we
should
talk
about
in
the
next
meeting.
G
I
guess
there's
a
lot
to
think
about,
but
I
think
the
two
categories
to
think
about
is
like
what
is
in
it
for
the
users
like.
If
we
pull
this
off
and
we
do
it
really
well,
what
do
users
get
out
of
it,
especially
what
do
they
get
out
of
it
that
they
don't
get
out
of
the
current
situation
where
every
profiling
vendor
has
implemented
profiling
already
it
exists,
they
can
use
it,
and
switching
between
vendors
is
not
as
hard
as
it
is
with
tracing.
G
That's
what
I
think
is
an
important
point
and
then,
secondly,
what's
in
it
for
the
vendors,
what's
each
vendor's
motivation
to
participate,
can
we
share
code?
G
Can
we
share
clients
like
maybe
this
means
we
don't
all
have
to
implement
our
own
profiler,
but
then,
which
one
would
we
use
there,
because
that
that
component
seems
maybe
even
more
interesting
than
the
format
and
sending
the
data
off
component,
and
I
think
that's
something
where
I
just
want
to
get
a
feel
if
people
are
like
willing
to
go
all
the
way
through,
like
one
vendor
has
a
really
nice
component
here.
Are
they
willing
to
donate
that
to
hotel
in
the
long
run
or
is?
G
A
I,
like
both
of
those
anybody
else,
have
thoughts
or
responses
to
that.
A
All
right,
I
guess
you
know
that
I
mean,
I
think,
that's
I
think
that's
fair.
A
Yeah
the
other
thing
so
yeah
I
mean
yeah.
We
can
definitely
talk
about
that
as
well.
I
mean
yeah.
I
don't
think
that
that
necessarily
that
those
are
mutually
exclusive.
I
think,
assuming
that
we
can
all
agree
that
collectively
there
will
be
some
format.
That
is,
you
know
better
than
yeah,
like
everybody
doing
things
separately,
which
I
hope
that
we
could.
You
know
eventually
come
to
that
conclusion,
as
it's
happened
with
other.
You
know
data
formats
as
well.
A
I
mean
I
would
be
surprised
if
profiling
is
so
unique
that
it's
you
know
different
than
tracing
or
logs
or
metrics
that
have
also,
I
guess,
potentially
debatably,
but
I
would
argue,
have
you
know,
found
some
value
in
you
know
having
a
standardized
format
so
so
yeah.
So
we
can
definitely
talk
about
that.
The
one
other
thing
I
wanted
to
ask
is
yeah.
I
mean
I'm
fine
meeting
every
week.
A
D
On
weekly
on
the
weekly
versus
bi-weekly,
I
think
it's
I
think
it
bi-weekly
should
be
fine
if
we
can
afford
more
work
to
to
offline,
because
I
would
expect,
like
more
discussions,
happen
on
the
documents
between
between
the
meetings
and
and
also
like,
maybe
exchanging
like
one
thing:
I'm
not
super
good
in
following
the
chat
because,
because
that
particular
yeah
I
know
many
people,
many
people
use
this
the
chat,
I
think
it's
pretty
active,
but
I
sometimes
just
miss
notifications,
but
if
more
discussions
would
happen
on
the
document,
that's
easier
to
subscribe
to,
and
that's
something
that
I
could
follow
more
accurately.
A
I
mean
okay
yeah,
unless
I
guess
we're
about
to
come
up
on
time.
If
anybody
feels
strongly
for
non-bi-weekly,
it
looks
like
many
are
in
favor
of
the
bi-weekly,
because
yeah
also,
I
feel
like
by
the
time
you
know
I've
been
summarizing.
The
meeting
notes.
It's
like
you
know,
monday,
there's
only
like
two
days
until
the
next
meeting
and
I
do
think
we
could
start
to
transition
to
having
more
of
this
conversation
offline
and
such
so.
I
will
talk
with
morgan
about
getting
the
calendar,
invite
changed,
and
hopefully
we
can.
A
You
know
yeah
still
that
way.
We
can
like
plan
out
the
meetings.
A
little
bit
have
more
time
for
more
people
to
kind
of
catch
up
in
between
meetings
and
and
yeah,
hopefully
still
make
the
same
amount
of
progress,
but
get
save
you
all
an
extra
hour
of
your
week
each
week
or
I
guess
each
other
week.
D
One
thing
I
I
know
we're
out
of
time,
but
one
thing
I
was
wondering
for
additional
topics
and
things
to
do
if
we
should
put
some
kind
of
like
requirements
section
in
one
of
the
docs,
because
I
think
we
discussed
many
things
such
as
like
time
stamps
or
no
time
stamps
native
versus
managed
increment
like
stateful
versus
stateless,
but
I
I'm
not
sure
we
have
anywhere
kind
of
like
a
concise
section
of
requirements
that
kind
of
requirements
from
everything
we
heard.
D
A
Okay
noted,
I
will
look
into
that
other
than
that
we're
a
minute
over.
If
there's
any
other
thoughts,
questions
feel
free
to
throw
them
in
slack,
or-
I
guess,
even
on
as
a
comment
in
the
doc,
if
you'd
prefer
to
use
that
but
yeah
thanks
for
coming
again
everyone
and
have
a
good
rest
of
your
week,.