►
From YouTube: 2022-07-07 meeting
Description
OpenTelemetry Prometheus WG
A
A
A
A
A
B
All
right
looks
like
the
flow
has
slowed
down
a
little
bit,
so
I
guess
we
can
go
ahead
and
get
started.
I
guess
first
of
all
welcome
back
everyone.
I
hope
you
had
a
nice.
I
guess
bonus
week
to
catch
up
or
relax
or
whatever.
B
If
this
is
your
first
time
here,
we've
been
meeting,
this
is
the
fourth
meeting
to
talk
about
profiling
and
adding
profiling
as
a
supported
event,
type
to
hotel
we've
been
discussing
mostly
early
on,
we
kind
of
talked
with
a
lot
of
different
people
who
were
involved
with
various
facets
of
profiling
and
sort
of
what
their
goals
were
and
now
as
we're
starting
to
think
towards
what
a
ideal
format
would
look
like
that
can
support
the
widest
array
of
use
cases.
B
We've
been
evaluating
various
formats
from
companies
who
are
using
custom
formats,
as
well
as
public
formats
like
p-prof
and
hopefully,
today,
jfr
in
order
to
better
understand
sort
of
what
the
I
guess
you
know,
kind
of
landscape
looks
like
and
what
types
of
problems
we
can
hopefully
sort
of
distill
down
and
find
one
agreed
upon
format
or
generally
agree
to
problem
format
that
will
support
the
most
used
cases.
B
A
couple
weeks
ago,
we
talked
about
some
just
general
high
level
goals
about
you
know
just
what
we're
ultimately
trying
to
achieve
the
the
main
one
being
ability
for
data
center,
system-wide
profiling
ability
to
connect
profiles
to
other
signals
representing
profiles
across
native
code,
slash
runtimes.
I
think
we'll
hear
some
more
about
that
today.
Those
are
all
and
then
ability
to
map
existing
formats,
to
whatever
format
we
ultimately,
you
know
decide
is,
is
best
and
so
yeah.
That's
I'm
still
a
working
list
and
we're
we're
continuing
to
add
to
that.
B
C
Going
going
gone
all
right,
I
got
a
quick
question,
so
I
I
saw
the
list
of
goals
and
I
missed
the
last
meeting.
But
one
thing
that
we
could
maybe
add
over
time
is
the
motivation
for
the
goals
so
which
of
these
goals
are
sort
of.
So
we
as
like
vendors
of
profiling,
can
share
technology,
reuse
solutions
and
which
ones
of
them
are
in
the
interest
of
users
and
sort
of
what's
the
motivating
factor
behind
them
to
be
good
to
capture.
B
Okay,
I
will
note
that
down.
B
Else,
right
cool!
Well,
if
you
are
not
part
of
the
slack
I'll
paste,
the
oh
yeah
also
feel
free
to
add
yourself
to
the
meeting
notes
to
the
attendees
list.
I
will
post
it
in
the
in
the
chat
here
also,
if
you're
not
in
the
slack
I'll,
add
a
link
to
the
slack
as
well,
but
the
elastic,
slash
profiler.
I
keep
saying
elasticslash
profiler,
I
don't
know
which
one
you
guys
would
or
would
prefer,
but
anyways
they
created
a
doc
about
their
custom
format.
B
That's
super
detailed.
Thank
you
very
much
for
adding
that
also
just
cool
to
see
as
someone
interested
in
profiling
but
yeah.
I
wanted
to
give
you
all
a
chance
to
sort
of
perhaps
summarize,
there's
a
link
to
the
to
the
dock
and
the
meeting
notes
for
those
who
want
to
dig
a
little
bit
deeper.
But
perhaps,
if
you
wanted
to
someone
from
there
wants
to
summarize
sort
of
that
and
we
can
discuss
some
of
the
key
points
from
your
format.
D
Yeah,
so
I
think
the
the
key
key
decisions
that
we
made
in
the
format
that
I
to
some
extent
would
like
to
see
in
in
a
future
standardization
there's
a
few
of
them.
First
of
all,
I'm
actually
a
strong
believer
in
not
sending
the
entire
stack
traces,
but
sending
a
hash
of
the
stack
price,
and
it's
probably
also
the
most
controversial
decision
in
the
protocol.
D
Judging
from
the
the
comments
on
on
the
dock,
the
second
thing
that
I
found
beneficial
is
trying
to
do
columnar
alignment
of
of
values
in
the
format
versus
row
based
or
a
value
value-based
alignment
just
for
better
compression
and,
lastly,
I'm
partial
to
use
protobuf
similar
to
gproff
just
because
once
you
use
like
it
doesn't
have
to
be
protobuf,
but
I'm
I'm
partial
to
using
something
to
specify
the
protocol
that
can
then
be
used
to
generate
parses
for
different
languages,
because
I
like
we,
we
all
know
that
everybody
is
working
with
heterogeneous
languages
on
the
back
end
and
protobuf,
or
something
similar
has
the
benefit
of
being
able
to
generate
parsers
for
very
different
infrastructures.
D
B
Yeah
thanks
so
the
I
it
seemed
like
from
the
conversation
both
in
the
dock
and
in
slack
that
the
not
duplicating
stack
traces
bit
was
the
was
in.
I
guess,
like
a
point
of
discussion,
do
you
want
to
kind
of
describe
sort
of
how
you're
doing
that
and
and
kind
of
the
reasoning
behind
or
yeah
the
motivation
for
it
and
and
sort
of
how
you
got
to
the
current
state
that
it's
currently
in.
D
We
see
stacks
easily
exceeding
128
200
frames,
so
you
just
get
a
huge
amount
of
data
if
you
send
out
the
entire
stack
frame
each
time,
so
we
we
learned-
or
we
noticed
early
on
that
if
we
want
to
stay
within
the
envelope
of
performance
and
network
bandwidth,
that
we
wanted
to
stay
in
that
just
sending
all
the
frames,
all
the
time
isn't
isn't
really
an
option
at
which
point
we
decided.
Okay,
we
need
a
way
to
avoid
sending
the
frames
all
the
time.
D
So
what
we
literally
do
is
we
hash
the
stack
traces
and
then
that
forms
the
id
of
the
stack
trace,
and
then
you
send
out
the
stack
trace
the
first
time
your
client
sees
it
and
on
subsequent
encounters
you
just
sent
the
hash
and
the
count
like
how
often
you
saw
it
there's
possibly
like.
If
we
wanted
to
go
crazy,
you
could
bring
more
efficiency
out
of
it
by
splitting
the
trace
into
the
leaf
function
and
the
rest
of
the
the
trace,
because
everything,
but
the
leaf,
tends
to
stay
more
constant
over
time.
D
So
you
could
even
get
things
to
be
more
efficient.
If
you
wanted,
we
decided
against
that
possibility.
Long
story
short.
We
identify
a
stack
trace
by
a
hash
of
individual
components
and
then
the
first
time
a
client
encounters
that
trace.
It
sends
the
entire
trace
to
the
back
end
and
then
on
subsequent
encounters.
It
only
sends
the
the
hash
it
doesn't
have
to
remember
hashes
forever
because
it
doesn't
like,
if
you
just
keep
a
local
cache
that
eventually
gets
filled
and
replaced.
D
You'll
send
the
same
trace
again
at
some
later
point,
which
isn't
actually
bad
for
resilience
in
case
the
first
message
gets
lost
so
so
over
time
you
converge
to
full
knowledge
of
everything
on
the
back
end,
even
if
you
had
intermittent
faults
elsewhere
yeah.
So
the
summary
really
is
take
the
stack
phrase,
hash
it
and
then,
whenever
feasible
and
simple,
use
the
hash
and
sell
the
full
trace.
D
E
E
Let's
make
it
clear
that
you
lose
something
by
doing
that
and
you
what
you
lose
is
the
ability
of
full
inspection
at
the
intermediaries
that
we
we
we
made
a
different
choice
when
we
were
designing
otlp
for
traces,
metrics
and
logs,
we
decided
to
include
all
required
full
necessary
state
with
all
of
the
messages
that
we
send
in
otlp,
so
that
intermediaries
don't
have
to
reconstruct
the
state.
E
It
may
not
be
necessary
for
the
profiling,
so
perhaps
that's
the
valid
choice
for
profiles
right,
but
but
it's
a
trade-off
and
showing
demonstrating
that
by
doing
so
by
having
this
statefulness
saves.
Very
significant
volumes
of
data
and
by
perhaps
showing
that
intermediary
filtering
is
not
even
necessary.
For
profiles
is
a
very
strong
argument
in
favor
of
the
choices
that
you
made.
D
Cool
yeah
one
one
very
quick
note:
the
intermediary
filtering
of
stack
traces
is
something
that
is
exceedingly
difficult
to
do
for
native
code
anyhow,
because
the
collecting
agent
usually
doesn't
have
the
debug
symbols
locally.
D
So
I
think
my
only
like
the
only
comment
I
would
like
to
add
to
what
you
said
is
that
if
we
want
to
filter
and
do
different
routing
decisions,
buy
an
intermediary
based
on
particular
stack
frames
and
functions
being
present?
That
is
not
something
that
is
easily
feasible
for
native
traces
anyhow,
so,
but
yeah
and
yeah,
perhaps.
E
D
E
F
Yeah,
so
I
have
a
question
regarding
kind
of
client
memory
requirements
in
this
scenario,
I
imagine
once
you
start
hashing
strings
and
you
know
you
have
to
kind
of
keep
track
of
that
information.
Maybe
this
is
not
really
relevant
in
the
context
of
it
all,
but
I
wonder
if
yeah,
if
you've
done
kind
of
analysis
of
yeah
extra
memory
requirements
and
if
that's
a
consideration,
if
that,
if
that's
a
trade-off
or
not,
that
kind
of
thing.
D
So
the
the
beauty
of
the
hashing
scheme
is
that
you
can
like
you,
can
set
the
memory
requirements,
meaning
you
decide
how
many
hashes
you're
gonna
be
remembering
and
you
can
more
or
less
trade
off
network
traffic
for
lower
memory
like
requirements.
If
you
want
so
I
think
I
need
to
check
what
we
are
reserving.
D
I
think
it's
on
the
order
of
a
megabyte
or
two,
but
maybe
10,
even
the,
but
this
is
absolutely
at
the
discretion
of
the
client
like
the
in
the
limit
you
set
it
to
zero
and
then
you're
sending
stack
traces
all
the
time.
G
Yeah,
just
a
quick
question
for
tigran
I
was
wondering:
could
you
explain
because
I'm
not
that
familiar
with
the
other
otlp
protocols,
like
wasp
kind
of
the
use
case
for
filtering,
is
or
what
it
kind
of
means
to
filter
in
intermediaries.
E
E
I
need
to
have
both
of
this
information.
At
the
same
time,
both
the
samples,
the
counters
and
the
stack
trace,
and
with
this
approach,
you're
not
sending
them
at
the
same
time
right
you
send
the
stack
trace
when
you
see
it
the
first
time,
but
the
counts
are
sent
subsequently,
they
do
nothing.
They
reference
the
stack
traced
by
the
hash,
but
they
do
not
contain
the
stack
trace.
E
So
to
do
this
filtering,
I
have
to
keep
the
stack
traces
when
I
see
them
and
I
have
to
then
then
essentially
keep
this
state
forever
right,
because
that
I
may
see
the
the
reference
to
that
particular
stack
trace
later
to
do
this
field
drink.
G
E
E
The
scenarios
like
this
happen
all
the
time
right
when,
when
you
have
a
large
organization,
you
use
intermediaries
to
collect
and
budge
all
the
data
before
it
goes
to
your
vendor
of
choice,
and
sometimes
it's
absolutely
not
under
your
control,
what
you
observing
in
in
your
intermediary
right,
but
you
still
know
that
it's
it's
it's
pointless
data.
You
don't
want
it
to
be
sent
to
the
vendor
because
it
costs
you
money
right.
E
So
you
just
want
to
drop
it,
but
you
want
to
drop
it
precisely
what
you
don't
need,
not
just
so
it's
again
it
depends
on
whether
you
have
such
an
either
no-
and
I
may
be
wrong
here-
maybe
in
the
profiling
world,
it
doesn't
happen,
but
it
is
very
typical
to
do
this
sort
of
filtering
or
routing.
Let's
say
I
don't
drop
it,
but
I
send
it
to
some
other
cold
storage.
Cheaper
storage
right,
so
filtering
reduction,
rerouting
is
very
common
operations
that
you
do
at
the
intermediary.
G
Cool
yeah,
I
think
it's
probably
like
the
philosophical
difference
with
what
we're
doing
is
fundamentally
want
to
catch
everything
all
the
time
and,
if
like
something
will
be
significant
enough,
that
it
will
be
costing
you
on
the
back
end,
then
it's
probably
significant
enough
that
you
want
the
profile
from
it.
If
you
get
what
I
mean,
but
I
haven't
thought.
A
B
I
don't
know
if
anyone
with
their
hand
raised,
wants
to
respond
or
ask
the
questions.
Okay,.
H
I
wanted
to
expand
on
this
if
we
make
a
parallel
with
the
logs,
which
is
very
simple,
to
reason
about
you
wouldn't
most
of
the
time
filter
on
the
log
message,
but
you
might
filter
on
the
log
labels
like
I
don't
know,
host
name
or
host
name
or
a
service
type
or
whatever.
So,
if
you
think
with
in
this
context,
we
also
have
additional
labels
that
we
attach
to
stack
traces
like
the
container
the
hostname,
and
you
know
you
can
filter
on
that.
Even
if
the
stack
trace
is
a
hash.
D
Sorry,
if
I
button
for
a
second
I
mean
if
it's
true,
that
the
hashing
will
make
it
difficult
for
or
impossible
to
to
do,
filtering
and
routing
at
the
intermediary
without
additional
client
support.
D
A
question
that
we,
like
I
posed
on
the
dock,
is
that
if
the
filtering
is
desired,
there
would
be
the
option
of
just
pushing
the
data
collecting
client
to
attach
a
metadata
label,
because
we
already
have
the
container
name,
the
pod
name
and
so
forth,
attached
to
the
counts
so
but
yeah
it
is.
It
is
a
trade-off
in
the
sense
that,
if
you
want
to
implement
that
filtering,
then
you
will
need
client
support.
F
I'll
just
add
on
the
topic
of
filtering
one
thing:
we've
seen:
security
teams
insist
on
is
some
sort
of
intermediary
that
would
filter
out
data
that
they
don't
want
to
share
with
vendors.
You
know
some
people
are
very
concerned
about.
Like
specific,
you
know,
labels
stack
trace
names,
so
yeah,
that's
another
kind
of
use
case
for
that.
B
Cool
yeah,
I
guess
any
other.
I
guess
questions
about
that
alexa.
I
Yep
on
filtering
one
thing
is
that
I
think
with
profiling
data.
Sometimes
you
cannot
just
drop
the
data
because,
for
example,
for
cpu
samples,
if
you
drop
some
cpu
samples,
then
it
will
essentially
skew
the
profile.
I
I
think
it
cannot
be
just
drop
like
it
needs
to
be
aggregated
into
some
kind
of
like
ignored
bucket,
or
something
just
mentioning
that,
for
example,
in
our
profiling
tools,
we
go
through
many
hoops
to
make
sure
that
we
like,
even
if
there
is
some
back
pressure,
needs
to
be
applied
and
data
cannot
be
like
all
streamed
out.
You
still
provide
some
aggregated
metric
for
it.
B
Cool
so
yeah,
I
guess
just
to
summarize
and
as
we
kind
of
think
moving
forward,
I
know
you
briefly
mentioned
it
or
someone
did,
I
guess
kind
of
as
you're
thinking
about
this
standardized
format.
You're
saying
that
the
I
just
want
to
make
sure.
I
understand
right
that
the
main
pieces
that
you
would
like
to
see
is,
I
guess,
some
mechanism
that
you
can
use
to
not
duplicate,
stack,
trace,
columnar
alignment
versus
row
alignment-
I
guess
maybe
could
you
expand
on
how
that
would
be?
D
So
I
guess,
if
I
had
to
rank
my
wishes,
which
one
would
be
duplicate
stack
traces
which
two
would
be,
let's
use
something
that
has
a
description
language
similar
to
pro
buffs
or
whatever.
That
can
then
generate
parses
and
number.
A
D
Would
be
columnar
storage?
The
columnar
storage
is
mostly
because
compression
works
so
much
better
when
you
do
columnar
alignment,
so
that's
really
just
an
optimization
for
getting
getting
better
data
compression.
D
So
it's
essentially
my
first
wish
is
in
order
to
reduce
the
overall
volume.
Second
wish
is
to
allow
people
with
heterogeneous
back
ends
to
automatically
generate
parsers
and
not
have
to
well
be
stranded
and
having
to
write
parsers
from
scratch,
because
nobody
wants
to
write
a
low
level
parser
and
the
third
one
is
again
for
for
efficiency's
sake,.
B
Cool-
and
I
know
you
also
touched
on
in
the
doc
the
comparison
to
prof
and
jfr-
I
think
we're
about
to
transition
to.
Perhaps
some
jfr
talk
anyway.
So
maybe
can
you
briefly
sort
of
just
touch
on?
Why
or
I
guess
yeah,
you
know,
sort
of
where
p
prof
and
jfr
lack
in
the
areas
that
you
sort
of
needed.
D
D
My
biggest
concern
with
jfr
is
that
there
is
no
official
spec
and
there's
very
few
implementations
of
something
parsing
or
writing
it.
So
like
as
a
former
googler,
my
instinct
is
obviously
going
with
grpcr
and
protobuf.
That's
just
the
damage
that
that
does
to
you.
Let
me
pull
up
the
dock.
I
Yeah
yeah,
if
it's,
if
it's
okay,
to
ask
there
was
one
thing
that
confused
me
a
bit
or
I
I
was
just
looking
to
clarify
for
myself.
You
mentioned
columnar
storage
and
my
impression
was
that
kind
of,
like
the
storage
format,
is
almost
like
out
of
our
control,
because
if
we
talk
about
open
telemetry,
we
kind
of
like
rely
on
the
transport
mechanisms
that
open
telemetry
provides.
D
So
when
I
speak
about
columnar,
it's
mostly
about
the
arrangement
of
fields
in
the
network
messages,
meaning
I
ideally
you
want
to
keep
similar
fields
close
together
versus
having
them
row
oriented
in
the
network.
Messages.
D
Yeah,
okay
and
then
it's
largely
because
lz
based
compressors
want
localized
repetitions
and
therefore.
I
D
And
I
mean
we
can
discuss
whether
it's
worth
doing.
We
got
about
20,
better
compression
out
of
it
using
gzip,
and
so
again
it's
a
trade-off.
E
Sorry,
you're
also
going
to
see
better
less,
let's
say
a
memory
fragmentation,
better
memory
allocation
by
doing
the
protobufs
that
way,
because
this
arrays
they
are
usually
allocated
as
a
single
slices,
whereas
the
the
nested
messages
they
are
singular
locations.
Usually
so
you
benefit
in
memory
as
well
in
the
recoded
format.
B
B
Oh
yeah
yeah,
if
you
could
yeah,
dig
that
up
and
share
it.
We'd
love
to
because
yeah
we'll
talk
a
little
bit
about
benchmarking
and
stuff,
and
so
yeah
yeah
be
useful
to
know.
D
B
Anybody
what
a
I
guess-
yeah,
okay,
we'll
we'll
move
on
for
now,
if
the
people
have
thoughts
on
that
feel
free
to
add
it
to
the
doc,
the
yeah.
So
we
have
the
datadog
folks
who
also
mentioned
wanting
to
talk
some
about
jfr
themselves.
B
I
don't
think
we
had
anyone
from
your
side
here
last
week,
so
feel
free
to
I
guess
kind
of
take
it
away
and
sort
of
yeah
in
whatever
direction
you
want
to
take.
It.
C
J
Yeah,
let
me
let
me
share
the
screen
first,
so
I
assume
that
you
can
see
the
the
purple
screen
with
the
presentation,
so
I'm
going
to
I'm
trying
to
do
a
very,
very
quick
presentation.
So
this
is
very,
very
brief.
Interaction,
not
going
anything
deep.
So
it's
an
overall
jdk
flight
recorder,
jfr
number
of
asia.
J
It
originated
a
long
time
ago
in
in
jeroki
jbm,
then
it
was
acquired
by
oracle
and
then
it
merged
with
sun
and
then
finally,
it
was
open
source,
the
whole
implementation
with
the
writer
and
everything
in
opengdk9,
and
we
got
back
port
to
open
jdk
8
in
update
262.,
so
datadog
was
also
taking
part
in
the
backboarding
effort
and
oracle.
Also
still
is
shipping
closes
version,
oracle
jdk,
which
is
still
kind
of
active,
so
the
key
features
of
the
jfr
is
that
it's
fully
integrated
with
jvm.
J
So
it's
it's
completely
hooked
in
the
runtime
compiler
gc.
What
not
it's
it's
very
lightweight!
Everything
is
event
oriented,
so
the
idea
is
to
take
as
little
time
and
resources
to
write
the
data
as
possible.
J
So
everything
in
gfr,
like
implementation
and
file
format,
is
actually
it
is
submits
to
to
this
goal
so
that
that's
why
we
don't
have
columnar
structures,
anything
because
it's
it's
more
difficult
to
to
maintain
it
and
and
it's
more
costly
to
maintain
it,
and
it's
it's
fully
self-describing
storage
format
so
like
once
you,
you
just
need
to
know
just
a
few
details
about
how
the
format
is
structured
and
after
that,
all
the
events,
all
the
types,
everything
all
the
all
the
values
you
can
read
it
up
without
any
extra
knowledge
or
extra
description
external.
J
So
everything
is
in
in
the
in
the
recording
itself.
So
everything
is
event
like
that.
That's
the
base
unit
of
jfr
recording
each
event
has
start
and
and
time
stamp
it
is
associated
with
thread
through
thread
id.
It
can
have
stack
trace,
it
doesn't
have
to
have
stack
trace
and
we
can
put
like
any
number
of
other
custom
data
fields
to
the
event
as
we
want.
J
The
stack
traces
are
actually
deduplicated
per
chunk.
I'm
going
to
talk
about
chunks
like
slightly
later,
but
we
are,
we
are
not
sending
the
full
stack
trace
for
each
event.
We
just
send
the
stack
choice
id
and
then
it
points
back
to
the
stack
trace.
We
call
it
constant
for
that.
J
So
there
is
some
xml
definition
and
then
it
will
just
it
will
explode
into
a
bunch
of
c
plus
plus
files
and
it
will
be
compiled
together
and
it's
hooked
in
really
deeply
in
jdm,
and
you
can
also
have
user
defined
events
for
which
there
is
java
api
and
with
those
you
can,
you
can
write
your
own
java
events
and
they
will
be
integrated
with
jfr
and
they
will
be
emitted
the
same
way
as
the
rest
of
the
recording.
J
So
now
I'm
going
to
talk
briefly
about
the
storage
format.
So,
as
I
mentioned
like
we
have
the
recording
so
that
there's
the
top
unit,
you
start
recording
at
the
time.
You
end
it
recording
at
a
time.
So
it's
time
about
collection,
events
and
the
recording
internally
split
into
chunks
and
each
chunk
is
like
self-standing
unit
information
and
it
has
header
just
some
like
informations
describing
the
chunk.
Then
we
have
metadata
event,
which
is
describing
all
the
types
which
are
in
this
particular
chunk.
J
The
metadata
event
can
be
repeated
and
then
the
the
definition
of
the
types
in
the
recording
is
actually
incremental.
So
you
can,
you
can
use
the
subsequent
metadata
events
to
add
more
information
like
when
you
register
new
events
during
the
recording,
then
the
new
types
used
in
there
will
be
in
in
that
new
metadata
event.
Similarly,
we
have
checkpoint
event.
It's
kind
of
confusing.
This
is
a
constant
pool
event.
J
The
checkpoint
events
containing
the
constant
pool
data,
so
it
will
contain
data
for
strings
for
duplicated
strings
for
the
stack
traces,
but
you
can,
you
can
create
custom
pools
for
any
any
type
like
even
for
the
user
type.
So
if
you
define
your
own
user
type,
you
can.
You
can
tell
the
this
in
the
format
that
this
type
is
using
custom
pool,
and
then
everything
should
be
using
the
pointers
to
constant
instead
of
putting
all
the
data
directly
into
the
event.
J
The
type
descriptions
are
based
on
built-in
types,
so
there
there
are
numeric
types,
boolean
and
string,
and
then
you
can
define
the
the
other
types
based
on
this
built-in
or
primitive
types.
So
they
are
composite
types
and
the
type
needs
at
least
name
and
attributes,
and
each
attribute
has
a
gain
name
and
type.
So
it
has
very
simple,
like
definition,
language
for
for
types
which
are
then
used
in
the
particle
chunk,
yep
constant
pools,
yeah.
These
are
the
cash
for
redundant
values.
So
we
point
back
to
to
the
constant
pool.
J
So
we
don't
need
to.
We
don't
need
to
store
the
same
data
over
and
over.
There
are
built-in
custom
tools
for
string
stack
traces,
as
mentioned
before.
There
might
be
more
custom
pools
for
other
types
and
then
it's
up
to
the
the
producer
of
the
recording
and
then
the
parser
to
actually
store
the
data
in
the
custom
pools
and
then
for
the
parser
to
read
from
the
ghost
temple.
J
The
idea
with
the
chunks
being
like
completely
self-reliance
of
describing,
is
coming
from
the
need
to
write
the
events
from
multiple
threads
at
the
same
time
with
the
minimum
possible
contention.
So
how
the
jvm
or
jfr
does
it
internally
it
will.
It
will
open
a
chunk
for
each
thread
for
up
while
it
is
using
a
size,
limit
or
or
time
limit
for
the
chunk,
after
which
the
chunk
will
be
kind
of
concatenated
to
the
main
recording
and
during
this
time
only
one
thread
is:
is
writing
data
to
to
the
chunk
so
internally?
J
It's
memory
mapped
on
on
the
disk,
and
then
it's
just
appending
new
stuff
new
events
that
once
when
the
the
chunk
is,
is
going
to
be
appended
it's
finalized.
That
means
it
will
get
written
the
the
custom
pools
and
the
metadata
with
the
types
to
there,
and
then
it's
moved
to
the
part
where
it's
going
to
be
joined
with
the
previous
previous
chunks.
J
So
while
this
makes
it
very
easy
to
write
in
a
highly
concurrent
environment
with
very
little
contention,
there
are
some
ordering
issues
because
we
are
like.
Basically,
we
are
flushing
the
chunks
at
any
time
and
the
events
are
not
physically.
They
are
not
ordered
by
the
time
when
they
happened.
So
then
you
need
to
when
you
are
passing.
You
need
to
rely
on
the
timestamp
of
the
event
to
restore
the
order
for
the
timestamps.
J
The
jfr
is
using
our
rdtrc
when
it's
available.
So
it's
it's
a
cheap
monotonic
clock
source,
but
the
thing
is
that
the
ticks
are
not
convertible
to
two
milliseconds
to
epoch,
milliseconds.
So
in
gfr
we
in
the
recording.
We
did
the
trick
that
we
store
the
milliseconds
and
ticks
in
the
chunk
header
and
also
we
store
the
thick
frequency.
So
we
can
divide
number
of
takes
by
by
the
frequency
and
there
we
go.
We
will
get
the
milliseconds
from
the
ticks.
J
To
make
like
even
better
well
compression,
let's
call
it
compression.
We
are
also
storing,
not
the
full
ticks
in
in
the
in
the
chunk,
but
just
the
delta,
which
takes
to
the
chunk
start
fix.
So
since
we
are
like
almost
everything
all
the
the
the
integer
numeric
values
are
led,
128,
compressed
or
encoded.
J
J
It
is
if
by
default
it
is
used,
as
I
said,
for
almost
all
integer
numeric
types,
but
it
can
be
turned
off
by
a
flag,
and
this
flag
is
also
written
in
the
chunk
header.
So
you
can
have
a
chunk
where
you
will
not
have
this
compression,
so
you
can
decide
what
makes
sense.
J
What
we
observed,
this
leb,
128
or
var,
and
compression
or
encoding,
was
not
that
great
for
large
numbers
with
big
entropy.
So
when
we
tried
to
use
this
for
uids,
then
basically
everything
was
was
encoded
in
nine
bytes
instead
of
like
eight,
so
it
actually
kind
of
grew
in
the
size.
J
But
it's
it's
really
good
for
small.
So
when
you
have
when
you
have
the
deltas
or
you
have,
you
have
small
values
and
it's
pretty
nice,
so
yep.
This
was
very,
very
brief.
Introduction.
There
is
a
blog
post,
secreted
by
gunner
merlin,
who
went
in
and
and
the
rebbers
engineered
the
full
gfr
file
format.
So
if
you
want
to
go
there,
you
can
you
can
take
a
look,
so
it
has
like
all
the
offsets
all
the
meanings
of
of
the
values
on
on
all
the
offsets
and
it's.
K
Just
had
a
comment:
hi,
my
name
is
stefan
by
the
way,
first
time
joining
here,
I
used
to
work
at
oracle
and
with
more
concert
with
them
with
j-rock
components
was
implemented.
So,
as
you
see
like
it's,
it's
very
optimized
for
writing
fast
for
many
threads,
and
it's
because
jaroslav
was
into
everything
is
event
and
we
collect
a
lot
of
events.
It's
it's
not
just
cpu
like
method
samples,
it's
everything
from
from
gcs
happening
to
lock,
contention,
etc.
K
J
B
Nice
thanks
yeah,
thanks
for
adding
that
we
got
a
a
pretty
good
sort
of
similar
rundown
on
p
prof
before
so
yeah
nice
to
have
one
for
jfr
as
well
alexei
you,
you
had
something
you
wanted
to
add.
I
I
have
a
question
licensing:
are
there
any
kind
of
like
licensing
aspects
of
using
gfr
format?
That's
one
and
second
dimension.
Timestamps.
I
wonder
if
you
use
timestamps
between
profiles
and
how
you
deal
with,
I
assume
we
cannot
assume
that,
like
time
is
synchronized
between
hosts,
it
can
be,
it
can
be
off.
J
Well,
first
licensing
this
the
source
for
the
part,
which
is
writing
the
recordings.
It's
it's
gpl2
with
class
plus
exemption,
it's
open
jdk,
so
it's
open
source
and
there
is
no.
J
There
is
no
format
specification.
There
is
no
patent
for
the
format.
There
is
no
copyright
anything
so
basically,
it's
up
for
taking
so
oracle
didn't
spend
any
any
effort
on
protecting
this.
I
J
J
K
There
is
a
parser
in
in
mission
control,
which
has
been
open
source
as
part
of
the
opengdk
project.
There's
also
parser
built
into
the
jdk
as
well
less
performant
than
the
one
in
mission
control,
but
it
supports
the
file
format.
So
there
is
a
java
api.
You
can
be
used
to
to
read
files.
J
And
yeah
timestamps
jfr
is
not
dealing
with
the
with
the
time
synchronization
host,
so
it
should
be
done
by
the
infrastructure,
so
the
time
stems
are
always
valid
within
the
the
chunk.
J
So,
as
I
said
in
at
the
start
of
the
chunk,
we
capture
the
the
ticks
and
the
epoch
millis
and
we
base
everything
in
the
chunk.
All
the
time
stems
are
derived
from
this,
so
yeah
also.
This
is
done
in
order
kind
of
to
fix
the
time
sku
since
the
the
drdsc
timer.
The
ticks
are
moving
slightly
faster
slower
than
the
system
time.
So
you
you
get
you
get
out
of
sync
after
a
while.
So
you
need
to
you
need
to
re-sync
between
between
the
ticks
and
the
airport
millers.
J
So
we
do
it
at
the
chunk
boundary.
So.
D
Mostly,
first
of
all,
thanks
a
lot
for
the
overview,
because
it's
was
super
helpful
to
get
got
an
idea,
I
think
an
interesting
paradigm
to
point
out
between
what
we've
been
doing
and
what
jfr
is
doing:
jfr
deduplicates
on
a
per
chunk
base.
If
I
understand
correctly,
yes-
and
that's-
that's
somewhat
similar
to
deduplicating
in
the
manner
that
we
do,
except
that,
because
it's
a
network
protocol
there's
no
chunks
as
such-
it's
just
a
stream
of
messages.
So
I
think
that
was
a
really
helpful
thing
to
learn.
B
Cool
yeah,
I
guess
I
don't
know
if
you
have
any
thoughts
on
this,
but
as
we
do
think
about
a
standardized
format,
I
guess
how
it
sounds
like
jfr
is.
Very
I
mean
it
does
have
a
lot
of
you
know,
sort
of
stuff
built
into
it
already
like
one
of
the
things
that
we
said
is
one
of
our
goals
is
being
able
to
sort
of
map
existing
formats
to
a
different
format.
B
I'm
curious,
if
you
have
any
thoughts
on
the
viability
for
jfr
to
be
able
to,
you
know,
be
be
somewhat
flexible
in
that
way.
You
know
if,
if
we
decided
a
format,
that's
not
jfr,
I
mean
I
guess
it
obviously
depends
on
how
different
it
is
curious.
If
you
have
any
thoughts
on
that.
J
Well
that
there
is
a
one
thing
right
now,
which
the
jfr
is
not
supporting
out
of
the
box,
which,
for
example,
people
support
it's
labels
so
but,
like
speaking
just
the
format
wise,
it
is
possible
to
support
it
like
the
jfr
has:
has
the
concept
of
arrays
or
sequences,
so
we
could.
We
could
have
right
now
the
stack
trace
or
or
the
profile
event
as
we
call
it
doesn't
have
anything
else
associated,
but
we
could.
We
could
associate
a
sequence
of
labels
to
that.
J
So,
and
I
don't
I
don't
know
I
I
was
thinking
about
not.
I
cannot
come
up
with
anything
which
would
be
impossible
in
jafar
or
with
the
with
the
small
changes
into
integer
from
formula
I
know
I
know
there
were
some
like
there
were
more
powerful
features
in
the
type
modeling
in
the
past.
They
are
not
used
now,
but
the
the
format,
the
parser,
at
least
in
gmc-
it's
kind
of
ready
for
that.
So,
judging
from
that,
it's
it
had
this
in
mind,
but
yeah
it
had
to
be.
J
B
All
right,
thanks
felix,
you
had
someone
you
wanted
to
add.
C
Yeah,
I
have
another
thought
on
that,
because
it
was
mentioned
that
the
design
goal
of
jfr
is
essentially
to
write
away
stupid
amounts
of
data
with
very
low
overhead.
C
I
think
the
idea
of
trying
to
convert
jfr
files
on
the
client
side
before
sending
them
somewhere
defeats
that
point,
because
you're
going
to
spend
a
lot
of
cycles
undoing
the
smart
things
that
have
been
done
and
you'll
get
some
power
results
out
of
that,
and
that
is,
I
think,
that's
a
long
long
term
concern
for
getting
detailed,
runtime
data,
not
just
for
for
java,
but
go
essentially
has
a
similar
problem
with
prof,
where
you
like,
get
p
props
out
of
the
runtime
and
that's
not
under
your
control.
C
Parsing
p
prof,
however,
is
not
as
much
because
it's
aggregated,
so
maybe
that's
not
a
fair
point,
but
the
future
of
go
runtime
observability
may
include
runtime
tracing
similar
to
jfr.
There's
a
go
enterprise
advisory
board
where
they
share
a
little
bit
of
what
they're
planning
to
do
for
the
go.
Runtime
and
the
words
google
sorry
go.
Flight
recorder
have
been
ushered
on
the
slides,
so
there's
a
chance
that
it
go
to
go.
Runtime
will
move
into
a
very
similar
architecture.
C
Yeah,
that's
fair,
but
I
think
that
ideally,
a
standard
would
allow
a
side
channel
for
including
raw
data
for
people
who
want
to
do
that.
Also,
I
mean
there's
always
going
to
be
data
in
jfr
that
might
not
be
supported
and
whatever
otel
standard
is
going
to
be
cooked
up,
because
it's
a
very
rich
data
format,
but
I
could
be
one.
K
Just
just
one
comment:
there
is
like
it's
optimized
for
writing
from
multiple
threads.
A
lot
of
events
that
doesn't
preclude
you
from
having
a
parser
thread
as
part
of
sending
it
out
right,
like
the
key,
is
avoiding
the
whole
path.
Sure
you
don't
want
to
waste
a
lot
of
cpu
sucks
on
the
other
sort
of
in
a
separate
thread,
either
right,
but
it's
making
sure
that
you
don't
sort
of
if
we're
like
one
event,
for
example,
is
low
contention.
K
It's
gonna
be
sort
of
a
timing
for
every
thread,
taking
sort
of
a
lock
in
an
address
for
that
lock
from
multiple
threads
right,
and
then
writing
that
information
down,
and
you
need
that
to
be
much
faster
than
sort
of
whatever
lock
contention
you're.
Having
so
you're
really
messing
things
there
in
the
critical
path,
and
it's
one
thing
to
do
the
measurement.
It's
another
thing
to
sort
of
transform
that
data
into
something
else.
B
Yeah,
well,
that's
part
of
part
of
while
we're
here
alexa
you
had
you're
hitting
up
too.
I
Yeah,
well,
I
think
it's
kind
of
obvious,
but
but
I
think
the
path
of
like
have
kind
of
like
side
channel
and
then
include
before
the
data
in
the
format
whatever
the
runtime
produces.
I
think
that's
also
a
slippery
slope,
because
that
removes
the
advantage
of
being
able
to
build
shared
infrastructure
and
like
profile,
processing
code
that
can
be
shared
between.
I
I
I
think,
like
I
would
view
the
format
we
defined
more
as
like.
Would
you
find
the
core
and
then,
of
course
it
can
be
like
it
may
be
extended
to
a
certain
extent,
with
labels
or
attributes
or
whatever
things
we
define,
but
just
saying
like?
Well,
you
got
the
side
channel
and
you
can
put
a
profile
proto
a
gfr
go
flight
recorder
formats.
D
I
mean
both
the
jfr
and,
in
our
format,
are
essentially
sequences
of
events
with
some
amount
of
deduplications
for
groups
of
events,
and
it
seems
to
me
like
we
should
be
able
to
come
up
with
something
that
resembles
a
protobuf
or
whatever
similar
specification
for
events
that
is
extensible
for
future
events
in
the
similar
way
that
jfr
is
at
which
point.
J
I
mean
that
I
guess
that
should
be
pretty
possible
to
define
format
the
wire
format,
so
you
can
describe
actually
the
types.
That's
what
jfr
does
as
well.
So
the
the
thing
which
is
describing
the
types
is
just
not
an
event.
So
if
you,
if
you,
if
you
define
this
particular
event,
type,
will
be
defining
the
events.
J
D
This
is
an
interesting,
interesting
question.
I
have
is
fully
self
describing
a
necessary
and
desirable
feature
in
the
sense
that
so
clearly,
my
my
bias
is
from
from
from
google
experience.
D
And
if
you've
got
whatever
product
of
specification,
you're
using
under
any
form
of
version
control
the
benefit
and
have
it
public,
the
benefits
of
self-describing
are
not
immediately
clear
to
me,
but
I
think
reasonable
people
can
disagree
on
this
as
well.
So.
J
Yeah
yeah,
the
thing
is,
maybe
just
for
for
only
profiling
information
like
you
get
samples
of
allocations
of
cpu
of
logs.
Maybe
it's
not
that
crucial,
but
like
in
jfr
and
in
the
runtime
they're
like
in
jvm
how
we're
used
to
live
now.
Is
that
you
can?
You
can
have
any
number
of
events.
You
can
create
events
for
your
application.
You
can
create
events
for
whatever
subsystem
you
have,
but
you
need
to
describe
them
somehow,
but.
D
But
what
I'm
saying
is
that
I
mean
protobufs
are
future
extensible
in
the
sense
that
you
can
add
new
message.
Types
to
an
existing
product
specification
and
old
parsers
will
just
keep
on
parsing
and
skip
the
new
message
types,
whereas
new
parsers
will
will
then
have
their
message
up.
The
the
architectural
difference
is
that
the
jfr
format
essentially
keeps
the
protobuffer
specification
inside
of
the
data
it
sends
out,
whereas
the
the
protobuffer
approach
is
to
keep
that
separate.
D
J
It's
again,
it
depends
like
if,
for
for
this,
the
the
custom
events
like
anybody
can
come
in,
like
you
might
have
events
for
for
apache
for
tomcat
for
for
spring,
for
akka
for
whatnot,
and
if
you
want
to
have
protobuf
distributed
for
each
new
event,
then
you-
I
don't
know
like
it's,
it's
getting
pretty
difficult.
J
They
you
can.
You
can
create
events
for
applications
like
you
can
you
can
do
kind
of
structural
log,
section
login,
but
this
is
probably
this.
This
is
kind
of
it's.
Not
it's,
not
profiling.
That's
that's
what
I
was
saying
so
this
is
extended.
This
is
this
goes
beyond
profiling
and
that
that's
where
we
need
it
for
for
profiling,
where
you
would
have
like.
J
D
I'm
going
to
make
myself
unpopular
in
the
sense
of
unpopular
two
vendors.
Now
one
could
argue
that
if
there's
a
public
protobuf
spec
for
a
profiling
format
that
more
or
less
forces
every
vendor,
if
they
want
to
extend
profiling,
to
send
a
pull
request
with
the
message
to
type
so
they,
the
absence
of
self-describing,
may
be
a
hammer
to
bludgeon
people
with.
But.
B
Then
wrap
up
with
like
a
couple
comments
go
for
it.
K
Yeah,
no
just
a
quick
quote
like
the
other
thing
to
think
about
here
is
like
jfr
was
built
at
a
time
when
that
there
wasn't
much
other
profile
like
it
was
like.
Okay,
we
want
all
of
these
dynamic
events,
so
we
wanted
people
to
build
events
in
there.
This
is
going
to
be
part
of
open
telemetry,
so
some
of
the
events
might
be
yeah.
K
This
is
going
to
be
part
of
a
tracing
or
a
log
events
or
etc
that
where
some
of
these
might
fit
so
there
might
be
a
sort
of
a
way
to
think
about
it
as
more
of
a
lockdown
format.
I
think
the
other
part,
probably
for
a
later
meeting,
is
just
thinking
about
like
okay.
Well,
is
it
a
sort
of
micro,
batching
type
of
thing
that
we
need?
That
needs
to
be
done
just
to
keep
performance
or
do?
Is
it
sort
of
the
tracing
type
of
thing
where
every
event
like?
K
Oh
yeah,
you
can
batch
them
out,
but
every
event
happens
by
itself
right,
but
here
it
might
be
it's
low
level
enough
that
it
requires
sort
of
some
type
of
aggregating
and
batching,
which
is
sort
of
the
culinary
storage
or
messages
you've
been
talking
about.
That
probably
is
required,
so
it
might
be
worth
values.
Thinking
about
the
whole
opens
limited
ecosystem.
What
what
fits
where
so?
That
might
be
one
one
way
of
of
not
having
to
be
self-described
yeah.
B
All
right,
awesome,
yeah,
good
conversation
thanks
everybody
for
thoughts
there
we
got
like
one
minute
left
so
yeah.
I
guess
one
thing
that
we
talked
about
last
week
that
we
haven't
really
made.
I
guess
regardless
of
which
route
we
choose
with
a
lot
of
these
things.
I
think
you
know
a
key
piece
of
data
that
we're
missing.
B
For
you
know,
all
of
these
formats
is
some
sort
of
benchmarking
and
we
kind
of
agreed
on
that
last
week,
and
so
we
don't
have
a
lot
of
time
here,
but
yeah
we
talked
about
it
internally
at
pyroscope
and
we
came
up
with
a
proposal
for
how
we
might
be
able
to
set
up
some
sort
of
process
for
benchmarking
these
different
formats.
B
So
we
can
follow
up
on
that
I'll
paste
it
in
the
slack
and
we
can
kind
of
follow
up
offline
and
perhaps
talk
about
it
next
week
or
if
others
have
other
ideas
on
how
we
can
do
benchmarking
and
then
yeah
outside
of
that.
Just
some
other
points
here
we'll
try
to
I'm
trying
to
reach
out
to
some
more
yeah
people
from
jfr
community
people
from
like
go
language
maintainers
to
kind
of
get
their
perspective
on
all
this
as
well.
B
B
I
would
be
think
it's
worth
chatting
about
potentially
moving
these
meetings
every
two
weeks
in
the
future,
but
I
would
just
propose
that
as
a
discussion
topic,
maybe
for
next
week,
does
anybody
else
have
anything
very,
very
quick
that
they
want
to
add
before
we
wrap
up
for
the
day.
K
K
What
we
talked
about
was
like
what
what
are
the
data
that
we
expect
to
flow
here,
because
they're
coming
from
our
background,
is
like
it's
everything
coming
from
periscope
or
from
others,
it's
probably
more
like
method
samples
and
then
the
truth
might
be
somewhere
in
the
middle,
so
it
might
be
good
to
start
filling
out
that
list.
So
we
know
what
needs
to
be
supported
by
it.