►
From YouTube: 2022-03-10 meeting
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hey
everyone-
I
am
I've,
never
attended
before
my
name
is
spencer.
I
I'm
not
affiliated
with
a
vendor.
I
like
sent
a
sort
of
introductory
message
in
the
sampling
like
wg
sampling.
I
think
it
was
on
the
workspace
slack
workspace
but
yeah.
I
guess
I
just
like
sort
of
hotel
and
I
like
math
stuff
and
when
I
saw
that
there
was
like
a
lot
of
activity
going
on
in
this
sort
of
area
of
the
spec.
B
A
A
C
I
just
realized
that
the
link
the
zoom
link
in
the
meeting
notes
is
wrong,
so
I
was
locked
in
into
different
meetings.
D
I
I
have
maybe
made
it
in
this
call
and
you
can
see
there's
a.
D
Oh
great,
I'm
sorry
that
I'm
always
having
trouble
making
this
meeting.
I
am
now
present
and
I
would
like
to
participate
in
a
meeting
and
it's
unusual
that
I'm
not
in
front
of
my
desk.
I
did
put
in
some
meeting
notes
and
since
I
missed
the
last
meeting,
I
would
love
it.
D
If
the
two
of
you,
peter
and
atmar
could
perhaps
recap
a
discussion
that
happened
about
the
discussion
was
about
cp,
potentially
trace
state
values
for
conveying
probability
for
after
the
fact
sampling
and
then
I
also
think
we
should
do
introductions
we
have
spencer.
Here
I
saw
who
I
spoke
with
yesterday,
has
joined
the
call
with
some
interests
of
his
own,
and
perhaps
he
can
speak
to
that.
C
Yeah,
actually
I
missed
the
last
meeting
as
well.
So
probably
because
I
used
the
wrong
zoom
link,
which
is
in
the
meeting
notes
and.
A
Yeah
sure,
hey
everyone,
my
name
is
spencer.
I
sent
like
an
introductory
message
on
slack,
but
so
I
am
not
like
here
representing
any
like
observability
vendor.
I
am
just
like
a
very
interested
user
of
open,
telemetry
and
things
like
it
and
then
also
I
have
I
generally
enjoy
like
math
and
statistics
and
stuff,
and
so
when
I
saw
that
there
is
a
lot
of
activity
happening
or
a
lot
of
open
questions.
I
guess
I
should
stay
still
sort
of
preventing
this
sort
of.
A
Like
really,
you
know,
dynamic
sampling
that
exists
in
some
specific
vendor
products
today,
but
not
in
hotel.
I
was,
I
became
curious
to
see
you
know
how
what
the
path
will
look
like
for
hotel
to
have
those
same
capabilities
so
yeah
glad
to
be
here
and
meet
everyone,
oh
and
then
for
context.
A
A
D
Excited
to
dig
in
a
little
bit
on
the
honeycomb
sample
that
you
mentioned
to
me
yesterday,
so
maybe
we
can
talk
about
that
a
bit
for
the
group.
I
think
spencer's
first
question
to
me
was
very
apt.
Yesterday,
when
we
spoke
in
a
one-on-one
question,
was
really
like
it's
hard
to
find
a
road
map
for
open,
telemetry
sampling
and,
of
course,
we've
been
facing
this,
like,
I
guess,
almost
disagreement
in
the
community
about
where
they
want
sampling
to
go.
D
D
Is
that
piece
that
does
fixed
rate
sampling
roughly
speaking,
and
so
that's
where
I
see
this
going
and
I
think
for
the
community
or
for
any
interested
parties
like
putting
together
that
roadmap
would
be
awesome.
I
think,
as
I've
said
in
the
past
my
role
as
openometry
technical
committee
member
means
I'm
often
distracted
by
other
debates,
so
I
haven't
been
able
to
work
on
this
myself.
D
In
that
sense,
I
think
there's
still
some
question
of
whether
a
new
trade
state
variable
would
be
necessary
or
helpful
to
us.
The
reason
this
started
remember
was
there's
a
sampler
in
the
open,
telemetry
collector
called
probabilistic
sampler,
and
it's
taking
in
full
choices
before
they
stamp
are
sampled,
and
this
makes
an
opportunity
to
do.
D
I
guess
I'd,
say
some
of
the
stuff
that
the
honeycomb
sampler
is
doing
or
to
do
some
of
this,
I
guess
very,
I
just
want
to
say
selective,
sampling
or
or
weighted
sampling,
in
particular
the
example
that
spencer
raised
yesterday
when
I
spoke
to
him,
has
to
do
with
this
sampler
design,
where
you
emit
a
constant
rate
for
a
particular
key
and
their
keys
can
be
variable.
D
This
is
an
idea
that
I
find
myself
having
worked
through
for
a
bar
up
sampler,
so
I
know
how
to
use
an
arbitrary
weighted
sampler
to
get
arbitrary
weights
on
traces
after
this
probabilistic
sample
stage
is
being
done,
and
I
can
use
that
to
make
sure
that
I
get
the
same
number
of
sample.
Traces
expected
same
number
of
sample
traces
per
category
of
some
attribute
label,
for
example,
but
I
don't
know
how
to
convey
the
prob
the
resulting
adjusted
counts
in
such
a
situation,
which
is
why
we
were
talking
about.
Potentially
this.
D
I
call
the
c
variable,
which
would
be
just
the
explicit
adjusted
count
variable
put
that
in
the
trace
state
to
say
that
the
effective
count
of
this
trace,
you're
collecting
or
the
span
that
you're
collecting
you
know
could
be
a
arbitrary
number,
not
a
power
of
two
potentially
that's
something
that
has
definitely
been
appealing
to
me,
but
we
didn't
get
as
far
as
specifying
any
kind
of
c
value.
C
C
I
think
and
it's
which
is
unrelated
to
the
r
value
and
then
we
should
communicate
the
the
sampling
rate,
which
was
used
here
using
a
different
variable
than
p,
because
otherwise
yeah.
A
I
guess
go
ahead,
I
I
was
gonna,
ask
a
clarifying
question,
which
is,
I
know
in
some
of
the
samplers
that
are
in
in
that
january,
pr
that
got
merged
into
spec.
There's
a
clause
in
there,
basically
that
mars
java
pr
implements.
I
think
where,
like
basically
your
the
spec
requires
to
like
support
arbitrary
sampling,
probabilities
and
then
it
like
probabilistically
sort
of
chooses
a
power
of
two
such
that
the
long
term
average.
Is
your
arbitrary
zero
to
one
value?
A
Could
that
same
principle
not
be
applied
in
the
in
the
collector,
basically
to
like
continue
to
like
like
probabilistically?
What
would
be
the
right
direction?
I
guess
increase.
C
Yeah
yeah
sure
I
mean
the
same
principle
could
be
applied
there
as
well,
but
the
key
question
is
which
random
value
is
used
for
the
sampling
decision
in
the
collector?
Is
it
based
on
the
r
value
then
we
can
use?
Then
we
combine
that
in
the
p
value
right,
just
the
sampling
probability,
the
effective
sampling
probability.
C
A
C
I
don't
know
actually
well.
D
I
I
think
that
the
way
I
would
clarify
the
expression
so
so
to
directly
answer
the
question
you
could,
I
think
the
idea
of
using
the
same
r
value
works
in
the
sense
that
we've
designed
it
to
make
support
for
consistent
sampling
work,
even
when
that
decision
is
made
on
independent
nodes.
So,
after
the
fact
sampling
to
probabilities
of
two
powers
of
two
using
the
r
value
as
randomness
is
not
any
different
than
changing
the
original
sampling
rate
inside
the
process.
D
D
Half
the
time
we're
going
to
choose
fifty
percent
and
half
time.
We're
gonna
choose
one
one,
the
the
numbers
work
out.
We
mean
we
get
correct,
adjusted
counts
for
all
the
spans,
but
you
end
up
with
a
potential
for
a
broken
trace
that
didn't
you
didn't
have
in
the
original
setup
if
the
root
sampler
had
been
using
our
design.
D
Well
that,
actually
I
realize
that
I'm
only
interested
in
the
case
where
there's
a
root
sampler,
so
I'm
not
going
to
have
a
different
sampling
rate
at
my
child
children
nodes
in
that
case,
where
the
root
sample
is
configured
at
75
and
all
the
children
are
using
parent
samplers,
I'm
always
going
to
have
a
complete
trace,
it'll
just
either
be
rooted
and
complete,
or
it
won't
be
present
when
you
do
after
the
fact
sampling
to
75.
A
I
see
yeah,
I
guess
then,
is
like
another
way
to
formulate
the
question.
A
Is
there
a
way
for
like
a
program
given
a
tree
of
spans
to
amend
the
trace
states
on
all
the
spans
in
that
tree
to
like
preserve
them
all
still,
but
also
be.
You
know
correct
in
some
sense,
as
well.
C
I
mean
the
the
example
you
described
josh
I
mean
if
you're
using
parent-based
sampling,
I
mean,
then
you
only
have
one
sampling
rate
applied
to
all
spans
of
the
trace
right
in
this
case,.
D
C
And
but
what
is
if,
if
we
have
different
sampling
rates,
a
different,
consistent
sampling
rates,
where
we
have
consistent
sampling
applied
to
spans,
with
different
sampling
rates
on
into
on
individual
spans,
then
it's
important
to
know.
C
D
D
So
what
if
we
keep
r
p
the
way
they
are
for
for
in
process
sampling
and
powers
of
two
and
everything
that
we've
done
so
far?
I
had
no
problems
with,
but
we
added
a
second
variable
in
the
tray
state.
We'll
call
that
c,
which
is
an
explicit
count,
but
instead
of
being
a
count,
an
absolute
count,
it
would
be
a
relative
count,
meaning
relative
to
one
so
that
it
would
multiply
with
whatever
adjusted
count
you
get
from
the
p
value.
C
Yeah
yeah
this
is
actually
this
is
actually
the
the
disc.
The
c
value
is
actually
the
p
value
of
the
after
defect
sampling
step
right.
So
so
you
actually
store.
C
C
D
Okay,
so
the
way
I
see
this
being
useful,
extremely
useful,
it
gets
back
to
this
just
sort
of
weighted
sampling
stuff
that
that
I've-
maybe
briefly
talked
about
you
know.
D
I'm
going
to
look
at
the
shard
numbers
that
I
have
calculated
distribution.
Now,
I'm
going
to
do
inverse
probability,
sampling,
inverse
probability
sampling
using
the
shard
number
weights,
essentially
to
resample
that
data.
Now
every
span
that
comes
out
it
will
be
a
fixed
number
of
them.
I
will
expect
to
get
the
same
number
of
examples
for
every
shard
id,
which
is
my
goal,
and
every
shard
will
have
a
different
non-power
of
two
adjusted
count
according
to
the
var
ops
result,
and
then
I
can
send
my
spans
off.
D
D
Sorry,
p,
values
from
any
original
sampling
decisions
that
were
made
and
the
varroa
stuff
works
correctly.
I'll
use
the
p
values
to
input
the
weights
and
I'll
put
and
I'll
take
out
the
output
weights
so,
but
I'll
keep
my
p
value
preserved
on
the
way
through.
D
So
what
I'm
going
to
do
is
I'm
going
to
take
the
p-value
use
that,
as
an
input
weight
to
var,
opt
I'm
going
to
take
the
output
weight
from
var
up
divide
by
the
input
weight,
and
that
gives
me
my
c
value,
which
is
the
multiplier
I
want
to
use
relative
to
the
original
adjusted
count.
I
hope
that
made
sense.
That's
that's
roughly,
where
I
would
see
that
going.
It
makes
it
sound
like
c
value.
Is
a
pretty
good
proposal?
D
B
I
I
have
some
questions
or
issues,
perhaps
with
this,
so
when
I
think
about
tail-based
sampling,
really
the
the
use
cases
for
that
are
that
during
that
that
time
we
might
not
see
the
whole
trace.
B
Also,
although
in
some
cases
we
we
will
see
the
complete
traces,
but
the
main
the
key
thing
is
that
we
will
see
things
that
are
not
visible
when
we
made
the
decision
about
head
based
sampling,
namely
latency
and
errors,
and
perhaps
some
attributes
that
were
added
to
the
span
during
this
span
when,
when
the
spans
were
running
so
the
sampling
that
we
would
do
would
not
be
unbiased.
I
would
say,
because
we
would
look
at
certain
things
that
would
make
some
spans
more
likely
to
be
kept
to
be
to
be
preserved
and
how?
B
D
A
D
Can
be
unbiased,
the
the
sort
of
example
that
I
like
to
work
through
here
when
I
think
about
this
you
know
is
like
let's
suppose
that
I
have
a
scenario
where
I'm
looking
at
the
an
error
situation,
so
I'm
going
to
sample
if
there
was
an
error
and-
and
I
expect
that
my
ratio
of
errors
and
non-errors
is
100
to
1..
Well,
let's
make
it
99
to
1..
So
that's
easy,
so
I
have
one
percent
errors
and
99
non-errors.
D
D
Now
I
do
what
I
said:
inverse
probability
sampling,
so
I'm
just
going
to
apply
a
weight
and
it's
1
over
10
000
for
the
error
span,
it's
1
over
990
000
for
the
non-error
spans.
Now
I
run
those
through
my
probability,
my
my
weighted
sampling
algorithm
and,
at
the
end
I
get
a
thousand
span.
So
I've
cut
down
from
a
million
spans
to
a
thousand
stands.
D
D
I
actually
should
be
saying
traces,
I'm
I'm
really
thinking
of
spams,
though,
because
that's
the
way
this
logic
works
in
my
head
and
those
adjust
accounts
will
be
different,
but
but
they
are
unbiased
in
the
sense
that,
if
I
add
up
the
expected
counts
of
my
errors
and
my
non-errors,
they
equal
a
million
expected
span,
they
equal
a
million
and
that
that
works
for
all
the
subsets
as
well,
so
that
I
can
slice
my
my
sample
on
any
other
dimension
and
still
have
what
I
call,
I
believe,
are
unbiased
results.
D
It
took
me
a
long
time
to
like
really
reason
that
out
and
understand
it
as
well
as
I
do
today,
and
it
took
me
like
really
getting
to
know
those
weighted
sampling
algorithms
like
better
than
I
did
once
so.
That
to
me
is
what
really
magic
of
weighted
sampling
is
that
I
can,
after
the
fact
do
this
per
key
adjustment,
which
is
really
the
core
that
of
this
one
of
the
core
features
of
this
honeycomb
library.
D
Is
that
there's
a
link
to
in
the
notes
then,
and
that's
kind
of
why
I
was
I
was
leading
us
in
that
direction
today,
since
the
c
value
had
come
up
and
my
customers
we've
got
customers
who
really
want
75
sampling,
it's
just
like.
D
You
think
that
that
doesn't
matter
very
much,
but
it
really
does
matter
if
you're
trying
to
be
a
large
customer
with
heavy
load,
they
want
to
balance
their
load
to
be
steady
and
if
they
need
75
percent,
that's
the
number
they
need
and
they
we
don't
have
sdks
built
yet
for
them
to
use.
So
after
the
fact,
sampling
is
exactly
how
they're
planning
to
do
it.
We
just
need
to
figure
out
a
way
to
get
adjusted
counts
through
to
the
vendor
for
these
fairly
simple
after
effects
samplers.
D
So
I
think
c,
a
proposal
for
c
value
would
be
great.
D
My
next
question
is:
are
there
any
volunteers
who
want
to
write
such
a
proposal
having
struggled
through
the
last
round
of
trade
state
updates,
I'm
happy
to
be
a
part
of
it,
but
I'm
also
recognizing
how
much
more
work
there
is
on
metrics
right
now.
So
I'd
love
it.
If
someone
else
wanted
to
take
a
stab.
A
Sorry,
if
I'm
being
dense,
but
can
you
say
one
more
time
why
and
I
feel
like
this
might
exist
in
a
github
issue
somewhere,
but
the
the
like
essential
motivation
for
a
c
value
rather
than
updating
p
values
in
place.
D
So
we
agree
that
you
could
update
p
values
in
place.
I
think
atmar
is
the
one
with
the
reservation
and
it
has
to
do
with
knowing
the
variance
mark.
Can
you
summarize.
A
D
A
Wait
so
then
I
I
thought
I
had
it,
but
I
don't
so.
Could
you
continue
then.
C
Yeah
the
the
reason
yeah,
I
don't
know
what
you
fully
read
about
consistent
sampling,
but
I
mean
the
key
idea
is
to
have
a
shared
random
number
such
that
yeah
samplers,
which
use
the
same
sampling
rate,
have
right,
consistent
sampling
decisions,
so
is.
C
Then
no,
it
just
increases
the
chance
of,
for
example,
having
full
spans
full
traces
actually,
because
if,
if
you
have
samplers
which
act
independently,
then
it's
very
unlikely
that
you
get
the
full
trace
sampled,
because
every
sampler
has
its
own
random
number
and-
and
it's
very
unlikely
that
all
all
the
samplers
will
sample
expand
simultaneously.
C
A
Right,
I
guess
my
mental
picture
for
like
a
p-value
update
procedure
like
given
a
spam,
I'm
sorry
backup,
given
a
trace
like
a
procedure
to
update
all
the
p
values
in
that
trace.
I
guess
I'm
my
intuition
is
that,
like
a
procedure
exists
to
like
update
p-values
in
a
trace
in
a
way
that
doesn't
result
in
broken
trait,
like
surely,
doesn't
result
in
broken
traces
and
still
implements
like
second,
a
second
stage
of
sampling,
but
I'm
not
sure
I
think.
D
You're
correct,
I
think,
you're
correct
about
that
spencer
and
I
did
actually
write
an
issue
that
started
this
conversation,
maybe
three
weeks
ago,
where
I
propose
roughly
that
procedure
the
way
I
think
about
it
and
I
think
it
is
correct
in
the
sense
of
you,
you're
not
you're,
still
not
biased,
but
here's
where
I
see
mars
actually
thinking-
and
maybe
this
helps
hopefully
so
imagine
two
scenarios
one
is
I'm
gonna
flip
a
you
know
a
coin
with
ten
sides,
and
I
get
you
know
the
the
one
choice
so
so
now
I
can
see
that
I
get
the
one
and
ten
so
now
I
have
a
span
with
an
adjusted
count
of
10..
D
So
now
I
give
you
a
bunch
of
those
spans
of
adjusted
counts
of
10
and
you
are
going
to
do
some
analysis
and
and
there's
some
analysis,
that's
going
to
tell
you
some,
maybe
averages
and
some
variances
for
that
data
that
I
did
one
in
10
sampling
on
now.
It's
a
different
scenario
where
I
do
one
and
two
sampling
followed
by
one
and
five
sampling
and
the
same
effective
probability.
In
those
cases
the
inclusion
probabilities
are
equal,
but
I
believe
the
variances
are
not,
and
I
don't
know
how
to
say
that
in
a
better
way.
C
I
don't
know
it's
so
so
what
your
meaning
is,
that
I
mean
if
you
have
equal
weights
or
it's
always
better
to
for
the
variance
right.
So
if
all
all
your
all,
your
samples
have
been
sampled
with
equal
probability,
which
means
that
they
all
have
adjusted
counts.
So
this
is
what
you
mean
joshua
so
with
well.
C
Of
that,
what
I
want
to
say
is
that
whatever
you
do,
whatever
your
sample
is
important
in
order
to
explain
or
estimate
data
is
that
you
know
how
a
data
record,
which
sampling
steps
a
data
record
has
going
through
right,
has
come
through.
C
So
if
there
are
multiple
sampling
steps
and
they
act
in
different
ways
like
we
have
a
consistent
sampling
step
and
a
completely
random
sampling
step,
then
we
need
to
know
the
probabilities
of
both
and
not
not.
We
with
just
one
value.
We
lose
this
information.
C
What
was
the
probability
of
this
consistent
sampling
step
and
what
was
the
probability
in
the
after
defect,
sampling,
which
uses
some
random
number
or,
and
where
is
some
some
reservoir
sampling
or
weighted
reservoir
sampling
step
where
the
sampling
probabilities
is
calculated
afterwards,
and
so
it's
very
important
that
we
have
these
two
values
that
we
can
reproduce.
Actually
the
the
whole
processing
chain,
the
sampling
processing
chain.
A
Does
does
that
change
if
the
tail
sampling
is
like,
surely
like
not
quite
so
arbitrary
as
like?
How
should
I
say
this?
It
like
treats
a
trace
as
a
single
unit
and
it
like
picks
a
probability
like
a
it.
A
You
know
conducts
a
new
bernoulli
trial
to
preserve
to
retain
the
root
span
and
then,
whatever
the
result
of
that
trial
is
like
it,
you
know,
does
any
necessary
updates,
but
it's
still
like
a
tail
sampler
on
the
granularity
of
traces
that
surely
preserves
all
the
children
and
only
updates
yeah
p
values.
C
If
you
sample
yeah
homogeneously,
overall
spans
of
a
trace
or
if,
if
a
sampling
decision
applies
to
all
spans
of
a
trace,
it's
quite
easy.
Yeah.
B
B
Right
so
so
it
is
quite
possible
to
have
consistent
probability,
tail
sampling,
for
example.
If
you
want
to
have
50
sampling,
you
just
increase
the
p
value
for
every
span,
and
if
it's
still
not
larger
than
than
r,
then
then
you
keep
it.
If,
if
sorry,
if
it's,
if
it's
larger
than
r,
then
then
you
drop
it
if
it's
still
less
or
equal
to
r,
you
keep
it.
This.
C
Way
it
would
preserve
all
the
metrics
correctly,
I
believe,
yeah.
But
what
do
you
cannot
guarantee
with
that?
That
you
really
is
downsampled
by
factor
of
two,
for
example,
because
if
it's
already
sampled
in
the
in
the
samples
by
fifty
percent,
then
you
only
have
r
values
which
are
already
greater
than
the
p-value
and
and
if
so,
you
would
have
to.
C
Yeah,
it's
not
not
so
easy
to
guarantee
that
you
lose
50
percent
of
the
data
in
the
second
sampling
step.
If
you
do
it
consistently.
D
C
An
example:
your
old
spans
have
only
the
p
value
greater
than
or
have
a
p
value
of,
2
right
and
you
want
to
sample
by
50.
Then
you
have
to
increase
the
p
value
to
three
exactly
and
if
it's,
if
the
spans
have
a
p
value
of
one,
then
you
would
have
to
increase
it
to
two
and
yes.
C
Yeah,
if
you
do
it
consistently,
yes,
yeah
sure
you
can,
you
can
have
multiple,
consistent
sampling
steps
after
each
other
and
if
you're,
using
just
the
r
value
for
your
sampling
decision,
you
just
have
to
to
adjust
the
p
value
right
so
and
right,
sampling
criterion
is
quite
simple,
because
it's
just
a
comparison
of
the
r
and
the
p
value
and
yeah.
If
you
want
to
set
the
p
value
to
four,
then
you
just
keep
the
data.
If
the
r
value
is
greater
than
or
equal.
A
A
That,
like
is,
has
like
a
acts
as
a
probabilistic
like
50
sampler,
so
like
in
that
case
like
it,
would
add
one
to
all
the
p-values
and
the
traces
if
I've
understood
correctly,
but
if
it
were
75
like
could
such
a
tail
sampler
half
the
time,
increment
p
values
by
one
and
half
the
time,
increment
them
by
yes,
zero.
I
guess
yeah
leave
them
alone.
C
Yes,
actually,
I've
implemented
a
prototype
which
is
basically
uses
reservoir
sampling,
but
keeps
makes
it
consistent
so
by
just
adjusting
the
p
value
for
some
and
for
some
not
there.
So
it
randomly
chooses
that
such
that
you
still
fill
the
buffer
and
use
the
buffer,
because
if
you,
if
you
ch,
because
if
you
have
just
a
power
or
effect
of
two
steps
right,
if
you
increase
so
for
example,
you
have
a
reservoir
sampling
buffer,
yeah
and
all
it's
it
gets
full.
C
Then
you
have
to
remove
some
of
those
right
and
and
one
one
approach
is
just
to
increase
the
p-value
of
old
and
you
lose
50
percent
and
there
are
50
percent
free
spots
again
there.
But
then
you
do
not
use
the
entire
space
which
you
have
for
the
buffer
which
the
buffer
provides.
Yeah.
D
C
This
algorithm,
which
I
proposed
you
know
just,
does
it
in
more
clever
way
to
still
use
the
buffer,
but
also
that
all
data
records
have
equal
chance
to
survive
the
buffer,
and
this
is
actually
in
yeah.
I
have
a
I
have
a
prototype
for
that.
There
is
also
explanation
of
this
algorithm
in
the
source
code.
So
if
you
want
to
have
a
look
at
that,
yeah.
C
I
I
had
it
in
the
pr
up,
but
I
think
the
the
pull
request
was
too
large,
so
I
removed
the
part
yeah.
C
D
The
first
time
yeah
go
ahead
apologize.
This
is
the
first
time
I've
actually
understood
the
algorithm
that
you're
just
described
like
the
having
this
conversation
right
now
made
it
clear
to
me
the
the
way
to
do
consistent
scale.
Sampling
now
makes
so
much
more
sense.
Appreciate
that
sensor.
You
were
just
about
to
ask.
I
think
the
way
I'm
the
way
I'm
hearing
this
and
maybe
I'll
pass
it
right
back
to
you,
but
the
way
I'm
hearing
it
is.
D
We
see
how
you
could
stay
within
the
p
and
r
scheme
and
we
see
how
it's
meaningful
and
correct
to
adjust
p
values
to
resample
in
a
tail
in
a
tail
sampling
kind
of
way,
and
perhaps
that's
good
enough.
I
I
question
it
because
I
haven't
thought
through
everything
yet,
like
I,
I
explained
how
to
use
var
up
to
do
something
earlier
and
I
don't
know
how
to
do
it
in
the
same
way
with
without
it.
D
So
my
question
is:
does
it
make
sense
to
also
have
a
c
value
where
the
adjusted
counts
are
going
to
be
if
you're
doing
consistent
sampling
you
you're
going
to
touch
p
and
if
you're
doing
arbitrary?
After
the
fact,
randomized
sampling
you're
going
to
touch
rc
and
if
you
do
two
sampling
stages
after
the
fact
you're
going
to
multiply
your
c's
together
and
but
but
consistent
probability,
sampling
is
done
using
p
and
I'm
guessing
that
there's
some
sort
of
restriction
that
you
can't
do
p
adjustments
after
you
do
c
adjustments.
D
C
Sampling
step
for
all
the
estimations
I
guess,
but
still
if
you
keep
the
value
separated
like
the
c
and
the
p
value.
So,
but
I
don't
know
if
there's
a
need
for
that
or
right.
A
Yeah
for,
for
my
part,
I
and
I
I
may
be
like
hyper
focused
on
like
this
sampling
design
that
I
ultimately
want,
which
is.
A
I
mean
I'm
not
sure
I
think
honestly,
I
it's
been
a
while.
I
did
like
read
the
first
few
pages
of
our
opt,
but
that
was
a
while
ago
and
I
like
might
not
be
recognizing
that,
like
I'm.
Actually
I
do
want
that
and
like
in
some
sense
that
it's
like
that
cannot
be
accommodated
by
like
p-value
updates.
I
like
presently
I'm
much
more
inclined
to
to
defer
to
you
all
on
on
that,
but
yeah
I
like,
I
think
most
of
what
I
want
would
be
satisfied
by
like
this.
A
Like
consistent
p-value
updating
scheme
we've
described,
but
I'm
not
confident
yet.
I
too
need
to
think
through
it
a
little
bit
more.
C
C
D
A
Until
quick
question
is
the
appeal
of
is
the
appeal
of
a
buffered
approach
like
buffering
some
amount
of
traces?
Is
the
appeal
there
to
eliminate
the
like
time
bias,
like
preferring
traces
that
are
earlier
than
later,
traces
that
exists
in
head
sampling.
C
I
mean
in
the
reservative
reservoir
sampling.
Algorithms,
do
not
prefer
any.
A
Bucket
issue,
where
it's
like
head
sampling
in
general,
like
head
head
sort
of
rate,
limitee
sampling,
introduces
a
bias
in
that
like
given
two
spans
a
and
b
you
are
more
likely
to
retain
a
and
so
there's
like
some
sort
of
recency
yeah.
C
Actually,
it
does
not
introduce
a
bias
right
because
the
sampling
probability
is
still
and
the
justice
count
will
be
still
correct.
So
the
estimates
will
be
unbiased.
That's
true,
but
but
you
have
you
know,
the
adjusted
counts
will
have
different
values
right,
so
they
are
different
and
the
more.
C
Are
the
the
higher
the
variance
will
be
the
estimation?
So
that's
the
problem.
That's
why
it's
better
to
have
the
adjusted
counts
or
the
sampling
probabilities,
equal
right
and
yeah.
If
you
have
to
do
the
sampling
decision
immediately
and
then
and
and
and
you
also
have
to
to
limit
the
rate,
it's
hard
to
find
a
sampling
rate
because
you
do
not,
you
do
not
know
what
will
happen.
A
So
I
think
I
I
definitely
misspoke
in
saying
bias,
because
I
think
the
word
that
you
used
in
your
original
comment
where
you're
like
hey:
here's
like
a
timeline
of
traces
like
the
first
one
just
generally
referred
to
the
second
one.
I
think
you
use
the
word
fairness,
which
is
like
not
like
not
actually
the
same.
A
Yeah
yeah,
so
no,
but
I
understand,
then
how
the
the
variance
is
greater-
and
I
am
coming
around
to
remembering
now
like
why
var
opt
is
all
about
reducing
variance
by
like
yeah.
So
I
think
for
me
that
the
sort
of
takeaway
answer
to
my
question
was
a
buffering
approach
on
intel
position,
as
opposed
to
like
a
sort
of
instantaneous
decision
approach
and
also
in
tail
position
buffering
enables
you
to
reduce
variance,
and
so
that's
appealing
got
it.
Thank
you
very
much.
That's
super
helpful
for
me.
D
I've
really
enjoyed
this
conversation.
I've
also
learned
a
lot.
It
sounds
to
me
like
we
don't
truly
need
a
c
variable,
except
it's
so
much
easier
for
most
people
to
understand.
I
I
have
I
have
to
think
through
more
of
the
maybe
perceived
benefits
like
I
guess.
D
We've
sort
of
been
talking
about
two
different
variations
here.
One
is
where
you're
doing
tail
sampling
of
a
span
and
one
is
where
you're
doing
tail
sampling
of
a
trace
and
tail
stemming
with
span
is
also
like
choosing
exemplars
for
histograms
by
the
way.
So
there's
there's
a
notion
that
tail
sampling.
D
If
it's
a
span,
then
you
might
be
interested
in
keeping
p
values
and
not
introducing
t
values
because
later
on,
you're
going
to
look
at
a
trace
and
if
each
span
gets
sampled
independently
by
a
tail
sampler,
you
want
to
have
those
partial,
correct,
consistently
sampled
traces,
whereas
the
example
that
most
of
my
customers
and
my
my
team
here
at
lightstep
kind
of
cares
about
is
not
that
one.
No
one
cares
about
broken
traces
here.
We
want
complete
traces.
D
So
for
us,
the
c
value
is
just
a
straightforward,
simple
answer,
because
we
don't
expect
customers
to
come
in
and
do
per
span
after
the
per
span
tail
sampling.
We
expect
them
to
do
per
trace
tail
sampling,
which
is
just
makes
it
so
much
easier
to
explain.
However,
it's
not
a
job
to
make
things
easy.
Our
jobs
to
you
know
do
what
customers
want.
So
I
find
this
to
be
really
enlightening.
It
makes
me
want
to
think
a
bit.
D
I
came
into
the
meeting
thinking
I
would
be
proposing
a
c
value
for
trace
state
and
now
I'm
leaving
thinking
it's
premature,
I'd
like
to
understand
the
ways
of
doing
reservoir,
tail
sampling
in
consistent
ways
better,
and
this
20-minute
explanation
just
prior
was
really
helpful.
So
I
would
also
like
to
go
back
and
look
at
the
links
in
mars
pr
again
and
consider
what
I
don't
quite
yet
understand.
D
Yeah,
that's.
I
agree.
It's
so
open
question
for
me
too,
because
I
have
wrapped
my
head
all
the
way
around
it
so
well,
I
think
that's
a
positive
result.
I
would
like
to
hopefully
take
this
to
slack
and
in
two
weeks
we
can
meet
again
hoping
to
make
more
great
progress.
D
A
When
a
quick,
quick
question
when
you
like
you
mentioned
to
me
in
our
one
on
one-
and
I
think
you
also
just
alluded
to
like
you-
are
aware
of,
like
customer
demands
for
different
things,
and
obviously
there
are
channels
that
that
information
reaches
you
within
lightstep.
What
is
your
sort
of
methodology
for
like
like
in
the
hotel
space
like
seeing
what
people
want
or
getting
a
feeling
for
what
people
want.
D
The
that's
a
great
question,
I
think
it's
just
being
here
long
enough
and
I've
had
enough
incidental
contact
with
people
who
are
coming
from
aws
and
jaeger,
and
it's
just
as
those
two
precedents.
The
jaeger
and
the
x-ray
system
has
given
people
a
framing
for
this
problem,
and
so.
D
And
I
I
guess
the
only
other
answer
is
that
from
struggling
through
that
standards
process
to
get
the
trace
state
stuff,
we
wanted
for
probability.
It
became
clear
to
me
how
much
difference
how
different
the
world
view
of
most
people
was
in
terms
of
not
really
caring
about
fan
counts,
but
but
truly
caring
about
caring
about
being
able
to
shape
the
traffic
that
they
get
from
their
tracer
libraries.
D
I
have
this.
I
mean
my
only
other
interesting
perspective
here
comes
from
working
through
metrics,
with
open
telemetry
for
so
long
and
and
the
the
stark
difference
here
is
that
metrics
started
with
this
idea
of
views
in
scope,
so
you
can
configure
which
metric
instruments
work
and
which
keys
they
use
and
which
aggregations
they
use
and
which
aggregation
settings
they
use
and
so
on,
and
you
can
disable
metrics
and
you
can
you
know
you
can
do
all
this
stuff.
D
All
those
things
I
just
described
are
what
people
want
to
do
for
spams
and
instead
of
having
a
spam
view,
scope
and
like
really
tackling
this
question
as
a
first
class
thing.
Instead,
we
created
the
sampler
api
and
nobody's
happy
with
the
sample
api
because
it
doesn't
do
what
any
anybody
wants,
because
it's
not
really
a
user
level
feature.
It
is
something
that
people
like
us
understand,
but
but
very
few
people
are
going
to
get
from
the
sampler
api
to
the
thing
that
does
views
of
spans,
which
is
what
they
all
want.
They
all
say.
D
I
want
write
a
regular
expression
to
change
the
sampling
rate
of
this
span,
and
that
is
what
customers
want.
Most
customers
want
and
then
it's
it's.
This
desire
that
my
vendor
has
to
kind
of
do:
unified,
observability
to
make
metrics
out
of
spans.
That's
pretty
rare
and
that's
why
we're
talking
about
probabilities,
but
most
people
don't
seem
to
care.
D
D
So
the
idea
of
a
sampler
inside
the
collector
that
feeds
into
a
standard
metrics
processor
that
outputs
metrics
from
your
sample
stands
and
then
puts
them
in
prometheus
when
that
end-to-end
works.
I
think
people
will
be
a
little
bit
more
interested
in
this,
but
until
then
they're
just
looking
for
ways
to
configure
which
spans
have
which
probabilities,
because
they
need
to
control
their
volume
and
they
need
to
find
interesting
spans
and
they
do
that
by
turning
off
uninteresting
spans
hope.
I've
helped
there.
A
Yeah
yeah,
thank
you
I'll
look
into
like
what
jager
and
x-ray
sampling
is
today,
but
it
kind
of
sounds
as
though,
like
I
personally
don't
have
a
demand
for
emitting
metrics
based
on
my
trace
data,
but
I
do
have
tremendous
demand
for
the
products
like
like
user
experiences
like
what
lightstuff
and
honeycomb
provide,
and
so
like.
A
I
totally
get
it
and
that's
why
I'm
here,
I'm
not
interested
in
like
the
like
actual
sort
of
like
the
like
trace
to
metrics
pipeline
that
you
sort
of,
I
think,
that's
a
pretty
good
sort
of
marketing
sort
of
way.
You've
found.
D
Anyway,
I
get
it
hopefully
I
thank
you
for
that.
Okay.
Well,
I
remain
hopeful
that
the
direction
is
a
span
view
mechanism
that
lets
you
configure
these
sampling
actions
and
then
the
fact
that
it
can
be
done
remotely
is
just
kind
of
sugar
on
top.
There
is
something
that
x-ray
does,
that
we
haven't
really
spoke
about
spoken
about,
which
is
like
global
global
rate,
limiting
there's
some
sort
of
like
global
reservoir
or
rate
allocation
that
they
do.
D
I
was
hoping
to
get
someone
there
to
talk
about
it
here
eventually,
but
haven't
yet
anyway.
Now
I
think
we're
over
time
appreciate
this
conversation,
and
I
hope
to
do
it
again
in
two
weeks.