►
From YouTube: 2021-12-02 meeting
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
D
Good
morning
I
was
just
working
on
an
agenda.
Well,
I
didn't
listen
to
you
all
talking.
A
Oh,
that's:
okay.
The
the
google
calendar
has
the
zoom
link
in
the
location
of
of
the
caliber,
invite
and
then
like,
but
you
can't
click
on
it
directly.
I
find.
D
I
was
wondering
if
this
was
also
because
hotel
changed
all
of
its
zooms
and
started
using
auto-generated
ones,
but
this
calendar
invite
hasn't
been
fixed.
Yet
I
think
so.
I
might
ask
someone
else
about
it
good
morning.
Everyone,
I
know,
will
that
you
were
talking
about
configurable
sampling
and
we
talked
about
having
that
be
the
primary
agenda
item.
I
think
today
yeah
as
I'm
looking
at
what
other
things
we
might
put
on
the
agenda.
D
There's
sir
two
minor
notes
that
I
thought
worth
reminding
the
audience,
but
I
think
we
should
not
belabor
those.
So
the
pr
I've
been
working
on
with
all
of
you
probability,
sampling
using
trace
state.
D
Is
what
it
is?
It
is
kind
of
ready,
I
think,
to
merge.
There
was
one
piece
of
feedback
last
night
saying
this
could
use
more
examples.
That's
great
feedback.
I,
like
that
kind
of
feedback
during
that
review.
Ottmar
pointed
out
at
the
last
minute,
something
that
is
worth
an
understanding
about
how
delegation
can
break
probabilities.
D
I
found
an
issue
to
talk
about
it,
because
I
think
we
should
refactor
some
things,
but
it's
a
far
distant
future.
So
I
put
that
in
the
agenda
just
a
link
there,
we
can
talk
about
all
those
things.
Last
I
propose
will
let
me
put
you
up
ahead
of
all
these
other
things.
A
Yeah,
it
was
I
plan
to
give
an
overview
of
like
how
we
do
sampling
and
tracing
at
autonomic
the
mechanisms
that
we're
using
today
and
what
I'd
like
to
see
for
the
future
and
a
bit
about
like
how
we
actually
don't
use
probabilistic
and
like
adaptive
sampling
mechanisms,
though
the
ones
first
proposed
by
yeager
jaegerman.
A
So
just
to
give
you
sort
of
like
a
setting
of
our
use
case
and
like
a
sort
of
a
high
volume,
tracing
customer
use
case
yeah.
That's
all.
D
Sure
so
I
I
mean
I
think,
we've
been
putting
off
this
discussion
about.
Configurable
sampling
includes
jager
remote
sampling.
It
includes
amazon's
x-ray
system.
That
includes,
I
mean
I'm
looking
at
honeycombs
in
the
audience,
honeycomb
dust
and
stuff
we're
all
interested
in.
How
does
the
user
get
to
control
sampling?
D
And
you
could
you
could
phrase
that
as
a
question
for
the
client
itself,
but
I
think
many
of
us
are
going
to
like
immediately
say
yeah,
good
luck,
configuring
all
of
your
clients
with
a
configuration.
So
at
some
point
it
becomes
a
question
about
how
can
I
get
remotely
distribute
this
configuration?
D
So
it's
a
big
problem
space,
but
I
think
we're
all
interested
in
all
of
all
this.
So
I
I'd
love
to
keep
hearing
more.
Will.
A
Sure,
okay
I'll
I'll
get
started.
Then
let
me
share
my
screen.
A
A
Okay,
hi
everyone,
I'm
william
tran.
I
work
at
a
place
called
autonomic.
What
do
we
do
at
autonomic
and
what
are
we
tracing?
We're
building
something
called
the
transportation
mobility
cloud,
and
so
this
connects
vehicles
to
applications
like
unlocking
your
car,
with
your
phone
updating
vehicle
software
over
the
air
or
streaming
vehicle
telemetry
to
consuming
applications.
A
A
So
how
are
we
tracing
scalable
tracing
is
is
challenging
and
for
us,
it's
too
costly
to
sample
only
at
the
tail.
We
can't
collect
everything
where
all
the
action
is
happening
at
the
head
and
send
it
down
a
pipe.
So
so
we
need
to
do
some
head
sampling
and
some
traces
contain
hundreds
of
spans
which
are
deeply
nested
and
in
these
cases
most
spans
are
actually
not
on
the
critical
path
of
interest.
A
And
some
use
cases
are
more
important
than
others,
so
some
use
cases
are
sampled
at
100
to
ensure
we
capture
air
conditions
that
are
hard
to
reproduce
and
we
identify
these
use
cases
right
at
the
very
beginning
like
using
like
the
http
method
and
path
for
the
entry
point
to
our
system
and
for
other
use
cases
sampling.
Just
some
traces
is
sufficient.
A
And
so
in
all
cases,
though,
we
need
analytics
that
represent
reality,
so
here's
our
architecture,
just
I
mean
simplified
enough
to
just
show
you
what's
relevant
to
this
discussion.
We
run
our.
We
run
our
workloads
in
kubernetes.
Our
services
are,
are
mostly
java
based
and
we're
using
java
special
agents
and
inside
that
jager
clients
and
this
stuff
might
be
kind
of
a
bit
outdated,
but
I
mean
at
the
time
that
we
started
tracing
hotel.
I
was
was
just
being
born.
A
Yeah,
exactly
hotel
java
instrumentation
actually
was
descend
as
a
descendant
of
the
datadog
java
agent
and
then
the
java
special
agent
had,
I
mean
similar
functionality,
but
it
was
a
lot
more
open
in
terms
of
how
you
could
modify
it.
I
found
so
we
we
went
with
that
and
ran
with
it.
It
was
also
interoperable
with
many
other
open
tracing
tracers
yeah,
and
that
was
like
a
project
that
came
from
lead
step.
B
A
A
Our
jaeger
client
sends
things
to
hotel
collector
through
the
jager
receiver,
and
then
we
have
our
own
component
that
we've
contributed
to
hotel
collector
contrib
called
redactor
and
then
goes
through
the
honeycomb
exporter
over
to
refinery
which
is
honeycomb's,
tail
sampler
and
then
over
like
honeycomb.
So
that's
that's
pretty
much
our
pipeline.
In
a
nutshell,
some
details
about
the
components
yeah,
you
have
a
special
agent
yeah.
A
We
started
tracing
early,
we
actually
started
yeah
in
2019,
and
then
we
things
really
started
rolling,
beginning
of
2020
and
yeah.
We
chose
special
agent
for
those
reasons
I
mentioned
before.
A
We
were
trying
out
actually
many
different
tracing
back-ends.
All
at
the
same
time,
hotel
collector
was
was
very
helpful
there
as
well,
and
we
modified
this
and
added
some
of
our
own
auto
instrumentation.
We
do
plan
on
moving
to
the
hotel
java
instrumentation
in
the
new
year
and
jager
client
java,
so
this
was
the
most
fully
featured
tracer
when
we
started.
I
know
just
recently,
java
instrumentation
has
reached
feature,
parity
and,
and
the
yeager
team
wants
to
deprecate
this,
but
we
really
needed
a
remote,
controlled
sampling.
A
And
we
modified
jager
client
java
samplers
to
convey
an
adjusted
count
and
support
rate
limits
less
than
one
per
second,
and
we
wrote
our
own
sampling
config
server,
which
pretty
much
serves
the
same
stuff
as
jaeger's
remote,
controlled
sampling,
config,
and
we
modified
it
to
allow
for
configuration
of
rate
limits
of
less
than
one
per
second.
So
the
jaeger
data
model
didn't
allow
you
to
have
a
rate
limit
of
less
than
one
per
second.
So
we
we
wanted
to
modify
that
so
to
to
review
that
yeager's
remote,
controlled
sampling.
A
A
You
get
to
apply
these
samplers
and
they
will
be
applied
given
the
service
and
the
operation
name
of
the
span,
and
so
there's
there's
just
like
a
one-to-one
mapping
of
operation,
name
to
a
sampler
instance,
so
that
that
actually
worked
out
well
for
us
in
having
these
samplers
can
convey
an
adjust
account
and
so
I'll
get
into
how
we
deal
with
that.
In
a.
C
A
So
redactor,
that
is
a
component
that
we
wrote
and
we've
contributed
upstream
to
redact
sensitive
data
for
compliance
reasons
and
that's
the
pull
request
there.
We
just
have
a
skeleton
up
right
now
and
that's
been
merged
refinery.
This
is
honeycomb's
tail
sampler
and
that
respects
the
upstream
adjusted
count,
and
so
the
little
code
snippet
that
makes
that
happen
is
is
where
they
can
multiply.
C
A
So
I
think
the
the
compelling
thing
here
and
the
reason
why
we
approached
adjusted
count.
The
way
we
did
was
it
was
really
just
to
to
make
everything
work
with
honeycomb.
We
saw
we
saw
this
feature,
we
saw
it
would
work
and
I'm
like
yeah.
Let's
use
this,
because
we
want
to
see
a
realistic
representation
of
reality
and
and
so
that's
what
honeycomb
can
do
for
you.
A
Okay,
so
rate
limit
versus
adaptive,
jaeger
was
baiting
adaptive
sampling
when
we
started
tracing-
and
I
think
they've
just
recently
said:
hey
adaptive
sampling
is
it's
it's
now
ga,
but
you
need
to
use
cassandra
and
just
to
like
adapt
to
sampling
in
a
nutshell,
is
where
we
want
to
produce
a
constant
rate
of
output
for
variable
rates
of
input
and
the
way
yeager.
A
Does
this
jager's
adaptive
sampling
is
it
will
remotely
control
the
sampling
probabilities
of
the
samplers
to
achieve
like
a
target
output
rate,
and
so
there's
this,
like
this
whole
distributed
feedback
loop?
That
happens
to
enable
that,
and
so
we
agree
with
the
motivations
for
adaptive,
sampling
and
and
that's
yeah-
to
produce
consistent
output
variable
input,
but
we
weren't
up
for
the
complexity
of
running
at
cassandra
and
running
that
whole
pipeline.
A
So
outputting
the
adjusted
account
in
samplers
rather
than
inferring
the
adjusted
count
as
the
inverse
probability.
We
chose
to
convey
it
directly.
A
Each
operation
name
has
its
own
corresponding
sampler,
and
so
we
added
a
corresponding
atomic
integer
counters
for
each
operation
y
decisions
increment
and
reset
the
counter
to
zero
and
and
decisions
just
increment
the
counter
by
one
and
and
this
works
regardless
of
the
the
sampling
algorithm.
So
in
some
cases
we
do
use
like
probabilistic
sampling,
but
this
this
just
works
for
both
and
we
add
this
to
baggage,
and
then
we
copy
that
baggage
key
to
every
span
as
a
tag
called
sample.count.
A
And
so
this
is.
This
is
kind
of.
A
I
know
I
joined
up
in
in
this
sig
last
year
when
we're
first
talking
about
sample.count
and
I
regret
not
being
part
of
the
conversation
around
how
this
turned
into
probabilistic
sampling
up
until
very
recently,
but
but
hey,
I'm
here
now
so,
okay,
some
caveats
with
our
approach
adjust
to
count
it's
set
at
the
beginning
of
the
trace
and
it's
not
modified
until
we
hit
the
refinery,
tail
sampler
and-
and
so
that's
just
the
limitations
that
that
we've
put
on
ourselves
to
to
have
confidence
that
this
mechanism
works.
A
A
So
we'll
just
assign
a
probability
zero
to
some
part
of
a
subtrace
identified
by
service
and
operation
name.
There
are
some
limitations
there
and
I'll
get
into
that
later
and
so
down
sampling
subtrees
is
not
yet
supported.
So
that
would
be
like
something
that
looks
like
okay.
I
want
to
see
this
subtree,
maybe
one
out
of
every
10
traces
that
end
up
getting
sampled
and
it,
but
it
is
feasible
to
do
down
sampling.
I
mean
refinery.
Does
this
by
just
multiplying
those
sample
rates.
A
And
yeah
we
we
wouldn't
want
subtrees,
just
kind
of
like
appear
on
their
own.
We
want
the
appearance
of
a
subtree
to
be
dependent
on
the
sampling
of
its
parent.
That
would
be
nice.
I
think
the
consistent
probability
sampling
does
this.
A
And
sampling
bias
is
possible,
with
a
periodic
input
to
rate
limited
samplers,
and-
and
I'm
I
mean
it's-
I
don't
have
a
proof
for
this,
but
it
just
it
seems
highly
unlikely
in
our
environment,
given
the
characteristics
of
the
input
to
rate
limited
sampling
decisions
where
we
make
that
decision,
the
volume
that
we
make
those
decisions
on
and
the
data
that
we're
making
those
decisions
on
yeah,
it
seems
unlikely,
but
I
can't
I
mean
you
have
a
very
rigorous
test
that
I've
seen
for
the
consistent
probability
sampling
that
proves
that
yeah
there's
no
bias
here.
A
A
Yes,
I
do
see
that
there's
there's
guarantees
from
bias
sampling
here,
but
I
think
it
goes
a
little
too
far
in
maine
mandating
p
to
convey
adjusted
count,
but
maybe
your
intent
there
is
that
just
the
scope
of
using
p
for
adjusted
count
is
the
scope
is
just
I
mean
this
is
for
probability
stamping
if
you're
using
probability
sampling,
then
here's
how
you
use
p,
I'm
not
sure.
A
If
you
mean,
though,
that
p
is
the
only
way
to
convey
a
conjunct,
an
adjusted
count,
because
now
we're
precluding
non-probabilistic
samplers,
and
if
you
understand
the
trade-offs,
then
it
might
be
still
appropriate
for
your
use
case,
and
I
would
just
like
to
see
a
more
interoperable
adjust
account
mechanism
or
have
that
defined
outside
probabilistic
sampling.
D
Think
I
understood
this.
This
might
be
a
tough
topic
to
grasp,
though,
for
everyone
who
hasn't
been
through
the
entire
debate
here.
D
So
I
wonder
if
I
could
summarize
this
last
feeling
I
got
it's
that
you
just
showed
us
a
mechanism
that
computes
what
what
you
called
sample
count,
but
by
basically
counting
the
number
of
yes
no
decisions
on
a
particular
operation,
and
then
you
can
essentially
output
a
count
whenever
you
sample
of
the
number
that
you've
skipped,
plus
one
essentially
and
and
and-
and
I
think
the
words
that
you
put
on
the
previous
slide-
is
that
this
this
might
not
be
perfectly
unbiased,
yeah.
D
D
A
test
case
with
a
very
contrived
input
that
would
demonstrate
why
randomness
is
better,
but
I
think
you
have
a
very
good
claim
that
this
is
good
enough,
and
so,
if
you
wanted
to
do
something
simpler
and
like
you
described,
you
wouldn't
be
able
to
use
the
adjusted
count
mechanism.
The
way
it's
expected,
partly
because
I
specified
how
it
should
be
unbiased
and
so
on.
E
D
So
that
you
might
prefer
to
see
us
go
all
the
way
back
to
roughly
speaking
where
we
were
in
summer,
where
I
originally
proposed.
We
use
a
span
attribute
to
convey
adjusted
count
and
we
we
landed
on
having
well
just
let's
just
use
this
trace
state
mechanism
and
record
it
with
the
spans.
D
It
sounds
to
me
like
we
could
add
a
portion
of
our
specification.
This
is
a
question
mark
really,
which
is
designed
to
help
essentially
in
this
type
of
situation,
where
you
have
another
way
of
computing
adjusted
accounts
which
may
be
biased,
but
are
good
enough
and
you
want
to
use
them
anyway.
D
Yes,
we
had
a
semantic
convention
for
spam
attribute.
That
was
the
adjusted
count.
Then
you
could
just
use
that,
and
that
is
actually
one
of
the
first
proposals
that
that
I
had
so
I
understand
I
like
it,
I
get
it
okay
and
that
would
and
then
what
it
would
mean,
I
think
would
be
to
specify
the
rules
of
interpretation
so
you're
looking
at
a
span,
it
has
an
attribute
that
says
sample
count.
It
also
has
a
trace
state,
it
says
p
value.
What
does
that
mean?
I
mean
that's
a
question
as
well.
D
I
think
the
point
I
I
take
your
point
though
I
I
think
it's
a
reminder
that
we
that
that
there
was
a
lot
of
technical
pushback
on
what
we're
doing
and
there's
also
simple
cost
pushback
on
what
we're
doing
and
then
adding
a
span
attribute
seems
to
be
more
than
we
needed.
After
all,
the
debate
is
all
I'm
going
to
say.
There's
a
lot
of
debate
there.
A
D
However,
it
was
going
to
be
a
logarithm
again
because
it's
so
so
few
bits
and
it
wouldn't
let
you
have
an
adjusted
count
of
three,
for
example,
so
you
could
definitely
see.
I
can
definitely
see
adding
to
our
specification
to
say
the
that
there's
an
also
a
span
attribute
with
many
conventions
saying
this
is
my
unjust
account.
Just
trust
me.
I
know
what
I'm
doing.
I
I
just
imagine
a
lot
more
questions
coming
up,
you're
using
baggage,
so
it
works.
A
Yeah,
there's
always
obstacles
yeah,
we
we
actually
are.
We
actually
take
all
of
our
baggage,
except
for,
like
a
few
sort
of
metadata
things
that,
like
don't
need
to
be
revealed
as
span
tags,
we
just
copy
them
all
to
span
tags
and
it's
kind
of
like
a
way
to
to
do
this
sort
of
denormalization
but
like
at
the
clients
like,
rather
than
sort
of
somewhere,
further
down
the
pipe,
so
that
we
can
kind
of
like
flatten
out
these
key
values
that
that
help
us
correlate
things
yeah.
A
And
then
we
find
that
like
also
useful
mechanism,
and-
and
I
mean
it's-
it's
no
big
deal
that
it's
not
like
part
of
any
spec.
We
just
sort
of
went
in
there
and
did
that
because
it
works
for
us
yeah.
D
The
metric
spec
will
tackle
the
question
of
how
to
convey
or
to
how
to
configure
attaching
baggage
to
metric
events
in
in
the
next
year.
But
I'm
I'm
aware
that
it
hasn't
been
done
for
spans
and
I'm
kind
of
surprised
by
that
cool.
A
Yeah,
so
I
think
just
to
some
some
this
this
slide
up
as
long
as
there
there
can
be
some
room
left
for
like
an
adjusted
count,
as
is
that
then,
I'm
happy
and
like
in
the
scope
of
probabilistic
sampling.
I
love
what
I
see
it
all
works
and
is
very
rigorous
and
efficient.
D
So
I've
in
the
past,
you
know
referred
to
other
ways
of
sampling
which
are
not,
as
perhaps
as
good
for
most
use
cases
that
we
know
about.
But
you
know
we
can
devise
schemes
that
are
not
power
of
two-based
sampling
and
you
can.
You
can
imagine
an
adaptive
reservoir
scheme
or
something
like
that.
That
has
these
adjusted
counts,
which
are
not
integers
and
they
are
not
necessarily
powers
of
two
either
of
course,
and
so
I
can
certainly
imagine
you
know,
I'm
I'm
doing
something
esoteric.
D
I've
plugged
in
my
own
sampling
algorithm
for
whatever
reason
and
now
my
my
sample
counts
are
floating
point
numbers
and
it
would
be
cool
if
that
would
work
with
the
vendors
there's
a
there's
one
other
esoteric
corner
case
that
you
run
into
when
you
start
talking
about
this.
It's
that
people
often
want
to
turn
their
spans
into
a
histogram,
so
you're,
counting
latency
items,
you're
counting
latency
measurements
from
your
sampled
spans.
D
If
your
sample
spans
can
have
non-integer
counts
now
you
need
a
histogram
with
manager
accounts,
so
we
don't
have
that.
So
that's
always
that's
one
of
those
other
obstacles.
D
Up
yeah-
and
I
don't
know
what
what
will
happen
with
that.
So
if
we
could
just
a
step
back,
I'm
going
to
say
that
sounds
like
will's
suggestion
is
that
it
would
be
nice
to
have
a
more
broad,
less
specified
semantic
convention
to
say
I
know
what
my
justice
count
is.
Please
take
this
and
use
it
vendor
who
you
are
counting
my
spans.
I
think
it's
a
reasonable.
D
D
I
think
that
that's
worth
filing
an
issue
about
maybe
coming
back
to
it's,
not
something
that
I
personally
need
given
what
I'm
trying
to
accomplish
for
my
own
vendor's
request.
But
I
I
propose
it
in
the
past
because
I
think
it's
fine,
so
that's
good,
but
I
think
that
as
we
we
we
may
want
to
table
that
and
come
back
to
talking
about
configurable
sampling
in
general.
D
D
I
feel,
like
everyone
kind
of
knows
what
what
it
is,
that
we
want,
and
we
all
have
these
examples
in
our
head
of
jaeger
and
probably
x-ray,
and
what
I'm
hoping
someone
will
do,
and
it
could
be
me
if
my
vendor
wants
me
to
in
the
next
year
is
to
just
sit
down
and
write
down.
D
The
proposed
specific
proposed
structure,
which
is
the
hotel
sampling
configuration
and
it
might
look
exactly
like
the
jager
sampling
configuration,
except
that
I
expect
certain
semantic
conventions
to
be
updated,
like
jaeger
has
an
operation
name
and
we
have
a
span
name
and
yeah.
Those
are
the
same.
I
think
jaeger
has
tags,
we
call
them
attributes,
that's
a
difference
and
then
and
then
we
have
parallels
in
the
metric
space
already
called
views
which
is
just
getting
to
be
stable
in
the
hotel
spec,
which
is
where
you
say.
D
D
D
Hopefully
we
have
a
rate
limited
one,
there's,
probably
several
choices
of
rate
limited
one
that
we
know
about
depending
on
whether
you
care
about
completeness
or
whether
you
care
about
rate
limits
being
hard
and
so
on,
but
that
we
need
essentially
a
file
format
to
do
that-
and
this
is,
I
feel,
like
all
roads
lead
to
configuration
at
this
point
in
hotel
right
now.
D
A
D
Other
format
in
the
sdks,
the
collector
has
one
and
that's
probably
the
model
we
would
look
to
so
yeah.
I
hope
someone
else
would
can
do
this
because
it
hasn't
been
a
priority,
at
least
for
us
at
lightstep.
The
priority
is
to
get
the
counts
done
first,
although
definitely
our
customers
want
the
same
thing
that
we
just
talked
about
the
only
other
sort
of
concurrent
parallel,
that's
happening
in
the
hotel
group.
D
That's
related
to
this
is
that
there
is
a
configuration
agent
configuration
group,
that's
talking
about
how
we
will
configure
agents.
So
if
you
had
a
file
format,
you
could
then
go
talk
to
that
group
about
how
to
get
it
distributed
to
your
agents,
and
then
we
could
be
talking
about
how
to
distribute
that
to
your
sdks
and
there's
some
open
questions
about
whether
agent
management
protocols
can
be
used
for
sdk
management.
D
Let's
suppose
that
they
aren't
that's
too
much
complexity,
but
then
we
need
something:
that's
essentially
a
lightweight
version
of
the
agent
management
protocol,
which
is
same
as
what
jager
has
for
the
sampling.
Endpoint
says:
I'm
an
sdk.
I've
been
configured
with
it
with
a
destination
for
sending
otlp,
but
I
also
need
a
destination
for
getting
my
configuration
and
is
that
the
same
as
the
collector
that
I'm
talking
to
is
there
a
port
that
I
have
to
use?
D
D
Good
will
the
other
will.
I
was
wondering
if
you
would
both
make
it
here.
Sorry,
I
didn't
notice,
I
didn't
check
my
participants
list.
Yeah
so
will
is
the
other
person
who's
been
talking
to
me
about
this,
both
wills
and
so
x-ray?
I
don't
know
the
differences
between
jaeger
and
x-ray
person
exactly,
but
this
is
what
I'm
interested
in
seeing.
C
Yeah,
just
I
I
was
gonna
say
we
have
myself
and
also
batik
from
the
x-ray
team
on
the
call,
and
we
added
an
item
on
the
agenda
to
discuss
our
remote
sampling
approach
as
well,
which
is
something
we're
going
to
discuss
later,
but
I
don't
know
if
now
it
would
be.
I.
D
Think
now
is
the
perfect
time
I'm
moving
you
up
in
the
agenda,
so
thank
you.
Will
I
took
some
screenshots.
I
thought
your
your
slides
were
great.
I'd
be
glad
to
have
a
link
to
them
if,
if
possible,
but
in
the
notes,
but
so
this
is
part
two
of
our
remote
sampling
discussion,
then
thank
you
will
well.
A
D
Let's
talk
about
it,
why
don't
we?
Why
don't
we
talk
about
now
that
we've
discussed
will's
back
story?
I
think
it'd
be
nice
to
hear
about
the
x-ray
system.
E
Hey
so
hey
this
is
project.
I
I
have
attended
a
sampling
like
in
a
very
early
early
starting
point.
E
The
next
thing
I'm
just
gonna,
share
my
screen.
I
I
have
sent
a
document
to
the
google
to
the
zoom
chat.
I'm
just
gonna
share
my
screen
to
kind
of
like
walk
you
through.
E
D
D
Well,
will
trans
spoke
about
setting
up
cassandra
for
the
jager
centralized
system?
It
looks
like
what
we're
going
to
talk
about
here
is
essentially
amazon's
equivalent
for
that.
E
Yeah,
so
maybe
you
can
go
to
the
introduction.
E
Yeah,
so
I
I
guess
the
will
has
talked
about
this,
like
the
remote
configuring
sampling
in
in
his
slides
right.
So
basically,
centralized
sampling
is
something
like
where
customers
can
configure
the
sampling
configuration
like,
let's
say,
reservoir
size
or
fixed
rate
in
the
in
the
in
somewhere
in
the
remote
console,
and
basically
the
sdk
would
pull
in
those
configs
and
kind
of
like
basically
try
to
try
to
make
a
sampling
decision.
E
So
the
basically
the
main
goal
of
the
remote
sampling
is
like,
for
example,
if
you
have
like
multiple
hosts
in
your
feed,
like
let's
say
your
five
hosts
in
your
feed
and
like
basically
customer
set
configuration
like
file,
request
per
second
and
then
and
then
the
five
percent
of
fixed
rate.
Basically,
they
would
want
to
trace
like
every
second,
they
would
want
to
trace
five
request,
but
you
still
have
like
five
hosts
in
the
in
your
in
your
fleet.
E
So
typically
this
would
work
well
or
it
would
be
easier
to
implement
in
the
in
the
one
host
kind
of
a
setup
where
you
know
you
can
basically
like
control
like
basically
how
many
requests
are,
how
many
requests?
You
are
going
to
sample
in
in
that
second,
but
for
with
the
remote
sampling,
this
would
be
a
little
bit
a
little
bit
trickier.
E
So
the
kind
of
like
the
idea
of
the
centralized
sampling
is
to
like
kind
of
distribute
it
distribute
the
sampling
configuration
across
all
your
rows
and
then
kind
of
basically
like,
for
example,
let's
say
a
customer
is
defined
like
five
requests
per
second,
and
if
you
have
like
five
votes,
then
maybe
like
divide,
like
basically
sample
one
request
from
each
host
and
then
apply
the
five
percent
of
the
additional
rate,
basically
like
aggregated
additionally
to
all
those,
so
that
would
kind
of
like
satisfy
the
that
is
where
the
customers
criteria
of
like,
if
they
want
to
sample,
like
particular
number
of
requests
within
that
second,
so
that
is
kind
of
like
in
general
idea
of
the
central
assembly.
E
X-Ray
sdks
have
been
open
source
and
they
have
been
providing
the
support.
Probably
I
think,
right
after
the
gl
launch.
So
it's
it's
been
in
production
for
quite
a
while
now,
so,
basically
so
the
doc.
The
documentation
mainly
talks
about
like
how
we
can
how
we
can
implement
this.
E
Using
the
open,
telemetry
and
as
as
basically
we,
we
would
also
want
to
define
like
like
a
like
a
spec
for
this,
so
that
you
know
everybody
can
utilize
this
pack
and
we
kind
of
like
build
a
general
model
for
like
configure
configurable
sampling
or
remote
sampling
or
centralized
sampling.
E
So
so
I
I
am
just
going
to
walk
you
through
the
design
of
like
how
extra
sdk
is
like
basically
have
extra
sdks
like
implement
the
centralized
sampling.
So
can
we
go
to
the
implementation
design
first
and
then
we
will
go
to
the
cabinets,
so
so
in
general
in
general.
How
the
how
basically
the
the
links.
So
we
have
also
implemented
the
centralized
sampling
in
java,
open
telemetry
java
as
well
right,
so
so,
basically,
how
how
the
general
implementation
flow
would
work
is
basically
we
can
like.
The
open.
E
Telemetry
has
like
sampler
interface,
like
which
basically
defines
like
each
samplers
so
for
centralized
sampling.
We
can
like
kind
of
like
create
like
two
decomposed
samplers
like
one
is
the
centralized
sampler
which
basically
like
which
basically
likes
handles
all
the
logic
related
to
our
centralized
something
and
then
the
other
one
would
be
the
rate
limiting
sampler
yeah.
But
before
going
into
that,
I
would
like
just
like
to
walk
you
through
the
in
general
workflow
of
like
how
centralized
sampling
like
basically
how
we
would
be
able
to
do
the
centralized
sampling.
So
it's
so.
E
So,
if
you
look
at
the
centralized
sampler,
we
we
like
basically
like
x-ray,
provides
like
two
apis,
which
is
like
get
sampling
rules
and
get
something
targets.
Api,
so
get
sampling.
Rules
is
basically
like
pulling
the
sampling
rules,
so
so
basically
pull
in
like
pull
in
the
data.
E
Like
you
know,
customer
data
that
has
been
set
by
the
customer
at
a
centralized
location,
which
is
like
a
reservoir
size,
fixed
rate
which
path
they
want
to
sample
like,
for
example,
if
it's
a
http
request,
then
there
would
be
some
paths
they
would
want
to
apply
for
service
type
service
name
all
those
parameters
which
we
definitely
have
some
equivalent
to
that
in
the
open
telemetry.
E
So
so,
basically,
using
get
sampling
rules.
Sdk
would
have
that
data
like
periodically
and
there
would
be
another
api
which
is
the
get
sampling
targets,
so
it
so
excel
sub
system
which
basically
computes
the
computes
to
quota
like
basically,
which
host
to
assign
like
how
many
numbers
of
requests
are
typically,
first,
a
reservoir
size.
So
like
those,
those
calls
like
the
sdk
would
make
those
periodic
calls
to
the
remote
location
or
the
remote
back
end
periodically
so
typically
like
with
extra
sdk
like
currently,
the
interval
of
fetching
is
for
sampling
rules.
E
It's
five
minutes,
because
we
don't
still.
We
still
don't
need
that
that
frequent,
but
for
sampling
targets
we
need
to
compute
like
more
frequently.
So
it's
default
period
is
10
seconds
so
that
that
would
continue
going
to
happen
like
in
the
background
of
the
sdk.
Now
like.
Basically,
the
main
idea
is
to
sdk
would
have
kind
of
contain
a
rule
cache.
E
So
it's
kind
of
like
contains
the
contents,
the
entire
rule
cache
for
that,
so
that,
at
any
point
it
kind
of
like
keeps
the
sdk
updated
with
the
with
the
recent
changes
with
the
remote
backend.
So
that's
the
idea
behind
it,
and
so,
and
so
basically
now
when
when
there
is
a
request
coming
in
right,
so
so
basically
it
would.
E
It
would
match
it
against
the
configuration
that
that
the
back
end
has
provided
to
the
sdk
like,
for
example,
host
like
host
name
service,
name,
service
type,
the
the
url
path
of
the
request.
This
is
kind
of
like
a
matching
matching
thing
like
where
we
are
essentially
matching
the
request
attributes
with
the
with
the
with
the
sampling
configuration
that
has
been
set.
If
there
is
a
match
aft
after
we
matching
okay,
there
is,
if
the
request
is
messed
with
one
of
the
one
of
the
sampling
rules.
E
A
E
Then
then,
basically,
it
means
every
second.
Those
five
requests
are
going
to
be
sampled,
irrespective
of
anything
so
so
customer
will
be
setting
reservoir
quota
and
fixed
rate,
so
fixed
rate
is
like.
After
reservoir
quota
is
consumed.
We
use
the
fix
rate
for
to
sample
extra
request,
so
it's
like
kind
of
giving
the
customer
guarantee
like.
For
this
part,
this
request,
five
percent,
five
or
six
requests
per
second
is
going
to
be
simple,
so
it's
so
so
once
the
reservoir
quota
is
consumed.
E
For
that
second,
it's
kind
of
like
you,
you
would
see
we
would
like
basically
centralized
center,
would
instantiate
the
another
sample,
it's
kind
of
like
rate
limiting
sampler.
So
for
that
limiting
sample
we
can
use
whatever
whatever
works.
For
our
case,
like,
for
example,
when
we
do
the
implementation
sdk
wise,
I
mean
it's
it's.
There
is
not
not
a
hard
limit.
E
We
could
have
used
like
3d
ratio
based
sampler
as
well
for
this
case,
but
it's
just
kind
of
like
some
probabilistic
sample
would
work
for
this
case
so
that
the
so
basically
the
major
idea
is
like
a
centralized
sample
would
basically
try
to
consume
the
reserve
quota.
Once
the
reservoir
board
is
consumed,
it
would
instantiate
another.
E
For
example,
this
is
the
case
where
request
is
matched
right,
for
example,
for
cases
like
when
you
know
you,
don't
you
don't
match
with
any
request
like
sampling
rules
that
has
been
said
by
customers
are
not
messed
so
and
that
get
in
that.
In
that
case,
we
would
be
having
like
a
default
of
like
one
reservoir
per
second
and
five
percent
of
the
fixed
rate,
so
that
is
kind
of
like
in
general
workflow
in
in
general
workflow.
E
We
have
like
basically
so
for
remote
sampling
with
open
telemetry
java
and
then
the
x-ray
sdk.
The
both
implementation
has
like
two
major
caveats
that
we
have
kind
of
like
thought.
It
would
be
better.
One
is
like
you
know:
in
x-ray,
sdk
implementation,
we
like
fix
the
sampling
rules
and
sampling
targets,
fetching
intervals
wherein
in
open
telemetry
java
we
provide
like
customer
to.
Can
you
can
you
go
up
to
the
caveats.
E
Yeah
yeah
so
basically
like,
like
in
open
telemetry
with
the
open,
telemetry
implementation.
We
are
planning
to
like
give.
Probably
the
configurable
interval
that
the
customer
can
set
when
they
set
the
sampling
when
they
set
the
sampler,
and
then
I
guess
it
would
be.
It
would
be
nice
for
customers
like
who
are
not
updating
their
sampling
rules
frequently.
E
Basically
they
can.
They
can
manage
some
of
the
traffic
going
via
collected.
So
I
guess
that's
the
idea
and
then
the
other
major
difference
would
be
to
kind
of
like
not
like,
basically
not
related
to
this,
but
it's
it's
like
x-ray.
Sdk
provides
like
a
local
file
of
sampling,
configuration
in
case
of
like
fallback.
E
It
would
use
that
file,
but
with
x-ray.
Sorry,
but
with,
I
think,
open
telemetry.
We
are
not
like.
I.
I
think
we
would
be
just
okay
to
like
provide
the
default
sampling
rules
so
yeah.
So
those
are
kind
of
like
major
caveats
can,
can
you
can
you
go
down
like
yeah,
just
still
go
down,
yep
yeah.
So
this
is
the
open,
telemetry,
sampler
interface
design
right
so
like.
E
Basically,
we
would
be
able
to
implement
like
a
standalone
sampler
like
centralized
sampler,
which
basically
implements
like
shoot
sample
and
description
methods
like.
Basically,
it
takes
the
sampling
parameters
like
kind
of
like
request,
and
then
it
would
feed
in
the
it
would
feed
in
the
sampling
configuration
that
it
has
been
received
and
makes
a
decision,
and
then
this
is
that
it
returns
the
response,
so
I
think
it
works.
I
think
basically
like
we
could
have
like
two
decomposed
samplers,
as
I
mentioned
above
so
this
is.
E
This
is
kind
of
like
a
sampler
interface
design,
that
kind
of
align
align
with
the
implementation,
and
this
is
the
bernoulli
sampling.
We
don't
have
to
go
through
that.
We
also
don't
have
to
go
through
the
data
model,
it's
for
like
minor
detail,
but
this
is
like
in
general
data
model
that
we're
gonna
use
and
and-
and
there
is
one
more
thing
that
I
wanted
to
talk
about-
can
you
go
down
here?
Still
a
little
bit
go
down,
yeah
yeah.
So
basically
I
wanted
to
talk
about
the
matching
data
model.
E
E
So
basically,
this
would
be
the
case
with
whenever,
whenever
there
is
a
request
incoming
right,
we
would
like
there
would.
There
would
be
some
equivalent
to
the
data
model
of
like
service
name
service
type
to
the
to
the
open,
telemetry
data
model.
E
So,
but
the
only
concern
here
is
like
so
this
this
we
wouldn't
be
getting
this
this
all
these
data
model
fills
in
the
same
in
the
sampler
like,
for
example,
some
of
the
parameters
are
very
dependent
specific
to
the
instrumentation
per
se,
for
example,
url
path,
http
method,
attributes
we
wouldn't
be
having
those
in
the
sampler,
so
I
guess
the
one
way
would
be
kind
of
to
because
anyways
we
cannot
define
it
because
it
would
be
very
specific
to
the
instrumentation
and
how
instrumentation
is
going
to
like
kind
of
like
kind
of
like
emit
those
fields
for
us
to
use.
E
So
one
of
the
one
of
the
probably
solution
would
be
here
to
kind
of
like
whenever
you
we
define
a
sampler
as
a
shown
above
we
can
take
the
resource,
so
that
would
probably
give
us
the
service
name
service
type
and
we
can
match
it
against
that
and
then
probably
the
matching
of
this
thing
probably
would
be
dependent
on
instrumentation,
so
yeah.
I
I
don't
have
any
clear
idea
on
how
like
basically,
what
would
be
good
for
for
open
telemetry
use
cases
but
like
open
to
any
suggestions
on
like
if
the.
A
Yeah,
I
I
totally
see
that,
like
just
relying
on
on
the
existing
data
model,
yeah,
it's
really
specific
to
a
particular
use
case.
I
mean
imagine
messaging
systems
you'll
might
want
to
match
on
topic
name,
and
so
that's
not
none
of
these.
These
existing
ones
here,
but
I
mean,
if
you
want
to
open
it
up,
attributes
in
in
hotel
spans
they're,
just
key
value
pairs,
and
they
could
be
anything
so
you
could
just
sort
of
open
it
up
to
all
attributes
and
and
throw
in
some
key
values.
A
I
think
that
the
things
that
I
would
like
to
see
fleshed
out
is
how
we
deal
with
multiple
potential
matches
for
a
sampling
rule,
so
you
might
have
ones
that
are
more
specific
or
less
specific.
Do
we
automatically
determine
specificity
and
ordering
through
the
configuration,
or
do
we
just.
D
He
will
you're
breaking
off,
I
think,
will
just
dropped
off,
but
he's
I
can
finish
his
sentence
for
you.
The
same
issue
comes
up
in
the
metric
views
specification
which
I
was
talking
about
earlier,
because
it's
very
closely
analogous
to
the
metrics,
the
the
need
for
a
configurable.
Sampler.
D
He
was
getting
to
the
point
to
say
that
it's
tricky
when
you
have
more
than
one
rule
matching.
Do
you
go
in
order?
Do
you
think
about
all
of
them
and
there's
certainly
probability
logic
that
will
be
implied
when
you
start
to
certain
answers,
may
or
may
not
run
a
foul
of
probability,
rules.
E
Right
so
so
we
we
have
priority
that
we
set
for
every
rule
in
this
in
the
in
the
remote
configuration
like
in
the
in
the
back
end
and
that's
how
like,
for
example,
if
we
have
a
first
mesh,
then
we
would
use
that
rule
to
kind
of
do
the
sampling.
D
Probability
sampling.
We
have
this
consistent
mechanism
that
and
we
have
specified
a
mechanism
for
composing
them,
so
I
think
we
can,
if
they're
all
probability,
if
they're
all
consistent
probability
samplers
each
rule,
we
should
be
able
to
compose
them
because
they
all
can
consistently
apply
their
same
probabilities,
so
that
problem
may
be
solvable.
The
idea
that
you
can
actually
let
any
all
of
the
different
rules
be
applied
at
once,
if
any
one
of
them
decides
to
sample
you
can
sample.
Although
that
might
not.
D
D
D
That's
not
that's
a,
I
think,
that's
a
good
outcome.
I
don't
want
to
distract
us,
but
that
that
much
like
in
the
metric
specification,
there's
a
lot
of
complexity,
about
how
you
specify
these
rules
and
and
the
priorities
are
implied.
E
Right,
but
I
guess
in
the
in
that
case,
that
would
still
be
kind
of
dependent
on
the
how
like,
basically,
where
we
are
configuring
configuring,
the
sampling
configurations,
because
it
would
be
different
like
if
we
move
if
we
change
the
vendor,
so
it
would
be
very
vendor
specific.
I
think.
D
So,
thank
you,
bautic
and
will
I
think.
D
As
a
as
a
engineer,
who's
interested
in
this
topic,
I
do
have
some
thought
questions
about
how
the
centralized
reservoir
is
managed,
but
I
don't
think
we
should
belabor
that
questions
now,
especially
there's
only
a
couple
minutes
left
in
this
hour.
I
think
the
the
roadmap,
that's
the
sort
of
outline
of
the
roadmap
is
pretty
clear
here.
We
we
want
configurable
sampling,
we
want
a
way
to
describe
a
bunch
of
rules,
but
the
rules
are
based
on
key
value
pairs.
The
key
values
are
semantic
conventions
that
otel
is
specified.
D
The
the
outcomes
of
these
rules.
Evaluation
are
probability,
sampling
or
other
sampling,
and
this
this
idea
of
a
fixed
rate,
a
minimum
rate,
is
a
non-probability
sample
or
the
way
we've
specified
it.
So
that-
and
this
was
actually
a
strong
use
case
that
was
given
in
the
beginning-
is
that
I
want
a
probability
standpoint
and
a
minimum
threshold,
and
so
we
can
do
those
so
that
would
be
a
standard
configuration.
D
Then
you
run
this
configuration.
You
match
your
span,
you
get
a
bunch
of
rules,
they
decide
to
sample
or
not.
You
output
a
probability
and
that's
good.
I
think
we
should
resume
this
discussion,
but
I
think
we
should
also
find
someone
to
champion
and
make
a
proposal
for
us.
I
don't
see
it
happening
in
the
next
few
weeks,
myself
because
we're
entering
holidays
and
there's
a
bunch
of
other
stuff
happening
in
hotel
right
now,
but
this
is
definitely
up
next.
D
I
don't
have
any
more,
I
don't
think
we
have
time
to
discuss
any
remaining
issues
on
the
agenda.
There
was
this
pr
of
mine
is,
has
an
approval
from
atmar.
Thank
you
lotmar.
It
has
a
couple
of
approvers
that
from
the
hotel,
spec
approvers
list
already,
and
I
think
it's
going
to
merge.
D
We
did
discover
some
corner
cases
that
are
not
exactly
correct,
but
it's
not
because
of
the
probability
sampling
it's
because
of
the
existing
parent
based
sampler.
If
anyone
is
curious
about
what
I
just
said,
this
link
here
is
worth
reading
through
and
then
the
last
I
think
and
most
exciting
part
that
we
are
now
maybe
only
a
couple
minutes
left
is
that
ottmar's
posted
this
consistent
reservoir
sampler
and
I
believe
the
algorithm
is
similar
to
the
partial
tracing
analysis
algorithm.
But
I
don't
think
I
know
the
algorithm.
D
It
is
definitely
a
curiosity
for
me
and
something
I'd
like
to
know
more
about
omar.
Is
this
something
that
you've
written
in
one
of
your
papers
as
well,
or
is
this
sort
of
like
proof
of
concept
in
code
that
you're
going
to.
B
Write
up
a
proof
of
concept,
I
still
need
to
write
some
more
unit
tests,
but
the
idea
is
basically
you
you
have
a
random
number
and
to
make
resource
sampling
consistent,
you
keep
just
the
the
smallest
or
the
largest
random
numbers
right
and
based
on
the
r
value.
You
already
know
that
your
random
value
had
a
couple
of
leading
zeros,
for
example,
and
if
you
incorporate
that
in
your
reservoir
sampling
algorithm,
then
you
make
sure
that
you
keep
spans
which
have
larger
r
values
in
your
reservoir.
D
Cool,
that's
really
exciting.
All
the
other
algorithms
I
knew
to
do.
Reservoir
sampling
were
not
going
to
work
with
consistent
sampling,
so
this
is
cool.
I
look
forward
to
more
on
that.
I
think
we're
out
of
time
right
now.
I
would
especially
thank
you
will
and
biotic
for
presenting,
and
I
think
that
we
have
a
little
bit
more
of
clear
road
map
for
what
what's
next,
and
I
think
I'll
see
you
here
again.
D
I
don't
know
that
these
meetings
are
useful
to
have
every
week
at
this
point.
Does
anyone
have
a
feeling
as
to
whether
we
should
do
this
every
two
weeks.
D
I'm
going
to
answer
that
myself.
I
think
we
should
do
this
meeting
every
two
weeks,
I'm
going
to
talk
to
someone
in
hotel
about
updating
a
calendar
link
to
get
the
zoom
links
right
and
I
will
post
in
the
slack
when
I
succeed.
Hopefully
at
scheduling
this
down
to
once
every
two
weeks.
A
And
I'll
try
to
get
the
slides
in
linked
in
the
docs
there's
not
to
clear
it
with
legal
and
thank.