Open Telemetry Uncategorized, 19 May 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2022-05-19 meeting

Description

cncf-opentelemetry meeting-2's Personal Meeting Room

A

B

Good morning, hi.

C

It occurs to me, I don't know what uh time zone folks are in, so you agree with me that it's the morning, so I take it here in the u.s pacific as well. Yes, that's right! Yes, okay, okay,.

A

A

C

A

Apologize that I missed this meeting last time, I was caught up on some sleep.

A

A

Well, this small group, we know each other- I, as you know, have been focused a lot on the hotel, metrics specs and have not been really paying too much tension here we did get our spec marked stable, so that was a major milestone right now, I'm focused on finishing our exponential histogram work, which was also backed up, so I still am not really focused here, but I did want to come and see what the group has been discussing and spencer. You put a question in slack. We should probably discuss that and then.

A

I I see, that's all that's in the agenda, so let's do it.

C

Sure, okay, um thanks uh so yeah, I indeed um had kind of. I don't want to call it a stream of consciousness. I tried to like make it a little bit coherent, but there are a lot of questions in uh in slack. um I thought I think I identified at least to me what appeared to be internal inconsistencies in like certain spec language um at least one. Maybe two I'm not really sure and yeah joshua.

C

I see you nodding, so I think maybe I actually have like a more I'd like to say just a little bit about like why I was even like reading the spec in that way and what my objective was, um and so my objective was like in my actual day job uh I'm trying to do some like open, telemetry, sampling, stuff instrumentation of our systems in a way that is like as forward compatible with stuff.

C

That I happen to know is like kind of coming um due to being aware of this group and what what we are working on, um and so I was trying to make it like forward compatible in this way- um and uh specifically, I was trying to um so. I think joshua introduced the term like span to metrics pipeline and what I was trying to do.

C

My task at work was to um sort of determine how to um part of our telemetry pipeline is converting uh hotel data to um a different representation where, um rather than the p values and r value that this group is defined, rather than that, the concept of adjusted count is stored, not in trace context alongside each span, but actually, as a span attribute on each span.

C

um I won't get into like why that's like not the best choice and, like this group understood, like you, know it's better, to put it in trace context, um but uh this is just the system that I'm working with where it like, wants to put adjusted count as a span attribute um whose value is an integer. um That's the integer n in the sentence like one in n, so like 1 in 12 sampling, you would put this span attribute as 12, and that would be like probability 1 over 12..

C

So that's kind of my my use case. I'm trying to translate between. I want to instrument my system to like use, set p values and things, and then a sort of downstream converter that, like at the last moment before storing into my storage system, would convert it to a span attribute with these other semantics. This one in n uh semantic.

C

um So that was kind of my setup um and in uh in order to do that in order to sort of specify what that transformation ought to be between hotel p-values and this, like n, in one in n, um to specify that transformation. I need to consider all the possible uh values of p that could be observed on incoming data, um and so that's pretty clear for the what I'll call that like zero through 62..

C

um But then there's this like 63 value. That has this uh certain meaning. And so that's how I got like very focused on like trying to be sure. I totally understood like what it would even mean for like for me to have sort of recorded and exported a span whose uh p value was 63. like that, might even be a contradiction and like probably wouldn't even ever happen. But I was trying to sort of just specify like what should this. You know.

C

Transformation do in the event that it does receive a span with p-value 63 which, like I'm not even sure, is possible anyway. So I think, like joshua, you definitely understand where I'm coming from. Does anyone else have like any questions on sort of my my problem context.

A

Yeah I mean: does uh I've talked a bit? Do yeah.

C

I understood yeah.

A

I do uh I mean I think um you probably have identified some inconsistencies in this text. We should look at that. um The the key that I think may be lost from the nitty gritty text into the high level idea here is that um it's meant that that value of p of 63 is is meant to to be special and um and because there's no there's no one in n. That gives you um you know. The value of n is, of course, infinite or something like that.

A

um It would be an acceptable uh approximation, though, to use two to negative 63. In this case, it's almost zero, it's very close to zero and that's the reason why it ended up on that end of this.

A

The spectrum, um the the primary reason that I saw for that to happen came when we started talking about these non so-called non-probability samplers, and that was um a creation to accommodate, essentially like the bucket the sort of leaky bucket abstraction is close enough to a probability sampler in in sort of requirements that that's not a good example, but we have examples that are more like. I want to see one trace per minute. That's just not a probability at all: there's not even a rate there.

A

So if you have a composite sampling policy that says I need to select one per minute and one in ten. Then you end up with this case. It's a corner case, but it's a case where the one in every the one in every minute selected, but the probability didn't select, and so we wanted to distinguish between unknown and zero. In that case, and so unset pr unset, p value means unknown and you do end up with a known probability.

A

It's just that the probability is zero, so that 63 is used to convey this oddball case, where you have a non-probability, sampler and a probability sampler.

A

So I think that if you don't have a non-probability sample, you should never see it, and but the answer is also well-defined if it like, if you're, counting spans- and you see a p-63, just drop it, it's zero.

C

Okay, yeah, that's kind of where I landed, um and do you also agree that um that, like in theory like in the system that I've described where, like you have you know, services that are emitting spans, like they probably shouldn't ever emit a span with a p value of 63? That is sort of? Is that like an internal to uh like within a process um like that value might arise? But it is uh unrealistic for that value to be present on an exported span.

A

Well, that's the the concept, I think, but then, if you were in a scenario where you've got a span and like the probability sampler has rejected it and you still want to record it, that's what we're that's, what that value may be useful for, but I think if all you're doing is counting the answer is drop it and I think one of the other examples people gave here is like okay, you have a sampler, that's gonna, that's gonna record every error as well as one in ten spans and now every error.

A

You know if you didn't sample you, you wanna record it so because it's useful information, but it counts for zero, and that sounds a little contradictory, but um it's better than producing an unknown when you know that the probability was zero, because otherwise you can't count. I think if that.

B

C

It's it's tricky because of my situation. I don't yeah like I, I think I'm struggling to sort of grasp. um You know the situation where simultaneously we like want to record it, but also in like statistics about sampled spans like you, wouldn't you would leave it out. um I mean, like literally, I understand what it's describing, but I guess I'm as like a user of it. um It's a it's quite odd. I guess um yeah.

A

Because mathematically, the expected, you know the expected count has to be zero for the sums to work out on those things that you didn't select, um um I'm not sure how to help help with this uh intuition. uh uh At this point, yeah, um if you don't have a non-probability sampler, this just doesn't come up.

D

A

Anybody else have another comment on this.

A

um Again, I do feel that that you would be re in reason to use the value of 2 to the negative 63 um as as a substitute for 0 in this case um that well actually wait. I mean that's going to mean counting it.

A

Sorry, that's a that's a terrible suggestion. I think you have to drop those. You don't want to count two to 63 spans when it when it wasn't sampled by a probability, sampler.

C

Another question I had um is uh uh one aspect of this spec which doesn't sound like it's in question or anything is.

C

I think I have like uh a really like 10 000 foot question about like what we mean by composition, um and my question specifically is: um if um is it, I may have confused it initially for one one process that I'm familiar with, or I don't know I was thinking. I had it in mind when I was reading the spec and then I was like.

C

Maybe it's not talking about this is the process where you have uh sort of sequential decisions, um and so this is the case where, like, if you had uh you know a sequence of services that, like called each other, it's like a a receives a request from the internet. A calls b, call c calls d um and so I'll call that, like sequential application of like sampling- um and I wasn't sure that's what that's sort of my that was my mental picture.

C

When I started reading the section on composition rules and then I wasn't sure actually, if that section described this sort of sequential thing or if it described a different notion of composition, um so I wondered, if like you could, or anyone could uh make what we mean by composition, a little bit more concrete.

D

Now so composition is when you have multiple samplers deciding about whether to sample a single span. So that's what was meant by composition and as josh explained, there could be some strange things happening when you have a non-probability sampler coexisting with a probability sampler that causes some headaches, of course, but in general you can have an arbitrary number of samplers making the the contributing to making the decision.

C

Okay yeah, thank you that was that was like where I landed, that it was sort of a yeah, multiple samplers sort of ored together for a single um decision like a joint decision for a single context, so that um that made sense um and what ultimately sort of helped me land there. That was in the spec is when it says like um in terms of like combining their actual results. um Just like take the logical or I was like.

C

Oh that's, like you know, it's a single decision, we're ordering it that's what is meant here by composition, okay, um uh one thing that I remain um a little bit unclear on and I think I would appreciate any explanation is um I- and I said this in slack, but basically like when you're like I'll just say: oring when you're ordering, um like a probability, sampler and a non-probability sampler, it is surprising to me uh that the like outcome in terms of like adjusted count is um well.

C

I actually don't want to misstate it, uh but I have a little table that summarizes all the different rules um but yeah. So if you have mixed mixed probability, non-probability and they uh it was yeah like if the consistent, if the probability thing uh sampler says sample, then you take its probability, whereas if it doesn't, but the non-probability one does then that's 63.

C

um is. Is the rule that I have written here um and it was surprising to me that, like it would have been my intuition that if you combine like a probability thing sampler with a non-probability sampler that, regardless of each sampler's decision, I would have guessed that the resulting adjusted count is unknown due to the sort of non-probabilistic sampler sort of poisoning the like.

C

You know uh no ability of the adjusted count and that having a sort of viral effect on like if you ever or anything, with a non-probabilistic, sampler you're. Now in the like unknown regime, that was my intuition. I want to hear uh why that's not correct.

D

Right so when you are together decision from probabilistic, sampler and non-probabilistic sampler, and they both decide to sample, because you cannot just it's not just like very simple boolean or you have to look more carefully at the cases if the probabilistic sampler decides to sample, regardless of whether non-probabilistic sampler decides to sample or not. You know the adjusted count.

D

That's why it is it is defined. It is well defined.

A

Thank you peter. I was gonna say roughly the same, you, you know the probability from the probability sampler and you know nothing from the non-probability sampler and when you combine no ability with nothing, you get. No you get known um and and that that p63 value is the corner case to make it all work.

A

Meaning, if the probabili probability sample rejects that it's zero probability so zero just to count um and nothing a non-probability sampler can do we'll change that all the gnome probability standpoint can do is change whether you record the span or not,.

A

This is a conceptual leap, so I understand, um and- and the other thing is, I think that perhaps lacking or sort of taking away from clarity here is that the rules of probability that we learned in school, like for for normal probabilistic composition or not being followed here, like you, don't when you, when you have two probability, events and you want to end them. Usually you multiply probability. That's the common case that we know that what kind of intuition that we have but for consistent decisions.

A

The rules have changed, and I don't have a short explanation for why that is. I'd, love to hear peter or ottmar comment on why that is.

D

Well, so the rules do not apply because these are not independent uh uh decisions right so let's say you have two uh probabilistic two consistent probabilistic sampling, sample, samplers or together or ended together, but they have different probabilities.

D

Because this is consistent sampling, it means that the these two samplers are very tightly coupled together. If, if, if probability, if the one with the lower probability samples, then we know that the other will sample as well.

D

That's why we don't multiply. We just take take the the the right value of p, because we know that the other is purely optional. At this point, the other sampler, it's it's! It's.

C

D

C

C

Halfway sort of understand the um the notion that, like oh, like one of these decided to sample um so the other one is like immaterial um and I sort of recalled actually I'm gonna post in this in the zoom chat. How I think what my hang up is looking at this from like a more um like analytic angle. um So uh I I shared a message, but this is um when I was trying to sort of rationalize this to myself earlier um I'm trying I was trying to.

C

Just from like a first principle say like okay, I have two things: what's the probability that at least one of them comes out a certain way, um and so this is not um what I wrote here is like not.

C

It doesn't consider like the actual decisions produced. This is like uh this is um considering the like parameters of the two sample, the p1 and the p2. Only one of those is not even probabilistic, um but I think I was trying to come at this from an angle of like what is the sort of like equivalent joint probability. Well, you can't say, because you don't know one of the p's, um and so that's, I think, how I sort of arrived at mine, and I think, I'm.

C

I think I still struggle with like uh understanding why this sort of angle of approach leads me to the wrong conclusion.

E

The case would you describe, I mean that you know p1, but not p2. I mean only happens. If you combine, you know a consistent with a sample voice.

E

With the non-probability sampler- and I mean for that- I think the combination is not really defined. So then I mean, if you combine both, then you just take the sampling decision of the consistent one and derive the p-value from it that so it's the p-value of that one and the other one does not have any impact.

E

Only if the consistent sample does not sample, I think then the sampling decision of the non-probability sampler would be taking into account and and the p value will be set to 63. I think this is, I think, the use case here.

E

So, um but if you're just combining consistent samples, I mean, uh as uh peter already said, it is a consistent or I mean not independent sampling decisions. So that's why the formula you described here in the chat would not apply.

C

E

That and that I follow yeah yeah and I mean a use case for that is, for example, you have you know one one sampler which samples one percent, or I don't know or yeah one quarter of full spans right and then you have an additional sampler which uh samples every maybe every span with an error with 50, and you won't want to combine that, for example, and so you can define those two samples and combine that with an or for example, and then you end up with the sampler that uh samples all the errors always is 50 on the remaining spans with 25.

E

So in this ways you can easily you know compose more complicated samplers and that's why we've defined those compositions.

C

Yeah I mean I will say I am I'm very clear on, like you know, taking the smaller probability of two consistent probability samplers that, um because they are, they share their source of entropy, so they're very correlated that all makes sense to me um yeah. I think.

C

I'm like trying to convince myself that it's not like an arbitrary like philosophical choice, to like I, I am still like, focused I'm, I'm still framing it in terms of um the like parameters of this two samplers themselves and I'm struggling to. Like I hear you and peter saying um you know it actually depends on their outcomes and like what their specific realizations uh for um like would would sort of change. What the intrinsic like.

C

I think my contention is like there is an intrinsic adjusted count for this pair of samplers, and what you are saying and the spec says and like I think, I'm just not yet grasping- is that there is no intrinsic adjusted count for when this pair selects or doesn't it's sort of uh it's a function of their specific decisions and there's like a two by two sort of matrix of, like you know the possible decisions that a pair of samplers could produce.

E

I mean in general, the sampras are: uh are allowed to choose a sampling rate for each individual span, so it sampling rates or sampling.

E

Probabilities do not have to be the same so basically for every span: you're free to choose a sampling probability uh based, maybe on span attributes or whatever yeah, and from that uh you get if the span gets sampled it the span gets its individual p value right, and so you can also implement the sampler could implement a very complicated conditions or formulas how to derive the sampling probability for this specific span and yeah with by with the composition. Basically, you can express some more complicated samples in a easier way.

E

Actually yeah using you know some elementary samplers. You could yeah those those compositions would allow you to specify the samples in a simpler way.

A

This reminds me that you know there were other ways we could have gone forward with a specification about, say, multiple sampling policies, at the same time being, let's say, operational and or independent, like a lot more just describe the situation where you have a sample, that's 10 percent of all spans and 50 of all errors, and we have a way to define. We think that makes sense to combine those into one sample.

A

That's a combined sample but like we could have also had a conversation about how to report multiple samples so that this span can be reported as two adjusted counts. It has an adjusted count for the one sampler and has a just account for the other sampler. As long as you don't combine, those two counts. You you have accurate counts. You have a count of errors and you have a count of spans and they're different counts.

A

So at some point I think there's an option to to expand the scope here and have a way of multi of carrying out multiple sampling operations that are not combined, that are maintained as separate, but that would mean having essentially multiple adjusted, counts or multiple trace states, essentially for each independent experiment or sampling procedure. That you're doing. We want to throw that out.

C

Yeah, that's interesting. I've not considered that um yeah and I I will say I am um decently acquainted with like the world of uh like choosing a p for a specific decision sort of based on uh span, attributes or stuff like that, um but uh yeah. I think I need to like go to a desert and meditate on this, like there's, no intrinsic adjusted count for a pair of samplers, but it depends on like the outcomes of the two samplers.

C

uh What the adjusted count is. um I think that is sort of at the core of my uh misunderstanding, so um yeah. If I come up with a different way to phrase it or if I have any insight that might be useful as like a from like a pedagogy perspective for like future people reading this stuff um I'll follow up, but um I think I think, for me, this thread is for to take up all of your's. Time is kind of at its end.

C

So thank you for uh for taking a half hour to try to explain stuff to me. I appreciate it.

A

Thanks spencer, I actually appreciate hearing peter and ottmar explain the same things. It's good to hear other people's thinking on this.

A

Well, thank you for that spencer. I think we may have reached the end of your topic and there is not anything on the agenda. Would anyone else like to raise a topic or discuss um and then, as always, I hope to be able to do more on this in the future and keep going.

F

I'll just say briefly: I'm probably gonna step back from this, but have someone else from honeycomb keep joining. That can be helpful and I I appreciate spencer, you've been in contact with us, so hope to keep the conversation going.

A

I'm thankful that honeycomb has been advancing sampling technology so that spencer would come do this with us. So thank you. Honeycomb.

E

A

All right, if anybody else uh has a topic last minute here here we are otherwise. I think we should move on and keep working.

C

I uh I did have another thing that I was trying to figure out how relevant it would be to our previous discussion. uh I don't think it was. It was super relevant, which is why I didn't bring it up, but um I was sort of working on a adjacent um problem to what I was just talking about, and I think I might have like independently arrived at.

C

um We've talked about like sc value um in this group before, and I think I might have like not fully understood it when we previously talked about it, and then I was like thinking about some concrete problem I had and what I came up with. I was like. Oh, I think this might be like that c value we were talking about, um so I'm actually going to share just a small set of notes.

C

So one moment that I can point to.

B

C

So I was trying to understand what uh or you know, reason about what adjusted count would be uh for a sequence of um decisions on uh not a single context but again yeah like context through multiple separate decisions, uh samplers um and where I kind of landed. Was that um all actually I mean depends, do you want to start abstract or concrete? Abstract is up here concrete's down here, um but I'll give you a couple minutes to sort of read um read this. I think it's all on the screen now.

C

A

It looks like you're using not inconsistent samplers, so I'm having I'm not sure how what I, what we're doing right now. uh Can you be more specific?

A

Well, you have these independent probabilities, which you're multiplying together right, um and that was the that was what we were talking about, how none of the consistent sampling composition, uh rules that we have are designed to handle that type of scenario.

C

Yeah I agree. I'm yeah, I'm saying I think I have like a a sampling design in this like hypothetical system which is like not representable with just p values. That's that's a thesis here.

E

Yeah I mean for this scenario: you actually need uh you know to collect the another kind of p value for the independent samplers right and.

C

Right and that's where I was heading back toward like is this what we have talked about as c in the past.

E

Yeah, if it's in japan um yeah, I think c, was, uh I think, related to yeah table sampling are probably also independent yeah. You could also interpret maybe a c value, but the point is that you need two different values.

E

You you have to to um you're not allowed to mix those values, because uh otherwise you cannot uh estimate correctly.

C

Right that was my conclusion as well: yeah, um okay yeah. um I guess my next question then is like, and this may be directed at, like the people with actual customer experience so like joshua and ben. Like I mean, obviously, this group prioritized like a scheme for transmitting uh like consistent um sampling uh over like inconsistent um or independent or whatever, uh and I guess I was curious, like can anyone think of like a case where someone would like reasonably want to be like?

C

Oh no, I want each node in my system to make an independent decision. um Is that a thing that anyone can conceive of demand for, or was this just like an academic exercise on my part.

D

Well, one argument against doing this: independent samplers is that with consistent samplers we have taken care of maximizing the chances that our traces will be complete right, and we have not talked about this right now, because we are talking about adjusted counts exclusively, but in many systems completeness of traces is important.

D

That's that's why I don't see any value that you could get by applying independent samplers. What kind of benefits you expect yeah.

A

Don't we get benefit by in in the common, simple, straightforward case, where you're sampling at the root? um I think that's the case where it is possible to con to mix independent probability decision with a consistent probability decision, and this is where I think spencer has pointed out. We could add a c value that can be, I think, maintained in parallel correctly. I think we would have to spell out, as rules um is, that what we're getting at here.

E

um Another thing uh which have to be considered if you allow, for example, yeah independent sampling in the context of traces, is that you know the extrapolation can be very expensive for the the estimation can be very complex if you have to consider for those independent uh sampling decisions, uh especially if you're, for example, counting where you want to count the number of traces which you know called some series a and series b.

E

Then it gets really difficult. If you have independent sampling decisions.

C

I don't I'm not sure I followed. Why is that? Is that saying that, like, like, I feel like I just heard sort of beneath the surface of what you just said, something quite profound, which is that, like throughout a system um there's no there's no measure that can be taken so that, like locally local, like isolated parts of the system, can like decide how to do things on their own? um Rather, the entire system has to sort of participate in a coherent sampling design in order for the whole system's output to be validly countable.

C

E

I mean- or I mean maybe say it in a different words: um the the number of possible outcomes. If you have in independent sampling, you know grows exponentially with the number of sampling decisions right and um because you know, every independent sampler has its own decision, which is completely independent. So if you have, I know 10 samplers, then there are 2 to the power of 10 outcomes, of what you could get from a trace right and with consistent sampling.

E

The number of possible outcomes is quite limited. It's basically the number of distinct p-value which have you chosen within the trace. So it's uh much better to to analyze actually and for estimation. You usually have to yeah consider all the possible outcomes, and you know and use uh corresponding extrapolation factors and things like that, and so it's it becomes invisible. If you have independent, independent sampling decisions or if you have many of those.

C

Okay, I think that is um a statement against what I had presumed to be true, but it may not be true that, like I'd imagined, I could have like a graph of you know: services or processes or whatever and like those could all be configured to be doing like independent sampling, and my presumption is that there exists some way of like reporting that on span contexts and things um such that, like accurate estimates, valid estimates are possible.

C

E

If you're just looking, for example, the number of spans of a certain service, for example- I mean then it's fine- this is the easy task. Then you can easily um yeah. I mean then you're not looking at the trace, actually right sure, and then it's.

B

E

To to estimate from that.

B

Even if there are inconsistent.

E

Sampling decisions yeah: this is actually what you did here right and but if, if you wanna estimate, yeah more complex, um yeah quantities from a trace, then it's not so obvious anymore.

E

C

You can, you can read.

E

My my paper, where I described an estimation algorithm.

C

Okay, yeah, if you could share that, maybe in the meeting notes, I would be interested in that.

C

I think one thing that might be illustrative is like: if, um if we considered like a system where, like service a uh selects with probability uh one quarter um and service and that it calls service b, that selects with probability um 1 16.

C

um and is, is your uh statement that you know sets traces for such a system, which uh I guess I guess I would ask like. What's an example of a quantity that, like would be in, you know, difficult to estimate. I guess in that system, where a calls b each doing independent sampling? What is such a quantity.

E

Yeah an example is, if you want to know uh how many traces um you are called a and b so yeah, and this means that you need to see a connection between a and b within a trace and if you have uh yeah, if- and you know this happens, this connection, you only see if you know all the spans in between are sampled so and if you have independent sampling decisions, yeah, it's it's your first. You see that you have a very low probability to see this connection at all.

E

So that means you only have a small number of samples values can see this connection yeah and.

C

E

Even if the the estimation would be simple, um you get the high variance of high statistical error for estimating.

C

The number oh sure, yeah yeah, no, I don't. I don't deny that the resulting estimate would be high variance. What I'm asking is. Is it still a valid estimate so, like the yeah.

E

C

Mean you can estimate it yeah? Oh okay, I must have misunderstood. I thought, but.

E

But it's getting it's getting complicated on if you're, considering more complex quantities. Okay, so I mean there are quantities which can be easily estimated, even in the presence of inconsistent.

D

E

Like, for example, the number of spans of a certain service, but you know if you're, for example, you want to estimate the average depth of trace or something like that, then let's get it gets complicated. Okay,.

C

I appreciate um the the concrete examples of like quantities that are um either impossible to validly estimate or just like merely difficult, um and the reason I appreciate them is because I I think I first of all, I think, no vendor in the observability space like supports such quantities today, like the average depth of a trace um or something like that, um and so it I think those don't like occur to me as like possible things that could be estimated, um but I'm sure I have no doubt that, like actual research on tracing systems like has thought up some pretty interesting quantities that could be, you know, said of a tracing system um that aren't yet like in my intuition as things that I would even care about.

C

So thanks for the those examples, those help.

F

I don't know if this helps in the conversation, but I feel like there are two use cases that come to mind where, where consistent versus inconsistent might make more sense, so the consistent sampling makes a lot of sense. For me.

F

Kind of as the flow of the trace is happening throughout the whole system that they're all kind of participating together in that decision, via the approach that we've kind of documented here, and then there is the tail sampling question, which is kind of almost feels like a whole different world, and I know there's a lot of interest both from kind of what refinery on honeycomb has done, and some of the work that's being done in the collector, and I'm wondering- and I think also where spencer is bringing up a lot of these things.

F

I'm wondering if we are going to get pushed to have a stronger uh way of representing those two pieces coming together and coming up with a useful probability rate or some sort of number that vendors can then use in the kind of trace to metrics pipeline other than it feels like now and I'm. I will totally admit that I'm only on like 50 on the details, but it does feel like now that we we lose.

F

We lose some granularity or some detail in that or have they uh have the possibility of losing that where, if we end up in the space of this 63, you know p63 number. We kind of we're kind of games over at that point, there's not much to do to come back out of that.

F

I might again be missing a lot of detail here, but I feel like that is the root cause that I feel like we're kind of spiraling around and haven't quite addressed, and we've mentioned a few times, and I think it sounds like it might be. Coming more and more valuable to address.

E

I I mean till by sampling and and this kind of sampling there I think they can be combined they're, quite independent. The thing is that tail-based sampling is uh usually um sampling, the whole trace it composes the trace first, and then you have one consistent decision across all spans of the trace, so um it's just a furthest sampling stage and of course you have to regardless, which sampling probability you choose for the trace.

E

You of course, have to add this sampling probability or the corresponding adjusted count as extra value, and this is what was proposed as c value. I guess, and then you have all the information to estimate things yeah. So the point here is with the step based sampling.

E

You do not have additional possible outcomes right, so it it's it's quite and what to go now approach so they can be combined can be used together. The first stage you have span sampling using consistent sampling and then, if you have the full trace already composed, you can add an another sampling decision which yeah which acts on the whole trace, and so I do not see any any problems uh for tail based sampling here.

F

Yeah, I think maybe that's where I'm getting at is if there is, if that becomes an easier use case, to focus on to add any additional parts of the spec that might be necessary, or maybe maybe not. Maybe what I'm hearing is that it actually could kind of work today, um but that might be a nice place to start from in expanding the the stretch of this or the scope of this, the spec that we have.

A

I um I want to say I think what otmar's has has proposed I've seen the code, and I think I get the basic idea of the algorithm to do consistent, tail-based sampling. uh On the other hand, I want to.

E

Like no, this was I proposed. I would not call tail based sampling, it's a.

A

E

Reservoir sampling of spans right, which happens consistently. So this is a different thing: okay, so it's before which which can still yeah. The problem is that you have um the same place to do an immediate sampling decision right and so in the in the uh in the processor uh yeah you can, you know you can buffer the spans and then drop spins and then adjust the adjusted counts correspondingly, but this is still before the the spans are combined with each other or before the trees is built.

E

Actually because the this is what I consider as tail-based sampling is, when you already have all spans of a trace, yeah physically, combined and and then do a sampling decision on the whole thing. So this is just an additional stage this. What I've proposed is an additional stage between the the sampler and uh and yeah and the exp yeah.

E

I don't know, what's the right term.

A

Pipeline, I think, is what you're describing um so. What I think um perhaps I can speak for spencer now, is to try and like if stepping back, what we want is a span of metrics pipeline. We know that there's going to be- or this is one of the things we want. We know that there's going to be span sampling, that happens in the sdks in context, and we know now we've specified how to carry out that consistent sampling and then collect those spans.

A

Now you're saying we're saying that we would like to apply kind of more traditional second stage sampling of some sort, and it may not be consistent at that point even and what we'd like to do and and as you've said, atmer like you could call this completely different like we are now sampling traces like no big deal.

A

This is a new thing, um come up with a new way to say what the adjusted count of the trace is and but what I think, what we're hoping for is that we can just sort of pass through those spans um and and make sure that the count on the spans reflects both stages of sampling, and I think it's true that we know how to combine consistent sampling parameters so that we could combine p-value and modify a p-value in the second stage.

A

But I think what we're getting at here is that everyone understands independent sampling, flipping coins is natural and the outcome is so simple that we'd like to just perhaps have another parameter, which is the combined independent sampling. That's happened and that could be expressed as c c value and it might allow us to do both to do sort of the straightforward form of tail sampling, which is limit spans to adjust the traces you want and then continue setting expands.

A

And now your spandex spanner metrics pipeline accounts for both the in process consistent sampling, as well as the out out-of-process arbitrary sampling. I think that's what we're getting at here.

E

I think you need different values for different sampling mechanisms right and if you're say you compile, you have a combined value offering for for uh inconsistent sampling. Then there are still two different mechanisms.

E

We could have independent sampling on a span level and also on on a trace level, um so which would be tail based sampling, so so this would require, I think, two additional values on on which you have to store in order to be able to estimate correctly because the one the tail based approach you know, makes this decision for all the spans of a trace. So this is also some kind of consistent sampling.

E

If you want to see like that, and and if you have independent sampling decisions on a span level, then you cannot just combine that with with the tail-based sampling probability. You know what I mean so.

F

If I understand correctly, I I think what I, what I'm, what I'd be interested in and uh maybe we're moving towards, is starting with the consistent sampling and process and the and flexibility I'm going to be a little ambiguous in the tail sampling and it we would have to be clear on kind of where these different pieces can be used and how they can be used properly together.

F

Such that we don't get it don't open ourselves up to more complexity and more corner cases than we really need to kind of get the basic use cases out there.

F

I think there's enough demand for this kind of tail sampling, where you can capture all the errors and drop all the other, interesting things that are less interesting and so forth, enabling that, in a focused way, that gives it an ease of implementation and that spanned the metrics pipeline, um without kind of going to the full suite of of cross product of all the different ways of kind of combining these things at both span and trace levels in process and tail and so forth.

F

And so maybe we can kind of take smaller steps that gain most of the the usable kind of pragmatic use cases that we're hearing from people.

A

To me, this leads to a conclusion that there should be perhaps another probability number that's just like we know people want to do. Randomization and probability falls out of it, but it's not consistent anymore and it seems to me fairly straightforward to you know. If you're applying one and three sampling at a collector, then we can just put a c value of three on those spans and when we come to the end of expanded metrics pipeline, you do the p-value calculation and multiply it by the c calculation and that's your total adjusted count.

A

um I think I I think what I'm hearing from band and probably spencer as well, is that there would be value in having that type of thing specified. You recall that there's a probabilistic span sampler in the collector and right now it's doing one. You can configure it for one and three, but we have no way to record that right now and I I had briefly proposed the idea. Well, we could just if it's a power of two we can multiply or add the logarithm to the p value and that's the right thing to do.

A

But that is very limited and all mars just pointed out why we shouldn't so I think that points to another value.

A

Okay, I think we probably all may all agree that that's needed, I'm not sure who's going to jump on that or if any work will get done right away. I'm waiting to get some direction from whitestep to actually focus on this.

C

So I think I gathered that, like there's, probably not a use case for like many um like uh it would be strange or uncommon for somebody's system to like have multiple independent sampling stages. But probably one at the end is likely to be a pretty common sampling, design.

E

I mean you mean uh in the you mean a last stage, which makes a sampling decision already for the whole trace right, yeah so like, whereas.

C

In my sort of example, in my notes, I had like two stages of independent sampling and I think that kind of may have like misled people to be like that's kind of weird, but then when we were like. Oh, if there's just one at the end, that's eminently normal, and I think I gathered like the same mechanism that same c value would be used for both it's just a question of like what will people actually like want to do in practice?.

D

We are over time here, but uh my personal choice would be to get rid of this independent sampling entirely and try to do everything using consistent sampling, for example. If you want to keep all the errors, that's not an issue.

D

We just have a filter that filters the set of traces which have errors into one group and we resample it with probability, one which is, of course doing nothing and we get the rest of the traces and we apply consistent sampling with a very low probability which practically gets rid of most of them, but all the metrics should be preserved correctly after after this step.

D

So if we follow this path, I don't see that we necessarily need to have independent sampling at.

A

All so one development that we might bring in here- and I know we are over time- I think I got to go to- uh is that the w3c has moved uh and released a update with a new chase flag. That was meant to help us out and that- and that is to the idea that we can put stop having an r value. You can start putting uh that flag setting that flag on the trace id so that we know that 55 bits or 60.

A

Some some number of bits are random, random enough and that you can use that to make your decision. um We probably need to update- and I we were kind of I was waiting for that, because without that you have to have an r value on every unsampled request and that gets pretty expensive. um I think what to take peter's point a little further. I think we can probably leverage that and and and truly build pipelines that take advantage of that randomness to do consistent sampling so that we don't need to go to a c value.

A

However, however, we still have people who are going to say I am doing one and three sampling, so power of two consistent sampling is is a limit, but but now that we have this random data, that's that's truly random. We can actually do non-power of two sampling as well. I think, although I don't, I know that that's gonna cause some concerns, but you know as long as those bits are ordered we can we can. We can flip coins in between the powers of two as well.

A

I think I just I'm not sure that that's going to be a popular idea. I know atmar in the paper which we need to link for in the notes. Basically, you would like to not have arbitrary probabilities just because it makes the calculations intractable, um not because of any other property. As far as I know,.

E

Yeah I mean for I am in durham. I listed a couple of reasons. One is, of course, that you know if you want to estimate um integer quantities um and you're using so and if you're using uh powers of two sampling rates, then it's guaranteed that the state estimates will also be intentional threat, um which is yeah. I think yeah good, because if you want to show the estimate, then you do not have to round it, because I mean I know if someone wants to see an estimate of three dot.

E

Two cores I don't know never.

A

Bothered me there's one and a half.

E

People here so um so this is one thing, then the other thing is that it limits the number of possible outcomes of which you can see which allows you to simpler estimate, more complex quantities, as described in my paper um yeah.

A

It was that second objection that I was thinking and.

C

E

Of course, um the p-value can be represented in a compactor way, so it's uh because you only need to store the exponent instead of a floating point um yeah this this. I think the main arguments.

A

Yeah I see that as a fairly motivating reason, just the compact representation, um but I think it's I mean we have to recognize that c values are integers so that first objection kind of doesn't work for. There's no objection there like as long as c values are integers, then your estimates are going to be integers too um so, like in the stats d sampling right now, we like you, have.

E

A

Note saying please use integer inverse integer reciprocals, or else we will be rounding and you will not know the difference um so um anyway, we are over time. This has been constructive. I do think um we're we're. We can see the work ahead for this sort of tail sampling uh work, um stay tuned, keep coming, we'll keep keep at it. Thank you all.

E

A

E