Open Telemetry Uncategorized, 17 Nov 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2022-11-17 meeting

Description

cncf-opentelemetry meeting-2's Personal Meeting Room

A

A

B

C

A

B

um I was going to have to change the agenda right now.

C

I gotta find the agenda. I haven't.

B

uh Should be linked in the calendar at least I am.

C

B

Thank you, okay, we're in a new we're in a new week now, okay start making.

B

B

A

It's been a while.

B

I do that last time we left off, we were uh kalyana, had presented a topic about question, question really asking about sampling and links and there's an issue now uh that has some discussion probably put that in the agenda.

B

um I put some information from mine into that issue and talk about that um next on the list was one that somebody asked us to talk about in Slack, uh asking about science exemplars, although I'm not sure, there's much for us to say um that's about it. That I had I found an old issue that we might eventually want to discuss again.

B

um I feel I'm, starting to feel like it's time for some work on uh proposing the non-power of two probability: sampling um that we've kind of speculated about based on WPC case context, um I, think I'm, starting to think that, because um we've talked about it and I know that uh there was this issue, the one I just pasted, finally into the notes that we discussed a few months ago, maybe six months ago about maybe more should present my screen for that.

B

um Well, actually, we should talk about this failed almost a year ago, so we should we'll put that one last um about Trey State when you're not doing probability sampling uh at the head. Okay, um so I've put three things in the agenda um and I think we should start with me presenting my screen.

B

um Okay, here we are, and I have a million tabs open, but here we go.

B

This is the one that I thought was most relevant this week, um and so um you can read, of course, you've probably read this already, but the summary is that we're um asking about how a user might experience the presence of both span, links and sampling um and there's a link that that, if you're following closely, you may be familiar with some just some debate in this space unrelated to sampling, but having to do with span links that um this fellow Johannes tax has been working on for a while.

B

So the links kind of crossed here or the the issues crossed here, um I wrote this one response. Last uh last earlier this week, um sort of half half answering what I think about this problem, but also trying to find solutions for the the real underlying problem, which I think is exposed by the um issue um of that that having to do with messaging and spam links.

B

um That's that's the second topic or paragraph that I put together in this or section I put together in this response.

B

um I would like to pass it to Kalyan and see if we sort of answered the question at least about um sampling when links are present and if you'd like to um comment on what I just said.

D

Thanks Josh, this is certainly helpful. uh So the first section here, uh I just did a small proof of concept. Yesterday just tried to write that composite sampler, as you said, using a non-probabilistic sampler for the one which looks at the links and then I just used a regular pattern based sampler for the other, probabilistic sampler and then compost both of them um I think I. Think that's. That sounds like a fine uh way to go. I think couple of follow-up questions I had number one.

D

Is this something we imagine would be like a whole tip and we go a bit further in terms of specifying similar to how the other things you have in the um like in consistent in the link, you have right, like the composition rules you talk about. This is how you do should we should we go a bit further and say hey if it is links, it should be like whatever you have in this first section.

D

Should that become part of the specification in some form, uh that will probably lead the way for sdks to even support out of the box uh samples in in the future or or do you think we should stop with like some kind of a guidance or like a sample sample is probably the wrong word here in this context, but like an example or a demo thing or like I'm, just trying to see what is the spectrum of uh where, where we would.

B

um Well, I think for my myself, I I. The reason why we haven't got I guess like an SDK spec that has composition rules included in it is that I think we suspect in this group that users may ask for configurable sampling, but what they? What they want, is quite a lot more sophistication than just one aspect of configurable sampling, so um being able to to specify a single composable. Sampler is just another building block and people have been are asking for hold Solutions, not not building blocks I.

B

Think it's kind of the basic answer that I have anybody else want to talk.

C

um I'll just say that you know like one of the use cases we see sometimes is somebody who has you know they have a trace and they've sampled. The trace and the trace has I saw one this week they had on average three times as many span links within the traces that were spans within the trace um and they wanted to sample those span, links pretty aggressively um and like they didn't care, they didn't care.

C

If all of the links were present in a given Span in a given Trace, they were willing to say I just want there to be less spans, and so so that's just like a data point of there are some people who have for who, for practical reasons, are content with tail sampling even and saying. No, you know just just get rid of some of this crap for me because I don't want it all here um and you know, then they were.

C

There were requests for sort of more sophisticated ways of specifying that, um but but on a base, thing like the your probabilities should be stacked so that, if you pick the, if you pick the trace you're going to pick all the links, um uh that's actually not what some people want. So that's a like making that normative and say it has to work. That way would probably be against what a lot of people actually want to see.

B

Can you re-summarize what what the user wants in this case? I didn't quite capture. It.

C

Sorry they had, they have, um they have big traces, um a trace with hundreds or thousands of spans and and well in span is Honeycomb defines it includes fan, links um and um Spain events, and so so they just, but they show up not in the trace waterfall as we call it um for the the view of the whole Trace.

C

They show up as like points attached to the things they're linked to um and they just they're like you know, these are all events in my system and and they're interesting to me sometimes, but but in the context of a given Trace, they may not be interesting to me and I would like to I would like to sample span events differently and more aggressively than I am sampling, other kinds of spans, and so um so a a thing that says the probability of keeping the link should be equal to or higher than the lynx.

C

Parent is not what these people want. It's the opposite of what these people want. Okay, yeah.

B

Well, I think the position that I put in my first paragraph here was basically saying: there's not really a consistent way to do this in a world where each context was independently.

B

um Sampled, like we can't do much better than the than combining probabilities that you're going to get all your span links and that, in my my belief, I think this is what you're sort of verifying is. That is that users generally should not expect all of their strand links to be present, um but still I. Think there's, there's, maybe a need for guidance and I I want to I think that, obviously, from this first response by Johannes, I didn't make myself clear in this sentence.

B

This pair of this section here, maybe I, can try it in words and Kent and see if this might help for users. So the the problem that I see is that when we create a child, the child refers to as parent, but the parent never gets a reference to its child to its child. There's no spam event saying I have a child. You assume that the child will be there to reconstruct the child parent relation.

B

Well in this, in the case of a span link and- and that was the reason why we have this non-ascending probability recommendation, which is to say that if you choose something right, then all of your children will be there, but as long as you don't lower your probability. Otherwise you have this Gap that you won't literally won't know about, because you can't tell when an unsampled child appears unless you, unless you're consistent.

B

um So the same type of problem happens with Spam links. You are creating a new span. It is going to link to something and that's something that you are linking to has no notion that it was linked to um and unless you have, and so this notion of completeness is I when I hear it I think what people are looking for is like both a way to know that all my span links whether my spam links were or were not present, and that's something we can.

B

We can indicate- and you can Shuffle probabilities to try and improve that those odds. But we can just tell you these were weren't sampled, but the opposite's not true um and I. Think one of the reasons that we're holding back on this second part, which is allowing span links to be created arbitrarily after the life after a Spam, starts.

B

um The reason why we've held off on that is that there's, no good sampling story um and I think we could fix the sampling Story by effectively saying that links are bi-directional, like you create a link, so you you are pointing to something. That's something you're pointing to has an event that links back and if, if either side is sampled, you can save that reference to see that there was something whether or not it's sampled is independent.

B

um So so it's basically a summary. That's saying that if we added to our data model, saying anytime, there's a link, there's a there's, a reverse link receiving it, it's a reference from reference from type of event, um then you could record the reference from event in the parent or the linked to context. Everybody can sample consistently and do what they like and you'll see when there was something missing. That's that's kind of the the critical missing feature right now is is that we don't have a way to tell when you're missing something.

C

um That sort of assumes, though, doesn't it that the that the first the the thing being linked to is still in flight and still modifiable.

B

Right right, um it does assume that and.

C

Which is not really the case in like say a Kafka environment where, okay, you know I'm going to embed in the packet that I shoved through Kafka, my Trace ID and then meanwhile off goes that my um producer trace and then sometime later, my consumer sucks that thing up and goes hey. Here's a link to the producer, Trace I got no way to modify that Trace. It's long gone.

B

Yeah right, um what makes me wonder: I mean how ambitious we all feel and- and this is really already- this issue is bigger than sampling. So we can. We can just sort of offer opinions, but but it suggests to me that potentially we look for other ways to record those incidental contacts.

B

um So if you're recording a span because it's sampled and you have some links that is effectively recording the event in the reverse direction, if you find a span your database, it matches that link to context. You can create a link from that span or to that span from the thing that you that you have in your hand.

B

um So if you're, not sampled, but you start a span and you're linked to something that is sampled, but that sample thing was already flushed from memory now. That was the problem that you were describing and like one answer would be to like write another span record or span continuation record or span addendum record or log saying there was an unrecorded span for a sample context.

B

Unrecorded span link for sampled contacts, reference record I, don't know like I'm, just trying to say, like all we have to do is record record that somehow um not saying we want that, um but I do think it would help resolve this asymmetry of, like the the sampling sampling, can't guarantee you completeness, but what we really really want from sampling is to know when something is incomplete and and right now we don't know when you're missing events, um if they are, if they happen after your span was created, which is the problem with these messaging scenarios.

B

um Sorry, that was just a little bit of a position statement on band links, not not so much on sampling I, want to turn this over to someone else to to or Kayana to see if I've thoroughly beaten up this topic.

D

uh Thanks Josh, just to summarize what uh Kent and you mentioned, um I think we are saying that hey there may be different. Customers may want different things, and uh there is no Universal scenario or agreement that the links should be sampled consistently. Hence the specification will not really prescribe a little bit more in terms of hey. This is how you should do, and it's up to people to build their own custom, Samplers that put together these kind of probabilistic and non-probabilistic thing and and Achieve. What they want correct is. Is that a fair summary.

B

I mean I think we have a stronger statement that that you can't you can't make span links consistent.

B

um You can just flip coins and try your chances.

D

um Yeah the I'm um I'm, specifically talking about the first section in your response, which is links created at the like. The link information is available at the at this time of spam creation right uh in.

B

That case, yeah, I I, think I can frame the question then, and then let's maybe ask someone else to try and answer it. um I have two parents I, let's put suppose that I'm a new span. I have two parents that are not root, that are not parents and I'm going to create links to them, and they are each sampled to 50 and I have to I.

B

Think I have to know that I always have two parents, and then I can say something like well I'm, creating a new span and my parents are sampled at 50 each, and so they either have a just account of zero or two and somehow I can combine my two parents probabilities to compute. Something probabilistic about myself is that is that, where we're heading.

D

No, no I was kind of uh going into the non-probabilistic option right that you recommended so I. Think in the first section is okay. Whenever you're looking at links, it becomes a non-probabilistic sampling approach and so the adjusted count becomes zero and, like you, you set the P value of six and all that stuff right, um which is probably okay. I mean that I think the trade-off is you're. Saying Hey I want that more consistency across my linked traces, but I'm trading off that ability to extrapolate.

D

And if you were to do like a purely probabilistic approach, you would have been able to do better on the estimation from the like all these plan to metrics and all of that right so, but but the second part I'm I'm kind of trying to still wrap my head around like the like. Today. The model doesn't support that creating links outside of start or.

B

Or you can't just kind of refer to the biggest reason. Why not, which is that like I may be linking to something that just finished and it's flushed out of memory already um and and I think you raised this question at least verbally last week or last time two weeks ago, which was sort of like yeah I mean I'm, linking you something that's perhaps in Flight, perhaps already finished and I'm sampled and for some reason, let's suppose through configuration. I want to say, like this has now become this.

B

This thing, I'm linking to has now become so important that I'd like to turn on tracing all like retroactively as much as possible, meaning if it is alive and it start continuing to make child contacts I'd like to begin sampling that halfway through or something like that, I think that was what was question was whether that could be done. And my my answer to that is: let's make span, links bi-directional and you can add a link to a span after it starts now.

B

I might be able to add a link from a sample, expand to an unsampled span and and detect that I should start half sampling or whatever like half tracing. It would still have zero just to count in my opinion, but you could turn on a sampling bed. So you could turn on non-probialistic sampling of a link to span somehow that's kind of what I feel like we're asking about or looking for.

D

I see kind of like a deferred uh sampling like some form of basic tail sampling uh right.

C

Right, there's.

A

A form of tail sampling.

C

That does that could do this. You could have tail sampling that says: okay well, I'm I still haven't made my Trace decision yet and then this link comes through and it's like. Okay now, my Trace decision is definitely keeping um so.

B

Yeah anyway, this this basically is not a sampling question at the point where I'm asking about whether spam links can be created after after start, but it would potentially help us solve some of the gotchasing sampling.

B

That's that's what I hope to say here is all I'm gonna, try and I will without the action here, at least for me, is I'm going to follow up and try and answer johann's question, because I think I was trying to to give him a piece of something he could use and I think he hasn't seen it yet so I'll work on that um I I, don't feel I feel like this is quite a esoteric conversation.

B

You know the the idea of turning on sampling uh part way through a trace, because you find it to be interesting but I, but this is like not uncommon, like if you've been in this space. Long enough, you know someone who's going to say: ah I, don't want to trace Everything But as soon as that fan crosses two seconds like it's become errant and like I want to do anything. I can to collect anything about that span, but only when it crosses a threshold.

B

In time, for example, I know I've seen that request a bunch um and then you can say and then everything from then on will be traced. It's become an error situation anyway,.

D

One last comment so I've seen that request in uh in a regular Trace right for getting links for a minute like a regular, hey, my Downstream call failed and I want my that thing to be propagated up back to the call chain and all that, but across links, I, wonder whether that's a real use case, because links by definition imply different life cycles right or different lifetimes of those traces so having this kind of a something that is linking to me like if I'm, depending on that to make my sampling decision I.

D

Think I, don't know because usually like these are all for async uh or like these kind of messaging scenarios, where, as Kent was saying like like the lifetimes, are completely different. So I don't know whether natural fit um to to kind of depend on something that is of a different lifetime, whereas, unlike in a in a single trace context, they are within the same uh lifetime.

D

You know that because the Upstream one hasn't finished yet until at least in a synchronous case, so I'm not sure yet whether uh that is worth solving the the back propagation of the decision across leads.

D

um But but yeah.

B

I'll go through here, yeah calling back propagation is probably a good word. That's probably one I've heard before as well awesome. Thank you. um Thank you. Okay, um any any comments on that topic before I. Try to move us back to the agenda.

C

The only thing I will just note so I don't forget it later is because we're going to talk about the power of two problem is um the the yeah actually never mind I'm Gonna Save. Until we have that conversation, oh I will remember. Okay,.

B

uh All right well, then, there's this one that popped up in slack um and uh the summary quickly is that uh metrics providers, especially the Prometheus ecosystem, have been kind of waiting for open Geometry to provide these span exemplars um for quite a while when you get down to it or something a little tricky about it.

B

When you start sampling, um because exemplars are a form of sampling, of course, and um let's see um foreign this, if if ever we've come, you know, for example, Ottmar has worked on a sampler that combines an exporter and a processor, because you need to do some sort of hard rate limit.

B

um This request, I, believe runs into essentially. The same type of thing is that you need to have coordination between the export and processor and buffer things differently than you would in an ordinary Trace export pipeline, because this is basically saying that when the histogram chooses an Exemplar, for example, um you might want to okay, okay, start tracing this or you might prefer, instead to just say limit your histogram example ours to trace spans.

B

That is actually the position that someone below is taking is don't do that it comes it's very similar to what we just discussed, though, is like you know, during the middle of your untraced span, a really long, histogram event happened. The histogram wants to choose it as an Exemplar, and now you've got a thing saying: uh you've got an unsampled span, but I want to record it um and maybe I want to like start tracing it.

B

You know that's that type of thing, um but basically, and if you read down you know this, the final final word from Josh here, um other Josh, is um that you know this can be built into an exporter. You can buffer your spans if they were chosen as a histogram, Exemplar, I I haven't actually dug in on this one. I just want to make sure that people see it.

B

um In case you're, aware of any user requests involving exemplars um basically says: if you want to do tail-based, sampling and you're buffering, then you can do this. We've seen that before with you know hard great limited Samplers, for example,.

C

Yeah and I mean honeycomb can do this with a condition on a rule-based sampler.

E

I mean you can do that also consistently. So if you yeah, if you know that a Spam is Exemplar, then you can just pick uh yeah simply probability, one which corresponds. What is the p-value or, as you get zero right, yeah and then it's consistently samples for sure, and that's all I mean this is a valid approach and.

C

Is there I, don't know enough about exemplars, because metrics in my brain have never worked well together, but in can you flip it around and say I can see the p-value, therefore I'm going to choose something with a high P values as an Exemplar.

E

Actually, in consistent sampling, you're free to choose the p-value, you can also use um attributes for your choice so, depending.

C

On but yes, yeah.

E

um No, the R value is fixed, so it is a constant which is defined at the root, and you cannot change that and you are not allowed to choose the p-value dependent on the r value, because this will break the consistent sampling decision. But what you can do you can freely choose the p-value for every single span, so you can choose a different sampling probability for every spin and this different.

E

This choice of the p-value can depend on span attributes which are available when the sampling decision has to be done and if, if uh it's known that a span is, is an example when the decision has to be made, then you're free to choose a p-value of zero, which corresponds to 100 sampling probability.

B

I think the problem is that the exemplars happen after the span starts, so you don't know them and.

C

B

Think probably one answer here is well the rich. The basic answer is already written in this thread is don't do that use head sampling and only consider sampled spans for the choice of exemplars in your metric system, um which.

A

I think is a reasonably good answer.

B

F

Well also, and another possible solution would be uh when we select spans to be exemplars. Perhaps we could look at the R value and select those which are which have high R values, because we have some Freedom here. I understand.

E

Yeah, that's a good point.

B

Yeah, because there's the sort of convention is to collect one Exemplar per bucket of a histogram per collection cycle, so that if your collection cycle is 30 seconds long, um you may have seen a thousand histogram observations. Some portion of those would be sampled and you can have it you. Basically, the idea is: there's a fixed reservoir one per per histogram bucket, and then you can do uh atmar's fanciness to do a fixed, Reservoir, consistent sample I. Think.

E

It's a good idea to make the choice or the selection of the examples also consistent. So if because it's also a sampling process- and it could be made consistent if there's an R value.

B

Yeah, so let's Suppose there is there's an R value and you want a sample of one just just as a special case, because that's basically what Prometheus does want um is the special case of a consistent sample for a reservoir of one just choose the one with the high star value.

E

Yes, yeah I mean you, you will have multiples which are equal right so because our value is are discrete.

B

E

Because we have one of the one of those which is uh no.

B

Yes, oh you're, just saying any ordering on R values would work yeah.

E

I have to think about it. If you have to keep if there are multiple with high star value, maybe you have to keep either all of them or not. I have to think about it in order to be to have the possibility to extrapolate that on in a non-biased way. So, but I have to think about that. So you cannot answer that immediately.

B

That sounds fair um well, I've put it I've, put a link to this issue and um I I at the very least, will try and say something about consistency and support. Josh's other response here, which I think is correct. um um Does that sound like we've done enough here? um If you have any thoughts on that, maybe we can Circle back next time.

B

All right, um I'll make sure to follow up with after the meeting. Just briefly.

B

Okay, so- and this is the one where Kent had a topic ready to go and I- have something to say as well. um I know: uh I I, put a link to an old issue. I might as well open it just to show you what we're talking about.

B

um This was right after we got our probability, stuff kind of landed and and finally merged and I I made this half-baked idea, which turns out, was pretty Half, Baked um yeah, the idea being that, if you're doing probability sampling in a tail tail sample or in the collector that you could invent our values after the fact um or sorry for different P values and- and there were lots of reasons not to do that, I think Peter's and not Mars. Both responded with gut reaction.

B

This is a bad idea um and I think it led to the suggestion that we probably want to maintain independently, like if you're, if you're doing after the fact, random sampling. That's that's like a different type of adjustment than a consistent sampling adjustment um and they might want to be kept separate. I raised this topic in connection with just generally this desire and I. Don't have an issue about it, but it's.

B

The issue is essentially following through on the w3c, the proposal to add seven bits of definite Randomness um which gives us this opens the door to non-power of two probability sampling um this issue here that I just showed you was like the point being here- is that these can these collector configure uh these collectors?

B

um That are it? Let's suppose you know, one of three sampling is happening. It's after the fact, um whether there's a p-value or some other valued indicate after the uh to indicate random sampling.

B

um The uh the the idea is that if this you can't uh when you're doing head sampling- and you want one and three sampling you can you can Pro, you can probabilistically choose between the nearest powers of two one half or one quarter, to get one and three, but when you're doing this um sampling in a collector pipeline as a tail sampler, you cannot do that anymore, but you could, if there were seven random bits of our value baked into your Trace ID, begin to do uh consistent sampling at one and three.

B

Baked into the trace, ID I believe um so that I think we can start the the that this issue is independent, but I think people would like to be able to encode probability to encode all the adjustments that happen to their counts as they collect their traces, and um this probabilistic sampling processor in The Collector is going to continue getting use.

B

um I would like to see us have a solution here, as well as for um propagating our values that lets us avoid propagating our values, because the trace ID has seven bits of randomness, and then we only need p-value. We talked about this, maybe three or three meetings ago, once you once you have r value baked into the trace, Trace ID and you have seven bits with reliable order. You can then have seven bits worth of sampling probability. At that point, your p-value could be replaced by something that has I.

B

Think we use c or t t value is the Threshold at which probability sampling starts, which you can use to compute a probability sampling sampling probability. All this has been sketched and I think it's time just for someone to start writing or if anyone's interested I think we are ready for it.

B

um It does sort of suppose that w3c is going to do what we think they're going to do, um but I think my impression right now is people are not using the powers of two sampling. um It wasn't good enough and it's time to do more.

C

So the part I wanted to ask about.

C

um Just to set the stage you when honeycomb does tail sampling one of the things we do um I mean we have a bunch of different types of Samplers, including a rule-based sampler.

C

Where you can say well, you know if, if I had a 500 error in in my Trace, then then I want to sample this at one at a probability of you know, I definitely want to keep it, whereas if I have a 200- and you know, I can take one in ten thousand um that and we do that number as one over the probability.

C

So in other words, we do an integer of one two, three four up to ten thousand or whatever number you want to be um so that's the model honeycomb has and has been using forever and when we send to the trace, we stamp the trace with the sampling over the field. We call sample rate, which is a bad name for that inverted um uh probability effectively, um because then we can multiply by that number, um the the traces we have to show statistics in you know in in the graphs and it it works pretty well.

C

Most of the time I mean there are places where it misleads, but in general works. Okay, except when you start getting into aggregations things, get weird anyway um point is um that model is more sophisticated than the powers of two thing, um but the powers of two things fits neatly into it, because it's just one two, four, eight um and um I guess the question is: are we talking?

C

Are we talking also about? Are there people who want to sample at 80 percent?

C

um You know is that the sort of thing we're also trying to do here.

B

uh Yes, I think so. I I mean one of the reasons that led me to post that that issue before was that 75 sampling is like really a desirable number for some people. If, if you know, if they're just putting a sampling in to cap their bandwidth and they want as much traces as they can get somewhere between 50 and 100, because sometimes the answer, and um uh so so I and I was I, mean I was surprised, but that was literally the first user response to like hey.

B

We've got probably two stamping now: okay, what if I do 75 um so yes, um this idea of a t value is uh is: is the idea that, uh among seven bits, you can produce a threshold that accepts 80 by by choosing the value that is 80 of 2 to the 56th or student 56, plus one or something like that?

B

um And um what I mean 57 bytes? So it's 56 bits y sub, two yeah six, um okay, and then we don't need an R value and your and your T value is, which is the non-power of two p-value, would uh would convey both integers as well as floating Point numbers arbitrarily.

B

With 56 bits of resolution, um it does create that that glitch that people are worried about some of the time, which is that your your adjust counts can be non-integers like with 75 percent. Your adjust account is you know, it's.

C

Four thirds.

B

It's one and a half right, that's the pro! That's a problem for people, um but it is. It is what you get when you ask for three and four sampling.

C

B

Yeah so I'm sure that's such a problem, but but I I do think that so so another alternative to the T value which can give you these binary numbers is to have a c value. This is hypothetical, but C value would be the literal count. So if your C value is two, it means you did one and two your C value is three one and three see values. Four, you did one and four. This makes your your sent. Your adjusted counts, always an integer, um but it's a little bit more mechanic.

B

It's a little bit more massive to get to compute.

E

I mean it's it's nice to have I mean if the reciprocal value is an integer, then it just accounts uh always seemed Tetris right. So uh but but one thing, if, if you, if you want to have all the just accounts to be an integer, then you are limited to sampling rates like one 1 divided by 2 1 divided by three here is I, can't told us so, but if you're limiting to that, then you have already a huge gap between one and fifty percent yeah.

E

So yeah I mean the Gap decreases, but if you're accepting this huge gap at the beginning, why not accept the Gap later on and then you're ahead powers of two right? So it's somehow unnatural why you would support the huge captor and and not uh for other, uh for the other same.

C

I mean I could totally imagine that number that we currently have as an integer um becoming a float, um and so, if you specify you know the value of 1.333, that's a 75 right and so um that you know that would require us to modify our plumbing so that we use that floating Point math rather than integer math. All the way through our system.

C

um That's work, but you know, might be the right answer here. So.

C

I mean I, do think, there's plausible and valid use cases, four, not having it be just the power of two and not having it just be a one over an integer, so I mean both of those things are limiting.

B

In the um statsd protocol, which is kind of a relic, but it's still alive today, we um have an example where the open, Telemetry collector has support. There's a receiver written and is receiving sampled, uh the there's a fraction, it's a decimal fraction. That's that's included in so it's a floating Point number, so you can say at zero point. I think the leading zero is presumed, but you can say at point one and it means one in ten sampling, um but the the hotel receiver has no way to express integer, non-integer counts.

A

B

The histogram, for example, um so that when you receive a histogram sampled with a non-integer reciprocal, it is it has nothing it can do, and it's basically just a readme statement, saying don't do that! Stop! Don't configure stat C with not integer reciprocal sampling rates, because the entire system down the stream can't handle it right now.

B

um I think that that's good enough, basically saying if you think that that non-integer sampling rates are a problem. Don't don't do that, but um maybe that's not enough.

B

I guess the proposal that I want to make and I'm having trouble committing much time to it myself, but at night I don't know.

B

But the proposal is that we we move forward with some kind of um document that says: there's a non-integer or a threshold-based adjusted count mechanism and it relies, and it assumes that you can stop sending the R value because it's baked into the trace ID just that that's the the whole scope of this um and um and then I guess it's a separate question as to whether a tail sampler that's doing um well.

B

Hotel sampler is doing consistent, probability, sampling and, and it sees the R value it can generate its own p-value at this point, I believe. So the idea is that if you have that seven bits of bytes of Randomness in your Trace ID, now that the collector pipeline is just doing consistent, tail sampling um and it can output a t t value to convey it to just count for non-integer sampling rates, which is exactly what I wanted to get on this conversation.

B

If that sounds reasonably believable to you, then I'm I'm content and I and I'm not saying that I I'm trying to assign this to anybody. But um it's available and I might consider this more important as the months go on.

C

I think that's reasonable.

B

Cool I think we could leave it there uh for now at least I don't have I can't commit myself and I'm and I have to think about this a bit more anyway.

B

um If anyone wanted to jump up and volunteer to try and do that, I would I would applaud, but I wouldn't necessarily try and press anybody. um Otherwise, I think we could call the meeting.

D

The Josh uh would that be a Otep or.

B

um I I've lost confidence in the Otep process, although it's just because there's there's not a lot of success like people are there's a little bit of a stall right now and I think more contributors, core contributors like me, need to get in and do work and then I um I actually have held today to do catch up on open, Telemetry I'm, going to try and write some specs, but I'm not going to do this to other things that are higher priority for me.

B

So I think yes in Otep, and it's going to be an Otep that just us just those of us here read and then we have to pester other experts to to comment.

B

But if, if it stalls out I'm tired of it and I want I would propose that we could just start updating this spec with you know if, like this is, this is an incremental change to expect that we already have, and it's not so it's not so huge um and it you know it does presume some things about w3c that are out of our control, and so we can just say: look w3c wrote this thing here. We are updating our spec for it. Maybe it doesn't need no tip.

C

Yeah I feel like Otep is a great way to have something stalled.

B

Yeah um yeah I'm, trying to rescue some of the oteps that are currently stalled and I'd like to see some of them merge before I recommend more oteps.

B

um Most of my time, I'm working on this Arrow Apache Arrow column representation. We have a really nice, Otep and I'm trying to get it merged and it's like uh no one. No one wants to read it. So um I've stopped recommending oteps, um uh but that's that I think uh everyone wants the sampling stuff and time is coming.

B

All right, shell um tidbit, let's step finally out of support, adding support for p-values. Finally,.

A

B

Given that we finished the spec almost a year ago, hey everybody see you in two weeks and there'll be some issues and you can follow them.

C

Thank you, bye, bye,.

D

Thank you, bye.