Open Telemetry Uncategorized, 6 Oct 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2022-10-06 meeting

Description

cncf-opentelemetry meeting-2's Personal Meeting Room

A

B

B

B

B

Hotel sampling, hello good morning.

B

um As you may know, I'm mostly here to listen and stay up to speed with all of you, um I did review Spencer's document and I see. There's been some discussion on the issue from the spec repo about tracity ratio sampler. That kalyana has commented on very recently: I, don't have an agenda for this meeting, but I think that's worth discussing.

B

So I can pull that up to help.

B

B

I'm going to put it up, so we can discuss.

B

This issue here um is very old, as you can see it was filed about the same time. We marked the tracing uh SDK as an API is stable in hotel. We actually decided this was the last issue and I decided not to solve it, so it was left as a to-do, and this issue was filed, and here we are, you know, quite a lot later.

B

um All of this discussion predates inflationary discussion. Inflationary sampling discussion predates our work on Trace State, um that was the following summer and um it got revived 24 days ago. The reason why is that there's a bug in the hotel go repository about it?

B

um Some people discussed it for a while, um but we got to the point where um I brought up the trace ID the trace parent random flag from w3c.

B

We've discussed it now and we come to the end of this conversation wondering a how many bits do we need of Randomness and B what to do to fall back when we don't have it and the question has been asked, so we need an answer. Essentially, that's that's what the the conflict with x-ray is that we don't have a good answer for this question and I will step back now and would like to hear um kalyana. You have perspective from the w3c group.

B

Peter also answered the question of how many bits we might want um I'm enthusiastic to listen at this point.

C

uh Thanks Josh, so my question is mainly my. My concern is mainly about the timeline like, for example, the wccce trace context level. Two spec. We just published the first Public Work draft and they have kick-started all the review process like the security review, privacy review and all of that and I'm, assuming it will take a few months to get to the recommendation stage right and then implementations have to pick it up like I, don't know like all the open, Telemetry sdks like dot net, and all of that that will take some time.

C

So it may be several months um before everybody had opposite and all of that maybe I, don't know, maybe a year, maybe I'm being pessimistic. But and if somebody wants to use consistent probability sampling like in three months like do we do, we have a solution, I think that's my main worry and, and the second part of it is from what I've heard from folks like Sergey and Daniel uh dailer.

C

Is that, even though the level one spec of w3c Trace context doesn't explicitly require that the bits should be randomly generated, most implementations uh do generate it. That way, even if we look at the AWS x-ray here, I think only the first uh four bytes I believe or six bytes uh has the time component in it. The last parts of the last set of bytes are still randomly generated.

C

So if we can do a best effort in specifying the same algorithm to use, irrespective of whether the random flag is set or not, I feel that will increase the possibility of having more consistent sampling right. Irrespective of whether uh implementations have moved on to that level, two or not, uh because it it doesn't add any extra work for the SDK implementations, I believe because they they just follow that new hashing algorithm that this group comes up with. So that was one one question I'm trying to understand.

C

Better and I have another unrelated question just to understand better, but maybe I can pause here just to get everyone's uh thoughts on this.

B

D

B

Kent please.

D

um Well, I I just wanted to note um at honeycomb uh we we have this sampling algorithm uh embedded in most of our libraries, and we went to some effort. Sorry, we went to some effort to make sure that we used a consistent algorithm across all of our libraries, which was nothing more than a sha hash, with assault of the trace ID um and so that we had a consistent value from which to make a. You know. A deterministic sampling decision like this, so um I I mean I I.

D

Think what you're saying makes a lot of sense to me um and and I think I I, don't know I I, guess the question is sort of like I think we'll end up in a debate that the worry is that we end up in a debate about which is the right, hashing algorithm and which one is performant enough to solve this problem at volume, and you know like that, becomes then the you know a level of debate that is kind of meta to to the question.

D

But I I worry like I, like what you're saying and I also worry that it might prevent it from getting resolved nicely.

B

D

I'd love to hear your thoughts.

B

Well, I'm I'm just trying to find a pragmatic solution. I think the recommendation kalyana made is that the? If, if we see that the first four bytes of the trace ID, are have a precedent of being used for time? Therefore, not random um I, I, actually confess I, don't know the entire format of the X-ray idea. I knew there was some time in there. Is it true that the other 12 bytes are random or uh I mean?

B

If that was the case, I could imagine us following roughly that I mean the approach that Kent has two parts front. One is to just assume there's some Randomness and use it. The other part is picking a hashing, algorithm um and I guess. We've debated in this form whether.

B

Yeah I I actually don't know what to say um other than that I feel like there's a pragmatic approach. Here. That's just uses 12 bytes, but I'm not sure we can trust those to be random. Do we still need hashing ones.

D

Okay I mean the spec says that.

D

Well, wait the spec says Global uniqueness right, not necessarily random, and so that's the that's the hard part. It's easy to do well, not easy, but there are ways of doing Global uniqueness. Don't include radius so.

B

um Forum opmar in the past has spoken against trying to standardize on a hashing algorithm because they tend to be not necessarily having uniform distribution as well as right. A performance cost. um It's like that. The risk of not being uniform as well as the performance performance cost as I recall,.

D

um Yeah actually I mean I was struggling with this this week as to whether the algorithm we've been using is uniform enough um to for the purposes to which we're putting it so.

B

One of the big struggles, or at least challenges I, saw, and maybe it was wasted effort, but in the Trace State investigation for probability, sampling, propagation of probability, sampling, information, I put in a sketch of I, think got approved. So it's it's in the spec. Now how to test for adequate, randomness um and I mean Ottmar helped me with this.

B

It was it's maybe good, it's maybe not perfect, but it is at least suggesting that we know that when it comes to an implementation of a trace ID, it's going to come from a random number generator, and how do we know that that random number generator is any good? Well, the test we can do is like. There are tests we can do. Some of them are pretty simple uh and I wrote up one and implemented it.

B

I I wouldn't guarantee I, wouldn't trust my life on it, but it's like at least it's a step towards saying yeah. It's been a number appears to generate fairly random form, uniform distribution.

D

How many samples do you need for that test? Oh.

B

um I I think I wrote the test so that you, you do a hundred thousand trials and then you look at the number of successes and failures and you compare that with uh the expected distribution and then you run chi-squared with you know, one degree of Freedom or something like that hot Mark checks might work.

B

um The way I wrote it to satisfy myself essentially was that you should be able to give a proof that you found uh a seed that passes the test very easily and um anyway we can go into the depths on that.

D

I worry about the Implement. You know that that's one thing I worry about in just assuming that Trace ID coming into you is adequately random without running it through a hashing algorithm is a really fraught yeah uh decision, like I unders, I, sort of understand why people don't want to do hashing um and, like I, said I I kind of accepted that it it would be a battle, but I also don't see a great alternative like couldn't. We identify a reasonably performant uh and reasonably well, you know understood to be uniform algorithm.

D

There's a lot of Standards out there.

D

um I mean like passing your. What I I would think that one of the more common hashtag algorithms would pass your test with relative ease.

B

I I assume they would yes I'm, going to put a link in the chat to that section of the of the specification that talks about it and I'm leaving at that. um um I I was satisfied with this, but I did it a year over a year ago. um I think there probably is uh well I, think Ottmar should speak, but I'm I'm, guessing that the way we've come out of this in the past is to say that.

B

Well, if you do hashing, you should probably only do sampling at the root, because it's really hard to get a consistent implementation of a hashing algorithm across more than one binary, except when you go to like sha I mean that sounds pretty good, but isn't that expensive and then I like I, don't know all right.

D

I mean I think you could have an acceptance test like if you, if you specified in algorithms, you could have an acceptance test. For you know, your algorithm is correct. If you give it these three inputs and you get these three outputs exactly.

B

Yeah so I I'm putting up now the test requirement that I wrote and.

D

C

D

I do agree like for in terms of uniformity of distribution. This is great, but but like there, it should be relatively easy to to verify that, given an input, the the correct output is generated um for a you know, specific because you, what you want is for the same input to generate the same output in this in this algorithm. If you're, starting with some input, Trace ID, which somebody gives you, because they just called it a string and they gave you my Trace ID, you want that to generate a consistent value on the output.

D

Otherwise, the point of doing this across multiple programming languages, in an environment where you might have Java over here and go over there, you want them all to generate the same hash probability. Calculation.

D

um I mean same probability, calculation from the same input.

E

Yeah I mean this is easily testable, yeah you're right so I mean it's basically testing a hash function if it is yeah consistent with other implementation. In other platforms, it's you just need a a huge set of test directors, your input and the output and right so you can um yeah test it very easily, basically and to you can realize consistent implementation across different platforms very easily. If you do it right.

D

Perfect I mean I had to deal with this a a few years ago. I, you know, often admit this, but I worked in cryptocurrency and we had a problem of having to do something similar across uh JavaScript and go in and Java implementations. uh We needed the same algorithms to generate the same results and it's kind of annoying to do that in JavaScript, but we were able to be successful once you put your mind to it.

B

um I think we've had a tendency in this group to stay away from hashing I think for fear of expensive Solutions I'm wondering if it's time to give that up and just say we haven't been able to um move forward on true Randomness very fast. It still looks like a year out before we can get this w3c thing. Maybe we've been asking for too much and settle on uh proposing the honeycomb solution as a v0 and just since it exists and I'm sure Kent would like to support that idea. Maybe, but the idea sure.

D

Makes other lives a.

B

D

B

We don't have to rewrite them all, but yeah common algorithm um available everywhere, I, don't know, I mean it sounds expensive, but um it it is in service of lowering the cost of tracing. So it's it should pay for itself, and it's not perfect, but it's it's essentially moving us forward.

B

um How's that sound everybody I'm, just as a rough idea.

C

D

Need to understand better.

C

The the honeycomb solution.

D

I'll tell you it. It basically is. I could probably find the code, but I'll look in a second, but um it's essentially take the trace.

D

Id we're given um run it through xiaowan, along with a um salt that we just defined in order to be able to separate like we could use a different salt for different purposes if we needed to or whatever, but then and then that result is then used as the value and actually well that that result is then used as a random value um and in our circumstances the thing I was just struggling with or haven't really resolved this week and I'm going to actually probably read your test. Cash um is um uh the question.

D

We then take that value. Take the lowest 32 bits and use that in a sharding algorithm for a deterministic sharding, um which is maybe not as good as we'd like it to be um so that's one of the problems is, is I'm not convinced, yet that shot that in this model, or maybe with our salt or whatever, that this is, is appropriately uniformly distributed. I'd like to do that test, the.

E

The Shaw Group is a cryptographic hash function, so it's quite expensive. You know there are two classes of flash functions, good for graphic and non-cryptographic and I would. If we go for a hash factor, I would vote for a non-cryptographic one, because they're typically one order of magnitude faster.

D

E

I'm not so sure, if the cryptographic hash functions, you really produce uniform output. You know the design goal of cryptographic. Hash function is a different one. You uh because yeah, if you have a hash, it should be impossible to find some input which gives you the give mod, but but the non-cryptographic ones images a couple of them. They are really designed to produce output, which looks like uh a random number uniformly distributed random number and just actually yeah this test Suite this GitHub project, which I posted on the chat.

E

um This is a set of statistical tests where they compare different, up-to-date hash functions. Oh.

D

E

And so yeah comparing speed and also the hash quality. So so, if they pass all the tests, then it's a good sign that uh um it's a high quality hash function.

E

Which is a well-known hash algorithm is does not pass all those tests. I think the the better ones are have been developed. The last two years, I think just call me hash and Y hash, which are quite good according to this list um yeah. But it's there's a lot of development still going on.

E

But I'm also not sure, because this is a special case. What we have we, our input is uh quite small. So the input is, you know just the Tracy, which are at least 16 bytes right uh if I'm correct- and you know this is maybe also not which is fully covered, but it is because they're considering any input of arbitrary lengths and uh yeah I, don't know so maybe we have to specify our own tests.

B

It's overwhelming when I look at this table, because I've heard about two or five of these I mean I've heard of 10 or 15 of these, but and like I'm, using Farm hash for something I, don't know how good it is. I just don't want it to collide very much.

B

It's not for uniformity, though um well this I mean this is the reason why we came to the w3c group with with a request for a random flag, but I I suspect that if you scrutinize that request enough, the question will be how random do those bits need to be, and we will end up again talking about the same question, which is: is it random enough? How how accurate is my sampling or how much impact on my sampling is there from non-random? You know non-random, randomness and I. Don't know what to say about that.

E

A

If I may I I, what I'm going to say is not constructive in any way, because I'm still struggling to understand why we are really um going back to considering hash functions on Trace ID bits. I believe that uh uh using R values and P values based on Powers of two is is a is a great solution and it avoids calculating any hashing on on anything and uh so what what's the real gain that we are trying to achieve with with this hashing I'm?

A

Well, let me let me say that I believe that this random bit is a great thing and we can still use it. For example, if we want to sample with probability which is not a power of two, because we can use the bits from um from Trace ID random bits to to select the appropriate power up to probability still, of course, it goes back to the quality of this random bit and so on.

A

But the risk here is a little bit smaller I believe than if we take the whole thing as as a base of our of our probability.

E

I completely agree with you: Peter.

B

Okay, so one of the advantages. Well, we, the reason why we were discussing this as I recall- is that when it would be nice if we did not need to add a Trace State variable for unsampled traces and that was going to be required with the R value by moving to the random bit, we do not need a Trace State to convey unsampled Trace probabilities.

B

So that's the recollection that I have and um foreign I think raises other possibilities potentially like we. You know if there are 128 bits of Trace ID. What if we just said six of those are going to be used for the R value now they're non-random, but we can control how they're used. That would be a completely new proposal. Let's not I, don't want to make that but um I guess that that the the goal of having true Randomness was that we don't need us to chase state. Basically.

A

That's true yeah.

C

It's a Jewish.

B

And when you have those actual bits, as opposed to the logarithm of those actual bits, um which is means more bits, which means more expensive, you can then do the non-powers of two, as well as the other five reason. I think like.

A

Well, what um yes, but uh in the backhand, dealing with with those um probabilities, will be a little bit tougher than when they are power of stool, so because because the differences between them can be very, very small and then there can be a lot of different probabilities used across a trace for different spans, I I. It won't be probably very often, but we need to be prepared for that. And this is a complexity or sorry.

D

You're suggesting sampling decisions we made based on something other than Trace ID, so you'd have different sampling probabilities within a single trace.

A

Yeah, oh yeah. Definitely we we still want to have different spans for the same Trace to be sampled with different probabilities. That was one of the reasons why we have these R values and P values, so they can the decision. Sampling decisions can be made independently at any stage of processing.

A

This is quite important, I believe- and this is sometimes unpre underappreciated by by people. They say that we want full traces periods. Well that doesn't work in practice because you can have execution branches that are very frequent or very infrequent, and you want to change your probability for these branches.

E

Yeah and what you said that you have multiple probabilities on the same Trace I think this can happen more often than you expect, maybe because, if you have, for example, a sampler which tries to you know uh sample at a certain uh rate, so a certain amount of spans per minute, for example, then it has to adapt in order to try and load, and so every agent maybe chooses different sampling rates for the same choice. So it could tap more often in future. Maybe right.

D

I I come I, I I accept what you're saying I, don't actually disagree, but I I come from a world where we're really focused on tail-based sampling, and so we we basically collect the trace before we decide to make a decision to sample, and we do have you know Samplers that attempt to do things like optimize for throughput and will you know you give it a key that it uses to decide the cardinality of what it's sampling and, and so you can.

D

You can still make decisions based on yes sub elements, but this is all tail based and and.

C

So yeah you're making.

D

Them head-based, then. Yes, everything you're, saying, makes total sense to me.

E

The sale-based sampling is is much easier because it's a sampling decision for the whole Trace, usually right so and yeah it's easy to handle that I'm going to like estimating things out of it got it.

B

Yeah, so this is also the when we use the word consistent in this context. It means that multiple participants in the same Trace will reach the same conclusion given the same probabilities um so that you can have essentially rate limiting in your span, and you know at a node in your in your Trace. You know this particular node I'm, going to say no more than x spans per second and then the Upstream is the is.

B

The you know is the responsible for how many requests actually happen in your service, and this gives away for services to protect themselves, um although, as a vendor, my vendor is mainly interested in complete traces that are boot, sampled and tail sampled as well, so I get that this is kind of a corner case, but it came out of you know. This is the open Telemetry specification that had this Trace ID ratio sampler? That was meant to be consistent. That came from open census exactly because Google had this problem, you know 15 years ago.

B

Is it like you? Have this huge hierarchical service and the leaves have no control over their sampling rates and the The Roots? Do and you end up in a situation where the performance cost of tracing is out of control, basically, um and that's where this to do and the desire to be consistent and how to actual um some sort of deterministic decision it came from.

B

um So even though many vendors are a little bit uninterested in this corner of the world, um it sort of fell out of open census, and here we are.

A

uh One more thing: I Joshua mentioned that we want to have a solution that might avoid using Trace State, but I believe that, even with with this random bit and using uh Trace IDs source of Randomness, we still have to transfer the threshold that the, which is the C value.

B

So right so P values and C values, I'm I, hypothesized about a c value or a t value recently. But.

A

B

The the you would need that when you were sampled so so there's a cost for sample choices or sampled spans, but not for unsampled. That was the one slight difference um that I mean going back a year in time. I feel like there was resistance to to this proposal, partly because of it's going to cost every span.

B

um I may be wrong about that, though um I, actually, you know, with a year of perspective, I think the report that Spencer wrote about how we haven't finished the sampling story and how client you know. Client configuration is the big problem. That's like probably was actually holding back sampling, not that it costs. You know a Trace State field per span to do it or a Trace State header per RPC is really the cost.

A

C

Josh I have one question on the non-power of two sampling rates, so in in the original Otep on the p-value and R value that you authored um like a few months ago, that you talk about um an example where, if somebody wants like say a five percent sampling rate, they can still achieve it by the combination of sixty percent probability of using one by 16 and 40 probability. Using 1 by 32 gives you 1 by 20.

C

uh how? How comparable is that with the true non-power of two approach? Are the from a probabilistic perspective? Is it going to be the same kind of unbiased? Is it going to be absolutely equivalent to specifying 1 by 20 with that algorithm, or is that.

B

E

It's it's. It's not exactly the same. I mean it's unbiased okay, but if you estimate- because you have you know- uh sample spans with two different uh probabilities, essentially so, which means you have two different just accounts, so this will uh result in a larger variance. Actually, then, if you would choose exactly the mean probability in between right um yeah, that's.

A

E

A small downside, yeah.

A

But for very large numbers for large sets, these two will converge to the same to the.

E

Same yeah, the bias it's bias free, basically, so in on average, you get a thing.

D

B

D

B

Like I discovered something that was surprising to me after we wrote that stuff and it I I, okay, so the the thought experiment was something like that works for if you're, making your own sampling decision. But if you're tail sampling, you can't just tail sample at 75 percent,.

B

Because the user still has to sample at 100 in order for it to work, um there's no consistency at 75. It's it, you know, you'll, some spans are going to be in some strands, are going to be out. I forget exactly how to phrase. This problem, though um I I feel like that we, the the thought experiment, was there's a probabilistic span. Sampler in The Collector can I make it work, um can I configure it for 75 meaning pass through um one-on-one, whatever you know like.

E

Yeah, so basically, if you have powers of two sampling rates here, then you have, for example, either 25 or 50 percent yeah, and actually you want to say employees 40, for example, yeah. So that means you. You sample spans of of a choice, sometimes with the probability of 25 and and some other spans of the same Choice with a probability of 50, because this information is not provocated. So this choice between 25 and 50 is not consistently done over the whole race.

E

So that means you do not get 40 complete traces in this case, but you get at least 25 percent, complete traces right, I hope this was clear. That.

B

Sounds that sounds like the bulk of my of my memory. Yes, um I'm gonna have to think through it a bit more, but but yeah.

B

um Hope that helps um okay, we've talked a lot about this I, don't know we're making progress. At this point, um it seems like we've discussed a couple of options. One is trust the 12 bytes of Trace ID that are not Amazon's, four bytes of timestamp.

B

um Adding to that we might still not actually have uniformity or, like some sort of statistical quality that we want. We can test it, so you could I could imagine specifying we will use the trace ID assuming it's random, except for the top four bytes which we know are not random or whatever, and here's a test to to determine whether your random number generator combined with Trace ID construction algorithm passes our Baseline Randomness test, which could be similar to the one I showed earlier.

B

um We haven't talked about hashing. At this point. um We are now able to support the p-values without the r values. We can now add a t value to do thresholds non-powers of two, and it all rests on that Randomness being good enough and that's where a test might come in um I feel like that's the more practical approach than trying to choose a hashing algorithm and specifying it um foreign.

B

But it does mean adding like asking us to or asking for tests in all the languages, to make sure that the construction of Trace IDs, followed by sampling decisions, doesn't distort unexpectedly. The statistics.

B

I think we're still facing the problem is that the clients need configuration foreign.

E

That there are not many pseudo random number generators out there, which would really impact the sampling decision. The same point, a lot so.

B

E

I think most implementation are good enough and sometimes I mean uh on Modern Hardware. You even have a real random generator support right on some CPUs right.

B

So we're basically saying we trust random number generators. That's my experience.

A

D

The reason I'm writing those tests is like.

B

Because of a bug that I met in my own code, not because I, don't trust a random number generator um and.

E

So the modern not trusting is our hair should algorithm. So this is, if we decide to go for a hasha group, then this needs to be carefully chosen.

B

Right, if we didn't do the hash and we just use the low 12 bytes and and did some research make sure everyone agreed to that- do we still need the random randomness bit in the trace ID at all, or are we kind of like retroactively, assuming that 12 bytes were random, I mean you.

E

Could increase the chance that we get eight bytes of Randomness or even less by, for example, X during the first eight parts with the plastic parts, because then you only have either the Thirsty parts to be random or the last one, and if bows are random, it's also fine. So in both cases you would get so then you, for example, yeah- and this is a cheap operation, for example,.

B

So a sketch of a proposal is xor. The highward and low word of your Trace ID assume it's random. At this point, it's just.

E

A proposal but I mean it increases the chance that uh that the current increase at these uh would give us enough. Randomness.

C

But uh sorry that that one doesn't um align well with the random flag itself, correct like because that's going to only promise last seven bytes or and assuming that this goes into the recommendation stage in that form, then it's like last seven bytes.

C

So why not bet on that as the um approach and then what I was just proposing in that issue that comment I made was just that more like a stepping stone, because if it takes a few months for things to adopt, this is more like um use that same whatever we are using for that random flag being set.

C

Also follow the same algorithm for non, even if that flag is not set assume that last three pipes are random in a way because it's the best effort anyway, maybe maybe it is not random, in which case it won't be consistent, yeah, but that's the same result we would get if we don't do anything, so it felt like a.

C

So in my mind, the options seem to be either go with that approach, um but use that, irrespective of whether the random flag I said, but within eventually in the fullness of time, everybody will adopt it and everything is guaranteed to be random. And all of that and you get the benefits for the other approaches to just use. The p-value R value proposal that already exists.

C

I mean those seem to be the two things in my mind um and and I think there are other options being floated as well, but the I don't know whether we want to go to the beginning of the trace, ID right or more in that proposal. We know that, for example, AWS is never going to change that first, four bytes. If it is time based, so why even take a dependency on that yeah.

B

So this is a different proposal. I like it. This is the idea that we just assume the low seven bits. Bytes are Randomness, that's that would be best effort, because that's we need something or- and we can't wait and eventually.

B

All Trace contacts, we're gonna, have this: okay, if I set or else There's real, some really some reason not to, um in which case you probably shouldn't, be sampling.

B

D

Yeah me too um has the benefit of being basically straightforward, yeah easy to kind of move with.

B

um Daniel daila asked immediately after I proposed something that do we need 56 bits. um Has that been a discussion in the w3c group um Peter pointed out why we needed like at least 30, but um somewhere between 30 and 50, is like not enough and too much I'm, not sure I I would follow that. What.

D

Would be the problem with.

B

D

B

Someone's going to complain that it just costs too much to generate these things, but I don't know if that's true.

B

That was one advantage of the R value approach. Is that it requires on average you know whatever to I can't remember. Was it two random numbers per to expect one or two yeah, something like that yeah? um So you can imagine.

B

Right, you can imagine not needing as many bits as I came from basic answer and maybe there's a way to make these I'll have to pick that's a bit more.

B

E

But actually I do not understand why it's expensive to to produce 64-bit right.

B

Now, I don't really.

E

B

Because it's usually.

E

All random number generators either give you 32-bit or 64-bit I mean there's nothing in between right, so yeah.

B

Yeah I think I'm more familiar with people like worried about the the synchronization cost of calling a random number generator so, like Google had a per thread. My thread, local random number generators for its tracing Library as I recalled 15 years ago or whatever um so I. Guess that, but but my feeling is that if no one's going to object to 56 bits of Randomness, because we're already generating 12 bytes of Randomness and no one's complaining for x-ray- or you know six, you know more than that for for the other Hotel sdks.

B

So, let's make something concrete.

D

Happen in this conversation, I.

B

D

Would be short-sighted to just say 32 and I'm, not sure, there's any difference between 48 and 56 in any sane world, so I would I would bias toward 56. as.

E

I'm inserted to I, don't know, I mean it's it's. It limits the sampling rate to one amount of uh I know two billion.

B

Four billion.

E

Or something like that, so I don't know if this is future proof so and so that's why we propose 64-bit at the beginning yeah, because this will this is future proof for sure yeah yeah.

C

But we don't have 64 right. The max we have is 56 I.

E

Believe yeah, but this is still a huge number so to the power 56. It's uh yeah.

B

So concretely, to try and help solve the original issue here.

D

Go ahead, I'm, sorry, I, just thought of something which I don't know if this matters, but I was just thinking of the problem of um a floating point.

D

uh Number value can only holds 52 bits.

D

um Of uniqueness, and so is there any I I just you know, throw it out there and if the answer is, don't worry about it, fine, but but is there any reason that you might want to take this 50s or this end bit random number, put it into a floating point value and get it back out intact?

D

If so, then we should limit its length to no more than 52 bits, um and so then, maybe you say, 48 I don't know.

E

Actually, these are 53 bits, 53.

D

Okay, I'm sorry, I've, forgotten.

E

But anyway, uh yes, I mean it's a yeah.

B

We're going to be constructing um uh a fraction at some point that we'll have 52 bits of significant in it, but there are many powers of two between one and zero, so I think we are able to use all those bits. That's my first answer to this question. I may be wrong, but.

B

But that's a reasonable observation and- and something makes me think, that's not a bad idea.

D

B

Mean if we're gonna say 48 instead.

D

E

But usually there are no random generators which uh directly produce a floating Point, random values. It usually consume one integer or one uh 64, pretending to try and just take those 53 bits to generate the random float, and so it's a little bit strange to convert a random long first into a float and then back to some I, don't know. What's the use case for that.

B

um In my pseudocode for this cheap hypothetical T value see.

B

Okay, yeah I'm, not sure I, I, think 56 sounds fine. I, don't see a problem with the floating Point Precision. There.

A

Well, but going with 48 leaves us at least one byte so-called well, let's call it unused in Trace ad, which we could, in the future, devote to something else, for example, r value. If someone's wants to to go with with the previous design, so we would have yet another bit, which would say uh well. We have r value embedded in the trace, ID on bytes them or 50 or whatever.

B

So this could be a new Trace flag in the future, Beyond Randomness, which says actually we're taking one byte of your seven byte random. Now you only have six bytes of random and we're gonna put the art the the R value. I. Don't actually you don't need the randomness, the other value, complicated idea, but um there is I, think you're right. There is a potential future desire to embed some information and we've seen how r value could be useful.

B

If you had eight bits, you could put all the r values in the trace ID, for example, um but that could also be well yeah you're right. We can't take back the random flag because the the level two will not know what the level three is going to do. So so that's a fairly good reason to say: maybe we use not the lowest bite or something like that. We just use the seven six out of the seven lowest bytes, something like that. We're thinking about.

E

I think 48, bits or I think enough for yeah reasonable sampling rates, so normally for me it would be okay here.

D

I would yeah never mind, I, know I'm gonna say it.

B

All right, um I can definitely summarize this I will summarize this for the issue. um I think what my summary will say. Something like um we discussed the question about w3c. We think it's going to be a while. We are advising that hotel move forward, assuming that whatever the trace, randomness spec will do is done already, meaning we are going to assume that some portion of the bytes are random.

B

It sounds like our recommendation is that you know we do want 48 bits of Randomness at least, and we could see, we can imagine reasons to embed other meaningful values in a trace ID for reasons.

B

Although I mean, if we get creative I'm, not sure, actually now that I think about it, we've there was at least one draft of a proposal that would avoid using Trace State entirely if w3sc was to like fully specify what we want and that and that's an example where today the version zero, Trace parent has the format of 16 bytes of Trace ID, Dash, 8 bytes of span ID Dash, two bytes of flags I, think, and you could imagine extending that for version one as a longer string that has 16 bytes and Trace ID 8 bytes of spin ID, two bytes of flags followed by more bytes.

B

So there are ways that we could embed more data in the trace context without clobbering bits of the trace ID. um It occurs to me and that that just seemed like a really far out there proposal to start making a year ago, um but it would essentially be like imagine just putting the R value and the p-value in the trace parent and never talking about Trace State.

B

um That's that's basically asking w3c to do what we want and I did I, don't know what uses there are for the transparent outside of open Geometry. So it's hard to push that.

C

Josh I think only one uh suggestion I have on your summary is I. Think the w3c level 2 is fairly close. I think the I mean it's not that that will take a while, but I worry about, um even if it takes like say a few months right like say three months, we are, let's say in recommendation State. Even then implementations have to adopt it right and uh let's take the example of say, dot net, which goes by like a yearly release cycle.

C

So if it is generating Trace IDs I don't know if it is going to be part of the next number release, or is it going to be one year further out? So that's why? If the?

C

If we already know that majority of the current implementations already generate in a random Manner and that that is something I heard from Daniel Taylor and uh Sergey yeah, that's believable at least the last few bytes right and- and all this proposal is saying, is: if we are betting on the w3c thing, it's fine. We are bidding on it, but at the same time, more like a stepping stone or a back compact.

C

We are also assuming that, even if the flag is not set we're just going to apply the same logic that way coolness of time, everything becomes more consistent, but in the short term it may be less consistent if somebody really didn't adopt a random NF uh last six bytes or whatever we picked, um but but yeah yeah, it's more like the combination of w3c plus the implementation is what worries me and.

B

C

And I have a lot of interest in the consistent probability sampling with with uh with the teams I'm working on within Microsoft I. Think many teams are interested in trying that, as opposed to the pure uh pattern, based sampling and I cannot tell them hey come back after a year or two years, yeah for it to work right, I want them to be able to start on it um sooner um and I believe this may be like a good um kind of a plan to get them there, eventually with full consistency, but.

B

All right, um I will do my best to summarize the position that we just found here in this issue, and I will see you all in two weeks.

B

C

B

I, don't need to cut off the meeting, um but uh maybe that is the end. um I don't have much else.

C

Okay, I had a couple of questions, uh I mean if we have time yeah, there's a little time left yeah.

C

So the two topics I wanted to talk about one is uh the linked traces right when, when, um when a span in Trace one links to say a span and Trace, two, the sampling decision for these two traces are going to be completely independent today. Right uh so has there been any discussions thinking around how the sampling like do?

C

We want to even solve that problem of saying there is some notion of consistent sampling across linked traces, so this may be a big topic, so we don't have to talk about it now, but at least for next agenda I would I wanted to see at least the opinions on hey. No, no. This is not a problem we care about and we won't solve it versus.

C

We can do something or we should do something. So that is one part. The second one is a quicker one, so I'll use this time for that which is the p-value that is propagated.

C

uh So if I understand correctly, that is for the adjusted count and for the span to Matrix pipelines and all of that correct. So let's say I'm service a and I have a probability, a sampling rate of say, 25 percent, so my adjusted count is going to be four.

C

So is there any specification that says I should emit something in the span attribute that for for the registered count like what is the value of propagating it to my Downstream service? What is it going to do with that information? So.

B

The trace state is a field that gets saved along with the the Span in the in the specs, so that it's not it's not as convenient as having a field called adjusted count in a span and we- and there were times when I proposed that, but it didn't fly very well and and the idea that this Trace state is already respect is recorded and we use the Trace State it like it all worked out for us um in the sense that it was backwards compatible with existing sdks.

B

To do it that way um and I do think that that is a not perfectly great thing. You know, as a vendor, I have to get a Spam parse. The Trace State, look for the sampling information and adjust the count.

B

um And honestly we haven't done that yet, because very few people are actually using that our customers, like you, know, we're behind. So um if there was a will and I think if we had widespread use of Samplers and probability sampling, then I think it would be a reasonable forward step to add a field in the span.

C

Okay, that would be awesome. I, don't know whether we are actually Trace. State I mean. Is there a? Is there somewhere in this spec that says, Trace State should be emitted along with the.

B

Span yeah there is, it is I made sure of that, but it's somewhere in this SDK specifications.

D

A

They are Refinery.

D

Decorates, the outgoing span, with a field called sample rate. That is, um you know four in this case, yeah.

B

This is another proposal that has I mean in ancient history. All these proposals have been heard, I mean the idea of a Spam attribute called sampling probability I have like I feel like a conceptual objection to like this is not about the span. This is like metadata, so I kind of want to be I, wanted to see potentially like refining our concept of attribute to have like descriptive and non-descriptive, or something like that. This is a non-descriptive attribute. It's like, doesn't say anything about the span.

B

It just says something about how to count it, but that's quite a lot of nuance, and um so here we are, um maybe an attribute called sampling probability makes sense, but I kind of want to somehow that kind of bothers me a little bit. Your first question just briefly: I have not heard much talk about how to handle links.

B

Light step is um barely adding support for that I believe we do tail sampling so of the links so that there's some way we can try to collect links, but we've already got enough trouble collecting full traces when they get quite large, but at some point the number of links is also a problem and we begin sampling. Those I don't know how to answer. Your question, though,.

C

Okay, yeah: we can. We can discuss more in the next meeting, so.

D

C

First, one I see your concern about the conceptual thing of it. Maybe sampling rate is not the right attribute name, something like a representativity score or something like that which which says hey. This span represents 10 other spans right, some something like that, which might attach more more.

B

Closely, that's a good, that's a good observation. Actually.

C

So so I would uh prefer we standardize the attribute there, because going and passing Trace State and looking for p-value. It all seems like a lot of okay.

B

I mean like a note that you said that you're, probably the first person that's in a year who said that, but it's not the first game, um I think that's worth worth thinking through there's a lot of Roadblock on systematic conventions in hotel right now. We're having trouble like you don't want to put us like, uh like there's a lot of trouble with metrics, basically in the cardinality explosion.

B

So you have to be careful and there's a lot of roadblocks right now sure, hopefully, those are lifted and things get better, but the idea has not been lost. Thank you sounds.

D

Great thanks, I, just pasted um I I knew this was scratching. My brain is something that got posted recently on our Refinery repository, um which is a kind of discussion of of um uh sampling. You know because this, your your cross-sampling problem it it's not. It's not answered it's more of a. You know a it's part of the discussion there. So maybe you know next meeting we can talk about this um I'd love to dive into this further. That.

C

D

C

Thanks a lot for sharing that.

D

Sure oops, uh you're, muted Josh.

B

I'll put it in the agenda, so we don't forget. Thank you all see you next time. Awesome.

C

Thanks I appreciate it bye-bye.

C