Open Telemetry Uncategorized, 18 Jan 2023

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2023-01-18 meeting

Description

Open cncf-opentelemetry-meeting-3@cncf.io's Personal Meeting Room

A

B

B

A

A

B

So while people are joining, uh perhaps we can start a discussion Josh on on our item, because I think I can answer your questions or we can uh I mean we can thank.

C

You uh can you hear me yeah.

B

C

Wasn't ready to go first, but I'll. Try um so I posted uh an issue um just a few minutes ago. So no one's read it, but I I wanted to brain dump because I spent most of yesterday afternoon kind of looking into it. So you can read along. um Let me pull it up so that I don't make a fool of myself while I'm trying to speak about it.

C

um So I'm working as the background here is I'm working on this thing called otlp Arrow we're trying to get the Apache Arrow columnar representation and transport mechanism into place for improving compression rates. Part of that is the use of columnar compression depending your schema.

C

Only once part that's using grpcs and we have been asked to disambiguate the benefit to actually say how much of it is unary RPC versus like column compression, but let's just proceed in saying that some of us want the best we can get and grpc streaming is going to help us so I'm trying to figure out how to make header propagation work in the context of The Collector and the apparatuses that we have as well as using grpc streaming. So I wrote down the facts that of background I.

C

Think I may have found a slight bug. um uh This is sort of an orchestration happening between receiver support. Those include metadata, export or support has a static field called headers. There's an auth extension that lets you uh essentially dynamically add some metadata um and then there's this mechanism that grpc supports as mechanism that HTTP supports are very different. So you see how like the OTC has standardized some stuff but like down in the depths of the HTTP and grpc, and it's completely indifferent.

C

So what we have today doesn't quite do what I need um and what I'm my goal is to set up this, what I call a bridge so that you would essentially, if you were running one collector before and it you know it was succeeding. You would now run two collectors. Those collectors would have a bridge between them.

C

It would be the otlp arrow exporter on one side and the otlp arrow receiver on the other side, and the idea is that that connection will get really good compression and I want it to be seamless, and that means that I mean configuration questions aside. You should be able to configure your Bridge so that, if metadata and auth were working on the first side of the bridge, it'll convey enough information to the second side of the bridge that the exporters work so that I can install this bridge between a receiver and exporters.

C

Somehow um and I went into like studying how all this works, um you get your and focusing only on grpc. For a moment, you get your request metadata from several locations. You get it from the auth extension dynamically and you get it from the export or static um and um the way that the grpc mechanism is implemented.

C

It uses this both an off plug-in as well as outgoing context, and those are meant to be separate um and what the problem I'm kind of running into here is that there's everything is fine with off off and I want it to be the way it is, but I use a background context when I set up my stream and that's because I don't want to set up routing and like different arrangements to separate my metadata. Upstream I just want to like be a bridge and pass the metadata through.

C

um So um so the background stream may be off to using statically static, metadata or a different plugin, um and then the question is, as as I receive, produce requests as an exporter. I want to pass them over this bridge.

C

Use using arrow and I can do anything I need to in the protocol we're proposing to basically encode a bytes field using hpac just the way a per request metadata would be transferred, but just to convey optional metadata associated with the request, but I say export request, thinking of the P log logs, the P metric metrics or the p-trace spans right. So I'm trying to associate per data item metadata with my export request and I, don't have a way to do it right now. The auth was set up once at the beginning of the stream.

C

There's a thing called outgoing metadata for grpc and you'd expect to be able to use it, but it gets overridden by the exporter. Grpc then does the right thing, but there's no way for me to get through an official mechanism, the metadata that the off request would have given you dynamically at a per export level, because this the office run per stream, not per export.

C

um Long way of saying that is I. Think a new extension mechanism actually called for here as well as maybe a bug fix. The extension mechanism would be much simpler than the off mechanism. It would just take a context, return a con uh and like give you some keys, basically that are going to be exported and then that would be merged with the grpc outgoing context or it would establish another round Tripper in the HTTP to merge some more stuff.

C

um But then, in my streaming export case, the otlp Aero export I would just call the this new extension on a per export. I would have. My per I would have my per stream off extension called once when I started the stream and then I would have a per export extension called to get me the outgoing metadata, and that would merge static, metadata and the request metadata, but it would it wouldn't concern itself with auth, really it's just a simpler extension.

C

So I looked at header Center, it doesn't it's like being an auth extension, but it doesn't really do what I need um I can make it work. I can I can hack this together for now, but I just wanted to hear if others have thought about that. I'll stop speaking now. Thank you very much.

B

I hope you can still hear me. uh There were some some Zoom glitches. Can you hear me? Okay, that's good um all right, so quite a lot to unpack here, um so the way that the authentication works is and I think it's not a problem for you is.

B

um Sorry for per um stream within the connection uh anything that's not the problem that you're having right. So um that's that's one part of the problem, because you know you get a context with the establishing of the connection and uh and then you get these streams within uh the connection.

B

um Now, let's I guess this problem is relevant for you because of batching operations, because whenever batch happens, um the context is just discarded and then a new, a new context is then created and everything that were that was inside about the context is lost like the client info and authentication information is within the client info. So we had this idea back then, and I see that tuberance here.

B

So children was part of the discussion of having a new field on on the P data or P logs logs traces metrics to restore the the Telemetry data uh context and uh Tyrone even had a POC back then, uh perhaps a couple of years ago. Now the problem that we were having was that um we couldn't get it um the the the Proto generators to work with this new function or with this, this extra field, because the the P data belongs so pillars, traces, metrics that we have is auto generated from from from from protobufs.

B

uh We could have new methods, but as far as I understood, we couldn't just add new fields. uh That was the problem back then, if I recall correctly so now that would solve your problem, I guess, because then data is attached to data points instead of to the context that has been passed through the pipeline.

B

um Now you mentioned the header, Setter and I would assume that the header Setter would work for your case because um you can place arbitrary things into client info, so not only off data right, so you don't. You don't need an extension to place things in client info. You can just set new values to that and all the data is just one struct within client info. So you can just add your own instruct within client info injected at one point like the receiver and then extracted at the exporter site. But.

A

B

That's that will be stored within the context that is passed through the pipeline again, if batch is there, you lose that data, uh it's batching is not there. Then you get that uh that struck down there on the exporter, but then exporter and receiver should agree on which key to use right um and I guess that's that that would be the the role of the extension here.

B

A

B

I think I'll try to solve the problem um with the original approach, like adding a Telemetry data context to to whatever we generate, um and if that doesn't work, then perhaps a new extension. But again this extension would just be its sole role would be to put in and get items from the context right, because.

A

B

The key to use that's the only purpose of the extension from what I. From what.

C

Yeah I think I'm proposing something: that's essentially like the header and Setter extension, but not called every on a off on the per connection basis, but called on the per data item. So I think I do like the idea of a like a metadata field in the data point itself. I guess there's a lot of that's a lot. That's a lot to ask for, and maybe tigman could talk about it, but um I can make this work it.

C

It just means like I, think I go through my list of extensions I find any off extension and then I just call it again using the context that I have um even though I'm not offing, uh request, there's no there's no requirement to use the result of the extension for off-purposes I will get some metadata back. I will then convey them manually and I. Think it'll work it'll just be difficult to document.

C

What I'm doing and I think I have to assume I don't know if like are there ever more than one auth extension I I, it's going to confuse people, it's going to look confusing in the code. It'll be fine.

B

um uh I'll read your your issue carefully, um probably not not. This week, I have more than 900 items in my notification queue so.

C

I was mostly fishing for interested parties. um uh Dressy. You don't have to be the one to pay attention to this one, um but I'd be interested to know if others have consulted or run into this and considered what to do about it.

D

Josh can ask a question: do you? Is it a goal for you that this bridge transparently passes through the authentication information from the receivers to the exporters? Is it what one of the things that you want to achieve.

C

D

C

Is it not a goal? It's not necessarily a goal, but it's not a non-goal either and I. You know light step. You know is ultimately what I'm trying who I'm trying to to please here and they they do use a header for for identifying Telemetry data, but it's not treated as an off operation.

C

It's just a piece of attached data that we use to Route things at the back end and there's separately an off question, so we would off the stream independently and then I would like to attach this extra bit of non-off I I, keep saying non-auth related metadata is all I'm trying to do is make a distinction between auth, metadata and non-authmic data and then use the non-auth metadata associated with each data item on the bridge.

D

So that none of metadata, do you want it to be transparently passed through the bridge or is it still not a goal for you.

C

um I I, don't know probably different definitions of transparent I'm willing to configure whatever it needs to be configured I know I have to set include metadata to even get it on the receiver, so I would probably require setting include metadata on both the bridge and the original receiver, and then I I know that I can configure a header, Setter extension to propagate things over ordinary or unary RPC and HTTP requests, so I'm kind of looking for an emulated behavior.

C

That would do the same so that when it comes out from the receiver on the other end of the bridge, it has the same client metadata it did, except for auth, because the auth is now different and I I just want to be able to be to like document what's happening so that each vendor can solve its own problem. I guess.

D

Okay, so um I'll read the issue. I'll comment on the issue that you posted. Thank you I'm, not sure what exactly to suggest right now.

B

Did you add um how we would accomplish that with the header setter.

C

Well, I will do that. uh Yes,.

A

C

In a proof concept that today, anyway, I think.

B

Okay, so if, if you can get it that working, that would certainly clarify what you.

C

Are yeah, I will say, I prototype something first before I, even understood the off extension and the header Setter, because it's like hard to find all this stuff at once. um My first dumb approach to this was simply to add corresponding to the headers field in the EXT in the exporter.

C

If the exporters had a head next to the sibling of the headers would be propagated, metadata Keys, uh then we could just saw this in the exporter. You would at least I know how to do it in the grpc case. So right now, there's this. If you know the otlp xgrpc exporter, has this enhanced context method? It's called right before the actual export call, and that's where you put the header headers in I've, sort of sort of I. Think it's a bug. It's like anybody. Upstream can't use the outgoing context because the exporter overrides it.

C

So if the exporter had a second section that would be propagated, metadata Keys, it would then just look up. The client info get the metadata look up each key in its configuration and set the outgoing context itself and that'll work. I, don't know how to do it in HTTP, I haven't looked at it, but you know that's actually the simplest approach that I tried taking and it just doesn't use the off header, header, Setter extension and someone called that out and I didn't know what to say now.

C

I'm reporting it to you guys, I'll, take a look at the.

D

Other just just be aware that the client info is lost whenever you have any any processor which does async processing like The Logical system right.

C

Yeah and I think that's another reason why I want this to work. um Actually, no I I'll I'll respond to that you're right I think we would probably like to figure out.

A

D

Whole problem I was trying to solve with having the context being part of the P data right that what what Jurassic was talking about earlier.

C

D

Which we don't have you use the batching processor? You lose the context. Essentially you don't know where the data is coming from. All the metadata is lost. In that case, the exporter loses access to that information.

B

Then it's not only batching but any any processor that that does a sync processing right. So it's a group by Trace, the group by n uh and of sorts. So everything like that.

C

um Point we'll take and we.

C

So so I think what what we're saying then, is that, because we expect the batching processor to be so valuable, we don't expect you to use per request. Net data. um I will come back to this issue. I I have to think about this.

C

From my vendor's perspective, um I was mostly thinking about this from the bridge, making the bridge seamless perspective, um and if a user was a multi-tenant user was sending data through one of these collectors as I'm imagining um they would have to batch by the metadata key that they want, and then it would propagate it through the arrow Bridge. The way I'm describing but I I would need that first step and um um so I haven't fully solved my problem just by getting the metadata over the bridge.

B

Yeah so um a couple of comments, so the first one is uh we, we had a thought about licking the batch processor aware of the uh authentication data, so we thought about having a a group by attribute on on the batching processor so that it would group into batches of similar data and then you, as a user Define. What is similar?

B

um We never actually ended up doing that work. We just we were just brainstorming on how we could solve that, and the other thing is when, when I use the routing processor for or the load balance, or actually only the routing processor, for any reasons, I don't do the batching processor on the same pipeline, I just route to whatever, whatever I want and then at the last mile I do the batching. uh Perhaps the last mile is within the same process just a different pipeline, or perhaps it is another collector.

B

In my my like physical collector in my in my observability pipeline um and uh a final comment is um emission. We see that the batching processor would be valuable so that people are going to use it, and the answer to that is, it is actually recommended for production. So if you look at the documentation for The Collector, um the the memory limiter and the batch processor are the ones that we recommend to use in every production deployment of The Collector. So you can assume that batching is always going to be there.

B

All right, but let's follow up on the issue itself. um Let's go to the next item.

B

Ryan I guess it's yours, yep.

E

Sure um all right, what's going on everyone, I posted a message in the uh in the slack Channel, but uh I will keep I guess the intro short and sweet, as Felix will probably be doing most of the um you know, presentation part here, but yeah. We came over there's a bunch of people here from the um profiling working group.

E

uh If you are not aware, we've been working for a couple months now on I guess several months now on getting uh profiling as a supported signal type in otel um and uh there's a lot of people there from various.

E

You know profiling organizations uh and vendors who are doing profiling, some open source side of things, so a lot of good expertise on the profiling side, but many of us had not been particularly involved in open Telemetry before we started this working group, and so we are now kind of at the stage where we're trying to figure out how the collector will sort of deal with profiles and profiling and uh yeah I wanted to obviously bring that to this group, as it is probably the best place to um first of all, just get some fresh eyes on what we're doing and also get some ideas on how we can basically add profiling, but not in a vacuum where it works differently than how The Collector treats other signal types and that sort of thing um so yeah there's a link to the presentation in the meeting notes.

E

um If you want to jump ahead to the end of it slide. 22 sort of has our some like questions that we will eventually sort of like get to throughout, but um other than that. I guess I'll hand it over to Felix we've yeah I guess our whole group has talked about this uh presentation that that he made and are generally yeah in agreement of you know the the content and interested to see what you all have to say about it.

F

All right, um yeah I can do the presentation. uh Let me see if I can share my screen. Can everybody see that it should be working cool um so yeah? Basically, let's see um here we go. That's the first slide, yeah, so Ryan kind of mentioned. Most of the things that are listed here. um Maybe the one thing to call out is that's initial proposal was actually made by Sean who's here today. I believe AKA movie store guy, so uh he's the original uh profiling instigator um and uh yeah.

F

Then, where I continue here is that in September one thing our group has proposed is a profiling, Vision Otep. We would encourage people from The Collector side who want to understand more, like sort of our high level goals, to take a look at that um I uh yeah for the presentation Ryan already mentioned. We have some uh goals. The goals are yeah socializing. The ideas of profiling is a new signal type.

F

uh Some of these questions I had a chance to sit down with tikron yesterday, so in some of them I'm more clear on than I was before. uh So maybe we won't have to spend too much time, but yeah sort of we're a little unsure like what it means for two components in hotel to be compatible uh like all things compatible if they require a collector to talk to each other is an interesting question.

F

um We have some ideas about maybe doing sub signals. I'll explain that later um running, Hotel collectors on separate hosts, tigrant already kind of confirmed that that's a common thing in the hotel. That was something our group was 100 sure how common that is, um but the Alcatel collector buffer messages for stateful protocol and kind of become a little database we'll dive into that.

F

um How would pdata deal with binary plots that representation for JFR or p profs and um yeah? Should we maybe consider reusing logs of signal types? The only reason I mentioned this now is like to kind of give you an idea of what we want to ask at the end um yeah. But the definite non-goal is to convince any of you here of any particular proposal.

F

We're trying to take learnings from this meeting today back to the working group and profiling and then come up with something that we think has a good chance to work well in The Collector and in hotel in general, I would say um yeah so summary of the profiling Vision Otep is I. Think our group is aligned that the target environment, where we want to do profiling is production. So production data um is, is what we're after um there's different types of profilers that we want to support.

F

uh Perhaps the most important ones are whole host profilers that are collecting data from an entire machine and also runtime profilers that are collecting data from inside of an application in collaboration with the runtime, and that sometimes allows to go more in depth, but, of course covers only a limited amount of activity on the host, in particular one application and one runtime um yeah different proton profile types that we would be interested in supporting our CPU profiles.

F

That's probably the one most people think about and talk about the most, but also allocation profiles, lock contention Etc. We are very interested in minimal overhead for everything. Cpu memory Network disk latency- we don't want to make application or hosts any worth by introducing profiling to them and we're sort of particularly interested in it, because oftentimes profilers can be used to kind of measure their own overhead, and so it's like very often clear, like um how how bad things go when you turn them on.

F

um We also think that we should have good support for popular Legacy formats, specifically P, Prof and GFR, which we've identified as two major formats that we should take into account. I'll talk more about that, and uh last but not least, correlation with other signals is going to be really important. In particular, Trace is probably the most important one, um but also locks and metrics I'll talk a little bit about traces, so yeah click.

F

The link, if you want more um the profile landscape in 2023, is that there's many vendors and many of them are attending the profiling working group. There's also many open source profilers I'm only listing a small small sub selection. Here, the two popular formats I've already mentioned that we care about notable about both of them. Assets are built into their respective runtimes into go and Java um and there's, but there's it's a really big ecosystem of things that is out there.

F

Mark Henson has a really nice profile or pedia website where he lists all the formats and tools in the space, or at least it's the most comprehensive listing of things out there, and we certainly don't try to support all of them. We've identified people of and JFR sort of, very important things and another thing that our group had in mind. Quite a lot in our discussions is something done by elastic, formerly plot, filer or optimize.

F

um They. They have a profiling wire format, which is like a very efficient protocol to get profiling data from a host to a backend and avoiding a lot of duplicate data transmissions in the process, and that is probably a direction we want to take. If we decide to design a new format for hotel, Beyond, Chevron prep yeah for those not super familiar with profiling data, it's typically just a set of Stack traces and they are associated with account or weight to show like how big they would be in a flame graph.

F

A stack Trace itself is often just a list of program counters in a compiled, binary or end depending on whether it's a dynamic language or not. There will also be readable. Symbols like function, names, file, names line numbers either or both of those can be present um stack. Tracers might also come with their own timestamps. So oftentimes when you have a profile, the timestamps get kind of thrown away, and you get an aggregation of what happened in a Time window, but I would say. Our group is also interested in collecting individual events with individual timestamps.

F

There's a few use cases for those, um for example, flame score, Heat maps and other things. um Stack bases can also be associated with custom metadata in go. There's P Prof labels to save our similar mechanisms, and one thing we are hoping to scroll away in that metadata for profiling is span and Trace ID, so the correlation can happen and yeah the most common visualization for profiling. Data is flame graph.

F

If you haven't seen it, it's yeah, the the one that sometimes uh some people call flamecraft, is a flame chart and vice versa, but yeah the flamecraft is one where the x-axis is not time. It's duration, it's not the passage of times ordering is alphabetical um on the tracing correlation side. I just want to give a concrete example of something I was working on, but it's just an example.

F

um Basically one thing profiling could bring to tracing is if you have a span, and it has an amount of self time- that's not explained by other chart spends you can pull in the profiling data scope to specifically that span ID and then understand. If you were CPU Bound in that time range or not, and if you were CPU, don't you can even break it down to a little mini flame graph.

F

So being able to do that in hotel would be really cool, I think um so now, let's talk about people for a second, um it's an open source project owned by Google.

F

um It is uh defined as a protocol buffer format. It's pretty stable and hasn't changed much recently. It supports stack traces with labels, as I mentioned. It supports symbols but they're optional, so you can also send a profile without symbols and then kind of find some somewhere else later.

F

It doesn't have first class support for timestamps, notably labels. Oh got misspelled can be used for that, but if you put one label per timestamp on samples, it's not very efficient way of doing it. You get a lot of duplication in the data structures in the format um yeah the again. The most important thing is built into the go runtime and tooling um so built into the go. Runtime means to go runtime when you ask it for profiling.

F

Data will often give you a finished prep, at least that's the most comprehensive interface for getting that data, uh and if you wanted to convert it to something else, you would have to actually do that conversion somewhere um in in most cases um there is a large ecosystem of tools and converters for p Prof. So there's a whole bunch of tools out there and um if we decided to design a new format, um people off is simple enough that we think we could design something in profiling.

F

That would be a superset of P Pros, so we could probably convert people off to something new that is doable without data loss. um So that's I, guess interesting. uh Here's how the data structures look like in prep, but won't go into that detail.

F

um Yeah TFR is similar as in it's like the built-in observability signal in the jbm runtime, but it's different in the sense that it's it's not a profiling format. It's really a generic event, reporting format that is not limited, so you can basically put everything in there.

F

um The next important thing to know, there's no official specification of chafe R, there's only Community, reverse engineering and uh code. You can find out there that kind of deals with it um and yeah for profiling. Data JFR will generally come with stack traces, symbols and timestamps, so it has most of the things that we would care about in the profiling working group. But JFR has a lot of other things that are, that can be totally unrelated to profiling.

F

All kinds of jvm runtime events are stored in GFR files and JFR is also a storage layer for user-defined events, so you can Define your own events in your Java applications that end up in JFR recordings, and if you've done that at this point you would probably really not like a system where you would have to convert JFR to something else.

F

That is not a super set of JFR because you might lose some of the instrumentation that you've put into place and that you would like to use jfr4, or at least you're, going to run bubble. It's also an ecosystem of tools and converters for JFR. So it's it's uh yeah. You can follow that link later. It goes to Mark Henson's website and um but yeah.

F

Basically, I think that if we designed an Hotel format, it would likely imply data loss if we wanted to convert profiling data into an Hotel format before we send it to the backends, whether that conversion happens on the clients or collector side, it doesn't matter I. Think JFR is too complex to to standardize the superset in hotel.

F

Some people here might want to disagree on that I'm happy to have that conversation, but that's uh I think a take that most of us in the profiling group have right now, and the second thing is even if we decided like hey, we can reverse engineer JFR we can do a superset. uh Shaffer could still evolve and change in the future. I, don't know how likely that is, but in that case the data loss might still be inevitable unless the hotel could be very quick to adapt and expand their format.

F

It um yeah here's kind of a taste of what the JFR data format looks like it's part of a very good article. I, don't think I've linked it, but I can send you a link later. If you want to know where that's coming from- um and that brings me to the last format that we are interested in in the profiling group, so even so, we want to support P, Prof and JFR, specifically for runtime based profiling.

F

We also think that it would be good to potentially standardize a new profiling format for otel, perhaps focused on the use case of whole host profiling, where there's not so much worry about Chef on P profiles, because those profiles produce their own data and the direction that we're thinking about. For that new format. Oprof is just a working title. It doesn't exist yet is to do something inspired by the broad file or wire format that I mentioned before.

F

But again, the important Point here is that this is a stateful grpc protocol and not file based and what I mean by that is the stack. Traces are hashed and only transmitted once so. Even a list of program counters is represented as a hash and only transmitted once and later on, referenced by that hash, ID and so yeah. The future increment counts like saying: hey, I've, seen that stack Trace again, please increments.

F

A weight on it is done by referring it by hash and also symbols are also only sent once, um for example, that can happen during Ci or that can be happened. The first time the symbol needs to be resolved, but yeah Oprah avoids or something like the broad file or wire format. If that becomes o Prof would avoid resending data for prep and chief. All the common case.

F

Right now is that you send the same symbols, for example over and over again with the same list of program counters as a stack trace or send over and over again, and typically that's done uh once every 60 seconds and that's a lot of data, duplication and um profiling. Data can be up to like 100 gigabyte per month per host. If you do it naively and at that point, bandwidth costs, especially across Cloud, can be relevant and so yeah. The motivation for Oprah would be to do something.

F

That's significantly lower Network overhead uh than people are, um there's no prototype or design yet because we don't want to go too deep on that. Until we know that something more stateful where different pieces of information are transmitted at separate times is something that the collector would be comfortable with, because it's going to be problematic. If the collector is trying to do processing on the received data here, um yeah, exactly The Collector would need at least three different, endpoints or ltp. You would need three endpoints for for different operations.

F

um The Collector would have to buffer the state, and that could be complex and costly in terms of memory and disk so yeah. What? What are we trying to do at the end of the day, we're again trying to create compatibility between different clients, profilers and back-end receivers? So, ideally, the whole ecosystem of different vendors that are making different whole host and runtime profilers could interact with each other through Hotel.

F

um We think that it would be nice if the clients could keep emitting JFR and P profs and, in addition, no Prof once that exists JFR, especially for the data loss reasons mentioned, um and so, but yeah I, don't think it's reasonable or the group doesn't think it's reasonable. That uh back ends should be forced to support everything like in order to be in hotel backend. You need to support JFR, prep and oprof.

F

That would be a very large burden on back-end receivers, so we're thinking an ideal solution would allow a sort of um just a subset of C's to be supported by back-ends and still be considered a part of the hotel family we'd also like to avoid client conversion.

F

uh For another reason, I haven't mentioned so much, which is overhead in addition to the data loss, um because uh some of these files can be pretty big and it could be relevant overhead in terms of CPU and maybe even memory usage to convert to something else on the client, um but also the question about duplicate implementation effort. If every client is forced to speak, two or three different output formats, that that would also not be great because those clients would for runtime profilers at least be written in different languages yeah.

F

So how can we make this work? Obviously, The Collector could play an important part in this and fill the role of a converter. So maybe the collector could come in as a piece where different applications send different data to The Collector, and then the collector can figure out how to turn these formats into things understood by the back ends, even if the back ends only support a subset of the formats submitted by the clients here on the left side.

F

um But of course, if the collector is not in the picture, then the compatibility story would be limited. There would be components in this piece or in this picture uh that would call themselves the hotel, but wouldn't be able to directly talk to each other or without the collector in there, and that's something where we're not sure uh it's a collectors is comfortable with that.

F

um So we're not really quite sure. Yet what the architecture could look like one of architecture- that's the most obvious- is that just have one profiling signal that carries multiple formats, so you have a payload that could contain a p Prof file, a GFR file or different o Prof messages um and beckons could choose to support a subset of these payload formats, but maybe not all of them, and then the collector could be configured uh to convert between those payload formats, um but yeah again.

F

This would make the collector critical or open Telemetry compatibility between these components.

F

um Another option is to maybe have sub signals or separate signals for the different formats, so there could be profiling.proof, profiling.jfr and so on, and in that case the clients and backends could explicitly specify which signals they support. So people would be deploying these Hotel clients would know exactly what back-ends they are directly compatible with and in which cases a collector would be required in which cases overhead would be incurred from the conversion um and yeah things would be a little bit more clearer on the user side.

F

um Of course, without a collector only signal overlapping implementations would be compatible, but when a collector is in the picture it could maybe convert between sub signals. I know that uh from tikron yesterday you shared that hotel currently does not convert between signals. That's not something that's being done, but apparently it's under discussion already for other use cases.

F

So maybe that's not entirely a crazy idea, but we're not sure if it's a good idea um and the last direction would be a unified signal or format, um and that would probably be a compromise where, like uh yeah one profiling signal, uh that is basically a subset of JFR and maybe a subset of paperov uh data would have to be converted. This would cause data loss and overhead. Potentially um it would require a collector for many users, because we probably don't want to do the conversions in the client and I.

F

Think if we did something like that, the reality is, it would probably lead to a situation where vendors uh would probably prefer to continue using their existing formats and maybe support an Hotel based interests, but it wouldn't be the first class citizen for interest and I. Think everybody in the profiling group would really like to walk away with something where we all feel yeah this. This would be our first class in chest format going forward.

F

um Yeah then there's a question of what that protocol should look like, but yeah. We don't need to go there. So yeah basically recap we're trying to find a good architecture um Chef on paper off difficult to ignore data loss overhead, um the Oprah format could make people Prof redundant in the future. People offers the one we're comfortable with designing a superset for, um but yeah TFR, probably not and uh yeah. As mentioned by Ryan.

F

We have tried to study hotel, but we're coming from it with little experience and what we especially need help with is when we see things being done in hotel in a certain way, we're not sure which of these things are sort of written in stone and core hotel and nobody wants to change them or it's willing to and which ones are a little flexible and uh yeah we're very happy to receive advice on all of us. So, thank you so much for your time and um if anybody has thoughts on these questions, please please shoot.

D

So Felix uh since, since we talked yesterday, I've been thinking about this, so maybe I can comment first. So uh the first thing is about the faithfulness versus waitlist, so I think it's possible right. It's possible to design a stateable protocol with uh with Delta messages and have a p data design in The Collector that corresponds to those Delta messages. I did I. Did a quick sketch yesterday night of a possible solution uh in The Collector, which I think should work should also support.

D

If and and I think, we can also support efficient pass through of both a new format of the custom formats as well. However, right I think it's important to first understand if we should do it at all. The statefulness adds a lot of complexity, uh especially to The Collector and I personally, would want to see a very good justification to adopt a stateful protocol uh so far, I don't see that justification in the form of any evidence. I I read the profiler document.

D

um It has a section that explains that raw profile data is too large to be transferred on the network, and then then the document jumps to a conclusion that a stateful protocol is needed. uh The one that uses dictionary, compression, columnar format, Etc uh I'm, not convinced by that explanation. To be honest, I, want to see an Apples to Apples comparison of equally well designed stateful and Status protocols. So with a stateless protocol, I can also use all those techniques right.

D

I can I can use dictionary, compression I can have columnar format and and uh and I want to see that comparison either. I want to see benchmarks that show how similarly designed stateful versus stateless protocol uh show like how they behave from uh from the from the network size perspective, at least right. If we can have this side-by-side comparison and show as an evidence that status protocol is not good enough right and stateful is much better, then I think it's then I think we should be working on solving this particular problem in the collector.

D

It's a complicated problem to solve I think Solutions are possible, but I would, before we jumped into that I. I. Would want to see myself some evidence and it's actually necessary uh and the other problem uh that we talked about is: is the ability to do efficient pass-through of custom formats like J JFR? You want it to be passed through without without data losses and also efficiently. I think that's also doable inside the collector and without necessarily introducing a new concept of sub signals or or even having the signal type conversion supported.

D

uh I did a draft sketching of what that can possibly look like in the collector's code base and I think it's doable. There will be some limitations, but I think it's it's doable there is. There can be support both for our own native profiling data type data format and also for custom formats. We can I think we can support both efficiently. So we can look at the details. I have some code that we can all go over together and discuss, but that was my prediminary finding here to me.

D

The most important question is is: does it need to be stateful and if we have a justification to make it stateful? If, if we can assert that with evidence, then we can talk about the solutions.

F

Yeah I think for our group. We sort of assumed that the difference is very big because about 20, maybe 10 of the normal profile in the people of our chief are symbols and those are often the same ones.

F

So if you keep sending 10 or 20 of your data, the same data over and over again every 60 seconds in your prolate system, once it's going to be a very significant difference like we're talking at least seven X, maybe 10x but we'd be happy to kind of work that out in a little bit detail and submit that for all those to review. I.

D

Don't disagree with you. I! Don't disagree with you. I just want to see a benchmark that proves yeah right. Okay, we can do that.

C

Have we considered a protocol that allows you to send data once as a program starts and relies on sort of like best effort delivery from many many participants to achieve the delivery you need so like I? Don't need to send it every time every 60 seconds, I just send it once per process start, and that's often good enough. I feel like there's. That's how you get from the state state full to State lists, um at least as a first step.

D

That's that's kind of a sub subset of uh of a full stateful protocol right which which can send Deltas along as they as they as they appear right. So what you're describing is also stateful as well right, so you have to keep that initial. Whatever is it that you're sending initially on the recipient side.

A

C

I meant the original source, so every time I start myself, I know I'm started once I've sent my catalog of symbols once and I'm never going to send it again.

F

So that's possible, that's possible in theory, but uh there are some symbols that are never actually need to be transmitted, so you don't know which ones are going to be dynamically relevant to the execution of the program. The more important thing is that stack traces, which are also a big part of the data. Those are also Dynamic Property or program execution, It's, a combination of all program counters in the program that can call each other, and that is not determinable or upfront. Yeah.

D

Yeah not just the static symbols, but the stock prices are dynamic, yeah, so.

G

I can talk to some prayer efforts for confining that was made and specifically was using the logs pipeline. It's using the logs pipeline for, for example, of the Splunk heck um I think it's the receiver and exporters both have some level of customization specifically for profiling, so there were some learnings out of that and things you should. You should know about this. It's important.

G

um We realized that we couldn't just not mix logs and profile information in the same pipeline. The reason for that is because we lose a lot of batching. We also have a group by attribute that happens for Splunk hack, specifically for passing around the access token, so that we can reuse it across resources and match our things efficiently.

G

um When you do that, you end up with your back end, getting a mix of uh logs and profile information in a way that is very difficult to route at the resource level. Now, if you're familiar with resources and and the signals and how they work, it's it's really a big component of hotel right to think about them as a we want to have signals of certain type assembled in a resource. So you can have avoid repeating yourself over and over.

G

These are each of your attributes right so logs you will have a thousand bucks, for example, inside one resource. If, in the middle of your logs, you have some profiling information uh now you're in trouble. So what happened? Is we had to do quite a bit of work from what I'm, seeing in the code to these integrate in and almost separate and create two different cues for profiling, inflammation versus logs information? What that?

G

What that means is that for each and every one of your log components uh in hotel, you end up having to do the work of having a separate uh profiling signal type anyway, because your backends will never. They really don't mind when you need to type things, because they have to disintegrate to the very fine-grained level, otherwise, so I broadly support that we go for different signal type for profiling.

G

Just for that reason, and it would certainly make a the code of this concrete exporter, a lot nicer to look at because right now we have to. We have the fails all over the place and based on headers of what is being passed in. We perform different types of queuing so that we can send things in a timely manner.

F

And that makes sense to me thanks.

G

H

Hi everyone, I I, do have a couple of questions for you Felix, so the first one with the stateful uh we we had before Integra knows we had the streaming a protocol in open sensors for mostly for the same reasons to avoid sending data multiple times, and we we had uh quite some troubles with load balancing uh for for that protocol and the problem is you once you start a stateful connection between two entities? You cannot move that simply so.

H

So you you you will not have you will not be able to do load balancing at all? uh You will have to to restart that connection periodically and stuff, and it would be a mess for you so because of that I think you'll have troubles making that protocol the stateful protocol, the official protocol inside the the hotel environment in general I think there is a good reason to have to have a stateful protocol when you you are talking across maybe across Cloud providers and stuff like that.

H

But if you are in the same AZ or in the same cluster in the same Data Center network is so cheap these days, then it doesn't make any sense to to have something as complex when when you are in the same like Data Center, um it's it's even faster than disk. So if somebody who wants to convince me that that the stateful protocol is useful in this case, I I'll have a hard time to understand that, but I.

F

Agree with you, but my question here is: is it very common that everybody manages to be in the same data centers as their Cloud observability vendors, because that also means that to be a cloud observability vendor? You have to be in every data center everywhere to make your customers happy right. It's a big burden.

H

As I said now, it comes with with the the thing that you need to to understand is where you are sending the data and what you are sending the data for so again, not saying that we may not need something like that: cross clusters and stuff like that. But the question is: is it is the collector that needs to receive this and supports this cross cross Cloud providers use case or The Collector is mostly focused on on supporting connecting like sources to the back end.

H

So so we just need to export or Pro not receive necessary or Prof on on our side.

H

um So then, then the Opera becomes the protocol only between the the client, whatever we collect the data, we put them into the collectors and then send them to the to the back end and the backend understands all Prof and that's it so we we don't have everywhere else, um but I I'm keep hearing from you that you want to do it without converting between GFR and people. And my question is: what exactly do you try to achieve from The Collector?

H

Why does do you need the collector uh in this case, because it feels to me that you are actually designing just a proxy for the data?

H

If, if you don't convert them into a subset that we understand and everyone understands, you cannot support any type of operations on the data. So if the data are packed from collector level, we we cannot support any transformation for the data. We cannot support anything because we don't know how the data looks like. Hence we cannot modify them. We cannot change them. So collector is just a proxy and if that's the case, my question is: why do we put this in the collector versus not using a proxy server, as it is.

F

Yeah, that's a good question. um Maybe I could explain it better. The the main thought there is, if you have achieve R and you've got things in there that you really really like. You want to be able to deploy the chief R exporting in an Hotel environment with a collector in between or without without losing data. You just want the data that you have to flow all the way to the back end, so there should be a mode of operation where that is possible.

F

That doesn't mean that the collector shouldn't be useful for people who might not have the ability to receive JFR on the back end. So if there's another vendor that you would like to try out, for they have a nicer flame graph view or whatever you might want to take the JFR that you currently have and you're. Okay with the data loss to convert it for that particular receiver, and then it doesn't become just a opaque part.

F

So what I'm saying is we want the ability to forward the data unmodified, but that doesn't mean that we're not okay to define a p data format. That is a subset of JFR that could be used for processing when needed, but there should be an a password mode that doesn't require going to a p data format. First, if that makes sense.

H

But for pass through mode, we just simply have to implement a proxy extension and I think we have a proxy extension for for pass-through mode. We we, if we don't what I'm trying to say, is if you don't deserialize the data, why do you need the collector like?

H

Why do you not connect the source to the destination? Why? Why do you need the middle.

F

My understanding is that people like to have the configuration in a central place. Sometimes you also want to enrich the data you have for some unified tax that the collector would apply. I think that's also.

H

The use cases but they'll do that, but uh that last part cannot be happening if the data is.

F

Opaque, oh well, there can be an envelope right so that you use your envelope. Here's your like your text. Here's like some other information, some metadata, but the profile and here's a blob of profile right so I would imagine this to be wrapped in an envelope in the envelope. Fields are well understood by The Collector, so sorry, I should have maybe try to specify that.

D

So I think I think the collector is still useful as a normalizer right, you you have it deployed, always and maybe you're sending the data to a vendor which supports receiving the data in exactly the same format as your sources emit, in which case there is no need for a conversion, and that's today tomorrow, you, you switch a vendor and you reconfigure just the collector. You, you use a different exporter to point it to a different vendor and now a conversion happens.

D

There is a lot of benefit in being able to do that easily, just by changing the configuration of The Collector without changing your deployment mechanism and infrastructure, adding or removing the collector. If you switch a vendor.

H

Children, if performance is so important- and we are disc that we are discussing the stateful protocol, which is much harder than deploying a collector when you don't need, it adds way more way way more overhead than than the stateful versus stateless.

H

So so, if correct.

D

Yes, it's very important.

H

We shouldn't consider deploying a collector when you don't need it.

D

That's a that's a good point, but we have uh take our Helm chart. For example, our Helm chart always deploys The Collector, as a demon said right, you have it so yeah.

H

Or you can deploy the tigrant but doesn't have to to Route the data through that instance that that agent is deployed there to capture some data, that it needs to capture like read the whatever CPU usage or whatever we read from the host, but doesn't necessarily mean that data should go through every instance of The Collector.

H

If you have it deployed like.

D

I'm saying there is value in having one standard reference architecture that you always deploy. You don't need to think about the specifics of your situation, but it's over the collector and The Collector is smart enough to be efficient. Regardless of what format you use it's kind of an idealized goal, but I think it's probably even achievable right. Maybe it's even doable that's what I was referring to when I said. I was sketching the possible solution there to me. I guess.

D

The bigger question here is the use of statefulness right I would want to see a justification for that, and is it really necessary or no and then the orthogonal second question that you're raising I completely agree with you by the way. I, don't know, I'm, not saying you're wrong, I'm, saying I see some value in having the collector, which can do efficient pass through of custom format. I see the value, maybe maybe that's not a not a bad idea.

F

Yep then maybe next step is we'll absorb all this information in the group and uh the one thing we'll definitely do is uh um capture some benchmarks that show the difference in bandwidth requirements for something State forward.

D

That would be great. Yes, yes,.

F

And and meanwhile, if anybody else has more thoughts, please join the upcoming meetings for the profiling group and or message us on slack with us, like.

I

I would also want to add, if you, when you do, that to Apple comparison. Also, please take take a look at like typical usage of metrics and traces Etc, because you said that one gigabyte per per month is a Bigfoot for profiling, like, for example, to trace, and it's 100 anyway. It's like for tracing it's. Usually it can be like terabytes of data even per day like for typical usages, so take.

F

I

At those yeah like overall overall, usually but oh.

F

Okay, I was quoting per host numbers anyway, yeah.

I

Okay, so just take a look at the typical usage of other my two types.

H

Yeah and last comment by the way from me about this is: if you want to support multiple types, as I said, it's gonna be very hard to do Transformations or any changes to the data, which is one of the main job of the collector to to allow people to enhance the data, to transform the data and to do this. Okay, sure, your point, you do it on that small comma thing that's possible, but on any other type of the data, it's it's almost impossible to be done.

J

H

Thanks for for this, I think we are out of time.

J

uh I I want to read uh everyone left. Okay, see you guys.

F

Yeah, thank you so much bye.

E

Yeah thanks everyone.