Cloud Native Computing Foundation Observability Special Interest Group, 2 Mar 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: CNCF SIG Observability 2021 03 02

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

Hey steve, how you doing.

B

I am well, how are you.

C

Not too bad, not too.

C

D

Libra she said he might be a couple minutes.

A

A

uh Constance here in the wrong section, you need to add yourself further up yeah.

D

That's me. That's me. Sorry,.

A

She was on the call.

B

By she wasn't added.

D

To the attendee list last week, so I just fixed that too we're.

B

In that google doc.

D

And adding attendees.

E

Michael at this point, in the morning, like I don't function so like steve knows, he's just gonna start typing things.

F

A

You have to excuse, but here it's 5 p.m. So I don't have that. Excuse.

E

A

You're just adapted.

E

To our time zone, that's all you're being very care, like you know, very caring about that. That's true.

A

True yeah very empathic about it. I like that.

D

Anyone following the ambassador rename for cncf.

E

There's a real name for ambassadors.

D

Well, so ambassador is trying to donate its component to cncf, but uh the the company used to be called something else, but the company renamed themselves to ambassador. As a result, they can't upload a component called ambassador for ip reasons, so they had to rename the component that they're uploading and it's now called emissary ingress and I'm not sure how I feel about.

G

It so the company was called something else. They created a open source project called ambassador that became popular company became called ambassador now, if they want to go in the cncf open source project that their now name, that they're now named after needs to rename it that's funny.

H

Emissary ingress just rolls right off the tongue so like it.

C

D

Has the best zoom name.

G

He's thanos spanish.

A

Yeah, please don't sniff.

H

Dana's, the one with the glove with all the little pretty rocks is that the one.

G

Yeah, I think, he's really into jewelry.

H

He really loves that glove. What's going on.

I

J

I

The project was actually named because of that right to.

G

I would assume yeah.

I

Delete all prometheus with us.

A

Oh, I thought it was the other way around. I thought marvel named it after ye.

A

K

A

I wonder if which is joining us. I think he had like some yeah. He said he he will be running late. I think we might need to start otherwise sure yeah very.

K

A

K

Yeah, exactly okay, so hello, everyone and let's go with the agenda and the question first question would be you know what what would be the order that makes sense. So there are some quick topics, potentially that we could get first.

K

What do you think about doing that?.

D

Yeah that probably makes sense because otherwise, like the hotel, stuff's, probably gonna, take up most of the call. So if the other ones are kind of quick, let's try to jump through them.

K

Okay, so next one is like artur alternative for streaming api monitoring. Do you think how much time you would need.

I

For that let's say I think this one would be better to leave to the next meeting. uh I was expecting some colleagues to join me, but they couldn't join today.

K

Okay moving to to the next one, and then we have some quick checkup on the white paper observability with simone, and you.

L

Yeah so so maybe I turn on my camera when I speak um so we have. We have parked this um this paper, for it's been quite a long time. We have talked about january, but then we it started quite well so the beginning of the paper, um I'm actually quite okay with the content there. But then we started having.

A

L

Because we were not sure about today's structure, it was basically arthur that did most of the work here, so so credit where it's due.

L

But then he didn't have more time to continue um editing the paper and I think we should. um We should continue and have one version for um for people here to review.

L

So this this has been abandoned a little bit um and I I think we should yeah. We should take it up and some people have experiencing more areas than others, uh maybe more in application, others more in infrastructure and I think um yeah. This is how maybe we should share a little bit small sections with people that have more experience in different areas and then the load is not too much. I mean writing one two three paragraphs and then somebody stitching the text together. It's it's not a big um something big to ask.

I

I I think, oh someone needs to step up as some sort of leader and yeah. I was trying to do that some months ago when I was an unemployed, but since I I'm now employed, I have limited time.

L

Yeah, maybe you can share the link in the chat, so people that were not here before um can open the article. Yes, thank you.

K

Okay, I guess we are looking for someone unemployed. Then.

I

That's someone that has, um I mean someone that is willing to do the work well,.

K

L

I mean calling calling for contributions, and I mean stitching the text together and trying to get like a readable version. I can do that, but I cannot be the main person filling the gaps that we currently have.

L

We have some uh so in at least where we start more in the part of data visualization and exploration um they're basically empty. So there are some names that were written here as candidates. I think michael was one of them for use cases. Arthur was also here, like just writing something about beginners approach, but there is there. There are some other sections that are still empty.

K

No, I think that's a good direction right, like first of all, there is no expectations to like- I don't know finish. It super super quickly right, it's community driven, so it's okay for some periods to be taking longer right. So um I would see this as like some community driven uh piece of documentation about observability, so I think we just need to yeah, announce it more and maybe split essentially interesting topics and that have gaps in it um and just yeah try to ask personally as well people who might be interested um so proposing candidates.

K

I think that's! That's really nice solution, um but yeah artur and simon is not no expectations that you will write that on your own. I think that's, that's definitely not what's happening, so um I would say announce it better and and try to gather more people on on those and and kind of one way would be to also make sure people are motivated.

K

So some kind of goal of this document is is well established. Like you know, this is why we're doing this, for example,.

L

Yeah, so I really so, the initial motivation of this document was basically to consolidate to have something that comes from cncf when somebody talks about you know high level and observability, but also going a little bit more into details like use cases or if your perspective, if you're an application developer versus you, are an infrastructure engineer um dealing with observability and some other things that are um happening in the community as well.

L

So we don't, we don't, have a very fixed structure there, but I, I would very much like to have um some input from people here that working in various areas to to make this document happen.

L

And when you talk about announcements, you're talking about um using our select channel we're going wider with that than that actually.

K

Yeah, I would I would you know, um mention this wider and say you know we we can definitely, um you know, make sure this has some structure and and go into good direction, but um we can definitely ask you, know and say on twitter and hey join our work group and um yeah just make it more known, because I think it's only you know a few people or like at least this observability. That is only kind of seeing that and just having more traction will you know, give people more motivation to also contribute.

K

So that's that's um my idea. That's what usually worked.

A

Before there's definitely interest from from the wider community as well, I'm working on the chinese version of the observability uh community and- and they also expressed interest so to contribute there and then subsequently translated into mandarin. So um I don't know what what's the best way is a tweet or a you know.

A

Maybe we can even do a webinar or whatever on it, but I agree we should definitely broaden, probably the message out there.

M

I know I have jumped in but like from agenda or initial thing from the from the working group, we said that we said that the canonical places are slack and the mailing list, with the mailing list being a must and the slack being recommended. That being said, I do think that we would benefit from from also having more public channels. So I think we need to split between people who take active interest of working within and moving within the sick and then external communication of the sig, but again I'm jumping in in the middle course.

M

I had a leadership thing at grafana which ran over and I'm sorry. So I don't know if, if that was helpful or complete hogwash for the context of right now,.

I

I think it was worth it we were. We were, we were not thinking about mailing lists. That's a good idea.

L

Do you have other groups that I mean in in the observability realm that would be interested here as well, richie that you have been involved or others as well, open, telemetry or some other folks that are in other groups that are but are not here.

M

I mean there's a bit: we we have prometheus and the wider prometheus community which we can leverage for stuff. We have open telemetry.

M

There are some friendly conferences which, which we can also try and put this to the main channel, I think, is just talk to cncf and I can take that action item to see how how they want to amplify stuff, which we publish um yeah.

K

I can share with thanos community pro materials jagger and, like totally it's a observability topic.

L

Yeah, I mean at least from the tracer from tracing parts. uh I forgot his name now, but the guy from jaeger he could. She could do so. I mean he could at least read something here and say this is rubbish or or give some pointers. That would be interesting.

D

Yeah yuri yeah definitely.

L

Yuri is his name right.

L

He doesn't need to write himself, but if he gives some some shape that we we can follow that. That.

I

Helps uh this money, do you want to talk about uh more in depth on slack later yeah, okay,.

L

But mailing list and other folks in the observability groups that are not here. um That would be probably my first bet before we go on twitter with, um let's say a document that is not in a great shape. Yet I would go on twitter if it's for calling to call for people. That would be like something that doesn't have so many gaps that we have now.

M

L

You're; okay, doing that.

M

I think getting feedback early is good. So as long as we communicate communicate clearly at what stage any particular thing is we should we should try and be aggressive in getting feedback early, but also I'm taking a little bit of a time check because we spent, I think.

M

Yeah, so just to to have the one thing, because I do expect that hopefully finishing the due diligence for open telemetry will take the rest nna, who might be on the call or might not be on the call, took a step at writing uh a a end user guide about distributed tracing. So just look at it give feedback if possible- and I think that's already it- what we. What we're talking about that. So, let's, let's start with uh continue with the open telemetry due diligence.

C

Steve is human, see, that's perfect, yep! I'm here.

D

All right here we go again welcome folks uh number three uh richie. Do you still have the action item here I mean we talked about it on the toc call uh at least bartok, and I were for there do you want to follow up on this one or what's the next steps for number three, I.

M

Failed to follow what do you mean on the on the um question of sub project? Yes, no. I failed to to email them, as you probably noticed, uh did any guidance come from the toc code.

B

What's your take.

K

B

K

It's a it's a good question. I think there was nothing uh no clear decision. There was like one opinion from this. Was that yeah it's? um It's kind of you know important to make sure that you know the incubation is high bar. So we expect things to be. You know kind of stable, but other um mentioned other aspects where you know there might be communities and some features are alpha and it's graduated. So it's fine and then um kind of I navigated that you know.

K

Metrics and logging are kind of important parts of the observability space and uh you know missing those are kind of a major major kind of part of the open telemetry. However steve you pointed that there are so many other aspects like sdks.

K

um The whole collector stuff is like many many things and- um and there were two interesting um experience- kind of from argo. I guess and to be honest, I didn't get the direction of it. Was it that, yes, we should um just um if, if there's a movement behind that, if there is like a strong community that we can trust that you know those big, even chunks of unfinished work will be worth kind of recommendation.

K

And I think that was the message. But I might be wrong steve. So.

M

Maybe then, to take a step back, would it make sense to on the sick level, just say that there needs to be a decision from toc and just bouncing this out of out of sick level, because I I mean.

D

Yeah, okay, that makes sense. I think that makes sense. uh The way I read into the argo thing is: it looks like argo's going up for graduation and has a similar problem in that it's a big project. There are plenty of aspects that are experimental or alpha or whatever you want to call it, uh and they too have to deal with it. So I I don't personally, I don't think that sick observability needs to solve this problem, just raise it to the toc and the toc needs to make that call.

M

Bartek, would you also be fine with this yeah.

K

I'm happy with that yeah. It looks like yeah community is kind of happy with this approach as well. So.

M

Yeah, okay, so let's revisit that point, I guess um also. We should uh close all the old comments, uh ideally during the call, so we have a paper trail of them being closed so.

M

We leave it as such requests a decision from cncftoc, so we can just leave it or should we strengthen that.

K

M

Okay, how about.

M

M

Or can we just mark it in a different color put it on top that we request this thing and move on, because we will not resolve this within this call anyway. So I would much rather optimize for for speed within within the document yeah.

M

Okay, so I will mark it as orange, but I will so call for consensus. I'm marking this orange and I'll also copy it at the top as a reference of the document, and then we just bounce it to toc all agree. Anyone disagreeing.

D

Good uh regarding the comment above since we do want to clear things out right, so the we have the adopters page updated. There was a request around adding different signals. uh That's a little bit harder because adoption is on a client library perspective and I've already received feedback from the adopters.

D

Saying: look we're not going to keep updating that page every single time we change something in our environment, so I'd like to propose the update that I basically wrote, which is that given tracing a stable, instrumentation libraries adopted, are for the tracing signal today and for the collector adoption is for tracing in metrics today. So I've clarified where we are uh without actually updating that page, because I think it'll be a one-time update. That's constantly out of sync adoption is from the components perspective, not necessarily from the signals perspective.

N

Yes, steve, that's a good point.

K

Yeah that's helpful. Thank you, steve.

D

Oh, uh so any objections to resolving this comment.

M

um Can we look at the table just once more because I I looking at this from your link, but I'm not.

D

Happy too, that's one: okay, there's still a few a few on the doctor's page, I haven't been able to track down yet to update their specific areas, but I will get that done once I hear back.

M

So, just from the formal standpoint does it make sense that you that you also just say this is current as of timestamp, so it's clear if it's out of date that but not an actual problem. So just it's clear as of what time the current state is given.

D

So I checked adopter pages for other cncf projects. No one has such a mark like I would just use the history you can kind of see when things are committed. You can go back and check it.

M

M

Okay, so as we have the comment that we need to review this, then at least we need to make a call for consensus on if that is a way forward. I given the product, but I think yes, but so should we resolve that or let me rephrase unless there are objections. I will just close the comment and to do with the basis of what steve just says. It's just said all agreed. Anyone disagreeing.

D

Very good awesome next, one uh there's a question about fork repositories and such so. Everything in the adopters page is adopting upstream uh there's no easy way to kind of track. This aspect: either uh I mean open to suggestions even when vendors are listed, like vendors are adopting it internally as well. It doesn't necessarily mean it's a distribution.

D

The solution here long term is there will be guidance from the governance committee of hotel on distributions uh and it'll be a lot clearer to date. It's just a from the adopters page perspective. The assumption is you're using upstream today.

M

But there are distributions, so your statement is that everything all distributions are equivalent to the distribute. Sorry, all distributions are equivalent or considered equivalent to upstream, as of today is, is that it appears.

D

So if you go to the community page, it lists vendors supported on openslimetry, and so here are all the vendors and if they have a distribution links to those things are applied or if they just have exporters, and it would link to the exporter specifically the goal long term is that uh hotel will likely certify distributions just like kubernetes does today. It's not doing that. Currently.

N

Yeah, I mean just to add to steve's point: uh we are working on the requirements for certification of distributions and and we'll be publishing that uh clearly, but uh you know keeping in mind with the similar guidelines that kubernetes also has.

M

So as vardek raised this and as fish is not on the call barted, would you be fine if we just added this to the to that section? As just said and move on.

K

Yeah, I would be curious, like can you tell quickly roughly like how the certification would look like? Maybe I'm not familiar with kubernetes one, uh because you know the concern is that those distributions can be anything right now and just claim open. Telemetry incubation, you know incubated someday, uh you know on stage and they will do anything inside there and we don't have any control because it's like yeah some other projects. So I wonder yeah what what are the measurements? I.

N

Mean some of the measurements again murders are are um obviously uh not only clear guidelines on you know what is uh integrated from a component standpoint, uh whether that's the collector component. So that's prometheus, uh you know end-to-end support or other components which are in the sdks, uh that is, the library implementations, but also uh stability guarantees, whether that those are you know, performance guarantees, testing, guarantees, support and maintainability uh guarantees for long-term support and and really requirements.

N

You know itemizing each one of those and then being compliant with the baseline, that is set up by the project.

D

Yeah and again we're not trail blazing here, like here's, the kubernetes one, it's a github repo. You can take a look at kind of the instructions they have automated processes that you run that actually conforms make sure that your distribution conforms to the guidelines. It publishes results like we're not going to influence something new here, we're going to follow something very similar to this yeah.

M

So we can just write one or two sentences into into that section, and that would be enough for you bartik that, basically we don't have to discuss in detail how it's done that, just that it is unlike kubernetes.

K

Yeah totally I mean I mean yeah I mean I don't want to you know block too much. I feel no, I definitely this helped. um It gives me some.

K

There is some kind of constraint on on those.

G

Also, just to add a bit of color um just from working at two vendors on this project previously and uh talking to the others like just because the project is so focused on data collection. I think most of the vendors and contributors are racing as fast as they can to have as little proprietary code like trying to push as much as they can into into the core oss so that they don't have to continue maintaining it, and so they can and they can share it with everyone else.

G

uh If this was a project that had the head back end for processing, I think the incentives there would be very different, uh but because it's just data collection, most, the vendors, don't want to specialize in data collection and they really don't. They honestly do not want to have their own data collection systems. Long term.

N

Yep absolutely.

D

Yeah, I added a quick one sentence with the link. If that works for folks.

K

There were mostly opinions from me and and open telemetry guests here, and I wonder if kind of I invented invited peter, maybe peter you want to give some. I don't know, I don't know some some. What do you think about that? Because I kind of I don't know like, I feel, like um um you, had a kind of nice experience on on those monitoring and collection stuff already yeah.

H

um I can say a few words here, I'm I'm here as sort of like um bartek and I were having a conversation about this space um just kind of informally, and we touched on some of this stuff, and so I'm like here, I guess a sort of maybe just a outside perspective. I don't know if this is actually what anyone wants or needs, uh but um I'll. Just happily summarize um just a few things when I learned about this whole initiative um kind of the points that got raised in my mind.

H

So just a bit of background I like was uh for a long time deep into really stuff in kind of like go world and function. Microservice.

H

Universe- and um I actually can claim uh that I uh was the one who invented the three pillars of observability, if you can believe it uh in one fateful meeting uh with some key players back in the day so anyway, uh accolades check. um uh What is interesting to me about this whole conversation.

H

Is that what it reveals about the role that the cncf plays, as it selects projects to like put little gold stars on right, um because my understanding was that the things that it lifts up into incubation and what are the other, like statuses, that there are there's like approved, vetted sandbox.

K

Incubation graduation yeah.

H

Graduation, yes, it says so my understanding was that as projects like move through those stages and get like check marks from the cncf, it is because they have proven themselves in the like developer space like they're in use. They are like de facto standards for these things and maybe not all the. Maybe all of them have to be all de facto standards, but like basically that's what the cnc is doing is like saying: hey. These things are good.

H

You can trust these things, but what it feels like is happening here is like uh there are, a collection of expressed user needs that are real and need solutions, and um uh the open, telemetry group is like collecting them and then like trying to suggest products that solve them in various ways ultimately solve all of them, but like haven't yet done so, and so it feels like either it's super premature or it reflects that the cncf isn't actually about what I thought.

H

The projects that lifted up the properties of the things where it's it's actually something else. It's actually like. Maybe.

H

Building solutions that it's like customer companies or whoever is like uh following the cncf to things they need. Maybe it's that and maybe that's okay. I don't know, but that's not what I understood anyway yeah. This is my perspective and like maybe the most concrete thing I should have maybe opened with this is like uh so like before prometheus was in that process.

H

I knew like tons of people who use prometheus right um in my like space in my role but, like I don't know, anyone who uses open telemetry right, they know of it, but I haven't met anyone who, like has ditched their prometheus instrumentation library for hotel right um like a couple of tried, but like it just hasn't happened yet right. um So that's what's interesting to me and that's I don't know, maybe worth discussion. Maybe it's all already been said. I don't know what do you think I.

N

Think peter again, you bring up some good points, but uh I would say that you know having rolled out aws distribution for open telemetry. We have several customers who are using not only um open telemetry as in commodity agent, um if you will, but also you know using prometheus, and it is an win-win for both ecosystems. It's not, uh you know, replacement necessarily it's it's also.

N

You know several customers, large customers, moving towards building out their kubernetes based ecosystems, as well as other compute support, and- and I don't think it is fair to say that, because uh there are several customers- and I am happy to bring you a list if, if needed,.

H

N

Also, understanding.

H

Is that they're? All private, though, is that correct, like uh uh no corporate customers, yeah.

E

No, no, that's not true.

H

From the aws point of view,.

E

But like this is from the doctor's page, but also like so I think also so. People are wondering like who are the people you talk to, because I think one thing that we need to actually address about open, telemetry and what cncf has actually done is cncf is used to only talking to like niche cloud native companies right who are used to implementing everything on their own, but where actually cncf is going, is actually bridging the gap from everyone who used to be on-prem yeah and that's where open telemetry is actually making the biggest inroad.

E

Where exactly bring all these companies that like. Because if you look at the attendees at kubecon now and a lot of the open source summits, there are a lot of companies that, like their banking companies, a lot of companies who, like still are kind of doing, on-prem or moving closely yeah.

A

They're going to.

E

Be actually wait right and they're going to be actually not using their own implementation of things, but trying to find like libraries there and also especially when it comes to observability space, where, like you know like okay, yes, coming from splunk like we are, we are quote-unquote. You can cause evil overlords right, but all a lot of our customers don't have time to implement their own. Like implementation promises you're going to want to use libraries because, from their point of view, it's not worth it and that's like.

A

E

Our adopters list is actually showing that we have companies that are using our instrumentation and our collector.

D

And the other thing I'd add is like the criteria for incubation. I don't know if everyone's kind of familiar with the cncf terms, but there's no criteria of like in the incubation status being the de facto standard, replacing other standards like being the only solution. None of those are criteria for incubation they're, they're listed.

H

Here I didn't mean to imply like, like the only choice or like or whatever, but but like wide adoption and and this the question to me was like who am I talking to right, which is completely fair. uh I I totally, everyone has their own like little lens, for which they.

E

H

Yeah yeah exactly uh so like the people I talk to are uh largely like open source communities and adjacent like slack rooms and and mailing lists and github issues so like stuff. That is public in that sense uh often um products of uh corporate environments, but largely not, and I think, an equal measure, um a lot of uh small medium, some large startups and companies that I consult for often in a go capacity, often in a observability capacity. So.

E

Actually, what's interesting, though, is that go tends to be like adoption go, tends to be more associated with a cloud-native company and not for those who are in right, yep and, as was like shown from part of our adoption list, it's actually people who are bridging the gap from coming from on-prem and hybrid worlds.

H

Right the people for whom openstack has completely failed so on to the next thing. Here we go: let's try it again, yeah uh yeah so like I have no visibility into them, and if that, if that is your like user base, then I can't speak to it right, because I have no idea well, it's.

E

Part of it right like that adoption page shows like we have people like shopify, like massive cloud native company.

C

A

Sorry, you mentioned one thing, which is a little bit worrisome for me, and that is standards. um Cncf is not a standardization body right. It's.

H

A

A sensation body like w3c or itf right, it's not about we are not creating. um You know here declaring this is the standard um I feel and based on the conversations that I have with our customers.

A

I can vouch for that that there is certainly a huge interest there, especially around standardization, across different signal types, and you know the clear method that we have here in the sig is to provide that due diligence in terms of the requirements that we see here in the process, and that is adoption and that to me looking at what is there is clearly given. um I really don't want to get into that standardization discussion saying like you know, there is a standard and you know is hotel the old dominating standard or not.

A

We have already seen.

H

I'm sorry for bringing the word into the conversation right, yeah, it's it's! It's not right!.

A

I just let's, let's stay clear of this standard there. It is a specification and there is a clear uh uptake and adoption there yeah. What I see.

H

But okay, okay, so maybe this is an important point: does that uptake have to be organic or can it be sort of like uh outward sales driven for like work.

F

So it doesn't matter.

M

F

J

M

Document it, it is much more about the direction and I don't have any problem letting this this discussion continue. I just want us to use the time deliberately, so we can also focus back on the document or we can discuss this more. Both is totally fine with me. um Maybe one point and I obviously have like 20 heads and I'm very much torn.

M

I would, from my perspective, most prometheus and prometheus ecosystem users are not truly cloud native, I would say more of them are in the brownfield data center space just because of sheer numbers, um but oh no, I'm already going into discussion. So should we continue back on the with the document or should we discuss uh the points? Peter wrote up. Both is fine with me. I just wanted to on.

D

Us to be deliberate here, yeah. I guess my comment would be if it's directly related to one of the questions in the due diligence that requires consensus, then I think we should absolutely drill into it.

D

uh If not, then we should probably either create another dock like bartik did or like schedule another time to have that conversation, uh because I think the goal is to try to get through this stock from a consensus perspective and uh if it's related like if it's an incubation criteria thing and we're not going to meet that criteria, for whatever reason, that's the the primary focus, I think we should have for this audience.

M

General feeling of the room.

K

I mean, I still think you know: does the project have production deployment that are high quality and high velocity? We kind of navigate that for tracing yes, but for rest? No, and then we so it's worth discussing. But that's only my opinion.

E

But based on what the tst said, that's a good point. Isn't it.

M

So should we as a as a possible consensus, should we then just say that we are that yes tracing has the adoption. uh The rest is.

D

Coming and can we do that? Can we do it from a signals perspective like do we have consensus that tracing is adopted, and maybe we don't have consensus that metrics and logs are and explicitly call that out here.

M

That's where I'm trying to go that we that we have a consensus, push position of of everyone in this call here that that is a statement, because I, like again, I have 20 different heads here, so I'm I'm very deliberate about which one I'm I'm currently wearing, um but for the tracing aspect. I do think there is no debate that there is white adoption.

C

M

So, okay, then, let's okay, let's try this and then we are overriding. The other thing.

M

M

So bear with me this is just the first step and steve. I think you need to reload your screen, because I'm not seeing what I'm typing yeah.

D

It's pause. Sorry, there you go.

M

Perfect, thank you. um That's similar to how how the prometheus things work. If you saw the recording there, we we nailed down one consensus, and then we built possible consensus on top of it.

M

Of course, that has shown to to take the least time to discuss um so as a first point of consensus, new consensus, because we're actually revisiting but okay, sick observability has consensus that there is white and organic adoption of open, telemetry tracing all agreed. Anyone disagreeing okay, just for the record, I'm uppercasing that one t!

M

Okay! So should we make a what's a positive statement here.

M

Steve bartek: do you have any wording.

D

Suggestions, you want to basically note something about metrics and logs, without making it sound negative. Is that the goal? Yes, because I yeah, I don't believe.

M

In making yeah precisely, but while still retaining the concern of bartek and peter yeah,.

K

I mean yeah, I.

M

K

That, maybe, from my side quickly, yeah metric and logging are not in incubation kind of quality state.

E

That isn't, but, based on what the toc said right is that there could be parts of the project that aren't at the same level.

M

With everything in the statement about what we, what what that's not making a statement about the whole project, this is just trying to split out the sub components. So we in the sick can find a consensus and then just bounce this to toc and they can decide whatever, because I don't think we will be making progress. Otherwise, and I would like to finish at some.

D

Point yeah, I guess one suggestion I.

C

D

Have is that there are I, I kind of raised this in the toc call, so some of you are on there. Some of you were not so, I think I'll, just repeat it for the entire audience, from an open symmetry project perspective, all of the sigs have components. Those components are adopted or not adopted. The primary components today are the instrumentation uh libraries, which are language specific and the collector.

D

They are in our reference implementation on top of the specification, so you have a dependency on the specification to make that work. All of those components are then tied to signals. uh I feel like the last two meetings uh in this audience. We've been focusing a lot on the signals, but it says: does the project have production grade deployments that are high quality and high velocity for all of the components?

D

The answer is yes, the the nuance there is that that is a yes for tracing signals and I think that's what we've all kind of gotten hung up so far is that well what about the metrics and logging signals and then there's kind of the back and forth on? Does that matter or not? So maybe one way to approach this is: do we have consensus that all of the components in open symmetry are adopted?

D

uh First, then, we could make a note about the tracing signal specifically, which is what the line in green currently captures and then the final one would just be more of a note. uh Metrics and logs are not stable yet and are not part of this consensus, or something like just articulate that there are additional signals that exist, but that's separate from the project as a whole right. You can still adopt and be in production, regardless of whether or not you have those other signals.

K

Right and- and I think why I'm pushing this direction is that you are right, those components might be adopted right, but it's still only only one signal is implemented on those components like collector um the most kind of implemented base and the most adopted why people use essentially sdk and collector is mostly for tracing capabilities right. So all of this is across all the components.

N

I don't, I don't agree with you, uh but because you know what I can say is that we have built uh instrumentation around aws monitoring, back-ends such as cloudwatch or which handles metrics or uh prometheus, are managed service for prometheus, and the collector components that uh exist today are being added, are being used for by customers, for you know again supporting metrics.

N

Now there is a clear roadmap for metrics, uh which is being currently worked on by the project and a very clear roadmap and milestones that are, you know, have been established in the past few months and work is in progress, and so is that happening for logs. So to say that there is no support. Is inaccurate.

K

Right there are a couple of plugins for metrics yeah.

G

Well, there's more than a few right like there's.

E

G

Yeah, like like google, for example, when they have customers who come in with windows vms, they tell them to use the open, telemetry collector and they went and added like windows uh system metrics right like it's, it's more than just a handful.

M

There an overview or like a lot. I I I feel as if part of our problem here is we are. We are discussing on qualitative statements uh a lot and not going towards quantitative statements.

M

So is there, for example, a list an overview, a matrix of metric signals and of of logging signals being used in production, because I do think that this would alleviate part of of bartex and peter's concerns.

H

M

Yeah, but I'm going slowly here, I'm cat herding hasn't been on my or has been on. My own is.

H

It possible to deal with me yeah. I I understand no.

M

No, that's not the statement about you. It's just a statement about spending two decades in open source.

M

And always trying to get people to to agree um for those two decades.

N

Richard again, I agree with you that you know we are happy to provide any quali quantitative. You know measures or whatever. That means. If we can define that clearly and- and uh you know we the specification, because the implementations as steve pointed out, follow the specification, there is certainly a compatibility, uh matrix or a matrix of features, which you know actually can be highlighted.

N

If that's helpful in terms of support.

K

Right, okay, I think you know yes, I mean definitely the number would be super nice, but my my worry is that you know I always from the beginning was you know, excited to see open telemetry? You know uh protocols to be you know, kind of de facto standards right. We know what to use for logging for replicating.

K

You know log lines across systems and from client to the agent. Let's say the same for traces, the same for metrics and suddenly we totally are on. I mean open. Telemetry looks like it's very unfocused on this point, because we are adding collectors with um tons of apis and we kind of we we quantify the quality of. If it's incubation, graduation by amount of apis. This is not healthy for community.

K

E

So actually it sounds like your. Your comments are more that it didn't meet your expectations. Right, like, as you said, you had these expectations for open telemetry, either not meeting them, but it's like we, but that shouldn't be blocking incubation.

H

Well, it's let me chime in just one sentence here: it's like if we have more projects in this pattern of you know the maximal feature set and the maximal set of integration like is that going to take the cncf to a good place or a bad place and like for me, there's clearly so.

G

I just I just want to be mixed, clarify something we talked about a few meetings ago like we. We should consider the logging part of open telemetry experimental, like the community, has very purposely sort of put it off in a thing where we're slow, walking it and very slowly working on it, because we're very focused on tracing tracing is now ga metrics is next, but I don't want to look at our ambition of eventually bringing in some logging components and doing some work and logging.

G

I I'm worried that that's really distracting from the core project, which was sdks collectors and protocols for traces and metrics.

E

G

The traces tracing yes right.

E

And we're actually at that point, if we reach incubation, that was.

G

E

Goal is deprecate open tracing and by actually us not reaching incubation. This is impacting the cncf where open tracing has been maintained in two years, and this is also impacting all those users.

D

Yeah, the other thing I'd add is, I think, we're also blurring the lines between incubation and graduation. This is not a graduation conversation if it was. I think many of the points being raised here are completely valid. Like you, don't have any metric support, that's stable. Yet how can you graduate, but that's not what we're talking about we're talking about incubation and the criteria for that is having production deployments that are high quality.

D

We do, it doesn't say it has to be high quality for every single thing like it is high quality to the point where customers today have adopted in production environments. So I'd like to make a proposal. I I made an update here uh and I put three separate bullets that we can try to talk about to see. If maybe this gets us closer to a consensus, uh so I'll read it aloud just so we can talk about it live just.

M

D

M

Some you deleted the other thing.

D

uh It's this line, it's the exact same line. I just moved tracing from the end to here, but I I didn't make it green, because I think we should talk about it again.

M

I agreed just as a point of order everyone who touches a green line. Please uh talk about it before you touch the green line because they are present yeah. It does no, no it's fine. We we caught it, but this is like my whole system is, is built on the green lines uh never ever being touched, which is why I made public or made it explicit. Also you edit signal in hotel, so that that's my point like I literally talked about making the t uppercase to yeah.

M

Please don't but yeah.

D

Yeah so I removed the green so that way we don't even have consensus on it, uh but I won't touch again sorry about that. uh So I put a paragraph here at the top uh that people should read real quick to see. If there's any disagreements with that, and then we can talk about three.

M

M

The one point as I usually fall, but this is without my head, as most of you will know, I usually focus on the wire forwards first, because this allows you to to freely interchange, back-ends and libraries. I think they're the underlying thing of everything else, but I'm also fine with this order. The other hat back on.

D

Yeah and from a from a stability perspective, that's how hotel kind of treats it too. So the specification has to reach stability first before the actual implementation. Can. The specification is where, like that, data model and wire formats live, um so we follow the same model. It's just. We kind of bucketed the whole thing underneath the signal, which makes it a little bit confusing here, um but okay in the in the interest of time. Let's try this uh so basically, given that there are components, and there are signals that the components are being used for.

D

I try to distinguish between them, uh so maybe rich. You want to kind of lead it. If we want to go back to this one specifically first, um I.

M

Would actually suggest that we put the signal first, because the wire format obviously includes the data model and, as such, this is as you set the basis for the.

M

Rest! Okay, so, as this is a slightly changed line, um sick observability has consensus that there is white and organic adoption of the tracing signal in open, telemetry all agreed. Anyone disagreeing.

K

I agree on my side.

M

Sick observability has consensus that there is some adoption of the metric signal in open telemetry.

K

D

This one might need clarification right because the specification is not stable uh as alumitu is pointing out. There are people using metrics in production today, though, so I'm not sure how to draw that distinction. uh I don't know bartok. If you have some suggestions on like maybe explicitly saying the specification is not stable, but there are production users of metrics. I don't know how you want to rephrase it.

N

K

I would still yeah, I wonder if it let's, if we would just scope this incubation um bar on just metric. Are we happy as a community as seek observability to kind of put this label on matrix, and I'm not sure about that? Maybe but.

D

That's demonstration yeah, that's interesting. uh I think it's worth a conversation about that. What if we do just have line, what right now lines one and line three do we have to say anything about metrics and logs?

D

Is that a requirement yeah yeah.

K

M

I mean with all my hats on at the same time, um metrics logs and traces unification in a single set of of unified client libraries, collectors and a holistic design is, is the core goal of open telemetry. As far as I understand it. As such, it should at least be looked at. I'm completely fine, just saying: well, it's future work and be done with it, but I think it should be mentioned.

M

M

And I I try to to have a positive statement baked in that's what I'm currently marking.

D

Oh nice, I'm good with that.

M

Okay, so let's try for this one also. We are way over time and I have a hard cut in two minutes. But let's try sick observability has a concer has consensus.

M

Sick observability has consensus that there is some adoption of the metric signal and open telemetry, but stability and compatibility with open, metrics and promises has not been reached, but is planned. Paul agreed.

N

It's working richard, it's work in progress right. I mean it's being actively.

C

Yeah yeah yeah, that's fair.

M

N

It's not only planned, it's also big yeah.

M

No you're right that and takes yeah makes sense.

M

I can't type when people watch me.

M

How about this.

N

M

Okay, so let's try again thick observability has consensus that there is some adoption of the metric signal. Open telemetry, but stability and compatibility with openmetrics and prometheus are work in progress and take note and sick observability takes note of prometheus working group within open telemetry.

M

All agreed anyone.

C

Disagreeing nice.

M

M

I will change the green in a second.

C

Is this a fair statement and I'll make it non-green in a second.

D

It's experimental better than future, because there is technically some amount of support for it already.

N

D

There's work support.

N

And and there's ongoing work that is actually already in flight with the stanza logging agent being integrated.

K

I mean we heard voices that it was yeah postpone and delight so far, so.

E

But bartek, as we said in the last meeting, we actually show that there's a log sig right. We talked about this in the last meeting. There is a log. Stick, that's actually being worked on right, and so I'm I'm kind of missing the point that if we've shown that there is active work on it, how is that being viewed as being denied if there's actually.

K

Work in a dock, what do you mean you shouldn't like jonathan mentioned, that he joined a meeting and there was nothing there.

G

That's no! No! No! No! No! No! No.

G

E

Jonah, I asked you for proof last time you didn't give us any proof.

G

Jonah jonas said he joined the community, call that was deleted six months ago, but was still on his calendar and no one was there.

K

G

I followed jonah started joining the the maintainers meeting, which had replaced it six months ago.

N

G

But but the long stick does have people it's it's.

N

A very active uh amount of work happening. uh I can send you, you know specific projects that we have done uh as well. As you know, the uh log model log data model as well as the stanza. uh You know, component being integrated into the collector right now.

E

Also jonah proved log specs like he is actually viewed as like participating in github prs. So that's like I don't think we can use like jonah's quote, might have been misplaced at the time, but I don't think that's fair to use it anymore. He.

G

Mentioned the community meeting, it was someone had modified it and it went back on.

M

People's calendars, uh I I would like to avoid he said she said uh situations.

M

How about alolita, takes an action item on um on putting some quantitative data on this as far as okay, perfect, and we will try and actually finish this, the next time.

E

Can we schedule an ad hoc? This has taken two months.

E

Right, like this has impacted hotel's ability ability to actually be a part of kubecon north eu in terms of maintainer track, which was one of like which actually impacts his ability to like get more participation.

K

E

K

You know by the way I heard that many open telemetry talks were actually approved where others were.

E

As the coupon code.

E

The person who spent all of last week approving these talks.

K

I mean I would I would love to know that, because I heard you know uh the other open, I mean yeah. This is like.

E

No but there's a difference between maintainers tracks and actual talks accepted toxic sector. Those bases are people who submitted things open, telemetry as a sandbox project isn't allowed as maintainers track talk.

K

I understand, but I'm saying that.

K

And that's fair that you know open telemetry talks were agreed and like kind of um approved more on the normal track, because they are not part of the maintenance track. There are so many rejections because thanos and prometheus and jager and others are already on maintainers track. Let's be fair. That.

E

Actually, isn't the actual process, I'm glad to walk you through the process, as kubecon co-chair, who walks through choosing the talks, but what we do is we actually from the observability uh there's a program committee that gives us a top 10 to 15 talks and we go through right based on their rating and then after choose those things there.

K

Yeah yeah, I understand I just want to. I think that would be fair, um that you know to give open telemetry um voice because of there is no maintainers. So.

E

K

E

But I'm telling you that's not how we do it. I mean, as.

K

E

Who chooses the talks? I'm telling you that's, actually not how we do it. So I'm sorry, I don't know where, like you're getting this information, but that's not how we do it. You can talk to steven and I whatever you want, but that's not how.

M

We choose topics for couponing, I think you're, discussing on the level of process and on the level of perceived outcome in parallel. So.

E

M

M

N

So richard would it be helpful to highlight you know what we are, uh what opportunities we are missing. uh Also very quantitatively. You know in terms of the project opportunities uh which are really important for the project, and you know help in in the work that is being done on the project to make progress.

N

N

Qualitative discussion.

M

On the on the level of of extra pr- yes, no, I don't think that matters on the level of the due diligence. That being said, I I do note that we take quite some time to to discuss this so completely orthogonal from the former point. um I do think it would make sense to to try and speed things up and if um interjecting a meeting helps with that effort. That seems to make sense.

M

um So the same time slot next week is open. Telemetry, metrics data model sick, that's the only clash which I can see offhand, um but it's not my call to make. It should be off the call here. So can we do a show of hands for saying yes, no on we lost like half of the people.

N

M

Next week, yeah, let me do gallery view just a second.

K

I mean we can talk offline on the mailing, please yeah we can. We can definitely make it more often yeah.

M

So everyone who's in favor of having this this call next week as an interjection. uh Please raise your hand.

A

We sent out by the email list as well. Yes,.

M

Of course, okay, anyone who's against.

M

Okay, so uh like for, for the recording, two thirds are in favor. No one is against. Does anyone want to abstain?

M

Perfect, so? No explanations, um so I'll talk to amy to to put an intermediate thing into the calendar, but bartek's point is also valid. We should try and and as always, move more out of the calls and into into the mailing list into the document, blah blah blah blah blah. I would much prefer to just go through comments and such and just close stuff, which has been discussed during the week and consensus has been reached. Then then discussing everything in depth, every every single call.

M

Okay, thank you very much and see you in one week not two.

N

Thanks, thank you.

M

N

N