Kubernetes WG Resource Management, 25 Apr 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes Resource Management WG 20180425

Description

Meeting Agenda:

https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU

A

All right well welcome everyone to the April, 25th resource management, working group meeting of two topics on today's agenda, as well as some special guests from I. Guess they go to skilling and seeing instrumentation. So thank you. First up on the agenda was talking about feedback of the testing work, so I guess about. If you want to talk through that sure.

B

Let me share my screen so that I can give some context on this.

B

So this is the document: I don't have any slides, but so, when we met for the latest face-to-face meeting on Christmas one has been working group. We have a to prioritize building some kind of a node performance benchmarking framework, so that we could test benchmark.

B

Some of the node performance features that people are gunning to add in the kubernetes ecosystem and have some objective data on it. So main reason is: there are some low level optimizations or performance optimizations that people connect to add in the cupola tank. You want to have a way to objectively measure the performance of these workloads with and without the feature enable, for example, that's this that's what this proposal is about, so there are mainly three parts to this proposal or three components to this proposal.

B

The first part is the first part is the probationer itself, as in like provision of kubernetes cluster, it could be on gcpd, AWS or bare metal or any of any other any other any other platform. Second part is the test run or itself. How are we going to run the test? The third pass part is the result of regulator and graph water or riffle plotter. So there are several options for these, so what we did was we took these options. We discuss this proposal with the cig testing community and the the initial proposal.

B

I think it is discussed at the end here where we have a roadmap, a phase 1. What we are planning to do is add this test runner to so the node. You know D to e sub directory within kubernetes or core kubernetes and and leverage testing for as much as possible. This with same option with six testing. There were I hadn't, seen you one person from Red Hat and another person from and Tim Sinclair, so they were the ones who are doing us main feedback and they think this is a good idea.

B

So this would be similar to make make carrot for load no D to e know. Diddly I forgot them. I forget to make target, which is I, think it's node E to e test, so yeah.

C

Sure wait till then. No you.

B

Can ask the questions I.

C

Wonder why that could be like that could be example, work yours if you're running, which span more than a single node like that, is also right.

B

We wanted to keep it simple. For a start, this is going to be. For a start, it's going to be single node or target single hold workloads. We got tester the performance of running workloads on a single node for a start. Okay,.

C

The other reason why I mentioned it is because you can do single node, even in a cluster level test. Okay, we can, if all you wanna run, is like a single instance of your workload and it's it doesn't span more than a single node that is still possible. Okay,.

B

That makes sense. Okay, maybe we can. We can explore that option too.

B

So so the options that we are exploring is just leverage the Queen for such an that is already in place in test intra and then publish the results. So in case of I, don't know if people have seen this. That is. Actually there is one more note first, but I don't think yeah so know what / has some some kind of result, aggregation and graph plotting, not sure why it's not loading. So that's it has a certain set of graphs published there with respect to like poor creation, latency CP usage, some of the metrics.

B

So we want to do something similar for a start with single node runs. The workloads that we are discussing is something Connor and I have been working on. So we I don't know if this is visible to everyone, but we did some preliminary benchmarking for the CPU manager with and without the CPU manager feature.

B

We have some initial results for some workloads which are which are like select kernels that perform some scientific operations. So we have some initial results from there. We also are currently working on getting results on neural network training. So I have some docker images here, but this is still pending. So we want to add these four codes. For a start.

B

This this framework that we are trying to build, will also be extendable with other workloads, at least that's the plan. For a start, what we are trying to do is make something minimal where, where it's pretty simple, where we use the existing mechanisms to provisionally um something on GCP, run the tests using existing mechanisms such as tests infra and then somehow plot it, at least that's the first phase, so it which mentioned.

B

Maybe we can look at cluster level tests as well before even the first phase, because if you can run single node tests, there will be there will be useful and we can expand it later on to cluster level test.

C

I think, once we have a framework where people can add different workloads as examples and and if you literally have a small walkthrough guide that says how you have to add and you yeah the benchmark, then thing we can crowdsource this and.

C

B

Right at least that's the plan, so we love dogs and I, put out all the docker images. Sorry docker files as well, so that we can have it all documented and Agra files, podium ins.

B

Whatever is required to run the tests and extensive documentation on using these using these frameworks and how to add new workloads.

B

C

So bad would a question.

C

B

We were initially thinking the test Runner, whatever his running could run and could could be staged in like to be per for something like that. Okay,.

C

B

Runner and the the code so docker files, the podium else, could run could be staged in like birth test or something, and then we link these two things to the third with some documentation, but we can just say this is the docker image that we are running, but the docker file is in Perth, cached or or we can host that all of these things together in were fast well yeah.

C

I think that makes sense. I recently tried to set up given as end-to-end tests out of three, and that was a huge pain that doesn't mean it's actually painful and maybe I was missing. Some easy hacks, but I I spent a lot of time just trying to get in to entice working outside of or like developing into interns outside of core criminals repository. So it probably makes sense to start there.

C

B

But initially we're just trying to use whatever mechanism is already provided like for no DT. We just deploys a machine in DCP. So that's what we are going to try to do later on. We try to make it work on on look where well my clothes.

B

C

Yeah I mean in general, like the the cluster level ii ii ii framework, because already is already running across different environments, and I think, like CN CF is also like launched. A new block was recently they show that they have a dashboard where they're running the into intense yeah.

B

But that's the other thing, that's the other option we are exploring, but this comes under the provisional part, where provisional component, where we can provision different platforms. Okay,.

C

But in this case, like I'm, assuming we wouldn't care about provisioning because all you're doing is like deploying workflows already supported by cumulus in like beta or above api's.

B

Right, I mean the test runner part only. You can point the test vendor part to any kubernetes cluster and you just run some tests, but but at a later phases we can provision the cluster wherever we want, so it could be anywhere. Okay,.

C

Did you check if it is also yeah.

B

So I have some experience running it on DCP and AWS like just running manually, the you to be cast while I was working for cpu manager, ET does, nor did we so it works. Not only does it work in the CSC, the environment, within communities, it also works in your own DCP cluster and this cluster yeah.

A

I mean this week: Red Hat pushes test DD node results from rel on it events, every 15 minutes or two hours, I forget into the test group.

C

Because if the tests already have enough infrastructure to run across different environments, then that might be another reason why you might want to choose a cluster level once. But if no d2e is in feature parity with, then we can totally start with no level.

B

Okay, ii-I'll make a note of this and I'll explore that option too I look at both glass, cylinders and I mean no really works for single mode right now, when anybody can run it in their own environment as well, but I will take the cluster level instance yeah.

A

I have a preference that we start with no de um and the cluster one Timmy is more interesting when you want to do stuff with storage and I. Don't know if that's like in.

C

And we will have to start monitoring network performance as well, and for that you need to run across.

A

Yeah I, don't disagree, I guess it's like which test does one want to tackle first and then network would be very dependent on like the Sdn choice you make and that don't really like that's something that we should.

A

At least this group doesn't need to focus on, maybe others feel otherwise, but I think it's like outside of our domain. No.

C

I agree: this is that, like we have a gender extreme that can be consumed. Different people like the domain expertise, might change.

A

Networking particular like not all Sdn solutions are delivered through the cube core right. So then, like you, get into like a weird community management problem of how do I want to present my networking Sdn in these tests.

C

Having the data would be useful right, you disagree with that. No.

A

No disagreement, it's just the management overhead and then like whoever we signing up for this would be signing up to answer a lot of questions that might get uncomfortable.

C

Okay, I mean: do we cure, like I? Don't have any specific objections to doing at the node level? I just want us to have a framework that would require low maintenance over time, and so, if like, if you're investing a month or two getting no d2e to work, and then we have to reinvest level any case, I'd be bothered to make the choice.

C

B

I also explored the classic test and report back, so the the feedback from sick testing was. They were fine. With this plan. We need their support because we might need to make some changes in testing for, ideally, we want to make any changes and you just add the at the test as a periodic job, and then we be done, but if we need some additional features, we can add it there too.

B

So so do I we need to like take this loose ignored as well and present it there, or is this community firing not I'll report back to our resource management working group every two weeks to four weeks do.

C

You do you have like do you need additional help here, so.

B

That is, I, don't know. Yet let me start working on it. I know.

A

For your question, they're, like I mean signal will be the owning sig to maintain this test. Health, I, assume and so I mean personally I'm, not too upset. If, if you don't like I'd like to avoid you having to repeat yourself but like, we should just make sure it's on the.

A

Planning document that Donna and I had circulated, and then if people want to link out like this, does become like a long-term.

C

C

The framework, depending on whether it lands in no DVD or class level tests it might become part of signor, are suggesting ideally like whichever test framework it is. It should be part of sick testing, so I feel like this. Would this might be something that was actually per test case rather than oh yeah,.

A

I'll, give you had a way of structuring I hate, that, like political structures, need to then bleed into code structures but like if there is a way of differentiating like the framework from like the actual test and then having an owner's file that associated the one test with the right people in the right sig. That's probably good, but I mean from a signal perspective. I I feel that, like we want to know his cpu manager is working well across releases. So, like the testing information is critical and I mean I. Have no objections to it.

A

I would struggle to believe anyone in the rest of the signal. Community I mean Vichy. You have no objections, so this is. This is all net positive to me, but yeah I think it might be on a per test basis, who owns long-term test health.

C

Yes, sir pest management is no based on regular expression, so.

C

A

For merging it's still based on honors files from merging that's.

C

C

I I was just raising that there is this alternate perspective. To look at this I I don't mean to progress I.

B

Can I can actually do this presentation in signal as well just to get them also on the same page, it's not a problem and and yeah I will continue reporting back here as well.

C

Pointed out that there's this group called Caston I forget that name saying it incorrectly, but they were also trying to do something. Similar last year is.

B

From the calcloud I haven't been able to get in touch with them. That thread is cut off. Maybe I can reach reach back to them again, and then we will see that takes us. Okay,.

C

A

Well, thanks logy, and if there's no other questions I guess we can move on to the next agenda item. So I think the next agenda line represents like what I think this working group should figure out best, which is try to get answers to unclear technical choices that span SIG's, so I, guess, Sally and Frederick. You guys have put together a brief document on like what you felt sick instrumentation would like effective monitoring at least device plugin monitoring. Do you guys want to walk sure.

D

You want me to share my screen here. That's probably.

B

D

Let me see if I can get it actually working properly.

D

All right, can you guys all see my screen here.

E

D

Right awesome um so kind of the the key objectives we wanted here, we're that we we wanted to make metrics less centralized through the cubelet and through the advisor we wanted to avoid.

D

Having cubelet need to do any augmentation of metrics as we don't want to eat lettuce bottleneck, we knew we needed to have some stable, well-defined set of metrics, because there's components that obviously depend on these metrics being van and having certain names and semantics, and we wanted to make it so that Prometheus speaking components can easily correct metrics, because in addition to just media itself, which obviously has a large mind share in the community community, there are many other tools that kind of speak the Prometheus transport format and so enabling those tools to also be able to easily cut these metrics.

D

As opposed to having to write the spoke bespoke ingestion mechanisms to understand some sort of custom API. What's a goal as well.

A

Can I ask some clarifying questions and I know I didn't get much time to talk about this before but like when you say the cubelet does not have to do any augmentation of metrics. Can you oh yeah, speak to what you mean by that.

F

D

Sure so the idea here is that metrics from each individual component should stand on their own, so the cubelet should not have to add any data to other components.

D

Metrics right so like, if I, for instance, wanted to point my monitoring tool directly at a device plug in or the pod in which a device plug in ran or if I wanted to point my monitoring tool directly at say, a CRI implementation or something of the sort right like it should be able to understand, to be able to have enough data with those metrics to understand them, and the cubelets should not have to add any information portion and should not be adding information to those metrics.

D

A

And on its own.

D

That's debatable right, so I just.

A

Put on like T's on this one, a little bit more before we like I'm sure everyone necks like agrees on that requirement, so I think it's it's I, don't know if I agree with it: yeah, okay,.

G

A

Like do you have an example of a metric, you feel like the qubit is doing that today, that it shouldn't have been doing.

C

Can I ask a question.

C

To say that you're saying that cubelet should not be in a bigger pot for collecting metrics, but it's okay for Kuebler to be in the control plane park for making sure that these external entities have the necessary metadata in order for them to expose metrics yeah.

D

It Cupid I, think you know we and we can discuss how the you know. The appropriate data gets to the individual components later.

G

D

That was a that was a point of discussion, but the components should be given sufficient information to expose this metrics or have some way to get sufficient information to expose those metrics with the right labels. Right so like looks like look at a concrete example, for instance,.

D

Hold on give me a moment.

A

And then to make sure I understand, your question is always finding out more data fish when you're separating the control plan versus a plane argument there is that saying: is it okay for the key components? I have to call back to the qiblah to get the enriching data and therefore it's in the control plane path, but not inside the data collection about hey all right is that is that a right way of phrasing? Your thinking about your question.

A

Yeah, this I guess I'm trying to make sure I understand when you say within the control point I.

C

Wasn't sure what Solly meant by.

C

D

Like ideally right in my in my mind, component components would either be given or have some way of obtaining all the necessary information right so like, if you have say like the CRI right, CRI plug-in the CRI implementations actually, as far as I can tell get enough information about the name of the pod. That particular container belongs to through the pod sandbox metadata, to be able to label any given metric about a container with the container name, the pod name and name space that belongs to right, and so, though, that information.

D

So when the when the CRI implementation exposes a metric, say CPU usage, it should expose it with those labels and not say we rely on being proxy through the cubelet and then having the cube. Let's say: oh, this is container ID blah blah blah blah blah, so I need to attach Chladni and and the namespace name to it. Does that make sense.

H

D

And I would argue either. We need to find a way for the cubelet to expose that information so that vise plugins can get it or the device plug-in. Api needs to be given the information to to allow it to expose information.

C

It's just a question of.

C

D

Mean I think some of those API is already have that information I know device, plugins doesn't but I thought a couple of the component. Api is actually provided sufficient information they're.

E

A little hard to hear you, this yeah you're, really quiet, sorry.

C

Probably like getting closer to my mic, so Network CNI plugins.

C

I said I'm not sure about CSI.

C

D

Thought CSI might have last time. I looked at this, but I could be misremembering I.

I

Thought CSI and I said that information, because ESI is supposed to be convenient. We should.

A

I guess thought about like what is the operational benefit or the end-user benefit that justifies the requirement. All right so from I mean.

D

So, let's do this in my mind. One of them is that we don't you don't have the cubelet as a bottleneck for for metrics collection, if you, if you, for instance, want to point your metrics collection software at all of these individually, so that it can monitor that they monitor the weather. Metrics collection succeeded or failed individually on a per component basis. You have that it means that you don't have any extra work, basically going on in the cubelet more than is needed it. It eliminates potential sets of bugs we're like the container runtime thought.

D

One thing or the the component thought thought. One thing based on the information they've had, and maybe the cubelet had a slightly newer set of information, and so it attached accidentally attached the wrong set of labels, like that those kind of bugs don't happen because all the all that identity information is being attached together.

D

So you know if the if the device plugin receives the information that says this is for container X in pod, Y and names busy. Then those are the labels that are attached at that point, and so you have kind of a holistic snapshot of information.

J

It would also mean that the the the information basically leaks into the into the cupola to where the couplet needs to know, based on which labels, what kind of information to to collect. And then we end up with a similar situation. It's it's not quite as bad but similar to how the adviser is compiled into the couplet today. Only that we have a application, specific logic to add additional metadata, and that's basically, what we're trying to avoid here right.

D

Yeah, that's that's a good point as well.

H

Like example, requiring the device like an expose, this information, the pod ID in the information- may not be great like there may be some other reasons today, yeah because of what we don't want to expose. This information, like the API, is where, like example, device device like an API. We have been holding using this information because we kind of don't want to because of some other reasons right, so acquiring intimate monitoring pipeline will only work if you expose this information may not be.

C

You want the plugins to do as possible. I said like have a really well-defined scope, and that way we can have some reasonable performance and portability or across environments, for example, on the other hand, or monitoring and introspection and tracing. You would need as much information as you can get well, that's sort of why I was asking initially if it can do just minimal amount of work such that it enables literally every possible extension, their own instrumentation, like whether it's monitoring or logging or facing I mean.

D

I know in the past there has been some discussion of exposing some sort of like pod, identity, API or something off of the cubelet, so that things running on the node could do associations with different pot identities. I I'm, not personally a huge fan of that approach, but it could be one solution to this.

G

Is the discussion right now to to get the containment afford, not node mapping from device to view like to add that capability to the device plug in the device plug in can handle the molecule by going outside of this movement? I? Think.

A

The device plugin in this case is just one attentional, plugin I think the discussion here is like. Is this a password for all the plug-in types yeah.

D

It's yeah, it's kind of supposed to be a broader like things that run on the node alongside the cubelet and the qiblah itself,.

G

Because, right now like, like you guys said this information doesn't exist and there is a like. This may be one of the solution. You metric strong.

G

D

I think that's kind of what we're discussing right now.

A

Yes, so if people agreed with this requirement, then obviously changes would need to be made to the plugin points. I mean I, agree with Sally on the CRI front, that, like people that are running container runtimes or knowing you would monitor those container runtimes separately from the qubit itself and at least for zri run times that would be running out of tree.

A

If that would be container d or cry or and on the list keeps growing like production deployments of those would still mana that contain around time, and then I agree that the container run times have sufficient data to map usage information to the pot and container boundary. It's just the question of like. Is that having to be done for every third party or not and.

A

Iii appreciate wishes, concern about it growing the scope of the plugin space. I.

A

But I I also appreciate the desire to not have the key book, be the bottleneck, I guess.

D

Yeah, it's really about.

D

G

D

Avoiding having more custom logic in the cubelet to do, you know like oh for device plugins, we always need to add this information. What we usually you add this information or, for you know, network plugins. We need to add this information for storage plugins. We need to add this information before it explodes it. You know it also like if we need to add extra information in the cubelet.

D

It also means that if something were to like, if you were sitting around and trying to debug a particular component- and you wanted to just hit metrics for that component, you like the metrics, wouldn't be useful to you until.

A

D

Been processed so.

A

I'm trying to rationalize like at what level of granularity is the right level of granularity so like, if I think about C advisor today Isaac you couple both knowledge of the container and its music, that's along with like the file system and its corresponding usage, and so basically, it's working at like a container boundary when you start talking like at the device boundary you're getting to like a sub level below the container and I'm wondering like.

A

If, if that's the tension, that's like a step too far for individuals on the call or are people basically saying, like that's the right boundary to do, stop things that is at the container boundary I. Don't.

I

Think you want to like limited the design at the device level, because a currently us packing is always a positive integer resource, but even in the future, it's possible to to share device. I. Think like just in midnight that device level may have some limitation and also you know, people want to I spot device level metrics. They don't have to be associated with the container.

K

Mostly on like how and what the metrics look like in sort of format and stuff, and doesn't necessarily cover very much of like how different things you're going to get the metadata I I am working on that and things are still in progress. But I in terms of abstraction I. Think that they're explained I think the right level is going to depend on the metric being collected and I. Like the model like this, where anyone can sort of watch metrics to a pot or container.

D

Yeah I mean kind of times. The goal here right is like you as a device, plugin writer or as a plugin right or as a component.

I

D

Can you are given enough information to make the associations that you might need to make right and you can choose to expose those?

D

However, you look like right like if you want to Bo's information broken down by pod and pod container, but you also want to expose information by by device in general or something because the device is like shared across different containers, pods or something like right. You can choose to do that.

D

Based on what metrics you expose right, you have all the information you need and you can choose what you can choose. What metrics we don't prescribe like what metrics here you expose. You should expose. You should.

E

We give you the information. This is how we.

E

Give it to you if GPU skits, start skating, sure we would not be able to give you, but.

A

That becomes the decision of the device vendor itself then to the side right. So if you start treating like the device has virtualized to some degree- and you can do that- that's split then. Yes, you would need to figure out a way to provide pot and container metrics with that in mind, but like if you're a device vendor who knows that your devices are handed out exclusively, then you could also still provide pottery container metrics and then I guess. The key blow today is very strongly assuming for accelerator stats. That is exclusive and container scope.

A

D

And so by by moving this decision, how into the specific like, but we want to I, want to move. I would like to have monitoring decisions moved as close to the people who have the best idea about. What's actually going on right, like Cupid, cannot have the best idea about everything. Let's go because it's well I mean that's the problem to see advisor eight like we're, trying to put all everything all the logic for everything.

G

Knowing everything.

D

In one place late, whereas, like you know, Nvidia, presumably knows how they have written their device plug in the best, and you know what the associations are and what the constraints are around that system and what the corner cases are and whatnot, and so by saying: okay and Vidya. Here's here's, the information that you need to make the associations between potentially between device and pod, make the decisions that make the best sense for the Nvidia device plug-in and expose the metrics that make the most sense with the association's that make the most sense.

D

And we know we and the cubelet don't need to know. Nvidia gpus work like this and have these associations.

E

I just wanted to give a data point in terms of how it would be used. Okay,.

D

I mean yeah. That's no! That's definitely good to have it's good to have that data. It's always fit that more.

I

Given that we are also discussing like I, have seen in nightwalker and storage, really, they also have similar requirements. I feel where perhaps also should look here, some folks from seeing an eye on the sea, I say to say, but that they have any special requirements in the agar like, for example, I think Oh si si they they are not tied to communities and I, don't think they.

I

They okay, understand how the concept so I don't know whether they have any other requirement.

E

And so on, a different conversation I do want to bring up what with products and that documents, and that one of the things we would like also to be able to do is have a monitoring container.

E

So that was the that was the issue we were facing and that maybe maybe we do need that Association and the device wagon API to have container and a device that beam. But it becomes complicated, becomes complex at that point when if we want to have a separate site, our continuing monitoring container to use the device plug in India.

D

So yeah I I mean this was I I feel I do feel. That's I feel like this is kind of an implementation detail I feel like it's an important implementation detail but, like I, feel there's a difference between here and I. Do want to discuss what you're saying agreeing that like we should have.

D

The should should have like each component component and exposing like metrics that are already appropriately labeled and between figuring out how we got that information there, whether or not it's pushed through the api's or whether or not cubelet exposes some sort of API for doing associations or or something like that right like so, are our people on board with the general concept of having having like the metrics exposed by each component already associated without having to go through the couplet?

D

It is our people on board with that.

A

The issue I would say that it was a concern. A race in the talk was just how like it would be important for me running production clusters. You know that the source and target for my metrics are both mutually trusted and then how easy it is to get that mutual trust in in that work in the world presented here and I. Think about a potential proxy yeah.

D

So I mean in terms of authentication and jump ahead a little bit.

G

D

In terms of authentication, there's basically two options that we kind of thought about. The option I prefer because it leaves things still more aggregated is there are actually several kind of authenticating proxies and we use them extensively in open ships, for instance, and we could probably use a trimmed-down version of one of these that basically are designed to sit in front of api's and provide kubernetes authentication right.

D

So they they know how to do token based web hook and certificate based web hook are back and and authors, authentication and authentication, and so they can provide that for you. Even if you don't already natively support that, and then consumers could just use the normal kubernetes mechanisms that they used to use, and you could write.

D

Kubernetes are back rules about, ok, such-and-such a user or such-and-such a component is allowed to collect node, node, metrics or any component any no component or such and such a thing is allowed to collect node metrics only about this particular device plug-in or only from the CRI or whatever, and so you either build that functionality and using go libraries or you run. One of these little proxies as a sidecar container in front of alongside your you know, alongside that, that is one of the containers in the demon set. They new device plug-in or whatever.

D

Does that that make sense.

I

Do you think this new requirement may make it hard for vendors to to develop dark? The best plugins I think I wish mentioned about gold, the best package to make an easier vendors I spotted our resources on communities. I mean.

D

I I think that's that that to me is like the goal of adding the proxy of having the proxy right having the easy to consume proxy. Is you don't have to you? Don't have to choose to write in kubernetes specific integration in terms of off you don't have to deal with all the kuving's off. You just drop this. You know you only like the way it works for a lot of the components that we use.

D

An open shift, for instance, is that you, you have them listen on localhost in the pod, and then you have the proxy. You know bind to all addresses or whatever and the proxy just sits at the port that normally the thing would be monitoring at and intercepts all requests and says: does it have a particular toe? Does it have the appropriate token header? It goes and makes the are back, call for you if you turn on making our back calls and then, if, if everything is copasetic, it passes the request through.

G

D

Wherever you told it to pass the request through in this case generally localhost in some court- and so you know if you want to meet this requirement, you just drop the proxy in place and it's relatively simple.

I

So this proxy will rise. I said car continue. Yes,.

D

It's just a sidecar container that you run.

C

E

D

I think you're you're super quiet.

C

Yeah so the same I think we'll discuss this in great lengths now I think there's a general agreement that the you want to decouple monitoring from the cubelet and I felt like this overall, like old building agreement, or that the B's there is still contentious is like how does cubelet expose the necessary metadata for a wide variety of use cases and can I propose tackling just that as a separate item and like there are few well that a few folks in this call, who have already spent some time on that so maybe like you all, can work together with the doc or or chat or whatever and then like.

C

Maybe you can discuss this again. The next meeting.

E

Sorry there's one question that I wanted to know about is: how do you do in point discovery in your model? How do you do what it's how's, the endpoint discovery done here so.

D

There are a few options here: I kind of I kind of left that I think a little bit vague in the document, but I think, presumably something like an annotation on the appropriate on the appropriate pods or on a server or on a service object would be relevant, for it would be reasonable or you know, even create a CRD that, just you know, is like node monitoring objects, CRD right and, and you just specify the port- and it's you.

D

You know whatever the details to connect like on each node and then something that you know what's got that.

A

Like there were no like and just we're talking about endpoint discovery of metrics that are not core to kubernetes right and so like it's not in the sonnet requirement of Coober nice to know how to scour these endpoints right, it's just a requirement of whatever modern platform you to point to know how to discover them.

D

Exactly like there, we could you know, hypothetically, you could also use prometheus service monitors from the Prometheus operator right, like it's I, I kind of left that open, but there's a number of options that we can do that kind of fall under kind of some standard humanities. How do we do endpoint discovery in general? The.

A

Tension here to me is how we get things out of the core monitoring pipeline that should never have been in the core monitoring platform. I.

C

Think the summary API is was never meant to be just the call monitoring pie. Plate was meant to be this placeholder until we have this alternate pipeline. So, like I saying earlier, I think this question has now boiled down to the specifics, how somebody would be out of the qubit and like how cubelet would would empower other other third party.

C

Metrics I feel like that many different ideas we'll discuss, but it would be very useful, at least for me, those ideas are captured in a doc and and probably shared with everyone, so that we talk a lot of interesting points here, but I'm not sure whether all those were captured and when I wonder if you like, through the same thought process again.

A

So I I have no disagreement. That figuring out the way by which the keyboard could make this information down to third parties is the obvious next step, I think, but I do think it was very beneficial to at least others. Discussions understand that, like being able to get metrics delivered from the author, as close as possible, seems to be like a pretty uniform agreement across the group right. So yeah.

F

C

Want to enable that, but at the end of the day, there's going to be like so many different monitoring.

C

Solution might make its own constraints in that, like I, don't I, we should probably not be opinionated on on the fact that monitoring has to be distributed, but that seems to be like the most extreme people. If you work for that, then the rest of the things should ideally per yeah.

D

So I mean like if we say if we say monitoring should be distributed or we design a system where monitoring can be distributed, and then someone chooses to implement all of these constraints on top of the advisor like. That is their prerogative right, but as long as as long as we have a sufficiently flexible system where people can and and try to write this just like distributed monitoring like non distributed, monitoring should just strictly be a subset of that it should fall into place on.

D

There was if we have a little bit more time. We have like four more minutes. There was one other thing that I wanted to discuss. Apart from the things that we need to discuss in more detail, which was kind of a conversation that David and I were having around versioning in.

D

Of metrics and how we deal with guarantees, so I had kind of imagined this system in which you basically document names and semantics and say you always have to implement these metrics. You may choose to implement these metrics, but if you do they have. We have semantics that we say you know like if you implement Network TX, that's always Network transmitted bytes and if you're going to implement network transmitted bytes give it the network TX name right and then there's like a third class of metrics, which is this whatever like.

D

There are specific vendors, they don't have guarantee, they don't necessarily have guarantees about changing them around them and those would be kind of exposed together, but we would have a versioned endpoint that would guarantee for a given set of like guaranteed, metrics and known metrics to to always have that same set and and the tension was kind of between that model and just calling those supplementary metrics alpha.

D

My concern with that was mainly that alpha and beta imply progression right like it doesn't make a ton of sense to me to have certain metrics as alpha level and just have them Alba forever, because it seems like, like the terms alpha and beta or whatever of imply that eventually had the Alpha metric will become stable and I. Think there's always going to be metrics that we don't you know we don't want to guarantee or we we can't know the entire set of metrics right.

D

We don't know what metrics and video is gonna choose to expose. We don't know you know we don't know what metrics certain seer eyes might choose to expose. So we want to kind of have this positive metrics, where we don't make any guarantees about them. That they're, but they're never gonna become stable, but they're not alpha, because they're not going to become.

K

Beta only when you say we are we like the instrumentation or this working or when you say we do. You mean like in your example.

K

Which are going to guarantee that a metric exists or not, depending on you know whose plugins you're using or how you're monitoring yeah.

D

I was kind of using like me as like the broader kubernetes developers, right like where we as kubernetes developers, are not necessarily gonna, say and Vidia always has you know we're not gonna know what metrics and videos gonna have, and we don't I think that we shouldn't try to prescribe what metrics and vidya has, but I do. Think it's valuable, especially in the case of things like CRI, to say all right. You might not expose networking metrics, but if you do, you should expose them like this. You should expose like these ones.

D

You should expose like this, so that people can have some reasonable confidence in consuming them and you're free to expose other ones as well. But you know you always should expose CPU and memory because we need elsewhere.

D

You may choose to expose network this, these types of network, if you this was just name starting example: business I'm, not saying specific ones. You may choose to expose network because people find that useful and if you do, you should expose it like with these names, because you know people want these for confidence, but then there's all, but always this third class of metrics. That's, like you know, specific to Nvidia or specific.

D

You know. Container D are specific to cryo that are just extra metrics that might be useful for people consuming cryo or consuming and video stuff, but are not necessarily shared amongst all the vice-principals.

K

Yes, I think I I think we're closer than maybe originally seen. I was thinking of it more from the perspective of say, a device monitoring agent maintainer that you know, even though a certain metric may not be available in all deployments that if you run this particular monitoring agent, then certain metrics are alpha versus GNA versus I was thinking more from that perspective.

K

I do agree that it's useful for us as a community to keep a list of known metrics so that, if I'm, for example, implementing a new CRI, runtime and I want to expose the same metrics as others, I can go to someplace easy I think that's certainly useful, but I I do don't think and I don't want to spend too much more time care because I'm, not sure I'm, the other people are interested. I.

K

Do still think, though, that it's useful to be able to say this is an endpoint that we do plan to change in the future versus this is one that's pretty much gonna stay stable, I mean.

C

Like Commedia offers no versioning, yes, I think at least recommending it's not mandating like just recommending some naming schema such that the the monitoring experience is consistent across plugins in a Karuna. The environment would probably help most of the operators. I mean we do subconscious naming.

D

We do already have that to a certain extent, it's not universally followed the biggest individual component that doesn't follow. It is actually see advisor, but we do have instrumentation guidelines that all kubernetes components are supposed to follow that prescribe metric naming label naming label semantics when to use what labels, when not to use certain labels, for instance, never use the UID label and stuff like that, and.

C

Having immediate integration like three years back, instrumentation yeah.

D

So it's it's because see advisor was grandfathered in and because I believe we weren't sure like whether we could reasonably change all the metric names can see advisor all of a sudden which, without like causing issues for people without.

K

Versioning, it's.

A

Exactly and time.

K

A

I we can like little get your straight, but I'm worried that my laptop battery is about to die and I don't stop on winning the meeting last forever. But I wanted to thank Sally and Frederick for putting this information out and like joining the call today and then like. It sounds like the next step. That I hear is that we need to like come to some consensus on, like the set of options that are worth exploring for making the information available from the cubelet and it sounds like bishops are like on the Google side.

A

Maybe there were some options that had been explored, but not yet shared with the broader community, and if so, who wants to take that next step to like share that discussion? So we can make a next step. I can.

K

Share that discussion is there anyone who would like it and then I will present it next week, I.

A

Think David I would love to see that as well and I very much appreciate you doing that and and I am gonna hit, stop and end the meeting. If that's ok so literally my laptop doesn't that die and I feel rude to everybody. So.

C

A

All right well thanks again Sally and Fredrik and in biology for presenting today, and it's really appreciated and be in touch. You next meeting.

C

A

You guys thank you.