Kubernetes SIG Node, 29 Nov 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes SIG Node 20221129

Description

SIG Node weekly meeting. Agenda and notes: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#heading=h.adoto8roitwq

GMT20221129-180422_Recording_1532x818

A

All right uh well welcome everyone to the uh November 29th. uh What year is it 2022 uh signaled meeting uh we didn't meet last week? Do the uh Us holiday, but a number items on today's agenda.

A

um I think we'll go through the uh first one here, which is a kind of a discussion on merits of exposing a secret path in the kubernetes API for running pod.

A

um uh Is the person who put this on the agenda here available to talk through their use case and more or maybe we can have some discussion on pros and cons on that.

B

Well, I believe that's me.

A

B

Should I share my screen or that's.

A

Up to you, I mean the floor. Is yours.

C

A

I clicked through the issue, I think anybody else can.

D

Probably click through if.

A

You have other material you want to show beyond the issue. That's fine, too,.

B

uh We can just start off from the kubernetes issue. uh Could one of you I, don't know if I have the permissions to share my screen? Youtube blah, oh okay! In that case yeah. Let me try this.

B

Okay, uh hopefully you'll see my screen with the kubernetes issue: yeah.

E

B

Great yeah that stays oh cool, okay, yeah, okay, uh just a quick introduction, uh I'm, a software engineer working on psyllium uh and I'm here to socialize our feature request to expose for C group path in a standard standardized way, and we believe that it will help us and others in the community uh to simplify and support their use cases uh so yeah before we get to the pros and cons with the existing ways to retrieve for secret path.

B

Let's just quickly go over uh some of the use cases, so I I believe uh we have context around what part C group paths are so I can skip. That part is that okay.

A

Yeah I think everyone's. uh Maybe the only thing you could clarify here is: if you actually want the Pod C group path, or do you want every per container C group path in that plot, as well.

B

uh So uh parsley group path is a good start and uh well I. Guess more information is always better. So if getting a secret path at the container level is possible, uh then yeah that that's that's even better.

A

uh For the psyllium use case, though, is it just the Pod C group path you need, or do you actually need the per container path.

B

So right now we are using for container path, uh but uh just using uh uh it should suffice for our use case, if we get a c group pad at the Pod level.

B

So that's the psyllium use cases. I've uh I've listed other uh open source projects here, uh I don't contribute to this project, so I'm not intimately familiar with this, but I believe uh we are using Parts degree paths for like variety of use, cases like monitoring, uh statistics, collection and so and so forth, and for psyllium we use ebpf for load, balancing policy enforcement and other use cases, which is where we require pod level, secret paths so yeah.

B

That's the that that's the thing on use cases, and uh so these are some of the ways we can retrieve uh part C group path and every approach has some cons.

B

So uh do you want me to go over these or I? Do people already have context here.

A

A

I mean I, know, I I, have context, I, don't know who else? If anyone wants to speak up I think it's fine to go through it, and uh if we could maybe uh Timeless discussion to about 10 minutes and we can either choose to make the best use of the discussion on merits are doing this or what we want to do as next steps or get everyone on the same page. um uh We can, we can see as you uh we can go either out. I, guess: okay,.

B

Yeah I guess, if you have only 10 minutes, then I I would suppose that I have listed all the information on the issues. So people can just reference this later, uh but yeah uh the merits are. uh These are all the uh projects.

B

This is not an investor list, but uh it keeps a feel for a variety of use cases involving particular paths, and uh so, if you click on these links, uh it should take you to uh the open source, uh so we're scored where you can see how every project has its own way of retrieving policy group paths. So there is no like a clear standard way and then uh these are the limitations with each of these approaches uh that is open, plus projects use, including philia.

A

If it's okay can I have some clarifying questions then so.

B

A

That I could see wanting to put the C group path on the kubernetes API in the in the Pod status, let's say, and maybe at either a container or uh pod level granularity is that you have a control component, not running local, to the node. That needs to do something with that.

D

Information so.

A

In the celium use case, do you have who is actually watching that pod resource and is it running on a per node basis, or is it running essentially on some control component.

B

With Celine We are following this third approach, which is crafting based on patterns. We do it on code, node bases, so at least at this point we don't have like a centralized component where this information is needed, uh but it could be possible that we have future use cases for that. So.

A

Does this the psyllium, let's say it's a Damon step pod here it runs on every node that currently watches all pods bound to that node. Is that right? Yes,.

B

A

um And then is there anything related to the psyllium flow for.

A

A

I'm trying to think like today, the Pod C group could change if the sandbox is torn up and down I, guess and then obviously for container c groups that can change with the life of a container and restart policy.

A

um What is uh what is? um What's the impact of that type of thing in the climus case, if the C group was to change.

B

Yeah so yeah yeah we've taken example of like, for example, if a container is in like crash loop, it's getting three creators, so this container ID would would be different every time the container comes up and in this case I'm not sure uh if the part ID Remains the Same but yeah. If there is a change in part, C group path, then we would want to uh get notified about the updated part, C group path.

F

One question I have is like: aren't these actions that are adding hooks or anything in the c groups level like in line? If you try to do it offline, it's going to be racy right. You can't wait for the status to be updated and go and do something because by that time, pod is already up. So whatever Network policy or anything you want to inject- is kind of racing with the Pod running.

F

So what I'm trying to say is like isn't this something that would run as part of uh your cni bring up portion of the pod.

B

Well, I mean yeah, so we don't have access to this information uh in the cni uh workflow either, because uh the secret paths are not exposed to whatever the CRI uh spec is.

F

Yeah, so we think that can be passed from the runtime to the cni invocation, though.

A

F

A

Remember offhand if we create the Pod C group first, where we create the potency Group after the cni says.

F

That this, it is before I mean when I mean cubelet, creates it already, and then we bring up the pod on the runtime side and we call cni as part of the runtime flow.

A

B

So I can give a data point here uh when we were evaluating this CRI grp interface approach. uh We noticed that in as part of the sandbox, uh this field, like the container, hasn't been created. Yet because networking only after networking is completely set up. uh Does the cubelet go ahead, goes ahead and instructs the container runtime to create. So what we noticed is as part of the sandbox uh spec. We didn't have access to the C group staff, information.

F

That doesn't seem right, uh I guess we will need more details.

B

Sure yeah um yeah.

A

Yeah and then um so I think at a high level. It makes sense that maybe a cni implementer wants to be able to know the pod C group that will contain that part on a Linux host.

A

Maybe just a one thing we can do, and probably the outcome on this is to maybe write up to get more to what the core root use cases that's needed.

A

One of the things that would give us pause about putting the Pod C group path in a status field is, it is obviously Linux specific two I guess we'll be expanding it to like what active, C group controls are on and off over that pod.

A

Three things doing it with like on a per container basis, particularly if those uh c groups get created and destroyed, as Bernal noted that the information would be kind of uh latent um if we have to checkpoint it back to cube API every time, and so we have to be careful about that and then I guess some other things that I'm curious, if uh have come up in your cni use case, would be uh the intersection with uh this and other runtimes.

A

A

um Handle containers differently like so any intersection with Kata or oh I. Forget the name. What's the name of the Google project, you all have done gvisor.

A

um Doing that on a per container basis still really uh problematic to me, but um scoping it to just the Pod basis, uh doesn't seem to do enough to be expansive to the monitoring use cases you talked about. So what I'm wondering is like if we can just narrow the use case to be cni, uh get ensure that cni has the right information. It needs at cni startup, rather than having to put too much more into the actual uh pod. Api itself is kind of the tension.

A

I would see um Do you have a uh do. You have a preference on that, like is. Is it fair to just say you would like to know the sandbox location at cni construction time, and that would be sufficient.

B

Yes, that sounds really appealing because uh going back to I, don't know who uh was the person who mentioned about the racy conditions? That's definitely possible, uh especially when we install this ebpf hooks that are invoked at a socket uh course like, for example, whenever a pod make like a connect, call or something uh it's important, that we have this information available uh before any network traffic is passed through a pod. So uh getting access to this information at as part of the sandbox container runtime uh sounds like a great idea.

A

C

Would it be okay, maybe, over the.

A

Next step, if we um maybe, if we relabel this issue as uh the core use case here- which this thing is coming up, which is saying, um uh enables cni integrators to know the founding C group of a pod and then before coming to this particular implementation route, we could also look at the other alternate route, which would be. What could we do to ensure you? You have this information as a part of cni setup and then maybe we can come back and evaluate pros and cons on either.

A

um One particular thought as an API change is going to be a very long process for you as a integrator versus um the more we can do independent of end user-facing, API I think maybe that would be a quicker path, but that'd be one of the things we could evaluate when looking at various options, foreign.

A

D

C

This the only reason.

A

You, your cni would need elevated privileges or are there other? Does this actually help? You reduce privileges entirely when getting this information.

B

Yes right now, since we're following the third approach, we don't have to elevate any security privileges, but at the same time uh there are limitations and supporting like different kinds of environments.

D

A

I guess the last thought we could have. We could ask if we want to make the pod C group taxonomy laid out by the keyboard like a stable, dependable API.

A

um That's just like a document convention that the passer you use now have been in place since a very long time now um and we haven't had strong reasons to change them that I can think of. um But there are users who do change the C group, parent path, I think um so anyway. That's probably another thing to think through, but uh is that next step? That's basically like?

A

Can we skip this to just um uh how to make sure the cni has this information and then put this option among one of many and then we can come back and revisit.

B

It's like don't fully understand the implications of doing it at the BNI level versus like exporting it as part of the static field you did. You did mention like, for example, uh if it was uh default to CRI, uh then it can be a long process. So can you elaborate on that a bit.

A

I'm saying, if we put any changes into the end, user-facing API like to do it like it's just to add a new field to the Pod status right.

G

A

Looking at uh three months, four months to get that into an alpha phase and then you're looking at it's basically a year activity, it's what I was just trying to call out, um and uh it could be that there are other faster ways for us to get. You Dependable ways of getting this information.

A

um Just changes to the end users are facing API or given. A lot of scrutiny is basically how I would phrase it because of the graduation process. They need to go through and um it's possible that the alternative path of uh seeing, if we can do something in the cni bring up flow, is, is a equally valid and faster path and doesn't make users look at the cube API as a source of Truth for this information, particularly if it rapidly changes so like crashing pod sandboxes.

A

We wouldn't really want to go and drive a lot of right rate back to the cube API server right so in general, I think, there's probably a lot of hesitancy to report this as a status field in the Pod API versus. Can we find other ways of getting you, this information, stably outside of the Pod API and maybe Bernal or Don or others chime in a few.

F

Yeah yeah I, think yeah I, agree. I think like this seems like low level information that seems weird to expose at the API level. So if we can get it to you down at cni setup that stays at the level it needs to be at.

H

I think there's the suggestion on the Mac, um the or maybe the menu and yeah just on the local, node and sorry I have some coffee, so the so the CRI uh implementation can pass to the same eye at the runtime. So that'll be solve your problem right.

I

I I just want to add one other um or sorry. I just want to add one other idea: I didn't hear a race yet which is uh NRI um so I, don't know if you're using container D or cryo but containerdy has this concept of NRI.

I

That's been a little bit newer, which allows you to kind of hook into container creation um and get back kind of events right, so you can get kind of a call back when, like the sandbox container started and using the sandbox container, you can extract out the oci spec and get the secret path through that. So that's like another option to explore um that's kind of outside kubernetes, but would also allow you to get this in your path. I just want to erase that as a possible idea.

B

That sounds uh quite specific to CRI. Is that right, all right, cryo, sorry.

I

Sorry that was a continuity based feature, but uh but I think, like the idea is hooking into the Container kind of you know it's. It's kind of like oci hooks in some sense of getting back container creation, events right and then get getting the ocis back and extracting the secret path from that.

B

Okay, that's probably some similar to the second approach where we might have to elevate privilegious and stuff.

I

B

Anyway, I guess it's not a generic enough solution right.

J

Yeah, it would be specific we do intend on uh supporting NRI and cryo. Eventually, we just don't yet also I, don't know.

C

A

Maybe it sounds like um for me: I would say: uh I would be very cautious about having cubelet to cube API server, um write traffic be expected to keep this data field up as a source of Truth for you as a cni provider, if that is Meaningful and not racy, and would prefer that we find a path to allow you to get this information local to the node without needing elevated Privileges, and it sounds like on this call.

A

Neighbors next steps like I'll, throw your name out Peter but Peter and uh forget the gentleman who spoke about container D. Maybe the three of you could get together and see if there's an alternative path that hadn't been looked at, that could be cross runtime, portable, um but also I.

A

Just I was trying to look at myself um uh where we uh create the Pod sandbox right now on the cable to refresh my own memory, but it feels like we should be able to do something at that point without needing to Rally State back to cube API and the more we can do that I think the better um and more narrow the solution is I. Guess.

E

Yeah I'm not sure, if Michael Zappos here but he's one of the maintainers on CMI and he's been wanting to add a couple of extra Fields anyway.

E

um So he can get together with you know: David Porter and anybody else, who's interested and as well as uh the originator of this issue, and we can get this fixed.

A

D

E

Good, maybe everyone.

C

A

Put their names in the chat, so we can get a group together uh to do follow-ons.

B

So just a clarifying question: uh what would the follow-up discussion happen on the issue itself?.

A

That's fine too I'm just trying to make sure you get connected to the right set of people to help out driving us forward. I think it's personally unlikely that we will want to report this, um and so I want to make sure we get the right group of people to to know. If there's things that can get you the outcome, you want without needing us to do that last resort.

E

Yep yeah Michael Zapp has already made some changes in the cryo and containerdy at the same time to enhance cni support. So really this he's the right guy I think to do this, one because I know he's already adding another field or two um that we that we can pass. So the only only issue you're going to run into is without going through some kind of a hook mechanism or using what you have now. You have a versioning issue. You know until this is.

E

Scheme of change to the Json that we passed to CI that has to be updated right enrolled in and you need to contain. Runtimes to you know to have that version uh supported.

E

But yeah we'll we'll tackle it on the issue: I'll pings that mother and Peter and David.

H

Okay can I just summarize what we agree about, at least so we do I believe signal to do. Okay, we shouldn't expose this no level OS level, it's better. It's a special, particularly from the mix OS level of the details to our API server. I think we agree on this one, because we have this conversation in the past right.

E

Sort of a heads up here, Don we're already passing the c groups path. Yes,.

H

E

That's right right up, but but we're grabbing it at coopcs.

H

I just want to make sure we are on the same page, because I don't want like the kind of like the personal.

H

Trend then later come back again, so we kind of already going on the community, but there's the mini suggestions. I think we have the several suggestions. People will uh send it write down what their suggestions. So then we can carry on the discussing in the issue and to figure out what's the best to satisfy this did I summarize this correctly, because I don't want the people to think about it later say: oh, we do agree. So what's the next, we don't agree about next again.

A

Yeah personally, I I don't want to expose secret pass in the Pod API at both the Pod or container level, unless I knew all other options were exhausted and I think my recommendation is that this option, if pursued, would come with other negative consequences that Renault and others raised here, but yeah. If we could just update this issue to say how to best inform a cni implementer of the Pod location, then we would understand the core use case wanting to be solved and then can find the right outcome for that.

A

That doesn't necessarily lead to a particular implementation approach. Yeah.

H

Okay, I also personally, don't want to so that's why I just want to clear and say, but I just want to make sure the community also agree, not just personal right.

B

uh Brought up a good point in the chat uh it seems like with the CRI approach it uh all. The Cris would have to agree on the spec of British spec. Is that right.

B

Can it delay, or, for example, or or if one of the container runtimes don't agree uh on exposing this field? Would it cause an issues.

A

Yeah I think we just need to write down. Maybe other approaches to solving this to then understand their various pros and cons, but um I think there's immediately negative uh consequences to using this as a stable source of Truth value for you, particularly if the thing that's responding to it is running on the Node and not running on the control side.

A

um If there was a use case that said why the thing had to run on the control side. That would be uh interesting, I guess to dive on.

A

But if your thing is running on the Note side, then we already need to handle use cases where um failure of a cubelet to talk back to cube API um is an issue how to deal uh with the right rate, uh how to deal with a number of other things that can put pressure on qvpi as a concern, so I just um I, don't think we know enough on the other pass, but I think everyone now understands the end goal, which should be a cni provider, should be able to know the C group sandbox on a Linux host for a pod and a stable Manner and right now.

A

This approach is one possible approach, but it has negatives and I. Think right now, Don and myself and probably Bernal um would say like we wouldn't want to go this route. Yet before wanting to look at the other approaches and because we really do not want to hunt the pot C group information in the end user, API yep.

F

I wrote the summary in the meeting notes so I think yeah.

A

All right, so then, if everyone else who spoke up on this that wants to help close, the gap could put their name in the meeting notes as well. Then we could do a follow-up. uh Excuse me of names there now so awesome.

A

um Maybe uh we can come back and uh I, don't know what your your desired timeline was, but maybe we can come back um in a few weeks and see if we've moved forward on that with more information. Does that sound good.

B

Yeah all right yeah, thanks for the respect and I appreciate it.

A

A

um So that the next topic here it looks like the author is not here, unfortunately, um so this was around wanting to move forward with fine grain. Supplemental groups controls I, guess probably the best thing we can do is uh who would want to help move this forward right now? The cap doesn't have any assignee.

A

um I personally, don't think I'm the best candidate for this.

F

um Maybe you cannot sign me desks.

A

F

I made a pass, I see it but I'm thinking of couple of Alternatives. So maybe I'll post a comment on the cap and see if we can.

A

I will assign you and now.

A

Awesome thanks Bernal.

A

um uh It looks like Kevin: are you on the call? Is there anything more than just a approval needed on the resource management stuff.

F

Doesn't seem to be on.

F

A

So this is updating the cap to be in sync, with the implementation.

D

I guess Don are.

A

You able to take a quick pass, the distance you reviewed this more I.

H

Will? Okay thanks.

A

And then uh Swati, it looks like you have. An update on my memory is poor. What was this? This is.

G

Yeah, this is the no resource, topology API, for uh no matter scheduling work. We had a discussion about this last week, so I I just wanted to bring this up again. We managed to get a bunch of approvals on this. The only thing that we're waiting for is your approval from kind of architecture. Point of view. uh If you could take a look at that I think we can get this close done.

A

Cigarch or for Sig note, or what is this.

G

um I think from architecture point of view, the direction that we are taking from no point of view, Dawn uh approved it and Tim approved it as well. I just wanted you to take a look and if, if it's okay to move forward, yeah.

A

All right, I'll prove that this looks like it's.

G

A

A

For making updates I was happy. We could unblock this I guess now.

G

Thank you, and the other thing is that we have. This is kind of the bootstrap PR. The API updates are in a separate PR and I have the link for that as well. So if people have bandwidth, please take a look at that as well.

A

And then my memory on this was that this is getting wrapped up into the overall Resource Management sub project, with.

G

That's right, yeah, the that Kevin proposed last week.

A

And so are there any other I see right now we have two API approvers. Is there anyone else who wanted to come forward now or.

A

Is that final? Is that two lists pretty much everyone who's come forward today.

G

I think Tim has been looking at it uh and he he took a look at the the new PR as well, so.

A

I guess at the CIU and uh for money in the list of API approvers on here and I was just wanting to make sure there was a. If, if we, if there was anyone else who maybe came forward, that wanted to do that as well or not.

H

Team also is the API approval, so team already approved.

A

I'm just saying specific to this staging repo, like the two top level approvers in the owner's file. Here is oh yeah,.

G

Yeah, so if, if anyone I think Marcus had showed interest in being a reviewer for this, so I added his name, if there's anyone else, who's interested in this, like we're happy to add approval rights, that's no problem.

A

D

Well, I'll just uh there's a question on there, but otherwise yeah. That's.

A

Good we'll approve it.

G

D

A

Quicker than I expected um all.

I

Right uh Bobby do.

A

You want to talk about the Sierra Health.

I

Yeah yeah, it's me so um I talked a little bit about this a few weeks ago, when I wanted to bring up this topic again, I got a little bit more feedback about it. um So to give everyone a little bit of context, um the idea here is: there's a proposal I'm working with someone uh kind of more on the operational monetary side that uh we want to add the kublet, uh we're gonna. Add a health check to kubl.

I

It uh like health is the endpoint, so Google already already has a healthy endpoint, but we want to add the ability for that healthy endpoint to also report that the CRI, the container runtime is healthy. uh So the container runtime today, just for a little bit more context, provides us like a status RPC and it returns conditions around the health Phineas of the CRI itself, and so there's like a runtime ready and network ready conditions that the uh CRI returns.

I

um So today uh you know getting that. Information is quite helpful to understand like if the container runtime is up and we need to do some type of corrective actions um like, for example, you know recreating some node or repairing it, or something like that. uh You know if something's not healthy, so um today the the issue is that kind of to to do a proper health check. We have to launch cry. Cuddle is what we do internally.

I

uh We launch cry cuddle and see if we can connect to the container on time, but that's quite expensive to launch bright cuddle, it's kind of a hefty binary and we kind of wanted to launch it on a periodic basis and we've had we've noticed some performance degradation uh because of that.

I

So the ideal thing that we would do is just kind of integrate this into the uh kublet Health endpoint, so that you know kublic's already running so everything's already in memory, and so we can just ask Kubla, hey it's the container on timeout and it would just give a yes or no Google already helped has a healthy endpoint. Since it's not a brand new thing. This is just gonna yeah. You know slightly increasing the scope that uh helped endpoint.

A

So that was kind of that's.

I

A little bit of back background context, I'm, sorry how.

A

Is this different than the plague.

A

The node is not healthy if the plug's not responsive I was just um which is also trying to interact with the container runtime. Is there something different here that I'm missing.

I

Yeah, so the main difference is that for the: if, if the, uh if the container runtime is not responding, you're right, the flag would not respond. But uh then then what would happen is like the node would try to report the status right to the API server, but we notice in some cases that node might not be able to, for example, like there might be dispatchers network issues. Something like that.

I

So the idea here is: we want kind of a local check uh on the Node without any kind of API server connectivity about the status so that we can, for example, like restart Google it or restart the container runtime. If this health check fails, that's sort of the idea, oh.

A

Okay, I'm sorry I missed that so, basically as like uh in my system, D unit or something I can ask if my uh my giveaway component itself is healthy or not and drive exactly exactly yeah and you want to drive a restart if my CRI is not healthy yeah.

A

How will that fix the problem.

I

So I mean it's, um it depends on the problem. Obviously so I don't think people will always fix it, but that's kind of what we do today. We have a health check script, actually that we can run periodically. That goes out and checks. You know. uh Basically, it checks out the kublets up by querying the Kublai helps the endpoint. If that's not responding, we restart kublet. If we and then we also launch cry cuddle and and see if we can connect the container on time.

I

If we can't watch uh connect the container on time, we restart uh container D on our side, for example. So we have kind of this um basic health check script that we run periodically, but the problem with it is that it has to continuously launch cry pedal which is going to have to connect to the container runtime, and we ideally, we just have like a healthy endpoint. We could connect to okay.

A

I'm just trying to make sure I understand this event, so uh the in the case where the the CRI provider might be saying, they're healthy, but they're unable to actually accept connections.

A

Maybe because I don't know something with that. Socket is wrong or.

I

Maybe something like that: yeah, like some connectivity issue, for example or yeah, um so something a good work somewhere. So we often see that you know just simple restart: can help remedy things. um So that's that's kind of the the context yeah.

A

I was just trying to see if it would actually fix the thing uh so.

D

A

You it fixes your thing: is this something Bernal or Peter? We would, you would see, benefit for with other run times or is there a situation where this makes sense.

J

Yeah I'm trying to think of like internally, if we've hit situations where we would want this, usually because you know putting the red hat hat on uh the because our nodes um there's less like manual touching on them. uh So if something goes wrong then you know.

J

The the admins have to go in and you know do something it's like I could see it. You know being useful, but um I haven't thought of a specific use case that we have internally, that we've used it for, but it'd be easy to implement um so I'm not opposed to it.

I

And- and this is already just to be clear- this is standardized across the container run times like they already report conditions around the healthiness of them right. So, if you ask the container runtime hey give me those conditions, that's already implemented in the container. What's not implemented is there's no simple, like HTTP Health C endpoint on the container runtimes are on Google. Let's actually get that status right, so that's kind of The Proposal. Here um it's it's basically to extend the Kubla healthy endpoint to just report back that information from the container on top and.

E

uh Specifically you're you're talking about the connection between kublic and cry, not some new connection that you would make toward the.

I

Client exactly yeah yeah. This is the existing existing container on time connection yeah.

E

um That that that's that's fair, any question I guess his earlier question there is you know what do you want to do if that happens, that sort of changes things right.

I

Makes sense like for our use case, we just wanted to restart things, but I could imagine we might extend in the future. You know for logging for like uh there could be more action taken. You know we could report that condition down to some other systems right that could do some type of remediation or something like that.

I

J

I

Like the last time we we brought this up just to kind of get a little more context like uh there was kind of two pieces of feedback. One was like potentially if this makes sense to integrate in the container runtime, not the Kublai that I think was as decent uh is pretty good feedback.

I

um We kind of looked into that a little bit, so the issue I think right now is it's not standardized across the container runtimes there's no Health Z kind of on the container on time side, so something we could bring to the container online community I suppose, but the other issue is that it doesn't really show the connectivity is working, which is something we wanted as well. We actually want to check that the Kublai can actually make that connection to the container on time. Not just ask the container runtime directly if that connections up.

I

E

Just one just to confirm just to confirm on this you're, not asking for a new status back you're asking for the current known status from.

I

Exactly exactly yeah yeah.

E

That's a lot faster than yeah yeah I.

A

Don't really have any downsides on this David I was just trying to think like. Does this actually fix anything and then like right now, we don't really do much of anything in a health check um on the kilo process, since I was trying to think through, like where we inline see advisor. You know is that more likely to wedge like of the things that we actually depth, rather than communicate out to the.

C

A

Was trying to think where this would help is when we, if, for some reason, the cubelet process, due to some bug in the cable process, is unable to establish a connection via a valid socket right um before.

E

A timeout and did got a discount yeah.

I

E

I

We've seen especially like in throttling cases, sometimes when things get kind of work like there's really high, I o latency for a period of time things can kind of get wedged periodically and like can restart, and it helps there's things out. We've seen that in production. So that's one of the common cases.

H

So in in a in the history, kubernetes did this before U.S I just want to say the what uh what did David uh request? Actually, it is something we changed, but there is nothing but just moved to the different component. So in the older time kubernetes detect of the container, which is darker, is not responsible or it is bad, then we will restart even kubernetes.com. The docker in a really old topic stock is really unstable, always and responsible. So we we end up.

H

Think of a kubernetes is the bring on the Node, so have decided okay, he have to restart a certain thing. So, but that's obviously it's not the good thing. So then we move that The Logical out to the system D some people and the sum is not in their startup script. So then later we try to move to this MPD. The problem is even for the load.

H

The problem detector, just like David, said, there's the certain cases this, which is a necessary, complicated and also certain cases we didn't cover like, for example, like the kubernetes and to connect to the to the container runtime. Those kind of things is not connected, so what did David asked for here? Even today, kubernetes still report container, Readiness or not. It means useless to report that kind of, but because we are not really report anything except of the collectivity to the to the container runtime, but that's not in the in the point.

H

So what David just say? Okay, another way to uh surface this, to the kubernator current kubernetes, that the continuity it is workable, is connectable available or not so I do think about that's now to really add them anything additional information and also clean up more standardized and clean up the current cases. Yeah.

A

D

G

Only thing I was just trying to think.

A

Through is like, would we benefit from also having the stats provider um being added in the health check or not, um and.

I

That's I think actually, so the way that this was implemented currently is that they there is a status endpoint on the container runtime right and it returns a set of conditions, and so we check I believe two conditions are in 10, runtime, ready and network ready right, which means that, like uh the the runtime Services up, but also there's, you know other conditions that could be added later, uh like the image Services up, you know SAS provider Etc, so maybe that's something we can make senseful in the future.

I

What exactly conditions is considered ready from the container runtime right? Maybe that's something we could best config or something yeah.

F

I think we need to figure out if some operations are taking way longer time or something right right now, it's like very basic, just it's just the connectivity. If.

E

F

Want to really know that, okay, it's not behaving well, then maybe some operation took way longer than than regular time. That's what we need to surface eventually.

I

I think it makes sense, yeah yeah, but I guess the question here. Just kind of an initial step of just you know is it up and then maybe we can make that configurable the set of conditions right? What is what is the container ones? I actually mean yeah.

A

I think that's fine I.

C

Was just being greedy here thinking? Is there something.

A

On the C advisor interface that we can also pull for a health endpoint so that we can, uh we can uh maybe do two things at once here. So I.

J

I, don't know what to actually do this, but.

C

If you, if you know off the.

A

Top of your head, if there was something on C advisor that would be wedging, that we could pull from the yeah.

I

For C advisor it's a little bit more on the like: it's scraping, you know it's using I notify on actual C group path, so it's not always connecting to the container runtime socket or um as opposed to Google it that's kind of the main difference. um I guess, yeah.

A

I think you're, like all those messages where it's like the migrants, I'm thinking of horrific node logs I've, looked in the past, where it's like the interval 10 seconds can't be met, type thing and 20 000 times in a row. It's um my memory right now on where and see advisor. That would be a the equivalent Health endpoint uh was a little rusty. So I need to go check that out, but.

G

D

Have no issues.

A

With this, if no one else has any issues so I don't see a problem.

A

I, don't think this needs a cap or anything, so we just wanted to move forward this or yeah.

I

Yeah we're just goal is to get feedback. Yeah I was also asking, if he's a kept or not, that it seems kind of a small, healthy Improvement. So.

I

Cool okay, so I guess the only question, then, is this something we think we should introduce. There was a question around like: should we make this configurable around? What type of conditions are considered healthy from the container runtime um in in the health Z endpoint, or should we pass Maybe the the healthy endpoint can take its input, the conditions we want to be considered healthy. That's another option, um that's something we want to do here, or should we just stick with the static list of basic conditions? I think! That's the only question. That's left.

J

Well, if we start off by just returning the the runtime status, whatever it is, what like either we couldn't get one or here's. What the runtime said, uh then a client could parse that and then, if we wanted to extend it, we just have to extend the number of conditions that the runtime can send when.

A

Something Network ready condition tied to right now on the runtime API. If I was, if we all got a refresher like runtime, ready.

G

A

Get is Network ready, meaning I have a configured CNN host or what was that successful?.

E

A

The last updated.

E

Cni config success so.

A

What happens for nodes that didn't yet have their cni deployed? Does the qubit always restart?

A

That would be something I wouldn't want to have happen. Yeah.

I

That makes sense that makes sense, I think maybe Network ready is not something we would want for restart um for sure right. So let me look into the.

E

The numbers I think it depends if the configuration is there and it's not started. That's that's a different piece of information.

F

To come and configure the cni right so.

J

Yeah yeah well I'm I'm, just a little unclear is, is the proposal now to have this health C for the CRI then cause the cubelet to make like an intelligent decision about what to do about it? Like you know, Derek saying it shouldn't be restarted on network. Not ready, like is to keep it making that decision, or is this a third party no.

I

It's third party: it's third party, basically.

J

I

Oh okay, okay, yeah.

J

A

J

Peter would be.

A

Like what? What would we recommend to the community for as the system D unit for the cubelet that integrates with this health check?.

A

J

Would almost have it be the unit for the run time like be like basically, I, think the manager of the runtime should be the one listening for this um to keep a separation of responsibilities, um which was going to be my point, but if the cubelet isn't going to have anything to do with it, then that's okay with me.

A

This is saying: if the cubic can't communicate to the runtime, then the cubelet should not be considered healthy, which would mean the manager and then decide, maybe I'll restart the cubelet, and that's why I asked earlier. What would that solve? And so I was trying to think through the error case, and so, if it's like a bug in the client library that talks to that socket, for example, Maybe, maybe that's what's being fixed and uh I trust David that he said he had to use cases where this helps I was just sometimes jiggling.

A

The handle helps a lot of things. We didn't know the root cause, and so I just didn't know if we knew what that was here, but if, if this would require, if this would make like many unit files in the world that run cubelets, you know have cable to be restarting, because the cni was not yet fully deployed. I would I'd be hesitant on that that one.

I

Yeah, it makes sense I, don't think. Maybe probably the network ready it's just something we would consider then I think just from writing, probably wherever we would start um and then, if.

H

I

Later add more conditions to the theater on time we could go back and revisit that help check. Maybe that's a good start.

H

uh David, even today we have the container, not variety. We only basically add the kubernetes start of point right, so this is changing. We basically have the periodic the Checker code right, yeah.

I

Yeah, so the idea here is, for example, for us we already have a systemd unit that does a periodic health check right yeah, but it uses cry cattle, for example. To do that so the benefit here is we don't need to load in this? This big binary of health check, interval right and just use something that's already running. Yeah.

A

So I think if we go if I'm looking at what's on here, Network ready is saying the runtime network is up and writer except containers which require.

C

A

That would be a non-starter for many of my customers use cases or users use cases, um because the network is deployed delayed.

C

I

A

Seems perfectly fine.

I

Okay makes sense, makes sense, I'll I think we just need to remove the network ready and stick with uh runtime ready.

E

I

Yeah. Thank you all for all your feedback.

C

H

Right excellent, um well, I think the last topic.

C

Here was finay's updates on PR Square to replace I.

A

Think so I think that's fine, any other things that folks want to raise in the last four minutes. Otherwise,.

C

Thanks all for a great list of topics today,.

D

So for the in place, resize I think we're just targeting it for 127 now and uh I'm, hoping that we can merge the API uh really early, because we need that for enabling some of the CI tests and I'm working on getting the full job fixed up, uh so that once that passes, we know that it's passing with the latest container.

D

Okay, okay, I'll be on vacation for the rest of December, so I'll meet you guys in January and uh have a great vacation have a great holidays. Happy New Year.

A

Yeah same to you as well, and um so there's no other topics, then we will end today's meeting and uh everyone have a great yesterday bye everyone. Thank you.

H