Kubernetes SIG Cluster Lifecycle, 7 Jul 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: SIG Cluster Lifecycle - Cluster API 21-07-07

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

Hello, everyone and welcome to the cluster api office hours meeting today is july, 7th uh 2021.. uh We have a few psas for today, so you have the first one.

B

Yeah sure so um some people are running and particularly openshift are running with the owner references permission, enforcement and mission controller in those clusters. uh Controllers need to run with the permission to set fine line. Finalizers um we've just updated cluster api, so this shouldn't affect you.

B

If you build a new controller recently with cube builder, it should be there automatically, but we started a product project quite a while ago, um so you might want to go and check your providers uh if you own them and make sure that you also have set the finalizers um our back permission. So have a look at the pi. It's pretty simple, just add another annotation um and that should fix up for those where, where someone's turned on that admission, controller.

A

Thank you, uh and I think this appears still upright, not emerged, perfect.

A

Cool any questions on that.

A

All right, um I guess one other ps8 that I have is that we actually uh just switch the book domain.

A

So when you go to cluster api, you now will see um the main website, which is like a alpha four and we also added uh tip dot instead of instead of master, which will always be at the tip of the main branch and then release 0 3 should be released.

A

0-3 the certificate is that has to be regenerated still uh well I'll check that later. But uh if you need the alpha, 3 dots that those will be here, uh I think we we probably yeah there is the legacy dot link in here as well. So if you can just follow that as well, um that's the only other psa that I have any questions on the book.

A

All right, I guess, let's keep going.

A

I wasn't here last week, I guess, like cecilia you're, not like zero four, um I wanted to do a shout out for the whole community, like uh for all the changes that the whole release was huge like uh I don't I don't know how much time cecil and I actually spent on writing those change release notes, but it was like a few days. I think uh yeah that was there's a it's a lot of a lot of stuff.

A

We actually had to compress the dog's changes to be one in one line, because there were 77 contributions to the dogs, so congrats everyone. This is super exciting and uh thanks for all for collaborating with us on this new milestone.

A

So with that said, uh let's go into, uh I don't think, there's any release blocking issues, but let's go to oprah proposal. Readout um is the first one I think cluster class. um There were a few lgtm yesterday on the proposal.

A

I think if there are no major object objections, um we could probably go ahead and merge it. um I see receive a few more lgtms. Are there any last-minute comments? I think we can probably merge it and start the implementation right away.

A

C

uh I had one small yeah. I had one small like right: there yeah there's a like a missing change from a previous commit. That's it.

A

All right um is this: fixed.

D

Yep sorry, I forgot to raise my hand uh yeah. uh I just I just saw it. I was about to uh like comment the suggestion. I was just a very of the fact that all the lg tms would be gone uh once I merged this change, but uh I think that's fine I'll I'll ask some of the main dinner folks to add an lg team again once I uh watch this change I'll comment, this suggestion in.

A

Yeah, I think that sounds good yeah we're gonna work on gtm one again. The multiple gtm is mostly like to have consensus, and I think it's clear. We have some contents to merge this in um cool. So if there are no, like other like last minute comment on this, please take a look, uh we'll probably try to merge either by there or by the end of the week.

A

Awesome thanks so much folks and it's exciting to have clusterplus get started on implementation phase.

A

What else do we have? We have spotted this proposal update with termination design handler.

A

I don't, I have not seen any new updates on this.

A

A

Is alexander here no.

A

Joe go ahead.

E

Yeah so alex is on my team at red hat. um He, as far as I was aware, I was just waiting for a review on that he's saying that he thinks it's ready uh for review. I think still just mentioned on chat as well, so um yeah if we could get some more reviews on that. I I need to do that as well. I'll get on that, but um yeah. I think it's just review for that.

A

Okay, I think there was one question from cecilia that would I haven't seen answered, uh which is: what's your outline? Is there any breaking changes? um If there are like, we probably can well, we can merge it now, but we cannot release it until five or whatever. The next version is.

E

Based on my understanding of it, there shouldn't be any, um but I will double check with alex and get him to to add note that cool.

A

And then we have auto scaling.

A

Do we have mike here or no.

A

I don't see mike here uh load balancer provider.

B

um Yeah, so um we just started kicking that off again, uh so there was a meeting last friday on, so the meeting notes are down below um we're still going through that existing proposal. um We also need to ask jason for his blessing to uh take it over um I'll leave the vested discussion too later, because it's on the agenda for some other bits.

A

Okay, so this sounds good, I think, um given that we're starting again like um I would suggest to close this, uh I'm happy to do it and move into a google doc again, there's a lot of stuff a lot of open comments, so it might be easier to collaborate on my google doc.

A

B

uh Think, certainly, okay.

A

Okay, I'll follow up on this later and I'll close it and ask to post a link. I think, there's an issue still open. No.

B

It's the one above neck 1250, oh yeah, that's true!.

A

Okay, um yeah that's supposed to talk in here and.

A

A

A

Yes, so I'm not just saying you do gear to it like and feel free to assign others uh just so that you know the google doc.

A

There you go uh icon integration, jacob.

F

Yes, um so recently, myself max nadir and who else?

F

Yes, you um had a talk regarding how we, because we needed that for ourselves connected the cap v provider to our infoblox instance, um and we basically built a custom controller for that, and then we also faced the problem of connecting metal three or the metal free provider to our infoblox instance, and the capri folks have the problem of, or I think, customers also requested support for external ip address management and also, I think, um infoblox in that case, um and then we thought that it might make sense to build a more generic solution.

F

Instead of having to to build custom controllers for each provider to integrate with different ip address management solutions, we could try to build a more generic one that all providers that are interested in uh could then integrate with so for metal. Three. There also already was a solution.

F

The ip address manager, which is subproject, which basically does in-cluster ip address management and but uses a system similar to the persistent volume claims that regular kubernetes has to just request ip addresses from a pool of addresses, and the idea was to maybe extend that to be able to also interface with other external solutions and not do it in cluster. And then, for example, you have an infobox ip pool and you can do a claim on that ip pool and then get an ip address.

F

Yeah, so the and this proposal basically just summarizes what we we thought about, and the idea is either to move the directly into cluster api, if that's an option or as an alternative, leave it as an external component, either as the metal three ip address manager or maybe, as an official in quotation marks. Cluster api, app address manager or something like that.

F

So it's the one approved or agreed upon solution to manage ip addresses so that if providers want to support ap address management, they can interface with that one operator or controller to manage ip addresses, instead of having to write their own um for each of them, which is what we are, what we were doing right now um for our use case, um but it would be much easier or well not easier but um easier to maintain at least um to have a generic solution right now.

F

This is probably only interesting for metal, three and cappy, and maybe one other provider like pretty much all the bare metal providers, not so much for the cloud-based providers, because they usually have ip address management covered, um but yes feel free to comment um or extend the proposal. It's still in pretty early stages. I haven't had much time to work on it, but we just wanted to share it. So everyone, that's interested, can can comment on that.

A

Thanks jacob um for a little comment, is there anyone that would like to.

A

Determine okay, um so if I think that the problem has like a lot of merit, especially for for bare metal purposes, um I wouldn't put it in cluster api directly as like a core thing that, like everybody, has to install, uh but we can potentially look into having this as a I said, I don't know if it's like, we should call our provider, but I mean it could be it network provider like the load balancing provider like one of those providers um and live in a different repo, especially given that you know like if we need to uh support like a whole ipam system which is built in um like definitely try to see if there's something already built up out there um yeah those are my two cents, um so just keep it external.

A

I don't know how cluster cuddle will behave when you have when I install something like this, so we have to think about that. uh You see yeah so.

G

uh uh So, like two things, first, the first one is like um we're uh with the load balancer provider in this we're gonna have to define a uh either support, matrix or some sort of contract if we're saying that we're uh putting those into different repos and having like different lifecycles.

G

uh The second thing is like my understanding is that this is mainly like trying to build an api, so we shouldn't probably be maintaining a an ipad implementation, but rather it's up to uh icon providers to plug in introduce apis and implement their own providers. So it would be on core capital to maintain.

F

Those yeah I mean, regarding that uh it being internal external. I guess I also preferred being external just because it's going to get big, we would have to decide how we want to handle the different item providers like if we want to have provider modules that you have to install separately or if we want to go with the route that I think external dns is taking, that providers basically just have to contribute their implementation into the one operator, um but otherwise, if.

A

If that is the case, though, like I definitely not like put it in custom, api code base no.

F

Yeah then, then, certainly like the main goal is basically to agree on a on one single api, so that if providers want to integrate with it, they have one api that they need to support and not so that we don't end up with different approaches or different different apis.

F

Because then the idea would be that a provider. When you create a machine, then a provider creates an api, an ip address claim, which then gets fulfilled by the controller. And I mean you only want to support. One type of ip address claim not multiple.

G

Yeah- and this is especially true because, uh like especially for on-prem providers, this falls short because we basically don't have any mechanism where we can hook any ipad any consistent solution for ip address management. So I think that, having an at least having an api that is describing how ip ipam looks like would be very valuable. Now for the location I, like, I don't have any strong opinions. This can be either through the experimental directory or another paper. As long as we define like how this gets released and the compatibility.

A

Sounds good to me, I think um you know like if this is a generic api like the thing those things like while like you know like it, could make sense to put it in copy like we don't have to do that right this second, uh because this is an extension right. Only some infrastructure provider will probably use it.

A

um So I do see this a little bit more of an extra rather than like something built in batteries included and like over time, like you know like if we have like, uh like lots of providers using it and like if you guys, are stable, we could always like move those apis in free and release them with copy.

A

So we could always do that. We did this with kappa k and kappa d as well, so we could just repeat that.

G

G

Yeah one last thing like to be fair: uh that ship has already sailed for like apis that aren't used for all providers. If I recall like, if I recall correctly, I think we have apis that aren't used uh by everyone, so I think it might be like to be fair. It might be uh worth specifying, then, through policy or something like that. What we could have uh in terms of apis or not.

A

Okay, uh nadir.

B

Yeah, let's I think we can probably defer this conversation a little while because I think the bit this proposal is missing and added some comments is um we need to define the integration point and what that looks like from a cappy perspective, because at the moment it's only covering the ipam address allocation. So once we've got a clearer idea of that, we can see where it can. Land um got one comment around intrigue versus outreach.

B

My preference is praying more towards having pluggable outward tree just because external dns has had issues getting provided specific code getting merged in. So it's um the ability to have everything. Pluggable is preferable.

A

Cool, um I think this could be. Definitely you know something we can collaborate on the purples, I think um and yeah like once the apis are more stable. We can make that call later all right, uh any other questions on the iphone integration before we move.

A

On all right, let's move on to group topics. Stefan you have the first one.

H

Yeah, um so um we found out that uh when we dropped a q proxy, uh we actually broke our metrics collection in the end hunters so zombie, in the middle of the inventors framework. There is something like a dump metrics regularly to find and that doesn't work anymore, um so we can fix it. But then we just had a question uh if anybody is still using it surface, if it's worth fixing or if should dropped,.

A

Just about if I was q bar back but proxy, not your proxy. We haven't yes, sorry yeah.

H

Cheaper proxy, of course, or especially the follow-up that we changed, the matrix binder address um to 127.001. That was the breaking change.

B

An idea yeah, I think um a lot of that code was aws as the main consumer of that. So um if you want me to take that over in fact, because uh we haven't switched kappa over to the same defaults around when metrics are on or off yet for the v1 alpha 4 release, so we will need to fix that anyway, because we do cap, we collect a lot of metrics and they're quite important to us around aws api, latency, retries, failures, etc. So that bit of the test framework is super important for us.

B

A

Valuable, um so are you saying like we should keep this and fix it right, like not drop it right? Okay, uh I think it's fair, like you know like. uh If those things are useful during testing, we should just keep it um keep it there.

H

Stefan yep, then I'll just fix it and that's it. I guess so it's a small change. You just have to do a little bit of stuff, but then I just fix it and yeah. That's it.

A

Awesome any other questions on on this metric collection, cnt.

A

All right fabrizio go ahead.

I

Hi everyone, so uh a quick, quick note, uh john mcbride sent a recently uh amazing first pr, which implemented the captain restore for cluster cutter, which is built on top of cluster cutter move. So basically it is saving to file everything that usually custom cattle move, consider and and then you can restore it from file, so it is uh for me it is a great addition. The pr is, I think already so. If someone is interesting on this feature, please take a look at the pr and again kudos to.

I

A

Awesome great work, john, and thanks so much for doing this uh folks, I don't know about schedule backup, so it's a way to dump all the things on file and restore them in the management cluster. um Once once you want to um stephanie, you have your end phrase. It's.

A

Oh okay, it's friends cool! Now, let's move on on to lord balancers uh nadir, joel and david.

B

um We're still going over the ux I'll, let joel speak to their internal um external load. Answers.

E

Yeah, okay, um so this is something that came up before um with the some of the projects we're working on at red hat. At the moment, all traffic uh is assumed to go via a single load balancer to any cappy cluster.

E

Now I wrote this proposal up in hackmd around the time we were doing the load, balancer proposal with json originally and deliberately did this in a way that it was kind of vague on the implementation.

E

So this document that's on screen at the moment, is you know, talking explicitly about the user stories and what we're looking for here but doesn't talk about how this should be implemented.

E

The idea, basically, is that some traffic from a cluster needs to go via the public-facing internet, for example, if you've got clusters, so you've got your management cluster in gcp and you're managing a cluster, that's created in aws the traffic's going to go over the internet, but if you're managing stuff within the same uh aws account, for instance, you can set things up like bpc pairing and maybe that traffic doesn't need to go the internet or in a joined cluster.

E

You probably don't want to go over the internet either, so this proposal is about having this part of the proposal, at least about having the ability to have multiple and I'm gonna suggest we limit it to two uh load balancers one for internal. You know private network traffic and one for public network traffic, and this bit in in particular, is talking about the separation of different parts of cluster api and how those uh should like be controlled, in terms of which would go via which cube config.

E

So like the upshot of the changes, I'm imagining here is that, rather than having a single cube, config produced, we would end up with two an internal and an external, and then different controllers would consume different cube configs, depending on where they were, and what topologies they're in. So. This is something that I would like to push forward and work on, ideally in in tandem with the the main low bands proposal um and uh yeah, I think that's something that I think we've got support from in general.

E

um One of my concerns is whether this is adding too much complication to the load. Balancer proposal- um and I don't know how others feel about that.

A

David and then.

J

uh So I have a comment question about the internal versus external. Do we think that's the only model we want to support.

E

So this is the the reason that I'm suggesting we limit it to two um is because I know on platforms such as azure, or at least as I understand it, a vm can only have two load balancers attached, an internal and external, and they have that explicitly within their apis. So to try and keep it consistent.

E

um Maybe we just stick to the lowest common denominator there, um but I think there are possibly other use cases that we might want to consider, but this seems to be the main use case. This idea of having a load balancer that your intra cluster components will use um so things like hubla, etc, would use the internal um and then having an external that something outside of the cluster like an administrator would use. Have you got a specific example of something else you think might work in this sort of like way.

J

um So I this is not a good use case. Maybe but some load balancers have capacity limits, and so in very large clusters you may need more than one load balancer just to span the workers.

J

um I was more thinking so so my only com, my comment about the design is that the traffic policy selector uh is sort of an implicit reference and I think it might be more flexible and clear if it was an explicit resource or explicit reference.

J

So instead of the traffic policy saying that we want internal versus external, it can point to the exact cluster class or load balancer class that we would like to use.

E

Yeah that that could be interesting, like I guess the that's- that detail has come from me trying to keep this generic and separate originally um from the conversation uh with the main low balancer proposal, obviously that that is like a detail that we haven't tried to tie together, but maybe that is a reason that we should not even try and do these separately, but actually should try and do them together.

E

So we can tie things like that together in the case where you're suggesting you might have multiple load balancers and it would need to go to one of a collection um how? How would that work in terms of like a cube conflict right, because you you're gonna, have one dns endpoint for the cube config api server to go to right? um Maybe I'm taking too much into the details here I mean it's a good question. I haven't thought that far ahead. um Okay, that's something! I think we maybe should dig into that.

E

You know you're right, it's not something I thought about, but it's something I can add to my list to look into cool.

K

Yeah, so I had a very similar question as uh what's on I mean the thing is: uh uh are we talking about the load balancers for the multi-master scenario, currently right? That is.

E

What we are talking about, yeah, so the the load balancers. Currently, you can have a single load balancer and then all of the the traffic goes via that to the api servers from the cluster.

K

Yeah, so why should we not have a load balancer cluster, as in what I'm trying to say is a load. Balancer is ultimately a virtual thing and we want a virtual ip, at least in what we are trying to build. We are trying to have one virtual ip per workload cluster, because um the owners of the clusters are independent. They don't have a similar building or anything.

K

uh They do not want to share resources, and we do not want there to be some sort of uh larger blast radius in case we want to upgrade one versus upgrade another. We do not want all of the clusters to be done, because the load balancer is down, for example, and so on. So.

E

Yes, these load balancers that we're talking about in this particular proposal are just for the control planes of the cluster themselves, so each cluster would have its own control, plane, load, balancer and that's how cappy works today.

E

The the concern here is that at the moment, for example, in a cappy cluster, your cubelet talks via a public internet load balancer to the cluster api servers. That's not cluster api service, even the kubernetes api servers, so you've got traffic from cubelet going out to the internet and coming back in and that that not only is less secure, but also you know, increases your egress traffic costs on the cloud provider so to to prevent that kind of thing.

E

One of the things we're trying to do here is if we can have an internal load balancer as well um yeah. So I am talking about aws here specifically, um uh if we can have an internal load answer as well, then the cube that traffic um as one example can go across the private network not have to charge any egress costs or anything like that um and we're you know: there's a monetary saving there as well. um So this has come up.

E

You know in the past with openshift customers, and so we actually manage this separately, but we think it's an important thing for for cluster api as well. um So that's why I'm getting involved here and.

K

uh To add to that, there is also this one load, balancer concept, wherein the internal part of the arm can be the internal ip address, and the external can be, of course, the public facing ip address. So have you thought about that and just use the inner part of the album for internal traffic.

E

Yeah, I think, from my understanding at least that's not common across all load balancer implementations, though, um and so we're trying to come up with something that will work across multiple platforms, um so to do that. Having two two load, balancers, uh I think is- is a safe way that, across multiple platforms, we can guarantee, like a similar user experience, perfect yeah, okay,.

K

Cool thanks. I got the concept now so on a per workload cluster basis, we would have two load balancers and one is public and the other is private. Thank you, yeah yeah. That's the general idea.

A

All right thanks folks, um any other questions on the ux of load, balancers.

A

All right uh david, I think you have gateway apis. Okay,.

J

um So this is just a summary of some of the open questions we have for how the gateway api might be used uh as part of the load. Balancer proposal um nadir points out that there's a different multi-tenancy model and there's some links here. If people have questions or thoughts about this, we can discuss them in detail, but I think that it's possible that the gateway api, assuming the multi-tenancy models, can be made compatible. It's possible that that can be just an implementation of the load, balancer proposal, um and so I think I agree with.

J

I think it was joel that suggested this. Maybe it was a scene to focus on the ux, I think is probably the right way to go in terms of finishing that proposal.

J

And then, if no one has any questions about that, then I have another question about questions so we're talking about meeting with nick young who's working on the gateway api, and since it's going to be hard for us to all meet. At the same time, we wanted to collect questions so that maybe we can meet one or more times and have all of the questions uh together.

J

So if anyone has questions they have about the gateway api, we can raise them now and or put them in the stock so that when we meet, we can bring them up in person.

A

Okay awesome. Thank you.

J

uh If you also have this one, you you want to go through it, or should you want? uh So that is actually my previous comment, then that is that I think the gateway api can be an implementation, so we're going so in kubernetes. Our access control is on a per crd basis, and so by having a cluster api crd that then creates a gateway. Api crd.

J

I think we can decouple these implementations and not worry about the gateway api as like the only implementation it just has to be able to be utilized by the load. Balancer implementation.

A

Awesome um cool any other questions on the load balancers a lot today.

A

All right, let's move on dane you have promotion of machine pools.

L

uh Yes, thank you, um so new relic has been using machine pools uh quite a bit. We've had a lot of success with them, um and the feature's still experimental, um there's been a few times that there's been some breaking changes in the infrastructure providers and things like that. As these resources have stayed in experimental and that's in a sense of benefit, it's allowed them to iterate quickly and and get to a good feature state and has definitely increased their usability.

L

um But that said, the activity around those kinds of changes seems to be uh waning around breaking changes, so it seems like it's kind of organically, reaching um a semi-stable state, and I wanted to just raise the question of what what we could do um and largely my question is around logistics. I think this would involve updating the um the caps or proposals for these things for the machine pool on the aws machine pool and then uh beginning a review process. Is that about right? But it's largely just a logistics question.

L

What are next steps to actually get these apis out of experimental.

A

It's you go ahead.

C

um Yeah, I think the main thing that has held us back so far from moving it out of experimental was the big question around how we were going to handle, maybe breaking it out into having like machine pool machine resources for otoscaler and that's still on the table, and that's the issue that I think david and you are actually assigned to. um I don't think david justice is here, but last I heard he was working on a proposal for this on the cap.

C

um So I think we should tackle that first before we start moving it out of experimental just because that might be a foundation like foundational change for the machine pool resource. um So that's the main thing on my mind, but I think it'd be good if we start with an issue, maybe um for moving machine pools out of experimental and then maybe like tracking, like a checklist of all the things that need to be done.

C

For that to happen, including like linking to the other issues, um I think there's a few about documentation that are still uh left just that it's like at the same level of uh like testing documentation, everything as the other cluster api resources. um But I agree with you. I don't think we're that far. um Hopefully, if the machine full machine thing goes well and that happens in the next release cycle, I'm hoping we can get the machine pools. Graduated for the next big.

M

M

Oh sorry, did you call me I didn't, I didn't hear it.

J

M

Sorry, yeah no worries and you got to deal with our dog like squeaking in the background there, um so I just wanted to echo what cecile was saying about the autoscaler like I would. I think I would. I would love to know like what the future ideas for the machine pools are with respect to auto scaler, um because you know I know there are some. There are some complications around like how to make them fit into the current model, and and also there's this other notion that maybe we should be developing a separate, auto scaler.

M

You know cr the custom resource so that we could put machine sets machine deployments and machine pools all behind like one unified resource that gets presented to the auto scaler. But I I would appreciate any kind of discussion or thoughts around that angle.

A

See I will give that back to you, but I don't know if we that we should just like um discuss him in an issue like you know what the future of machine pool like with respect to auto stealer is, um I guess, like for the future. It's mostly gonna be like integration points right mike like it's, it's like how do we tightly integrate these things together um and yeah? Maybe you know: go poke around like in the community to see like what we think of the right.

A

The right thing to do here. Yes,.

M

Yeah, I just think I think at the moment there's a slight mismatch between what we expect from the cluster auto scaler side and what machine pools could provide. You know with machine sets and machine deployments, it's a little more concrete, I think, with machine pool. We just need to work out a few details. There.

A

Any other questions on machine pool. I think we have machine health checks.

E

Yeah, so this has come up in conversation recently um with a colleague of mine mark who's. uh I think it's on the call, um but we were you know looking at a problem that we were having with machine health check where, over a period of time, uh a lot of machines got uh restarted or rebooted. These are actually metal cube machines, so it was just like a reboot rather than a full reprovision, um while they were doing some some upgrades.

E

So there was some maintenance happening, uh each machine went down and then when they were coming back up, uh you know they were getting rebooted again uh and we noticed actually that, unlike other cappy controllers, there's no concept of a pause from hc um and actually that we think that would be really useful for this kind of situation, where you know you're having some maintenance or something go on, uh and you know that the machines are going to have issues for a period and you want to just to stay within machine health check like obviously you can delete it, but that's not ideal.

E

If you're like you know, you've got some githubs or something in in place, or I don't know. There's various reasons you may not want to delete it. I think um so. Then we noticed that it it's it's not there in machine health check. So is this something that has been discussed in the past? Is this something that people would be happy for us to propose as an addition to machine health check? uh Has anyone got any other thoughts on particular use cases where this might be useful.

G

You see yeah, so uh we actually like filed. I actually filed an issue for for this, because we have use cases where you, when we're using uh you're, using basically uh uh an active, active setup with uh storage replication, and you have like a failover happening by an external system. You still want to pose mhc so that it doesn't interfere, interfere with that system. If I recall correctly, we added support like there was a support added for the paused annotation at the machine level.

G

But I don't know if the control like, if we're, for example, like supporting the post annotation on the mhc object itself,.

E

Yeah, so I did look into that and I noticed there was the the idea that it could check on an individual machine level, and you know I can see that working in some cases uh like like the one you mentioned there, but the uh what we were suggesting or what we were seeing is. You know like when we know we're going to go through and do a an upgrade to the whole cluster at once like that, is then like a bigger thing where it's like.

E

Okay, I can go and pause every single machine object, but would it not just be easier to pause the machine health check itself? I don't know it does sound like there is a there is a work around that we could use there, um but it feels like it would also be helpful to have machine health check level.

K

Fours yeah, so I have a more fundamental question to understand this, uh so we are moving towards more of a immutable or infra like mechanism. So if you reboot a node, it is not necessary that the new node which comes in is the is an incarnation of the old node. So it's just uh another node node left under new node joined, and we try to do that. So what are your thoughts about that? And why do you want to keep state in a node and ensure that the same node.

E

Joins so, I think in in certain environments, that's true, um like cloud provider, is very easy to do a mutable infrastructure, but in a bare metal environment, that's a much harder sell right. So you know looking at some of the ways that openshift are using uh similar stuff here, it can take a couple of hours to completely wipe and clean a machine.

E

So if you need to do like a an os upgrade or something like that, it's either take that machine out of action for a few hours or just do an in-place upgrade.

E

So I think in those scenarios in particular it's much better to actually have an in place rather than the kind of a mutable infrastructure, and so that's kind of the case where we're seeing this here is that you know we're taking down these these bare metal machines to do some upgrades and then like one by one we go through and do those, but their machine health check is unaware that these upgrades are ever happening and so tries to remediate them.

E

And then we have problems, um and I think one of the things that came up uh from this is you know we looked at the node lease proposal that had come out mode maintenance proposal that could possibly be used here and might solve some of these problems. But I know that's also a little bit off as well in terms of like that being delivered right.

K

For things like os upgrades, we plan to use the I forgot the name there is a machine template or something like that equivalent wherein you can actually specify a new template and mention it in the deployment, and I was hoping that that would ensure that machine checks will be paused and other things will be paused.

K

So there is this location wherein you can specify a new template and you can point all of your missions to a new template and it would start.

E

Upgrading as far as I'm aware that doesn't integrate right now, but um someone someone may be able to correct me if I'm wrong.

G

Thanks for this.

A

uh You seen in the deer, I think the deer was first.

B

um Yeah, so I'm just gonna say: there's actually use cases in the cloud providers as well, there's uh sometimes when you're doing upgrades or plus the initiation you've got mhcs as well. uh You can see a lot of churn in your uh like ec2 instances or azure vms uh for a period of time until the sort of cluster starts settling down. So um it's not just the case for bare metal metal fuse.

B

It's you don't see the issue so much, uh but you you do. If you go in the ect console, you might see like 50 terminated instances that have only been up for like a couple of minutes.

A

G

And then dane yeah, so uh I, like I, I recall, seeing some somewhat some checks in mhc, so I did a quick scroll and it seems like we're, checking the deposit annotation on the meta of mhc and on the cluster and whether, like once one of them is closed. We are basically just returning early, so uh yeah it might be worth exploring if that's actually working or not. uh The second thing was regarding upgrades.

G

uh Today there is like no coordination between mhc and uh one wants you to youtube. There are an upgrade. So basically it's up uh to you if you d necessary to pause mhc or not, but I would argue that sometimes having mhc during upgrades is also valuable, especially when the boot, when basically the bootstrap of a new machine, fails and you wanted to retry the upgrade process for you.

E

Yeah, so I saw that the cluster paused check, um but I was. I was also wondering on that. In that case you know, if I'm doing an upgrade of my cluster and I'm you know changing a lot of stuff rebooting machines. I then wondered whether that's appropriate towards the cluster, because that would stop new machines coming up.

E

If I understand correctly um and if you are rolling through and maybe you've got cluster autoscaler in, then that would bring up a new machine, but if the cluster's paused, I think that breaks that so I was trying to avoid having to pause the whole cluster. uh In this scenario,.

G

Yeah but like on this check, it seems like we're checking for either the cluster or the meta object of mhc itself yeah. You could just annotate it with the pause annotation.

A

Yeah and then it will all right: mhc itself will stop dragon telling.

E

ah Yes, okay, fine, uh in which case I had not noticed that particular part of it, and it is already there, in which case. Thank you.

L

uh Dane you're next um just trying to understand that probably more and to share an experience in some of our larger clusters we've had to, um and maybe there was a better way to stop machine health checks, but we had to delete them. That's how we handled it during large scale up operations um because of the the churn would become so bad. The churn that uh was mentioned earlier that this the cluster could never.

L

It would just get stuck in scale up, because the instances the nodes would not become healthy and time and something in machine health check controller was deleting them and it would just get into this really bad state, and we saw that during some large machine deployment replacements as well, where max surge was fairly high.

L

I never had time to dig too much into it and we had to work around, um but it's this sounds very similar to what was just described, at least in some way.

L

um I don't I don't know. If pausing is the right answer, I mean it's effectively what we're doing by deleting and then recreating machine health checks, but um or if it would, or if this is fixable, by having some kind of a grace period of some kind of machine health check. I'm not I'm not sure what the better way is here.

A

I think, instead of pausing, you can definitely uh start until deleting you can definitely pause the machine health check, especially if your machine health checks, like are spending across clusters, which is possible because the labels could be shared.

A

um You might be impacting other clusters as well, which is you know, fair warning there when you bother to lead a machine health check, um but um for the the high turn like of like machines like, I think that should be probably like a different issue.

A

Amendment of the current proposal, maybe to add, like either like a some sort of strategy to like how we go about uh impacting a single cluster, or maybe once we get closer cleansing bridge like we can also think about group of clusters in the same class.

A

um We can think about these things, but yet, like a strategy, might be a good way to to go about it like we have roll up strategies, for example, we could have like a deletion strategy for machines in a cluster um yeah.

A

uh You see, you still have your hand raised yeah. uh Do you have that yeah.

G

One one one quick thing: I agree that this is a separate issue and like uh for now it I think that pause is. It can at least solve some of a subset of the use cases, but I think that, for example, madeira's comment is pretty much accurate because sometimes- and we've experienced that sometimes like mhc can be too aggressive.

G

Like the definition. The definitions that you have for mhcn in the normal case can be pretty much aggressive, too aggressive for a degraded mode, for example, if you're upgrading or doing scale-up operations.

G

So I agree that either having a strategy or having different slos when you're running on a degraded mode, such as like upgrades or scale ups, can be very valuable.

A

Cool we have two minutes left. We have three more topics uh jack. I see uh that close card operator like that's, for an update, but uh this was covered in chat. uh It's okay. If I skip that zach had to drop okay, perfect, oh yeah, okay, there you go, uh uh we got the update on the rescaler from zero from mike uh needs to be uh yeah a little bit more time and registry. Are you still here.

N

Yes uh yeah, I just wanted to say I just wanted to add a link to the managed external hd proposal, and uh it would be great if everyone can review. Also, can I add it to the list of open proposals above or like in the same dock,.

A

Yes, totally great: okay,.

N

A

Well, thank you so much folks for joining today and the great discussions uh see you all next week have a good one bye. Everyone.