Kubernetes WG Batch, 22 Jun 2023

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes WG Batch Bi-Weekly Meeting for 20230622

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

A

Today is June 22nd, uh my name is mache and I'll be your host for this batch working group uh meeting. Today we have a couple of topics, but a quick announcement.

A

uh Kubernetes AI HPC day happening during cubecon in Chicago, um November 6 until 8. If I remember type the word something along those lines. The call for proposal is open and it's due on August 6th. So you roughly have a month and a half to submit your proposals and sign up for uh for the AI HPC date if you're interested in uh in attending.

A

So that was the only announcement that I had and now we can jump quickly to the first topic and I have no idea. Who are we looking for lipog SEC, not sure if L is the first name? Do we have a person to talk about the first topic.

A

Hearing none, okay, I'll move it to the next. In that case, uh marching do you want to share your screen.

B

Actually, the presentation will be done by Daniel.

C

And if you could kindly allow me to share screen and then I can start doing that in the meantime, I can say hi. My name is Daniel I'm soft drinking at Google and.

A

um First and then I should be able.

A

uh uh Although you're uh uh your logged in as the other piece or.

D

A

uh Are you or Abdullah logged in as the lead, because the surprise, the icon chain, I, can't make without Daniel a co-host or be able to share a screen.

B

Is it me yeah, probably me yeah.

A

It might be based on what I'm saying that might be Abdullah. Can you make Daniel co-host so that he can share his screen.

C

E

C

It seems seem to be able to share my screen.

A

Yep we can see your screen.

B

But we can't hear you, you are muted.

A

uh Isn't that the problem where, where you're sharing the screen, you can't talk I, remember that they fixed it. Okay,.

C

Okay, there is another button that was moved somewhere else: okay, okay, perfect. um So um today, I would like to talk about the Persian request, crd that I have AAP proposal in in this link here and uh yeah. That's that's the topic I want to cover today, so um so the motivation currently cluster of Skiller does not provide any a way for us to express a fact that group of PODS would like to have a capacity provision for them.

C

Atomically uh before the pots are created, and there is a lot of situations this may be relevant for users. A couple of them is that, for example, users might want to have All or Nothing semantics for space scale up and currently due to like interactions between workout controller, beat like job controller or like other one Cube, scheduler and cluster office killer. The VMS will be provisioned in in a step function anyway. So like we will try to look at some.

C

Then, if we hit stock out or whatnot or just because we don't see all of the pods, we will try to get more and we will just go in the time and before we get all of the VMS that we want in and because of that, like users will be paying for all of the time.

C

The VMS are there and are idle, basically because they cannot proceed with their workloads without all of the games, and there is also like, uh if you eat, latency, for scale UPS, uh some of the workloads may be bursting and because, because of the same reason, users will actually expect bigger latency with each cluster of the scalar scale.

C

Upload will just see some of the Bots and just Spam bunch of uh like have one request to cloud provider to create some VMS, and this latency will be like bounded by the slowest VM and each of those like.

C

Usually, we will have like if we want to have 60 VMS. This will take not six times uh the request of getting 100 games, so there is some some latency that we can cut off there.

C

um So this is my motivation for for this, and we are proposing like an API which will we call provisioning request crd, which is custom resource object and would be a nice namespace object where users can describe. The group of the pots would be created and.

C

Like that, they want to have a capacity for for them provision uh in a particular Manner, and it should contain like the template of the pots and how many pots there are, and possibly the different uh pot templates for for different uh within a one one, uh provisional request. So we can imagine like uh I wanna have 10 tops like 10 parts that will be running a job and One controller. That will be just uh scheduling, some some specific tasks to them and it should allow admins to like support different implementation of provisioning, the capacity.

C

So we can. We can try to have Atomic one. We can try to have like uh other modes of atomic, so maybe the step function but make it uh configurable in a way that I'm happy with at least such and such VMS. uh A note not um the. This is an overview of of how you would how the young of the one of the provisioning quest would look like.

C

um So we would have some name. Some project require class uh in this case, like the atomic scale up how long we can wait for the VMS to to brought up and like one bot set, which would consist of 20.

C

specific Parts, each of them in this case like requesting two and a half CPU and 20po, and possibly there may be some other configuration that are relevant to the scalability uh and.

F

C

um Yeah and the question so far I'm here none so here we have like generic atomic scale up and within the within the EAP that I have I'm, proposing, uh also a genetic check capacity, which would basically be one of track to verify that the cluster within a cluster. There is enough capacity to provision a given set of spots.

C

um This is not a real guarantee like we will not block those uh this capacity for them so like there will be no no nothing scheduled on down, but it might be useful in scenarios where we have only one workout emission controller, for example the queue um in in the scenario. We don't expect any like robots to hijack this capacity. So if there is capacity you can just assume in the IQ Loop that it's there after after the chart, another one is the aforementioned atomic scale up and uh yeah.

C

We want uh basically other name size. You want to have the the capacity provision in an atomic manner this uh this means in. Like the generic case, we will try to provision our required women's idioms in one cluster to scale level Loop.

C

This, for example, different note groups, so different uh different uh type of machines and whatnot, and if we succeed, then we pass this information to user that if we fail, we move all of the partially provisioned VMS, remove them and like retry, some like like later in some explanation back of manner within this uh on the duration that users can specify, as mentioned in the in the example and what is also is in important.

E

C

Can provide a specific class for for their, uh like Cloud providers can provide specific classes. So, for example, if one of the cloud providers was to have a better API to to provision VMS in a way that we can go there and say, yeah I want hundreds of those VMS. If you can provision them provision them atomically.

C

If don't just let me know, I I will retry some later on and this this kind of logic will allow, for example, users to to not pay for the partition, personal, uh partially provisioned VMS for the duration before we delete them and yeah the the proposed life cycle of the object is like as follows. So first user or like a framework, for example, the queue creates the object, then like cluster rotoscaler picks it up chooses amount, pools, no pull or one and tries to create nodes, or uh this is.

C

This is like, for example, the atomic scale up. So in this small you know provisional nodes in the check capacity, we'll just check whether, if it's there and later on, we pass this information through uh conditions to to users, whether we're successful or whether we are retrying or whether there is enough capacity or not, and at this moment, if, for example, for the atomic mode users, uh if, if you are successfully created the VMS users can create Bots and users, they can can Mark those as with specific annotation, that those are consuming.

C

This specific version request and like if all goes well, the the those parts should be scheduled on the dedicated capacity that was provision and once all of the pods are scheduler users can delete the provision request. Otherwise it will be like garbage collected, um yeah I have uh I sent the question happy to answer it. As of now.

G

E

Yeah, absolutely thanks for uh sorry. Can you hear me.

G

E

Yeah thanks for uh presenting um so I had a question, so you made a comment that users don't pay for partial provision VMS. So so can you describe that a little bit more.

C

Yeah, um so the one of the modes that we are saying is the generic atomic scale up and in this mode user base for the partial Provisions, because uh this is like we, we try to get get them uh the capacity, and if we don't cannot get the capacity fully, then then different Cloud providers may give us like only partial provisions and in this one users will file for the capacity depending on the capacity like the provider.

C

But there may be like a specific cloud provider, API um to atomically scale up and and then then this within this specific class we might we, my users, might not be uh required to pay for this uh partial provision, because those will not happen. Basically,.

E

It's up to the cloud provider how they, uh you know, give you the resources or whatever the slas of that Resource Group got it yeah.

C

Yeah, like Cloud providers, may have different operation class like um and those might have different guarantees about. The atomicity.

E

Okay, thank you.

B

One comment here so a little bit explanation how a cluster Auto scalar is organized, so we have an interface for uh providing resources in cluster of the sky that is implemented currently by about 30 Cloud providers. We will add something new to the interface. However, it will take time before all of this Verity plus Cloud providers respond and implement the API. Some of them may not have this capabilities available yet so as a stop gas solution, we want to provide something like semi-atomic like this generic Scale app.

B

That gives you like some of the possible Atomic experience. However, in generic node Cloud specific way, sample providers may have something better already, and in that case they will implement this API differently, like use some different ways to expand the number of nodes in the cluster, and then we will be able to provide better guarantees stronger guarantees that once the.

B

Request is started, it is actually fulfilled and the impact and financial impacts for failed provisioned request for the user is non-r minimal, but that's depends on the cloud providers how they will respond.

B

As Google, we want to integrate it with what we have on our end. So the idea is that soon there will be Google specific implementation and generic uh semi-atomic implementation. That should work for most of the cloud and there will be in interface for other Cloud providers to follow so that they can use whatever secret API or public IPS. They have on their own end.

C

I think we have Alex next.

G

Hi can can you hear me hey uh thanks Daniel for presenting us. um We actually have a few users who are interested in this type of game, scheduling, I, guess you could call it so very, very interested in this talk um going forward to your next slide about the the way it works.

E

G

The fact that you have to submit this request wait for the capacity and then submit the pods I'm.

F

G

If there is a way, maybe with a controller where you could specify the pods along with the provisioning request and then once the capacity is available, you schedule those pods from moving that that need to um submit the pods and then remove the provisioning request. If that makes sense, just.

C

To make the interface.

G

C

Yeah um you mean like fire and forget mode like you could create the pots and the provision requests at the same time, kinda and then just those should should happen along the capacity is provision there like those should be scheduled. Well, when once we have uh Bots like the note for them, yeah.

G

I guess some abstraction here to say like here's, my real pod spec uh once the capacity is available for all of them together put them uh schedule them all together.

B

um That would be problematic because that would require additional step on scheduler. So right now, in the picture we've got only cluster Auto scaler, which reacts to a scheduling decisions right, so you would like to have somehow block the pods, possibly with a scheduling, Gates and only unblock them once the provisioning request is. Is there right something like that?

B

So that's that's possible, however, uh possibly as a responsibility of some additional controller.

B

So you create that's not like rocket. Science, probably won't happen in uh in MVP, it's not for everyone, but yeah, it's something that could possibly be it's something that could possibly be added to the uh proposal as like step number two.

G

D

If I may answer to the question from from the Q perspective uh this, this is one one feature that we can easily integrate with from the Q controller. uh If you're not familiar with Q controller does job scheduling and it supports some some apis, in particular the job API. So if you use the job API you submit a job, uh Q can create a provisioning request for you once the permission is ready, Q admits the job and all the ports are created.

D

That's um One, One controller that will make use of of this Pro this API, making it seamless for the researcher. Let's say.

G

Yeah yeah. That would be great if Q, if Q supportive.

B

Yeah, that is the plan we we want to have it as as soon as possible, and we want to make this API kind of standard. So right now we as secure the skin working on uh with a carpenter folks on convergence plans between these two autoscalers, and the idea is that this API will not only be supported by cluster Auto scalar, but also by contactor. So hopefully it becomes uh kind of standard way of asking for a capacity in your cluster.

B

We want to have it in queue as soon as possible.

F

I can also just quickly mention that this is also very interesting for their model projects where we have so similar needs. You sort of want to separately control, resource provisioning and scheduling.

B

Okay I mean so please, please take a look at the proposal and if you have any comments drop them there.

A

I have a question related with the API, which is kind of related to what Alex was asking about um I'm looking at the provisioning request and I. Think that stands out is that I need to copy the entire pods back from every other um resource that I'm working with uh can.

A

Can we make the API a little bit more generic such that we will either uh provide some kind of, for example, pot uh selectors, so that it knows what is the type of thought and it can read that part from the cluster and get that information from the cluster rather than me, copying the entire prospects from my job that I'll be creating or if I'm, going to be using it for a different resource, type and I, don't know, maybe providing a reference to the type and then, but that would the downside if we would go with uh with the controller reference, you would have to read the actual part from a part-time funding from the reference controller.

A

It is doable for the built-in, because you know roughly where, where to look for them, but it's a little bit more problematic with with crds, and maybe the parts selectors that you you.

F

A

On um on those copy and then eventually provide the prospect as a as a fallback option, if you cannot read that information off of the cluster anywhere.

B

So we thought about many possible use cases, including using pod templates that are also like a top level objects in kubernetes and referencing them in the API, and that creates various types of problems like the sports. Works can can be changed and they need to be monitored. If you want to monitor possible objects, like I, know job deployment, then you expand the number of watches that are required to implement these features.

B

So this all of these things that you mentioned are possible are not rocket science, but probably it would be better to do the after MVP and after initial launch and see with this like real or strong need for them, because they will complicate the picture and implementation and maybe like people can leave with copying the the templates uh and in that way, make the life of API implementers a bit easier.

B

But anyway, if you have strong feelings about the uh that, it should go other way around, please, please add them to uh to The Proposal.

C

Yeah, like I, have a link at the beginning and at the end, so please please comment and let me know what do you think about it.

C

With that, since there's yeah, there is also a link in the agenda so like.

A

Daniel one thing: uh the presentation, the slides that you shared I tried and I cannot access? Can you share that with the batch work group, mail.

C

A

List Alias this way all the participants.

B

Please use your private Gmail, our corporate policy, prevent from allowing everyone to access the document. So if you do it from google.com account, you won't be able to properly search the document or we'll be getting requests for research over and over.

C

Okay, um I will quickly like after the meeting update the link in the in the docs.

A

Well, thanks a lot. uh Do we have the personal responsible for the other topics? The call.

A

I, don't hear anything I'll, probably just bump this topic over to the next time and does anyone else have any other uh discussion points that they want to bring up with the group.

A

Okay hearing, none uh with that I'm gonna, give you back about 15 minutes the first time. Thank you very much for today and see you next time. Bye. All right.