Kubernetes Batch Working Group Weekly, 10 Mar 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes WG Batch Weekly Meeting for 20220310

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

All right, um yeah hi everyone. um So, as you know, this meeting is recording, I will be uploaded to youtube, um so I was thinking that we could go through the initial roadmap that we had. Let me try to present.

A

Can you see my screen yes, you're presenting your entire screen.

A

Like my like one one, entire screen right, yes or even, like my whole, everything all right, it's.

B

It's a whole window. Oh sorry, a whole screen.

C

Yeah we can see finder on it.

A

Oh okay, why did that happen?

A

You can see that on the wood version.

A

B

Now we can see the dogs okay, no, no, no, no.

A

Anymore, oh now I see an empty desktop. Okay, now yeah.

C

A

All right um sounds good, so I I like try to really um sketch a a road map, and maybe a world map is a strong word at this point is mostly like you know, uh headlines and high-level ideas of what we want to pursue. We want to fine-tune those um and, as I discussed last time, we have three. I was calling them pillars or work streams.

A

uh If you can check here, the outline we've got the job, api, job management and specialized hardware, um and uh one thing that we discussed last time was actually trying to do like determine the state of the art like. Where are we right now?

A

um So that's one of the things that we could do as well um and specifically with regards to the job management. um The suggestion was to start by inviting um you know, uh people who already built frameworks on top of kubernetes to manage jobs like volcano unicorn and cad batch plugins. We have as well um in the scheduling framework um to present their frameworks and info, like the probably. The outcome here is to have some sort of like um collect.

A

um Where what kubernetes is missing, what exactly are they trying to? What are the gaps we're trying to bridge and what things that we believe should be pushed upstream?

A

um So I don't know if this thing, like you know, discussing the state of the usher, applies to all work streams. I think in the third world scene it does as well um marlo from intel. I think she's, already working on a dock, to document the state of the art related to the current state of um what cubelet supports um for.

A

I think stuff related to cpu manager, money manager and and the likes.

A

I think uh if this could apply to the job api, we could try to discuss the state of some common custom operators like mpi job tf operator, spark operator.

A

And try to understand what what are the missing features in the job api that they had to a custom resource or even if they had to do a custom users in all cases, how can they use the job api as a building block.

C

A

C

Those the different solutions that you showed in the second section would actually influence what's in here as well, so um you know as we learn about these different solutions, I think they will highlight some additional features that weren't there and why these systems were built and how we can add to this new api or enhance the api is probably a better way to see it.

A

As a like action, I items here like I'm not going to write some rough notes. I don't know if I should probably write them here, not in the the agenda dog trying to say: okay, how can we make? How can we formulate this a little bit more, um for example, here invites.

A

Maintainers to present their frameworks and discuss uh missing features.

B

A

We can include some sort of.

B

C

B

C

About that too, that's a great idea.

A

Template of what.

B

um Basically, I don't want this to turn into a shop where everybody is trying to send their product. We we want to. We want to answer the specific questions right, so we we in the template, we could have things as a why yeah, why x is is not enough. Why? um Why did he introduce this uh um or maybe what um what's your target audience.

A

B

If, if there is any.

A

Right, uh like I would say, like more of operation.

A

um Gaps and upstream gates.

A

Is that we can work on one like I.

B

Guess yeah, for example, one thing I'm thinking is some frameworks introduce their own job right. Well, you will come back to the original question of the other work stream of what is missing in the job api. um Yes,.

A

I'm gonna switch to my other account.

D

C

I think it would be maybe helpful.

C

To have in there.

C

Kind of like the the the motivation and the architectural design of some of the features like the cueing piece of it, if there's queuing right there, if they're scheduling pieces of it and what policies they have, you know just some.

E

C

Into their design, decisions.

A

So I think um I'm going to assign this to aldo, propose a template and.

C

um I would be happy to help in any way.

A

um What was the template for presentations.

A

Okay, that sounds good. I think um this covers it for this initial, like you know, step towards what we want to do with this work stream, but this is a first step um that would inform the next ones I don't do. We want to discuss more here beyond this. At this point,.

D

uh I think another thing to include in the site templates is that how they support current the job framework or job operators like, as far as I know, what volcano did a lot of integration work to support current stuff. So in my evaluation we might want to evaluate how that kind of integration behaves.

D

So I think I would design go is that we don't want to sort of do the integration for every single yeah job controller, whatever right we want to edit the integration be much more smooth and extensible, so otherwise we have to maintain a large pool of operators. We can support, and that is not that extensible.

A

Right, I think this is a good like high level idea, which is um whether they, for example, support us, a job api that they propose and then they have they and every anybody that wants to use it. They have to somehow model their workloads using that api, or um there is some sort of like again like volcano there's, an annotation that defines the group of parts yeah.

E

After they are getting created and.

A

Yeah, so that's a good point. Yeah I mean support for custom. Workers is extremely essential, as I mentioned before, like the job. Api is lacking a lot of features, and it's not enough to just say: okay, assume that everybody's going to use it and that's it.

A

Sounds good, um then. This is really a good first action action party on this work stream.

A

um Although are you happy with this one I'll put you in the spot here, but only because you proposed it to have a template to collaborate with uh collaborate with diana.

B

Yeah, if diana wants to start this great because uh cold freeze is coming- and I need to work on my future, but okay.

C

Sure I'll take our first pass and then reach out to you and kind of go through that.

B

A

Any comments, thoughts. Do you like to discuss more um here. I can go through these, like there's a few things that I had on on mine, but we can. um If you have something else, we can add it now. I think.

D

I think another uh angle to look at this and this body model to to look into that is that whether we can have a standardized metric or what kind of other stuff can evaluate how good this kind of integration will do for each like job, which uh third party operator like for spark operator by giving this kind of size cluster, and by giving this kind of uh what claims? How good and I mean, how what's the major methodology we can use and what kind of tool we are missing to measure this kind of how?

D

How good is like this scheduler versus another? So if we can have this kind of stuff, we will be. I mean.

A

I I think, I'm not sure if we are trying as a working group to grade schedulers.

A

It's more about trying to find common ground that can be pushed upstream without influencing too much. What the different um you know, frameworks are trying to do we're, trying not trying to impose a specific model and because like if we, if we define a metric that says okay, this is good, and this is bad. It may as well influence a specific design and that method may not apply to every single.

A

You know, type of foreclosure that these frameworks is targeting. um So I'm not sure.

D

What we gained by yeah- I got your point. I got your point so I I say that it's just because in terms of uh technical decision, for example, when you uh evaluate different tools, one question coming from your upper map upper management is that well give me some sort of results. Why you choose this over there and if there can be a metric and proving that, so that would be beneficial. But if it's not in the scope, then each team that evaluating this tool will have to work themselves to come up.

D

Come up with this kind of number yeah, I'm fine with that. It's out of the scope of this working group. That's totally fine.

A

I mean, like I um definitely like again like we. We can't expand the working group into many different directions.

A

um I would yeah, I would suggest we start like again by first trying to learn about these frameworks, try to document the different modes of operation that they employ and then work our way from there.

D

C

I do think it's a good idea to think about evaluation only at the sense of eventually when we come up with something we can upstream, how we're going to evaluate it through the normal kubernetes performance evaluation right. So it's it's good to think about how we evaluate the design and so that we can prepare, for you know benchmarking as we move things upstream.

D

Yeah, I think eventually we are well. You should be good to have that. It's just at this phase, I'm going to mention that we have a lot of work to do. It's maybe another idea to expand the discovery too much.

A

So like um do we I'm just trying to make it a little bit more concrete here? What is it that we're trying to do with with this workstations?

A

I guess the group that joined today is more interested in this um in this works team. um So here I'm saying like identifying patterns or proposed enhancements to job level, queueing scheduling, provisioning and auto scaling. um Do we.

A

And I guess I guess we can go through these like uh points and maybe that will make it a little bit more concrete um yeah sure is like converging a pattern that can potentially be supported natively by upstream kubernetes. Do you feel that we can get to a point where we will have queuing support upstream in core kubernetes or you feel this is a losing battle and um and the focus should be on.

D

I think it's technically possible.

C

I think it's possible and I'm very encouraged about that um and, as you can see, there's already people needing demand. Is there right? So I think you know if we do a good job, uh we'll be able to alleviate a lot of that demand and.

B

So I guess the question is whether we think there would be a one true answer to queueing in coronaries, or it would be just let's say the job defines a cue and, however, you want to implement the queue you have multiple options right.

C

Right, um that's what I'm hoping we can design something like that.

B

Okay right so yeah, I think at this point it's too early.

E

B

To answer: where do we wanna? Where do we wanna be.

A

Right yeah, I'm just trying to get a sense of like is it is it it are people like motivated to actually do something upstream with cueing, or is it more like? Okay, let's try to find and define you know or like just work with existing solutions that are built on top and replicating a ton of work.

C

So I can share just my experience and uh just for for clark pure clarification, um one of the lead uh developers on mcad, which is a queuing. It has some queueing capabilities and uh uh you know: we've been working with various people within the company, our premium and uh there's definitely a demand for cueing um solutions.

C

Because- and the big motivation is many times, you'll have uh cluster workload, focused applications uh on clusters and uh many of the teams are wanting to now collect. You know mixed workloads and solving the problem of you know, capacity issues with or without cluster auto scaling.

A

C

So um worked with many teams that are trying to solve this problem, where you can mix these different types of workloads uh with you know different types of batch workloads and coexist, and you know setting policies around it, which is priorities, quotas, right um and using multiple clusters. Even so, there's there's a demand. I just from my own experience working with this, uh and so I just I'm, I'm very excited to see this, because uh some of the features that I was uh we were using.

C

uh I see them already, as we are talking about this, so it would be nice to have this upstream. If that helps, I'm just sharing that.

A

um So the the thing that, like um at least from again like wearing my personal hat here, not the uh like as a co-chair.

A

um This is what also we're trying to do with with cue, not necessarily enforce like a specific implementation, but we want to reach that set of apis like a workload, api, a queue api and that's why the thing that we try to share is an api doc with use cases implementation you can implement it.

A

Multi-Cluster single cluster, auto scale, not auto scale, doesn't matter, because the way that I guess we want to model capacity is not by looking at existing nodes but by saying okay, here's in an object that defines how much resources you have right- and here is a workload um like you know, um api- that you can queue and use that capacity.

A

Now, how do we implement those in the background? um It would depend it could be custom, it could be uh like you could have multiple invitations for it. I guess.

C

Yeah exactly I kind of think of it. Sort of I haven't thought through it all, but kind of like in the scheduler. Now you have some plug-in models right uh where you can add, add and implement, implementing uh multiple policies or multiple solutions.

C

So if we make something and design something to where you take, these pieces, put them together and then apply your objective based on the policies that you enable. You know, I think it could solve a lot of folks problems.

E

uh So is is one of the points of valid I'm curious in terms of like choosing between whether implement whether to implement something entry or out of tree. So if some of the concepts are like, basically one two one correspond to you know the injury, concepts and apis that are like uh quota that are like the w1 batch v1 jobs, so you're, basically functionally equivalent, but you're hoping you.

E

This is something that's more suitable for batch workloads, then does it make sense to you know instead of introducing something brand new, but rather you extend uh whatever the the current concept. That's more geared towards. uh You know: application serving applications, you kind of just put that as their batch counterparts or like, for example, I'm kind of thinking. In some other conversations like people were like asking for hey. Does it make sense to have like a instead of staple sets like uh stay full of jobs, uh then?

E

Does it make sense to be just because they're kind of very much aligned with the abstraction level of existing uh tremendous objects, native objects right.

A

So we had these discussions um on um like there are two things that you mentioned here. Two two really good examples: the job api, the quota system. Let me start with the job api. um So what we do have now, a stateful set job, the sort of speed we have index job, it behaves exactly like stateful said, but the pods could uh terminate, and so it matches better the top like batch workload, um you.

C

A

Behavior now, even if like we, we, we are trying to enhance the job again. That is the purpose of the first work stream, but it felt that at some point it will not be possible to force every single type of workload to be modeled only using a single job, v1 job there will be cases where they is special, like spark spark, is going to be extremely hard to moderate as a job api v1 job- I don't know it's it's.

A

It has some custom setup, uh it needs its own operator and life cycle management, and so we need to accommodate those. Those are not a small group of workloads.

A

um Our hope is that they either like most of them would try to use the job api as a building block, but I don't think we're gonna get there 100, and so we want to provide that middle abstraction.

A

That still allows us to start stop these jobs, provide them resources, but not how to deploy them, how they run and whatnot.

A

I'm not sure if, like I'm being too like clear here, um but it's going to be it's not going to be one type of job api that will affect everything um with regards to kota.

A

The current codex system is not like it's called coda, but it is mostly designed around protecting the cluster from failing over. It's not designed to basically try to cue requests drain them based on available quota, and uh you know, and do a fair sharing and whatnot.

A

It is implemented in the api server itself at admission like when you create the object, it decides whether I'm going to drop it or not.

A

um It does not match at all the job of the batch like requirements of of cueing and and quota management, and fair sharing and also requirements related to you know fungibility and auto scaling. That is a heat like a beast of set of requirements to try and convince the community to push it into the current quota system. But I'm optimistic that we could have something designed for jobs that we could push upstream.

A

um I'm not sure it's going to be exactly the same, but I think those there's two different requirements.

E

Yeah I agree right like um and what I was alluding to is not necessarily yeah like extend the current job api, a job type or the current quota. But rather, if these are things that are basically one-to-one equivalent to their you know, non-batch uh counterparts, then you can think of them as core and if they're core, it kind of makes sense to. You know uh directly push them into uh okay core. So I guess I've been kind of just using that as a I guess. A criteria.

A

C

So I'm not sure are we. uh This is a are we. This is a 30 minute call or an hour call.

A

It's supposed to be 30 minutes. Okay, I.

C

Mean I thought so because I was planning on. uh I had to join another call, but yeah.

A

uh No, I think I think we should be always always on time. I can extend this meeting to 45 minutes, um but it seems that this one is less popular than the uh other meeting. uh Last week we got, as I mentioned, like 17 people, on the call so we'll see how how this goes forward.

A

Okay, um so I guess um please take a look at the road map uh again make suggestions, edits and and uh how how we can move forward with this, um perhaps next week. uh On the other, the longer time slot um we can go again through the api job api and the specialized hardware.

C

Okay, so sorry I have to drop now, but thank you so much.

E

C

E

See y'all take care.