Add a meeting Rate this page

A

Yes, all right, hi everyone um so uh welcome to the first meeting of uh batch working group. As you may all know, this meeting is recorded and will be uploaded to youtube. So um please adhere to kubernetes community uh um standards in in meetings.

A

um Let me start sharing uh the agenda that I had in mind for the first meeting.

B

All right, can you all see my screen yep, we can see it, you can make it the font a little bit bigger.

A

I can make it bigger uh yeah.

B

It will be nice if you zoom the the page. It would be much more readable. Oh yes, it's like less than half of the page.

A

Right.

B

There.

A

You go oh perfect, this, maybe all right um so yeah welcome to the first meeting. um So, as you may all know, um this working group will be focused um on um on enhancing batch um support within core kubernetes and we have a charter. I hope that you've uh you've looked at the charter, which defines the scope of the of the working group. I thought we can discuss the charter today um as well. um uh As you may all know as well.

A

We've got a couple of meetings, uh uh so we have a meeting every week, but they are on alternative days. The first one is uh basically today every thursday at 7 00 am, and hopefully this is friendly to europe and east uh in the east, like east asia um and the other one is at 3 p.m: um pacific, pacific time um it's the same day on thursdays, we've got the calendar here as well um and we'll be starting uploading. These um recordings into our youtube channel.

A

We have the link here as well, um so we've got also a slack channel within um within kubernetes. uh It's the w working group dash patch. So please feel free to ask questions there. um The organizers!

A

uh Let me open this link, um so uh I'm just one of the uh of the group we've got way um he's uh six scheduling, uh lead and we've got danielle. I don't think she's with us today. She couldn't join um uh she's uh she's, an expert on um uh on node node performance uh machek as well. He's well known.

A

Sick apps did a lot of work with with the job um api in the past and cron job as well, um and we've got suarey our expert on hpc in general and um the uh she's leading the she's she's co-leading the effort to improve support for pneuma we're scheduling within kubernetes. So we've got like a really nice group of um organizers, but I want to emphasize here that, like this, the set of organizers are mostly moderators.

A

um We want to try as much as as we can um to basically.

A

Move wherever the community wants us to focus on it's, not that we're going to impose any direction per se as much as we will try our best um to understand where the community wants to tilt and and uh and- and you know, use our expertise uh in these various uh segs um to help understand how we can move forward um with the batch mandate or batch charter.

A

Okay, so with that said, um welcomes the introductions. uh Probably uh you know, like my name from the from the screen again like my name, is abdullah. I'm scheduling lead. I've also introduced my co, uh my co-chairs um as well, um so I'm gonna like I think we have a lot of people. I don't think we can do a lot of introductions um today. We take a long time, um but please, uh if you want introduce yourself in the uh slack channel, um it would be nice to understand what your uh you know.

A

Interests are, while joining the working group batch, um so that it would influence again um our agendas in the future in general. If you have any topics that you want to discuss again like similar to uh typical kubernetes meetings, uh please feel free to add the agenda item uh uh into into the edge in the dock for the next meeting, um so how we operate um so again, as I mentioned, we want to make sure as much as we can um that we want to advance the batch use case.

A

We we want to define high level pillars that we want to advance so that we are kind of focused uh batch is a massive area, um so we want to make sure that we are efficient in the way that we operate. um We want to identify.

A

You know a number of issues that we want to address at the higher level, as as mentioned in the charter. Our like first exit criteria as a bash working group is to define recommendations of enhancements that we want to apply into core kubernetes, um and so the respective segs would and the respective segments would execute on them. I do believe that we can start executing before we exit like we. We want to, for example, discuss um enhancements the job api.

A

It's not that we're just gonna write that enhancement proposal or or open an issue, and that's it. um It would be great if we can start making progress on that as well, um and so, but at the end of the day, I really feel that again we want. We want to be efficient, make progress and that's why I would like to start by discussing a high level road map and then maybe we can even document it and write it.

A

The charter just sketches very high level uh areas of of interest, but we need a much more concrete roadmap and um before I get to that, I want to pause here and see if anybody else have any questions. uh With uh related to this long introduction.

C

I have one um so part of what I've been working on is trying to figure out where the kubelet should be going, because it can't it's not very mutable as far as resources underlying right, unless the features are specifically added to the kubelet. So there's some movement. It looks like and interest a lot of interest in getting a resource management plug-in model, as opposed to the current model, where any resource management items have to be put in tree right.

C

um Where, where would that fall? Is that an interest to this group that I should be bringing back.

A

So, are you referring to the dynamic resource allocation that patrick, is, is working on.

C

No I'm talking cpu and memory, not just regular resources.

A

uh Is that, like a a more? um Is that, like an issue open or uh or a kept already open, being discussed.

C

uh There's some discussion in channel there's not a cap at this time, because I'm still trying to get together the requirements and what sorts of solutions would be okay, and when I spoke with derek carr and elena hushman. They are also interested in this, but it would require quite a bit of of re-architecture of the current cubelet. In order to do it.

A

Right um so this is like a great topic to discuss under, like the uh the pillar of uh removing frictions, to use special hardware and even like in in your near context. Probably even I don't know types of uh I I'm assuming you're, referring to some tunings at the like the cpu level, right.

C

Cpu level- and that includes pneuma power um and memory placement right.

A

Yeah, I think it fits under this pillar um but again like um uh before that I wanted to to understand like try to get a um some sort of consensus or agreement, how we operate across these three pillars that we have discussed in the chapter again. What you mentioned could fit under this pillar, but I don't know if the group has the capacity uh to invest in that direction. uh I I don't want to impose that, and so it is definitely something we can um we can discuss.

A

It will be better if we have like a an issue at least an issue within core kubernetes, that you can reference, so people can get a better context of what exactly um that you you're proposing. We pursue.

C

Yeah, I understand okay, there there is a cpu management dock, but that's pretty all over the place. It's not concise.

A

Right and like, in my mind, cpu management memory, like the memory like uh doesn't, that fall under also a in some sense, or it's probably more right, like it's also about pinning.

C

It's pneuma what types of cpus, which will include the stuff that you are working on, because, if you're trying to get high performance, certain cpus sometimes are better for some things than others right.

A

Yeah, I I think it does fit squarely under that third bill.

A

Any any other questions.

A

Okay sounds good, um so yeah, as I mentioned, like one way of trying to work on that broad map, uh will will will give us a little bit more direction to um how this working group going to make progress um is to try to start with classification of batch workloads uh and like uh like we in the charter. We already discussed three areas um which one of them is the job api, the second one, the job management.

A

The third one is removing frictions to use um like special hardware, um but those are like three areas of feature sets that we can do I'm trying to understand if we can have a batch workload classification that we can project on these features uh and guide us, um for example, for the job api enhancements.

A

We've done a couple in the past uh few releases introducing index job, but is that enough to model mpi workloads? Is that enough to model reinforcement, learning, uh machine learning, uh reinforcement, learning workloads? Is that enough to model spark workloads?

A

um What other enhancements do we need to do uh to make sure that communities like cube flow can build on these core features that we are trying to develop here um again, the idea is not to impose something on the overall community as much as finding common ground um to move people from just creating uh plain pods doing life cycle management to things that everybody is doing.

A

um For example like I can mention one example here um with um with mpi: it's like it seems that it's reasonable to model it as an index job, for example, where index number zero being the driver and the indices basically define uh identities to the workers, and so they can address each other. um So same thing with probably um with reinforcement. Learnings, like you have two groups of pods one.

A

Is that uh the I don't know what they call the parameter servers and and the uh and the workers, and you can model you, don't need to model everything as a single job, but you can model as two v1 jobs, for example um and so forth. So one thing like uh uh we, I would love to start with is- is really define the type of workloads high level workloads. Every batch workloads um that the community is interested in I've seen people, for example, interested a lot in spark. How do we do dynamic?

A

uh You know scaling of spark workloads um so so that that is one thing, the other thing, and I would suggest that we discuss in the uh uh again. The roadmap is projecting this. This categorization into the types of um uh like the three pillars that we want to address, which is the job api, job management and um on node enhancements uh uh related special hardware.

A

um Does anyone have any questions or like does that make sense.

D

um Okay, so for the uh job api.

D

Building batch workloads such as npi tensorflow spark, so what does it mean exactly uh that the idea would be to create and to enhance the job api with labels or something or create new objects?

D

I was thinking about the npi uh once we there's the mpi operator. So how do those things uh go together?.

A

Right, that's a great question, so the idea is not to actually implement mpi in core kubernetes, but to make the job api that we have a building block to the mpi operator. For example, again we introduced index jobs. That was a step in the direction of having the mpi operator live in the job api rather than creating their own like creating raw pods.

A

um Another enhancement that we can do is like an elder uh in the past suggested is in the status of the job uh of the job like like, for example, the mpi job could be modeled as a v1 job. Right, like the mpi operator, would create a v1 job, but it needs enough features in the job api to do everything that it wants to do right.

A

um So, as I mentioned the index job, that's one thing in the past they tried to model it as a stateful set, for example, but that didn't work um so we're trying to fix this and make sure that the job api addresses these shortcomings.

A

Another thing is improve the job status with ready job already paused aldo can discuss. This, uh have an issue about this as well. uh For example like when you, when the job api, when you get a job, um we we want to track which parts are ready right and those are like, basically that that the mpi operator can watch for and only then, for example, it would create the driver um so, along these lines, we're not trying to implement mpi we're not trying to implement spark or tensorflow.

A

We just want to make sure that the job api is, you can use it to model all these things.

E

Is are you also considering, and I think correct me if I'm wrong, but currently in the job api you just have one pod spec, which essentially means you're dispatching one image with multiple replicas there's a consideration also to do multiple parts templates.

A

So we've discussed this before um so far.

A

There has been some resistance from api machinery to do that, but I think it's also on the table like if there is enough momentum to say okay, we need v2 job api that, uh like you know, with multiple templates, um I think that's on the table, but at the end of the day again like we're not trying to push too much downstream, like you could imagine that the mpi operator itself, if it wants two types of power like um two types of pods, the driver and the workers, they can model them as two two jobs, and so the mpi operator would be managing two drugs.

A

uh So it would be left to the like higher limit operator um to manage multiple, let's say templates, rather than trying to push everything down. We need to to understand. Where is the line right? So that's the thing where we can say no we're not trying to impose too much into the overall community we're just trying to build these smaller buildings that are robust enough and genetic enough um to build on top and not rely on pods that actually.

E

Okay, the reason I'm asking is because I was just curious about the scoping of just using the job as the only artifact for delivery in this case and many of the workloads that I've worked with, they don't necessarily use jobs, they use deployments staplesses and they have multiple sets of those, and I know job is for run to completion, but they take advantage of these other artifacts in kubernetes uh of the different types and then internally, they, you know, uh set up their workload to actually run to completion and delete these objects.

E

So I just was curious with regards to yeah the focus here. Is it just going to be just the job I mean I would expand. I.

B

Would I wouldn't primarily focus on the job itself? Okay, uh since I'm representing the the at the workload side of the kubernetes project, if we see a a general desire to improve additional apis outside of just the batch um aldo and and abdullah added the batch initially, because that that was the uh the obvious answer that yes, it will be definitely improved as part of the changes, but there's nothing stopping us from uh proposing changes to additional apis within the within the workloads group. So yeah, that's great.

E

It's definitely.

B

On the table.

E

That's great to hear thank you.

A

Right but like I think I think like in my opinion, um it would be better if we try to say okay, what is missing in the job api before we say.

A

Okay, we did that again, like stateful set, was being used to represent these types of workloads that needs uh to identify workers by by ordinal index, and so I say okay, I mean we should be able to do that with job, but at the same time, you'll be able to have some of these indices to actually run to completion and finish, but stateful says you couldn't.

E

Right.

A

Exactly so.

E

Yeah.

A

Exactly so, so I think our priority should be actually improving the job api um to make sure it works well and that actually will give a really powerful tool for operators like providers cloud providers to do more optimizations like if we understand that, if you like most of your batch workers are actually at the end model as a job, then it will be easier for us as a provider to do optimizations at that level.

A

Right, like it's easier for us to do, statistics and you know, collect metrics and uh and do actually crazy stuff right yeah if you're creating more pods it's hard for us to do anything with them. Yeah.

C

I agree: okay, great, thank you.

F

I want to add another motivation of for why external frameworks should be using um kubernetes apis. Instead of uh I mean workload apis instead of robots.

F

Recently we got so recently we fixed the job api to be able to track uh um all the pots that completed, even if they are deleted from the api server um and we fixed it for job. But now people are trying to use airflow or argo, or these other frameworks that today they use raw pods and they have the same bug that we used to have in the job api.

F

So and it was not a trivial implementation it it was. It was quite a cumbersome to do so if we already fixed the job api, it's easier if the community can just use it instead of having to fix the same bug all over again. um So that's that's a very, um very important motivation. I mean this is just one example, but there could be other um other pot management issues or bugs that that we could solve once and for all right.

A

Yeah, that's a great point. That's great right!.

E

No, I was just agreeing that that's great.

A

Yeah and specifically uh like what aldo is referring to is basically drop like pod tracking if every um controller creates raw pods and these parts just go away, because you know the the noga preempted and the garbage collector took the pod object completely. You lost tracking, and so the only way to solve this is to inject finalizers and for your custom workload controller to manage these vinylizers when to inject them when to remove them, etc.

A

Why they shouldn't be doing all of that they just okay, let the job, let's let them create the job uh v1 job. The human job will do that we'll do that um tracking um for them and that should be universal across all types of batch workloads. I guess um so that's like just speaking more concretely to aldo's point.

A

um The other high level item here is job management. As you may all know, we we've started a sub project called cube. um It is uh some project under kubernetes sex, um we're trying. We had a proposal um that we've presented to sega apps before uh it's published. um We've got a ton of comments and interests from from the community, um so that is our like framework that we're proposing to start thinking about high-level job management, cueing, dynamic resource allocation, um provisioning and all of that.

A

So this is the highly second high level item.

G

Right have a quick question. Sorry, abdullah, uh will you be evaluating other solutions like this as well like uh I we I mean I work on maintaining the flink operator uh and we have a bunch of like we have a custom resource which spawns a bunch of uh like stateful sets and other things, and we want them to be provisioned together or not none at all. So we were using volcano scheduler, for that, will you be evaluating other schedulers as well or.

A

Right so the the um the high level charter of the working group is to try to introduce new features that work well with core kubernetes controllers.

A

um I don't think I don't know like, maybe if someone else wants to. Even I don't know what you mean by evaluate here. uh The the like vulcan already exists right, I'm not sure how much traction it just got so far.

A

um My volcano is a second scheduler, so it's something that is like compete with cube, scheduler and- and that is like something in the chat that we mentioned. We don't want to do that at all. We don't want to introduce new components or work with components that replicate existing functionality within core kubernetes.

A

We would like to work with uh the mindset of how can we complement what we have to introduce new um capabilities.

A

But feel free to disagree and- and if, if you feel strongly about this, it's also.

G

It's also on the table. I mean it would be convenient if the existing scheduler had this uh possibility of like grouping resources and scheduling them all together or, for example, just checking how much resource quota this namespace has, and only then scheduling it in that namespace. uh Like scheduling, yeah yeah.

A

So yeah like uh now, I'm speaking like with the sixth kid drilling hat co-chair hat um and aldo here can also uh uh disagree if um maybe weigh as well when he uh when he joins next time. um But from my perspective, the cube scheduler is a part-to-node scheduler. It's a pod, scheduler um doing koda, checking and and doing all our things scheduling at that level is not going to be scalable.

A

That's why we are proposing this new approach. If you want to do job level management, you need that control that does job level management and is specialized in that, and so we have this separation of concern with yeah. The other controller does drop level management that injects.

A

uh You know, scheduling, directives that forces the cube scheduler to act, the weather that this higher level job manager decided to do uh take a look at the cube proposal that does touch on some of these aspects. The way that at least some of us um see how these things uh fit together and why we've been resistant to introducing uh you know these types of you know collective or bulk management of pods within cube schedule all right. Thank you.

F

Sorry, but um nothing nothing should stop us from enhancing the experience of using volcano like if, if there is something in kubernetes that we can do to make volcanoes life easier, I think we should do it yeah, um it's just a matter of finding what what's the right set of features that that would help yeah um um this. This idea of of managing workloads instead of managing pots.

F

uh I mean it's, not it's not the first time. uh It's not. The only idea over there I think alex is here. Alex is also. uh He also has a another scheduler, another sorry, uh joe joe killing system, um which works a similar way and it kind of looks at a accordance. I don't know if I don't want to put warsaw on alex's mouth but um yeah. If we can find a common ground of a set of features within core kubernetes that can serve all those purposes.

F

I think that would that would be a win for for everybody, um but again I'm also we. I also have the idea that we should. We should try to build on top instead of replacing stuff um or in or enhance some features so that you don't have to replace them anymore. um That that's my point again.

H

I'd like to add a point here, if I can um so like apart from cube, batch or volcano uh like we have other schedulers that are managed by multiple uh companies and entities like the community is in terms of bash, it is already fragmented and the main thing that we, I think what abdullah is trying to say here, is like what are the use cases. Why is another scheduler needed and what are the use cases that we are trying to solve? If they can be solved via the core kubernetes?

H

We should try to have those features included in the existing workloads api. I think I agree to that part, but one thing I would like to add is whoever is owning those individual batch schedulers. I think they should also do uh uh some uh research and then identify like, for example, we have the existing queue right I mean. uh Can we make sure that the volcanoes api can fit into this uh with the existing mechanism that we have? I mean like as a community?

H

We are not going to force anything, but I think uh understanding those gaps can actually enrich the existing workload api. uh That's something that uh should be driven by the individual uh project owners, but I think we, if there is a feedback uh we are, I think, ready to accept. That's what abdullah is trying to say.

G

Yeah makes sense, I mean yeah, I'm I'm also a user, a volcano just clarifying. I don't.

H

Really.

G

Yeah.

H

Yeah and we have some schedulers that are owned by ibm folks as well, and they have some experience running those bad schedulers. So one of the things that we are trying to do is evaluate uh the existing queue api see. What are the pieces that are uh missing and see if we can make sure that they can those schedulers can conform to this api if possible? If not, what are the gaps? I think, as a community like if the individual project owners can do that, this group's chapter will be much more successful.

I

Yeah uh yeah ravi, I want to mention yeah diana here. uh Has a uh you know a job scheduler right, so that can be one of the things that maybe in one of those meetings uh we can look at. You know the the design choices there and see if some of them, you know make sense to you, know, adopt or enhance this proposal.

I

And you mentioned another person, I think on the group who has another job scheduler.

I

Or queue manager right, I'm using job scheduler and q manager interchangeably. So I think it would be a good idea to kind of of those proposals or those. Actually, you know working solutions presented and see if there are any design choices that make sense really to be. You know adopted here.

A

I think this is a great idea, I think, could be one of the things that we do as a part of our roadmap. Part of our progress. uh Making progress in this specific pillar is to stop what do like a call for participation.

A

um Get these projects to present to this working group, um their their their schedulers. uh What do they see uh missing in core kubernetes that they had to implement at their layer? um What is their like? I would call like theory of operation. This is something that tim hawking keeps using like how how do they see managing a job or how did they? How do they manage a job like? Do they manage it as a group of pods, or do they manage it?

A

As just like you know, a job um high-level job like you know, resource and the pods is- are managed by the specific workload, controller, etc. So there are multiple ways of I think different schedulers had evolved um and we can start there. I think that's a, I think, that's a great idea.

H

Yeah, I think the other advantage is uh like the community, as of now is fragmented, but we can all rally behind a single api say if we can come to a common ground that this is what at the core all the schedulers would like to have this set of functionality, and this can be this- can be envisioned via this particular api. I know this is hard, but I think that can actually ensure that the community is not more is not fragmented anymore.

H

I couldn't agree, go ahead.

F

Yeah going going back a little bit uh to queue, uh we have a set of apis that that we share, but uh from my point of view, I think they can be they. Those those set of apis could even be split into something that is clearly common like the workload api and something that is maybe more business dependent like modeling capacities or some people calling it they call it cues.

F

They call it um quotas, so maybe that part can be split out and leave it to people to decide how to do it, but maybe at least we can agree on on low lower level apis uh such as the workload or. Maybe we can split it split split the the cake somewhere else, that's something we we can discuss right now in this. In this forum,.

J

um Just one comment: one thing: I'd find very useful would be for each of these focus areas if, if we can kind of narrow down the pain points and the challenges that are being encountered by people, that would feed into kind of requirements, that would give us a more holistic view in terms of what we want to achieve in each of these focus areas, because I think, like ravi said it is a bit fragmented in terms of what different people are working on.

J

So that kind of help collate some of this information and give us more clarity for each of these focus areas as an example like for the third piece uh for the hardware, acceleration and no awareness, so it uh it involves cpu management, topology, aware scheduling, and then we can maybe have a statement corresponding to like uh the topology wear scheduling piece. This is the gap, and this is what you want to achieve and solve. Maybe the core kubernetes api itself.

A

This is a like a nice segue into the what I wanted to mention um like these three players. I I think we need um like quote-unquote, lead that um pushes this these each of these pillars forward and tries to uh you know organize the effort exactly as you mentioned identifying.

A

You know the key areas that we want to focus on, and that would make this group more scalable, I'm not saying that those are going to split into different groups. It's going to be the same group, um but we want a point of contact that is organizing the efforts and the progress we're making in each of these three pillars.

A

um What what does the community think about like this highly organizational approach?.

J

I think I I agree with that as well. I think one of the things we should do is maybe uh set a baseline in terms of what the current state is um as well like, because uh if people, for example, myself, francesco marlow we've been probably more involved in the note side of things, so it'll be interesting to understand the work that has been done in say the job, api and job management point of view in scheduling area. So if, if we set that baseline, it will help the group overall.

A

So do we want like, for example, I we could. I could start by um proposing, like a template, for these three pillars, as you may, and then we can iterate over that template, basically how we're going to start making progress in each of these errors. As you mentioned, we can start with um defining this like what is the current state um within core kubernetes?

A

What are the main areas that we would like to? We found like we understood, or we learned that they are uh they. They have limitations and we can list them uh and and then we we take it from there. uh We developed this as collectively being our roadmap.

J

Sounds good to bring.

F

Current initiatives to whatever is in progress, yeah.

A

This is what I mentioned by like the current states. Oh yeah right current state will then call kubernetes and and initiatives uh as well.

A

Sounds great, okay, we're out of time. um Thank you so much. uh This has been great. I'm really excited about about this working group um and I hope um I really hope they will be successful. uh We can produce something um that gets pushed into core kubernetes and makes batch a first class citizen of some sort compared to uh you know, just like services.

E

This is great, thank you, so.

G

Much.

G

Thank you all. Thank you. Thank you for organizing this below thank y'all.

B

Bye.
youtube image
From YouTube: Kubernetes WG Batch Weekly Meeting for 20220303

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).