Kubernetes Batch Working Group Weekly, 7 Jul 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes WG Batch Weekly Meeting for 20220707

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

This meeting is being recorded.

B

I was just I wanted to try. um Should we start, I think she'll start I'll wait for a couple moments.

C

A

This meeting is being recorded.

B

Hello, everyone, um so this is bash working group july 7th um this meeting is recorded, will be uploaded to youtube. So please adhere to kubernetes um guidelines um when, um when commenting um so today, we have uh antanas um so he'll, be giving us um they'll be giving us an overview of slurn.

B

um So slayer is a batch, a known batch scheduler. um I believe it has a lot of history with um hbc type workloads.

B

So I thought, let's please take care of it.

C

Thanks abdulrah, let me try to share my screen. Hopefully everything works. Fine.

C

uh It's telling me host disable screen sharing. If you can.

B

C

B

C

I think looks good.

C

Let me know when you see my slides.

B

D

Okay, yeah. Thank you, everybody for uh joining this talk today, um so yeah. uh I participated. Some of the previous calls in the batch group and I heard interesting topics in respect to badge. Job submission on kubernetes and my background.

D

Before going to the cloud native site was hpc and basically we used a lot basically in high performance computing. We we have similar problems. What people have to deal with in kubernetes nowadays, uh so we are having distributed computing and um a lot of the workloads are processed in a pet shop manner and there are different solutions. How to do the scattering and resource management, and very popular, of course, is learn. So I thought it would be a nice kind of think to share what was done in hpc with the community here.

D

uh If you are developing new tools, maybe this will give you an idea for nice interfaces or idea how to extend existing tools, make them cloud ready and so on. So I think I hope it will be valuable for everybody.

D

So, let's uh jump into the details a little bit. What is learn um so slurm had long history. uh It started development in 2002 in laura's, livermore, national, lab laboratory, um and you see already a little bit about the history in the name. It was simple: linux utility for resource management.

D

Actually, the first version of slurm was a resource management tool. um It was still not having a lot of scheduling capabilities, uh but this was the start. um It's a c code, uh quite a good size, c code, 500 000 lines of code.

D

If you compare it to to other tools, I don't know how big kubernetes is, for example, but it's quite compact for the tasks what it has to do with uh and very efficient. uh If you have used it in the past on the hpc systems, you most probably aware that the two can scale to thousands of nodes basically uh can submit, or you can run jobs on thousands of nodes and basically the deployment time or the time to start.

D

The job usually on such kind of clusters is within a minute what I heard in the past. So it's really quite optimal solution, so you can find details more about the the actual software in the link below. uh But I tried to capture some key points and give you some overview. What's learned this and can do.

D

So just have to deal with powerpoint right um so yeah the smarm had very simple original idea. uh Basically, um it tried to uh coming out from uh from normal applications on pc uh when when it was developed 2002, this was just the time where parallel applications were getting more important.

D

We got in the time of the first multi-core processors and so on. um So basically, the idea is how I can can we run parallel applications in a similar way. Why we do it on standard pc? uh You see an example. You have an linux application, some application a and you want to execute it on linux. This is very simple. You just call the application and uh slurm in slurm.

D

You can execute the version of certification just with a single command, s-run um and you're running a parallel application, and eight uh tells that you want to run eight copies of that application. um So it's quite intuitive idea, let's say, and in hpc, usually behind the uh a stays, a mpi parallel application.

D

Mpi is, in that case message passing interface, so you you have multiple processes which are communicating through through messages.

D

This was the main idea. What's what they want to achieve. Originally um and yeah, the two became quite big and uh offered a lot of additional features. It's it's fault, tolerant, uh full scale, uh cluster management tool. um Basically, it has three main things you can use it as a resource manager, so it can deal with all the resources available on the compute hardware, um allocate them free them.

D

According to job requests, you could define certain time limits how long your job should be running after that the job will be kicked off so, and it provides all these components what you need to start the jobs execute them. It has monitoring utilities.

D

A lot of the tools which are part of the package are command line tools, which makes it very nice for for administrators, it's less graphical compared to to what you see nowadays, with kubernetes.

D

But very interesting concepts are the queues, so we will talk about that a little bit later,.

D

Right and just uh I found this uh kind of uh information from scat md skate, md is the original um company or community which currently drives the development of slurm, um and they gave a short overview. What is the difference between resource management and job scheduling?

D

um I think most probably everybody in the this community is aware of that resource management. How do you allocate resources on the compute nodes, so resources can be cores memory, caches networks, gpus switches, so many many things are seen as resources and job scheduling is yeah. How? How can you schedule your jobs more optimal if you have complex network topologies, if you have time slices, if you want to limit uh the execution of jobs, let's say we have in my company the case.

D

We want to restrict jobs of users to be no longer than two days, so you could do that with sloan. You can set such kind of limits and you can do prioritization different quality of service, different priorities um and yeah, it's quite powerful.

D

Actually, there are more solutions out there for that. You see below some small kind of table. Slurm covers both resource management and scheduling aspects, um and, if you ask yourself what's kubernetes uh today, kubernetes also covers both resource management and scheduling. So it's another kind of bar in in this table.

D

um Looking behind the architecture, so basically um on this diagram, let's start with the with this box with the lilac cover. Those are the main components. There is a control plane.

D

Slum control, control daemon, basically resembles a little bit the cube ctl.

D

So basically, this is the main um main component with which the users are speaking to issue commands, but also it speaks with the database. This is the swarm database, which is mostly managing the accounting values like. If you want to introduce users, policies, time limits, those will be stored in the database and the control demon will ask for them or basically they will be pushed back.

D

So um then you can have multiple copies of the control daemon. If there are more users on the cluster and a single process cannot handle that, you can replicate, it have an optional demon to handle more work um and then the the yellow box basically is run on the compute hardware.

D

You have a slurm daemon, which is responsible for uh basically, you get the job request from the users it's registered from the control daemon and the control demon sends it to to the actual nodes. uh The actual nodes running the swarm daemon uh will take care then for resource allocation and and job execution.

D

um So the slurm demon is something like cubelet. uh If you search for or some sort of references or connection to existing kubernetes architecture um in terms of commands, what the user are using, you can see them in this blue box. um Speaking with the control demon is done with the s control. Basically sq. You can ask what queues are available on the system. You can view the queues if the jobs are acute.

D

How much you have to wait that your job gets running. You can achieve that through the sq in queueing. Job happens with as patch, so as batches just fire and forget um at some point you don't you, you get an output from the job um and then s run is more interactive blocking process. So actually you're you enqueue your job, you block or the control daemon is blocking uh the user request and you, your job, gets running up and running.

D

It's more like an interactive experience um and there are some further functions for accounting to to monitor or to see the accounting values which were stored in the database.

D

This is the basic architectures called strong, looks like resembles a little bit kubernetes distributing today. So again in kubernetes we have a control plane. We have daemon in form of cubelet, you have a database etc. So very similar.

B

Right uh this is a little bit just one note yeah, to draw one more comparison here, like all these arrows are rpc calls right. Those.

D

Are right, they are rely, they are residing on different servers, um so the database can be a different server. The yellow boxes are all on different servers. You will have one demon per compute node right.

B

I guess they're trying to mention here like in kubernetes, like you, don't have these calls being explicitly made between components like components, don't call each other directly here they do like, for example, s run which is running on the client. It actually makes an rpc call to slam d. In our case the cubelet explicitly to start a job node.

D

Yes, um so there are some differences still, as you see um so yeah.

D

um This is a little bit detailed description or some bullet points for each component. What it does uh the most important components, basically, the smart, slurm uh controller um um so and then the database, the slurp demon, but we we captured those uh as batch.

D

Some of the uh basic utility is used to submit jobs, are as patched as run. Jobs are basically text files, simple text files. I will show an example how a search file looks like, um but you just call as patch with this job file and then um all all goes to the controller and the job x is executed. It's very simple uh and you can also have something like interactive jobs.

D

You can go s-run with some additional arguments to have an interactive job, so it's not completely bad job-based you, you can have maybe a partition, smaller petition for interactive usage, um and then there are further comments.

E

Question, can you can you explain, what's a job location.

D

Job location, basically, um it's it's waiting that the resources are allocated when you get the resources you get unblocked, um and this is the difference as batch. You will shoot the job um you enqueue, it put it in a queue and the the command line gets unblocked.

D

Basically, it continues normally the normal linux command line. You can use it for whatever you want, but it's not guaranteed that you have a result. S run will block until you get the resources.

B

D

This case, like.

B

You could run multiple jobs inside an allocation.

D

Right right, you can request to block to to allocate 10 nodes, let's say and run as run with an argument to allocate the nodes. It will wait that the 10 nodes are available from the queue for the certain time span. You can provide the time information. I want to allocate those nodes for 10 minutes. um So if the, if the nodes are available on the queue and you're the next user, uh who can get them, um then you get unblocked at some point and you can do your work with that.

D

It's interactive access afterwards, as patch you, you put the job on the queue. It's there at some point, but you're unblocked and you just don't get the result. You can ask basically the controller.

D

What's the state of my job uh through the sq command command, you can see if, if it's running or if it's still uh in in queued state, so you might need to wait until it's executed.

B

One more thing here, like you, mentioned a lot of um commands- and it's like one thing, for example with kubernetes- is that um people sometimes like build automation like they build controllers, to do a lot of the work.

B

um I'm wondering if this is a usage pattern that you have also would slim, or is it mostly that there are physical end users that are actually starting the job etc or is there a user? Are you like a pattern where no all of that fires jobs some at some great level.

D

I did not see such thing. Usually there are users submitting jobs, um so you don't have some sort of um connection to to some events. It's not really events driven concept. Most probably you you can do some some implementation, which does that, but maybe kubernetes and the whole components have better ways to do that. If you want to react to an events and stuff like that,.

D

Yeah, usually, what I saw in the past are actual users submitting jobs um yeah. You could do maybe some magic with chrome, jobs and stuff like that. But it's not really nice.

F

I just would say um users will use frameworks, like you know, workflow engines and stuff like that, that interact with the batch system. So it's a different model than maybe kubernetes, but you know they're sort of an analog.

D

F

D

The the flow engines they're going a little bit in the same direction. They uh they have some elements from cron jobs and stuff like that. Yeah yeah.

D

um Right, so there are a little bit alternative commands. You can just allocate nodes and attach to them um later. This can be done with slog as attached, and you see some additional arguments you can. um We have accounts associated uh to the whole commands, so there there is, there are components for permission models and so on. You can give the time spans and so on um so quite um yeah.

D

A lot of features available through the api um very interesting is to look on the s batch files. uh They are. um I I saw some examples, how you are thinking to define jobs for kubernetes and the nice thing about slurm drops they're very compact. Very you could see what you need to express a bad job, maybe through through this example. So, basically you have the name of the uh of the bad job. You can tell how many nodes you need. um You can have different number of tasks.

D

The task can be. You can assign also how many cpus you want per task, uh how much memory you want per cpu. So you do similar things. What you do in your pots, basically with requesting resources. So it's a little bit different here. The resource requests are inside the batch job. So if you are thinking how to implement this in kubernetes, you will need to find a way to translate.

D

If you want similar kind of interface, it becomes interesting. How do I do I translate that to bot requests, so it's a little bit difficult problem um and again you have time you can specify after that, and usually what this is. This part is the job description part where you have resource requests, some some um definition of time and so on. After that, it's usually shell scripting. What follows and uh you can execute any kind of linux application inside.

D

Basically, it's a normal shell script, following that um in in many cases it was a an hpc application and usually in hpc community they are, they use this kind of module environments which is a way to configure.

D

If you have multiple software packages, you can turn on or off some of those through the module environment which, basically, under the hood, just sets environment variables on the system.

D

um In terms of uh yeah the control, uh what you could get or what kind of resources you can address, you could theoretically address any piece of hardware. uh What you want you can address cpus, how many cpus you want how much memory you want.

D

uh There are extensions for gpus um and then uh basically, there are um also, after that, um the nodes uh get also different states, uh just a node, uh basically switches between those states. It's idle mixed or are located.

D

You can basically enforce explicit, explicit usage of the node that nobody else can can use that note, which is very typical for hpc people don't like to share there um right.

D

So it's a little bit different in the compared to cloud world where you have virtualized resources, and you might be not the only one on that note. um So then the other interesting things are partitions.

D

I will spend a little bit more time on partitions, because I find this concept very nice uh how you could distinguish between different types of resources, or so you could group basically pieces of your cluster in partitions. You can tell this part of the cluster is partition. One. This part is partition, two, where it's useful, you could imagine. Let's say you have a hybrid cluster.

D

You have some cpu only um part of the cluster, and then you have gpu only or a cluster with gpus, so you could build two partitions and instead of giving access uh or scattering to all the nodes and basically allocating some drop uh on which which uses more cpu on the gpu node, uh you could control that through the partitions and actually in in slurm. This can be done automatically, so you could reset a request, a resource, let's say a gpu resource, and then the scheduler should find out automatically.

D

What's the best suited partition or you will see later, you can specify also explicitly what partition you want to use and under the hoods for each partition, you have a queue which is basically then responsible for inquiring, the jobs or getting the jobs, and then they are served in the order. Accordingly,.

B

Is it fair to say that partitions are cues basically similar to pews yeah yeah and do they need to be like I'm not sure? If, but do they need to be homogeneous like each partition has a homogeneous set of nodes, or that doesn't matter.

D

That doesn't matter. I will give some more examples. It doesn't matter you you can define the partitions. I don't have exact example of how the definition is done, but I have some partition examples.

D

um Okay, this is summary of the resource requests. What you usually will see in the jobs. um Okay, this is more on the cpu side. There are further things what you can control with plugins for gpus and so on, uh but yeah. um This is how usually you can control, how many cores you get and and so on, and how much memory per cpu and so on um now back to the partitions. um So, for example, if you want uh a bird you could this example, which I took from enroll?

D

um It was with two islands. uh You had an island um roughly 100 nodes which had let's say hard drive capabilities above one terabyte, um so you could basically group them and every job request which wants to use which needs file system bigger than one terabyte will be served by this um another example. You could group by memory, so you could tell at least make a queue or make a partition, basically, which will cover all requests with.

D

uh For jobs uh having more than uh requiring more than 96 gigabyte of memory- um and you in that you can have two clusters, as you see so one with 192., it doesn't have to be homogeneous, but you can bind them in one partition. One queue um both both will fulfill the requirements. This is just a simple example, um and this is extended further. You can have a third one with gpus and the way how you specify your your requests.

D

You add these parameters to the as patch commands or s-run commands. You tell explicitly. I want uh 500 gigabyte of memory or I want uh 20 terabytes of storage and then slurm will find out which one of the partitions is best suited uh for this job. Basically,.

D

um Yeah they are actually recommending to to do explicit resource requests, but um you have also the possibility to choose the partition um on explicitly. You can list them, so you you can they are. um You can always see what what are the available partitions, what kind of how many nodes you have inside um and then there is an argument: minus p. I think what you can pass to the s patch um to to use the explicit partition, but they are not advertising that usually.

D

Yeah this, this is the example. Basically, you have the standard s run command. You have a time parameter. How many nodes you want four nodes, basically with that capacity of memory. um So this is how you request it or if you want a bad job or if you want a bad job donating what you need to change you you do this s run to s patch, then it will basically push it on the queue and give you the control back um right and here another example.

D

uh Basically with gpus, you want two gpus, basically um to use two gpus on eight or eight nodes with two two gpus each. So this is how you can request it um and in the background it will choose the partition uh or explicitly. You can specify minus p here.

D

Yeah, I think I'm a little bit quick. uh So this is the basic interface, as you see is very simple, very intuitive. um This is a little bit the difference uh to the kubernetes, where you have descriptions through yabo. They are a little bit verbose um and for bad job processing. At least this.

D

This kind of slurm stuff is very yeah very easy to use it's basically a client, then you have your shell script and it's executed um yeah, and I have some open discussion at the end so yeah I have maybe questions to the community or if somebody uh already have made thoughts or some experiences, if those both both worlds can be combined, um so the whole patch.

D

Of course, there is the way to rewrite everything from scratch from kubernetes, but can't you extend, for example, sloan for kubernetes run slurm on kubernetes, which I saw in in some cases right um yeah. This is my first question. If you had some experience with that.

G

I have a question uh so uh can a job specify multiple partitions? Let's say I would say gpu two and memory. Something and slum is gonna. Do the you know intersection of two partition and find the common nodes and stuff like that.

D

If you have multi powers yeah, you can specify um basically a minimum memory and you to combine resources. This is possible here.

G

The intersection nodes are selected in that case right.

D

It will select the node which fulfills basically both requests it. It cannot run if you yeah. It will throw an error if it doesn't find a node without sufficient memory or without the gpu.

G

F

I maybe this is shane cannon from nurse uh berkeley lab. uh Maybe just I wanted to comment that one thing that I'm not sure was covered and maybe it'd be worth having you know a presentation separately on that is a lot of what slarn's designed to do is give a policy kind of framework where the resource provider can put priorities on how jobs are scheduled. So you might favor large jobs over smart, small jobs, for example, where you may give certain types of jobs priorities over others. You know really what it's designed to do is.

F

Typically, you have a backlog of work that exceeds the amount of resource you have at any point in time, and so it's trying to make decisions about how to schedule that workload and where it can get challenging is you might have. If you need to schedule a really large job, then you have to start putting aside resources so that you can run that right.

F

So, even if you have a job, that's could run, you may defer it, because you need to get the resources to run that big job, and so it has this concept of kind of making reservations that are kind of in the future. Where it's going to plan to run. You know a certain job based on its requirements.

F

So a lot of you know a lot of its capabilities are in how it makes those scheduling decisions.

D

Yeah it it has very sophisticated, schedulers and, and the reservation systems are also nice. Yes, so you, you can basically specify as an argument that you want to to use a reservation. So this is one of the nice features um yeah. So there are a bunch of very, very good ideas in insights, learn the partitions the limits. um I think the whole accounting is very, very sophisticated, already very mature. Already.

B

One question I have here is um in a typical lucky: you you did, I think, get examples from a uh an actual deployment like how many partitions do you have like again in my mind, partition it resembles a queue um usually like have like. I don't know tens or thousands of these partitions or a lot less things.

B

Do they map to to teams like how they're grab.

D

Yeah what I saw in the past on some of the big systems- hpc systems, they try to group the resources. um Usually you don't get a lot of partitions. You have maybe five ten partitions. Basically some of the um what you usually have on these big clusters.

D

They they organize them in islands, because islands are, if you stay within an island you might have. uh Basically you might be on a single rack or something like that. um So in in some cases you have possibility to access uh an entity of the data center through a partition. So basically you know through taking this partition.

D

I will be running on uh an island which is very well connected, let's say and will will give me very good latency in terms of communication um right, so you can have this as one one thing, then you might have different types of hardware, so you might want to have uh partitions dependent on memory, so you will have let's say, standard nodes with having 128 gigabyte memory available, and then you have a fat island or a fat partition which does one terabyte memory.

D

It's another example. um So there are not so many usually um they're within a dense of partitions they're, not thousands or something like that. So.

G

I have a question so let's say: if somebody does not specify partition, can they be scheduled on any nodes, irrespective of like whatever island they are on? No.

D

The administrator has some level of control. uh There is most probably a default pool where the jobs will land, so you will have a default or administrator. Has the the power to define the default pool where all the jobs will go. uh You can limit the access also to the partitions, not give access to everybody for for the fat nulls or something that this is.

G

Possible, okay, but uh the job still might can get scheduled on heterogeneous kind of nodes. Like somebody with high memory and somebody, that's all that's possible. That's not explicitly denied.

D

If the administrator did not disallow it, uh it can be scheduled, but yeah you, the administrator, might turn it off. So.

B

Sure that others have a chance to are we during our time.

B

We're almost out of time um any other questions from the community.

H

I so I I want to tackle on these questions in the slides, um so the the slides are suggesting that uh slarm is the resource manager and kubernetes is the uh interface, but I I was thinking whether that that makes sense, or it would make more sense to do it. Yeah.

D

This is um I I wanted to. uh I did not speak about it too much but see in my company.

D

We are looking at benchmarking and uh and how to approach cloud native benchmarking um so and the team had the idea to look to classical ci tools, um white jenkins, ron, jenkins, job and stuff like that and yeah. For me, I don't know if this is a good model, because benchmarking usually can be done as a bachelor.

D

So if, if the benchmarks are mature- and you know that they are stable and running, you can shoot them in a queue right, go to sleep and get back come back and get the results, um so it's very well suited for for bad job processing.

D

um So I was thinking basically, how can we do benchmarking in my company for kubernetes uh by maybe reusing some ideas from slurm and the idea what we are exploring? Can we use slurm basically to spawn kubernetes clusters, as basically the benchmarks can vary?

D

um We want to benchmark, let's say a cluster of four nodes, eight nodes, so you can, you can expose all the nodes make them available through sloan first and then slurm allows you to to start a job and in some sort of prologue script you can provision a kubernetes cluster on the located nodes. This was the idea and then you just run the the benchmark, gather the results and, at some point the result or the job is completed, so you could theoretically combine both worlds.

D

uh You, at least in this example, for benchmarking um and make it. But this is in the case where you want to test, maybe the whole kubernetes system um in. In other cases, you might have just one kubernetes cluster and you just want to submit bad jobs. You don't need all this overhead. So most probably it's not very efficient way in in terms of provisioning and so on, but it should be possible.

B

One question related to all or nothing scheduling, do you like for larger jobs?

B

Are there like times where you would have resources being set aside, while them is trying to accumulate the amount of resources required for the all-or-nothing drug? um And it's like for how long does it do that, like I'm, just trying to understand how it implements all or nothing? By trying to minimize maximize you know or minimize the uh case where you have as resources set aside, it can be used while accumulating.

F

Required- and maybe I could comment on that- yes it'll- definitely do that. So that's what I was talking about earlier is it'll make. If it's trying to schedule a large job, it can create sort of automatically a reservation out in time when it thinks it's going to be able to have those resources available based on the time limits that have been specified for all the different jobs, and so it'll say you know, I think at six o'clock I'll have all the resources to run a thousand node job, and it will it'll make that reservation.

F

Now if a job comes in that's shorter than the time between that and the reservation it'll slot that in so it's called backfill scheduling, so we'll try to make efficient use of the resources that can, without you know, interfering with running that large shop when it's planning to, but that assumes that all jobs.

B

Declare when they're.

F

B

F

That's like they.

B

Have to declare.

F

What a wall time is they don't have to, but typically a wall time is either specified by the job or a default is applied uh based on what partition they're going to. What is the typical that you see like? Is it it can be hours to days, but typically, you know many many days starts to get towards the limit of what a typical hpc center might. You know allow.

B

I'm just like there could be a case here for, like users trying to game the system right always setting something that is like break things. I'm wondering.

F

They will try to game the cues to try to get their jobs to as fast as they can, but it's within the limits of the how the policies have been configured so there's constraints on that. uh You know like a smart user, will sit there and say like I can fit my job in that big backfill slot. Let me size it the right way, so it'll jump in um you know others, just they just want it to run at some point, they'll submit it and they'll wait for it to come out. The other side.

B

That's the start time, but like the the length of the job, that's something hard for users to estimate like how long my drop is going to run, because it depends on the resources I will get in terms of like well.

F

The resources tend to be very consistent, so you kind of have a feel for that, um but you're right it can vary. It can vary more because of things like io congestion or things like that, but they'll typically put in you know the expected time and then they'll put in some safety factor. They won't get charged. They only get charged for what they use. Typically, so they're, it's more about just trying to get the time right so that it'll it'll schedule sooner than otherwise.

B

D

Yeah yeah: this is another thing which I did not cover the charging, so they are usually in the hpc data centers. They connect that, with with some sort of uh budget, um the users have budgets of hours compute hours, so you can make it automatically that the compute hours are then deducted based on the on the actual used compute types.

A

uh Question uh does slurm save the state of the job, let's say I'm running a job and by this time the job needs to be completed. If the job is not completed, then save the state so that I can resume tomorrow the same uh same time and finish the job.

D

Yeah this goes into checkpointing, usually what of the applications are doing or the hpc applications are trying to support checkpointing. It depends a little bit also on your code. If your code supports checkpointing, you can make checkpoints in time and basically start from from a certain checkpointed state later. So um it's not completely automatic that you make a copy of the memory, and then you can restart it's, not the virtual machine or something which is running so your application has to support it.

D

B

um Yeah, thank you very much attorneys and shane also for uh backing us with some of these questions. um We're five minutes uh late. I guess um that's really great. um Please like. If you have more questions, maybe tag uh shane or thank us on the working group, um slack channel um and maybe, as as some mentioned on the chat we can invite also people from skid md. I did meet with them before they schedule defaults and mention to them that we have a working group in case, um so we will try.

B

I hope we represented them. Well. I guess here with this presentation: um yeah.

D

B

D

They can give us at least some insights if they are looking to kubernetes, enabling and stuff like that.

B

Right like how do we, how is planning to support like auto scaling, for example in cloud environments? I guess there is a work. There is a ton of work. There is happening right now, uh so that would be also interesting, because everything that we've discussed so far kind of works well on an on-prem cluster.

B

But I have a ton of questions on how this is going to work on the cloud yeah, the elasticity and like auto scaling all these things and the various amount of resources, resource types that you have, that you don't typically have on on print cluster, where you have mostly homogeneous machines.

B

um All right. Thank you uh meet in a couple weeks, thanks.