Cloud Native Computing Foundation KubeCon + CloudNativeCon Europe 2022, 2 Jun 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Accelerating High-Performance Machine Learning at Scale i... Alejandro Saucedo & Elena Neroslavskaya

Description

Don’t miss out! Join us at our upcoming hybrid event: KubeCon + CloudNativeCon North America 2022 from October 24-28 in Detroit (and online!). Learn more at https://kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Accelerating High-Performance Machine Learning at Scale in Kubernetes - Alejandro Saucedo, The Institute for Ethical AI & Machine Learning & Elena Neroslavskaya, Microsoft

Identifying the right tools for high-performance production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this industry collaboration we aim to provide a hands-on guide on how practitioners can productionize optimized machine learning models in cloud native ecosystems using production-ready open source frameworks. We will dive into a practical use-case, deploying the renowned GPT-2 NLP machine learning model in Kubernetes leveraging the ONNX Runtime from the Seldon Core Triton server, which will provide us with a scalable production NLP microservice serving the ML model that can power intelligent text generation applications. We will present some of the key challenges currently being faced in the MLOps space, as well as how each of the tools in the stack interoperate throughout the production machine learning lifecycle.

A

All right, so, uh thank you everybody for coming today. uh We have a very exciting and interesting topic. uh Today, we're going to be diving into the topic of accelerating high performance machine learning at scale in kubernetes.

A

So a little bit about myself and my co-speaker, my name is alejandro saucedo. I am engineering director at zelda technologies, a machine learning deployment and monitoring startup based in london, I'm also chief scientist at the institute for ethical, ai and governing member council at large at the acm.

A

My co-speaker couldn't make it today, she's still based in vancouver, but she was able to send us some exciting videos of the demos that you will be able to try out yourselves with the jupiter notebooks and deploy on your own site. So elena is a senior cloud architect at microsoft, and today we're going to be able to show you a great interesting collaboration, uh productionizing machine learning. uh You know at what is scale we're going to be taking a use case, which in this case is going to be a text generation.

A

um You know exciting uh gpt2 model we're going to be showcasing how to perform optimizations on machine learning. Of course, this is a kubernetes conference, not a machine learning conference, so we're going to be actually covering more about the steps productionizing, those models, some of the nuances of how these practicalities change when you're dealing with machine learning as opposed to just normal software and then how we're going to be deploying and scaling this in a kubernetes cluster.

A

Finally, we're going to be covering some cloud native best practices, things like githubs and operational monitoring that you would introduce in normal microservices, but adapting that into the machine learning space. So, let's get started, let's start with the. What so we're going to be taking this machine learning use case that some of you may have come across before so this is the gpd2 text generation use case. What it basically does is it takes a text input and it simply generates the next token right, and what this allows you to do is to basically generate human-like.

A

um You know uh text right so here you can see that the input is the tokens a robot. May you can see that the model actually generates that next token. So that's basically what we're going to be doing. The reason why we're taking this use case is for a couple of uh various uh perspectives. The first one is that it's quite intuitive you can see some of the exciting value, not just in how it actually performs the predictions, but also in the use cases that uh people have seen uh uh deploying it on.

A

So there has been, if you may have come across this, a dungeon crawler, where you can choose your own adventure and interact with this ai model to say what you want to do as the next action. So you can start as a I don't know as a wizard and say I want to now go and grab the staff and it replies what happens to you. So it's quite interesting, but also from the hardware perspective.

A

It's very uh you know, computationally intensive, so it's also going to allow us to show how to accelerate uh the performance of this model. So it takes a couple of seconds to run so we're going to be able to actually showcase how you know we can. We can make it run faster.

A

The actual steps we're going to be carrying out today are the ones outlined here. We're going to be fetching the model optimizing, the model running it in a micro in a server locally, then deploying it into our kubernetes cluster and then showing how to um you know, make this deployment much more robust through githubs and monitoring.

A

So uh the code and examples you're gonna, find them in the resource and I'm going to actually share the link for the talk so that you can access it uh later on. So, let's get started with fetching the model, so the first step is actually getting access to the artifact and because, in this talk we're going to be covering the personalization we're going to skip all of the part of the training the model, so we're going to already access a pre-trained artifact.

A

Fortunately, we are going to be using a collaboration that we have done with the hugging face team. So this is basically a team that has collated and trained a broad range of uh you know: machine learning models in this case they are, they have also trained this gpt2 model that we're going to be able to just make use of uh the way that we're going to be doing. This is just using their transformers library and by using the tokenizer, we can actually fetch the preprocessor and the model we're going to talk about what that happened.

A

What what that means and we're going to just be able to build the pipeline, this actually just fetches everything from their model hub and simplifies the python side for us. So what I, what happens under the hood just to provide an intuition is we are able to provide a text input. So, in this case, is a text. I love artificial intelligence. We have to convert this text into something that the machine learning model can understand, so we're going to tokenize it in this case we're going to convert this string into a bunch of tokens.

A

Then we can actually pass this tokens to the model, but if you remember, the model actually generates one token. So if um we want to actually understand what's the most, uh I guess reasonable prediction: we can actually take the most likely next token or we can actually take the most likely series of tokens right. So this is this generate function that we leverage here and then once we actually get the output, we decode it back to a human, readable string, and then we return it right.

A

So this is all happening at the python level, so the internals of the generate function. You know, as I mentioned it, could be just a greedy approach of taking the next most likely token, but in this case we can also use other algorithms, like in this case, the beam search algorithm that uh has a look ahead to find the most sort of like uh plausible uh series of tokens, so we're gonna skip. You know through all of this and abstract it primarily because we're gonna just interact with this as a black box.

A

So the next step is the optimization right. So we have this pi torch model under the hood. We can also fetch a tensorflow model, but we can actually export it using this um serialization format called onyx and for this, fortunately, we have a library still with a hugging face framework that we have collaborated with. That simplifies this. The only thing that we need to do is to actually use the optimum framework and the optimum class that actually gives us the onyx quantized models, which just basically means it's going to be much more efficient.

A

It's going to run significantly faster and we're going to see what that actually looks like in practice. So now that we have an artifact and we saw how to run it in python. The question is now: how do we actually deploy it before putting it in our kubernetes cluster?

A

To avoid uh bothering the devops team, we want to first make sure that it works right, make sure that it works locally, run it and ensure that it performs to what you expect for this we're going to be using these two tools called ml server and zeldan core, and the reason why is because there are a lot of challenges when it comes to scaling that go well beyond just the challenges that you face with normal software. The reason why is because you have specialized hardware in play right things like gpus or tpus.

A

You have complex dependency graphs, so it's not just like a microservice that you consume, but it's actually multiple hubs across potential inference pipelines. You have compliance requirements where your model, your code, your environment, has to be reproducible, and you have you know all of the the nuances that may actually require higher level principles that may be dependent on use cases in various different industry domains right, if you have to explain your predictions, if you have to actually keep audit trails, etc, etc. So, today we're going to actually just simplify this by introducing those technologies.

A

The first one is seldom core, which is this kubernetes cloud native orchestration tool and seldom core allows you to basically convert those uh models, artifacts or custom code into fully fledged microservices and run them in different run times. One of the runtimes can be triton, it can be tf serving, but today we're going to be using this python runtime called mo server right so ml server. The only difference is that it provides you with a simple compatibility with python based libraries and because hogging face is a python based library.

A

It is very easy to actually interact and integrate with that. So now that we actually talked a little bit about the tools that we're going to be using now, let's actually talk about how we're going to be using them, we're going to take this gpt2 model and define it in our ml server, runtime right.

A

We're then going to test it locally by running that as a microservice but locally, so that we can consume the model as by sending inference predictions and getting the responses, and if we're happy with that, we can then just deploy it into our kubernetes cluster with the seldom core scheduler, which is going to be using that same runtime underneath so we're going to be first, as I said, defining the gpt2 model pipeline. This is actually a programmatic way of defining it.

A

If you remember in the previous python example, we actually selected what is the task right. So in this case it was the task text generation and it's going to be using a pre-trained model from the hub, which in this case is the digital gpt-2 right. It's basically like a smaller gpt2 version, because that model is actually huge, and you know if you guys have been using the conference wi-fi.

A

We don't want to download that you know and get stuck for that, because we're going to be all day here and we can actually just activate the optimization by saying optimum model. True here we specify the runtime which we're using the hugging phase runtime. If you're familiar with machine learning, you can also use things like xg boost, runtime uh or scikit-learn, runtime, etc, etc. So, now that we actually have the configuration we can just basically run it. We do ml server start with that config file.

A

It runs this fast api server and then we actually just send a call request with this input. Selden is very, and it returns with this output generated of selden is very curious about the matter right, so it works right. So that's basically what we have now. Fortunately, we do have a demo that uh my co-speaker recorded before coming here. So now. Let's hope that everything is connected hooked up, because I'm gonna press play.

B

C

Working with ml server and hacking.

B

Phase optimum library, it's rather simple and fast, and we will do it all in github code, spaces, online, no local installation of python or anything to start using the model. We need to create model settings file direct to use hug in phase runtime.

B

We will use text generation task for this example, with distil gpt2 model available from hug in phase, and we will enable optimum optimizations. We will start a ml server.

B

It will serve model on both http and the grpc protocols. Following kf7 v2 protocol. It will download retrained model from plugin phase and apply optimizations once server starts. We could explore open api docs and run inference. It will show a structure of expected input and output to help out and we will submit data. Investing io love ai, let's see what it generates.

B

It looks rather cool for ai generation and all that power without writing a single line of python code. Now we could use this model easily from application written in any language.

B

But that's not all many transformer models support different nlp tasks. We saw text generation now. Let's look at sentiment analysis. We will change our model settings for sentiment, analysis, tasks and we'll start with server. We didn't specify model so downloaded bert model as it dims at best for the task at hand. Let's resubmit the data with our I love ai sentence and see what it generates.

B

It generated sentiment positive score in this demo. We saw how simple it is to run various nlp models and tasks on github codespaces with a mail server and optimum library.

A

Awesome that that worked out so that's great, we're literally pushing the release of ml server as we speak as we had to do some uh yeah last minute updates, but you can try it out. So please do make sure you know you check out some of the notebooks. So now, let's actually dive into the next step right, so we run it locally. Now we want to productionize it and we want to make sure we run this service as a microservice in our kubernetes cluster.

A

So we're going to be using our kubernetes operator for seldon core, which provides us with a bunch of custom resource definitions that abstract those machine learning concepts into uh basically crs right, like the concept of a seldom deployment which allows you to deploy your models. The way that you would be able to do that is by using the uh you know, definition if you remember those same parameters that we use them in ml server and in python there is a one-to-one mapping between those parameters and the ones that are passed downstream.

A

So here you can see that it's the same text generation. You can see that you can define your pre-trained model and then you can, you know, select the optimization. The key thing here is that we are accessing the pre-trained models from the hugging face hub, which means that you can actually download. You know all of the. You know large range of models that they actually uh provide. You you can now one once you deploy it with cube, cuts or apply.

A

You know you can actually see that the actual pods are running successfully and you can actually send a request now. The one thing to mention is that uh you know under the hood we also had to install in the cluster our gateway controller, which is istio, which provides you kind of like for the routing uh to be able to access the models so seldom core integrates with both istio and ambassador.

A

So that's, basically the uh you know kubernetes deployment part, but if we remember you know when, when we're dealing with kubernetes clusters and with kubernetes deployments, it's not just about running a pod right, it's also about the reproducibility, and um you know the ability to actually ensure you have. uh You know roll back disaster, recovery mechanisms, etc, etc. So, for this, we're going to now be delving into some best practices that we can introduce that you know have been covered quite extensively in the general cloud native.

A

uh You know, ecosystem things like githubs or operational monitoring, but adopted into this machine learning deployment workflows so the first one that we're going to cover is continuous delivery via githubs right. So so githubs can be summarized as deployment as code through version control right. The ability for you to have a one-to-one mapping between your kubernetes cluster state, and you know the equivalent within a git repo that is versioned. One of the benefits, of course, of git ups is the ability to be able to roll back.

A

That's one of the things that normally gets covered, but one important benefit of githubs is also disaster recovery right, like uh if suddenly, your cluster gets into an inconsistent state, and you want to you know recreate it somewhere else that gives you a robust disaster recovery mechanism and similarly for migration, you're able to actually replicate the cluster. So what does that look like if we actually have our data scientist, our machine learning engineer interacting with the kubernetes cluster?

A

So you would have the data scientists, training, new models, perhaps doing transfer learning pushing it into the hugging face hub or, as we also support, pushing it into a google bucket or an s3 bucket, etc, etc.

A

Then the machine learning engineer is able to you know: programmatically, deploy that model by pushing the specific yaml configuration into the git repository. Then, as you will see in the next, uh you know part of the demo. The githubs integration would allow you to actually sync that change in the github repo and then make sure that the cluster uh recon reconciliates with with those changes. What that means in practice is what we saw, just not with cube cuts will apply, is with the ktops workflow.

A

That would actually run our machine learning model runtime, so seldom core would be the reconciler component that would see hey. You requested a seldom deployment with this runtime. I am going to run that specific runtime, which is, in this case ml server and then mo server, will fetch the particular pre-trained model that you specified right. So so so simple enough uh now, actually, uh let's, let's see what that looks like in practice in order to actually configure it from our site.

A

You know we actually have direct integrations with things like argo cd in this example we're using flux for the integration, so you can see that the flux config specifies which repo it's going to be syncing into which cluster as well as what are the particular parameters that that we want to uh you know create. uh We will then uh make sure that, once the model is deployed, we are able to leverage some of the.

A

I guess observability richness, that you would normally get out of the box in kubernetes and, of course, extending it into the world of machine learning. What that means in practice is that with seldom core, you get the benefits of ensuring that all of the models that get deployed not only have that you know rest grpc and kafka apis, but also you're able to ensure that metrics are exposed.

A

Operational metrics, like requests per second latency, but also more advanced metrics like gpu utilization, etc, etc, and this is not only relevant for our use case, but also for when you're using more, you know, advanced runtimes, like triton, where you want to really get every single sort of like millisecond or nanosecond of latency. And similarly, you know out of the scope of this talk but um being able to collect the inputs and outputs of the model so that you can actually get insights from what's actually uh being processed in the inference side of your deployments.

A

So we're gonna actually see some practical insights that we can extract in the next demo. That is gonna. Basically, you know showcase all of the things that I covered. So let's now switch back into the next video and hope that it all works and, let's give it play.

B

Now we will see how to deploy scene transformer model to kubernetes with seldom core and following detox approach. We have a cast cluster with two node pools, one with gpu and one with cpu based computer cluster is enabled with gitobs flux addon and we onboarded our cone. Repo manifests to be synced with the cluster.

B

We have installed seldom core with streamgrass on the cluster and now ready to create seldom deployment service. One crd is demonstrating running model with hacking phase runtime on gpu nodes. We could see that we set hugging phase server as our runtime.

B

We defined task that model would perform to text generation and we will use distil gpt as our pre-trained model. We have also defined tolerations and nvidia gpu requests, so that model will run on gpu nodes for cpu version. We have removed tolerations and it will be running on cpu nodes.

B

We have committed manifests to the repo and we see flux, controller thinking latest commit with a cluster resources are being deployed and it takes few seconds to get readiness, props turn green southern controller processed at cd object and deployed two containers for each model and virtual servers to enable routing for the model for eastern grass.

B

Our model pod has one container with email, server configuration and another is seldom sidecar performing orchestration tasks. Now we have both models running.

B

Let's compare how these models deployed on separate nodes performed. We will use k6, load, testing, 2 and define two scenarios running multiple iterations, with text prediction payload against both models. We see that gpu outperforms cpu, but alert by a large degree running hundreds of iterations, while cpu processes just 12. once test finishes. We see that cpu-based runs takes 2 seconds to run while gpu is 10 times less just 200 milliseconds the ml server optimum library, abstracted from us complexity of model serving and we were able to utilize underlying gpu infrastructure very efficiently.

A

Awesome so I think I think one of the things that we can see from from that demo is ultimately the comparison of a slightly more complex machine learning model which would actually take perhaps a couple of seconds to process the input data. The interesting thing is that if we actually perform the inference, you know one input at a time the cpu and the gpu would perform equally at the same speed.

A

The benefit is when we actually do batching right when we send multiple requests batched for those to be processed by the gpu on a single clock cycle and then, similarly, one of the things that you know is kind of outside of the scope of this session, but that you can try out yourselves is how to leverage things like adaptive, batching or predictive batching, as it's also called the ability to ensure that the actual server itself is doing the batching.

A

So you can send like a heavy load and the server would actually take some requests, let's say 100 and then actually run them within the gpu and then make sure that they get returned with the relevant. You know open connections. uh You know accordingly right and that actually makes sure that you still are able to leverage some of the optimizations within the gpu itself.

A

So so again, you know the great thing about all of these things and as we all love open source, is that if you find any issues or if you find something that needs improvement, we would love for you to. You know open an issue uh or even a pr, uh as always uh very much welcome, so just to summarize and to make sure that we um take a step back and see what actually happened from the big picture.

A

Let's see, what does the anatomy of production mlops look like right, so so we can see all of our persistent areas, the training data, the artifact store, and then we will see also the git repo and the inference store. The first step, which we actually skipped in this talk, is the experimentation right is when data scientists are training, machine, learning models, using data converting into artifacts right and in the case of the hogging phase demo.

A

This is basically pushing them into the hogging hugging face hub, but you know it can be also into an artifact store s3. You know, google bucket, uh azure, blob, etc, etc. The next step is once you have a model that is ready to be productionized.

A

You would be able to either manually or programmatically ensure that it's actually, you know, pushed into that kubernetes cluster right. So in that case we can cover it from the ci cd side programmatically having a ci pipeline or an etl pipeline that is responsible of potentially packaging the model, uh potentially you know pushing the runtime if it's actually all encompassed in an image or just pushing it again into the artifact and then actually pushing into the git tops repo.

A

This is actually quite important because in the previous slide, we were showing how potentially the machine learning engineer would push into the githubs repo, but normally from our from our side, we tend to discourage that right. uh Pushing into a github repo is not something that you should do, particularly given that github's, you know, at least in some contexts, is seen more as a data store right. It's. It is the state of the cluster and the ability to you make sure that you can.

A

Actually, you know, roll back, and if you actually can do that programmatically, it provides a extra level of security. Now, as we also saw, then that's when flux would come in and be able to perform that reconciliation with a cluster. That means that you would have your real-time or batch models that are running in kubernetes and then, of course, the operational metrics that we were showing the monitoring the observability.

A

Of course again, this is not a monitoring talk. uh I have links to resources that cover things like drift, detection, outlier detection, so that you can actually delve into some proper data science monitoring with cloud native architectures. But from that same context, this gives you an idea of this. What we call the anatomy of the mlops lifecycle, so yeah, that's kind of like the main sort of premise you know taking a step back, getting a bit of an intuition and not getting you know to too much in in depth.

A

One last thing that I do want to highlight is the step that we showed about running ml server locally right. We we see. One of the key challenges is that often in the envelopes life cycle, the data scientists or machine learning engineers go straight from experimentation to production and the reason why that's a challenge is because if they have like a container crashing, that would actually introduce a very inefficient loop between the data scientists and the devops engineer or the platform engineer going hey. Can you send me the logs hey?

A

Why is this not working right so that part about being able to run it locally, making sure that everything works? You know, send some requests debug. It that's actually quite key in this in this workflow. So if you want further resources, uh you know we have. You know other talks that we've actually given in previous kubecon conferences on ci cd for production, machine learning at scale on production, machine learning, monitoring with explainers drift detectors, outlier detectors um the similar one on accelerating mlm versus at scale, but with onyx and triton uh machine learning security.

A

And then you know the machine, learning ecosystem and operations, the current state of that space, uh the slides you can find them in that bit.ly link at the uh top right uh over there. So yeah. If you want to access the slide, the resources, the notebook check it out there.

A

So just to summarize again uh today we covered machine learning acceleration at scale how to optimize your models, how to run them locally, how to deployment to go to kubernetes and how to introduce a production cloud native tooling again, thank you so much and thank you for bearing with us. You know with this uh juggling of video and presentation. I hope you enjoyed it and I'll take questions if anyone has them, if not, you can grab me for a drink later on for more questions. Thank you very much.

A

A

So any takers for questions.

A

Any brave ones nice we have one there.

A

Oh, I think uh yeah just just him. I think you haven't yeah, he hasn't see you. Oh you have a microphone. Okay, if you tell me I'll repeat the question, yeah.

A

Right right right, so the question is: uh do we have any methods to share gpus across containers? So that's actually interesting. We were just chatting about that and the talk right before this was also talking about about that. um So seldom core operates at a sort of like you know, high level sort of scheduling, building kind of like the pots, so we would rely on the schedulers themselves. There was a very interesting talk uh that one of my colleagues was mentioning from nvidia.

A

That was actually talking about how to introduce, at the scheduler level the ability to uh specify fractions of gpus, and actually you know, make container d or the lower level. You know magic handle that, for you uh not being able to actually deal with that yourself. um So you know from our perspective, that would be the ideal, but of course you can actually leverage some um of the runtime capabilities like triton. So I didn't cover how you can use triton instead of ml server.

A

But when you deploy your models using triton, you actually have access to those low level configuration. Of course, it is at the mercy of your configuration, so you're going to have to like specify that at the pod level and handle that on your configurations. So it does get a little bit. You know complex, but there are options, um so I think, from from our perspective, one is actually gpu sharing another one is multi-modal serving right and that's one of the easier ways of actually handling what is actually sharing. You know.

A

One gpu across multiple models is just actually having one container running multiple models. An ml server allows for multi-model serving triton allows for multi-modal serving, and we've been actually doing a lot of collaboration across all of these projects. To make sure that this the apis are consistent right. So so the management apis are the same. The inference apis are the same across ml server and triton.

A

So it's more of a preference of which one you want to use yeah. So that would be the current sort of answer to that, which is not very much a full answer, um but yeah good question. Yeah.

A

Other questions.

A

Awesome, oh, I think we have one one over there.

A

Yeah and if not yeah, you can grab me for for for deeper dives and questions. um You know later on, hi.

D

Great talk, thank you. Thank you. Thank you. How do you deal with model versioning? uh You know I use dvc, usually for versioning the models, and how does this affect the github section of what should or what should talk about.

A

Yeah, that's that's actually really interesting question and I'm always really really keen on delving into into that context. So the dvc team is is awesome, we've actually done collaborations with them and we have examples of how to deploy models that have been trained and pipelines that have been using dvc dvc handles the experimentation part right and the reason why I'm pointing this out is because, in the experimentation part you may have a hundred experiments with a hundred artifacts.

A

When you move to production, you choose one artifact, you choose one experiment, you say I want to productionize this one experiment, and now you move into a new realm, where the relationship between production models and experiments is different. You have a one-to-many relationship where one experiment can have multiple deployments. You can have them deployed in a dev environment. You can have them deployed in a production environment, you can have them deployed across three namespaces, etc, etc.

A

So, when it comes to versioning, we do keep a um you know, sort of principle where the experiments themselves must be consistent right, so making sure that the experiment has a unique identifier. So whenever you productionize it into your githubs repo, there would be a unique identifier in the yaml right. So whenever you change the experiment, the yaml changes right. Of course you have. You know some servers that allow you to actually just point to a bucket and have the server updating whenever the bucket content changes.

A

So that's something that is a no-no for us right. We ensure item potency, make sure that whenever there is a change in the yaml, there is a change in the server and no magic underneath right. So so there are some considerations to take into account. In summary, is that relationship between experiments and production services, as well as the ability to ensure item potence on the yamos and github's?

A

You know components themselves so that it allows you to actually trace back all the way to the previous steps in the machine learning life cycle. Yeah awesome.

A

C

Thanks for the dog, um for example, if you want to go one step further, once you have done the the inference and you detect somewhere thing in a video or or kind of stuff, and you want to to trigger any action related to this uh inference, does sheldon offer any feature that can be can be used like a pipeline or something like that.

A

Yeah yeah great question, uh so the short answer is yes, the long answer is it's complicated right, um so so we have multiple different uh interfaces through which you can trigger. um I guess events right, so one of them being. Let me see if I can just open it, one of them being operational, metrics right. So this is one to one to your service level agreements. You can say I want to set a service level objective.

A

I want my uh model to be using this amount of uh uh you know throughput or this this maximum latency and I want to set up alerts through alert manager or something like that right. So that's on the operational level on the data science metrics side.

A

So I talked a little bit about um drift detectors, outlier detectors, which we didn't cover in this talk, but you will see in some of the resources that I link that there are ways in which you can hook in extra components that would be able to listen through the you know, various inputs and outputs of your model to actually perform.

A

um I guess advanced checks of the current state and then you're able to trigger respective actions depending on the outputs of those right. So so so the answer is seldom would provide you with with the lego blocks that then your platform teams would be able to. um I guess, arrange accordingly, right and, and you know, as an as an open core company, that's kind of like more of where we delve into providing that sort of you know management layer uh but yeah.

A

The open source provides you with all of the tooling that you would need, and that's why you know we have you know seven million downloads and people kind of like integrating in different ways with with you know whether it's k native or whatever, but um yeah. So so. The answer is yes.