wasmCloud Machine Learning Working Group, 23 Mar 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: wasmCloud Working Group - Machine Learning 03/23/22

Description

wasmCloud is a platform for writing portable business logic that can run anywhere from the edge to the cloud, that boasts a secure-by-default, boilerplate-free developer experience with rapid feedback loop.

https://wasmcloud.com

A

Hello, this is christopher prowing from munich, germany and today I'll show you around our machine learning, inference capability provider and the application we build around it. So in that architectural drawing, you probably see here on the screen, you'll see a picture of the application there's in the center in the big rectangle. That illustrates our wasn't cloud runtime.

A

There are multiple artifacts, the smaller rectangles that are capability providers. We have got an http server and that is also an endpoint to the application on the lower left and on the top right. We have the capability provider which does the inference processing, so the machine learning processing and it's marked ie that's for inference engine. However, that's misleading a bit because we already have more than one inference engines implemented in it. However, now you know what it represents, then, in the vasencloud runtime in form of circles, youth actors- we have two of them here.

A

The one is, I api, so inference api, that's directly connected to the http server and the other one marked. I uh you can imagine it not being there it's not implemented. We intended that for first iteration of the application. That's not important outside of that wasn't cloud. Runtime you've got the bundle server. The bundle server is a kind of a blob store and that's where we used to store the ai models as well as some metadata.

A

So what happens is if you process requests against that wasn't cloud runtime they come into. The http server are routed via the inference api to the capability provider, who does the processing and the response going back via the inference api and the http server so that you will receive hopefully a 200, okay response with the result of the inference processing.

A

um However, as I said before, the processing can start the capability provider. Ie has to download all the models, and that is done during startup of the application.

A

So, having said that, let's see how we can get you up and running such that you can replay what's already pre-configured in form of different models and also scripts. In the repository.

A

Don't worry about the repository url, you can ask for it in the select channels, and I think you will get a quick response. So one of the preconditions is that you have installed bundle server. um As far as I know, you can install any version node. However, that versions later than 0.7.1 you have to deal with keys, so they harden the security aspects.

A

So I for development purposes, I use 0.7.1 and you do not have to worry so much for security. um What else do you have to install? Is that docker and docker compose?

A

I refer to the page of wasm cloud. They have got a good guide, what you have to take care of and what you have to install. Otherwise we may directly jump in and start the application now. Let me see if I have it ready for us so before you can replay the application. As I said, you have to load the bundle server with the models. So luck for you for us, the models are pre-configured and the repository.

A

So let's see how you load them up into the bundle server. Once you have cloned the repository um go into the deploy directory, and then we have a start. Script called run. This is your friend. If you just type it, you get all the sub commands and in order to upload.

A

The pre-configured model run.

A

A

So we'll create the precon; no, it will upload the pre-configured models, upload them to the bundle server which is started at the same time. So you do not have to worry to start it, and then you should be good to go. You get some log information and if it looks like that, this is fine.

A

You can now you have to do it only once so, once you've uploaded the models and the meter data to your bundle server. You do not have to do that again. So whenever you start or stop the bundle server next, you can do it with the other sub commands. You see, bundle start and bundle stop. So that's started now, it's time to start the application, and that is done via run all.

A

So, what's starting is nut server. We have the local registry, where we pull the actors and capability providers from, of course, the runtime itself and the bundle server is already running. So now, if you go into the log directory of the vasencloud host, you can see hopefully.

A

A file is named alan dot, log dot, something and if we print them out.

A

You can follow up the logs, so while the application is starting, we may have a look at the washboard, which is getting more and more complete. What do we see at the washboard? We see the application starting up with all its technical stakeholders, so at the top right we already have the ml inference capability providers, you see its status is healthy, so it's up and running on the lower left. You see two links, so one link is between the http server and the inference api actor.

A

The other link is links the inference api actor to the inference engine so to the capability provider. Who does the processing and the inference api actor itself? You see on the top left. So meanwhile, the http server started up. Its status is healthy as well. It takes always a bit longer.

A

So now we have been running that should be reflected in the logs as well as in the output, and now, let's have a few requests running so I've pre-configured to requests the first one.

A

Is a request against the onyx inference engine its model is called identity. What it does is so the model is configured such that the output of the model is always the input, so it yields what it gets. It's not particularly interesting for real world use case. However, in case that works, you know that the application is up and running and works well, and if you trigger that, you see we get a 200 back and also we get a result. And if you compare that result, you see a field has arrow is false. That's good! Already!

A

If you compare the content, you see that the content is identical. In fact to the request. That's fine! So that's a request against the onyx engine. Now we also support tensorflow engine, so we've got a second model. It's also pretty say simple. However, not another identity model. It is a plus three.

A

It adds three to every field in the data what it gets. So, let's just post it see that we get back at 200 and now, if you closely compare the data of the response to what we put in as a request, you see.

A

That's not identical, in fact, and this data I mean that is serialized to byte array, but if you deserialize it, I don't know if I have it prepared for now, you will see that in fact, there's a plus three, I scroll down the readme of the current repository and there we have it documented somewhere.

A

um So this is exactly what we got back and if you deserialize that into f32- and you see that it incremented now it's down here, so it incremented every input by three. That's why it's called plus three yeah, that's it! That was the two examples.

A

So if you want to replay that have problems or if you want to implement your most wanted feature contact us in the slack channels and that's for now. Thank you. Bye.