Red Hat OpenShift OpenShift Commons Briefings, 3 Jun 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: OpenShift Commons Briefing Deploying KPMG Ignite on OpenShift - Kevin Martelli Hongfei Cao (KPMG)

Description

OpenShift Commons Briefing
DeployingKPMG Ignite on OpenShift - Overview and Deep Dive Demo
Kevin Martelli and Hongfei Cao (KPMG)
2020-06-03

A

All right, everybody welcome back to another openshift Commons briefing, we'll be live-streaming, multiple channels on Twitter, Facebook and YouTube, and we'll collect your questions from there. Today we have to be new, hopefully Commons, member KPMG, joining us to give us a talk on there at night platform for well we'll find out what it's all for. It's machine learning table science all kinds of good stuff, but we have Kevin Martelli and hung fake.

A

How well let me introduce themselves in more depth and talk a little bit about the Ignite platform and how they're running in an open shift and then we'll have a technical demo and live Q&A, as always at the end of this. So please Kevin and have a take it away. Great.

B

Thank you so much and thanks everyone for taking the time out to go through this presentation quickly before we dive into the details. I want to quickly introduce myself, so my name is Kevin Mark Kelly I'm, a principal software engineer at KPMG, overseeing our Big Data soft Ranger engineering team, a lot of stuff related to cloud containerization, z' and end-to-end deployments with machine learning with me. I also have a colleague conf, a cow on hung, say maybe a quick introduction on yourself. Yes,.

C

My name is a holy cow I'm, a big data engineer director phone at house KPMG my specialties on big data and its motion, artistic and emotional learning deployment. As far as cloud computing.

B

Great, thank you one thing hmm so, as was mentioned, we're gonna give you a little bit of a background around the motivation that we had on building. What we call KPMG ignite, which is our data science platform and ecosystem, to bring use cases from pocs into production, will drill a little bit into to what ignite is so the audience can get a feel for what we're dealing with they.

B

Can I how it works on top of OpenShift and and ultimately how it choose to produce business value and then, finally, we'll go into some architectural component or ease of it, and then I think the the last part here on for which is, which is the demonstration and what we hope to do is through all the different parts that will take out we'll walk through a very detailed technical demonstration.

B

The demonstration will be representative of how the business can interact with it, how data scientists can interact with the platform as well as how engineers can interact with the platform and how we can production Eliza's pipelines on. On top of that, we shift within containers. So, looking forward to this and again any questions, please feel free to type them in, as we go.

B

So the first question that many you may ask is: why did we decide to build and invest so heavily into this platform that we use for our internal data science as well as client initiatives, and you know probably like many of you could guess, one of the main drivers was the just the explosion. If you will of AI in the marketplace, you know many clients, you know looking at it to leveraging it.

B

How can they use it and we saw that there's a need there to kind of bring some of these technologies together and not only from the standpoint of you know, being able to I would say, production, wise use cases. We do see a lot of people that are able to build cool POCs but they're, not always necessarily able to get them into a into a production, wise format. So one of the things within a within our ignite platform is how can we come and bring these these capabilities from both the POC to realize production?

B

In addition, making sure we have those right hook, so how does the business interact with the platform and our scientists and engineers and again we'll drill a little bit into into that detail? But as we looked at that, there's really about you- know five or so areas that we see that enterprises need to be aggressive in in order to support AI and be able to bring AI more seamlessly into the into the organization.

B

Some of these areas we do cover by ignite and in some of these areas you know have to be augmented with separate business processes. You know one of the important things here on number one is around data literacy and building at your data expertise, and we used to think about this more along the lines of something that an engineer or scientist a business analyst would do. But this is more holistically now across the organization. How do business folks understand their data? Can they get the right training data?

B

Do they have access to the right data within ignite you'll? Learn? We have different. You know ways that this could be achieved. We have user interfaces, we have. You know datasets from a technical side which we'll talk through a little bit.

B

The next is around technology and I think we can all kind of agree that there's been a technology explosion in the space every every day you wake up. There's new technologies available to produce. You know similar types of outputs and what technology should you use? What technology shouldn't you use? Where do you want to do your R&D, invest in around?

B

One of the things we took for ignite is that we realized that this market was going to be expanding so quickly, so we needed a very open ecosystem, micro-services containerization, easy to plug in and plug out as these new AI tools and capabilities came to the marketplace, and that's something that will show you to within this demonstration business processes. How do business processes now change our people, relying on the data from data science? Tell our people embedding this into their day-to-day. You know work activities. How are the enterprises adopting this and then the work force?

B

It's it's. It's important that we enhance some of our legacy. Skill sets. We probably we hire new folks and, and how is our workforce support this? How do we move from sort of the legacy monolithic into these more agile? You know quick, developing and production, ization of machine learning pipelines and then the final part I'll just briefly touch on is around risk and reputation. I think we all know that there could be a lot of risk associated to some of this.

B

The importance of you know understanding your model, understanding the details that are going into your model. How are you managing this? We see a lot of organizations. You know you know coming up with ways that they could have explained ability and making sure that their minds they're there. Other models are fix of bias, and how do you fix these things? So there's no risking reputational risk that organizations are facing in this era and again these are some of the capabilities. We hope that we can help support through the deployment of our KPMG ignite platform.

B

Do it with that I'll jump down to you know what is KPMG ignite and we touched a little bit on these topics on earlier, but essentially we have the who the what and the why from the who, who who was this platform built for it? Who already have it in mind that could use it. It was a platform predominantly built for our data science and data engineers, so they could build the the pipelines. However, you know without having the business and the ability for others, analyst, etc, to be able to put input into the system.

B

It just wasn't necessarily as I would say useful, so there's business hope so, whether that comes from annotating and creating training data, whether that comes from validating model results, there's different areas where the business and business analyst can come in to work within the platform, the what part. So what is it? And we talked a little bit before this- is a a global AI platform. We have a very modular /my, close service type of delivery, so each module and we'll dive into that could be an OCR job.

B

A module can be a model, a module can be a data extraction, so we'll jump into some of these things. What these models are, but they're built in such a way that they're sort of interchangeable. So if there's a new capability in the marketplace that comes out- and we want to take advantage of it- you can plug it into the platform seamlessly and its capabilities deprecated. They could be removed from the platform, then.

B

Finally, the why what we noticed is that you know, as we mentioned before, there's a high demand for these types of capabilities, but, more importantly, as organizations sort of took their journey into the AI space and unstructured data and semi-structured. There was a lot of work around unstructured data set, so loan loan documents or contract documents, PDF documents. How could organizations get the right information out of them, voice, documents etc and make the correct business decision?

B

So this was really the why of this platform and why I was built sort of with an unstructured data set in mind.

B

Again, going back to sort of the the prior point, this was a platform that was built for data scientist as well as data engineers with with those business folks and, as we mentioned, it's very important that the business had a way to integrate into the platform to provide the right data set both from a training perspective, as well as a data validation perspective back into the process, to making sure that we were building the right models, deploying the right models and kind of meeting the business requirements and expectations, and then the modular component architecture.

B

This is I, would say the crux of what we built our platform on we're, really building it on the foundation of very small capabilities and services that could be interchanged with other services that you can string together a pipeline to produce some type of output, for example. Maybe a pipeline is you need to OCR a PDF document. You then need to break down that PDF document, so you can start making business decisions.

B

So maybe you add some, you know Spacey into into it, to enhance it or- and you might do some sectioning of the document and then ultimately, you might make some type of business decision on it. All these different components are modular and kind of executed by themselves, or you can call different components that may reside in different cloud providers, whether that's you know GCP, AWS or Azure.

B

Another another area was that it was, you could be deployed, rattly a containers or it supports restful services into the platform based on the demand that you need to push through and finally, it's around reusability. So all these components are reusable. So if one person creates a component in the community, that component has a capability of doing something that component gets checked in, can be real Everest and then rebuilt into an image and ultimately brought back into the platform.

B

And again some of this might be a little confusion through the talk, but once we drill into the demonstration I think it will become a little bit more real and then finally, you know it unlocks that the value of unstructured data with surgical precision.

B

So one of the things that we really want to do is be able to get out that content in in documents that are able to help the business make decisions and many times you'll have hundreds of pages of documents, but there's only certain sections or clauses or dates or information that the business needs to be able to make that that that business decision and and and how do you get that information out and that's all part about those pipelines that it can be built on top of the platform, but that had this, the I call it.

B

The documents become mature and more and more enriched. So you can start extrapolating out the right information to then be able to make the right business decisions.

B

So, as I was mentioning you know, ignite is really. You know made up of many different layers as you see here, and what I guess I want to talk about is what I call sort of the foundational pieces that make ignite ignite and we really break it down into these.

B

These five buckets, so a loom is really the main mechanism of how it might communicate from one component to another, and it's it's a proprietary data type or data element that we use here, but what it does is it allows for one component to receive data in and push data out in a consistent way and Illuma is really nothing more than an enhanced json on the next is we have our data science notebooks, which many of you are probably you know familiar.

B

With in the data science, notebooks yeah have the ability to be and own your own pipelines and workflows to test them to build them and then deploy them. The data science notebook themselves, it's gone through Jupiter hub and then each user will get sort of their individual notebook. And then we talked about a little bit around how the business and users can get in there, and we have two main mechanisms. We have the annotation UI.

B

What chunk phase gonna be able to go through in our demonstration, which I think is pretty powerful, how business users can then annotate documents and those annotations are ultimately fed into feed the models and then there's some management UIs as well, and these are these are areas where you can build workflows and and string things together in a more user-friendly way versus building it out through different code bases and then finally, one of the things I think, which is a gap in some of the other areas that we deal with is around model management and the ability to manage your model.

B

Understanding the statistics associated to your model many times, feeding into your model governance process, as well as serving up models and I guess one. This thing I want to show on this slide is: if we sort of understand the platform I want to walk through from the bottom, for the pools are persistent volumes up through kind of the the application layer. Just to give you a feel for how the platform itself works.

B

So at the bottom we have persistent volumes, for there are openshift cluster that are attaching into the cluster and if you move one layer or for structure, so what we use is we use distributed manao to facilitate object, storage, which gives us better Li. You can see faster, read and write times. We also have a post crest database that stores a lot of the metadata associated to the processing and the workflows. We have our logging and reporting within Qabbani and elastic search. We use Kafka as our message broker.

B

The Kafka is really set up in a way that allows you to. Why do you execute one component at an F component? That goes a way that keeps the the queue for the next component to pick up then, and finally, everything is executed across the container organization platform. Remember orchestration through openshift, alright,.

C

I woke up on there. It's a common mentioned talking night Pantheon is designed for data science and in the essenti, send the data engineer right, for you know to build arbitrary measurement pipeline and also easily collaborate. You know for different pinnies use cases.

C

So in this architecture, as in Kevin Carver's system volume 4 into structure, on top of, for instance, structure, we have were microservices right for each infrastructure component, that we deploy data in a secure, open, shape, the environment and and they expose as API right for user to access and on top of the infrastructure we have our machine learning pipeline component. You can see you have some pre-built, you know so. Maldo, like the you know, say even a cc.

C

Artists run model, LP space, a machine learning model manager, model, deployer and initiative managing so those Robbie's own micro services has its own image and but what its own part later in the half the time I'm going to show you later, you can see how we are create and customize each machine learning component and I secured are using electrons openshift, yes, don't moving on to next page.

C

um So here is another view: I got to show you the high level architecture about ignite Pantheon and how we deployed openshift for after that, I will show you the actual deployment using our scope in shape. um So as as we mentioned, so, the whole deployment is the container base and conducting is a jockey in ride and defender shipped right to a cloud infrastructure initiative working from BM or in a zone. Prime, not leaders, private cloud, and we deployed the whole plan on using the SAT pipeline.

C

So if this is a panel, can help asks that have C create config. My and O's are in the infrastructure component component deployment, including Akaka API.

C

You know genetics or TFK, etc, and once we have the infrastructure component and the deployed their weekend week and unlikely to use such a big notebook as our data science platform, but to customize and build any elective units are too emotional learning pipeline workflow using the predefined component or image. Here we show ones book flow. We build using this internet star phones, our job PDFs can PDF and we can individualize right.

C

January's OCR draw text data and we can do some sectioning I choose segregate different attack strategy into each section and run ILP processing lose eyes on Stacy library, and we have forward you can night customized component corner infinity to managing to Australian accent any interested feel interest if you write, such as in the contract, number contract, borrower, name, etc.

C

At the end, we have a hundred feet and affection component. um The whole model is also a machine learning model. It's also deployed and version control using mo flow for those sleepy and not familiar with my pillow. It is open source Apache project which offers a version control and the centralized just storage for any machine model. It is supported in the Python secular model as well as there like in the spark and male model in packet format. It is a tensor flow, alright, I torture model etc.

C

So we're leveraging the underflow as our model storage and we have our building component of a model deployer to put appointment. Ok, so without any questions so far, ok, all right, I think I said we can move on to use a diamond. So, for the rest of time, I have fun street demos.

C

For you, the first one is I'm going to show you how we deploy electron form to a secured OpenShift platform and then followed on dodge I will show you how to leverage ignite from formal open shaped to do the annotation and the using some help flow to manage the complete lack of machine learning model in star phone's data preparation.

C

How we can use our annotation UI to prepare the training data accessing data to generate the label for our supervisor learning model and all the way to model prediction classification, and we have interface application tools for you for any user to correct an order, output and there's a last demo. I will show you how to use, attribute and notebook to customize and create an arbitrary emotion, running workflow pipeline using a predefined ignite component in the, for example, spacy in as a as a riot OCR and ID. Okay.

C

Without further ado, then let me jump choose a first demo. So, uh as you can see, we deploying on a hole in that platform in today's secured almond-shaped platform, and let me quickly jump to the components we deployed as a measure earlier. We are using the say, SAT pipeline to capacity boyos, a infrastructure component for it in that platform and also set up services, dolls right, a for any API where I have web application way exposed to the end-user.

C

Also, this basic pattern will help, as you know, customize and configure base account rather than for the map and make sure the secret is placed properly. As you can see, we have several paws holding a napkin 20-point there, the elected, multiple powers running at this point and for the data scientists were data engineer. We have to pin notebook of them right to use a cast or developed by the new motion running pipeline.

C

You can see each data, scientist or engineer and user and then log into our to bonobo develop created their own path, so they have a segregated environment for them right to pass and a develop. So they don't need a wardrobe all acting this access control where other user right externality change, where we move their code. Okay, um so holding that popcorn.

B

Thank you for sure.

C

Do you thank you? Yes, but um so, let's quickly jump to the net IQ demo I know we can come by to the safety kikyo ignite on 45 min all bishop. So here is the choice for Edison case and the data engineer managed machine running model. The backend is, we are using the enough flow to a store and version control, the supertrain model.

C

It actually includes the whole civilized binary of your model, so this to basically help you select lean as a model store, so you can use these tools, try to check and the reveal all the existing model from in Emma flow and also you can manage and create a new models with ASA web application. Here, I'm going to show you I already login as admin that can look at in the Ziggy in this you can add a model workspace. We already have a three models created and also I.

C

Let me jump me in one model year, so let's model is called started. So what happened? Is we were processing a bunch of for financial services, contract documents and it's a model well extract. The magnetics stretch the the content from the del PDF to Cadiz as a start, a write four days take in the contract and we don't have rely on any another predefined template. This is a purely IOP internal model.

C

Okay, so the first thing you want to do is you want to create the model using this Ignacio item in r2 and by giving the model name? Was the standards of this model? You first? That is always a setup, meaning you keep the model name use that as a target accuracy or you want to reach for the model, and you start like this or prepare as a training and the testing data side.

C

So at that point we are moving to the annotation stage once the Fenian dinner testing data sets is ready, we'll move on to the modeling stage, which again is where train the model and the melody is more yourself asking decide and at the end we have a like a hold out inside for you to really you know so valid in a test. Your model with all the lacking natural true run through the label, and at this stage we have it's a user interface for any user.

C

They can't go to the model, result and amenity, garage or oxides and multiple result. It's a las that is complete. So when we save the model in using these admin tools, what happened is, in the back hand, the ml flow or it will create entering a project in the end up flow for this model. It quickly show you here: this is the back and enough flow engine, as I mentioned earlier. This is used as over centralizing mode of storage model database.

C

It provides a portion, control and also a song travel for model serving butter for the actual units, economic workflow. We have our own model serving component for the Model T boy. Okay, maybe.

B

One thing here is I think a lot of the datasets that are being part of this ml flow or ultimately feeding back into the governance processes that organizations may have so. A lot of these statistics and data sets that are coming out of your confusion, matrix etcetera can then feed back into the overall governance process of your Model Management yep.

C

Absolutely come on, as you can see, we have several runs variety for the same start paid model. This is the training Sam model. You can see what accuracy and the number of sample.

C

So here, the the we uses are no flow model to tract model performance and as well as save and manage the actual model, tannery service format, for example, or a second remodel. It will save as a Pico format. The first part ml model about save as okay format right. So let me jump back to the Ignite IQ okay. So at this point we already create a model or using this I can add a cue any mean.

C

Let's move on to the next step, which is we want to start annotate, our can set up for the model training. So, as mentioned earlier, it's a whole unite. I q-- to his design managed to complete lifecycle for machine learning, starting from the data preparation. So let's say.

B

C

B

We see a lot of struggles around getting the training data and how do you get that trained it as a back your data scientist, and this will allows you sitting on top of OpenShift, allows you do to better to get better training data annotating that data, but an alternative scientists and engineers can cannot get it so so that I'm saying yep.

C

Hey, um as you can see, we have three economic models already loaded again this again. Thank you again. It's back and it's a real fun and not flow and it's model and into demo is a starting model. First thing is: we want to have a set of label data for our model training, it's kind of mentioned. This is often the part ready to for the gate, descent into the CAD store training data.

C

If this is a cigarette learning model and far worse start more start dating model, we need a bunch of contract data draw PDF and for the SME company side over to the sentence, they kind of used this ignite a clue to to manually label our target result from each documents, as you can see here where they have the label for the start date, the actual text is showing here the reason we can show that extra tax is all the document drop. Scam.

C

Pdf is ready, OCR, we do have OCR text without detection, so and I say. If I draw another bounding box in the different area, you will see that it will return actually in the text. Data from OCR in this way SME what the dissent is and quickly label the documents, rather just from a bunny box around information they need and move on to the next.

C

B

Just ahead a little a little business context here is what we traditionally see is business users might want to get certain information out of unstructured data set. So, for instance, if it's a contract, they might want to pull out the contracting terms, or they may want to pull out the effective date or the completion date of the contract and be able to make business decisions on that. Are they getting the right services for the contract?

B

Is it something that maybes dealing with LIBOR, where you have to figure out what the new rate is to be a change so traditionally, businesspeople, business will ask a lot of questions to documents.

B

Those questions have answers that are in the documents in the annotation process allows us to annotate those answers, specify them where they are inside the particular doubloon or the you know the outflow from your process and then allow the data scientist to figure out if they want to use some type of machine learning model to be able to extrapolate out the information or some other technique to consistently find that right information. So decisions could be made. You know form from the business side, but that's that's one of the areas of it.

B

You can also do straight classifiers. If you have a bunch of documents, you want to classify them into different categories. You can you can label them that way as well, and then the data scientists can see the information to start building their models.

C

Instead, okay bless, you finish: menu label for the training data set. We can execute as a model and January's I've received. Fourth opinion data because nobody's a trainee. The data center may Vanek, multiple experiments, multiple runs but also tracks, accuracy, distribution and it's a backseat ran right from this history chart audience all the metadata.

C

Boomin symmetric will be stored in the emma flow and give us a real contract here.

C

Moving on to the tests similar to the training data set, we also want to prepare is the label data for the test stat at the end, the descent, if I send me, you can log into each documents and magnitude towards the bounding box for the actual label once it is complete. Nichkhun Jenner is a test data side accuracy at the training model once the model in the fully tested we can move on to the last step, which is a holdout inside.

C

You can use this data set for penetration of your model and each data side is new documents and the model number seen before so by our great a leopard Neal I create a new old outside.

B

And this is where we envision having a business Shmi coming in to the start, doing the validation of your data sets of your hotel. So how accurate was the actual model itself yep.

C

Once SMU prepares a holdout data set, it can used tests between the model, the processing, each documents and, as model generally, the default predicted result with a cuisine, and this point we do see like in the song result, output a paid model, looks correct to us for the samsung result itself. The world starts a menu determining step to go through each model result and either accepts the result were corrected, for example, for this document. The result looks good to me, so I can accept the result, and this result is also.

C

This could mean okay and moving on to the next one sings actor. This detection is off. We can reject it and men unique right on the right start date or dump all that me pick up this one and sour that it's dark day. They let's take this result here. So what happened? Is we just correct this document? Our result generally power in the pro training model, today's correct information and for the next training were next in the model updates of Melanie use their this rather information label or retrain our model to credit.

B

The result- and maybe there is something to mention there when we initially talked through this, we said that all of our data is stored in a loom, and this is the same concept for when you're, using these these inputs for the business, the annotation UI.

B

So when the data scientist gets the output of the corrections that the business is making or the interactor sees that they had in their prediction, they get to see that information into the loom to then retrain their models or update their rules or whatever they're doing to try to extrapolate out that information. So again, one of the things we had data literacy is: how do you kind of have the understanding of data through a life cycle is process and one of the concepts here we have.

B

Our core concept is our loom the store track, manage and kind of the data through a life cycle of the program or the project. I should say.

C

At this stage we are pretty much done or as a whole machinery life cycle, and we can't go back to our admin tools to update our model status to completely boring. It's a skill if the model still need more pull out testing, because activity, medication stem and uh all the the whole tools is deployed on top of a unite platform.

C

B

You bring up the PowerPoint one quick time before you go back into open shift. I just want to show where you were in that life cycle right there yep, if you think about it, have you think about what we were doing? There was the activity of the OCR in which was done already that OCR and then feeds into those business input function so where you can start doing the annotations and where you can start kind of marking up your documents.

B

That thing goes back into the data scientist and then once they do when they build and they train the model, there might be a smart sectioning model, for instance, they might want to break the document down in the section, so they can more granularly. You know make predictions on the datasets that are being highlighted. They could add some Spacey there to enhance the data to better find the information, but these steps along the path of after you, CRA after the business, comes in and gets the the annotations and the markups. You know.

B

The next part, then, is to start breaking that problem down to be able to get the information out that, ultimately, you want to use to make your business decision so again, we've kind of run one part of the night here to OCR it. So you can see the document on the screen. Like you did, the business will do the annotation. You start creating different models, whether it's a model for smart section or you want to reuse this.

B

You enhance your intelligent domain, name engine, you add Spacey to get better Richmond across of it and then finally, what on Fable now show is these components will execute in a workflow that can scale up or down? So if you have through OCR, we know OCR is a heavier process. Maybe a doze. You are a thousand documents. You could have a thousand components that are running each ton of components, complete it's going back into Kafka to the Kafka. No, and then your next component and the workflow is picking this up.

B

So we'll do a small demonstration of the annotations been done. The models have been built, and now you sort of want to execute this pipeline through these different steps to produce an output, and that's one phase going to show you now.

C

Yep you jump back to where we are for the that platform. Okay, so just quickly, I'll show you how different components we deployed have a facade to point for most of our storage application, who means object, storage, meow, typical mode to tha, half facce research Postgres and for our.

A

C

We also has a point of appointment. Type components are shown here, including the we have, this part I'm, the only component and a load balancer and James HBase API for rewrite from HBase, annotation tools, component builder, etc. So after using the cs-80, a right to deploy is a complete in that platform. Openshift, we can let, in you know, data scientist or an engineer to use this platform to launch or create archery I denied workflow for their machine learning. Pattern does the first thing.

B

Just one quick thing: you're gonna be using a Jupiter notebook to show the ability to create a workflow, which is each component that you want to execute and then executing it through or as we've seen some other clients, and we do it internally. You can call our restful service that sends in your workflow that also execute this and then pipeline.

B

C

Yeah, so this is the Python SDK we created is called in that connect, but it's kinda mentioned. You can also try to be the meat of your workflow in the our JSON format, too long to your machine learning pipeline directly in that API yeah. But it's a time of them to show you so using soya, even as I speak a to create port flow and executed using Napa right. The first thing is the way you need to import several libraries peasant libraries for tonight, and then we will define our workflow here.

C

First thing is a: we will integrate some PDF scan PDF image right from the local disk, and then it will come to the invisible flow definition here. We have several components. You want to ask you in this for flow. The first thing is because PDF is can PDF we need to convert from. You know superior up to the image we are using this PDF tool and we call the edge scanner components.

C

You need to specify the full name package, name, docker image, which the tag information here and based on the number of documents in the process. We can horizontal scale. You can set a number of tasks. You want to ask you for this component in openshift, and also how many documents you want to run in one batch expensi. It is very flexible and highly customized for your own processing.

C

Okay, next component, so come to pick up, is Desiree OCR component. We also need to divide to find sounds parameters whether you want to pass through the tesseract command and the similar to the first component. Unity finds a package name component name image: double image with attack information instance batch size.

C

We do have two more components: Phase II, which is running the ILP processing and intelligent on the engine to extract into the field right forms, financial contract documents. Okay, once you define each component using si speaking the next step is, you can create the workflow, so is actually defined. Workflow a pipeline by defining a directed acyclic graph. That simular to spark on has at this point is skill. Like a lazy evaluation, you just create edge for the components right.

C

What component you want to execute in your pattern and with which older and then it won't execute until you principal flow and to say hiccups, workflow I'm going to. But this is.

B

Our idea, the infrastructure kind of foundational pieces that the platform is this, but none of these other containers are deployed onto the platform until the execution in the workflow and then there's a determination of how many do I need so help. How many OCR job so I need to run how many containers do I run and then we'll run in parallel, complete their job and then they'll all shut down. So it's it's only using the capacity when it needs it, and then it shuts down and produces the output. Yes,.

C

Devin and the Weston, we start the workflow the job as pretty in us to meet to the back end for the back processing, so asynchronously and to check without we come Popeye to our open shaved and the look has a pulse here and you will see.

C

Okay, so you'll see we just finished the page standard components. The first component and it's completed is terminating Reynold and move on to the next second component, which is tested right here, and it's launching right now. Okay, also, if you don't want to using the open shape portal to check the standards, you can also using our SDK, which is the workflow doubt standards because cactus that is won't you.

C

Okay, so basically it returned all the components that is in your workflow. We already completed standard the first component and the knowledge is running the OCR on the rest of the components in pending State.

C

Okay, OCR is up and running to the law and the chat aren't status of OCR, so we have to PDF. Documents are processing view.

C

C

This is refresh days to see one to see our complete and move on to the next component. Stacy. Don't, of course, the once old component footsteps defining your workflow complete you, our guide, our final result in the Loom format and you can check limiter was extracting field. Our resolve is that.

C

But here and please see OCR skill running right. I think at this point that cover on my demo, but just waiting for the workflow a.

B

Great so again, just to recap: on on what we saw, we went over a little bit on why we decided to invest in my platform and I apologies. I did get kicked off a little bit there I'm gonna store in the past through, but we saw why would we build at night the kind of capabilities that ignites an open ecosystem, build on containers, micro services that can plug into other offerings or other cloud offerings?

B

Other data science offerings on Prem and and helps to manage the fall and lifecycle deployment and production on station, as well as model management of a model concepts built on top of a loom so luma in the amount? It's easy for everything to communicate with that and then at the end of it it produces some type of output. That's usually then fed to a downstream application.

B

So there could be like an exception process where any of the predictions that need to go be reviewed, go into an exception queue once I could feed through feed through to in the business system to help make those business decisions. But that was I think everything that we wanted to show and again apologies for the a little often on there with connectivity.