Red Hat OpenShift OpenShift Commons Briefings, 20 Jul 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Ask Me Anything on OpenDataHub with Landon LaSmith Red Hat and ODH team members

Description

AMA on OpenDataHub with Landon LaSmith Red Hat 07 20 2020

A

All right, everybody welcome back to another openshift commons briefing today, as we are want to do on mondays.

A

We are bringing one of the upstream projects um that is one of the more important workloads on um right these days, not that your workload is not important, but one of the more interesting ones, um open data hub, the ai platform and team to um come and tell you a little bit about this project at red hat, and we have a number of members here: wanna vaclav chad, landon and beverly landon la smith is going to walk us through first, a little overview of what open data hub is and then we're going to open it up to q a and have an ama session on this, as we like to do, and have a little bit of a demo of it.

A

So view up your questions wherever you're watching this, whether it's facebook, twitch or youtube, or if you're in the blue jeans and we'll aggregate those questions and answer them. Hopefully after the demo and lecture part, and have a conversation about what open data hub is and how to use it. So take it away. Landon hi.

B

um This is uh diane stated. uh My name is lane dylan smith, I'm one of the engineers on the open data hub team, along with uh joanna vashek and chad.

B

So I'm just going to give a quick overview of open data hub, and hopefully we can answer all of your questions right. So in this slide we're going to cover what is open data hub. I give a brief input introduction to kublow, which is our kind of upstream project that we're in sync, with kind of tell you where open data hub is used and just do a give. You a quick demo on how you can deploy opendata.

B

So what is open datahub? The original goal of open data hub is to build a platform for data science. um So we want to make it as easy as possible for a data scientist to to stay within their workflow, so we know that they have many tools that they use for model training model development model serving.

B

We wanted to make that as easy as possible to do it on openshift openshift allows us to kind of scale out to different needs and configure the workflow exactly how they want to do it.

B

One of the issues we tried to tackle is to make it so that the everybody on the team can contribute to the data science workflow.

B

We want the a team of data scientists to be able to work on shared data using some type of storage, uh use, an environment, a development environment that they're comfortable with, in this case jupiter notebooks, um but also allow a kind of data engineers and devops to to work within that workflow to create the best solution possible.

B

So uh this began what we are now calling the open data on uh open data hub is not a red hat official red hat product. uh It is a community project. uh We set out to create a reference architecture to provide best practices on how you can deploy these different tools within this data science workflow.

B

We have a lot of information on these best practices: how to deploy open data hub, how to use different components of open data hub on our website at opendatahub.io and the core part of opendatahub is the the meta operator or the meta project, the opendatahub operator.

B

So, with this operator, we can deploy different tools that will be used in the workflow for a data engineer the data scientists and make it easy for devops to deploy this project. So if you want to deploy the open data hub, you can find that on any openshift cluster under the operator hub on that cluster and look for opendatahub, it is a community operator. That's available to install for free, no red hat subscription required.

B

So the open data hub ecosystem combines a lot of different parts where we gather input for best use cases, best practices uh for open data hub. So we work with a lot of uh customers, uh internal and external, to kind of uh lay out uh how we want the open data hub uh to proceed. So uh we take public requests. um You can contribute to the open data hub.

B

We work with red hat partners to see you know if their tool uh helps further the open data hub and we work with a lot of upstream components um that have downstream um projects within red hat. So our goal is to use completely open products within the open data hub and also provide a path where you could kind of substitute in these downstream products if necessary. But everything is freely available.

B

So this is kind of a few of the components that are in open data hub. In this nice graphic, uh we focused on kind of jupiter notebooks for the development environment, um object, storage provided by kind of ceph apache spark for data engineering seldom for model serving argo workflows are kind of the the core uh pipeline technology that we've used in the past, um prometheus grafana, tensorflow and kafka.

B

So, uh with the recent release of open data hub 0.6 we're currently on version 0.7, we are an official downstream of kubeflow, so the kuflo project is uh is a project to bring together all these data science tools um into a a ecosystem that works on kubernetes, and we do the work to make sure that this workflow also works on openshift, um but we also bring in a lot of products that aren't covered by kubeflow, and all of this is available in operator home all right.

B

So uh this is a graphic kind of our original release. So a little bit of backstory about open data. Hub, probably a year ago we had our official release of zero five. uh This contained a few of the components um which are jupiter hub uh data catalog for um that contains hue, high um and thrust gpu support, argo and um all in the ansible operator, with the switch to kind of down stream kubeflow uh we refactored or updated operators. So it's purely based on go.

B

uh It works with the kf-def manifest and it fully supports kubeflow products. So, using this open data operator, you can deploy kubeflow on openshift, uh in addition to open data, hub components.

B

So the current release of zero seven, um you can see a few of the components we have released, uh so we have full support for kubeflow version 1.0. You can deploy that with our operator on openshift kf serving support with our operator. I think this might be mixed in. We could use this with opendata full ci testing on all of our updates and releases.

B

So as soon as we submit any updates to open data hub, we've run a full uh uh battery of ci tests to make sure that new component doesn't break any existing functionality, but also provides working new functionality. uh You can mix and match odh and kublow components.

B

So right now we're verifying a small subset subset, but with the 0.8 release, we plan to verify and test kind of all of the default. Kubeflow 1.0 components um mixed in with odh components and some openshift container storage.

B

So um the current operator for open data hub is kind of a phase one basic install.

B

This means that it will deploy open data hub and uh do some minor updates, but uh for the most part, we're doing a full install. We have plans as time goes on. You know throughout the year to kind of bring this into a phase five operator, but these are long-term plans. But as of right now, you can deploy your open data hub ecosystem.

B

Architecture on your openshift cluster, without any.

B

Issues kuplo uh kuflow uh for those that may not be aware of it. uh It's an open source project dedicated to making deployments of ml workflows on kubernetes, uh simple, portable and scalable.

B

A lot of the work we did to bring that or bring open data hub in line with kuflow is to make sure that there are no issues when deploying from kubernetes to openshift. uh We had to introduce a lot of updates and fixes to make kubeflow more secure.

B

We want to make sure that you know not. Every container is running with privileges that you don't have to uh elevate any container privileges uh beyond the standard, um runtime permissions, um and then we kind of verify and make sure that model, training and serving works uh on openshift.

B

So these are a few of the goals. uh This is for open data hub and working with kuplo. We want to incorporate best practices.

B

Okay, sorry, a simplified install. uh We want to use the kind of ubi or universal base image as the base for all of the open data components. This provides anybody deploying open data hub with a level of security that comes with using that ubi base image.

B

So you get a lot of the the red hat effort for providing a secure base image uh in open data, and um we also want to make sure that we uh secure that deployment of open data hub and by extension, kubeflow. So that's using kind of well-defined permissions.

B

We kind of eliminate any containers that require root privileges and work within the kind of standard deployment standard, secure deployment that openshift.

B

B

So this is a kind of a quick graph of some uh open data components that we are uh bringing to the new release of 0.7 or 0.8.

B

We have we're working on kind of allowing you to deploy storage along with uh open data hub based on ceph object, storage.

B

We have support for I guess we have components that are using postgres, um but as of right now you can deploy kafka um and we have uh you can deploy smart clusters, uh we're working on updates to provide kind of data exploration, though we do have superset which allows you to do data visualization.

B

So you can work directly with kind of your external databases or data sources to visual asset data, we're working on adding data cataloging with hue, so that you can kind of navigate your object, storage, but also run spark sql queries on that data, so we're hoping to get that into the next release um and we currently, I think we do support with the the ability to mix open data hub and kubeflow components for ts serving.

B

uh I think pi torch is in the the final verification steps uh we do deploy, uh support, seldom model serving um and argo and kuflo pipelines, along with monitoring by prometheus and grafana, and, like I said in the bottom right, our data scientists kind of workflow includes jupiter hub.

B

So we fully support uh openshift uh authentication for those notebooks, so jupiter hub being a multi-user notebook server.

B

A team of data scientists can work within their own notebooks, separate from each other, but potentially sara data, either through object, storage or even sharing notebooks by either allowing others to access their notebook pods, and all of that is fully integrated with kind of a spark cluster that you can deploy, and we have our ai library, with kind of example, ai models that you can utilize for your workflow.

B

So um if you want to join the open data hub or follow it uh as always feel free to go to our website at opendatahub.io, we are fully functioning on github.com opendatahub dashio.

B

So if you want to track any issues or progress that we're making in the project, all of our opendadhub projects exists under that organization of opendata io again we're a community project so feel free to kind of take a look file issues if something doesn't work correctly or submit prs. So if you see an issue or you want to add a new feature, um definitely go there and submit a pr.

B

If you want to track progress, uh we have an announcements list you can subscribe to and then a contributors list. If you go the extra mile to submit uh prs and want to become a contributor, and then we have bi-weekly open data hub community meetings that you can track archives on the uh our gitlab site. So I want to clear up confusion. So our old operator exists on git lab, but to make sure that we can stay in sync with kind of kubeflow updates and become a fully functioning downstream of kubeflow.

B

We migrated to github, but a lot of our old projects are still on gitlab the open data hub community being one of those, but it's still current for the open data community. So you can see old meetings, get notes from any meetings where we have a lot of guests present um any use cases that are utilizing open data hub or kind of volunteering or opening the discussion to add new features to open data.

B

So this is kind of some examples of where open data is being used. Originally open data hub was an internal project that started with the basic elk stack, if I remember correctly, and we worked with internal customers so that they could kind of um work with their data in an easy fashion. So we provided storage and elastic search to interact with that data, and from that we got a lot of customer use cases that help to form the open data up.

B

So a lot of the work internally, we transition to the open data hub so that some of our experiences with this type of workflow can be utilized by the community as a whole.

B

A few of the early adopters for open data, the massachusetts open cloud, it's a collaborative effort of a few or universities to kind of run their data science in high resource workloads on a open, um high availability cloud, uh so open dat hub is kind of part of the the backbone. For some of this work, where kind of professors, uh research, researchers uh and even some students can get access to to run their.

B

Openshift or data science workflows.

B

So we'll do a I'll give a quick demo. I just want to demonstrate how you can get access to the open data hub and deploy it within your workspace. So let me kick over to my openshift console so here I have a basic openshift cluster. Potentially you could deploy this on any openshift cluster. So right now, I'm using a three worker node cluster. So this is pretty standard for any open, shifted stall.

B

We do have support for deploying on something as small as a crc or code ready containers cluster. You could also use okd, which I think just released just went ga or general availability for openshift4 clusters. So right now the the current iteration of openshift or opendata supports openshift4.x, so the current version is 4.5, which I think was released last week or two weeks ago.

B

So any of the freely available openshift clusters can be used to deploy open datahub, but if you go down to kind of crc or okd on your laptop you'll need to scale it accordingly.

B

So if you want to deploy opendatahub, you can log into any openshift cluster and go to operator hub. So this should be available in every single openshift 4 cluster and you search for open data.

B

Actually, let me backtrack, let me go ahead and create the namespace.

B

So just create any namespace. I use opendatahub.

B

And here you'll see open data operator.

B

And again, it's available as a community operator, which means it's freely available for anybody to deploy on any openshift cluster and you'll, get kind of a rundown of the the current components that we deploy as part of the open data hub with additional info about where you can uh track the project. uh The operator image, where we're pulling operator image and uh additional information, so this kind of describes the available channels that you'll see in the next step.

B

So when we click install we're presented with these standard options for any operator, the current iteration of open data hub is a cluster-wide operator, which means that we can deploy any kf-def custom resource that the open data hub watches for into any name space in the cluster and the operator will see that and deploy open data hub.

B

The update channel is beta. Beta is what you want to use right now. That is our where we're hosting our new operator legacy is the older name, space bound operator. uh That is the older answer operator that still works, but um we are providing kind of minimal support for that. So a lot of the components that are deployed there will not be receiving updates um since we're doing all our updates on the beta channel and we'll leave the approval strategy as automatic.

B

So this means that whenever we release newer versions of open data, hub they'll be available and installed, the operator will update automatically and hit subscribe. Now, we're just waiting for the operator to be installed by olm olm is operator, lifecycle manager. So we utilize a lot our olm a lot.

B

So in the older operator, one of issues that we encountered was that we had to kind of recreate the deployment strategy for every component we deployed. So if we deployed prometheus, we had to create the deployment objects. The the roles service accounts um every single item that was required to deploy a component.

B

We had to vary that or kind of embed that into the operator image.

B

Now, if there is a component that the open data hub uses that is available in operator hub, so whatever component has put forth the effort to kind of be listed on operator hub, we can easily leverage that entry in operator hub for open data. So we're not recreating the deployment, um the strategy or plan for every component. We can literally say for uh selden version 1.2 uh reach out to olm and deploy that operator.

B

So that's good because we don't, we aren't required to kind of um stay in sync with our update strategy so, as you know, seldom updates their operator and pushes that to the openshift operator hub, we automatically get those updates for that version and olm will handle the kind of deployment strategy. So now that the operator has deployed broken data hub, we'll just click on that and you'll kind of get another overview of the deployment.

B

So the open data hub operator is looking for kfdf custom resources.

B

So anytime you submit a kfdf resource which is the um essentially the customize manifest format for kubeflow. Once you submit create one of those on an openshift cluster, the open data operator will see that and based on the information that's in there, it will deploy it so we'll go ahead and click create instance.

B

And hopefully you can see this, but this is a sample kfdf or an example, kfdf format that um we provide. So what you can do is you can look through this and you'll see every entry in this cust customer applications.

B

uh Dictionary has the same basic format, so customize config, uh with a repo ref and a name customize config, repo ref in the name. So this determines what is getting deployed as part of this kfdf.

B

So here you'll see that we're deploying uh ai library, cluster and ai library operator, um one of the things that we set out to do whenever we add a component open data hub is to kind of separate the cluster-wide permissions or cluster-wide action, uh mainly things like deploying to a cluster-wide namespace.

B

I want to say: checking for required. Crds exists in this kind of cluster component and then anything specific to the deployment of the operator or application uh exists out there. So in the this operator deployment. Sorry, um so a lot of these components will have two portions or two configs. So here you'll see kafka, cluster and kapha kafka. So anything that's not named cluster, so kafka kapha actually has the um the deployment files necessary for kafka deployment.

B

Cluster is generally the the crds and any required cluster wide options. So, as you look through this kfdf you'll see all the components that we're deploying kafka. Grafana, the rad analytics spark operator prometheus jupiter hub jupiter hub will be the entry point to a lot of the the use cases for open data hub if you watch any demos or or examples, uh airflow, argo and so on and so forth.

B

So with the latest release of open data hub, this is kind of one of the new features we we wanted to focus on so you'll see in this repost section we have kf, manifests um and regular manifest so kf manifest is a fork, a downstream fork of the fixes and updates that are required to deploy kubeflow on openshift.

B

So if you go to the github.com kubeflow slash manifest, that is the pure vanilla, kuflow deployment that will work on kubernetes and they do have support for additional um cloud providers, so uh azure, let's say ibm google cal, but in this open data hub dash, I o manifest all the files um and updates fixes that you need to deploy on openshift and this plane manifests, which is in the odh manifest repo. This is the open data hub uh proper.

B

So these are components that we've specifically curated as part of the open data hub reference, our architecture to deploy. So if you just deploy anything using manifest as the repo name, then this is the open data implementation.

B

If you see anything that references kf manifests as a repo name, then this is based on kind of the the upstream deployment of kubeflow that we have added a few fixes to make sure that deploys successfully on openshift. So right now, I think everything just references manifest, but as uh the next version is released in newer versions, you'll start to see more and more uh mixing of co, kubeflow and open data hub components. So um potentially you'll see like the tf job operator, the pi torch operator, maybe even some pipeline work.

B

It just depends on kind of what we have time to verify before that release. So in order to deploy open data you just hit create you'll see the kfdf file is created.

B

You can view the yaml.

B

And now we just wait for everything to deploy so slowly. You'll, see kind of different components come online based on that kfdf um they have library, operators, seldom controller superset, uh so on and so forth, and once these pods come online they deploy successfully.

B

Then you can start to use any of the components that are deployed. So this may take a few minutes, but that is kind of open data hub. In a nutshell, so I don't know why we wait if we just want to go ahead and open the floor to any questions.

A

Certainly, we always have questions one of them while you're doing this maybe explain a little bit. um One of the quick questions that's often asked is: is open data hub available for generic kubernetes, um which kind of flows into the question about is open data hub available on operatorhub.io.

B

So yeah, so uh there's a lot of confusion between operator hub that you see in openshift and operatorhub.io so operatorhub.io. The website are for operators that are certified to work on finale, kubernetes, so not uh openshift, but the uh upstream kubernetes uh server.

B

So um we are certifying that we work on openshift, which means that we are only available in the openshift operator hub that is deployed with all openshift clusters. So just because you don't see us on operatorhub.io means that does not mean that opendatub isn't available on operator hub. It just means that we're certifying that we work on uh openshift, so any openshift deployment, whether it's okd code, ready containers openshift on aws openstack.

B

If it, if open, stack or open shift, is supported on any type of infrastructure, then you have access to open data hub.

A

Yeah and- and you did mention and I'll mention this while we watch your your screen scroll here- um that okd uh is now available and okd is the open source distribution of openshift and it's available now so july, 15th um in general availability and it's running on fedora core os, but um you should be able to deploy the operator hub.

A

uh uh The operator hub, open data hub easily on okd, and I don't know if anybody's tested that yet, but if you haven't, let me know I do one of the chairs of the okd working group, we'd love to get your feedback on that and help you through it. If there's any issues whatsoever, I don't think anyone on the odh team has done that. Yet it's probably too too soon that was just last week, so would definitely have to get that tested.

B

Yeah and uh just to kind of build on top of what diane just said if you deploy it on okd um or any infrastructure provider, openshift, um cluster and you're, experiencing the issues, please, please submit the issue to kind of any of our projects.

B

If, if something isn't working with the operator, add that to the open data or create an issue in open data hub dash operator, if any of the components aren't working correctly, then feel free to follow an issue on odh manifest.

B

If you are deploying kind of pure kublow on openshift, then feel free to follow that on open the organization open data hub dash, I o slash manifest, um and if you follow to the wrong one, that's fine, we will definitely make sure it goes to where it needs to be. Definitely.

A

Straighten you out point it in the right direction um and really, if you're listening to this- and you are running um this reference architecture or want to please do reach out um we're. Definitely looking I'm seeing it pop up in lots of conversations um across the ecosystem from health care to and covered tracking stuff to all kinds of interesting things. So it's definitely been starting starting to get a lot of um overflow into other spaces and market spaces and use cases.

A

So we're definitely looking for more feedback and any bugs anything you find send to us. So how's your demo going.

B

uh Everything's deployed uh we're missing one key thing for hub, but I'll investigate that and we'll go from there, so we can open it to other components. So there are questions, let's see so one of the things I'll say. uh Whenever you deploy uh open data hub, uh we make sure that everything's ready in a state where you can use it automatically.

B

So if any of the components um need to be accessible, so they're not just kind of backing components where a component a is just utilizing a service from component b. If it's something that the user needs to interact with, we make sure that there's an open shift route created to that so that you can easily once open data, helps deploy just go to networking routes and access that component. So here you'll see superset.

B

So now that this has been deployed, it's ready to kind of interact with and, and you can start your workflow from there.

A

What was the piece of jupiter hub that didn't deploy here? That was sony.

B

Check, let me see.

A

This is why we love to do live demos while we're live streaming, because it makes it much more interesting and people believe us that it actually works, and it's not smoking mirrors and this truly isn't smoke and mirrors.

A

B

B

B

B

B

I'm trying to see if there's any mention of.

B

A

Okay, maybe while you're doing this, we can answer a few more questions and um I'll unmute, some of the other folks that are from your team and you can debug it and just raise your hand when you figure it out or not, um and we can do that. So, um let's see who else we have juana is here.

A

And valklav was here beverly's here: hey lana,.

C

Hey, how are you.

A

Yeah and it's joanna right, it's not yeah! It's joanna.

C

A

To just shoot myself, because I ought to be able to remember that each time.

C

A

I'm sorry and vaclav is here um so that there are, while he's doing, that, a couple of other questions, um and I think you you answered the one about and explaining where it is um in terms of vanilla, kubernetes versus open shift, and I think we do have a pretty strong, um full open source stack with the the complement of okd now.

A

So anybody who wants to do a full stack without licensing ocp could could if they would- and I I'll see if I can get the okd working group to find someone to to test it out for us. um But uh one of the questions that came in and um beverly has is probably going to guide us through some of maybe the first question. If you want to go through that.

D

Absolutely so wanna um are all components from kubeflow available or included in open data hub.

C

uh Actually yeah, so not all of them are, for example, I could say: kf serving today doesn't work with cooper, 1.0 that we have, and if you look at the example manifest that is actually linked through our operator main page description, you'll see that some components are commented out, and these are the components that we are actually still working on to get them working on openshift.

C

So it's a work in progress for us, so.

A

Do juana, do you have um right now? It's probably a heavily red hat, led contribution base right now. Do you have people external to red, hat, contributing and helping out.

C

We do have very few people, mainly opening issues and guiding us through fixing the issues. I wouldn't say we have a major contribution, but we do have some contributors from ibm with regards to the operator contributing heavily there.

C

And add something I forgot, what it was, but yeah, that's mainly what it is, but uh our community meeting is always busy with many different developers from different companies, and we do work really close with many of the component owners such as selden and cooper.

A

Yeah, so I think, as as this community expands, the end users become really important because they're giving the feedback to how it's being used and the integration partners like seladon and others um become important too, as well, so be interesting to see how the um the ecosystem grows around this, because you have incorporated a whole lot of partner and integration points there. So that's going to be fun to watch as we go through.

D

And diane, now that you spoke about the end users, I also have a question on whether we have active use cases for the date for open data hub.

C

um So we have, uh from a use case perspective from an industry perspective. We do have a couple of one use case: that's already out there, which is the fraud detection use case. So we have all the code and all the instructions on gitlab for it and then we're working on a couple we have. We also have ai on the edge that landon's working on and then we're working on other industries. I think we have one in the banking industry and a couple down the line coming down.

C

Is that what you mean by use cases, or did you mean? How is open data being used uh currently.

D

I mean it could well that answers a question, but um we could also look at it in terms of do. We have maybe like clients that are already using open data hub in their infrastructure.

C

Yeah, so we do have a couple clients um we have uh um exxonmobil using it and they had and they did many presentations with regards to using open data hub. We also have internal implementation of open data hub that is being used by internal data scientists and data engineers in red hat, and then we also have the moc that landon describes and I'm sure russia can add a couple more about this and where it is of today.

C

A

Thank you, yeah yeah go ahead.

E

um So with moc, we are working on uh support for openlayup on power9 uh machines and clusters uh of openshift, and we have opened uh they have deployed in moc and is being used by students for their research work. We had a couple kind of early adopter projects.

E

I don't think that any of them is live right at the moment, but part of that is that, since we have moved to mostly only supporting well supporting is the wrong word, but using and verifying all the age deployment on openshift, four and moc is still running on openshift3.

E

uh It's been kind of hard to keep it running there, um so we are working with them. um We have weekly things to to basically see where they are and when they have openshift4 ready uh for us, we will um come back to to um having opening up fully running there. uh Part of our road map, uh which you can find on openawayo as well, is um for the next release to have a plan for how we could do c continuous deployment.

E

uh Landon mentioned. We have an internal deployment of open data hub which is running internally at red hat, and then we have that partially public deployment on moc, where the researchers that are part of messages open cloud can use it, um and our goal uh for the next release will be to come up with a plan for reproducible, continuous deployment solution or process.

E

Rather, where we, uh our new releases of open it up, would go to our internal lead app instance and to like mostly deployed instance, and it would be hopefully also reproducible for our for our users, where they can use that process to to also like get their deployment uh bound to our releases and and stuff like that.

A

On your, your roadmap as well is that you were thinking about disconnected deployments.

E

Yeah definitely, um it is a big, is a big ask, um and not only for open data, but also for the kickflow. The upstream project that we uh pull components from um there is plenty of people that are running disconnected to be that with edge deployments or originally on a remote locations where they maybe only have mobile connections or something like that, and they need to be. They need to be able to make sure that they can control the traffic that is coming in and out of the clusters.

E

um So we want to make sure that uh that is possible to open data up and when deploying that, like everything goes down smoothly, they can pre-pull the images and they can deploy to a disconnected environment whenever they are ready. um So we'll be looking at that. Probably this. This fall.

E

um We've been looking in that for some time in our previous versions, uh which was based on ansible crater, um but it was kind of hard, because that was just a lot of parameterization with ansible, and it was just like all the repos all the registries, all the images and it was kind of a mess. So we hope that, with this cube, flow-based solution, it will be a bit easier and also keep close working that in the past, I'm not 100 sure if they were able to finish it.

E

um But we will definitely look at the keep flow solution for that, but that there is any and if not, maybe we can help or finish or bring it back to the community and see what they have in mind. For that.

A

Thanks, I know because we had someone come to the okd working group who wanted to do um on arm 64 an ml use case using okd in a disconnected fashion.

A

So I think maybe open data hub is a bit overkill for what they were trying to do, but um I think it gives them a good roadmap and a good, maybe a collaboration point um to work through so I'll see. If I can feed you that use case as well.

E

A

E

A it's an interesting point: whether open it up is overkill. um You don't have to use all the components right yeah if you're only reason to run open it up is to deploy selden and something else. Then maybe it's it's still good to use open data, because we have verified the components that they run well on openshift and there are some integrations and there is more coming. um If it's just one component and it's already an operator up- and we just depend on it- maybe it doesn't make sense for an opening now.

E

But if it's like three things that you would be running, um it gives you kind of a single point where you just apply that one custom resource and it all comes up and it's all integrated and configured.

A

In the background there did you get that working london.

B

Yes, so um just so, it doesn't look like magic or anything. um What we did I was playing around with the kfdf. um The operator was kind of uh throwing an issue with graffana deployment, so it was kind of a timing issue, so we're relying on olm to deploy grafana based on the grafina devs configuration since we have it separated into kind of graffana cluster and the grafana application.

B

We kind of needed like a little uh wait time uh in between the two of a few seconds, though one of the dependencies that the grafana deployment required uh wasn't present um yet so that would have been deployed by grafana cluster. So it was a small race condition.

B

So um let me I'll just go over what we did.

B

As always, when you have a problem, try the simplest solution, so I went to the kfdf.

B

And I moved the grafana component to the bottom, so this is pretty simple, uh so I just cut or cut this text from higher up in the kfdf ammo and moved it to the bottom. What happens is once we save that it'll trigger an update um which the operator will detect and then it will reprocess that kfdf.

B

So now, there's based on the previous attempt to deploy grafana all the dependencies were installed um and now we can deploy grafana successfully. So that's that's all I did and what they did was kind of unblock the dam. So um once that was resolved, all the below components are the components below grafana deployed successfully. So you'll see, we have a lot more deployments and we have now. Argo is uh available. uh Grafana, it's online.

B

B

Then here we can just sign in.

B

Luckily, I didn't expose my password to the world and again so we're using openshift oauth for a lot of um these components by default, and here these are. This is one of the customizations we've added to jupiter hub, where you can select your notebook from a list of notebooks, so we have kind of a minimal notebook which is just kind of bare bones.

B

uh I think it's just python is installed: scipy notebook for the scipy library, a spark notebook that has version 2., spark 245 and hadoop 273 the spark scipy notebook and tensorflow notebook, so any user that deploys this has access to this. So if they have, if they can kind of read access to the namespace, a basic kind of minimal access to namespace, they can deploy their own notebook.

B

And we have different sizes, so if you have a team that needs different size, notebooks there's an ability. These are the faults that we provide, but you can provide your own custom resources so even internally and externally we have support.

B

So if you wanted to change the small medium large configurations to be kind of uh 10 cpu- and you know 256, megabytes or gigabytes of memory um or even larger than that, where you have kind of a small cpu but large memory, you can do that and any if you, the user, wants to add any environment variables they can and then they just spawn the notebook. So this is under full controller user.

B

At this point they don't have access to the project space uh where jupiter hub is running, but they have full access to their notebook pod.

B

So if anybody has any questions about kind of the the process we went through to kind of debug this or any questions about any of the components feel free to.

B

A

I think beverly has a couple more queued up here.

D

Yeah, um so we've got a question since open data hub is a platform or blueprint to pre to building uh an ai as a service platform. Can you talk into whether it works with gpus.

B

So yes, um it does so we have full support for gpu, so um the open data hub does not do gpu enablement, but the notebook that we just spawned um a user has an option of requesting enable gpus.

B

So I think so, let's do a quick shout out to opendadhub.io.

B

We have a quick guide for how you can utilize gpus and open data up. So we have links to upstream partners. So right now in aws you can add a gpu node and you can. You would use the nvidia operator that's available in I'm not sure if it's red hat operators or community operators.

A

It's red hat operators, I'm pretty sure, but I will double check.

B

Okay, so since it's in red hat operators, you will need a kind of fully subscribed cluster, if I'm not mistaken, but the access to the nvidia operator uh is free, I'm using air quotes. So if your cluster has access to red hat operators, then you have access to the nvidia operator, so the nvidia operator is responsible for doing gpu enablement. So you provide the gpu node uh install the nvidia operator, and then that will handle kind of. uh Usually it requires this auto dependency. That's installed, the node feature.

B

Discovery operator is a dependency for nvidia that will essentially catalog every node in the cluster, and it will give you all these annotate or these labels for different uh hardware features that are available and once it sees this annotation, the nvidia operator will go out to that. Node install the appropriate drivers for the gpu. That's installed once that's installed. You should get.

B

This line here when you describe the node, so if you oc describe that gpu node and you see a this value, a non-zero value. That means that you have whatever this number is that many gpus available for requests.

B

So at this point, open data hub can request x, number of gpus um and then from there you can spawn any notebook with um that will request the gpus, and then you could use that in your model development.

B

So you have full access to that gpu. I think right now, all of our examples use tensorflow to to crunch the numbers gpu. Hopefully that answered the question.

D

Yeah, that was great london, um and can you talk about what the ai lab library is.

B

um That's a good question. uh Maybe chad has an answer for that. I know he's done some work with the ai library.

A

If you pop over to the docs, there's a little overview on what ai library is there and it's just okay.

B

ah Here we go. uh The ai library is an open source collection of ai components, machine learning, algorithms and solutions to common use cases to allow rapid prototyping.

B

So again, um if you have any questions about like any of the components, if you want to know more about them, that we provide no potato, feel free to go to opendata io, we're always improving the docs and increasing the amount of documentation, that's available for different components. So, as we add a new component or updated new component, add features, it will be available on opendatahub.io website.

B

um But I think these are it's a collection of models that you can use in your workflow and we're using selden. So selden is a dependency for the ai library, um where any of these models will be deployed and the api will be available for you to submit data to.

A

Then, back to the gpu topic, we just got a question coming in from oleg um on youtube: how to automatically run cals on free gpu. Is there any spawner for that gpus with some ram available sure.

E

E

How is it generally working in an openshift with gpus? um It's basically that the container, when you're spawning a container um and it requests some resources, it can be cpu memory or any special resources like gps. um The container will run on that node and based on the configuration it will. It will get those resources so for memory you ask for 100 gigabytes of ram. If you have a node which can accommodate that container, it will get it if it's gpu.

E

If there is a node that has a free, gpu, unassigned gpu, it will get that gpu right now. There is no good solution. As far as I know uh for like splitting gpus or something so we are talking about using a gpu. It will be one gpu per container or multiple gpus per container, but cannot be multiple containers per gpu. There are hex around it, but it doesn't really really really work uh yet um so for this um we cannot automatically run.

E

So, if you just say in your code running in a container hey if it's there, if there is a gpu I'd like to run this on gpu, that's not how it works. The container the pod itself has to specify that it requires the gpu to run, and if there is a free gpu, it will be assigned to that node and it will be run on that node and it will get the gpu kind of mounted into the devices of the container and the code inside a container can use that problem with that.

E

As long as that container runs, the gpu is allocated and cannot be allocated to something else. um So there is no smart way right now to let's say: okay, if there is a free gp run on gpu, if not do not run on gpu, we don't have that and I don't think anyone really has.

E

um Potentially you could write an operator which would take your code and inject some information, whether there is a free, gpu or not, and then based on that the code would change uh the execution path and then uh maybe based on the information from the cluster. It would either get or don't get gpus, but but I haven't seen such solution yet, but in general there is no automated way how to how to decide this um available in opening it up or in openshift in general,.

A

Cool thanks for that and um I'm looking at the time and we're almost to the end of the hour, so maybe landon. If you want to share that slide where people can find additional resources again, while we blather on a little bit more. um That would be a great way to to end the hour here and um I'm I'm just wondering, um because you have lots of different partners and different integrations into this. Have you done anything just because they are now part of our family with the ibm watson stuff?

A

Have you has anyone integrated any of that into um and use it from a jupiter notebook, or is that something for a future briefing.

E

We actually have an issue uh created where ibm provides a cuda enabled imp container image um where we basically cannot redistribute them uh as openly app and as redhead. We cannot redistribute put our binaries in our images. We always have to build it on spot.

E

So if, if you deploy um open data- and you want to use gpus- and you want to have good enabled images- you have to build them in your cluster, which is fine, we provide all those build configs and everything it just takes time and some resources for the build, whereas ibm provides these images in actually red hat registry.

E

um So we have an issue for looking at whether we can use that image as a base for some of our jupiter notebook images so that we don't have to rebuild build them, but we can actually leverage what they already provide um on. The other front. um Ibm is very active in keep flow communities, so we are talking to them. Often in our community calls and keep flow community calls and coordinating.

E

Basically, um the cube flow operator, which is a base for open media operator, now, uh has been built by ibm team with our guidance and then contributions to uh ideas, documentation um and some code, uh but they they did majority of the work. um The open source team at ibm so um really really good collaboration. There.

A

Awesome well we're going to have to get them on again soon and see if we can't uh make that all work and explain how that all works too. So um I really want to thank beverly for for stepping up and organizing this today and making it happen, and the whole team from open data hub for coming and answering questions and sharing um your your wonderful project and congratulate you on on. It's really come a long way since the last time I did an open upstream conversation on it.

A

So it's really amazing to see all this, and um I know I've been talking with folks, like guillaume moltier, around um some of the work that he's doing up I'm up in canada as his guillaume and um with the covid project and the that the ontario folks are doing. And hopefully we can get him back on again. Talking about that.

A

Some more that I think we did a briefing a little while ago, but I think a lot of people out there in lots of different spaces are leveraging what your, what started out as simply a reference architecture and has turned into a real community around things so kudos to y'all for making this happen, and thank you landon for making the demo work and explaining the fix- and we will have you guys back again soon for new updates and new use cases for this.

A

So again, here's all the information you need to find everybody, and hopefully we'll get your feedback and have a few of you on showing us what you're doing with odh soon. So, thanks again, everybody.

C

Thank you, bye.

A

And we will upload this and put the slides um link it to on the youtube channel, rh, openshift and I'm sure the open data hub folks will steal that video and put it out on their feeds as well. So look for that shortly. Thank you. All for taking the time today take care and be.

C

C