Red Hat OpenShift Case Studies | OpenShift, 8 Aug 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: ML on OpenShift Case Study at Laval University with Guillaume Moutier

Description

Laval University Case Study
Guest Speaker: Guillaume Moutier (Laval University)
Data Science Services (Valeria)

A

Well, that was a really interesting presentation from from Shepherd, because in fact it's mostly what I was about to say. But now there you see exists, it's quite different, but two tools and the approach is it's quite the same level. University is in the Quebec City Canada. We are number ranking number six in Canada for for a research, and that means that we have data scientists on the other campus picture, a picture of the campus.

A

But that makes a population of about three thousand people who will qualify as data scientists, with our researcher graduate students and so on, and those are the people we are trying to to help with data science platform that very closely resemble to it. What what weather has just been presented so a few words about the project that we have data services. We call it Valley area, that's kind of a brand name. The idea is to provide professional services just consulting support and sorry.

A

Inserting support and training around all those data, science and machine learning things, and, of course we will. We will base our services around the technological layer that we have with our own data centers on the campus, once we have a compute Canada or a compute Quebec, which are resources that are freely available to researchers all throughout Canada and, of course, other services in the cloud that we can also leverage. So the idea is exactly as was presented, to have a central data Lake with some storage capacity and compute.

A

The capacity on top of which we will have the terrorization and management tools totals are mostly for transferring the data recti ingestion, because in research we are using more and more more sensors and data scrapping from everywhere and resources that we currently have, whether at compute, canada or campus, do not talk about this new tile of of that ingestion. So we have to provide this in our new platform. Of course, tools for processing, visualization and analyzes and storage, and access and catalogs.

A

But those are mostly technological solutions and we want to abstract this layer of technology to our researchers so that they are able to use it in a much easier way and that's where we will provide you. The service access to data scientists, slash data engineers who will help our researchers to take the most of everything that we are putting up in trigger the project. We are also working at the data management framework.

A

We were talking earlier about access and security, and things like that. That will be part of this management and, of course, of course, support and training. So, if I look at the architecture, we are quite lucky here on the campuses. We have four different data, centers ways, of course, optical fibers, covering the campus with connections from up to 100 gigabits. That's also, we have our own wherever on network I burn network all throughout the see, that is to connect our research centers, that we have in different hospitals, and things like that.

A

So we have a very efficient network which will allow us to have this concept of completely central data lake there and the computing resources. That will be accessible by every by all the research community of flatout.

A

We made some technological choices for this project, so first was to you need to network with 100 Gig links all throughout the data centers and 40 gigs, where it's not necessary. We also use self or a clique storage, especially as we can liberate restore data centers to have our data spread across different physical sites.

A

Those are completely isolated sites geographically, but in fact it's the same difference by network that will spread throughout the data centers. Usually we come from the world of fetch pcs, so it's mostly based on the roster for a file system and things like that. But here it's a complete change of architecture. We go with self which ways which will be much much easier for us to interact with the data and and move it around. Of course it poses other challenges which I will talk about a little bit later.

A

Also we will use up and shift otherwise that won't be there.

A

For you know, we we have some experience with it in or institutional applications, we've been using it for almost two years now, and we want to leverage these experience for this new data science platform, because it's much more efficient and a giant versatile in in what we want to do and the best choice was no I do usually antenna science, the standard, the installation and what we, what we see are different research that was based on Hadoop, and here we took radicular different paths as we you know, we are not setting up an infrastructure for a specific lab or specific department.

A

It will be a completely shared service. So that means that we don't know yet at which path will part which parts will grow, whether it is storage, whether it is the compute. So from the start, who has to separate everything, and once this has been done, then we chose to you to yourself: storage. Well, in fact, there's no leader of Hadoop, as we can directly use spark and tensorflow, and all these all these new things directly. We salute the use of the standard hello.

A

We will also provide some end-user tools here: it's a patchy knife, I for real-time ingestion, which will be a little bit easier to use for researchers and directly programming, Kefka and thing like that.

A

We offer pentile for ETL and, of course, jupiter notebooks.

A

If we look at the technological landscape, we have our data scientist on the top left, which will still be using their standard, that analyzes tools, jump and or our studio SPS. I something like that, but we want to offer to its applications, tool and services that switch our top fly and the top right for data ingestion data discovery, which is very important for us.

A

It's not just about having data in a central dynamic, but it's about finding it and right now we will act, this kind of tools such as 10, a verse or a sea-can catalogs or I roads for the management. I think that we want you, of course, I have Jupiter and Arnett books on demand and we're still. The effort is HPC. Services are which are much more than natural. On the bottom part, you can see the different applications and tools that we will use and in fact we don't pretend in this project to invent anything.

A

Those are standard applications that that you can find there. Our project is much more about integrating all of these in a seamless experience for our users, and here, if it works, I can make a small demo the kind of an experience that we want to offer. So here it's the portal and please it's it's only a small proof of concept that we need here.

A

Let's say: I'm a researcher and I want you to use the Valliere platform. Then I am redirected to a clip-clop instance. We have integrated keep lock-in in the project, though I can't register I can have a new login here for is you're signing it's coupled okay.

A

So here what some logged in a cuckoo clock I have directly access to the portal where I can find Jupiter get lab 4 for storing the data. Do we will have C can, of course, other and other services, no more standard services like databases and access, my storage and everything. So here I'm, the researcher I- am able launcher. Jupiter Jupiter hub I have a selection of different types of that books.

A

That I can that I can choose yeah, for example, let's say I want to do some side, PI and then I click install everything is based on Cuban G's on open shifts. So here as I launch my notebook, you can see a new pod which has been instantiated it's under my name directly because, as we have, the integrated authentication I can I can gather all the data that are necessary.

A

You know, of course, username, but maybe some other restrictions and thing like that and I cannot show you it to you right now, but we have a new and you version of this, where I would be about, depending on my credentials, to select many CPUs I can run, which memory if I am able to use. Gpus and thing like that, and here I have my chapter netbook, which is right. Now it's backed by an s3 storage.

A

You know, as we are running inside up and shipped, we had to decouple the storage and we didn't want to use persistent volumes in openshift for this kind of thing, it's better to have every on the on the safe storage, which is much easier later to interact with other applications and another tools. So here you have everything and those are well standard netbooks, which is not really not really interesting.

A

At this point, if I go back to the presentation, if you have small studies on projects, we have worked with red hat to to validate that staff could be indeed used as a data Linux solution for running heavy spark workloads in research.

A

We are talking about petabytes of data and very very heavy heavy computing heavy compute. We have some genomics computing that takes a few weeks to to run. So we want you to make sure that we will have a sufficient sufficient performance with staff or spark back, and the results are currently being published right at something that we are doing in common short answer is yes, it works. Of course, there is a. There is some drawback in terms of performance, but still regarding the versatility that we have by using an object, storage and cost efficiency.

A

It's much much better to run right out with staff directly, and we start our Hadoop. As I said, the network has been redesigned as I just showed you. We have this working a proof of concept with Jupiter grown up and shipped with the self provisioning of the different flavors of notebooks, and forget that we have also OpenShift working with red analytics with go for spark. We have another one, a little pension with desk or for scheduling for scheduling other computer process.

A

We had a fire, we have to complete single Ssangyong across all the solutions which, which really makes things very, very much easier for everything, one you and what's next, we are working and seek and integration with Jupiter. So that is the ability directly from Jupiter to interact with secant to to preload. All your data sets. The data sets that you have bookmarks or your favorite data sets. They will be directly imported interpreter unknowns.

A

We have, we are working or sold to. We have some other. Second, extensions you'll be able to let's say, for example: oh you have antennae, said that I find interesting and I want to reduce some pre analyzes on this. So it should be very easy, with only one button to spawn directly on that book. Have it connected to the right data set, important data set and then I am able directly to interact with it without doing almost anything, we still have to work on the globus integration.

A

Globus is the kind of system that way to exchange data between different data centers. For example, we have computed canada. We have big facilities in the West, for example, and then Coover or in mountain. So whenever we want to leverage with HPC power, we have to move the data from point to point, so we reach with Globus at very high speed. We want to use volt as an integration to store all the client secrets, because we, as we use an extra storage, usually if those are not public public resources.

A

You have to integrate at some point that the secrets to connect to connect to the storage- and those are the thing that usually you pass through environment variables inside that moves, but whether you do it directly into the book itself, which we don't want to do, because at some points, if we want you to version these notebooks, of course, we don't want to secrets to get into it databases, so we have to directly directly up to the environment variable set at run time when the researchers are launching launching Jupiter, but we don't want them to manipulate those secrets and thing like that, so we are trying to find a way to use volt once our researcher is, is authenticated to directly directly manage the sequence of them and inject the secret at runtime inside in that book, when anyone's time and, of course, we're working at many many more things.

A

We are working also an uncute flow. We are developing custom, cube spawner you use inside tied up and shift to be able, as I said, to gather different information and lunch notebooks with different parameters and some different things that elephants that I don't remember right now, but that's about it and, of course, first everything, everything that we do. You know all those all those new technologies and integration, and things like that will be published and source for for everyone.

A

It's very important, especially as a university to have other universities embark on this and that's in I.

B

This is amazing, stuff I promise you. This is not the work you float meeting. I will say that that last point that you had on what's next, particularly relative to bonding and integrating compute resources into Jupiter. That is probably the most requested feature that we've had in queue flow. I would love to get your thoughts on it.

A

Yeah at the end, but yes, sure, definitely parts that we touch. We want to test and push to the limits you know and decent immersion is very important because you know our researchers they are not. There are not IT people, they are not even a scientist yet so here. The the Monday that I have from the management of the university is clearly stated, as we want our researchers to do. Research not lose time in setting up infrastructures, setting up libraries and find the right which work. That's the loss of time for, for us, yeah.

A

B

I couldn't agree more. If you have a chance, it wasn't open ship related, but, as you know, as everyone knows is so that the two are should be totally compatible at next last week, I demonstrated using the cluster API and auto provisioning of nodes.

B

Being able to just describe number of CPUs number of Jews and your container image and creating an entire new four-year cluster for a job to run in it spins it up, it creates it. It runs it. It stores the data and then it shuts down- and you know the bridge between that and doing that from Jupiter pub should be very, very small, so I'd love to get your thinking and how we help. Theta scientists execute that that very step, hey.

C

David was was that session recorded by any chance, and it.

B

Was yeah absolutely I will fight it and put it in the arena. Awesome.

C

And that would be great.

A

Comes to my mind, Bay Beach would be some interest for you. As we come from HPC from HPC worm, we have managed to integrate sea VMFS inside tide or not booking edges. So like this, we have access to directly pre, combined modules for calculation and thing like that, which I without adding to have them inside the image. So that means we keep the images as small as possible. Where is access to huge amount of different scientific libraries without I think anything to do so.

A

This is something also that that we want to share in terms of recipes and how to set it up perfect.

C

That's it yeah very much appreciated you sharing this use case and helping us understand what you're doing there and incredibly impressed at the work. That's going on. I think we all are, if you could add your slide deck link to the the notes that would be great. Staying with you sure are.