Red Hat OpenShift San Diego 2019 | OpenShift Commons Gathering, 18 Nov 2019

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 16 Openshift and Machine Learning at ExxonMobil

Description

OpenShift Commons Gathering @ Kubecon/NA San Diego November 18 2019

A

My name is Cory latch, Kowski and just a quick poll. How many people in this audience are with oil and gas industry?

A

You guys are brave good thanks how about the rest of you, how many of you are in a large, complex organization, awesome? Okay, how many of you want to create a large complex organization? Okay, good?

A

Yes, exactly so to lead us off today we're going to talk about how we've empowered individuals in a large organization through technology such as open shift in kubernetes and I'm gonna. Let Audrey kick us off with that telling us a story.

B

Okay, so I'll tell you our story, so my name is Audrey Resnick I'm, a data scientist I work with the computational data Sciences group within Exxon Mobile and as a data scientist probably two years ago. There were just three of us in this group and what we're trying to do is saying: hmm how can we get our proof of concepts over to our customers? And this was actually was a challenge for us, because we'd come up with these really great ideas and then we'd sit there and look at them on our screens and go damn.

B

I'm got to go and I have to go, and you know change my users. Environment I have to do some pip installs I have to make sure that they have the latest code base and that wasn't really a good way for us to actually get our proof of concepts out to our users.

B

I mean who wants to sit there and install a lot of software when you just want to see if the problem that you have actually can be solved, or is it a problem that she should even be looking at, so we started kind of our search and we we first started with Jupiter notebooks and we said well. This is a good way.

B

We have documentation and we can actually release the code, but we're still stuck there's there's no way that our colleagues and the customers that we have are going to be very patient to go ahead and sit up in an environment, and it really was a drawback.

B

So in having lunch with one of my colleagues, Chad Furman we're gonna have to get buttons Friends of Chad because he goes around ExxonMobil and gives all these ideas. And then some of us end up here on this talking to are talking about the stuff that we've come up with said. Well, why don't you take a look at open shift, because, basically, what you can do is take your entire environment and create a container.

B

So, instead of worrying about giving people, local admin, access or worrying about the latest source code or even worrying about some of the dependencies that you have, you can contain this in with a an atomic unit that you could go ahead and deploy, and we sat back and said: that's really a great idea. You know actually, if this, if this works for us, we would have something that is both.

B

Reproducible and interactive, save us the trouble of flying to some of our colleagues in Calgary and setting up their laptops just so they could see the latest solution that we have for them. The other thing that inadvertently, it does and I'll talk about this later on.

B

Is it really ties really well into the agile process so that we could go through our code, a lot more iteratively and we could really quickly push out those Minimum Viable products and I'll kind of give you a taste of that right now, it's last year, at this time we pushed out to Minimum Viable products, because we had just started using OpenShift and earnest this year at this time, we're already past seventy. Okay, that's seventy seven zero, that's huge!

B

So my goal last year was as a data scientist I want a data science environment for myself, and my colleagues and my users there's interactive and reproducible and can give me some sort of collaboration. So what we ended up doing is saying: let's get away from this snowflake type thing and create some sort of workflow where we could have our code. That would be coming out of againt repository and for you guys still here yeah for some of the scientists. This was a new thing.

B

There again repository was their h-drive okay and that's a challenge you have to get folks to determine. You know: how are we going to work together and openshift actually allowed us to do this as well too, because we could then put a workflow together we would take the code we would take it out of yet we would build an S I.

B

Do paterno book push it to open shift and then basically have our users or our colleagues be able to hit a URL, and that was huge for us, because that's that interactive reproducible and collaborative environment that we wanted. So what we did last year is we took at that one step forward. We said: hey data scientists we're going to use agile methodology. We want to be able to talk about a problem that you're developing.

B

We have this way that you're going to go ahead and deploy your product to your user and by the way, we're doing this using an open shift environment and guess what your user or your colleague has the ability just by clicking on that URL to go ahead and take a look at your product for us, that's actually ground baking. It may not be for many of you, but for us this was huge because before this, if we wanted to deploy a proof-of-concept, it would take us three weeks on something that we called quick, app delivery.

B

That's not quick three weeks and if it went bad, it would take a month and then, if we use the standard process that our IT department had, we could go from one month to three months so being able to code something and as fast as you could code it and deploy it and give it to your users was just huge for us.

B

The other thing that that we brought with us as well too, is with that interactive feedback. It really, as I mentioned, tightened really nice into the agile development. Where we could then see is a solution. Okay, if it's not, we could go ahead and recycle and then keep on going and for us, as I mentioned, that was just very nice and then the other thing that we have with this data science delivery model is in that turn.

B

In that process of we're working through agile ii-if, we determine that we have to create some sort of external, not Xperia, L girl, domes or if we have to kind of change direction a little bit. How are we going to add more packages or libraries? We can't ExxonMobil. Has this thing? You don't want to go ahead and download anything into our environment because we're worried about having malware and attacks. So what we have is an actual security portal, where we tell our Dave scientists.

B

Now, if there's a package or library that you like, or if you're working with Stanford or Berkeley, go ahead, get the packages the addresses give it to the security team. The security team is going to go ahead and help us download these packages and put it within a nexus repository.

B

Therefore, when you go ahead the next time- and you want to actually use this package- guess what we're not going to have 20 instances that all over the place it's going to be in one central location where we can actually get the exact package that everybody would be using again for us, that was another huge win, because I can attest it. As a data scientist three years ago, just the three of us. We had multiple versions of pi PI.

B

We had multiple versions of numpy and we could never agree to which one was correct, and some of you are laughing yeah. So you understand that so in having this data science delivery model and then being able to get these external packages and pull them in nicely made this environment, something that was very valuable to our data. Scientists and optimization engineers and I'll just quickly go through this, because the other thing that it did is it actually turned some of our data scientists and optimization engineers into developers.

B

We don't tell them that we tell them that with the open shift environment, we have a platform where we're helping you develop success skills so when they go ahead now, and you see that figure with the developer. Just think of that, as the data scientist we're saying. Well, this way we're helping you easily store your code somewhere, where somebody can get up and it's being automatically created into a source to image and guess what you don't have to worry about.

B

Finding those images- because we have it within an image registry- and you know what we can easily deploy it so unbeknownst to them there working in this methodology. We have over 40 data scientists at this point in time that are actually using the source to image and actually are becoming developers, but, as we tell them, you're developing these great success skills to make you a better data scientist by being able to release your proof of concepts very quickly.

B

So let me just switch gears a little bit and talk about the machine learning that we have at Exxon. One of the examples that Micro particularly works with is we take a look at optimization and surveillance and the chart that you're looking at right. There is just a machine learning model to protect the well flow and within the OpenShift environment.

B

One of the things that we looked at and kind of struggled with at the very beginning was if we were going to go ahead and work with the data scientists, we didn't want to have over 70 different containers. So what we quickly did is looking at the different types of problems that people looked at or worked on. We came with kind of three images that we kind of use between our data scientists and that we also give out to the rest of the ExxonMobil scientists when they want to use them.

B

First, one is just a basic image where, if you're, taking a look at a data set for the very first time, you're, probably going to use something like pandas or numpy, create a number of data frames and just take a look at your data and make sure that it's clean see if it needs to be clean any further or even see if that data has anything to do with the problem that you've been assessed with the second one apart.

B

A second container that we built our standardized is something that we call an intermediary container and that's one where we've already kind of curated the data, but now you're probably going to do a couple. Small machine learning problems with it, create a couple models.

B

Kick the tires on your model, see how well it works, and then, finally, we have an advanced container that we're building that only a very few of our data scientists are using right now and that's because this container, we use with some of our GPU work that we're doing, which is still a proof of concept. So, if, though, for those that are using PI, torch or tensor flow and taking a look at some of our more advanced models, they're going to go ahead and use that third container.

B

So while I'm talking about machine learning for us right now we're doing our final setup, I'm, really hoping. This is the final setup, because we've been having fun with this since May, we have our final GPU cluster, we're using some in Vidya V 100's, and we have also some internal services with our high performance computing center, where they also have some Nvidia 100 cluster set up, and there are really two proof of concepts that we're working with and I'll talk about them on the next slide.

B

But at this point in time with those proof of concepts, those are more to vet out. The containers that we have to make sure that the containers are going to work well for a number of the other machine learning and AI problems that we have through the company.

B

So these are the two proof of concepts and again we're trying to create a model where the data scientist is basically going to go ahead and any of the algorithms that they develop. We're trying to encourage them to go ahead and put them within a gap repository.

B

We would have a number of containers, such as the GPU container, that I was talking about that we're developing and we have a number of database containers for sequel server and for Oracle to actually access our own prompt data databases and, from that be able, then to again produce proof of concept where you could then send a URL to any of your colleagues or any of your customers.

B

The the two problems that we're working on one is just a natural language processing problem where we have one of our manufacturing sites getting a lot of reports in, and they just stack the reports on the desk because they're paper and it just keeps getting bigger and bigger soon.

B

You can't you open the door and the paper slides out because they can't get to the paper fast enough, so we're seeing how we can actually take that process and actually digitally transform that the second proof of concept that we're working on is more of what we call Petra physical process and we're just looking at Petra physical data.

B

So for those of you who are not geologists and I, think I'm, the only geologist in here plus I, guess: I'm, a data scientist suffer engineer, jack-of-all-trades whatever that's rock data, so we're looking at the porosity, the permeability and we're determining. Can we take a look at a number of our reservoirs from all over the world and see if we can make matches and say based on this type of reservoir? We see these types of characteristics. You know what, when we had a field in this type of reservoir, we produced X mounts of hydrocarbons.

B

This other reservoir looks similar in terms of the permeability and porosity and some of the other characteristics it might take a bit on that one and see if that one turns out the same. So those are the two proof of concepts that we're working on right now and with that I'm going to hand it over to Cori and I'm just going to mention. Last year there was myself and just one Red Hat contractor and as of two months ago, we actually were able to create an actual enablement team for our computational data scientists and Cori.

B

Is the team lead for that team? Thanks.

A

Audrey, all right so yeah, one of the one of the things that I guess, Audrey sort of tricked me into was to head up this team with that's called the enablement team. Thank you, and so one of the things that our purpose is to do is to create appropriately awkward conversations with the data scientists. We do peer reviews which they generally operate by themselves, focused very specifically on their machine learning or their models. So one of the things that we talk about with them is creating a pipeline of either automation.

A

One of the things that we we specifically go over with them is that one size does not fit all. We tried to actually give them a solution. So if we build it it'll come, they did not like that. It didn't fit their needs and we found out that a lot of these things are very quick and iterative. There's a huge waste basket of ideas, and so we just simply do it- did enable them to use web hooks with Sui. That was the simplest solution that they were very happy with another big thing that we learned.

A

We had one project with a month's budget that was burned through in about three days using GPUs in a cloud provider. I won't tell you which one, but basically this is one of those things that, when you're looking at GPU training or any kind of GPU use decide whether or not it makes more sense to actually just buy a rack of GPUs every two or three months, instead of putting it in cloud space. That was one hard lesson that we learned another one is a lot of the data.

A

Scientists are very focused again on on specifically solving a very, very niche problem, so we've had to sort of bring them out of that that mindset and we do it by asking really simple questions. Usually these are some of the questions we ask so where's your data, we talk about data gravity and we help them be more aware of that. We also ask them: where are your customers?

A

Are you working with internal customers within the company, or are you working in a hybrid of external internal and also what is the bandwidth their latency between these various elements in your systems?

A

This is sort of an overall architecture that we're we're helping them to think about. This is sort of our stack. I guess that he would say, and I've been up in the top left is talking about cloud ready applications. So we try to help to facilitate the data scientist to think as a developer to break these a lot of times. It's a single Python script that they've, like it's 2,000 lines of code and whatever, and we're trying to break that out, make it more modular or help them to think about collaborating with their peers.

A

Again, we create that awkward conversation of well. How are we supposed to support that and, and that really helps them to sort of get out of their own heads with this so specifically on the team. One of our personal focus areas is not necessarily looking to try and do something perfect. That is an ExxonMobil thing we want. We've always wanted to do something flawless, but this is a cultural thing we have to sort of break out of so success is not about doing things perfectly.

A

It's about willingness to change and be honest about where you are. Ultimately, this is going to be helping you to be in sorry. Ultimately, this is far more important than anything else that you'll do so in the in the upstream enablement team. Our focus areas specifically around what we're delivering is consulting- that's probably 80% of our time right now, just because we're trying to bring people up, this is technical debt, but in the people and skillset area education. So a lot of these consulting engagements become education.

A

We do workshops that have been really helpful, just teaching them. We have various layers in and get so. We teach them how to use get as an individual. How do you get as a team and then how do you get to collaborate externally and and just look look at the bigger picture? The other thing is all these things that we're doing either lead to clock collaboration or partnering with organizations internal.

A

Ideally, hopefully we'll get to the point where we can collaborate externally as well. We're working on that one of the things is Jupiter hub, so we're looking at open data hub and Jupiter hub as one of those enablers for self-service and then bringing GPUs to the the masses. So right at the bottom here, I'm going to turn it back over Audrey she'll talk about sort of where we why we ended up where we are- and this is the user story right.

B

So again, I mentioned that last year. At this time we had two perfect concepts, and these are for our our clients up in Calgary, specifically within the curl mine. They had a number of trucks that would deliver material around the mine, and you can imagine if you have 30 or 40 trucks on one Road and these trucks are weighing a couple tons. You can imagine that the road is going to degrade.

B

So one of the problems that we were given is, if we give you a starting time for these trucks- and we say when they're picking those loads up and where they're supposed to go. Can you optimally create some sort of system where we can randomly put the trucks in different roads to make sure that when they get to the dump location or to an actual crusher location and they're actually taking or to that port portion there, so we either get rid of the ore or we crush the or finer?

B

Can you actually devise some sort of so that we can go ahead and get these trucks to where they need to go? And as I mentioned, we were able to do that now.

B

Some of the savings for that I'll give you an example is in one instance, they said well, we need more trucks, because we see that we're not getting a lot of the stuff delivered quickly and we said okay well, let's go and run through some of our analysis and what we found is they didn't need more trucks, because a lot of the trucks were actually waiting in line burning fuel going ahead and increasing our carbon input.

B

If you want to say well, not really that, but you get the idea that we're able to say no don't buy more trucks, we're going to have a better way of actually going ahead and telling the trucks which location to go to and I think another example I'll give is also with graders, with some of the roads that we looked at. We said you know, you say that you also need more graders.

B

Well, actually we can show you that 60% of your graders are actually sitting in different locations and not working as efficiently as they have to. So those were some of the items in the proof of concept that came out one of the things I think that was really important about this is as data scientists in the research center.

B

We also tend to intimidate people, so when we give our users and our colleagues the ability to take a look at a proof of concept like this on their own time, so that they can go back into their office or they can go into meeting room with a number of their other colleagues and step through the application that we've created for them we're going to probably get a lot more honest feedback as well for me personally, with this entire journey at this point in time, I'm actually very happy, because I don't have to go and upload software onto somebody's machine or create a server and hide it in my office away from our normal IT just so that I can deploy a solution for my customers.

B

The other thing, that's really great about the platform that we have is. It is interactive. It really allows us to be more collaborative and if I don't have to worry about the architecture or delivery mechanisms that a data as a data scientist didn't have to I, just can do my job better.

B

So at the end of the day, I think that's what we think of any ways as democratizing data science- and here are my colleagues I made them posed for this picture here and they're very happy now, because we don't have to cram everybody into a room. However, we can deliver that proof of concept, so maybe some of our other colleagues in in Alberta or elsewhere or in India, can actually group around a computer and take a look at a proof of concept that we deliver.

B

C

Well, I think I am totally inspired and I, don't know it if you realized my my Twitter handle is at Python DJ, so I, just Jupiter hub and seeing you guys use this is just just makes my heart sing. So thank you very much both for coming and inspiring us closing.