Red Hat OpenShift San Francisco 2019 | OpenShift Commons Gathering, 28 Oct 2019

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Keynote: Baking AI Ethics into Infrastructure with Daniel Jeffries

Description

Keynote: Baking AI Ethics into Your AI Infrastructure Today with Daniel Jeffries of Pachyderm.

Filmed October 28th, 2019 in San Francisco.

A

So we're going to talk a bit about algorithms, making decisions in our lives and how do we know that we can trust the decisions that are being made? Algorithms are making more and more decisions for us every single day. Many of them are hidden away and we're not paying attention to them. We don't even know a lot of the time that we're already utilizing artificial intelligence.

A

When you talk to Siri or Google assistant- and you ask it questions, that's NLP, that's understanding what it is that you're saying and when you use Google Maps to get from point A to B through the various wildfires you're, utilizing artificial intelligence. In the background, we tend to no longer think of those things as they get baked into our lives as artificial intelligence and in the future, this trend is only going to accelerate we're, gonna, see more and more algorithms making decisions for us.

A

Some of those decisions are going to be trivial and we don't really care whether they make mistakes, and some of them are gonna, be very serious. I noticed Google photos the other day when I told it to look for people tagged a bunch of the paintings that I had taken in photographs all over Europe as people.

A

Theoretically, it's not very wrong, but they aren't actually anybody that I personally know that's a mistake that the algorithm has made and when it makes a mistake like sending you down the wrong Street, it's not necessarily the end of the world, because you're still gonna get to where you need to go, but when algorithms are making decisions about who gets, hired and fired, who goes to jail or who doesn't or driving a car that could injure or kill someone.

A

These decisions become very serious and we need to think about how these decisions are being made and what we can do about auditing them, and it give you a few examples of nightmare. Examples of AI gone wrong. Sometimes these are easier to see. There was a terrible example: painful example of Google photos, labeling people of color as guerrillas.

A

Imagine if you wake up in the morning and you log in to find yourself tagged as and handle. This is incredibly painful. Experience and people don't understand how artificial intelligence works, so they might think that there's just a rogue coder who did this deliberately in your organization. This is a PR disaster, but it's also a very disastrous experience from a personal level as well. This is a painful experience for people, and sometimes life and death is at stake too, and entire business models are at stake.

A

Uber self-driving car killed, hit and killed a woman in Arizona. Last year the sensors detected the object. The person has an unknown object, then as a vehicle, then as a bike, and it sped up, and there were a number of mistakes and assumptions within the model and within the decision-making process they had.

A

The National Transportation Association had discovered that in their report that they had disabled the ability of the self-driving system to emergency break because they didn't want to see erratic behavior, they didn't want the car to make a mistake and suddenly just jam on the brakes, and so they were relying on a human, a safety driver to make that split-second decision what we call human in the loop or HIO. That can be a very effective model, but it's not a panacea.

A

If you guys have ever used your cruise control, you have probably noticed yourself getting drowsy and paying less attention. That's actually been studied. We get something like 25 to 30 percent. More drowsy I've noticed this in my mother's car, which has adaptive cruise control with lidar and in radar where it speeds up and slows down, I start paying a lot less attention, so imagine that this completely autonomous vehicle has been driving for an hour or two.

A

You have no feedback from it whatsoever that it's made a mistake and you have to wake up out of this trance within a split second and make a decision to stop the car. It's a flawed system altogether and it knocked boobers ability their program off the off of public streets to this very day, very costly mistake, and then this last example here researchers were testing a classification system and they tested stop signs with graffiti and stickers on them.

A

When artificial intelligence algorithms make mistakes, they may be superhuman in their ability to make choices more accurate than humans, but when they do make mistakes they make very different mistakes than human beings. Sometimes mistakes that we trivial, even if you train a child, to recognize, stop signs they're, not going to mistake either of those signs as anything but stop signs just because they have graffiti on them. The one on the right was detected by the visual classification system is a 45 mile, an hour sign simply because they put stickers in different locations to confuse it.

A

These systems can be incredibly brittle and we're going to have to deal with these in the future, but the longer term problems are subtler, they're, sometimes harder to see. Sometimes we as human beings are very focused on the big flashy problems. It's hard to see these problems that play out over a long period of time. We spent three trillion dollars since 9/11 fighting terrorism that kills less people than lightning strikes.

A

We spend 10 billion dollars a year on heart disease and cancer, which kills one informant, and it's because that is a disease that plays out over a very long period of time in terms of subtler. Problems to compass, recidivism score in Florida gives a score to a judge about whether somebody should get bail, whether they're likely to commit a crime again, and it was a black box algorithm that was sold.

A

These exist in about eighteen to nineteen states and when they finally started to peel the box back on these and look at the methodologies behind them. They found that only two of them had been certified and they'd been certified by the people who sold them to the state, and they found that they're only about sixty percent accurate in predicting things and, in particular, cohorts they're, terrible at it. You can imagine if you put a deep learning system on the history of the American justice system.

A

What you might find is that you're ingesting a series, I mean justices, and if we take a theoretical example, you might find that an algorithm, that's handing out loans is not handing them out to.

A

As many women who could pay them back because, historically, that was in the data set and you might not be able to see this right away because it's not not giving out loans to all women, it's just giving them out to a lower percentage of ones who could actually pay it back, and these are very difficult problems to spot and, as we use more and more algorithms in life, we're going to see more and more of these become challenging. So how do we fix this? Well, one of the things that's coming down.

A

The pipe is explainable AI there's a DARPA initiative around this called Xai, where they're trying to get the machines to talk to us and tell us what it is that they're doing. But this is very cutting edge research in 2017 there was a paper still trying to decide what explainable AI even means.

A

Can the artificial intelligence tell us in plain language what it was doing, we're starting to see certain methods like the lime method, which can look at a visual classification system and say we think that these clusters of pixels were involved in the decision that the AI chose to label this as a bird or plane. But those things are not further they're not far enough developed for us to rely on and even then it's going to be a moving target if you think about something like alphago, which had four algorithms involved in it.

A

Cutting-Edge algorithms, if you haven't seen the documentary on Netflix, you would not think that it would be riveting stuff to see a computer in human playing a game, but it's actually incredibly emotional they're using a policy network from studying games.

A

They had reinforcement, learning a Monte, Carlo tree search because you can't know every possible outcome so there, because there are more possibilities than atoms and then in the universe, four moves and go, and so you can only study little partial randomized trees in that branch and each one of those would have to explain itself in a unique way and, as we see, artificial intelligence develop are going to see entire groups of algorithms, maybe ten or twenty or thirty working in concert in each almost like a subroutine within the intelligence.

A

That needs to explain itself in some way. So we'll incorporate this as we go, but it's not enough. So what most organizations do well, they form an ethics committee and the ethics committee is usually some powerful people or people are going to politic some people in human resources who tend to be more interested in these kinds of thinking than anyone else and what they inevitably do is they get together. They talk and they inevitably put out a report with the words inclusive and fair in it right.

A

If you look at the EU Commission on this, their framework says it's gonna. Be inclusive and fair and transparent- and this sounds great- everybody feels really good about this, except none of this is actionable. These are abstract human concepts that are not translatable into models and machine code, and so what happens? Nothing the group gets disbanded or nothing really changes within the models, because it can't and people love these reports. Somebody just sent me a report this morning saying I know you were giving a talk on ethics.

A

Maybe this is useful to you, then I immediately scroll down and saw the world and saw the words transparent, inclusive and fair in bold letters. So these things are not going to be enough for us because they just sound great, but they don't actually translate into anything actionable.

A

So what we really need to do is take the middle path. I'm not going to spend too much time talking about how you would actually formulate ethics into a series of statements that you could use for your data scientists, that's a totally different talk, but I am going to talk about how we can audit these systems today, already utilizing tools and best business practices and best IT practices that we've been using for many many years we're going to talk about.

A

We essentially don't need an ethics committee or explainable AI to be able to deploy these effectively and I'm going to talk about building what I call an AI anomaly response team and that really consists of two teams. One is a customer public facing team and the other one is what I call QA for artificial intelligence, so the customer public facing team is there to liaison with upset customers or when there's a PR disaster.

A

This is inevitable. We have to accept the fact these systems are not perfect, going to make mistakes. Nothing is perfect in the world. If we could just have pure determinism, we could build perfect systems and have a perfect life, but people are going to be upset when they don't get alone, when they're hired or fired on something they're going to ask questions, and your answer had better be better, then that's just the way that it works. You're gonna need to train people.

A

They're gonna need to have templates they're going to need to understand how these algorithms make decisions and they're going to be able to need to be able to talk to the public. Reassure the public and customers that were on top of this. We're fixing this we're moving forward, give them regular updates by the way. This is actionable intelligence.

A

Every organization on the planet should go home today and take your PR people and your customer people and start building this program. You don't need anything special to do this. It's something you're going to need to do right now, if you're, using algorithms decision-making in your organization.

A

The second team is going to be something that you're going to have to build and expect multiple organizations to build this over time. That is a what I call a QA for AI team. This is group of coders engineers, testers data scientists who specialize in breaking artificial intelligence and finding these various edge cases and their job is to come up with triage solutions and long term solutions. This is going to be a very creative team. We often talk about in artificial intelligence. Oh my god.

A

What's gonna happen to all the jobs, we're gonna lose some, it's very easy to see all of the jobs that we're going to lose, but it's very hard to see all the ones that we're going to create and in fact we always inevitably create new jobs. It's hard to explain a web designer to an 18th century farmer, because it's built on the back of 20 other technologies, computers, the web browser, digitization, etc, etc, etc. Photoshop, you can't explain all those things, so we can't see all the jobs that are coming.

A

This is going to be a very elite team in any organization. The example of the Google photos labeling folks incorrectly. They got a lot of backlash in that day. What they did is they simply stopped the system from labeling anything as a grill and there's a bunch of articles and wired in Forbes about how they didn't really fix the problem. That is correct. They did not really fix the problem, but this actually is an effective stopgap solution. It's a triage solution.

A

When you first get to the ER, you have to stop the bleeding before you can address the deeper problem, and so that was an effective solution.

A

It's just that they didn't go further in organizations are going to have to go further, and this is where it gets more challenging we're going to talk a bit about how you solve the problem at a deeper level, but we're also going to talk right now about kind of the tools we're going to take the best practices that we've used in IT for a long period of time and we're going to adapt them to artificial intelligence, we're going to use version, control and data, mid data and metadata management, snapshots and rollbacks to known good States, see ICD pipelines, unit testing, auditing and logging you're gonna have to be able to build unit tests and we'll talk a bit about this in just a little bit for these education.

A

These systems need to be logging their decisions over time, and you need to be able to look in a random sampling of these decisions to understand what it is that they are doing. Take that AI ethics team and have them looking at the random sampling of of things over time.

A

This is also where you're going to need to utilize tools like open shift where you need that as a bed, route with containers and you're gonna need to be able to use things like my company's packing dirt, where your, which is get for data, where you're keeping track of all of the different changes in the data on the model, the code and how they all relate to each other, so that you can go back to a known good State in the event that your fraud algorithm suddenly starts detecting a whole series of anomalies that are fake and has all these false positives.

A

How do you roll backwards and forwards? We already have a lot of these systems in place, we're going to have to adopt them and adapt them for artificial intelligence, in particular. We're going to talk about this sort of creative problem some. How would you solve a a difficult problem like the stop signs being detected as 45 miles an hour? You're gonna have to go out and potentially build by a new data set, and that means you're.

A

Gonna need integration, that's much deeper than kind of CI CD or what happened in kin, trainers, we're youwere, uniting storage and coding, and networking, and all these things together now you're gonna need to unite the business units and in legal teams. Think about. If you need to go, buy a new dataset. You have to find that data set.

A

Maybe you have to go to procurement to purchase it or you need to involve the legal team, because you need to test whether that data set is going to be effective for you and there might be a legal agreement. That's in place that says: hey you're not allowed to use this in production until you pay for it. It's a lot of coordination that happens, or maybe you just need to retrain a bunch of models which means you're, gonna need GPU cloud time or on your open shift infrastructure and that costs money.

A

So again, procurement is gonna have to be involved. Budgeting is gonna have to be involved in these things. You're also gonna have to think very creatively. This is this is almost an elite. Special Forces, like team they're gonna, have to come up with solutions. Maybe the triage solution is the best that you can come up with. There is no simple answer, or maybe you've got to build a synthetic data set.

A

Maybe you need to generate a whole series of synthetic profiles of different women and economic models in order to build to continually test over time whether the loans are being given out effectively or you're, going to need to be able to develop a method that shows that that stop sign is not being detected as a 45 mile. An hour sign right. These are test test test unit tests. All of these concepts. It's no longer enough to test the accuracy of a model.

A

The self-driving car is accurate, 98% of the time, but it detects a stop sign this 45 mile an hour. That's not good enough. You're actually going to need to build a unit test for these edge cases, and this team is going to have to know how to do these things and they're going to get more and more complex, because we're always going to run into these anomalies over time. Yeah ice will always make mistakes.

A

The various groups that are coming out and saying these things that need to be flawless are basically asking for unicorn and fairy dust. It won't happen. We're going to have to get comfortable with algorithms, making mistakes think about a self-driving car self-driving cars will likely be much better than humans at driving cars. Humans are terrible at it. By the way, 1.2 million people are killed on the roads, our being 50 million people are injured by humans. It requires absolute, perfect concentration.

A

You know you drop the cellphone and start digging around for it. Fighting with your girlfriend, your significant other, your kids are having a bad day at school and you're, not paying attention right. The cars are going to make decisions and there's some famous there's a famous MIT test on you know which, which people get killed in the self-driving car right. Is it the old person or the baby? You know you make the choice that tells us more about humans than it actually tells us about how these decisions are made by the algorithm.

A

The algorithm we don't. We can't program that kind of intelligence into it. We have no idea how they're going to make different mistakes and we've got to build tests with them, because they don't have contextual awareness.

A

There's a famous example of a visual classification system with a baby holding a pencil in the system, labels it as a baby, holding a baseball bat. Now, every human being has their sense that they know that the baseball bat would be too heavy for the baby, it's too big to be a baseball, bat, etc. The other rhythms don't have these kind of contextual awareness, and so we're gonna have to build these tests.

A

There's even a one pixel attack, that's been shown where a research organization and a research organization was able to put a single pixel pixel into the image net database at different points and able to completely destroy the system's ability to defectively detect what it was a single pixel. So these things are going to and we're going to have to build these proper testing solutions over time to be able to know that these things are working.

A

This is a universal problem. We're going to see artificial intelligence, saturate everything every organization, planet person is going to be affected by algorithmic decision-making and research in explainable AI is going to continue to accelerate, but it's not enough. We are all going to have to deal with and get better as a society with risk and we're gonna get have to get better at dealing with how we communicate about these things and how we detect these various anomalies, as they're happening and last day. I wouldn't be a futurist.

A

If I didn't talk a bit about what the future holds. Eventually, it's not going to be enough for just humans to do these things. We're gonna have AI monitoring, AI we're gonna need to automate the scaling of monitoring and we're gonna have AI and infinite regress.

A

All of these high-stakes decisions, whether it's trading money or a visual classification system for a mole on your arm. We've already I, talked about this two or three years ago, we're already starting to see companies trying to get approval to come to market. If I take a picture of this mole- and it tells me that it's benign it turns out to be malignant whose fault is it the doctor could have made a mistake, but I should have maybe gotten a second opinion, but I didn't who's at fault.

A

For these types of things, we're gonna see more and more of these types of dilemmas coming forward. Everything open, I think my friend Daniel likes to say that if it's not open, don't let it think for you I, like that. That's very funny it's, but we're not just going to see openness in the the tools, the infrastructure tools like tensorflow, extended and couvreux, and these things we're going to see open, datasets, open algorithms, open models. The open source methodology has absolutely in the world.

A

I lived through it in my wonderful days at Red, Hat night, when we first were going in and talking about Linux to the curmudgeonly UNIX engineers who said this is going to get me fired. It can't possibly be as good as as the proprietary Unix that looks kind of foolish nowadays, but open source will eat the artificial intelligence community as well, and it should we want to have these synthetic datasets, open, datasets, open algorithms, that's how we actually get to transparent. That's how we turn that from a platitude into something that's useful.

A

We talked a bit about explainable, a contextual AI, we're going to start to see to develop in the next 10 20 30 years as well, and that's where these machines are able to make better decisions with a larger context. It's not a general artificial intelligence, but it has more generalized intelligence has more context about the decisions that it's making.

A

If you can think about an early example of that it might a capsule network from Geoffrey Hinton, it's still research phase and not able to outperform a traditional convolutional neural network, but they taught it a bit about geometry. And if you look at a lot of the classification systems now it can detect a face.

A

We talked about facial recognition earlier, but if I take the eyes and I move them off the head, so they're floating and the nose over here, it's still gonna detect it as a face has no understanding than a head should go together in the eye should be here and the lip should be here. So if we can give it more of an understanding of geometry, then we get closer to a system that has some level of common sense and understanding. So we'll continue to see the development along these fronts over the next few years.

A

That's actually another initiative of DARPA as well, which is a contextual layer. I thought I made the term up, sometimes I make terms up and then I find out that they're already being used widely, so I can't take credit for it. Sadly, unless you want to give me credit for it, that's fun, and that is that's about yet we're all going to have to deal with these systems in the future.

A

This is a societal level problem and it's a problem that is, we can deal with right now and it's going to be a moving target, but it's something that we're all going to have to focus on in the coming years, if we're going to bring artificial intelligence into our organizations. Thank you very much for your time.