Continuous Delivery Foundation MLOps Special Interest Group, 13 Aug 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: CDF SIG MLOps meeting 2020-08-13a

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

B

Hello, terry, how are you hi.

A

Yeah good here, how are you warm good.

B

Yeah so warm right now, but you must be cooler and nicer on the island.

A

Yeah, it's uh it's! It's quite pleasant here at the moment.

B

Yeah, I bet I buy it. Do you sail.

A

I used to I don't don't get time nowadays, yeah.

B

How, how is everything, what are you cause? I haven't spoken to you that much recently, how do you feel about all the new changes with jenkins x3 like how does that fit in with your work.

A

Yeah, so I'm looking forward to getting stuck into that and making sure that all the ml up stuff is um is moving ahead on on that.

B

Yeah good good good and have you um yesterday was up as a pr. I don't think it has been merged in yet. Have you seen the pr for the jingen's x3 docs, because that will enable you to play if you need more direction, because it's a bit of disjointed now, it's a bunch of pieces. So if you.

C

B

To have you know, walkthrough of how to get started things like that and give some feedback on the docs. Have you seen this.

A

I have yeah and I've also been reading the code base as as they've been.

B

A

It so uh you know I've got some idea of what the what this the structure is like now, good good, so yeah. I just need to uh make some time to actually fire an instance up and start playing with it.

B

Good, okay, nice yeah, it's quite quite exciting, actually they're good changes. I think.

A

Yeah, I think we're we're moving to a much more mature structure now, which should be a lot easier to manage.

B

Yeah yeah definitely.

B

So I wonder, oh here we go.

D

B

D

Hi, hey sorry, I was chatting to james's rawlings and striking just before so always goes over time. So um uh how are you folks doing.

A

Yeah all good here, thanks good.

B

D

Yeah, that's doing it right here. It's the start of um I've got a t-shirt on because it's the start of it feels like springtime here. So I'm pretty happy about that. Today was uh it's pretty short winter year, so I can't really complain, but um the day's getting longer and it's uh getting warmer. I think you're just getting over your heat wave. So that's! I think, house is about to start yeah.

B

Looking forward to that.

D

Yeah, that's unpleasant there in the heat so um yeah. I guess uh uh this week. um Did you have anything you want to bring up cara or terry.

A

So, from my perspective, um we're just finished off the last few items in this uh technology requirements. Section.

B

A

I've done another three this week, um we just need to fire through the the last handful.

D

Maybe we could, um I forgot to take a bunch of notes last week on some of those points we were talking about, so maybe we could tear into those last three uh today.

D

um Maybe if, if I shared a screen and typed, we could do that um so before that um I had an interesting discussion with uh metaflo and uh and netflix, so metaflow was something I think I mentioned last time, um and so I spoke to, I guess the lead developer of that from from netflix and they are interested like they were very interested in the mlx roadmap. In fact, he said he was working on an internal memo at netflix um to sort of pass around uh on their sort of principles of ml.

D

Upswitch overlap very much with this, so he's going to take a another look at it and if he can share it uh with me, that would be good too, because it's kind of a you know large-scale validation of some of those ideas. um There was a really good blog they put out recently um they did integration with amazon step functions, so they use step functions as a back end for orchestrating the workflows. So metaflow is more like a high level sort of flow config that just glues, different tools together and behind the scenes in netflix.

D

They have sort of a closed source orchestrator, but they wanted to make it work with something that was off the shelf, so that was amazon's dead functions. um So that could be something else. So he was interested in tactile from that point of view, and he he mentioned it's a sister project of spinnaker, so he's very aware of the cdf, so it could well be if it becomes successful.

D

The way spinnaker was it could become, they could become interested in the cdf, which I thought was interesting, um but we'll see at this stage, they're kind of sort of thrashing pretty hard on it and they're getting outside contributions and stuff. So you know they want that to sort of settle down and to see it has some um life to live outside of uh netflix and then they'd be interested in perhaps looking at a place to to host it.

D

So I thought that was interesting um because to me I view it as sort of almost like an instantiation of a lot of the principles in the um in the anal ops roadmap- maybe not all of them, but I guess that's part of the idea of that is, like you hear a bunch of challenges and requirements and possible solutions, and this is one of them that you know that solves you know a bunch of these things. Yeah.

B

D

It it definitely comes from the angle of uh building stuff for people who are interested in data science and machine learning, not necessarily only data scientists but heavily on the data science side, so building tools that that guide them to do it the right way, but not to too prescriptive.

D

um So it was pretty interesting and they run things at quite a scale there like he was saying they you know for multiple countries. They would do maybe trying a model for every language in that country. So you can imagine the fan out of that and they they're trying retraining that, based on some of its cron schedules, some of its uh when the data becomes available from some upstream data source.

D

You know, there's probably other systems in netflix that batch things up and and so on, or maybe maybe just a certain amount of viewing data and and then they do it for all sorts of experimentation as well. You know maybe even planning what uh what film and stuff it was fascinating.

D

So um I I know this is recorded, so I can't share everything that you sort of talked about because you know netflix netflix, but it was it's fascinating, so they're kind of applying a lot of these principles at scale and um doing it in a way that everyone in netflix, that does machine learning and data science sort of goes through this platform. So I thought that was a good validation of some of these ideas and there was one blog post.

D

I just put it in the show notes the show notes: the the minutes middle flow amazon, which I thought was a really good uh blog post on the philosophy of it um from a data science, point of view.

D

So there's a link.

D

Can I edit that doc.

D

And I'll paste it in the zoom chat as.

D

D

So I thought that was a really good read.

D

On uh on their philosophy of things and uh why they use a dag, a cyclic graph to do things and yeah. So that's that's worth a look. um Hi is that uh al mog, I'm not sure if I'm saying your name right, so I'm not sure if you've been here before when I've been haven't been here. I don't know if we've met.

D

So what why don't you introduce yourself.

E

Hi I've met on the other meeting with terry, I'm I'm a technical entrepreneur founded a few companies and now I'm investigating and researching about the innovation in the envelopes world. So I'm here to listen and help, and whatever you want to do, I'm I'm I can help but uh feel feel free. Please to to ask me to do stuff uh this way. I'll do it. You know, because if you ask, I can't say no yeah, I already volunteered.

D

Yeah, well, it's great to uh yeah it's great to have you along um yeah. It's it's a an interesting area of the whole ml ups thing, the the more time I spend on things, the more uh I realize a lot of it's about just handling the data like I'm. I must have spent most of the past week just trying to uh I'm working on a conversational sort of user interface right now, um there's a bunch of different models that are doing things behind the scene that are, they can take their time.

D

They prepare the data, but then the the the sort of more real-time stuff is, uh I'm finding a bit frustrating because everything's time sensitive and it's all about getting the data in a timely fashion and yeah it's most of the mlaps seems to be about handling data. It's almost like by the time you get to training the model and deploying it that's the fun part like that's the that's the uh it's it's it's a dangerous part, but it's also the fun part so yeah.

D

um I guess the next thing was last time we mentioned about any rfps and conferences, and things like that has there. Anyone had any done any submissions or any heard of anything interesting to start uh sort of evangelizing or socializing. The roadmap.

A

I think we've uh we've probably missed most of uh most of the events this year, things tend to be they're.

D

A

August september 9.

D

Yeah yeah, so I guess it's uh um not to keep an eye out is has cube con closed. I don't know whether this is within the remit of that, but they certainly would get a bigger audience. I'm pretty sure their main event is late in the year. um I don't know if they are.

D

Cubicon cloud native fun.

D

No it's in august, oh that's the europoint. I said I still got a europe one and they'll have a u.s one later in the year. Isn't that funny, even though it's all online sorry cara.

B

The north american cfp for kubecon players now.

D

It's closed. Okay, all right, yeah, that's probably the big one anyway. I thought I'd follow up on that.

D

uh So I guess the next thing is to dive into those final few items. Did you want to show your screen terry, or do you want me to.

A

D

D

So I made a change to the online learning section: um changed the wording to say it's not out of scope, but it's not something we're looking at at this time and these words that effect.

A

Yeah, I think it'd be better to actually re remove that bit as well. Let's, let's just get capture the the challenge in there and then we'll put a technology requirement in there, but we'll we'll we'll make it clear that more investigation needs to happen in the technology requirement.

A

D

Right so just make.

A

It say structuring these things is that whenever we capture something, that's a that's a challenge, it should go into the challenges table right um and.

D

We don't have to we, we. We could admit that we just need more information.

A

Yeah, well, we can we. What we do is we flag certain things as difficult challenges which are typically in a long time horizon, and then that's the that's. The point of this, which we're going to come on to next is, is that this is basically giving an indicator of when you can expect certain capabilities to be available um and and for something like that which is which is very challenging.

A

um It may not be visible as a capability in this road map sure, but we, you would almost certainly be be, showing some some work being done on it um later on in the roadmap.

A

So you know you you would you would flag it as as as black research required, um and then um you you, you would potentially not see any development activity on this time horizon, um but you you would leave it as an outstanding that that somebody needs to pick up, um and those are typically the things that some people will be very interested in, because they'll be looking at the the longer term wins and where it's worth investing in a big bet for something downstream.

A

um I've added some extra pieces in here um with respect to.

A

Managing assets and security, uh so this is a this- is the latest pr and then again, yeah.

D

I just updated my wording by the way to the challenges, one so yeah, so we got these three yep.

A

Yeah, so we we've we've, we've just got these ones, so let's go back and look at the challenges and see what what we've actually identified and then that'll tell us what we need to discuss to go in here, so margin tracking.

A

A

Yeah, so so this is an interesting one, because um you know the practicalities of models are that they're. um There are nearly always some sort of trade-off between different factors um and- and there are are often going to be conflicting um drivers on on customers as they're trying to build solutions.

D

So these are these trade-offs in in terms of almost design decisions um like um like. Currently, I imagine, if it's sort of a human factor, then it would be a note in a or it would be a you know, some comments in a notebook or a document somewhere saying we, maybe you couldn't do things at the zip or postcode level, but we had to reduce the resolution to be the trade-off.

D

Is we couldn't be as fine-grained, because of some legal constraints, or um one thing I I was looking at was: was tracking uh sentiment and things of individual comments that could be traced back to individuals, uh which is not always what you want, um because that sort of information could be used the wrong way. So would a trade-off. Be that you deliberately almost blunt the system or you blunt, the data in some way um yeah as a positive trade-off for privacy or in some cases you go.

D

uh This is we require this personally, you know these these pieces of personally identifying information um to train the model, but it's not going to affect. You won't be able to go backwards and you have to explain. Am I on the right side of track in terms of what you mean by trade-off.

A

Yeah I mean, and and often different types of compliance actually introduce competing trade-offs. So, for example, if you're faced with gdpr, but also some new ai legislation about explainability those things will be in direct conflict because the the more explainable you make the solution, the less privacy.

A

There is because you have to link back to individual events in the data to be fully explainable.

A

So um so there are going to be certainly big trade-offs in a triangle between accuracy, privacy and explainability, um but but typically there are. There are multiple other areas where you will also have these sorts of trade-offs. Where you know there are basically different features in a dataset, and you can't optimize for all features, so you're you're, ending up with a balance um between weights on certain features that will give you a certain behavior out of the system.

A

So so the the the overarching piece here is that we need to recognize that that's the base reality that we're operating against and that actually we should be providing capabilities that embrace that and help customers to manage their assets within that environment.

A

So, for example, what you might want to be able to do is um is tune parameters in such a way that you end up with with several models as points on a continuum between different endpoints in in in the trade-off space. So say you say: you've got a triangle represented by the extremes of accuracy, privacy and explainability.

A

What you might actually want to do is try and train a bunch of versions of the same model that sit at different, actually.

D

Right and you, you might find right, you might find one that doesn't give us a away as much privacy as you thought, is still adequately explainable and accurate.

A

D

And yeah yeah, there's ensemble, sort of things and and automl tools can do things like that, because they just try a bunch of things, and you just pick the in fact. Netflix do that. That was one of the examples they use in the meta flow is that they would, uh you know, do a hundred different variants of it in parallel um and just sort of almost brute force it out and then pick the one. That's that's part of the flow pipeline would be to pick the one.

D

That's that's good enough, based on some constraints, and some of those options would be. You know in this case might have more privacy challenges than others, and then you could weight it accordingly and pick the one that you know if you value privacy overall else, you want to be as explainable as possible, but no, you know yeah. I like that triangle idea. So could we just take some notes on this, or should we open a pull request and just put some dot points of what you just said, because it's easy to lose track.

A

um Well, I'm I'm happy to write that one up, because I've got a fairly clear idea of what okay all right.

D

D

So the next one yeah the triangle- things are great. I know we can't really include a visual in there. It would look a bit weird, but if we could have that somewhere, that's kind of an interesting, interesting uh idea. It's like the what's the distributed computing version of that the uh durability and availability and gets the cap serum consistency.

D

It's almost like the cap, theorem, there's, always three. Isn't there like and you gotta pick. You know pick two or pick one.

D

um Oh, that's fascinating.

D

Let's take a note in the in the minutes on that. um Yes, I thought. That's.

B

B

So, with with the model training as it's done now, usually these formulations that are are checking many different variants of a model they um like in different parameters. They are looking at precision versus a recall. Often those are like two standards, so this would be beyond that. Your whatever system you have in place come to compare your different potential.

D

I think what they sorry.

B

Would have multiple comparison points then, so it would just be like this enormous branching tree almost of using.

D

B

Results are different comparison points.

D

I I don't know uh for a fact what netflix do but um yeah they they try all those permutations of hyper parameters, but I'm guessing they're just looking for accuracy, um so that's simpler, but yeah you, you could have some other. I don't know how, and maybe this is part of what terry's thinking for the requirement is like. How do you sort of quantify the the privacy trade-off?

D

um It's maybe it's how many you know, maybe if there's 10 features of personally identifying information, something that uses three scores better in privacy than something that uses seven maybe like. If all seven features are used to train the the best model um that has better accuracy, um then, although accuracy and precision have a specific meaning in models, so the the good, the goodness like the more good or um I don't know you know the best model by some whatever your metric is that you're.

B

D

You kind of have to that's the trade-off, then, if you could say I'm using three of these, but I get this score or I use seven of these and I get that score. The trade-off is that, well that score is actually good enough, so I'm gonna well that school's, not good enough, so I'm gonna take more of that. Pii features in there is that right.

A

Yes, I think the the existing situation is, to a large extent, people set thresholds and then they're aiming to get one parameter above a threshold, um whereas the the the practical application in the future is going to be much more about having to tune for the best compromise against multiple parameters.

A

And so you may end up deploying a less performant model because it's actually safer or more legally compliant.

A

And I think we're we're probably looking at a future path in which the solution is more like evolutionary software than than conventional single path. Ticd, in that what you'll do is you'll you'll, spin off multiple parallel trainings.

A

And try and get them to resolve against different points in our triangle uh and then you'll compare the behaviors of those um against uh a set of of overarching tests and then select from that group. One which is the the best compromise for the for the application.

D

Yeah, you have to have something that makes a decision, and so, if.

E

If I can ask for so for now currently for what it looks like the the measurement, the kpi for a good model is how accurate it is, and what we're saying is we need to add more measurement measurements to to to see it's like a triangle, fairness, uh accuracy and their privacy, and you need to take in account all of these.

E

You know measurements in order to decide which model you choose. That's correct, yeah,.

E

Interesting that can be a wall. We can write a world article about.

D

Yes, it is, and I don't I don't think, anyone's really yeah, it's probably yeah. Maybe people are already doing this, but it'd be surprisingly surprising, but I I imagine you could come up with a like if you the way I think of it is that you have a set of so many features that you may or may not use in one of those permutations and then, when you look at the output of that, you've got all the precision and accuracy of the bottle itself and then you've got how many features of those set were used.

D

And then you could have some formula that comes up with some aggregate score and then that thing decides which one that you pick is um yeah. I could imagine people I'd, be surprised that people are doing stuff out.

E

Of it yes, yeah, the formula is also quite different for sorry to be rude, but let let's take russia as an example in russia. They don't really have um a problem.

E

You know recognizing black people, because there are no black people in russia. You know like 90 99 of the people in russia. So probably if I were a russian company that wouldn't be an issue for me, so I'll prefer more accurate than fairness of uh steam color no.

D

Yeah the the biases come into it at that point, um based on the on the data, and I guess the role of ml ops there is to have that um sort of trail or record of how you got to there like people will ask how come it doesn't work for this group of people and then you can go back and go well. Here's the training set of data, it's from this nation or this city um they're not represented, and then you can go well, that's a problem. How do we do that?

D

And then you can, you could add inject extra data or you could add, I guess, unit tests or acceptance tests. If you like to go well, we won't accept a model that doesn't you know at least work for these cases and and fix it up that way, but yeah. That's definitely, there's. Definitely a role for mlops to look at that, but yeah. This is something that people are discovering every day like there was that um I think it was a kickstarter startup.

D

The other week that was guessing people's gender based on their email address is that right. It was all over twitter and it was, um it was pretty obvious. It was a bad idea uh yeah because yeah you put in you know such and such a name which could be anything and it would say a 60 chance, yeah whatever and.

E

Then you put it you put in doctor and it says 100.

D

Chance yeah, it's just so it's like it's in co, it's codifying the biases of, and then they were saying you know the people. The project got got cancelled, but people were saying, oh that, but this is just. This is just reflecting what the data is, but I think you know we're at a point where people aren't accepting that it's like. Yes, the data says that, but you know we should be doing better than just echoing. I don't know what the real world has so yeah.

D

It's definitely one of the challenges and I guess the idea of surfacing it in in these pipelines that are more traceable and explicit than people can maybe catch this stuff more early on rather than late.

A

So that actually brings us on to the next um item, uh which is pretty much the law of unintended consequences, because if we actually have to validate all of these models for fairness and bias, then that means that we actually have to hold um special category data.

A

That actually indicates things like race and religion, gender, sexual orientation um stuff, like that, because we need that data to actually detect the bias.

A

Oh. So that means that you, you end up having to then hold a whole bunch of sensitive data that massively increases the privacy risk in order to attempt to decrease the fairness and bias risk.

A

So um what what we're going to see is a whole period in which there are knee-jerk legislation. Attempts to um to fix the problem and those knee-jerk responses will actually make things worse in other areas, because we'll get more data breaches as a result of having to store more data to address the previous concerns.

A

um So to an extent, this is like trying to push down bumps in a carpet that the best you can hope for is to get them all under the sofa where nobody can see them.

A

um But what we will need to do is think about the technical implications of this, because we will need to have a high degree of security and privacy management in all of these mlr solutions, because they themselves will otherwise be a large vulnerability in all of these application development stacks.

A

So so this piece really needs to address um data classification and data protection um and, and also you know, probably some degree of auditing in terms of managing the the day-to-day level of sensitivity of of the information that's flowing through your mlx system.

D

So some of these, uh so when you say special care category data, some of that's um in different countries, it's protected data. uh There must be already in place some exceptions around that, like, for example, in insurance policies for vehicles or other things, even life insurance.

D

Here, certainly they find out quite a lot about you in fact, you're obliged to tell them, um but normally um in a normal workplace professional setting. They can't ask that, like in in my country, like you, can't even ask someone how old they are professionally, you can't ask about their marital status or or any or religion or ethnicity. Any of that stuff is, is you can't actually ask it as an employer?

D

They can ask about what? How old are you? No, they can't, yes, not actually allowed, but everyone everyone, volunteers, it I mean. There's certain forms. You write down your date of birth, but like there's, there's rules around around age, but obviously you need to know that for getting that's a very trivial example, but for getting in a car insurance, for example. So I assume there's already some.

D

This isn't really changing below there's already some prior path trodden path here around handling data because of insurance companies and banks, and things like that. So.

E

D

E

How does they handle it if an insurance company can't know how order you, how can.

D

They do that, that's what I'm saying is they do know so they they must have some special exception um or it's it's yeah. So there's I guess that's the more on the legal aspect, but I guess my point is more that this isn't the first time sort of this handling of the sensitive data, protected data or special category data. It has come across banks and insurance companies all over the world.

D

Do this every day, um maybe lawyers, but then they probably yeah it's filed away in a cabinet or it's probably on a usb drive that they just leave at a bar or something anyway. That's how you hear about this stuff, but um yeah. I guess I'm saying is there in terms of the data security, it might not be that there's new solutions or anything required. It's more that considerations for the data um might matter more here than than say a typical software system.

D

The data, you could argue the data is more sensitive, so someone building one of these systems should be more aware of encryption at rest or or google even brought out that uh system that has things encrypted. You know almost all the way to you know the help yeah yeah, so that the data stays encrypted as long as possible.

D

um How it's these are things you might not care about if you are building a an e-commerce system, for example, but if you're building an e-commerce system that was tracking every click and every transaction ever and and a whole lot of the customer and- and you know, maybe their mobile app- that they're using that tracks their location, then suddenly you've got, I guess, an escalation of data potentially that that you would have to handle more carefully than you would in the past.

B

Can I just ask um it's more of a, I guess, slightly technical question, just checking something, so we need to have this in the way that we're structured in the roadmap, the the more sensitive data needs to be collected and kept in order to check that the your models don't have bias.

B

That's how we have it structured here, like um unintended consequences, kind of thing and would it that is because you should split your test data from your overall data set and keep it separate from your training data.

B

This would be a difference from that standard practice, which is probably best practice for checking your model, but but if you had data that had been cleansed of all um more anonymized, so what weren't collecting this data and if it existed, was being taken out of the data set and you trained your model with that.

B

Could you have synthetic data to check for biases, so it wouldn't have to come from this. Make.

D

Up a make up a fictional persona or something that that fits the the the the criteria like you might be. Looking yeah it's, so it's not a real person right.

D

A

Guess that would be, if you can.

D

If you can do it in volume, then that would be ideal yeah. I think so.

A

The the challenge is that the bias is in the data rather than in the model, so so you're you're you're actually needing to add information into the model into the into the source data to flag. The fact that some of the items have bias that you can then train against those to even out the model that you get.

A

So you actually have to add data to the training set to indicate that the degree of skew in the data set.

B

That's assuming you know that degree of skew and a lot of the times I mean parsley, is just people not thinking about it, but when you think of things like the mortgage scandal were certain postcodes because they were predominantly black. Postcodes were really uh penalized under this ai okay. So in that case, yes, we have bias in reality and that will be shown in the data, even if you're not collecting for, but what you could have is you could have a data training set. That was checking for this, like the model you've used.

B

Does it actually in you know real world, but synthetic real world? Would it result in you know all these side effects that we don't want, say racism or sexism or whatever, so that would be enabling you to keep your data, keep some sensitive data out and, of course, like different models like if you're in medical care, like you probably need a lot more of that data, but you know for a lot of models.

B

You could keep it out of your training data and the data you collect and then you can have it be checked for in your test, data.

A

Yeah, I think that the challenge here is that, in order to detect and in order to rebalance, you actually have to be holding information about the this, the sensitive things that are being detected by proxy in your original data.

A

So so that means the the requirement to to add. Fairness also means having to hold more sensitive data.

D

Like perhaps the the the you know, specific examples that might be easier to synthesize like it's not uncommon, to synthesize data to balance some training set, but you have to know a lot about what you're synthesizing, because it's always risky to train on data that isn't real, but people still do it like people will balance data sets and drop a certain percentage of things to um like in in this case, terry? How are you if there was some bias detected?

D

It's like it's one thing to sort of failed and reject it, um but how do you? How is that corrected like, like this data, that you've got the sensitive category of data that you've got your hands on? Does that have to be fed back into the model itself like it does? Doesn't it to? If you want to correct it with real data, you don't really, or can you adjust weights on things like? Maybe it's like with enough data? Surely it's not it's not completely uh black and white, where the source data is completely missing.

D

A certain category of people, for example. It's just that it's very underrepresented. So, can you just adjust the weights on the model or would you have to synthesize or add more data to balance it out.

A

um You, if you wanted a an accurate model, then you would need to collect more data to to balance out that category.

D

Yeah I mean in some cases you might not, because you might want your model to reflect reflect reality, but you just hand code heuristics from the output of that. So you might you know in the mortgage case, you might go well if the person's from this protected category- or we know these postcodes based on the data- are problematic.

D

So we will, you know in we'll kind of use, a more heuristic approach for this subset of the decision making. I can imagine people doing that and kind of you know.

D

You know almost hacking around it you're using the model for where you want to use the model and then you're using yeah ensemble approaches work that way effectively. They just have different models for different sections of the data, rather than one big bundle um yeah, so I'm sure there's lots of technical solutions there, but yeah it's not a if, if reality is biased away in a way, that's not suitable for your business and it's decision making. Then you basically have to either bend reality or bend the decisions uh to ensure fairness and equity.

A

Yeah, but again it's I it's actually from this perspective, it's irrelevant what the um what the solution would be to rectifying the underlying.

B

A

Because you still have to hold the data to detect that there is a problem.

D

A

D

That's that's, I think, that's the let's take some.

D

D

Yeah, so even if you could synthesize things to validate it for some parts of your pipeline or retrain it to correct the balance at least initially or probably regularly, you would need to be getting data into the system that is sensitive uh because you wouldn't know like you could you could go a fair way with synthetic data creating fake personas that match different things, but at some point that will drift from reality and then you'll end up back where you started where it's it's miscategorizing underrepresented people, because your synthetic uh personas are not filling the gaps right so yeah, I guess that's.

D

The bottom line is that you know the technology requirement is to handle sensitive data. Ideally it's as little as possible, but there will be some there's no getting around it. Yeah.

A

I think that's what what we're really getting to at this point is that pretty much by definition, every ml ops implementation will be forced to handle the most sensitive category of data, even if the customer that didn't anticipate that when they started it's because the customer may not have considered that as a requirement when they selected the product.

A

But the problem domain itself inherently points towards a situation in which that that classification of data is almost certain to exist in in every sizeable problem, that's being run through it.

A

So so and again, I think this is one where it. This may not be obvious to developers of ml ops, tooling, but we we clearly need to set the bar a lot higher for this tooling than we would do for any other type of um software engineering tooling, because.

D

Yeah I mean in in in uh traditional software: it's you would get if you're building on a system that that has a database of some sort of pretty much everything does.

D

um Then it's not uncommon for people to want to have some subset of the production data to work on, but in general, people won't have the whole copy of it or they'll have a scrubbed version of it or it'll, be a very small portion, and it won't at all reflect the real production, typically uh in machine learning, development and and uh data science realm that doesn't really cover it.

D

So you're gonna be handling uh what would be production data just to get anything done like it might be a subset of it just scale it down to get some algorithms right, but essentially it will be the real sensitive stuff and it'll be a lot of it too, like even a small subset is still a lot more, um whereas in normal software development it would be fine to have a a barely populated system just enough to exercise the functionality. You would really work across.

A

I think that's the that's. The point that we want to get across in the technology requirement is, is that you know the all. Mlops systems will be expected to to work with highly sensitive data and need to take that into account at a design level, because otherwise we're going to go into a period where there'll be multiple data breaches and every time it's going to be somebody attacking the the the mlops system to get access to the to the data.

D

That's a great point. I understand, I mean uh yeah input. Devops type infrastructure is sometimes a fairly advanced target of attack uh because it has its fingers in everywhere uh and yeah. This just magnifies that, because if you've got every uh ml engineer with a copy of this sensitive data, that's a huge attack surface, uh it's you know all their laptops uh and things like that and typically laptops.

D

um You know they're, not they're, not as uniform and securable as a server. Yes, um it's just the way. It is it's they're, very customized. So um so I guess there's one sorry sure.

A

I was just gonna say that that also brings us neatly on to the next one.

A

It which is specifically looking at different types of attacks that can be made against models and how we need to consider that, as as part of the mlx workflow.

D

So there's ip protection in there.

D

um So black boxer is that is that to protect the ip or is that to protect the data or both? I guess it's, the.

A

If you've got access to a model, you can use that to train a copy of it.

A

So if you treat it as a black box and just keep poking data into it and then train a new model based on the input data and the output data, you'll, eventually synthesize, what the original model was.

A

So that means you actually need to provide protections that can detect large amounts of data being fired at particular model services, um because that might indicate that someone is trying to reverse engineer your your model.

D

I mean that that's a that would be a similar pattern of attack to um to the other ones like extracting data about individuals, um membership, inference attacks by by just throwing data out and seeing if it does something different uh yeah. So.

A

Those again yeah, that's they're, all they're, all.

D

Kind of related, they're, all abusive the model or detecting a pattern of.

A

Yeah, so there will be some standard ways of detecting and mitigating that um and those I.

D

Guess the the analogy here in the analogy here in, for example, web apps would be protections against. uh You know, request, forgery and and injection attacks, and things like that. There's probably there's a bunch of you know: there's the oh wasp 12 or whatever it is sort of things, um there's obviously a lot more ways to attack things, but that's a good generic sort of baseline of things. So it's analogous to that.

D

These are there's, probably specific things you could do to protect and and that but yeah these would be sort of the um intrinsic production, okay, yeah. That makes that makes sense.

A

Yeah, so that you you would expect there would be a need for um validation level tooling, to allow you to exercise your models in against things like adversarial attacks during the integration phase of your development.

A

um But then you'd also expect there would be runtime deployment considerations for all of these examples where you would actually want to be able to deploy some capability with your model.

D

Like a like a web, app firewall and there's all sorts of layers you can put in with web apps today, this would be similar for a model, and likewise you can run test suites and fuzz tests against packages of software before they go to production. You can also so it's like it's viable to defend.

E

More like a firewall protection or more like um a sandbox where you put your model in and you run a lot of tests before production.

D

Both I think both yeah yeah, like it's, you would validate it so that you haven't done anything to make it worse um and then yeah production. You have something wrapped around it. That sort of the last line of defense.

A

um So again, using the the airwass model you're going to have owasp top 10 penetration testing done during your build. But then you would also want some sort of uh owasp rast style deployment in in your application. That is actively monitoring uh incoming traffic and and either warning or reacting to additional attacks on on the service.

D

There's a lot a lot of um if it follows the analogy of other software. Most attacks are fairly generic, like the usually software's compromised by casting a very wide net, and then one out of ten thousand servers will respond with something and there's also spearfishing and more targeted things and social engineering. Social engineering is outside of the scope of this, and and uh there might be a very specific model that someone wants to attack, in which case the generic things won't help a lot, but um that's the same with websites. Today, it's the like.

D

We, you just know that all google assets are always under attack all the time.

D

um I had a bunch of friends that worked on the the blogger platform, I guess still around, but google and um they would come under ddos all the time, because there were political blogs on there and there were state actors wanting to attack it, and their strategy was just to make things scale so much that the ddoses weren't ddoses anymore, and then they couldn't really prevent it so that they could make it strong enough to harden enough. So it didn't actually matter. So.

D

I'm sure there's analogies here as well, although this in this case it's not so much a protection of, I mean there might be protections around availability, although generally models are fairly efficient at runtime execution, it's maybe it's more the training of things, but um it's more than ip protection and and uh defending against inference attacks. So you know.

A

D

I guess um sorry, you go to.

A

Just get getting these done, I'm happy to to definitely pick up the first one cara. Would you like to write something about the the data categories.

A

B

I I I have so many hesitations about arguing that we need to enable more to normalize more data, personal information gathering.

B

I I I would I would I would be like we need more protections, so I I don't think I'm the person to write this, because I think I, what I individually as a human would want to argue, is probably different than what you're heading for here.

A

Like no, I think that's that's. That's actually the point. The point is that what what we're going to find is that, um from a legal perspective, governments are going to force us down a route that leads to having to capture more information and that, if we're not going to then create a a big privacy problem, we're actually going to have to ensure that the the implementations of these solutions are built to an appropriate standard to mitigate the risks of that happening.

A

um You know this is the this is the unintended consequence thing is that you're going to get politicians acting to do things that they think that will protect citizens and actually those actions will create bigger problems elsewhere, because they don't understand the consequences of the first action.

A

um So we have to think ahead of all of this and and plan for the situation where, where we are end up end up left trying to reconcile a set of poor decisions that are imposed on us in law um and make sure that we're not creating a a technical vulnerability. Under those circumstances,.

B

So you're, assuming that governments will move further away from things like gdpr.

A

What I'm saying is that governments will will introduce independent legislation that conflicts.

A

D

Like, for example, a gdpr um whilst well intended has an unintended side effect of strengthening the position of facebook and google in the face of other, more diverse competitors, um and that certainly was not intended, but that's one of the effects it has had, because no one else can really afford to operate in certain markets because of it, and that actually makes things worse for privacy because it drives more data into facebook than google's hands than it would have before.

D

um So that's one example of an unintended kind of consequence of well-meaning legislation. So I guess terry you're saying that there could be a similar thing here if governments so.

A

What we're seeing is um a large number of efforts to introduce.

A

Legal protections over ai and decision-making systems that require full explainability now full explainability means zero privacy because you actually have to be able to trace in the model all the way back to the source data item that gave you that weighting.

A

So that means that you actually have to reveal the identity of individual people. That's.

D

A really good point, because 70 people talk about explainability and it's I don't know what they mean like it's other than that like are they thinking there would be some uh like a human would explain something which is how a human explains. Something is never really the truth like it's, you have to go back to how they learned it, who taught them and no one ever tells the truth of how they really know something.

D

um So, there's this false assumption that you can't explain without giving away the whole chain of things so yeah. I think that's a great example of.

A

Yeah, so yeah, and likewise, if you have you, have another movement that is saying you know we're going to pass a bunch of laws that say, models must be free from bias and they must be completely fair.

A

Well, the only way to demonstrate that from a legal perspective is to actually record the the gender and the sexual orientation and the religion and the race of everyone in the data set and then show that your model is not discriminating against any of those factors when, when you run data through it, so you actually have to by law, record information that you otherwise don't need in order to prove that you're complying with the legislation that has been put in place. You actually make everything worse by trying to make it better.

D

Well, I think um I was going to look at the last one, the intrinsic protection, because that sounds interesting- the middle one. In the escalation dated categories, I could have a run at that and then pass it around and we can- or maybe we need to tag each other in the document and then flush things out there, because it sounds like there's more discussion on that, um but uh certainly the first, the last one. We could definitely knock out pretty quickly and then we just got this one.

D

But I think I could have a go at it to try and summarize what I've written here.

D

But other than that, I think uh I think I'm pooped so we'll call it a day, and this is really good stuff. So, if we need to next time we can dig more into the escalation of data categories, because I think this is an interesting topic. I did see in the news this week in the uk. I think they have banned police departments from using facial recognition in crowds um which is fascinating, like basically a blanket ban, or is that incorrectly reported.

A

No, no, they were very careful to word it such that it. It had no implications on the future usage of facial recognition right. They merely said that in the instances in which it had been deployed, it was not legal right, but they explicitly excluded that from becoming.

D

The general case it's too tempting it's like it's too uh people want that sci-fi uh future of uh you've, seen hollywood movies, where it picks someone from the crowd and yeah it's uh it's, but I did see that in the news but yeah. I thought that was interesting.

D

All right well, um yeah thanks everyone and uh good good conversation as always, and some good notes and some good uh stuff to look at next and then then we can sort of go on to the final bit of the road mapping of filling in squares of when things can happen and what might be out there and yeah all right. Keep the ideas flowing and chat next time.

A