Add a meeting Rate this page

A

Hello.

B

Hello.

A

Welcome.

B

Thank you you too hi. How are you.

A

Yeah, fine thanks. I do like your backdrop.

B

Thank you. Yeah been a missing a bit of uh um yeah home sort of che blossom seasons um yeah. So I'm from tokyo originally.

A

All right, beautiful.

B

And uh wow: is that your background it or is it the real background or is.

A

It's real, it's real.

B

Okay, that's a great background by the way.

A

So um we'll we'll just sort of um chat for a few minutes to see if anyone else is going to join today and yeah- there's quite a lot going on this week, so uh I'm not sure how many people we're going to get in the session today. Okay, um uh have you had a chance to take a look at the the roadmap document.

B

Yeah yeah, I did have a look. um The 2021 draft version um yeah it's of quite a lot of um things to kind of digest, but um yeah I kind of wanted to mainly sort of know, um get to know um what you do sort of uh from meetings and meetings. I think there's um twice every month meeting every other week on thursdays.

A

So uh we've for for the robot meetings we've switched to once a month at the moment: okay, just because we're we're fairly early in the year, so um you know we're.

A

We, we publish towards the end of the year.

B

um

A

And then we we aim to do a an update um as we go through the year. So so right now.

B

We're.

A

We're still relatively up to date and there's not much uh going on in terms of new changes. Yeah.

B

um

A

But as we as we ramp into um you, know, trying to add new stuff to the document generally tidy things up, we may go into back into the twice a month schedule again.

B

I see okay.

A

um So the we we're completely collaborative here so.

C

We.

A

Welcome you know people coming along, or you know, contributing to the document. Yeah.

B

You know.

A

Generally, having input and yeah but right right now, the focus really has been on trying to capture a record of what we think the challenges really are in the space yeah um and and then you know, start to look at what needs to be done to to manage that longer term.

B

Yeah.

A

Obviously we're in a when a very new space and everyone's making it up as they go along, uh but that will mature and we want to try and make sure that it matures into something that is going to be viable in the medium to long term yep. Otherwise, we're going to have to get a lot of people to back up and go in a different direction.

B

And.

A

It can take decades to uh to get that to happen so yeah.

A

A lot of this is about trying to get in front of um you know, understanding where the whole discipline needs to be, um and uh you know what tooling, we might need those sorts of challenges hi. There.

B

Hello.

C

Hi, how are you guys good to see you yeah good to see you.

A

Too yeah we've we've just been uh chatting over the you know the the general role of the road map and uh doing a bit.

C

Of an introduction.

A

um So uh let me just uh post a link in the chat.

A

That's our agenda document um yeah, if you could just sign in on the um on the document and you'll, see at the top near the top of the agenda. There's this this month's session and a list of attendees yeah just gives me a feel for who's involved and what input we're.

A

Getting.

A

So um just to get everybody up to date uh in terms of conferences.

A

We've got cdcon coming up in about a month's time.

A

We are probably going to be doing uh an ml ops sig uh because of a further session, uh which will just be a sort of joint uh feedback session where everybody can get involved and ask questions, and you know, share information.

C

Terry, what about there are two conferences that two places that we can maybe participate. The first one is the envelopes world conference.

C

Maybe we can do a week.

C

A quick presentation, though I think.

A

Yeah.

C

So I've.

A

Got I've got a list of uh upcoming conferences that I need to also apply to um so I'm gonna fire off a um a bunch of proposals to the the latest ones.

A

um We've uh we've also got a slot in uh devops summit in.

A

November, so starting to uh to get a few opportunities in there to communicate.

B

That's great.

A

Now, if anyone wants to get involved on the uh the cdcon session, um then uh I can include you in the in the.

C

uh I'd love to help. I think I can help creating some uh some slides over there to help you uh sharpen it.

A

It's it's not really gonna, be a presentation, it's more of a panel chat, so uh it would be a matter of um being able to attend the in real time and uh and then you know, we, we just have a sort of off-the-cuff discussion about um some of the work that we're doing and and then hopefully prompt some questions from the attendees and then get into more of a discussion.

A

Is there a date um there's a uh there's, a broad date for the um for the two days of the conference, which I can uh share with you, I'm still waiting to find out uh what time slot we might have within that um I've asked for a sort of mid to late afternoon, uk time, because that seems to map reasonably well to other time zones.

A

All right, oh as soon as I know what the slot is I'll I'll, let you know, and then we can check and see if you're available all.

A

Right um so.

A

Other than that, probably the um the big news this week is the leaking of the proposed eu regulations on ai.

A

um I don't know if anyone's um yeah actually had a look at the document.

B

No, I haven't.

A

uh It's um it's quite concerning um it. It really would not be possible to be compliant with the regulations that they're proposing um it's.

C

It's it's.

A

Built on a fundamental lack of understanding of how machine learning works.

A

um And so there are, there are lots of things in there which um you could never hope to to actually practically comply with.

A

So, for example, there's a there's a requirement for uh everyone working with the software to be trained to understand exactly how the system makes decisions.

A

So for deep learning, you're expected to to be able to know and understand um you know a neural network which is clearly not possible.

C

My take on that is uh it's basically, uh there are three things that uh related to dutch three points now. The first thing is, I think that we as a subgroup of the cd foundation linux foundation, I think we should try and participate uh in some meetings. Official meetings related to the regulation to uh as experts in the field to participate and give our uh expertise over there. I think it will also be good for uh public relations stuff.

C

No, that's a b. uh I don't see any way. This can be any close to final because, as you said, it's there are a lot of concerning problems with this document uh see, I think in somehow it will push companies to adopt machine learning, tools or platforms, because otherwise it will be held expensive to implement. You know, tracking data and managing data, so uh maybe we can try. You know to juice their lemon.

A

So um responding to your first point, I I will raise that with the cdf committee to uh to get their feeling on um whether it's appropriate for us to get involved at that sort of level.

A

um Certainly, I think that's something that needs to be discussed with the with the talk um uh and then I'll feed back to you.

A

um I I think the the approach will be interesting on on this one, because the the drivers for this regulation may not be entirely what we expect so so this regulation may be more focused on trying to deliberately block um american and asian companies from being able to compete in europe, rather than um explicitly doing what it says on the tin and if that's the case, uh there will be a rush to to push it through um uh and and then there will be a lot of um publicity about.

A

You know eu being a world leader in ai regulation.

A

um You know in practice a lot of what's in this document is very similar to what went into the the gdpr regulations um and again, there's a lot of gdpr, which is it's not actually practical to yeah, to implement, and especially.

D

To small businesses.

A

Yeah, so uh um oh welcome, I see uh see, we've got uh uh artem, joined us and feel free to uh to join into the conversation. We're uh we're very friendly. We don't bite.

B

I'm the um I'm a first timer as well so and in terms of um sort of our concern as um mlop's special interest group towards these kind of regulations.

B

Is there a channel of communication that is already established or how could we feed our concern back or sort of?

B

Is there a way we can express and in a more direct manner, um rather than just as a kind of um you know, here's our concerns and recommendations.

A

So um to to this point, the the group has really been focused on a more technical audience.

A

So so the the idea really has been to to try and capture the nature of the problem that we face when we start to work with machine learning solutions.

A

And then, if you like document an end-to-end requirement set, which we can communicate with anyone who's working on building solutions in that space, so that they're they're up to speed with understanding the full scope of what the problem domain looks like, rather than getting narrowly focused on to the bit of the problem that they can see right now and then potentially getting sucked off into a blind alley, because um they were unaware of something that was about to come and bite them.

A

Now.

A

The idea was that that would be very beneficial in terms of supporting teams who were building tooling for devops and ml ops, um because it would help to steer people uh away from a number of blind alleys which exist in the in the current domain.

A

um That has built an asset which is quite useful in general terms, for understanding um some of the problems associated with uh machine learning, but it does still assume uh you know quite a technical understanding of a lot of the issues involved.

A

It's not an ideal artifact for communicating with a non-technical audience in a way that uh will be easily understood. um So it would be quite a lot of work for us to um to shape that into.

C

Into.

A

Something that would be better suited for having the political conversations, it's not to say that that's not something that we couldn't do collaboratively.

A

um I I imagine that there will be a lot of conversations going on um between the the the larger players in the space, because clearly this is going to exclude um all of the digital businesses that um the eu has been trying to encourage over the past few years.

A

uh So so that's that's, obviously going to uh cause a lot of concern for a lot of people.

A

um But again it depends whether this is a you know, a genuine attempt to provide um consumer protection uh or, if it's more, of a political stick which is intended to to create trade barriers and um and then become a negotiating point in the future.

B

Yeah yeah, that makes sense. Thank you.

A

So.

A

Other things that we should probably have a a chat about.

A

um In general, our focus this year is is on communication, because we we've got this asset here now, which we feel is quite valuable in terms of painting a picture of.

A

What's going to happen when you go down the path of trying to build a machine learning solution and what the pitfalls are and where there are gaps in terms of being able to solve some of those problems today.

A

But it's also quite clear that most people working in the space are not actually fully aware of all of these challenges. Right now,.

A

I don't know if anyone else is attending the uh um the ml conference. That's going on at the moment um the access conference with uh tecton and feast, um but.

C

There's been.

A

A lot a lot of talks there, where people have been proposing solutions, and actually we already know that the solutions that they're proposing don't work because of the um the requirements that we've gathered in the roadmap.

A

So it's a little bit concerning in that you know. There's a lot of work going on worldwide at the moment, and lots of people are dedicating a lot of time to doing things, um and actually they still don't know what they don't know in terms of what problems they're going to run into when when they try and implement that it's quite important for us to to find as many ways as possible to to get the roadmap circulated more widely. This year,.

C

Can you can you give us an example of what are you talking about fist implementation? Are you talking about something, maybe more specific, so uh yeah pushing pushing to promote feast and technology 100.

D

Marketing conference.

A

So, uh for example, there was a there was a discussion yesterday um about what a third generation of machine learning system might look like and really what it was describing um was some of the pipeline based machine learning, solutions that were already built last year um and using some of the practices that we've we've put in the road map.

A

um So it's like you know. People are investing time in in doing something that actually there's already a good generic solution out there for and they can just download it and use it as open source. um So you know it's. There is definitely a a big communication problem in the space at the moment, making sure everybody has access to a full picture of what the problem space looks like.

A

And then we also face the problem of how do we keep that that picture up to date? So how do we make sure that we are capturing new problems as they're discovered and factor them into the roadmap, so that other people can also understand what those challenges are.

D

Okay,.

C

That's interesting baby. I think you know.

C

I'm not sure, what's the way to you know, try to solve them if you're referring to the concept of feature stores, uh I think it's it's only a personal partial solution.

A

So yeah I mean you know, obviously that particular event will be heavily focused on on feature stores, because that's the the nature of that event, um but I think there's a there's, a a bigger picture of you here. In terms of um how people see the problem depends very much on which discipline they've come from so people who've come from a data. Science background tend to see machine learning as a database problem, and they tend to build solutions that work like database.

C

uh Servers be, to be honest, I I don't think that's that's the data science people, uh but it's more like engine, it's the engineer's vision over there, because the guys contact on mike- and I forgot the name of the other, the cto they come from engineering.

C

However, I think we maybe want to answer what is the problem, because machine learning, operational and infrastructure that that's that's way too big? It's not one problem, but many problems like you have monitoring.

C

You have a feature management and you have data processors and you have model training and you need when you have operationalizing of the subject. So I think the question is: what are you trying to solve and what is considered to be part of the mlx problem.

A

So I I think, that's that's actually, where I'm seeing the biggest gap at the moment, because the the actual problem is that you are trying to build a product that includes some machine learning capability.

C

The.

A

Thing you're trying to build is the product, not the machine, learning right. So, if you're going to succeed commercially, the focus has always got to be on the product and how you manage and maintain that that asset across its life cycle, that's the control and and the reality is that when you, when you look at one of these products, it will be promoted as a an ai product.

A

But when you break down the the overall product itself about five percent of the overall effort goes into the machine learning bit and the other 95 is in managing the rest of the product um and, and that includes a lot of conventional software assets, because you know a model on its own, can't talk to anything.

A

So so your product is going to have a user interface. It's going to have integrations it's going to be connected to things it's going to be managing those data and so machine learning assets never exist in isolation in the real world, they're always just another asset as part of an overarching product.

A

So the the path that a lot of companies have gone down in the ml space. At the moment is to treat the machine learning assets as if they exist in isolation and deploy them as an atomic unit through a dedicated system. That only does machine learning, um and then that leads to a situation where you. You, then have a a lot of challenges.

A

Trying to cost effectively manage your overall asset because um you've got. You know one big chunk of it, which will be all of the web facing stuff for all the ui facing stuff and all the customer facing stuff and all the integration stuff which can all be deployed with devops and um will be part of a ci cd system. And you.

C

Know.

A

You can work on a very, very fast cadence of you know, doing a release every few hours if you need to, and then you've got this huge monolith, which is your machine learning stuff, uh which acts a lot more like a a single database server instance, rather than a distributed component based model uh with the rest of your architecture.

A

um So so I think a lot of companies are struggling when it comes to getting their models into production, because the paradigms that are being used to to manage the machine learning assets are not a good fit for the production world.

C

So I I I don't know if you had time to read my article that I published a few days ago, so I totally agree with what you said. I think that there is a very, very big problem, uh deploying models to production uh and that's because when you do want to deploy models to production, you can't just you know, click on the button on its own production.

C

You need to work closely with a different role in the organization and you need to build together this productization overhead. So let's say I'm an engineer and you're a data scientist. So now you need to you need me to have time to sit with you to understand your problem, to have multiple meetings, to plan a new design for productization, to write some code to do qa.

C

To understand that hey, I didn't understand how you define age and age is an integer, it's not a double, and then I need to re-implement it all of this cycle, which is very very complex, because there is no one ownership for the productization or or the product and the way I see it is in order to solve that.

C

We need somehow to mitigate between these two layers and the technical solution I propose is to to split the initiatives to model development and the data development in a similar manner to front and then and backing you know. So there will be a tight relationship between these two processes, but it's two different processes.

D

So.

A

I I think there are. There are multiple architectural ways that you can decompose the problem, but I think, what's what's significant here is that you know this is this is not a new problem.

A

This is actually something that that we went through in the in the traditional software space right some time ago and the the the solution to that problem was devops as a methodology.

A

So so the the the really devops is driven around the idea.

C

That I think that this the devops solved only part of the big problem, so the big problem is: how do you build face fast applications?

C

So there were two solutions: one is to split the front end and the back end, and the second solution was devops, which is how to operate or how to make these products operationalizing right.

A

Yeah so let's, let's, let's split out the the the technical architecture from the um the conceptual approach to managing an asset, because those are actually different. Different.

C

Agreements I already I, I think that there is an emerald, there is envelopes and there is ai infrastructure, and nowadays people just call everything envelopes or called everything ml infrastructure, and there is no differentiation between these two terminologies.

A

Yeah, well, I think, there's there's actually a deeper problem here in that um right now the phrase ml ops is being used to describe something which is unrelated to devops, and that is is misleading people, because you know within the cdf.

A

We strongly believe that ml ops needs to be an extension to devops if it's actually going to deliver the value that we expect from that. So devops is on the surface, a bunch of practices, but those practices all exist for a very explicit reason.

A

But those reasons are hard-won knowledge from people making mistakes in production environments and losing their companies as a result.

A

And, of course, a lot of the machine. Learning projects haven't got that far yet so so, there's a there's a blind spot in terms of understanding why devops exists and and what its driving principles are, and uh so we've got people who are trying to develop.

A

You know ways of working within ml ops without understanding what the purpose of devops was in the first place in so so, we've got a communication role there, which is which very much falls to us to to say. Look devops exists because of these problems that you will face when you try to manage your asset in a commercial sense and devops as a practice.

A

Does this this and this?

A

Because if you don't do those things, your costs will spiral out of control and you will eventually lose your asset and machine learning just adds a new class of asset into that same problem, space, which you are also playing.

C

In.

A

So you, you need to be doing devops and mlaps, rather than either devops or ml ops, which I think is the situation we're in right now in in a a lot of cases.

A

So then um we're we're currently working on a best practice guide within the cdf, um which is spelling out a lot of the you know the fundamental drivers for devops, and hopefully that will help to communicate um what some of these challenges are.

C

If I, if I can extend to what you just said, like 100, actually I'm 1000 agree with you. So envelopes should be the extension for devops to handle with ml oriented problems.

C

However, that might not be enough, because if we will take a look about devops, let's say you need to deploy a complex application that should take you know, user upload, an image and the image should be uploaded to a server. So if you need to manage that so for a device perspective, that's very complex because you need to to handle you know persistent storage and all of this stuff.

C

However, let's say you put there the s3, so that's reducing the complexion of devops, because then you used an infrastructure tool that replaced all of their persistent storage problems for you as an operational function.

C

So I I think that, uh in order to overcome the challenges of deploying machine learning or ai in general, we need to uh solve these two problems: how to simplify the the technical solution and how to simplify the way to operation operationalize these these new technologies.

A

Yeah- and this is where I think we have to be quite careful, because many of the proposed simplifications actually come as a result of you know, discarding certain things and saying right, we'll simplify by not doing this stuff, but the stuff, that's being discarded, is actually essential to addressing the overall problem domain. And so you get simplified over simplifications.

A

That then paint you into a corner and you then get a product that can do certain things really easily, but can't actually do all that it needs to do in order to be a viable product, um and that's that's pretty much a good description of where the ml ops, tooling market is right. Now, in that a lot of people have made assumptions about what they can simplify out and have built tools to optimize for those scenarios.

A

um But as a result, we don't have a good generic tool that allows us to cope with all of the problems that actually exist within the domain.

A

So this is this is where there's a lot of hidden complexity and what we need to do is provide ways to consistently manage the complexity. So people can understand it, but we can't magically make the complexity go away, so they, instead of pretending it doesn't exist. What we have to do is build tools that um that label it and standardize it and so becomes easier to to understand so in many ways we're looking at what's effectively um a parallel to uh containerization in in the real world.

A

In that we went from deploying things onto physical computers which works, but it wastes a lot of resources and it's um every computer you deploy, gets configured slightly differently and therefore there's no consistency and there's a lot of complexity.

A

So then we went oh well, actually we're not using most of the resources on these computers that we've already got. So why don't? We split them down into virtual computers and deploy into the virtual computer so that they can share the physical resources of one machine, and that gets you a bit more compute efficiency.

A

But then everyone was still building their vms by hand and doing it differently. So every vm running on the machine was a bit different, so that made the vms hard to manage, because you never quite knew how it had been set up and how you needed to maintain it. So then we went to containerization, which says right.

A

Well, let's just create this idea of a completely virtual set of computing resources which spread across lots of physical machines, but which all have the same configuration and the same way of setting them up, um and then it doesn't matter how many resources you need.

A

Somebody will plug some more hardware in in the back end, then your application will just spread onto that new hardware, so that gave you a level of abstraction that made it simpler to conceptually work, but at the same time it created this massive amount of complexity within the infrastructure, um where you actually have to be able to manage all of these things in a consistent way using consistent, tooling, and so the complexity still exists.

A

But it's buried in an abstraction layer where only a limited number of team members actually need to engage with it.

A

So so this is.

A

This is where we're sitting with with a lot of the ml problems right now in you know, yes, you can build bespoke solutions for doing ml ops to solve your problem in the short term, but if we want to be able to do mlaps cost effectively like we do with other types of software, then we we need a lot of standardized tooling that all plugs together in a consistent way that takes away the complexity from some teams and buries it in a in a separate layer um so that we're not all getting dragged into all of the complexity.

A

But at the same time we have the flexibility to to manage all of the problems that exist within the problem domain itself.

B

Yeah um that.

A

Actually, um oh sorry,.

B

Sorry go: go, oh yeah, okay, I'll I'll, be quick! um So that's um quite interesting point and I was reading the um the roma as well.

B

Around kind of mlops has to be a discipline that is language and framework and platform, infrastructure, agnostic and that's kind of an interesting and in the point that um you raised terry about kind of how to ex increase exposure to this road mlaps roadmaps and all of that sort of um what ml ops has to be kind of point towards um things like kubernetes, um which I think is um quite well aligned in terms of it is quite an um agnostic platform for allowing sort of um deployment of containerized application, and so is there kind of a scope with um for this group to kind of either endorse or collaborate such sort of technologies such as kubernetes, um in a way that sort of we recommend or and then vice versa.

B

So on the kubernetes side, um if you know if they agree with with our objectives and and recommendations, sort of point users towards um this recommendation or or some kind of mention of um yeah uh sort of sort of the the work that that's been put into, this um group.

A

Yeah, so that's a good question: um what we're trying to do is to communicate with teams who are building solutions in this space and get them to understand the implications of the problems that are in the roadmap so that they start to come up with technical solutions that actually address the full scope of the problem domain rather than just part of it now kubernetes is, um is a potential approach to solving some of these problems.

A

With my other hat on, I lead the ml ops work within the jenkins x, ci cd solution, and we have a an ops component which actually does use kubernetes to uh to do all of the machine, learning, training and deployment uh so yeah. There are some solutions out there already that are going down that path and they don't actually need to be directly supported by kubernetes.

A

So the jenkins x solution is leveraging the other techdon with a k rather than a c, um which is a standard pipeline component for kubernetes.

A

And so we just we just turn machine learning assets into uh things that can align with a standard tecton pipeline. And then we just use standard, build capabilities to distribute the the trainings and and to to run the model inferencing.

A

um So together that that stuff is going on, uh but I think we're we still got a bit of a void between the teams who are build focused and the teams who are ml focused um and there's still work to be done. To get everybody uh understanding. The full scope of the problems that they're facing.

B

Yep.

A

So um artem back to you.

E

Yeah, um I had a maybe maybe stupid question because I am partially um not in the context of of uh sigmarops roadmap, but um sorry for that. If, if it is.

A

There there are no stupid questions. um There are just good questions.

E

All right all right, so um um my question is um from from the perspective of uh infrastructure and, as you were, telling the history of like progress from virtual machines to um to containers um how is amal and envelopes so different from um already solved problem for software engineering. Is it like large amount of data, or so from my from my point of view, all the complexity is most of the complexities concerned on the ml layer, not on the infrastructure layer.

A

So that's actually one of the most important questions in ml ops.

A

So you know you're you're, bang, on the money there, with with with that one, broadly speaking, cicd to date has been focused on easily packageable assets that can be lumped together and deployed atomically onto a system.

A

So you've got some code, you build the code, you test it, you deploy it somewhere. You run some integration tests on it and then you switch. It live um that that's a solved problem and there are lots of good solutions out there in in that space.

A

What's different for ml ops is that in order to do a training run, you need a training script, which is a software asset effectively that you can deploy in the same way as ci cd, but that training script will be tiny.

A

But then you need four petabytes of training data that you also need in the system.

A

Now, in the under the current model, you basically have a a big database effectively, which has got all of your data in and then you point, the training script at the database run it and it spits out a model.

A

But the problem is that you don't have any versioning over the data that you used for training your model. So if that data is, for example,.

A

Data that's being collected by autonomous vehicles that are in the field, then that pool of data is being updated.

A

You know by petabytes a day potentially, and it's constantly changing, so you can't re-run the same training twice and get the same result.

A

If you've got a an uncontrolled set of training data that you're training against.

A

Now that that doesn't stop you from coming up with good models, but it will stop you from coming up with compliant models, because what the legislation is going to say is you have to deploy a model which has full traceability back to all of the original source data.

A

If you change anything, you have to be able to run a regression test to prove that your model has solved a problem that was in the original model, and you you have to be able to allow every person whose data exists in your pool of data to object to it and have their information removed, not just from that data set, but from any model that was trained using that data set yeah.

A

So this is this is how bad the actual problem is going to be compared to what we think we're solving right now, in that we need a completely reversible chain of custody.

C

Over.

A

All of the data that we're using to produce our models and when we deploy something into production, we need to know exactly which set of data was used to train that model and potentially need to be able to recreate that explicit training set in the future.

A

I mean imagine: you've launched a product, you've put it out there in the world and 10 people died.

A

um You know, there's going to be an investigation, and at least one court case.

A

And the onus is going to be on you to prove that you did everything to the best of your ability to build a safe system, so you're going to need to be able to demonstrate exactly how you built the system and what data was used in in that process.

E

But as far as I understand, this is still like far related to the problem of managing resources. So imagine we have. We have a super huge super, flexible database that allows us to fixate the let's say, immutable slices of let's say: commits of data subsets of data train on them version them change them, maybe delete from them. uh Some rows delete the user's data track back. So if we had this fully data managed solution, um then then we still so first question is the problem solved and second problem is: are there any other infrastructure?

E

Kubernetes related problems like at this level.

A

So so, yes, that that's one of the the technical solutions that that we suggest is needed in the roadmap, so you know that partial bits of that solution exist today, but there's nothing. You can go to off the shelf and just say right here is a a a data lake that is completely versionable that integrates into a cicd system in such a way that you've got full traceability.

A

um So that's a a product that somebody needs to build. That is a critical dependency for deploying uh any high-risk ai system within europe as an example.

A

um So yeah, but that's this is the flip side of the road map. Is that actually it's a long list of problems that people need to be working on? That will actually have high value, because if you build one of these things and sell it, you'll make a lot of money because there's a captive market.

A

This has to exist for people to be able to sell ai products.

A

So so one of the selling points of the of the roadmap really is that this is.

A

This is one way of working out what to work on next, um because we're we're making some some some pretty strong predictions about what the market is going to look like over the next five years um and where there are commercial opportunities to be had.

A

um And in in our answer to the second part of your question, yeah, there are other aspects to this that the will need to be solved um there. There are lots of governance challenges that uh sit within this space, and the expectation of regulators is a long way away from, what's practically possible right now, with the tools that we have.

A

So there will be lots of things that we could insert into the the release pipeline, which would allow us to automate a bunch of the governance problems, but solutions don't yet exist for them.

A

So, for example, whenever you train a model, you want to be able to do bias. Testing on that model.

A

Now there are. There are some tools out there for for building your own bias, testing approaches, um but really what we need is a standard component that says is. That is the types of bias that we expect to encounter in the customer domain that we are working in um uh apply those generically to each model we produce and give us a score for that model, and then you just report those scores as as part of the quality metrics for the models you're generating.

A

So all of these are effectively individual products that could come from independent vendors, but if they're built in a standard way, they could they could easily be integrated into anyone's overall pipeline to allow them to improve the quality of what they're building.

E

Can I have one more question, please yeah, of course carry on uh what do you? What do you mean the standard way? Is it the the way that goes standard first and then uh implementation, or is it easy pluggable to some widespread platform as kubernetes, for example,.

A

Really, what we're looking at is collaborative standards for for plugability, rather than four more standards, for this is how you shall work. um So the idea is that we we want to encourage a level playing field, but with lots of opportunity for for lots of vendors.

A

So if you look at some of the the cdf projects, you'll find that there are actually multiple cdf projects they're effectively competitors to each other, because they they do the same thing, but we have an overarching interoperability group that negotiates some standards for the way we name things and the way we represent them.

A

So where this commonality, those products actually become interchangeable and the customer gets to decide which one they want to use in their infrastructure, so we're trying to avoid too much proprietary lock-in. Instead, we're saying look, here's a whole suite of tools.

A

Each of these solves a particular problem in a different way, but they're, mostly interoperable within this broader sense of you- can have a pipeline for deployment and you can plug these things into it.

A

So in that way the customer is free to build something that meets their needs and and there's there's plenty of room for everybody to compete on quality and features.

C

Thanks all right guys, I have to cut off terry. I guess I'll see you on the second meeting.

A

uh Yeah, I I I wouldn't um worry about that. One too much nobody's been attending that one for months, so.

C

Michael wanted to to join, but then he said he couldn't right.

A

Oh no, that was this session.

A

All right is.

E

He going he's.

A

In australia, so this is his night time.

E

I didn't get, that is it. Is it a weekly meeting or monthly.

A

uh This this lot is monthly.

A

um There there's a there's another u.s time zone slot um which is later on today, um but uh that one was being used for another purpose and has wound down. So it doesn't have much attendance right now.

A

If we get more collaborators coming in from um from that time zone, then I'll I'll continue to run both sessions, uh but but right now this is the one. That's that's driving most of the work.

C

All right, so I I guess I see you next time.

A

Okay, well uh thanks everyone, it's been uh great to have your involvement and uh hope to see you again. Thank.

B

You we'll see you soon happy.

E

To see you next time.

A

Feel free to reach out to me offline, and uh you know any questions or anything you want to contribute. Just you know find me a message join, join the slack.

B

Okay, bye-bye great thank.

A

You.

B

Bye.
youtube image
From YouTube: CDF - SIG MLOps Meeting - 2021-04-22

Description

For more Continuous Delivery Foundation content, check out our blog: https://cd.foundation/blog/