Cloud Native Computing Foundation Network Service Mesh, 15 Jun 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Network Service Mesh WG - 2018-06-15

Description

Join us for Kubernetes Forums Seoul, Sydney, Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

A

Okay: let's go ahead and get started so item one on the agenda, as always is agenda bashing? So yes, if anyone has any anything to add to the agenda, now is the best time to do so.

B

Although speaking here so I'm new on the group and thank you, thank you guys to inviting me on short notice here and I'm happy to join, and it was an idea to have five to ten minutes about a use case we are doing. Was a wall at home draw so refresh on active service mesh? If anybody interested.

A

So we can pray at a spot on the use case- clapping the lowest for you to talk about oh I,.

C

I think that did did you I think I was already added right. I think I put that on there exactly.

B

It was a big hack, just four minutes four minutes ago, before the call to add some use, Kate's strong to the use case map. It's not an obviously it's just sketched in there, but if we want, we can tell you this case even as a call here and talk about what we thinking or wait for this stuff.

C

That would be good, I think.

D

That would be good yeah.

C

D

And, of course, of course, last week we thought probably will have a detailed discussion on the use case document and I think this comes handy for you to present to that'll, be great here. I'll talk about it too, and.

C

Prem I I, agree. I was hoping that this we might spend the majority of this meeting on the use case, stuff right.

A

Yeah I personally think that, right now, the most important thing the we can focus on in network service mesh is working out. What those use cases are because there really will drive the development, so I'm happy to devote the majority of time to do use case in and use-case discussion.

A

Okay, so is, is there anything else that needs to be on the agenda or should we should? We continue on.

E

I'm good everyone else is good, nothing after.

A

Me I think silences okay, so the next thing is there is a for those who are attending the open source summit in the end of August in Vancouver. There is a cloud native, never function, seminar that is going to be held the day before the summit starts. So the summit itself is on Wednesday through Friday on August 29th through 31st.

A

They hold a couple many sessions, I guess you could call them or workshops on Monday and Tuesday before, and so the Tuesday before is the is the seminar so feel free to join in I? Believe you have to register for it when you, when you register, for your open source summit best and as a and one of the topics, is going to be about network service mesh or at least there'll, be some discussion on it.

A

A

So I'm trying to understand what this is on the agenda, a a is or Al's review, one one or the other. So oh.

C

Yeah, ai yeah, so I I I, actually suffered those first, two I actually opened up issues. My eye looked this morning and I. Wasn't it wasn't clear who was gonna? Do that so I just did it so I put links to them there to.

E

Follow Joyce I have not gotten to the DA I assigned to me from last week about you know getting it. You know. Maybe save space and roof right inside the pod, I I will go ahead and get that up there on the wiki. I just haven't had a chance to do it this week. Okay,.

A

So much better with it saying actual items.

E

Even that we do have an owl on the call I do I do field name so.

A

I apologize right so anyways so connection items I.

A

So, at least as far as I know nobody's gotten to to verifying the ability to change. This emi driver certainly has been note, no changes on it in the past two hours. If the issue was grated, that was pretty sure someone will get to that. If anyone wants to take ownership by by all means, do so.

A

A

Actually, probably be a good idea if, if anyone wants to take ownership, add yourself on me and Genda and we'll assign it to you on github as well.

E

To be clear, like this is an open-source community, we all get that every that all the actual items you take our aspirations and and so don't don't feel like you're signing in blood. When you sign up for a thing you know just you know, do it if you can and if you can't, let folks know it's up for grabs again. Yeah.

A

I'm perfectly I'm perfectly fine with that and in fact, don't be surprised. If you take an action item and then someone else completes it on your behalf and.

E

That's also totally cool so.

A

Talk about probably won't happen for those for the more complex ones, but for the simple ones: okay, so it's the opening github issue to with to verify container runtimes, and so the issue was created as Edie to documents in the wiki, the having getting the namespace from inside the pod I, don't recall seeing anything on the wiki yesterday. So it's.

E

That's what I said just a few minutes ago: I have not got me done yet, and I apologize I will attempt to get to that. I've got a bunch of things. They've been backing up for this afternoon that people have been asking for around this stuff, so yeah. Definitely, okay,.

A

So well we'll circle around with that later on no problem, kram sent all off to the mailing list. Yeah.

D

So I can probably quickly give an update on the current status just for the benefit of all. So until now we have 17 responses and then out of which we have close to seven responses. That said, no the remaining 10 people have said yes, so which means, from a majority perspective, it is still leaning towards the current time slot.

D

So I'll probably send out an email. We do amounts. We.

E

Should probably talk about that a little bit more because I think the whole point of sort of getting a yes and no was not to sort of do it on the timeslot per se. All.

B

E

See do we actually have a problem with yes and I. I would tend to feel that 7 out of 17 responses, saying there is a problem, probably means there's a problem.

E

The does that does that match how other people are reading the the situation.

F

Yeah I would say: 7 out of 17 is a big enough fraction to be concerned about.

D

Mr. party person get.

E

Concrete so um and I know, you'd made a comment Frederic about the about people not getting to weigh in on times. If they said they were okay with the timeslot. Do we want to try and just run a quick doodle poll for a new timeslot, and we could include the current timeslot on that doodle poll. um You know so that we can get a sense of like you know where everything stands, yeah, that's what I thought we were gonna do. That would make most sense. Yeah.

A

I feel like we're losing information on that, like I'm I'm personally I I'm open for for this time, but I'm open for other times as well. Then we lose right.

F

So I think so kind of weighted voting would be best like give everyone 3 votes or something I mean that's the way they usually solve this kind of problem well,.

A

A little will do that, but does.

E

That doodle definitely does that for us effectively. It lets everybody.

G

E

Make and then we figure out what's the best solution for the group, I guess I waited so for some people, you know sometimes our doable but more or less, more or less painful. Maybe do we have a tool that we could use for that so far the numbers are small enough. Just manual.

F

Analysis of results would be doable.

D

So one other thing is we'd. We also had another idly. A doodle poll, probably I, can do a quick correlation on what were the times there and then try to see if it was.

E

A little more restricted in time, because my big recollection was it was restricted enough in time that literally no, this Lots proposed in the last doodle poll were times that I could make by.

B

E

Know but but I think that the set of times that you were suggesting more recently on the the poll, if you actually did, click no and go through we're a little bit broader in terms of the possible times. Okay,.

D

I see what they're saying it so one thing what I can do is I can probably remove that restriction, irrespective of the race or no people can go ahead and then do it, but mm-hmm that by that way, everyone expresses their time.

D

A

Just got a comment as well from the chat from part of I get the name wrong from Lucina suggesting Google Forms saying the Volk coop had used it for weighted voting, so there might be another solution. Looking too.

D

Sure we can look at that. You.

E

Hear from you've been extremely generous with your time in trying to coordinate the right.

D

E

But if I were in your position, I would kind of feel like I keep getting pushback on on on it. um Would you would you be okay if we can find another volunteer to pick this up, sure I'll be happy yep? Is there someone else in the call who's who's highly yahoos who's highly organized I am NOT who might be interested in picking this up.

F

Well, okay, I've made enough noise, but I probably should hit you and do some work here. Awesome thank Mike, yeah.

A

And the senior volunteer as well so I I suggest the both of you well.

F

We don't need to now. We don't need to if Lucena wants to do it. That's fine with me.

D

Saying yeah yeah.

A

E

Fine with either either.

A

Or one of you so I.

E

Appreciate both of your willingness to help are you, okay, picking it up? Lucena.

D

E

Excellent. Thank you so much.

A

Yeah and just so just to clarify, like we're very happy the way to work that you.

E

Absolutely you do you're doing so much for the team at this point from what, if that's right, you played along dear under your head, so so.

D

No yep sounds good.

A

John to crisply expressed the invisible network and by a promise to to ml and or next week, meetings and seeing network service mesh document.

A

E

Would Rick do you want to share the agenda as you're walking through it.

A

Sure, actually I'm actually talking on the phone. So if someone's gonna share the agenda on chapt on my behalf, that would be very useful.

A

E

Somebody willing to volunteer so.

D

Fredrik the agenda for today right so you, okay,.

A

D

A

D

D

Sorry guys I think the formatting is going for a toss, so I hope it's okay, wait.

A

So, for the for the invisible network and vile problems, is that something we want to talk now or do we want to do? We want to add it to yeah use case discussion or like well? What do you want to do with that? John.

H

I had some comments online and document and probably I think that's fine. Just now to have people look at it with comments. I think Ivan from in tell me some related comments and try to we try to sell narrow it down. I mean the problem is in the data plane. You know what what everyone exposed in the data plane to hto from nfm I do have a solution, but perhaps, as we do the use cases something may jump out at us.

H

Any other thoughts.

A

So my recommendation is people who are interested in this to read the document and comment on it and what we can do is we can schedule some time or in one of the following weeks, once once you're ready to discuss this, if you feel that the topic is something we should discuss in the meeting well, will that work.

A

Okay, so moving on so we've had another week of active development and, as you can see from some of the github issues that we've that we ran through the discussions that we've been.

C

Did we lose Frederick, oh you're, on mute Frederick? That's what happened.

A

Bleech the charities won't have expected, and so that mug was was squashed, there's also work that was done to rehab to refactor the some of the code and get it so that it's reduced some of the code, some of the code duplication that we had and make it a little bit more more robust, and there was some more work that was done around handling handling errors. So it's a lot.

A

So a lot of music would say ensuring that we don't accumulate too much technical technical debt in the process and things that were uncovered through through some additional testing and.

A

For the pool for pull requests that where they were merged as well see, it was I think that pretty much covered the majority of things, the only other thing was we've also. We also updated the kubernetes dependent and version and the client step dependencies to use the semantic versioning so career at ease for some reason, releases. Multiple versions and some are semantic versions, and some of them are not and depending which ones you cool in you.

A

Unfortunately, def defaults to the non semantic version, so just so, just as a heads up, if you want to know how to how to do semantic, versioning kubernetes properly go look at our dependency files and in this particular repo. So we've worked through those issues. I have.

G

A sorry I have a question regards to the dependence here with the 111 the regime. There was a change in the client go and basically the 111. The latest like a beta, 2 clients version 7 doesn't work, so you really have to do like a release. Use branch release, age, 0, 0 and I was kind of curious. What's the plan, that's right now you're using the kubernetes 110, so you guys planning to move to 111 at one point or what stop? What's the plan.

A

So that's that's good question at this point. My recommendation is until 111 is released, that we not that we the way because I the semantic versioning specifically I, believe that they cut a release of client client go after after 111 would would be released, and so it really it really depends on what the state of the system looks like at.

A

That particular point you know, ideally is: do you know if the changes like if there's any intention with with within the clan, go in order to make it backwards compatible with 110 and one line, and so on? I.

G

I, don't think so. I mean at least based on on the change between the release, 7 and release. 8 I mean it seems that it's a kind of a breaking change. One of the reason why I looked into the client 8 is because, with the 111 they introduced a new dynamic client like when you create a kubernetes cubic kubernetes client. You know to talk to the API server, you return, kubernetes interface and then a rest interface and then basically, three types of interfaces and the diversion eight.

G

There is a dynamic client way that can be used to access pretty much any type of object. So I mean that's what that's why I think it could simplify a little bit the code. You know when you deal with a client, so that's all. Okay,.

A

That makes sense so yeah like right now, their documentation and the documentation may be wrong and their compatibility matrix claims that they will work. Coronet ease, one nine, ten and eleven in their head version and but yeah I think we have to wait until.

A

Until they do their release and see, if there's any to see, if there's any issue and I my gut feeling on this at this particular point is, it would probably be best if there's if it's incompatible, it would probably be best to just jump over the the jump over that hoop now, since we don't have any production clients at this point as far as I as far as I know, but.

E

This, of course, to say, is if you're, using this um in production, you're a braver man than I right now,.

A

Yeah, so if so, my some I suggest it would probably be to like, let's, let's move now and not and not wait so that way that we're we're working on what we'll, what our customers will more likely right on and that likely will not be 110 by the time that we kind of first release.

A

So this thank you for for bringing this up and we'll we'll spend extra time to focus on this out on this issue, to try to make a good technical decision when more information is available. Great.

C

Thank you, I I think we do. We I think it's something we should take in AI to open an issue for this to make sure we track it and we can use that to discuss it right. That's.

A

A fantastic idea, so cool.

C

A

C

Just put a generic one down there, and one of us will get to it so.

A

All right, so that's added something on that we'll get to it. Love, yes, I! Thank you for a quick yeah. I was bothering me if I type it so since we're done on that yeah, the last one was never service fashion on gke. So if anyone wants to speak up on that, I.

C

Think that was John I believe right.

H

Sorry I was on mute, so yeah I did I just from the help from Kyle and Frederic I. Just copied down steps and kind of document today. I wasn't quite quite sure where to put it so if, and else is running G key a couple of gotchas, you too so.

C

John this is this is really cool. Would you mind basically creating a markdown file with this, and we can check this junk to the dock. This would be super useful cool, exactly.

A

What I was thinking.

H

Just give me any I.

E

It's just not a meeting until John is taken to the yacht. Nai.

A

So, let's move on to use case mapping, so I will pass it on to which one of you is 3-1. Thank you. Hey.

C

Fredrik, uh just one quick thing before we move to that Cheryl I just wanted to bring up that I would love to get some reviews on PR 91, because that actually they're, currently some of the stuff has broken due to that. So getting that one merged would be would be great if anyone has a chance to take a look at that sure. I.

A

Will hop on and take her take a look at that after me, after the meeting cool.

C

A

A

So use case mapping I will pass it on to to Graham Fabian and John. Look at that wall, so the use case document is on the. The link is on the agenda, so Chad.

D

Chad open the chapter.

A

And just the heads up, the use case document is also available on. The link is also available on the.

D

A

Repo, so if so, if you can't see the chat or there's no on feel free to go to, they get every phone.

G

A

Don't think it'll so ever have a problem with accessing Google Docs I. Don't think that it'll help for you to take to go through that path, because that's also a Google Doc. So sorry about that one I always had it to you problem.

D

Sure so, quick update on the use case document. There are a bunch of comments and then I will incorporate the comments for the use cases related to that of the cloud. Networking I also added the distributed mesh or incorporated that on just to reflect on how the use case looks with respect to the distributed bridge model I just want to briefly cover about that and then probably pass it on to John to share his updates.

D

So this is in continuation to the presentation that it touched upon last time. So the idea is basically use the route rules.

E

What you're talking to oh sorry,.

D

I thought could one off you sure, because my connections a bit flaky I, can assure you thanks a lot easier.

E

To follow these things, but the pictures in front of me I'll get it up there. Yeah I mean you'll, have to make part of the document. Sure.

D

I'll mention that then duh cool, so let.

E

D

You the page number so that region, nine age.

E

Nine well, there are no page numbers on the document, so I'm not sure sin.

D

First, no further.

E

D

Oh sorry, I think you've just passed, yeah, okay, okay, um so the use case basically talks about the. How do we build a distributed or using distributed bridge? How do you build the same use case here? That can be two types of two types of meshes: one was the assistant full mesh? What is meant by percentage as a as a prerequisite of the of this particular use case, you do need VX lands between all the compute nodes.

D

So the only down side to this particular approach is that when the number of compute nodes increases, it's going to increase the number of mesh between this because of the numbers in wall. So the idea that basically mentioned here is that how about an on demand full mesh? The idea of on demand full mesh is that, let's assume that we'll play with the use case, let's assume that the one of the application exposes l2 channel and the others would essentially want to connect with that.

D

So in this case, what happens as the information is available in the service discovery and any other client or the part that wants to have this request when they request the VX lands are connected on demand, and this is going to happen for the first request for such I mean such channel, and then it will continue on for the defender for the defense tear down policy by this, what happens?

D

Is you avoid the full mesh problem that you have in case of the persistent full mesh, so these two use cases would probably fit in well in case of network service mesh, and the recommendation would be based on how you want to have your mesh created.

D

So that's what I've captured here and then the diagram again explains the interaction between the various components. Here you have the bridge spot which essentially exposes channel and then what the subsequent part would that, once you have a connection, would essentially request for the connection and then it continues on in the life cycle. So this is AB. Data I have added, in addition to that of the conventional model. I also incorporated the other comments from people have, even until now.

D

Thanks for the comments also I think I just touched upon the cap map, you just saw the cap abuse case. I think this seems to be similar to that of the bgp evpn use case. Probably when we touch when we discussed the cap abuse case, we can power before the discuss on whether we want to have some collaboration between both the use cases.

D

That's all I had from the up a point of you probably will pass it on to John if he has any updates. John yeah.

H

Okay, so the thing I was doodling with this week is trying to tease out some design patterns from all the various cases to try and categorize them. This is really useful. That is just just may. We may main works a trainload for.

E

Somebody whose mind works all still that way I would find it super useful, because I tend to up and down the abstraction of trees, and so you know, I do tend to sort of look at a lot of concrete things and try and squeeze out what the common patterns are. So it would be real helpful for me: I, don't know how everybody else's printing works.

D

+12 that I will also be interested John.

D

We do appreciate all that you do John very much yes and.

D

So one of the just to add to that John I've always been doing a lot of work around the micro-service patterns aligning so that is also one of the favorite areas yeah. So.

H

Let me take my doodles in the document, and I can comment on them. Iii mean what I have so far. It doesn't feel quite right, but I think maybe having other people make. Suggestions will probably be the better way to get to get it done so gain different viewpoints.

E

E

D

To burn my dear yes, so I think I'm, sorry miss the person was quite added that gap abuse case, probably across.

B

Bosnia, basically by oh.

D

B

D

Nice meeting you so do you wanna present the cap.

E

D

More than happy to have you shirt so.

E

You can drive it.

B

E

B

Fine, it's a big ride to launch will have a chair, so I need to change that. So basically, it was just a quick hack to bring the informations in here.

E

Let you share so you can drive that ends up working better way when it works.

I

I

A

Need a name for that action item that we just discussed.

E

The patents was this: the one for John John.

A

E

The actual items are belonging to john.

J

D

John I'm, actually I'm out of I'm traveling, so I'll probably help you join you soon and then work with you on the other.

B

Okay, I'm done was that so, where you can quickly talk about use case, we looking at it before the call, basically, as already mentioned by prom, it's mostly a bridge or a layer to use case what we have here, which is at the moment in time absolutely not related to network service mesh from implementation perspective.

B

Apart from the use case perspective, the thing what we need to do here is obviously to transport cat blob, encapsulated frames coming from the fields basically outside the communities cluster to a cup up controller is basically UDP encapsulated protocol for the control plane, each user playing I think you guys, as a court, probably know or need we do. We need to go into it.

D

I'm, not you I'll, be with, but yeah it.

B

Would be good if we could probably show him I.

D

Can make a quick what.

B

It is basically it's a Wi-Fi control protocol for wireless access points, which is a standardized and an ITF, and basically, as as anywhere you have a cup up control protocol, which tells the yellow box, which is a wtp, a wireless termination point to bring up bring up a Wi-Fi network so that the user equipment, the you a can can attach to and or the control from elements for, our syndication authorization and channel management, etc. Are going to the cap up C channel, it's a classic control, plane and the use of plane.

B

It's encapsulated in Kappa U, which is a quite usual tunnel protocol, which has a tunnel ID which relates to the control, plane, etc.

G

B

And those both protocols needs to be terminated in a communities cluster which already creates on user.

G

Problem because.

B

J

E

I think we just lost your audio I. Am you yeah I hear you now you were saying they create some problems because it's UDP and then you broke up at least for me, okay, bringing.

B

Udp traffic into the.

E

Apparently, the world does not want you to express what it is about, UDP because you said bringing the UDP traffic into that and then but I lost you because anybody else losing his audio or is it just me.

D

B

Sorry about this I'm well, I tell you what we I think we go away with the camera. It works.

B

So can you hear me now?

B

Okay, so basically, the problem is quite often in the cloud to bring a UDP traffic into a cluster to a port because of layer, four load, balancers, etc, sometimes time time mode issues on UDP. But besides the fact this can be managed.

B

We have done it in a bare-metal deployment, with some special moments of configuration. At one day, the you hippie traffic reached the port, basically where the control plane and data plane is running as a container and this represented by the CP and EP box here, and this CP and EP talks to each other over normal cube network over c and I to to share informations is basically an internal protocol.

B

So now we're referring to who's a layer 2, because now the the the data cost d capsule aids saket, but you traffic and after decapsulation you have basically the traffic or the layer 2 frames of the device which are from the mega addressing perspective foreign hmm to the commodities Network. So basically it would be forwarded at all. That means, if we want to give this traffic to another network function in this case cg WIP ii GRE, that's called in all terms connect the gateway.

E

This is a good use case because you've got a classic illustration there of what we mean when we say that your payload is l2 yeah exactly right. So it's a beautiful use case. Yeah.

B

That's a problem: yeah you'll notice, when you put the l2 payload into the cloud and you're died because of all security settings will not allow to have remote payload from or a mechanism any number of crazy things could go wrong, absolutely poor. Nearly everything for my cloning cables it almost everything goes right. I'll buy that I'll buy that.

E

If you could take your loss, wrap them again, yeah we lost your audio again, which is really sad because I'm excited about this use case and you're me now much better and.

B

Here you know: okay, cool, so sorry about that. Next time you make better anyway, what we have done, Zion I, say: okay, as we have dynamic thoughts, there are no stable end points, etc. So we we have started to label to label.

B

The port's was a simple label that call it be excellent, true and there's a another controller watching for labels and have some annotations and then the controller xx into the pot, which is a little bit hacky and sponsored and dynamic weeks land between two ports that basically represented either me explain to dpdx Andrew T P E, which is a which is a pot or sidecar in the container, which creates a VX LAN link and pushed in the interfaces into the pot so from the use case, different naming, but exactly what you're proposing this was net mesh, but done different.

B

The main growth so from there we layer to traffic how payload traverses. Well, we expand to the next pot with then creates an IPSec GRE tunnel which is terminated externally and the service control box. And this way we leaving the cluster basically because we need to head or what were foreign system here. So basically, we have, in the use case, is in our communities, cluster L to payload, which needs to be distributed.

B

Across pots, and then we need to leave the cluster and forward the traffic sorry re channel to to a remote system and to achieve that, basically, a production at the moment for a couple of months or even a year now, but as no mentioned to go exist in this way and just implementation upstream, it's just to demonstrate the use case. Ten minutes before the call we was ahead of Antonin.

B

He, although you know other joints a call here, we put so as we actually and controller public to the open, CNF group, which we are I've created on github just run to look at this. How will it be done?

B

Yeah basic, yet that's the use case and any questions or about the limitation I. Think Antonio. Here you happy to to answer this as well use case. I can I have.

G

A good question, if you don't mind so you're, saying that your controller builds the excellent tunnel between the ports, so basically they're leveraging the same. The interfaces which are provided by the CNI right you just built on top like the TCP tunnel between the two ports right. Exactly.

B

There's nothing to do is see an IC and is not involved here at all and basically the controller xx in and make some commands to create the interface with a given name, which comes from the manifest in the port. Oh.

G

So your port then has multiple interfaces like one the rebel or CNI and then in other the new one vx lon, exactly.

B

Interest, if you don't want DP interface, which a long distribution protocol, what we use internally, this goes while I was at CNI.

J

B

It's a payload, it's a totally separated, or was this the X landed object so.

B

Maybe yeah, if you want, we can go into the manifest here so.

E

Look like it's actually very good, you're right. This is very much up the alley of the clients. Think thoughts were thinking here in researcher smash, because effectively, what you've done is you've come up with a way to sort of hack standing up what we call a connection for ltalians between your your tap lap, TS and your cgw ipsec grea, and then likewise to stand up a connection. You know of type IP sac to your external service. Can control.

B

Where, if I'm understanding correctly exactly, we have two different implementation of a link, let's say one goes external one is internal one, as we explained based other ones, IPSec, G or even based I,.

E

Presume that the explained is peering and onto payload and the IPSec link is carrying an l-3 payload now.

B

In this case, as IPSec, you, like scary, into the GRE payload, which then carry and still there payload, because I have to go to the.

E

So I'm used to note that I've got my my very long list of ways that people carry all payloads because Julia was not armed a list yet, but it should be yeah. It's.

B

It's to the cake so especially.

E

In this use case yeah, so although.

B

Use cases we have in mind, we have a quite similar one with just carrying GDP, is so generic commoning protocol, payloads and and bringing up thoughts, implementing and ng, GSM and mobile, and if it fails, it comes up on a different pot and gets away like land and controller to come up with the same endpoint, IP address and stuff. But there's all different use cases on. If we try to put in the edge we use case, but because, as you say, everything will break payload.

E

Everything, it's even worse than me. Everything breaks as an l2 payload in the cluster in that and I've occasionally had this conversation with people. Kubernetes actually makes no has no concept of melty segments. So if you try and stick a Mac, you know to frame out there. God only knows what would happen right, even if nothing broke. It certainly has no guarantee of getting where it's supposed to go.

B

Exactly I've seen in nearly all virtualization environment, there's nothing cubanía is related. We have deployed the same thing in a mistake not in containers not in ports it for us and even in VMware. If you are not careful and you put layer, 2 payloads in there, it's either doesn't break things or unique. You have special security settings and it's all all a mess.

E

Now this is very cool. I appreciate your bringing this to us. um So what are your for doing interest area? It sounds like you're interested in sort of you know, looking at making sure number one that we can meet your use case with network service match. Overall, are you interested in the medium to long term, with the in terms of using network service, / yeah.

B

Is that because a network service mesh from this perspective, there's not not really a use case, but we want to leverage native environment communities at it as it is. We have a strong opinion. That's network service, mesh component. You guys call it has exactly similar functionalities from a pattern, perspectives and classic service mesh that says the TCP and HTTP ones, but pathway to and s3 payloads. So the medium-term goals would be to to join, to join network service mesh here and one day face audible, excellent control as a homegrown stuff to to go there.

B

We are also in a project currently which is running, which is a black fest about this stuff, where we are on a research project. So we could pretty people like to promote to start to bring opposites lab environment with service mesh, because numbers of l2, SVP, payload CNS, will come up in this lab and that's why we are.

E

It's hugely exciting, frankly, a lot of fronts, we're delighted to have you guys involved. You know, and and and just don't be at all shy about, asking anything that might help you move that forward. We would love to see you guys do this with a bloodfest face there, yeah.

B

That's well, you needn't try to be shy. It's just about timing some times and console, but interesting part for us would be to learn how your activity your working group, is received in the Signet working. It says something which, besides beside of that or having a chance to get, let's call it upstream or get get forward here or where we, where you stand at the moment, was, was.

E

Never I think I can tell you my viewpoint. I would actually encourage others on the call to express theirs. We did present I think two weeks ago, roughly to the the two sig networking and I think we got overall, a pretty positive reception there from folks.

E

The the one of the benefits of the network surface mesh approach is that we don't actually need any changes in kubernetes proper, um which is really helpful to us, because we don't have to go, try and convince like three or four or five different groups in kubernetes to change something for us. But it's also seen as a good thing by Signet working I know. Tim made a comment that he really liked that about this particular approach.

E

um You know, since then, we've actually been stirring up conversations and trying to have a conversation with the Signet working group about whether it makes more sense for network service mesh to be some project of state networking or a kubernetes working group or sort of what's the right, formal structure there. um We were gonna have that conversation yesterday at the Signet wiki meeting, but the turnout at sig networking was very, very low.

E

This is past meeting, and so we we didn't get to have quite that conversation, but we're actively working to line up with sig networking and so far they seem pretty warm to us.

B

I can give you my experience for that the boss, if you go back to mating, distancing networking, there are three attempts, and even in the channel on select to say, hey, SFC, service function, training and service measures. Basically the same, why not seeking a test direction, and but this, as was mostly abner bones and Noah, was followings as due to the HTTP HTTP forks usually are there and networking is not highly represented in this group. In my.

E

Experiment, well, that's that's fine I mean the thing. Is the the same? Networking guys have solved a very important class of problem really well it just like the sto and the envoy guys have solved a very important class of problem very well. You know, and even though I'm hugely in favor of sort of borrowing, by analogy the cool things they've done, I, don't think trying to ram l2 and l3 payloads into their already functioning. Really well for them system is like a happy experience for anyone. Yeah.

A

To to further that the kubernetes, these cases are primarily around enterprise, which primarily calls for a very specific l3, l4 pattern, and so like I know, it sounds a little bit negative, but this actually was a right approach for kubernetes. In order to keep it simple and to grow. You know it just fit affects us by saying not having anything like l2 or so on so I.

A

This is an attempt to lift those but lift those use cases, but we are from a kubernetes perspective like if you, if you go up to them and you ask them for a feature. If that feature has wide enterprise use cases, then there's a good chance we'll get in. But if it's, if it doesn't or it complicates those use cases, then the chances of getting it in. It's still not impossible, but it'll be significantly more difficult because, like telco, and so is not, is not the main.

A

The main use case, but I think that we'll gain as we gain more traction. We have the ability to to add into some of that, so that influence around areas where critizise may not be as strong. So I.

E

Would second everything you've my bad perception matters you're a jerk with one exception which is Nick. It's the perception of broad enterprise, use cases, I. Think in various points. There are things enterprises will discover they need that, may not be perceived as being a need yet and I hopeful hope we can help with some of those. Oh absolutely.

A

The containers startup is an excellent, is an excellent example, so our pod startup I should I should say rather than senior startup for like how long it takes, but yeah. So sorry going know.

D

Thirty good, okay and just I did operation.

A

Yeah yeah I think from I from my view as well like this particular approach like we, we wonder with the term of service mesh because it makes it easy for people to latch on to, but the goal was not to stop to stop there, but it build something that was a lot more flexible, so use cases like yours can can be built. So this is like totally in scope, so don't feel like you're diverting us or anything like that by bringing up these type of use.

A

Cases like this is actually a really really great example of something that I believe you'd like to support. Okay,.

B

These two, a next question, I, said we starting next week, will bring up a pot, a plaque, fest environment here, which is run for about two years and adding more and more features and.

G

B

Differences use ksv, we show you how ready for prime is NSM at the moment when you think we should start to make extra hands dirty on what you expect from from our steep or flat learning curve here for people already have done, it doesn't make sense to mess around with that already from implementation perspective to be up with that, or should we wait a little bit more? It just goes basically back to Antoine on the call, because he has written to be exact controller for from our side.

B

This is already 0s as an experiment or haha what's the status of venison and.

A

We're still very we're still very new, like in fact just from a timeline perspective. The first conversation that Ed and I had about this was in mid to late March, and so all the work that you seen from them to now has literally been between literally the past 70 days and so from a implementation side. We're you know we already have where we're building up the primitives at this particular point.

A

In order to in order to describe this so like we've added in kubernetes series and reading we've built up protocol buffer api's, which those would actually be really good for you to review as well. Just so you can get a sense as to like, what's so, I think what the core functionality is that we're that we're working with and.

A

So but yeah, but we there's still there's still some more work that needs to be done before I would be comfortable, saying that yeah. This is writing it ready for primetime.

E

Do you think stem to mind for me like number one? Is you know? Obviously, you guys need a cue stick over giri and people with real concrete needs at hand who want to try. Things tend to bump up the priority of things as I mentioned. Ipsec over giri was not on my list before. It is definitely now um the other one that I actually want to throw out there is. You would really welcome your participation in the development community.

E

You sort of have an opportunity here to to shape making sure that we meet the kinds of needs that you see and that you see from other folks by participation in the community, and I can tell you having often arrived, the communities after the the after things have hardened after your stuff is already in deployment. It's really nice to have that opportunity for early influence, and so we welcome your participation and then also you guys. Look like essentially prime beta customers for us in terms of you've got a use case.

E

You know, wina NSM does actually get to a point. You can try it out. You guys kicking the tires would be extremely valuable to us. So.

A

But before we continue on, let me just wrap up the meeting and we can continue the discussions afterwards, if you're, both if you're, both interested so first. Thank you. Everyone for for attending is there any last-minute stuff that we that we didn't get to that. We should add to the to the agenda for for next week. I think meeting time planning was. It was really the only one.

A

Okay, so anyway, so I'm going to stick around for this conversation. So if I don't know, if Ed has time and historically have to drop off I actually.

E

Do have time on this occasion and I'm delighted that I do so.

A

So I'm willing to stick around for more conversation as well. So let's say this concludes the the meeting and then we'll continue. This particular conversation I.

E

Will take the opportunity to remind folks that the meeting is recorded. It is entirety from the very beginning, and that will include this after meeting conversation. I, don't think that's a problem. I think in many ways it's a good thing, but it's always nice to make sure people earlier. That's.

A

Right so that that to the preamble, oh I'll, free the preamble- if they have to start at every meeting and that'll, include that.

E

We agree the people of the cloud needed networking world in order to form a more perfect match. Hey.

A

They actually in when, when you talk in the ham, radio and that's many of them actually have set preambles, that they say at the beginning of every of every start, first asking if there's any emergencies and then to describing what what is it they're there for they say it every time, cool.

A

Sorry but verging so.

A

Okay, yeah so I, so I was the same before, like I think these type of use cases you know we agree. We talk about network service mentioned we've, given some some examples, but you know like the examples that we've given I like by no means like saying this, is in concrete, like the rails that were setting, so we we want to make something.

A

That is ideally my viewpoint on is that we want to make something that is it's flexible to handle these type of use cases, and if you can, if you can think of it and the lines with the patterns, that of the primitives that we've provided, then there's no reason why you shouldn't be able to do it, barring up barring some technical technical reason like if you say we want to send l2 over some environment that doesn't support it and, of course not.

A

But ideally you know we want to have we have we want to have the Sdn and the services and clients all worked out and they essentially negotiate the transports so that so that you can build whatever it is. You want to you want to build in this particular like, like this particular use case and and get things working so so we definitely definitely appreciate this particular this case.

B

It's for we're receiving that for process. Basically, it's a question when and let's say how the join activities here so therefore a set to mentor guys have says where we teeth so, which was all as I think easy, because it doesn't affect CNI, underling, et cetera, so and I said we. We are starting not in products. Now we are not intended to say: hey, please make SM stable next week because we need to migrate servers over next week.

B

That's not the point here, but I said next week, which nothing was with a group of developers, even students in junior developers, which can dive in whatever we decide and if NSM a good fit. Then they can dive in there and using this framework, this technology to solve the use cases.

B

You know that's whether the question was coming from where where and when we feel would be in a situation where a bring up of a system, every second breaks and it's very very early- it saturates it only to understand from only Tavella parents etc, and maybe it would be the right moment to do to go in here. But if the environment is already in in a shape where I say okay, we can't start with that and adapting and helping for the use cases and bringing back issues, problems, etc and ideas.

B

We see then I think we. We are happy to join Raza. Sooner than later. Maybe you should know. We also are part of the VPP communities. We know quite well working with them already for the BNF itself and therefore having a network service mesh which, based on the same same path,.

B

Would be a premium as well, because otherwise, usually you end up in too.

E

Many tools, so quite honestly, so NSM itself is agnostic as to the data plane that you choose so there. As do you choose to use sort of the there's, the data plane inside your CNF, your cloud native Network function and you obviously we're agnostic as to that. That's whatever you got to do when SM is also agnostic as to what you might call sort of the underlay data plate, in other words, the thing that is connecting the connections.

E

That said, as you might imagine, there are quite a few people in the NSM community who care a lot about VPP and so I expect that to be one of the early data planes supported, so you you're gonna, get basically what you're looking for I have a question just out of my own curiosity, um and this is because often so you're dealing with wireless traffic right now um and what are the interesting things that we've gotten from some other folks is what I would call exotic L tubes.

E

So, for example, you exotic L to protocols so, for example, in talking to the cable guys, they have use cases where they would like to be able to pass DOCSIS frames as the LT payload. Okay, do you have exotic L twos, like that in the wireless space that it might be interesting to pass over an L to do over a connection and network service, mash I.

B

Say from a from out of frame perspective, it starts with classic Ethernet or when the traffic arrives. So because DOCSIS goes beyond that. Maybe it's on wireless framing as well. If you don't man to 11 frames, yes, of course, but know about from what we need to carry I, don't see it for the current use cases we we have in production, so.

E

That's completely tuck in the back of your mind that one of the design intends here is to be able to support exotic l2, l2 and l3 protocols, if need be, because there are a bunch of them around and if you just leave the architectural white space for them.

E

Supporting them is super easy, and if you don't, then you make it very hard for people like I've got similar things talking to fibre channel guys where you know, they're they've got their own l2 and l3, and if your attitude is, there are kinds of l2 and l3 payloads. Then it's a very easy game to play. Mm-Hmm.

B

So I said: what do you mean is exotic? You mean from from a framing perspective, going beyond her to frame off from the payload itself or from a.

E

Gamma now you're on mute. Oh sorry, when I say exotic I mean non Ethernet, because there's a very large percentage of the world sees there exists one and only one L to protocol and that it'll protect what was Ethernet.

B

Radar here, as we go into into the mobile core network elements, senators coming non IP transport for for Vento data in the narrowband IOT, all right, which is a hundred twenty eight byte Center frame, which is encapsulated somehow and then we'll arrive somehow, and you need to forward it somehow. Yes,.

E

Exactly exactly you see precisely my plate and that's that's entirely. Why we're trying to keep it because it's cheap to keep a generic if you think to but a whole lot of people think either there's the only l2 there is.

B

Yeah that's at least from the IOT perspective. Yes, we are are in this in this area and, as you say, it may be dope says is coming around as no current customer need it's a moment, but our customer base, which are carriers, are usually, although inches in this area, but not directly at small cool.

B

What I always think just a stepping out ideas here? What we have discussed and offices that whose network service measure always a principles of network service metrics, is basically encapsulating any kind of traffic in encapsulation. It called VX LAN and bring it to the next part could be also a transport primitive for classic service mesh, because you really want to do. It was a classic service matches very expensive yeah.

B

What you do is you need to pass protocols and you need to put in HTTP extenders, and then you create a new packet and you set for Wired doing it's the other way around and capsule. I encapsulate the traffic put in NSA hmm around that and no need for for even decoding that packet and encoding a packet again, because you have this, you have such trace, IDs or whatever IDs runs out the frame. This way you even could transport Els frames or whatever you want.

E

You should absolutely talk to John McDowell he's actively trying to write down things in that direction. I think he was the guy who got all the items on the call.

B

Getting X neither but exactly.

G

That's that other ideas: how.

B

To how to make this happen, or even put trace IDs for open tracing ginger, somehow arguable in the news or play and if those the high speed, however, if not doing a high speed for whatever tracing scenario, you you can, you can do it just good.

E

A

This this might be an interesting use case as well, where perhaps, if you have something that doesn't support those tracings, but you want to do add them in and that's all that you're going to look for, except when maybe you get a new a new flow.

A

You have to initiate a new, a new ID, but there might be an interesting use case to to show as a to to add in a example where perhaps you encapsulate and they capsulation and add in these particular headers and then transfer as you and make your decisions as you would and then- and so you know I it's it should be. You know, and it should be trivial to do this in in our in the architecture that were they were proposing. But at the same time be able to.

A

You know- and you know basically I think it'd- be a really great way, because to did also demonstrate some of the flexibility, because we're showing here's something that that it's l to frame that no one else in the world has ever seen. But here it is handling it without any. Without any issues. Does that make sense, yeah.

E

But it totally makes sense and- and the thing is, we've got some really fascinating tools for that as well, because not only can we do sort of a thing that it essentially caps leads you in a way they can get tracing, but we've already got built into things like PPP stuff, like the IOM port O'call from the IETF, where we could not only trace what's happening, sort of above the tunnel. We can actually trace where the tunnel is going, because you know to degree that you have IO am support which is starting to come online.

E

In your networking devices, you will get per hop information, including latency information, for where the tunnel transited to so the amount of tracing you could inject. The network service mesh becomes frankly insanely huge. It's really cool next.

B

Up and then you put it on the normal of old, raising API and put it away, but you'll have at least the media data in your hands based on the IOM. Stop absolutely just.

A

Just remember only do that we have to define the API is in a way that other Sdn scan the eventually implement such functionality. Oh, no.

E

Absolutely I totally get in you know, for example, if you were going to do tracing you probably want to negotiate tracing the same way that you do modeling, so that you're, basically you're actually doing tracing in a way that both sides can deal with. But this actually brings up a new matter or which is: we've talked about negotiation of tunneling and it's all fine and dandy to wave your hands at being able to do something similar for tracing, but as we're building out the the B form you see between ms/ms.

E

We probably do want to be able to express a preference on tracing and I. Think tracing is a little orthogonal to underlying in.

B

Distributed all collated trace ID and that's that's what you need, and so at least you need to bring so.

E

Things need to support. The both ends need to support the tracing tracing mechanism is, and then there is an exchange of laterz and right now. The way the negotiation between two FS Evans is mostly shaking out is the requesting NSM says: I can do this? The you know the NSM on the far side basically comes back and says: okay. Well of the things you suggested to me, m4 preference order. This is the one I picked you know because of my offenses, and here are the parameters related to it and for the tracing.

E

It would be your your trace, IDs, right and so effectively. The the request II is in charge of the parameters in these situations, because it's the one who ultimately has to make the decision. Okay,.

E

Beep one roundtrip yeah.

A

One of the other use cases we're gonna have to think a little bit about as well is its I can see potential use cases where you might have one an SM, that's man, that's being that's managing, let's say VPP manages odl and you want to do tracing across the both of them as an example and so I. What would quote that use case like.

E

Is that even something.

A

We want to do in the first place. You.

E

Know I would say that something that looks like that is very likely. You know, because you know it again in that scenario, whatever the NSM, that's that's controlling or talking to OD all it would. It basically have to have some set of things, that's capable of, and you know so, if the end of them on the pod, it's using DDP, you can do IOM and it comes across, and it says okay I'd like to trace with IO am as part of this connection and the before in comes back and says. Well, that's nice!

E

I've! Never heard of this IO am thing that needs to fail. Gracefully and my suspicion is failing. Gracefully means you just don't get tracing, but you do get income yeah.

A

And I think something that we may end up being able to do is if we, if someone were to use I'm I'll use a higher a higher level example, like suppose that we were to define a protocol buffer that describes tracing that could be encapsulated into a into a network service, mesh connection that then handles the that they can then across all all n SMS, and so that so there may be interesting ways that we can utilize our own architecture to smooth over the differences between different different STM's and still get the same, the same functionality and so yeah.

A

We have options.

B

Yeah, this is very exciting, may come time because leaving a tracing area a little bit. Will you have another use case, which is a little bit depressing I like to address or like to discuss, basically which, which is about my let's call it dependency across the pass yeah.

B

Let's say you have micro services with two different things: the right hand side is establishing an IPSec or whatever routing passed connection to remote hand and on the left hand side. You have your cup up termination or GE EP termination. So the.

G

Basic if the right.

B

Hand side is loses the connection, then the tense life is not allowed to to receive traffic anymore, because otherwise you're stuck because as a more less disconnected micro service based environment, you need somehow tell the left end so left and we internet said they should shut down or not accept any traffic anymore, because the next one, the pass is broken somehow, which.

E

Means I think Frederic has thought a little bit about this already and my suggestion would be that you inform the left hand CNF, that you can note that you can no longer service this connection in some manner. Maybe it's a disconnected the connection but or.

G

E

You can't service it anymore and then it's up to the left hand, see it see enough to decide what the right response to that is right. In fact, we.

A

Can we can borrow a lot from from enterprise use cases in this scenario because they have that that same path and many in many scenarios so I'll give an example from from Netflix someone watch this to watch a video and suppose that they have a failure that prevents them from being able to service customers in a given region or or maybe watch a specific set of videos, rather than allow the customer to continue on and press play and then buffer and wait for it to never arrive, and so the customer it's they wanted to return an error fast and so there's a number of techniques.

A

The one that comes to mind that would probably be most helpful, is probably what they call circuit braking, but there's there's numerous techniques that we that we can borrow for them. So it's just a matter of picking the ones that that we think would best suit, and then we can see, if modifying for this case, for our use cases and see if there's a good way, we can take them in easily.

A

That being said, I I get the feeling as well that we have to be a bit careful with some protocols, because some protocols may be sensitive to.

A

To this thing as well so like there may be a specific way that a certain protocol may deal with an error and pass that up, and if we could preempt that you know that would okay, we can if we can follow those channels out. That would probably be better in some scenarios, but basically.

B

I'm pretty aware about superbrain taking a chip right cetera, so we we've developing other applications which has Oz's super wise or trees, etcetera, etc. But it's always inside an application. So the principles are quite real. You need to signal something right-hand fails. You should signal that this has failed, and so the question here, if you're leaving your application environment, which you usually use, it might might support that and say hey. We have independent parts which are created in a different language in a different environment.

B

Let's say you want your mother and I be sick on the right hand, side using line of kernel and some transcripts just to give an example at you on the left hand, side you have the running part implemented in c or VPP doing another thing so, and but this parts have a pass that has a connection. Maybe their readiness pro tells probes lifeless probes. So you need to coordinator somehow or across your little bit in the orchestration environment, which tells let.

E

Me ask you a questions. Is that you're, a good person to see a real problem right? So one of the things that I've, occasionally mused about on is the possibility like the following, which is for situations in which the thing you're doing is effectively stateless right, I'm, gonna, say some things that may be unclear about your Muse case.

E

Just for illustration, so correct me at the end, so say, for example, that your your left hand cap, whap, CNS, you know it's getting a bunch of you know data that's coming in as l2 frames and it was to pass on to your gateway.

E

um Now you may have a five replicas of that gateway and we happen to have routed your connection to one of them and since it's just a be actually in connection, my presumption is that you don't have any magic state, you're just shoving frames at somebody who will then be able to shove them into an IPSec Jerry tunnel to where they need to go. If you happen to connect, we connect you at first to replica number one and replica number one dies. We discover via liveliness probe. It's gone.

E

Maybe the node caught on fire for God's sake right you, there are some scenarios in which just seamlessly and quietly connecting you to replica number. Two is probably doing you a favor and it seems we should be able to do that at NSM, quietly and seamlessly reconnect a stateless connection to another replica that provides the service you're looking for is that would be useful to you. Yeah.

B

That's exactly the case and I think usually a back wall protocols and even in the cop up it's much simpler, because the client makes a retry so requests coming over anyway. Keeping the state of putting it to another stateless element would be even better, but that's more less easy, because a failing pot knows he has failed, and the problem is sometimes.

E

Sometimes he knows, he's failed, he's Newsies filled for certain kinds of failures right, so it's your example of I can no longer. My IPSec tunnel is down on the Gateway there. You know you feel, but if, if you something went wonky on the physical server, where your note is mom, and so the note went down on gracefully and so the pod just disappeared mom, then it strikes me that you know it would be a favor to you. If you were truly stateless to when your cap web sends another, you know basically sends another frame.

E

We just you know having realized that that one is done. We put you on to a new VX, the ethanol that takes you to another replica and you continue to get serviced and so, instead of having to think inside the cap web server and do a lot of logic. Essentially, you get a very brief period where it's just not working followed by working again and it just works right. It looks like a packet drop or it just looks like a blip of packet. Drop owed by packet drop is no longer dropping.

A

Kubernetes does provide some of the primitives that we can do. We can rely on so, for example, if it's pot. If, if it's a pod, we can use readiness and lightness probes to.

G

B

I have liveness readiness or whatever other state, from the application on one pot on the right, all right hand, side and it's this lightness was this. That's look was a picture. We we we've, we share again sure sure do a bit of security in or you assured Ian I, think yep I've shot again.

B

I

B

This part here or is this connection- let's say it's failing, maybe not because because the pot cell fails, but so the outer serves others rosette, except this connection is not there anymore.

B

This should delete you I, don't accept more frames on the left-hand side, because otherwise they will send traffic and traffic and fret they can have an impression everything is right, but the state of this connection- it's it's gone, actually the practical use case, even even this, with the stock equipment at the moment, so this will still try to send traffic evening either redundant second copy in another data center, etc. Thank you because you don't know that this connection on the right hand, side has been failed. They still put traffic in here so basically.

E

What you're saying is, if so think about this for the point of you cap wet pot. It's you know, I find it very useful to think about local points of view. Well, the point of view of the the cap web pot. It does not matter why the connection it has to a gateway service is not working like why it's not working is not its problem. Even it is no longer a good connection.

E

It needs to know that right because it may have steps that it needs to take on in order to behave properly, and in your case, you're saying this step is to refuse to accept more incoming Wi-Fi or wireless range is fine, but but I'm a firm believer in localizing intelligence wherever possible and limiting the need for global vision wherever possible, and so I think all the kappahd needs to know is I no longer have a connection to a gateway right.

E

B

You say I'm not already anymore, because whatever the reason for that it lets in cgw dying as well as and I, don't have a connection anymore on that account.

E

If the Gateway, for whatever reason is no longer showing, is lively which can include it declaring itself not lively because it's lost its outgoing connection or it could include, you know, somebody took a sledgehammer to the node it was on and that pod just isn't there anymore right. It doesn't matter which one the NSM essentially notes that it is failed. Its liveliness check and having failed. Its liveliness check means that we to look at the connections it has and either notify the pod that they are gone or reconnect.

E

Those connections to someone who is passing their liveliness check, who provides the same network service. That Gateway was providing yeah.

A

Yeah, like yeah I, think the way that I'm looking at there's like there's, there's multiple areas, and this this would be as something that whoever is designing this particular path would have to decide on so like if the IPSec connection goes down, you know, is: is it possible for it to resolve the connection issues itself, open up a new IP sex channel and then silently deal with it? You know, and that's that's one option.

A

Another option would be to signal ups, it's upstream connection, saying I no longer have connectivity I'm going to go away now and you make a new request for anything. Making the new request for a new connection or passing the error upstream would be the decision of the of the next hop up a so like in essence that, like the the IPSec imagine that the IPSec path itself that you had there was like another NSM like that that connection that context of that kind and that connection it doesn't know anything about about.

A

What's above it other than a limited amount of state. That's and metadata that's been passed to it, and so it doesn't actually have the context to deal with it, but the next hop up definitely or may have that context or the next one up from there. So it's like you know you have to pass that information up until you get to a point where, where so, where something can make the decision like I want to retry to reconnect, or we should fail up the entire.

A

We should fail if we should fail up the chain and eventually you you hit the the customer where you might fail. The connection, the worst case scenario, yeah yeah, so I think, like we still want to capture statistics on all this stuff like if we see I'd, be sick tunnels dying all the time. You know. We definitely want to know this, and so the tracing is still is still important but, like you said orthogonal, but the actual decision itself to me, it sounds like like we want to.

A

We want to hand the decision to whichever service has the best context to deal with that, and the I think the most effective way is to just keep handing it up until such services is found and.

E

Almost analogous to how you throw exceptions up the line until you meet the person who actually knows what the they're doing yeah.

B

Woman's approach would be, the IPSec pop is not already anymore is loses all readiness, it still lie, but not already, and then, if you can propagate I'm, not ready or the readiness is it's gone, then yeah I just said just.

A

So you know with the the management channel that we're using at this particular point, we've built up through protocol buffers, the the management path like how do you make a new connection or so on? So one of the things that we need to build out as well as there's passing some of this state information backup about about a connection, so we haven't added any primitives to that, just just yeah like I name. This is exposing a hole in that in that area.

A

Where I think it's not this, it's not that we didn't think about it is that we haven't in our development cycle. It's it's on the it's on the agenda and I am part of. So what we're using is we're using G RPC, which has a dual which bi-directional receipt mechanism, and so in this particular scenario you know it sounds like. What we need to do is ensure that the that there's a some way that we can communicate this information to and from each each service, pod.

A

I guess we'll call them in order to make sure that this information can get propagated up and that it's understood what those are is our and what they mean, and perhaps we would perhaps we can, that we can. We may even be able to add a little bit of additional functionality as well like if the.

A

If the client says I want to connect to an IP, SEC network and that's handled by another network service mesh and it dies, I mean we could potentially even add some functionality, functionality to say, hey if this thing dies, don't even bother returning returning back just trying to connect to another one and then only return to me. If that fails and.

E

Yeah, it's kind of my point: I was suggesting about D for stateless connections because for a stateless connection, it's entirely possible. The network service mesh itself can handle the error without disturbing the pod on the pod. That has its connection to the never service that went away. Yeah.

A

But ultimately, the the one who that decision needs to be made by baps by the client requesting the service and the client needs should specify that as I want one of my failure strategies to be to retry new connection on failure. No.

E

That's absolutely true right I mean because the the the client is the one who knows whether a reconnect on failure is going to be a problem or not right. You know again, it's not available at the local knowledge. You know tap web understands whether this is okay or not, and so it should indicate if that kind of thing is okay, when it requests the connection right.

A

um And because it's on a per connection basis as well, that means that you can have multiple connections from a from a single client. So that means that you could set up based on your SLA requirements or so on, exactly what you need and even if it's the same data path, perhaps one customer, you have a different recovery strategy because of contractual obligations, yes, and that could be that could be added in to your tea or an attempt. So you can try the service of his best.

B

Cents will be what they promote anyway, at ma the third, so engine intent, you already noses as a lazy teacher, etc, and it would drive some behavior. So it's exactly what I would like on that is it's not on the PowerPoint paste from a from a from a site, calm perspective, doing that or its regular. Then we have it on a connection based yeah. So it's a okay. This connection fails from an ism and information on this level and not on let's say VBP porters fails or something as a globally.

E

The other thing that I would suggest you, it sounds like you're already starting to think this way is one of the things that I found. Really profound here is the very tiny here. These l-2 and l-3 connections in our service mash opens a whole world of possibility that we just haven't thought of before cuz. The world was too static right. So what is it useful, for example, to have a connection per client right coming out of your cap web? mmm You know yeah, why not?

E

You know it's unthinkable in the way we used to do things like you literally couldn't have imagined it exactly.

B

What's happening, the soft Creek so CRE, you may say parent client, you make a make it soft GRE connection, which is not pre-configured heuristic, bringing it up, because this 12 clients needs to go to another one hop as isazo wants and, as you can establish dynamically, that's entirely possible and it wasn't before that oversee and I only possible that bring up time and not not at runtime anymore. So absolutely.

E

More we've run fabulously over, but I think very productively. So we look forward to having you guys to get more involved with unity going forward and thank you so much for bringing this use case, particularly on four minutes notice, because I think it's been a very productive one. Okay,.

A

And, and just to circle back to your first question is doing in terms of involvement like I think it may be a little bit too early for junior developers who want to implement a service to jump in at this particular time, I mean if they want to help with building out the network service smash yeah we have plenty of tasks and part of what I want to do is is help people you know, even if they're very junior, to to learn how to contribute and effective contributors.

A

If you have the bandwidth to help build this out fantastic, if you don't have the bandwidth to boot, to help build it out, you know the use cases alone are are invaluable, so you know so don't don't feel bad if you don't, you don't have that at this visitor time, timelines I don't have a good answer the to to the timeline at this particular at this particular moment other than you know. This is a do me. This is a high priority.

A

I want to get this up and running as fast as possible, but I want to temper that with making sure that we get it right. So that's that's why I can't really commit to say hey. You know this should be usable by October or November or like a sort of specific date. So so so I apologize about the Hat about having something concrete on the side.

B

It means from a from a time being from a current state. It's a environment is not even usable. It's just about defining the api's at the moment, or this is already user in a very limited scope. And can you can you push a package from A to B array? Yeah, that's a question. You know, as you imagine, and Roma yeah.

A

So we haven't built out the so we're targeting. Vpp is one of the as the first as the first SDM. At this point we haven't built the backend for that just yet so there's no there's no patent so that so that's that's! Where we're at so from a production perspective. I know that's you're willing to build out such a such a component. It's not really! It's not really ready! Yet right! Now we we!

A

We do have the capability to push information into network service mesh in terms of some of the intent at some of the states and we're doing that through what's called kubernetes CRTs, so is basically an extension to the kubernetes api, okay, and so so we're going through the CRT path and in that scenario, but we also have some real call buffers that have been inspected that have been added out.

A

We have code generated where we have a service, that's that's up and running and injecting things into cubes, but there's nothing on the other side of those queues. Yet so we're so we're so we're still in the process of actively filling out the core functionality. So.

E

I mean the the net thing I would say: is that there's some placeholders right now for the API is don't take them too seriously. We just needed something as a placeholder we're building out the in to be able to handle those api's and and I. My suspicion is that, once that infrastructure is in place, there will be some rapid iteration, getting very serious about those api's that will occur.

E

A

What and part of what it's it's gonna have it as well as at once once we're happy with the data injection we're just about done with with that particular path in terms of the CR DS, so the next, the next step immediately after that is, let's start building out a PPP provider and demonstrating demonstrating at it.

A

So so we should have the ability to push packets very soon, especially with the expertise that we have and the and the releases and the access to resources that we have in the team, including including add with that and so so will, will have something. We should have something relatively quick, barring any major stoppers that that we find it's not it's not. Quite it's not not ready to be demo.

A

E

Other thing I will mention to you, which is the way NSM looks at the actual CNF. They you can sort of divide them into click, two classes immediately there you might call from NSF if you too, smart CNS. These are the CNS that are intelligent enough to participate in the conversation with the NSM, and then there are what you might call the Dom CNS. These are the CNS that are not smart enough to participate in the conversation with the NSM.

E

They expect a static set of interface presented to them and for those we intend to write an init container that would take a config map that would set up whatever the dumb CNF needs. It sounds like you guys are wanting to be smart, CNS, I, think.

B

We definitely add an observer, it's basically like building smart creative applications or doing a father and I.

E

Totally get it and I'm happy you guys. Would it be smart at CMS? The reason we have to make sure we support both? Is we get a lot of feedback from some of the larger operators, basically saying what our vendors e-enough? They aren't even dumb, yet.

B

It's like it's like it as a cheap proxy and a service on some shining saying. Are you eyes or if antennas age of where vnf or or you need to have a so as a few proxy and Tuesday so I think Bay Belize is there, but, as you see with the open BNF group already organization, we are about the vnf cells. So basically we see us as a PMF or CNF provider. So we can make some smart as we want it's.

B

No, there is no winner, you know, there's no, there shouldn't be a vendor, CNF fare so because we are about to boots of ENF. Obviously, if something comes in which heats in in a system integration perspective, if you say hey, but we have this way nor CNF or vnf still here then from system integration perspective. Okay, this can be represented by an interface. But if you thought about our cm knives and we on itself would always be small and they're already as a consuming directly communities resources they pushing matrix on us from météo.

B

It's form rot, so they they do all the cloud native things already and if it's possible, they do NSA NSM, of course, as well. So otherwise this doesn't make sense to have a CNF at all from it's a V.

E

Innocent really quickly, you're talking about a plugfest, was that the OPN efi plugfest very different. Like fast. It's.

B

Nothing, it's a it's a local, it's a local research project which says state funded here, and it's about bringing Nets about to partners, creating creating black vests for themselves, all six partners, universities and other renderers, etcetera.

B

E

There something I could google to find out more about it now.

B

Again, Malaya and.

B

An open and a B we, we have not much contact, we. We have contacted Dan Cohen about to a one a half year ago and do a lot of circumstances.

B

We lost beloved attraction here because of real projects to deliver, but we like to influence or work without Menifee as well, but we don't like to orchestrate open sticker, so it should be. It should be near death. I feel.

E

You I feel you all right: cool I do have it's been such a pleasure, gentlemen. hmm Okay,.

B

So I have to have a run as well. So thank you guys, I think was pleasure to meet you and we I'm told me get forward from here and discuss what that'll mean for us and takes a lot more than formation. Yeah.

A

Thanks for stopping by on such short notice, and just so you know and I, had a conversation with some about 10 hours ago for the first time and I thought.

E

That's why I did why I poked him? You know because I saw your conversation where you poked me so I do appreciate it. Okay,.

A

Yeah, so we definitely appreciate you know short notice and the amount that you've done up to this point. You know it's it's strongly appreciated and with that thanks, thank you both and you both have a good a good day or, in your case a good afternoon or night yeah.

B

We begin time yeah, okay and see you on IRC, then guys definitely take.