Cloud Native Computing Foundation Cloud Native Telco Day EU 2022, 19 May 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Leveraging Cilium and SRv6 for Telco Networking - Daniel Bernier, Bell Canada

Description

Leveraging Cilium and SRv6 for Telco Networking - Daniel Bernier, Bell Canada

In this session, Daniel Bernier will be demonstrating how Cilium and its eBPF data plane was extended to support telco networking requirements in a cloud-native way. He will demonstrate how Cilium can provide network segmentation and Multi-VRF support with or without the use of multiple interfaces. With this new approach, he will also explain how to build simple multi-cluster VPN or simple integration to an MPLS provider network by leveraging natively IPv6 and SRv6.

A

My name is daniel bernie, I'm a technical director at belcada, focusing on cloud transformation and other crazy things like p4 and smartnix. But today I'm going to talk to you about something which is really interesting to us. It's about leveraging psyllium with srv6 for telco networking.

A

First, I'm going to start with some problems with the telco networking and I think our colleagues from swisscom were talking about it before, but telco networking, although he would like it to be it's not something that we can talk typical, so most telcos, their networks are built through mergers, acquisitions and although we would like them all to get consolidated, sometimes it never ends up. So we keep legacy devices that are ranging from the even the 70s.

A

Most elco network are plagued with legacy technologies which never seem to go away. We still sell the s1s, the s0s, the gears still there the tools that come with it still there. So, even though you like to go completely cloud native that doesn't change and the systems that you need to manage them with need to cope with that.

A

Most telcos networks are also bound by regulations sarbanes-oxley all sailing agreements which you they do impose limitations on, how you would like to do networking and how you need to keep network segmented, isolated. Otherwise it doesn't work. So the big fat net, single vlan, that covers everything, doesn't work in those kind of cases.

A

Most operators have different differentiated services which they do offer and they that also imply some reality checks around how you would like to do networking. Sometimes your tv delivery system will not be the same. It will not be able to run on the same as your mobile core. The systems will have to change.

A

Oh and there's still that's 5g thing that everybody keeps talking about. That's requiring a bunch of new things in our network, and it's a also coming with something which we call network slicing we need to think about. So conclusion is, although we would like them to be telco. Networks are not typical and they're, not like any enterprise networks.

A

Cloud native transformation for telcos is not just about cnfs. If you look at this, our I.t systems need to be going cloud native.

A

A lot of our systems that manage the network need to go quantitative, not just a function, and it's also an impact on their networking and by the way hyperscale clouds use the same technologies as hyperscaler as we do in operating to build their networks, they run mpls, they run ip network v6, they run those classical device technologies and whether you like it like it or not, network segmentation and isolation will not go away.

A

The question is how we do it so today, I'm not talking about how cloud native is affecting the telco, I'm going to just talk mostly about multi-networking and multi-interfaces, so even with all the potential alternatives that exist within kubernetes, the multi-interface is still not a pleasing experience. So, if you have to deal with it, you have to either static route sprawl forwarding rules that go really crazy on your pods. You could use vrfcnis, but most apps still don't understand how to use vrs properly linux kernel.

A

So you kind of it's a fun thing, but it doesn't really work if you have multiple interface to support which is able to supposed to be providing you isolation.

A

For some reason, we're kind of recreating the four-legged server, which had a leg into a physical, because now pods have legs and four networks for some reason so and not to mention that the security policies don't necessarily follow those four interfaces for new pod interfaces and we still dump the problem most of the times through the network guys, and then we complain that the network doesn't work because we pass a vlan with dump traffic and expect them to fix the problems.

A

There's lots of third party are downs available. That's a one of the beauty about kubernetes. Is it's so malleable that you can add extensions and new features to new modules to it, but that comes with other problems. Now you have to deal with support model challenge, so I I have a kubernetes platform with its cni. Then I I want to add another mesh. I want to add a third party mesh, which is not the same. I want to add another cni, because it does some cool things.

A

Then you end up with four licenses to support and lose your single punt of contact for this, and you also have to deal with the compatibility matrix. You want to upgrade your version of open shift, but then that that new shiny smash that you integrated that it's still not able to do so. So you have to wait. So that's not still fixing and for just for fun, although it does work, have you ever tried to have a vpp based and then native cni running on either antas, eks or ocp.

A

I still haven't been able to get it so and by the way I did try an sm for my feature set. So it was a I I'm coming from scars and experience, but it was a cool project.

A

So let's try to fight our other way, because in the end, this is what we want. The developers should be able to create their applications, their pods connect them to the network and not have to worry too much expect this to be black magic, but how to make the networking, even if complex simplified for them, so that it doesn't become a challenge so that comes to the that was the part about the multi-interface problem. Now I'm going to talk about technologies that exist in the networking protocols.

A

Do it that's going you're going to see them the linkage together why it became so interesting for us, so I'm going to talk also about srv6. So for those who know me, they know I have a pet peeve with soviet x. I've I've been through this for a long time in ietf and other initiatives, it's basically a new way of doing rotting based on ipv6.

A

So I was quite happy to hear my dirt jelly come friends talking about ipv6, so so vividly it's basically doing pet routing like a source routing based on either 128-bit address scheme like a segment of 128-bit or multiple small segments of 16-bit.

A

So you basically carve out the the sword, the pack, that the path of your network to the source address it's based on a few drafts and a few standards. One of them is being the network programming framework where you, actually you encode instructions in your network based on your ipv6 address scheme and you can actually create programs or small policies. A services service is also not prescriptive, so he doesn't really care how you do the control plan of srv6. It does bgp because it came out of itf, but you can create your own sdn controller.

A

You can use your pc whatever you feel like. uh It provides a single encapsulation to provide both overlay and underlay and in the end, it still leverages pure base ipv6 routing. So you can actually simplify a lot of your network by just doing raw ipv6 routing.

A

It also is based on the logic of policies so based on how you define the the end behavior, the way your ipv6 is going to be treated at their destination.

A

You create various policies which is called in the in the itf term: segment, routing policies that help you define how your network is going to treat the pac the application and the traffic is steered into those policies either some physical or virtual interfaces, five tuples or even more flow based uh mapping, gtp header remapping- and this is how you actually create what they call segment routing network policies is based on those parameters you create constructs. I can do a layer, two pseudo wire over an ipv6 header.

A

By doing the n.dx2, I can do ipv layer, 3 vpns, with the dt46, which is a decapsulation in a 2ip table lookup, and I can also do something which is quite getting quite a lot of traction right now is remapping gtp into ipv6 sellers, so you can completely remove the gtp problem to and simplify your encapsulation in the network and then reconstruct it later on for those who were not there at mobile world congress. Soft bank did a quite a good demo on this.

A

So what we did is we look at? How could we leverage srv6 with cilia because of the ebpf base? It has in the in its data plane and how could we construct a new way of doing networking for uh from the uh for our clusters, so we're working with these event team? We looked at various scenarios and I'm gonna do a small talk and afterwards I'm gonna.

A

Do I'm gonna try and do a demo if the wi-fi agrees with me so basically based on the fact it's pure ipv6, so a cluster that does that is completely v6 doesn't need to have any encapsulation.

A

The only thing that really makes it work really quite well is, if you're able to run in something which we call flat mode or in the case of google cloud, they call it vpc native, but the aws have the same. So you don't constraint your ipv6 addresses into of your pod. You make them you. Let them lose, as I would say, and based on this v6 routing doesn't need to think about anything, not even segment routing v6 v4.

A

On the other hand, you can actually encapsulate if you want to, or you cannot use the cnr mechanism of celium. If you want like cluster mesh, but for paradigm, we actually did look the way of. Can I have a v4 pod talk to another v4 pod, using an srv6 encapsulation over psyllium, and that way you make you make your underrate network completely v6.

A

We also look at the way to do cluster, a cluster talking to a l3 vpn pe, a physical pe in the network, and how to do this from syria. The beauty of this is, I have a single interface. I have no multiple interface to talk to vpns. I don't. I cannot still associate my parts to multiple vrfs based on the address policies, because you remember in this. Obviously I was talking about sr policies in syria.

A

We constructed the notion of services, srv6 egress policies, so based on criterias and the path definition you say I I need to be attached to vrf, 0 and 1. In the case of our demo and based on this with the learn, the route the the psyllium learns from the bgp, these pgp neighbors is able to construct dynamically the egress policies associated to the vrf, so that single default route single interface.

A

You can still default internet or default behavior with the standard constructor of communities, but in the case you need to go somewhere in the in the deep of your network through vrs. You can use the egress policies of psyllium. It makes it works in our case, as I showed there, the pod needs to go talk to an application over an l3 vpnp in srv6 domain. It just goes across and then the beauty comes with the rest of the integration.

A

When you look at srv6 overall, so psyllium still does its magic integrates single route table single interface. The egress policy is still there, but oh by the way that the destination p is not srv6 is a legacy classic mpls. In that case, the srv6 architecture allows for gateways to be able to translate from an srv6 domain or ipv6 domain to any other model.

A

In that case, we do mpls so going from an ipv6 flat net fabric, no srv6 device unless, except for the gateway function, that translates into mpls and my cluster is still able to go and talk to vrs.

A

And then I can do a service insertion for those who were at npls paris. This year I did a remote talk around the fact that we integrated like physical devices into an srv6 domain using services proxy kind of the logic of an envoy proxy, but or like a proxy-less mesh for those who are able to do it, but for physical device as well, and in that case cdm does the same magic, single interface. But now you can construct either dynamically to bgp advertisement or statically through the policies created in cilium.

A

You can also insert physical or attach physical services within your network to that construct.

A

So I can actually create a chain from my network. Physical devices or virtual network elements is still going to the application.

A

And if you look at this, this is the simplicity of interconnecting the clouds in a in an environment. I need to map vrfs or vlans from a cluster, make them go to our network redo the mapping to any vrf. I need to have in my network, but I still need vrf in my physical network now go to my interconnection map that back to vlans and now I go to the clouds and the clouds don't have multi, and so I need to create a large amount of vpcs and try to interconnect them.

A

If you look at this and if I can start having my my clusters be able to do a basic ipv6 overlay end-to-end, the only thing that's missing now is to be able to say well public cloud. Can you just provide me ipv6, and in that case I can actually do a single flat ipv6 address scheme with my cloud providers and my vrfs or my isolation I need to have it will be done directly by the by the cni before I go to the demo.

A

If you think you want to pass the questions for after or you want to go to the the the demo in case, it actually works. So what could go wrong? I think right now it's only the wireless.

B

A

Simple demo topology, so I have psyllium with two parts that need to talk to two different vrs but the same address scheme and there's a physical pe which is running fr linux because fr supports a serv6 and at the end I have the devices.

A

So I'm going to show the the pods I'm going to show the rotting table of the pod, the the the interface of the pod, I'm going to see the encapsulation happening in psyllium from when in case it goes to srv6 and the case where you don't see any encapsulation, because it's actually doing the default routing of the pod.

A

Wi-Fi still up yes, now it's going to be fun. Trying to do this reward.

B

B

C

B

B

A

A

You make a text bigger. I actually try to try to see it from the back.

B

B

A

My god, I think I'm going to have to do a session with at the table with the speeded guts.

A

Okay, so I have my two vr and my two clusterviewer client vrs, the client parts that are attached to different that need to talk to different vrfs. I'm going to try now and show you the routing tables.

A

I'm gonna switch to this.

B

God it's painful.

B

Sorry guys I'm trying to get back my command, but.

A

The speed of the wi-fi is not helping me.

B

Okay, so I'm gonna first start to show you.

B

A

Going to show you policies, so it's going to help. So you see this is a dynamic policy that was created through the route advertisement that it receives.

A

You have the vrf id and the destination sid, which is a segment routing id that's going to use at a remote pe, and you have the destination cidrs that are used to map for the policy. I can just try to get you to the get to the pod.

A

B

This is painful.

B

ah Okay: let's try this one.

C

B

Okay, I'm gonna switch.

B

Things this is not helping me.

B

B

A

I'm just switch mirrors, so it's going to be easier for me.

B

Oh, but it's a bit too late: okay, no! It's! Okay! I'm gonna make it work. Okay,.

B

So first I'm gonna show route.

A

So you see that ios. For those don't see sorry, but I only have one single interface in my pod. I only have one and one route.

A

Now I'm going to do a ping of google, the famous google.

A

So you see here you only see the bgp advertisements coming up, but there's no traffic going through the pe at the end remote because the traffic is not encapsulated, it's not part of the egress policy, so it's basic securities networking going on there's nothing happening.

A

Now, I'm going to switch and you're going to see. Now, I'm going to ping the right location.

A

And now you see the encapsulation happening, you see the destination. The source address, 10.1 that 156 going to 10.3.0.1 being encapsulated into srv6 from psyllium, so I can now use an l3 vpn. In that case we did a l3 vpn going to a remote pe using ipv6 srv6 but again nose, multi interface, no secondary cni, pure single one, and it's just a matter of having the right bgp policies and the evpf code to map that back into a dynamic egress policy.

A

With that, I'm going to go back to my famous presentation.

A

If you want to have a session the question about the way that things are constructed after because I burned some so many valuable minutes on my demo, we can do the discussion afterwards.

A

B

B

Okay, let's go here, I'm.

A

Gonna stay this way because for some reason, so the next step so right now fr only words with srv6 in the mode of um base, ipv segment id under 28 bits, and the next step is working through the community with fr to support. Also the micro sid instruction 32 by 16.. The good thing with fr is it's part of the big sonic community, so a lot of people are already working through that effort with the srv6, so we're not alone in our world. Second, one is optimize acid allocation.

A

How do we do a good, clean, addressing scheme of ipv6? This is pure, like trying to figure out the right model to do the addressing scheme based on the the how we do ipv6 ipv6 assignment in kubernetes finishing the work with the iso valentine team on the bgp integration to making crisp and production great. I would say this is the part with the the syrian project with the with the team.

A

uh The rest is really keep tracking the ipv6 development and communities to ensure we don't end up redoing v4, but really maximize how ipv6 is used.

A

This is for those who remember, which is actually a good thing when you look at how you could use ipv6 and communities.

A

If you really want to go crazy and try to burn 10 million containers per second, it will still take you 58 000 years, 58, 000 years before you actually max out a slash 64., which is right now how we do 64 in assignments and pods. So I think we can burn a few ip addresses and to build the network properly, rather than do not port forwarding and all those things.

A

I think this way we can actually do this and I also added a link which is around something that came out of the internet, uh the innovation of the internet with from etsy, which is around the design of ipv6 data centers. So um a big thank you to the azervalen team. They have been ninjas with me up until 3 morning, 3 o'clock this morning, to make the demo crisp and clean, although I mess it up with dealing with screens.

A

So a good shout out to the isolan team, which I know by christopher paul and utah, were really really helpful. With this, so if you have no questions or if you have any questions, please go to the mic so that everybody can hear.

A

Oh, I think the mic is not on, though,.

D

Right, okay, now, um so, first of all, thank you for the great talk. um I think we did something similar within deutsche telekom, not with segment routing but differently.

D

I have one question, because we've seen that with vendors they heavily embrace and want to use multus for secondary interfaces within pods and heavily push back on a. I have one interface with different routes approach. Have you seen something similar and how to approach that.

A

You're right, we still do see this, and the question I would ask, though, is why, except for uh except for um high speed interface, actually we can do with the fxdp um right now. Actually, with psyllium code, we actually did like 23 gigs out of a 25.

C

A

Nick, so I saw a review for me would kind of seem something weird to have to defend them, but anyways most of the time, it's really around um there's cases with protocols not supported inside.

A

But if you look something like this with a pod directly with an overlay, whatever protocol you need to pass through the pod excuse, kubernetes doesn't see anything at that point. The pod still needs to do it. If it was v4. I understand you need to go through services because of the the the limitations but v6 right now I would not have so. I think it's more cultural than a really big technical reason.

D

Yeah thanks so much thank you.

B

No other question: oh there's a question.

A

Okay last question: if you have any questions afterwards, I'm here all day and.

E

Okay, so uh my question is that how much does uh like the platform infrastructures are supporting this technology? Because if we rely on this, then it should be everywhere. Basically,.

A

Can you, please repeat a bit louder.

E

uh Yes, what is the support of this in in uh in platforms like in google cloud or in in aws.

A

So um in aws funny enough, I did this in 2018 and it worked like a charm. So from montreal to toronto, going through oio, east srv6 and 2n, I had never had any problems, so uh the question is more around the way australia is supported.

A

Actually, cilium is the default cni for both google cloud now and amazon eks. So um I would not. I would have trouble understanding why this would not work. In that case. The only trouble I might say, though, is the mtu size, but it's going to be the same for everybody if they have any queries. Instance that has now a large end to use because they do telco workloads, but their interconnect doesn't support large mtus you're kind of in the in their own in your own world. So that would be the only case right now.

E

A

Thank you very much.