Cloud Native Computing Foundation Research End User Group, 21 Sep 2022

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: CNCF Research End User Group: Cilium and eBPF, Raphaël Pinson, Isovalent (September 21, 2022)

Description

Don’t miss out! Join us at our upcoming event: KubeCon + CloudNativeCon Europe in Amsterdam, The Netherlands from April 17-21, 2023. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

A

All right so welcome everyone. uh We are finally restarting the usual the regular sessions for the research user group, and um this was a topic that we had already for a while uh on psyllium and evpf that had been suggested before and we got Raphael from Isle Island. That was nice enough to to to join us today. So uh we'll have a presentation.

A

So usually the format is like half an hour and then we we release plenty of time for questions but uh I guess if people have questions during the presentation that should be also Okay, so yeah thanks again Rafael.

B

Thanks so I guess I'll start by sharing my screen.

B

um Where's this here.

B

All right is that fine, yep, okay, so uh yeah. Let's talk about psyllium I'll, put sodium in France film in general, sodium evpf and the tools around cellular. So my name is Rafael, so I'm a Solutions architect at isovan, based in Switzerland um and so as part of my job at Solutions Architects that you um customer support on psyllium, uh Hubble, tetragon and I've been involved with with chromatous in the cncf, where several years, I actually co-organized, um as cncf Meetup uh in in the whole monthly part of Switzerland.

B

So the different subjects we can touch on today are silly many BPF in general, then different features of psyllium networking cluster mesh security, obserability we've been on service mesh and uh I'll leave a little bit on tetragon at the end, which is not extremely related to selling, but we usually ship it together, at least in our Enterprise solution.

B

So first selling many BPF, quick introduction, so I'm part of isoville and isovalent is the main contributor to psyllium and psyllium is part of the CNC ethics and incubating project at the moment and uh based usually on ebpf evpf being a technology that has been in the Linux kernel for years, and it's now actually being quoted to other kernels Microsoft is doing some work to import it to Windows, for example.

B

So what is Cillian uh these days with Cecilia means, first and foremost the cni, though it actually started um before kubernetes was widespread or even announced, I think so. It provides the the Pod to pod pod to service, to part communication uh even node, to node communication, but it can also Implement kubernetes services or replacement for Q proxy, as well as extra features such as multi-cluster or VM Gateway, which is an extension of the multi-cluster feature.

B

On top of this, it implements Network policies using either the standard, kubernetes, Network policies, uh resource type or specific Network policies. Crd is provided by psyllium for advanced features. It is identity based, and this is something very important. There is a duplication of identities that happen in psyllium both for performance and improves the visibility, the observability um layer it supports uh encryption, but about encryption using ipsec or Warrior guide and there's also observability, um mainly linked to the Hubble components, providing metrics, providing flow visibility and service dependency in the form of a service map. All.

C

B

Is based on ebpf and extending the Linux kernel using ebpf. So what is the BPF? Ebpf is a technology that allows to dynamically extend the features of the of the kernel So.

B

Currently, in our case, the Linux kernel in the future, in my extent, to other os's, and so the idea is that you can have many programs that are typically injected into the kernel as byte code, so BPF bytecode are usually written in some form, some subset of C uh like the example here that is compiled into bad code and then inject into a kernel, and then the kernel will verify that this program is acceptable. So there is a strict verifier that verifies for both security and stability of that program.

B

And then, once the program is accepted into the kernel, it will be compiled into a machine code. So so it's as performant as the kernel itself and you can attach it to different events in the kernel, which is it makes a lot of sense. Given that the kernel itself is essentially an even driven uh program. So, for example, with rdbpf a process would uh call an execd Cisco to start a new process and that Cisco would be scheduled with the BP apps.

B

You can actually capture that Cisco and decide to observe it or even to act on The Cisco itself, even prevent it. So the example here, for example, is an example of observability in the kernel.

B

There's um several filter applications that are quite natural for ebpf um observability is obviously one of them, because you get to observe anything that happens in the chronologic forms of ciscals or k probes, your probes, lots of different things or um Network events. Another field of application is security. Obviously, if you can act like analysis calls even block them. That makes a lot of sense and the last one is networking. I see the last one, although in the case of the ceiling project, this is where the project started.

B

Actually, with networking uh it's a little bit less obvious, but where's the bpfs. You can also bypass some native uh Native networking Stacks in a kernel and actually re-implement them so different ways in which EPF programs can act on events in a kernel. So you can typically add yourself attach yourself to events such as a file being read or the I o reads directly on the disk and do some observability or some action on this.

B

When it comes to networking, you can attach to connect event or to the send packet event, or you could even implement the TCP returns Mission. By attaching yourself to your TCP events, you can attach yourself to K probes.

B

Your probes system calls Trace points sockets and so on and there's even ways actually to interact directly with uh with the the hardware using XDP, for example, so Express data pass, which allows to program um Hardware network devices uh to do load, balancing, for example, so you could capture packets, arriving on an interface in a kernel and redirect them directly to um to a network card for load, balancing for Hardware load, balancing, which can give amazing performance compared to what the kernel itself can do.

B

So sodium is look used by a lot of different companies. This is just an extract from users: dot MDM in the repository in the sodium repository, there's lots of uh names of organizations that you sell them and what they do with it.

B

So, at the moment, when we talk about psyllium, there's essentially three products that are part of the psyllium project and that are in in this school part of the the cncf um there's psyllium itself. So the the cni and networking solution, this Hubble, which is the observability component and tetragon, which is the most recently open source component that is responsible for runtime uh security as well as security observability.

B

So uh the psyllium networking components, the cni itself is responsible for the networking side of things, ipv4 IPv6 integration into Cloud providers or bgp, if you're doing on-premise, typically um overlay, so there's two options: either direct routing using bgp or overlay using vxlan or genev srv6. uh There's special features such as egress Gateway and multi-cluster, the possibility of doing Nets, 4664 and uh the possibility of applying Network policies.

B

So L3 L4 that's kind of the standard, all the way to layer 7 using Envoy as a proxy and uh filtering on DNS entries, uh either specific DNS entries or with wildcards encryption.

B

I mentioned this a bit before and load balancing, which can be done actually even outside of kubernetes, so either kubernetes load, balancing as a replacement for Q, proxy or even outside of kubernetes as a standalone load balancer using psyllium as a container in Docker, for example, and on top of this because we see all the traffic going through, we can actually add a lot of observability so observe uh traffic in the form of flows. But we also get a lot of metrics from all these components.

B

So we still imagine itself uh the ceiling operator, uh Hubble, uh the envoy, proxy tetragon and so on.

B

um I'll talk a bit about tetragon here. Tetragon allows to plug to a lot of different kernel events and actually observe what's happening there. You can get a lot of observability flows from there, which can be exported to the same of your choice same for Hubble. So essentially you get a flow of Json events that you can export and process.

B

In your tool of choice or you can use the tools that are provided and typically in the Enterprise solution of sodium provided by asovalent, we have ways of correlating these events and then there's the service mesh layer um that is essentially using features that we already have in sodium and that people expect in a service match such as encryption such as cluster mesh, such as observability and adding to it the features that people would expect as well in service matches, chats and Ingress controller authentication, traffic management and there's quite a bit of work.

B

That is still done on this layer and will come in sodium 113n and following so next next release and following so, let's dive a bit into the networking side of things. uh Typically on kubernetes, there is no default Network layer, there's only a standard which is cni, and so here uh psyllium can replace both the cni layer as well as Q proxy. So the cni will be part to part uh intranode enter node communication and replacing Q proxy will actually implement the services which are usually implemented using iptables or ipvs in Q proxy.

B

So we can have psyllium take care of all of this, and typically instead of iptables on every node sodium will inject evpf programs and maps to uh to implement the same features um except they they might be. It might be a little bit more featureful, because evpf programs are uh have more possibilities than just iptable rules, so there is sorry I'll get back to this.

B

There is an agent uh a silly magent writing on each node, and this agent is responsible for injecting ebpf programs and ebpf maps that are used either to configure these programs or to retrieve information for this from this program. It's typically for observability, for example, so.

A

If we talk about services.

B

And the possibility to replace QT proxy, uh typically um on kubernetes, when you have a service, you have a virtual IP that works as an lcl4 load balancer to pause in the back end points, and this is typically implemented using iptibles in Q, proxy or ipvs and with IP tables. What you get is essentially a system of sieve where you will go through all the rules before you find the role that is proper for you. That applies in your case.

B

So typically, if you have a service with three uh pods and in the back, uh you'll have free role and three rules, and the first one will say redirect to this part, to this IP uh and 33 of the cases and then redirect to the second IP in 50 of the cases and then redirect to this IP in 100 of the of the remaining cases.

B

So you see, this is a very um an approach that is based on filtering and when you get to hundreds or thousands of services, it can get really slow to get to the rule that you need to apply, uh let alone just applying the the rules whenever there's a new part or whenever there's a new service or whenever there's a modification Network policy, uh the whole stack.

B

The whole um list of iptables need to be Rewritten, whereas with an evpf based approach, we can have hash tables that will link directly the identity instead of the IP address the identity of a pod, based on a set of labels that are known from kubernetes by the edpf program and based on this identity, you can link directly to the way it should be routed to another Identity or if this, um this traffic should be allowed or not. So typically, both routing and network policies can be implemented in a much more performant and scalable way.

B

uh One of the examples of features that that are provided in psyllium, that is interesting, is uh the possibility of having an egress gateway to access um a workload typically a database, for example outside of the kubernetes cluster, and so we have a crd. That's called egress Network policy, psyllium, not policy. Actually, that allows you to Target some pods in the comment this cluster and say when coming from the spot and accessing this cider outside of the kubernetes cluster.

B

I want to go through this specific IP or this specific node of the cluster, and this allows the the application outside of the cluster to know uh to to recognize the IP that it's coming from. So typically, if you have a firewall in front of this application, you can filter on this IP. Otherwise, you don't know exactly which IP would be coming out right.

B

It could be from the node itself or it could be directly from the Pod, if you're using direct routing, but it would be really hard to identify which application is reaching out to this external um application outside the cluster there's, actually a mode that is provided in the isoville NCM Enterprise distribution, which allows for high availability, egress Gateway. So, instead of having one IP to exit the cluster, we have several and sodium can load balance between them and failover as well. In case one of the nodes uh actually is not available.

B

In order to integrate with rules as well like security groups or other metadata from the cloud, so typically, we have a component called the sodium operator, and while the agent is a demon set running on every node in the cluster, the operator is a deployment that has a credentials to connect to the cloud, um and it will allow typically to uh play a role in the ipam for cilium.

B

Getting the the sets of ips that are available to assign to the pods in a direct routine environment such as eks, for example, but it would be similar for gke or other approaches.

B

uh This operator also has a role to do uh some um garbage quality in the cluster. In the case, for example, one node is removed. Obviously the agent on this node is not there anymore to remove stuff from the uh from the API server, so the operator can do this, there's a possibility to do cni, chaining, so using one cni as the base layer and then deploying psyllium on top. Some people want to do this.

B

It's more and more rare I would say um the typical cases because they don't want to lose possible uh support from the cloud provider by using something else. Fortunately, psyllium is getting more and more common and supported by Cloud providers, so it's not necessary to use cni chaining anymore in a lot of cases.

B

That's what the networking part um one specific feature that is interesting is the possibility of meshing, different clusters at l3l4.

B

So, typically, we can have clusters that have different uh commences distributions different versions even and you want to match them directly at the l3l4 layer.

B

This is possible using psyllium, and the way this is done is that when you activate cluster mesh on a cluster, psyllium will create a new API server on on this cluster, and this CPA server will be used by the agents on the other cluster to get information read only from the first cluster, so typically they'll be able to share uh service Discovery information, the the services and the back ends the the end points for the services and to perform load balancing between the different clusters.

B

You can also share Network policies, so typically, you can have Network policies with labels that labels that typically Target one or the other clusters. So you can say this backend from this cluster is allowed to talk to this front-end or the other way. Actually, this runtime from this cluster is allowed to talk to this backend on the other cluster, and this allows for cross-cluster Network policies. Encryption can also be extended between the two clusters, as well as obviously routing, otherwise it wouldn't work.

B

So typically, this allows to deploy the same workloads on two clusters and make sure that if the workload is not available on one cluster, it will fail over to the other cluster.

B

Another option is to have several clusters with shared services, so here I have a service that is global between the three clusters, but I don't actually have pods on cluster one and cluster 2 to implement it. The pods are only on the shared services, cluster and typically one use of this is using a stateful cluster, where I have like a database running an icon, easily scale this cluster, so I'm keeping it.

C

B

As a as a buff in my architecture, whereas I might have stateless clusters or clusters that are dedicated to stateless applications, that I can scale I will on different uh availability zones, for example, or even different providers, and then I'll have a virtual service here that provides access to my database and this service will be made Global in a cluster mesh approach so that it actually points to the stateful cluster foreign.

B

It's also possible to integrate with service mesh so typically having an Ingress in front of each cluster and then making sure that the backend service is always accessible using either the local pods or the remote pods, and in the latest version of selenium you can actually specify if you want a local or remote Affinity. So if you have a local Affinity, you will say preferably one access. The service I want to access the local pods on my cluster. If there's no such bus then go to the other cluster to fulfill the service access.

B

So that's the case for remote. It looks like this: it's actually a normal Service uh definition, except it has annotations for Global Service and for service Stephanie. There's actually other options available for this. Just to give an example, let's dive into security uh like I said security is very important in psyllium and it's clearly based on identity and I've shown you how you know. Psyllium can use a concept of identities that is native to it and that Associates uh labels uh in kubernetes workloads directly to their identity.

B

This identity in the case of vxlan can actually be transmitted encapsulated into the the network. Packets are going from one node to the other, so that the identity is propagated between nodes, uh typically for observability and network policy applications. In the case of direct routing, it works a little bit differently, but the idea is that every time something a network flow arrives on a node, there is a knowledge of this identity which allows for uh Advanced observability and network policy enforcement based on this identity.

B

Based on this, we can have three layers: three levels of um Network policy application, either l3s or just connection between two pods L4 connection, plus the port or protocol TCP, UDP and L7, which allows to actually parse the the application uh networking layer and uh extract the the protocol. The application protocol and filter on this, so typically you can allow HTTP get on slash public. That's an example, and in this case it will actually go through an Envoy proxy provided with cilia money on every agent.

B

This is an example of L7 filtering using Cassandra, and you can see here that we're specifically allowing one action select on one table: there's possibility as well in network policies to filter on DNS.

B

uh So typically when uh traffic is exiting the cluster at the moment we have a 2fqdn rule and you can actually use either exact DNS names or even wall cards, and the way it works is that psyllium will cache the answers from Cube, DNS or core DNS and allow connections to IPS that are known based on the resolution that was cached, there's an aha option for DNS proxy so that when the agent uh gets upgraded or if the agent crashes, which shouldn't happen, the DNS proxy continues to work and this ha option is available and the isoville and slim Enterprise distribution.

B

So these are all the possibilities of matching for Network policies.

C

B

The standard things like pod labels, namespace Services, account service names, cluster names as well when using a cluster mesh, DNS names I just talked about this ciders either external Outsiders. In this case, Cloud providers- and this is one role of the psyllium operator- is to actually resolve instance, labels or subnets or security group names into ciders and and logical entities. Logical entities can be uh the host on which the Pod is running. The container is running. Typically, um it can be another host in the Clusters. That would be the remote node entity.

B

It can be the API server, it can be the world, it can be anything in the cluster, so we have logical entities allowing to write Advanced rules without caring about IP addresses observability. Because again we see everything going through.

B

We can have great observability uh thanks to uh evpf and we have several ways of accessing it: typically through a CLI, so a Hubble, CLI or Hubble UI, and both of them use a company that we call Hubble relay, which gathers all the observability flows from all the nodes in the cluster and allow you to filter them and view them either.

B

On CLI, or uh with the UI, and on top of this, every one of the components: seal, Imaging, sodium operator, uh Hubble, the envoy proxy, the DNS proxy- they all provide metrics, which you can send to your favorite, um to your favorite um observability platform right because they're promises metrics essentially.

B

So this is an example of the Hubble CLI. So here we have some Buzz running and with the Hubble CLI, you can see traffic going through your cluster. So here we see the DNS lookup that a pod is doing to core DNS. We see the reply in UDP and you see the IDS that are associated uh the https requests, and here we see we have DNS visibility.

C

B

The the IPS are replaced with the DNS names that are known to be associated in the cache, um and we can even see here when uh traffic is blocked. So that's a network policy being applied and here the traffic is dropped uh at the same uh request. Tcp request.

B

So this is the the CLI. This is the UI for Hubble. This is the open source version of the UI. The isoville slim Enterprise has a slightly different uh version of it with a few more features, and essentially this uses the same source of information as the CLI except it actually builds a graph of dependencies between the Bots between the pods, the services that shows how they actually communicate. So here you only see gray uh lines, they could be red lines as well.

B

If traffic has been dropped, so we can see where it's being dropped and you can actually click on the boxes to get more information, and here, in this case in this namespace, you can see that HTTP visibility using the uh Android proxy has been activated because you see actually which HTTP requests were performed on which services- and here you have the flows, and you can click on the flows to get more information on every one of them.

B

I'll finish the psyllium side with service mesh. This is the extension. This is where we're going at the moment, because we know that people are interested in this part and psyllium is quite low level, but a lot of people are interested in features that are a bit higher level and the idea is that synonym already has a lot of the features that people expect from service mesh.

B

So one thing we've added in stone: 112 is the possibility of programming the envoy proxy that is already provided with the psyllium agent, using a crd, so cm112 provides a psyllium, Envoy config crd that allows to program your the envoy proxy on every node uh using logical identity. So you can see from the spots to this Parts I want to apply this as Envoy configuration and one layer of control plane that we've added in cm112 is an Ingress controller that bases itself on it.

B

So, essentially now you can use the psyllium Ingress glass uh if you've activated Ingress control. Obviously- and this will human Breath class, we will essentially implement the Ingress by creating an envoid configuration dynamically for the Ingress for a specific pod in the future. What we're adding what we're currently working on is support for the Gateway API and support for uh species for a form of mutual authentication directly implemented in psyllium. This is planned for 113 for the next major release of psyllium.

B

You can already use these two with psyllium as a yeah, sorry as a novelty on top of psyllium, but we're also planning in the future to integrate. Istio Society uses the envoys crd natively, uh as a as as the implementation for for the Easter abstraction, instead of injecting an Envoy proxy into every pod like it does. Currently.

B

All of these actually also allow to get metrics and to get observability in the form of flows and isolation price. We actually provided fluent D um service, which allows to easily export all the the flows that we have either from Hubble or from tetragon to your favorite CM platform.

B

We also have a component called Hubble hotel that can turn Hubble flows into open, Telemetry traces. So you can import this as well and correlate it with the traces from your applications.

B

So the way we're seeing this is just just the way that uh some decades ago, um TCP went down from being an external Library uh to being a standard library to being emptied in the kernel, so that everyone can use TCP without even thinking about it.

B

We think that a lot of the features and service meshes today uh that started at as libraries that you had to use in your application for instrumentation are now at the external implementation in the form of sidecar's step, and we think it could go even lower into the kernel, uh obviously not patching the kernel. The way it was done for TCP back in the day by using ebpf to inject this feature, so it becomes totally transparent for the users and it's just there, observability encryption, uh Mutual, authentication and so on.

B

We also gain a lot, obviously in performance, because instead of having one um one proxy one avoid proxy per pod, which means a lot of uh Android proxies running on your nodes. If you only have one per node, you gain a lot in CPUs and CPU and RAM. Typically so the The View that the idea is whatever we can do, natively any BPF would try to do. Native Laney.

C

B

Including observability security traffic management, because we gain a lot in performance and whatever we can't do natively in BPF, we can still use an external proxy for it, and evpf can still allow to directly route into this Android proxy provided per node. So are we still getting performance compared to a sidecar, uh the performances? This is um graph that Compass performances uh based on a proxy um or the visibility directly in a kernel using an EPF note that some of the the sodium projects actually use uh directly eppf based HTTP visibility.

B

This is the case of tetragon, for example, and it's totally possible in the future. We might actually uh use these libraries that already exist to parse HTTP or other layer, 7 protocols directly in the kernel to gaining performance again so last point on tetragon and system uh observability and enforcement. So tetragon is kind of a compliment.

B

It's been and the isoville and Cinema Enterprise offer for many years as a closed Source product, and it was open sourced in the beginning of this year as a complement to sell your man Hubble, and it can attach to a lot of different events in the kernel and provide observability.

B

The the really nice thing about this is that, because of ebpf, we can do uh correlation uh and aggregation of events directly in the kernel using Maps so that we export from the kernel into user space only what you want to see already organized already aggregated- and this is a huge game in performance and then based on this in user space you'll have the tetragon agent agents, sorry that will read from the different structures in which the BPF programs have written, extract the information and make them available as metrics or as flows of information flows of events logs or traces.

B

Based on this, you can observe a lot of things: our file, access network, namespace uh escapes privilege, escalation, access to uh data on disk Network protocols and so on and so forth.

B

uh One of the um users that we have this is an example from isavilion selum Enterprise, where we have a process to review that has existed for many years, where we have a correlation between what's actually running into inside a container and the network flows that result from it. So maybe you see in Hubble hey. There was a connection to this weird thing here, uh something that not reverse shell.com, which is definitely not a reverse shell right and you want to see where it came from.

B

You know in Hubble that it came from this pod and so and this view here, you can actually look into this box, see exactly what was executed in the spot, that she's a citric on, and you can see that uh five minutes after this, the the server.js standard application, the container started. There was a shell that was started and an NC and then a curl that connected to elasticsearch- and you can say: okay, there's something fishy here, but at least you have a trace of what was executed and what gave um what?

B

What led to this network traffic that you could observe in Hubble.

B

So there's some information again on uh isobalian.com. We have some some Labs with some of the features both open source and Enterprise features in these Labs um and the community. Obviously, there's the the slack and ebpf Community select server and the zdpf.io when it comes to evpf itself,.

A

Awesome those awesome and a lot of information, so I guess we can. We have plenty of time for this question any questions straight in.

D

Have any specific questions but um I've seen that before and it's really really exciting. We are planning to I, think you probably know already actually in gr planet to use it usually and we've had to Implement some very basic features ourselves previously, and this will allow us to get rid of all of that sort of accumulated sector I suppose but yeah it looks really really powerful. So it's done.

A

So what actually Jimmy out of curiosity, which features, are you mostly looking at.

D

uh Well, there's a few things we want it for I mean we haven't implemented this or something some got a solution for it, but really basic stuff like it's not basically but being able to see. You know real product, real Source, IPS and that kind of stuff for our traffic I mean it's got: clusters um the tetrachon stuff's really exciting, because we can use that we've got a lot of requirements around that. uh The things we have implemented ourselves are more around the sort of service, meshy kind of stuff.

D

So the concept like egress gateways and that kind of thing with we use Envoy ourselves and figure it to do this kind of thing um and then I've also seen them seen them over the whole. uh Almost like a sort of Global Network policy thing working across clusters, which is really exciting, but we've had to come up with I guess: we've pieced, This Together ourselves as existing kubernetes, um Primitives and cncs stuff, and having something like this, which wraps it all up for us as a attractive proposition.

D

um We've used Calico for a lot of times, I see an iron again. Really, it's been okay, but it's exciting to try something. Even yeah just opens up a whole load of new having resource I. Don't know if you guys record anything with industry.

A

Yeah, so actually we we discussed with Raphael very recently about this as well, but uh the one of the main things that we've been looking at is uh the possibility of doing cluster mesh, uh and this is for uh because we still push for people to have this kind of disposable clusters and to have applications deployed across multiple clusters instead of having clusters that need to be upgraded in place, and things like this and uh for stateful workloads. This is uh this can be problematic.

A

So we've been looking at cluster mesh to kind of expand the boundaries cross-cluster, which should allow us to potentially do to use this model even for things that have staple workloads, running like databases or uh I. Don't know we have some. We have some, even even for like batch or ml, where you have long running jobs.

A

This is something we're looking at.

B

As you're, considering specifically the the shared service model, just like uh was like a stateful cluster and then several stateless clusters that would access the services on the stateful cluster in a transparent way using Global Services.

A

So yeah yeah, so that's that's one one option and the other one is even to do uh to run the stateful workloads that have multiple replicas cross across multiple clusters. Even if we have no connectivity which.

C

A

Have 10 premises, then we could consider even having replicas in multiple clusters and and um eventually just just make them also kind of disposable. As we add more, it.

E

Really depends.

A

On work well, of course, how easy this is to do, but uh but by having a mesh for the pods and even like one one thing: I, don't know if you mentioned or not, but even this um restriction that exists today, that I know will disappear about having a non-overlapping subnets across multiple clusters. This.

E

A

Really a big deal for us, because we orchestrate everything centrally but uh yeah, it will become simpler.

A

So I I had uh one more to mention. uh Maybe Timothy. You have a question. You.

E

Have a quick question around: uh how does it compare to admiralty in that in that sort of feature, space.

B

um I can't answer I, don't know: I I, I.

E

B

E

Able to just it's sort of like yeah, so admiralty sort of like uh the ability to Federate uh clusters, so you know, run one workload from one cluster into another. What sounded like that? There was a lot of those types of features uh to provide some sort of federation. The use case I'm looking at is, is sort of like on-prem, Cloud bursting type things where um you know you have an on-prem uh cluster, and then you provide a cloud versing capability and sort of Stitch, those together in in some sort of manner.

B

Right so I I don't know because I haven't looked at admiralty. Is it well.

E

No I'm just thinking in terms of not necessarily admiralty. That's that's the use case that it's currently being used for it just does. Can you do those types of things you know within ceiling.

D

I'm not going to answer that actually I know I. Think admiralty was more around sort of Federated deployment of applications rather than accessing them. So that would be your single pane of glass to deploy something that replicated and lost different clusters. Environments, we're sitting there's more around but I suppose the cni and the network, access to Publications decisions get a post deployment. I, don't think silion does any kind of deployment of applications for you.

E

But you, but from uh what we're trying to accomplish you, can you could build? Could you do that bursting type of Federated uh thing across multiple clusters.

A

So actually that was the other point. I wanted to say that we are looking at and I already mentioned it to Raphael, which is exactly that. It's uh the ability to do sort of cluster mesh across uh regions or data centers, which means that you won't have necessarily no do not connectivity, because you might not have a VPN or something that will allow this. uh The way admirati I'm, also not an expert.

A

But my understanding is that the way it works, it kind of has like a kind of Gateway parts that then communicate to the remote ones. uh So it's kind of nice similar to what.

B

Easter would do it I guess.

A

Yeah but it, but it's it's really targeting not only the services but also the kind of patch workloads where you can submit pods to a cluster that then the actual workloads run in a remote cluster and they're kind of uh attract through Parts in local clusters with with children. Potentially we could do this transparently at the cluster level, like we do with cluster mesh, but we would need this kind of cross-boundary communication without uh necessarily have a VPN or something like that. I think that's kind of Timothy to use k0, so yeah.

E

In some sense, I was looking for motivation to dive into this a little bit further, because you know these are not simple uh platforms to to wrap your head around. So that was enough motivation to go to go, do some deeper research and maybe potentially do some pocs thanks.

A

Yeah but, but maybe we take, this kind of uh I had one more question to Rafael, but maybe we take this as a kind of an action item on the group and also psyllium, which is to track how how visible this uh this model is with psyllium, which is to kind of burst um more than just uh across clusters, but across Network boundaries.

B

So we have, we have customers that do this, but they actually have vpns between the different regions.

A

But maybe it's worth because I I have the feeling that several Labs will be interested in this kind of functionality. So maybe, if.

C

A

Summarize this, it would be nice for everyone.

A

The the other question I had was uh like. uh Jamie also mentioned that he's uh they're using Calico. Is there anything that is worth knowing? If people would move from from Calico to psyllium? uh Is there something that uh that would be lost or to to be considered.

A

I guess this would be the question for Raphael sister. If you know I, guess other people that have done this transition from the character to sodium. If there's anything specifically I.

B

Haven't heard of anything specifically.

D

For us we have, we will have some dependency on, because we have an influence of some Network policy as Calico specific Network policies. Obviously yeah.

B

The migration yeah, the migration of network policies, but there's no as far as I know, there's no specific feature in Calico that you wouldn't have until yeah.

D

And obviously this is kind of like a fundamental thing, so you wouldn't migrate a cluster in place to the new CNR. Take it.

B

Ideally yeah, this is what would be recommended to start a new cluster with with psyllium and redeploy, and if you have, if you have a good github's approach, it shouldn't be too complicated. Theoretically, unfortunately, it's not always the case but yeah. If.

D

You just have one sort of pet cluster that you you nurse every day, yeah.

C

A

Will always be one yeah.

B

Yeah I won't name anyone, but you know people using hyper convergent databases on the cluster and and then this cluster becomes a little gem that you can't touch or reinstall or replace anymore.

D

There's a Enterprise versus open source aspects here as well, I understand um what about on the Enterprise side of things. Do we have? Are there any sort of extra complexity around licensing and that kind of stuff and stuff you have to set up or is it just work? No.

B

No at the moment, it's just different images: uh different Helm charts, but there's no there's no license per se.

D

I've seen in the past with products where actually it's sometimes the most annoying thing about using the Enterprise Edition, even though you beat them apart from paying, for it is oh and then you have to provide license, keys and set up some complex infrastructure, especially if you're, on-prem and don't have internet access. Then it all gets a bit difficult. But that's.

B

Good yeah, so at the moment it's not like this, it's just in fact, you can actually upgrade from sodium open source to psyllium and Enterprise.

A

A

All right, any other question I see.

A

Feel free to come forward. If you have questions.

A

All right otherwise, I think ah yeah, I guess so: yeah Alex yeah it was a late late announcement.

D

A

C

A

So I I guess I was just trying to summarize uh What uh the main motivations uh are from from from this community to to move to ceiling and I guess the ones we we got here were um the ability to do cluster mesh, potentially doing a hybrid bursting things, and then Jamie you mentioned, because you, you have quite a lot of clusters as well. Is this something that you're also looking at.

D

uh Yeah hundreds of clusters so interesting.

A

So I guess these are things we can track here together.

D

A

um You mentioned also the egress Gateway for for uh that you're setting up with uh with a customer invoice departments.

D

A

Simplifying that yeah and what else was there? It's just taking.

D

uh How usage uh things like access to 4.3, uh more Global kind of network policy and uh and then I was curious. Other things are attachable stuff right. That's, okay,.

C

Yeah so so, and just to note something on, my team has started looking at that directly tetragon psyllium uh as of last week from a security standpoint. So.

A

Nice, maybe I have a related question, which is uh regarding so for for our load balancers.

A

um We have a different solution because you mentioned tracking the source IP and we always have this issue if you have like a jealous uh I still pass through across the load bouncer to the back end that you lose the the initial IP and we've started, enabling the this proxy protocol to to be able to to um propagate the source that pre of the client to the back ends.

D

You still not lose it three key proxy.

A

uh Well, this you can fix uh at the kubernetes level, though I don't remember exactly, but there's the the question.

D

I've tried to fail to do that actually using um that external traffic policy.

A

uh We, okay, that that's what we are doing and actually that works. What we were losing was the source IP in the lb, because.

E

A

Of TLS, uh but with proxy protocol and uh using something like engine X for for the Ingress. uh This is all working and for dlbs we're actually using aha proxy with a kind of uh active, passive.

D

A

Okay, is this something that psyllium would also be potentially a good option for for the external LP.

B

With the external beat I couldn't I couldn't say at the moment, one thing I've seen was keeping the Source IQ, but that's that's kind of a different situation that what you have is the possibility of actually uh doing um routing directly or keeping the IP address when accessing the the service and there's two ways of doing this. So there's either a DSR or what was the other one I, don't remember.

B

Actually uh so there's ways to so so DSR will make it so that the backend will actually reply directly to the to the source safety with a routing uh without doing an S9 so going back through the node in which uh and entered, um but that doesn't really solve your situation right because you want to go through an L7 proxy right.

A

B

And well at the moment the sodium doesn't work as an L7 proxy. Actually it works as a now field for proxy. So I don't think we would support this.

B

So we have HTTP or https visibility, um but psyllium and itself using mbpf as far as I know doesn't doesn't do L7 protocol, but.

A

But in this case it would actually be enough for because this is like pure PLS and uh and this proxy protocol is actually doing just a blob at the start of the binary packet. And this. This blob is understood by some implementations of uh of.

B

Oh yeah, you have no TLS termination, then yeah.

A

So it's just a pass through and uh yeah.

E

A

Think it's yeah. It's called proxy protocol and it's not.

C

A

Supported but it's supported by nginx, it's approxy all this stuff.

B

Okay, well, I'll have to check on this I. Don't.

A

Know correct, but I think this would be purely for there's no like we do just it's like TCP. No, no, it's sleepy.

B

All right, yeah that would make sense I'd, have to check if proxy protocol would be supporting this yeah.

A

Because this could be something that we could look at as well to have.

D

A

Have a solution here for reality as well.

A

Thank you all right, I, don't have anything else. So, let's uh last chance for a couple, more questions to Raphael.

A

Otherwise, thanks a lot Rafael again and uh especially for the immediate reaction to the call, so that was pretty awesome. Yeah.

D

Thanks very much.

A

Yeah and uh we'll keep track of how how people start using sodium, also in the research departments.

A

Thank you. Everyone else.

E

A

We'll have uh the next meeting in two weeks, uh I think I'm, always confused about this first and third I think it's two weeks so yeah we'll we'll circulate the topic in a bit and then maybe next time. We also uh prepare the topics for at least the rest of the year.

D

Yeah, we need to refresh the backlogs with uh Robert dry recently.

A

Yeah I can spend some time next next next uh session. Thank you. Thank you. Bye-Bye thank.

D

B

Right have a good day.

C