Cloud Native Computing Foundation KubeCon + CloudNativeCon Europe 2020 - Virtual, 4 Sep 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Deep Dive: Linkerd - Zahari Dichev, Buoyant

Description

Don’t miss out! Join us at our upcoming events: EnvoyCon Virtual on October 15 and KubeCon + CloudNativeCon North America 2020 Virtual from November 17-20. Learn more at https://kubecon.io. The conferences feature presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Deep Dive: Linkerd - Zahari Dichev, Buoyant

In this session, Zahari Dichev, will focus on lessons learned, how to's, and what the future of Linkerd holds.

https://sched.co/Zexn

A

Hello: everyone, um my name, is hari dutchev and I'm a software engineer at buoyance. The creators of linkerty today uh we're going to be doing a deep dive uh into linker d and we are going to focus on the multi-cluster support that has been much anticipated and released recently in 2.8.

A

So the agenda for today is the following: we're going to do a quick overview of what service meshes, why you might need one and how linkery works internally, we're going to cover the multi-cluster concepts that govern the implementation of this feature, we're going to look at the architecture and how we designed it. And of course we are also going to jump into the terminal.

A

And um so I can show you how using this feature feels like and then we are going to dive even deeper to track the life of a request across clusters, and I think that's going to be particularly interesting, because it's going to give you um it's going to give you a a bit. It's going to give a bit more a deep knowledge into how the whole the whole system has been put together.

A

And then, if you have any questions, we're going to have a q a time for q a so you can ask these.

A

So much has been said about service meshes and what this is and why you might need one, and I will try to summarize it with a quote uh that I think is quite precise and um concise in the same way.

A

So in a typical distributed system, namely, you might think that the most the only thing that matters is that amb can exchange packets, but as your system grows and it becomes more complex, it becomes apparent that this is not the only thing.

A

That's uh that's that that matters. So what linker d gives you is. It gives you the guarantee that amd can exchange packets in a way that validates the identity on both sides has clear authorization. Semantics is confidential to third parties and is measurable and inspectable.

A

So to put that in a more diagrammatic context in your typical distributed system, you might have something like this right.

A

You will have a client that talks to an api gateway which happens to live on the kubernetes cluster and when the request hits this api gateway, it's routed to one internal service and any of these services might talk to a database or hit a redis cache or for that matter, talk to a third-party api over the internet, and this is all fine as long as it works, but things become things start to become a bit more interesting when, when problems arise, so, for example, your database might get locked up or the data center where it is, might be set on fire or you might deploy a version of your service, that is, that exhibits some sort of a performance regression.

A

So now your one of your services is slow and that reflects on the runtime characteristics of your entire system, or for that matter um you might lose the connectivity to the outside world and not be able to hit your external third party api and when any of these problems arise, you want to be able to make an informed decision.

A

Diagnose diagnose things quickly, so your teams can react in a prompt manner and resolve the problem, but when you have a complex system that becomes hard so what's the way linker, this solves this problem is by injecting a proxy, so it's a sidecar proxy container and all of your meshed parts of your meshed workloads.

A

So now um this proxy is running alongside your application containers and all your traffic goes through it. So, instead of looking like that, your traffic pattern starts starts to look a bit like this. So now, all the traffic is shifted through the proxy.

A

In addition to that, there is the control plane component, which it's rather a set of components set of workloads. That linger d runs that um you can think of it as a management console for the proxy and for for the service mesh in general for linkerd.

A

So this control plane gives you a few things, so it takes care of issuing and providing tls certificates for the proxy, because all proxy to proxy communication is encrypted by default.

A

It provides service discovery through dns, as well as kubernetes primitives. It provides service profiles, so you can set rules like timeouts and retries per services.

A

It gives you automatic proxy injection, so there's a proxy injector component that takes care of that it also the control, plane, also ships, with a dashboard with a bunch of useful metrics um that you can look at in order to reason about your traffic. It also exposes an api interface.

A

That's um a number of cli commands that we ship with linker d use such as tab, stats and others, and these allow you to look at real-time statistics about your traffic um from the terminal, and then you have the proxy, which is often referred to as the data plane and that's an ultralight, transparent one, that's written in rust, so it's very performant and and secure.

A

It has automatic, prometheus metrics export for all http and tcp traffic that goes through it. So the proxy itself exposes endpoints that your prometheus instance can scrape. So you get so you get data about the traffic. That's going to the proxima prometheus.

A

It provides latency, aware layer, 7, load, balancing and automatic tls out of the box, and what is a particular feature? That's actually that I think is is pretty cool is that it exposes an on-demand diagnostic tap api. So you can um use a cli to to tap into any proxy and see the live requests that are going through the proxy in real time.

A

So oftentimes the question arises. um Well, how does that proxy uh end up in my workloads?

A

Well, that's happening through a process that we call injection, so it's usually accomplished by the proxy injector component, which is part of the control plane, but can be done manually as well, and the essence of it is that um your pots are modified to include an init container which is responsible for setting up the ip tables rules for the pot. So all the traffic can go through the proxy and then there is the psychiatric side car container, that runs the proxy itself, intercepting this all the traffic.

A

So um if you have to draw that um the injection sort of looks like this, you have your pot running with the application container in there and then there is an init container at it, uh setting up the rules and then there is the proxy as a side, car or sidecar container. So now your traffic, your incoming traffic, gets shifted, so it first hits the proxy and then goes to the application container.

A

And similarly, your outgoing traffic goes to the proxy first and then to the whatever it's destined to.

A

So when we were um so linker d um is designed around a few core concepts and that's observability so being able to collect actionable traffic metrics security, which is encrypting all the traffic between services, reliability, which is ensuring that services are available and traffic management, so routing traffic to services and using advanced patterns such as traffic splitting.

A

So when we were, we wanted to we kind of thought about all of this and we wanted to bring these guarantees and these core concepts um to the multi-cluster support as well. So you get all of that not only within your cluster but within your whole um ensemble of clusters. So to speak, so um this kind of brings the question as to why why why multiple clusters?

A

Well, there are. um There are many reasons that people have pointed out, but most most notably it's about traffic management and traffic migration.

A

So people want to be able to do canary deployments across clusters or be able to um use a set of services that are in a pre-product environment or prop environment for development locally and make them appear as they are as if they were in their local kind.

A

Cluster, for example, as well as um also people um want to do failover so kind of shifting traffic from one cluster to another um when, when a certain cluster fails or needs some kind of maintenance, so um we kind of thought about all of that and we decided to have some core concepts across uh um across multi-cluster. The multi-cluster feature that kind of build upon the linker d score concepts, and um so our solution is secure and much like uh with link rd.

A

Everything happens over mkls, so even traffic that goes out of the cluster to another cluster. It's it's tls.

A

Then we have the solution: big kubernetes, first so remote services, um they appear as normal kubernetes services, so any service that's exported and available in your local cluster. But it's a remote service. It's just a normal service. It's not some kind of special special kind of service and there is no single point of failure. So no single cluster is blessed or magical. Each cluster is running um its own linker d, installation with the control plane.

A

um So if a cluster fails that doesn't bring the whole system down, the solution is also transparent, so applications don't need to know whether service is remote or local and the solution is also network independent. So the only requirement is that there is gateway connectivity between clusters, but the underlying um the underlying network, infrastructure or network hierarchy doesn't matter at all.

A

So if we have to draw kind of a diagrammatic um kind of a diagrammatic representation of what our multicultural solution looks like it kind of looks like this. So imagine you have cluster east and west that you're going to see in the demo in a bit. um So these both are running lingerie and it's important that linkard for both of these clusters is installed with with uh truss roots. That's that so the certificates share the same trust root in order to enable the mtls across clusters.

A

And then you have the linker d, um the linker rd multi-cluster set of components which consists of the service, mirror the gateway and the cluster credentials. And then you have your local services and your remote services that are effectively a proxy to services located on other clusters.

A

So the um service mirror is responsible for monitoring the exported state of the target cluster and replicating it. So it would use uh kubernetes informers and the go api clients to um to continuously monitor the set of services that are exported on a target cluster and create proxy services locally, and then there is the gateway which is responsible for routing incoming traffic um to the services to the appropriate services on the on the target cluster.

A

So that's that component exposes an external api, ip that receives all of that traffic, and then this traffic is rotated to where, um where it should go uh to the internal services in the cluster and in there are the credentials which consist of a service account, that's located on the target cluster and it allows the service mirror controller, to um spin up and kubernetes api client and monitor the state of this cluster.

A

And then there is a secret, um that's living in the source cluster and it's containing the kubernetes api config for the service account. So um so this this config can be deserialized and and and the kubernetes sepia client can be spun up.

A

So now, with all of that being set, um it's it's actually pretty um informative to do a demo and show you how this how this feels like and what you can do with it. So in our demo we have two clusters: east and west. Each cluster has back-end service installed.

A

A test client is deployed on cluster east, so this test client is just a container where which we can use in order to curl. So what we want to do is we want to split the traffic to back-end service between cluster east and cluster west.

A

A

Let's actually do that.

A

A

So here you can see that we have.

A

We have our cluster east with link rd installed. What we are going to do is we are going to install the back-end service and the test client.

A

We are going to install the back-end service onto cluster west as well, and these services respond with a difference with a different response, just to be able to differentiate between them.

A

So now you can see that we have the backend service here on cluster east now. The next thing we are going to do is that we are going to install the um multi-cluster components onto cluster east and that command should install the gateway. The service accounts, as well as the service mirror controller.

A

We're going to do the same thing for cluster.

A

A

A

We can, we can see that.

A

We can see that there is the gateway on clusterwii east and the service mirror controller.

A

What we want to do now is to be able to provide the credentials that are needed for cluster ease to be able to replicate services from cluster quest, so we're going to do that with the link command and what this link command does is that it puts together these credentials um and that's actually the wrong command that the export service command, which is for later.

A

So we are using the link command, which um takes, uh puts together, grabs these credentials and and puts them together the credentials for the service account from foster west and package the packages them as a secret and deploys them onto cluster.

A

A

um So now we can use the gateways command in order to see that um on cluster east cluster east knows about a gateway um on cluster west. That's has um that's exhibiting a bunch of traffic characteristics, so this is the latency to this gateway and it's alive. So there is a probe internally that that probes, this gateway and informs us whether that gateway is alive at the moment or not. Currently, however, there is there is no services being mirrored.

A

So what we want to do is we want to export the service from cluster west to be available um onto cluster east, and this is done by pretty much adding a few annotations on the service itself, so this command will do that. So now the service is exported.

A

If we look at the gateways now, we can see that um now there is one exported services.

A

And if we look at the services in uh on cluster east, we can see that there is a back-end service which we know that we deployed and there is a backend service dash west, which is a proxy for the service, that's located on cluster west. So what can we do with that?

A

A

We can actually get um we can actually ssh into our pot here.

A

And from that pot we can hit.

A

We can curl the back-end service, that's located uh locally, so it says hello, east and if we curl the back-end service, that's located on west it we're gonna, get a different response right.

A

So now we can step back a bit and think about what we can do. um So we can define a traffic split here that splits all the traffic going to back-end service um in half, so half of the request will go to back-end service, which is the local service and half of the request will go to the service, that's located on cluster west.

A

A

So now that we have that we can ssh into our test container again.

A

And run curl in a loop hitting back-end service at port 88888. So you see that we now should be.

A

A

A

Think I think I created it in the wrong cluster yeah. It's when, when doing these things, it's it's pretty. It's pretty advisable to always be explicit about the context, so we are creating that traffic split on cluster east.

A

So now, if we ssh into the container- and we run the same commands- we see that what's happening is- is that we get responses from both cluster west and cluster east and the reason for that is because half of the requests are routed to our local um service and half of the requests are rooted on to the service that's located on to cluster west, so we can actually look at our traffic split and we can see that um you know both of the services are exhibiting 100 success rate.

A

However, what's interesting to observe is that the latency to back-end service west is a bit higher and that's normal, because the service is not located on the local cluster. What's also interesting to see is that we can do, um we can run the dashboard command.

A

A

So that dashboard command gives us the dashboard here we can look at our traffic splits and gets a visual representation of that and again you're gonna see the characteristics here. So half of the traffic goes to the local service. Half of the traffic goes to the service on cluster west. Also, what's also interesting to observe is that we can go into grafana and there is a multi-cluster dashboard here.

A

That shows you a bunch of traffic statistics about your about the traffic that's going to remote services, so you can see here that there is a request rate for um the linker b gateway, so this is for all the traffic. That's going to this gateway, no matter the service. But then there is, you know a breakdown here, that's um that's by services, and that would be a bit more interesting if we have more services, but we can see that all the traffic is going to back in service at the moment at cluster west.

A

So this is all um quite interesting, and you know you might ask yourself well how does this? How does this actually work? Well, um the way it's been designed is is the following. So imagine you have um your clan pods, that's located on cluster east right! Well, the moment you fire um get requests through curl to back-end service.

A

This get request is actually rooted by the traffic split to back-end service west, which is the proxy for um for the service located on cluster west. Now, what's going to happen, is that this service?

A

Actually there is no pod behind it, like the pot is located on service on cluster west, but what there is is that the service has an end point that the service mirror has created that has the external ipip of the gateway of cluster west, so the traffic will go to that to that ip on that port, and in order for all of that to happen, however, of course there is a proxy on this client part, so this proxy intercepts that request and it uses the linker the destination service to kind of learn a few important things.

A

First of all, it issues a discovery, query to the destination service and the destination service knows that this is the the the request is going for to a service. That's actually a proxy for a for service located on another cluster.

A

So it's going to return the expected identity of the gateway on the other side, so it can enable mtls, and it's also going to return the fully qualified domain name of the target service on to the other cluster and that's important, because this other cluster might very well have a different cluster domain and it also is not suffixed with the west right. The the name on the other side is back-end service, not back in service west.

A

So when all of that happens, this request, the uri authority is rewritten, the proper tls identity is set, and this request flies off to cluster west. So once it hits this uh ip uh on this port, it's intercepted by this. This request is intercepted by the proxy and the proxy um will and and this gateway gateway will uh route it to the correct back-end service.

A

So and because all of that, um um because there is uh the correct destination encoded in the request, uh it knows onto where to send it on which port? So you know you might be hitting um the proxy service onto different ports and all of that traffic will go to one single port on the gateway, but because we carry metadata indicating uh which port was actually the original port, we can route it to the correct service.

A

So this is pretty neat and one other important thing is that, because there is a proxy injected into all of these components like all of your traffic is encrypted end-to-end, and not only that, but you can also get a bunch of valuable metrics, as I already showed you about what's what what's happening with the traffic cross clusters.

A

So that being said um this this, this was sort of the first iteration of the multi-cluster support that we designed and it works quite well. However, there are a few improvements that we have in mind and we are actively working on at the moment. So, first of all, now there is a service we want to be, and now there is a single service, mirror controller running and it's responsible for pretty much all the um all the linked clusters.

A

We want to split that up and have a service mirror controller per target cluster, um and I think that's going to simplify things greatly and make the kind of the cognitive luggage lighter, and it's going to be much easier to debug whenever problems occur, we also want to introduce a crd to better represent the target cluster information. So now a bunch of that information is encoded in a secret um other bits of the of that information is encoded in arbitrary annotations on the um services that are mirrored and whatnot.

A

We want to consolidate all that into a crd that represents this information. So it's all in one place. We want to support traffic policy and much finer green permissions control as to what traffic can go where and of course we want to support plain tcp traffic, so this is something we so currently it's all http.

A

um We want to support plain tcp across clusters, and this is part of a larger refactor and larger piece of work in the proxy, but we have an idea how to do that and work is, is actively happening at the moment. So this is something that's going to be coming out soon.

A

So with all that said, um I'm happy to answer any questions. um If you have one and as always, you can go to my github account and take a look at my uh repo that contains the talk, um so you can look at the slides again. Thank you a lot for your time.