Cloud Native Computing Foundation Online Programs, 22 Mar 2023

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Cloud Native Live: Operating High Traffic Websites on Kubernetes

Description

Don't miss out! Join us at our upcoming event: KubeCon + CloudNativeCon Europe in Amsterdam, The Netherlands from 18 - 21 April, 2023. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

A

Hello, everyone Welcome to Cloud native, live where we dive into the code behind Cloud native I'm, Annie, talvasto and I'm. A cncf, Ambassador and I will be your host tonight. So every week we bring a new set of presenters to Showcase how to work with Cloud native Technologies.

A

They will build things, they will break things and they will answer all of your questions, so you can join us every Wednesday to watch live so this week we have Salman here with us to talk about operating high traffic websites on communities and if you are seeing a different title right now, that's due to some technical difficulties, don't mind that today we're going to be talking about as I said, operating high traffic websites on kubernetes very excited for this topic and as always, this is an official live stream of the cncf and as such it is subject to the cncf code of conduct.

A

So please do not add anything to chat or questions that would be in violation of that of that code of conduct. So basically, please be respectful of all of your fellow participants as well as present judges. So with that done I'll hand it over to someone to kick off today's presentation.

B

Hey good morning, good afternoon, good evening, everybody, my name is Salman Iqbal. Thank you very much for the introduction there. um Yeah I am so Monica I work for a company called Appiah. We are a cloud native consultancy and we have uh we work in the cloud native ecosystem.

B

um So today I'm going to talk to you about how do we scale our websites which are running in kubernetes? uh We're not it's not only uh restricted to websites just to do with all workloads that you're running in kubernetes uh I know we talk when we talk about kubernetes, there's a lot of talk around Auto scaling, so things that can automatically Scale based on uh some traffic, that's coming in, perhaps metric or whatever it might be, but that doesn't happen by default. There's a lot of work.

B

We have to do, there's a lot of configuration we have to put into our uh into our system and what are those things how we can do it? uh That's all we're going to look at today.

B

So uh it's going to be all demo and hopefully everything will work and if it doesn't work, uh don't worry. I have a video as well uh of the whole whole demo. We can go through it and we can pretend that there was a it was all live, but I've tested a few times before as well. uh So I think it should.

B

It should be all good um if you have any questions anytime feel free to ask, and uh you know we can, we can answer as you go along if I know the answer, I'll try and answer it. If I don't, uh then we can search up the answer, uh the normal way, which is using Google or, if you're, feeling adventurous.

B

We can ask the question in chat gpt4, how about that and you can decide if we can, if you can search ourselves uh what we should do, but it should be, it should be a good, laugh, I think.

A

B

Whatever you prefer, we will do excellent. So uh what you're saying in front of you is my screen, uh whatever I'm going to be doing, I'm going to be sharing uh together. So here's the scenario scenario is you're running a website and that website could be anything your blog could be. An e-commerce website could be whatever it is, and a lot of requests actually I'll probably put a put some diagrams up in a in a second.

B

If you, if you're giving a second I, shall open this diagram here so so we can all see it together. uh One second uh last just double check yeah, and this is where we are uh yeah one minute.

B

Okay, perfect: let's get this open and what we have here.

B

B

Give me one second.

B

B

So I hope you can see this I know we all know all the various components how things how things work in in kubernetes, but uh just a bit of recap for all of us. What we're going to be focusing on on today. Imagine this is our our current setup of uh of our website. That's running so at the bottom you can see, we've got some pods, let's say the red ones are the pods which are serving the traffic.

B

Is your it's your website that's running, and then what we have is some Services which are sitting in between uh in the middle at the bottom there in the gray bar that's the service service is basically just a a thing that sits on top of a of the pods. It is an interface and that deals I, think of it as an internal load. Balancer.

B

If you have multiple replicas of your pods running in a deployment, you can use our service and the service acts as an internal load balancer through traffic to one or the other parts right. So something needs to something needs to do that. That's where the service sits service is all internal. We don't expose everything anything outside the cluster, usually using a service we for that.

B

What's what we use is an Ingress and in Ingress is what I would call it an external load balancer and the purpose of the Ingress is to take requests coming from outside of the cluster and send it over to the container. That's running our website and that's the purpose of the English and the good thing about the Ingress. Is it understands, HTTP requests? So if somebody says Hey I want to send a request to www.cncf.com forward, slash uh I, don't know checkout, it will send them it. Can it can understand the request.

B

You can look at the headers and you can understand a request and make the routing based on that. So we'll look at the Ingress, we'll look at the service because we'll create this and we'll look at the Pod. Now the thing about the Ingress which we're going to focus on on today, the thing about the Ingress itself is here: you go next diagram here. uh We're not they're not supposed to be the slides, but I thought we'll just have some photos of what we're going to see.

B

The thing about the Ingress is ingress usually comes in in two parts. There is uh the Ingress pod. As you see in here, the Ingress is basically a normal deployment, so you would have you can pick any English controller. You like, in this case we're gonna, look at nginx. Everything we talk about today is open source, and we also have a link where you can try all of this yourself.

B

It's all the projects in in our open source, then nginx in response we're talking about is, is the open source, nginx Ingress pod, and uh the point of this is uh yeah at the point of this, is that uh um when it runs in the cluster we install the Ingress control, cluster doesn't usually come with it, and it's attached with a service, so this is now runs inside is running in the cluster, so anytime request comes in from outside of the cluster. It goes to this pod and that's absolutely that's fine.

B

The problem is what, if you get millions of requests, this can become a bottleneck. Would you agree with that? Annie I I hope you would write. This could become a bottleneck right because all the requests yeah all the requests- are going to this one pod. So if that happens, there's two things you can do number one. You can always have many uh Ingress requests, Ingress pods running uh the questions that are coming in I'll answer in a few minutes, so to keep them coming and I. Please do it will answer in a few seconds.

B

um So the point is what you can do. Is you can start off with a number of pots? You could say you know what instead of running one Ingress pod in my cluster, because it's a normal deployment, I'm gonna run five or ten or fifteen, and.

A

That's absolutely fine. You.

B

Can do that, but it will take up resources, uh so you will have to allocate these resources and these pods might not be doing anything for the time where there's not much traffic coming in. So it's a bit of a waste and we don't like waste. So what we should do is scale up when there's lots of requests coming in and scale back down when there are no requests and that's what we want to do. So that's what we're going to look at today?

B

How we can do this, how we can and look at all of this and uh um and then we can, uh we can scale up so before we go any further. There's a couple of questions in here and uh I'm gonna: uh can we can we go through that and is that? Is that? Is it okay with you? Should we go through some.

A

Questions, that's perfect. We should go through them, um yeah and I. Think the first one that came in um is: can you root traffic other than HTTP or https using Ingress, for example, psql or request.

B

uh So yeah thanks Shadi for the question uh I'm. Usually you you root HTTP https uh traffic using Ingress I'm, not sure if you can do psql, but we can. We can check that later later on in the Stream we can. We can do a quick search ourselves together. Usually we deal with HTTP and https. uh There might be some extra controllers because, because these are different, Ingress controllers, some controllers might be able to.

B

Actually you know what we can do I'll share some reference with you, which might be, which might be handy uh so I work with uh Appia and I also come work with a company called learnkates uh shout out to Daniel from Maroon kids, there's different kinds of Ingress controllers and different controllers provide different capabilities, and so you can see in here. This is a comparison sheet. This all of this is open source. You can check out.

B

You can contribute yourself as well, so you can just search for learnkates research, and this will come up and in here it is nginx ingress and then on the left. You can see uh different types of routing mechanisms, so maybe some of them might support it. You can check it out, but I'm not sure it doesn't look like we can so maybe different kinds of Ingress controls could or or could not support. It. I hope I've kind of answered that that question or not so hopefully,.

A

And it's Shady: if you want to ask any more or you have extra questions, feel free to pop them in there, and then we had another one from laurentina's that uh they are curious and how should they Define high traffic.

B

Oh very good question, so that question is: how do you define high traffic website? I, don't even know something is hard traffic up. That all depends on um on on your application itself. uh So, um for example, you can say: look high traffic is if I get I, don't know, 100 000 requests per minute. That's high traffic. That's all depends on what your setup is and what you can.

B

What your setup can deal with, how many requests it can deal with at a time what we, for example, today, what we give, what we're gonna say if we have uh we'll I'll share the metrics in a second. If you have active, 100 active connections inside each nginx pod- and you know you can test this out yourself and trial it, and you can see how much memory and CPU consumes- and we know like 100 requests per second is just an example.

B

100 requests per second is quite a lot, so then we can scale up so this is. This is the entirely dependent on your setup and your application? What is a high traffic website? You can pick a different metric and we're going to touch upon that later on today.

A

Perfect and I think Shady.

B

A

With I suppose, yes, I have configured Ingress controller once to.

B

A

Ports than 80 40 43 I used it to Port um three one: zero: zero upd there.

B

Okay cool, so I think looks like uh looks like should be, should be good, so excellent. All right sounds good to me: shall We Carry On, then uh honey, yeah.

A

There was a comment um in the beginning with which was: it is like al7 load balancer, but there's not a question there. But obviously, if you want.

B

To say yeah yeah, it is that's, correct, yeah, I think, that's that's what they're saying is absolutely right. So that's a lyricism load balance. So that's what what is doing it doesn't understand. So services do like layer, 4 stuff and then Ingress does the layer seven. So that's what they're saying is absolutely correct.

A

Perfect and then there was a from Dennis to share the materials, um show the Google Doc here before, but if there's any materials uh we'll get them linked to to everyone attending um as well as maybe you can share some learn: more resources in the end and so forth. But yeah.

B

I I would definitely will that's that's good, all right, excellent. So here's the thing now what we want to do I hope you all understand what we are going to do is in this demo.

B

What you'll see is we'll deploy an application and like a normal application, as you see in here and then we're going to throw a lot of um a lot of traffic to it and what we want to do is we want to scale up this pod so that and that's and that's how we're going to do it, there's a few steps for it which which we're going to talk about as we go along, but first thing first, what we should do is deploy our pods right.

B

So, let's deploy an application that are that we're going to deploy to the cluster, and then we can see what we can do in terms of uh scaling here we go so, let's just make sure it's all nice and big and everybody can see it. If you can't please let me know if you want to change the the background from from dot to to White, so we can do that too. But here's what we're going to do initially, what I need first is a kubernetes cluster.

B

Luckily, for me, and for all of us, what we're going to do is we'll you can use any cluster you like uh in this case we are going to use mini Cube cluster uh before nothing is running on the cluster right now. So let me just make sure we have enough space. So what we have is Cube CTL get nodes. This is minicube right. So this is a local cluster. That's running on my machine and it's got nothing running inside. Well, just the components to run kubernetes. That's all it is.

B

We have nothing in there, so uh I'm gonna go inside in here. uh Cd demo just make sure we're all in the right place, yeah excellent. So let's uh yeah that's where we are okay, cool, so first things first, is we need to deploy an application uh deploy a website? Let's just pretend this is our website. This is our super fancy website. That's going to get lots of traffic and, uh let's start with here.

B

Let me just stick this thing here: um I'm sure, you've seen deployment files, uh Plenty or plenty in your life, and what we have is we're using Stefan prodan shout out to Stefan product uh pod info. uh This is, you know, I'm sure you might have seen this. Basically, it's just the website. It's going to run it's a static website, we'll deploy this on on our cluster. It is a deployment so deployment.

B

We can use to scale uh replicas if you want, but in our case we're only going to run one deployment um and that's what we're going to do so we'll take this we'll deploy it and we got some other information in here around uh like pods, which are running uh labels. But the point is we're going to create this deployment. First, let's go so: let's go Cube CTL apply ysf we'll create the deployment. Hopefully, if everything is all good, we can do acute, pods and containers.

B

Creating there's going to be a lot of that during the demo. Of course, we'll just make sure the container is uh come up, is taking a lot longer than I expect. Luckily, it's all come up. Okay, um so the Pod is not running, but how do I just a tip to share with you all?

B

um How do I know if anything's running correctly, usually what I do is I can if something is running and I want to make sure hot is running and because I'm going to be deploying a number of components, I want to deploy a service, how to burn Ingress and then we'll try and access it, uh but instead of just jumping through everything, I can test everything out right. Let's just make sure I got a qctl port forward, because this is my local I have actually the cluster I can port forward.

B

It I can say which, which resource I want to put portfolio which is a pod. So this this is the Pod. Let's stick this in here and then what I have to do is this is the port? The container is running on? That's the port. The container is running on, so uh the the the syntax goes, I pick any port on my machine, Let's see that and then the the con, the port. The container is earning on 9898 in this case. Let's just deploy that so 9898 now this is forwarding.

B

So if I send a request from my laptop onto um onto the uh here, you go if I go to localhost 8888.

B

This is the website that's running inside the Pod right, so the first step is, is all good we've got. We have a. We have a port, that's running, uh it's giving us a website. This is the website that people really want to visit, because they want to see this cute uh cute little uh uh creature on here and that's what they want to visit right and what we're going to do next is. We need to be made. We need to make this available outside of the cluster, so you can send requests to it.

B

Well, for that, as we've talked about before, I need to create a service and then I need to create a lingra, so people can send a requests from outside of a cluster. So, let's, let's just go ahead, nothing, uh nothing uh too extensive at the moment.

B

What we have is another there's gonna be quite a few of those today, as, as is always the case I'm sure you agree, Annie uh lots of yam files, um we we will create a service and a service, as you said, is an internal load balancer, and in here uh there's a few things. We need to make sure that some of the configurations match up, and that's all we're doing in here and picking the right pod. So this is the deployment that we created.

B

Has these labels app info uh pod info and then what we have is this bit here, which is the the target board. So, when I deploy this service, uh the service will be attached to this. uh The spot.

A

So let's go ahead.

B

And create that service. So if we do this I'm going to close this a second, so if I, we can stop this, we don't. We don't need this anymore, so I'm, gonna, okay, apply myself uh deployment. Oh no! We've done the deployment.

B

Right, so that's what we are doing now: the service should have been created again. Service I can do that just make sure everything's correct. Now this is the type the service type load balancer and if there's different types of services in kubernetes, uh there's cluster IP, there's load, balancer, there's, node, port and uh and there's one more, uh which I can't think of right now uh and then um headless, which is the the smaller service yeah. That's right. This is using load balancer type service. This is running locally.

B

If this was running inside a it's a cloud provider, it will actually go in the cloud provider. It will go in provision, an actual load balancer and attach it to the nodes which are running these workloads and that you can imagine imagine if you create like multiple services with multiple load, balancer types and you end up with so many load balances. It could become quite expensive.

B

You could have a lot of load balances, and this is why, in order to expose Services outside your applications outside we use in English, but we need the service because we could have multiple replicas running now. I've deployed this. In order to check everything is correct: I can do the same thing as before or.

A

Yeah and there was actually a question um from.

B

A

Oliver was a fan of the comparison table of inverse controllers, is it available publicly and they would like to get the URL.

B

Sure, uh let's just do that right now, because I'm just looking at the other screen here, you go, uh I shall stick this. We can I can stick this in the chat right. So if I should.

A

B

This in the chat Nanny you can, we can, we can share it. So that's that's the English and there's actually quite a few other things in here. You can check like this English controller comparisons uh manage kubernetes comparisons. This comparison, comparisons on service meshes and yes, it's all open source as well. Of course, please feel free to add, uh send pull requests. All the information is on that page.

B

Okay, excellent, uh well, good right. So this is the service. That's running. I just want to make sure service is configured correctly, so I can do the same as what we did before we can do 9000 and then uh we can test it out. So, let's just quickly do that localhost 9000, and as long as we see the same page, that means we what we've done is we have done everything.

A

Correctly so the service is configured.

B

Correctly and so far we're looking good right, so uh we've done that, but what we really want to do is install um it's installed in Ingress um and well I'm, going to deploy a few components first and then we'll break and take questions in a few minutes. uh If that's, if that's all good- and it will just deploy all the components, then we'll we'll take some of the questions in a second.

B

So here's the thing, though, what I want to do is I want to deploy this Ingress and the Ingress is something like this right, so I'm going to show you some rules in an Ingress. uh Let's just go back to here, so you saw the service file and I'm going to show you the Ingress file and the Ingress file looks something like this.

B

We have the kind of resource that we're doing some metadata, but this is the important bit in here the rules, the rules of what we are going to send down- and here is what these rules are. These are all the HTTP rules and the rules are like this.

B

All it says is: if somebody sends a request to this path, so just a just a base path, you can, you can put anything you like here, blah blah, you get anything you'd like you can put anything you like in here and if you send a request on in this case, we're saying to the base part, send that request to the service pod info, the one that we just deployed and that's where we want to send it to now. uh The thing is also what the request should be coming for example.com.

B

So if somebody sends a request, this is what we are actually asking. So this is not just path based Roofing. This is host based routing. So if the request is for example.com, uh then what we can do is basically we can send this uh uh the request to the Pod it's uh to the actual service and then down to the Pod, but here's the thing if I apply this to the cluster, so I can apply this to the cluster right now. So, let's just stop this. We don't need this.

B

We're going to do. Kk is the Alias that I use on my on my machine for uh for cubes tlk, apply minus f ingress.com, deploy this on the cluster, and it just goes in and stores that information inside the hcd database, but it doesn't know what to do with it, because there's nothing inside the cluster that tells us what to do. There's no controller, that's running inside we haven't installed anything. We haven't installed this bit yet this this Ingress pod is not running yet. So how do you do that?

B

Well, we can install this on the cluster uh using we can go to their website. There's a number of ways of installing it. What we're going to do is we're going to install using huh and Helm is a package manager, as you might already be aware of, and the good thing about Helm is, you can have what's known as Helm charts and the charts allow us to install all the components necessary and that's what we're going to do. So, let's just quickly install this so I'm going to do. Helm I have the command in here.

B

uh I'm gonna copy it from my other screen because I have it there. So this is just adding the repository uh which I already have and then I can just use the helm, install command. Let's just stick in here: I use the helm, install command and from that repository, I can install the nginx Ingress on my cluster.

B

So you'll install a bunch of things in my cluster, the controller that it needs to run inside a pod and also um the services and anything else that might need to install, and you can set some variables at the same time, I'm doing I'm just telling it to also uh use basically watch Ingress is without class. You can Define that class. Then Ingress uses but I'm just setting that right now.

B

So once once I run that what I should do is basically go ahead, get all the um get all the bits they need bring all the uh um yeah. Here you go, bring all the configuration down and apply to on the cluster. What I'm doing right now, just for demo purposes, I'm installing everything in the same namespace. That's where it's going! Everything is going in the same namespace, which is the default namespace just for demo purposes. Usually you would deploy things in different namespaces and that that is the right way of going about it.

B

um So this is just for demo purposes. If I do Cube CTL get pods. What I should see is this bit here. What we've installed is nginx uh support info, and uh this is an nginx Ingress, that's running, so this is the controller that we've deployed right so so far we're all good we're deployed the the Ingress and then here's what we're going to do, uh because this is mini Cube. This is running locally right and the name of this Ingress is nginx.

B

I can see if I can uh try and get to the Pod that I have deployed. In order to do that, I can do a number of things. I can get, because this is local. This is local right, so I can get the IP of the mini Cube, and that will just uh um that will basically be able to access that I'll answer. Some of the questions I see some good questions are coming up.

B

I will answer that in a second I'm just going to quickly show you uh the website that we can access, uh because our our thing was. If somebody sends a request to example.com. No, this is running locally, send it to the Pod, but I can use a new command from uh so we have Cube CTL to get a service. As I said, uh Ingress is normal, a service which is in here. You can see this since this service here load balancer service.

B

It's called main engineering same address right, so you know how we did port forwarding. This is what I can do: mini Cube um mini Q service, similar to that Main, nginx, Dash, Ingress and dash URL. This will just give me a URL locally. I can use to access the website just to make sure we. What we're doing is absolutely correct right. So it's going to run, uh there's a there's, a spelling mistake in here, and they should run in a second and give us the the URL.

B

uh Here you go he's given us two URLs in here. So I can access this and see if the website is there, but the problem is I. I can't do that because I have this host property inside, so what I can do is I can curl actually and let's just make sure this command is running as long as this command is running, I can curl it I can do this, I can pass in a header.

B

Are we coming we're building up to uh building up to the part of scaling in a second uh example, .com and then I can stick the URL that's been given to me, and you can't I know this. Sounds it's not as exciting as seeing that the cute little uh creature that we have in the in the in the URL, uh which let me see if I, can pop back onto it's not as exciting as this, but uh what we can't do. What we can't see is uh is well there. You go.

B

That's the message greeting from pod info: this is where the logos is coming from. I hope you all agree. This is what we're doing so so far. What we've done is deployed our application and the bit that we're going to do next. Is we what we need to do and we'll we'll take a couple of minutes to answer some questions, real, quick and then we'll move on to the next part. We've got this setup. Complete.

B

We've deployed our application, we've deployed Ingress, we know Ingress Works, we've sent the request to it and we can see it running, but there was only one request. We. What we're going to do in a few minutes is pretend we have not pretend we're actually going to do a lot of requests and see how we can scale up before that I'm going to do a few things.

B

I need to decide how do I scale up so I need to pick a metric that can scale up and then I need something else to help me to scale up uh and then I think Annie. This may be a good point to answer some questions. There's one question I think I'll start with. If that's okay, because I that's yeah, that can that we can answer really quickly. uh I think there's a question on can I add multiple ingresses, if possible, are there any precautions when using it?

B

Yes, you can add multiple ingresses in the cluster. What you have to do is in here. uh You can have multiple ingresses and let's just go in here and what you have to do is in here. You have to Define what class name you're using uh what do you have to watch out for? Well, uh the the things that you have to watch out for just make sure you use the right English class name for the English to use I've seen some examples in the password.

B

People have run multiple ingresses in the same cluster because of different requirements from different kinds of applications, but um yeah, there's, there's not everything that you have to watch out for when you set up an Ingress is is the same. What you have to do, uh for example, don't declare the same host twice but I. Don't think, there's that many pitfalls for it. That I wanted to answer that, because it's related any what other questions we've got. So we can quickly answer yeah.

A

A

uh To check out as well that's always great to see, um and also there was a question before um to get that link to uh link to the previous um resource to LinkedIn as well and Jillian helped out with that one. So, thank you so much there.

B

A

um And then we have the questions. um So there was a question um which goes as how does kubernetes know that I'm running my kubernetes in a cloud provider such as AWS and how does it know what type of load balancer to provision? And there was some helpful information already provided by another commentary. But but obviously, let's answer the question here as well.

B

Yeah uh so yeah I think uh Avinash also on their answered questions. So when you, when you deploy in when you deploy inside kubernetes in a different cloud provider, what happens in kubernetes is when you create resources? uh What you can do you have something called controllers: that's running inside the cluster, so built-in controllers, like the the replication controller.

B

So if you query credit deployment, the replication controller is watching in hcd to see if there's any changes for it and it creates that the same thing happens when you're doing a cloud provider, they've got their own controllers, which is extra bit of Logics that's running inside the cluster. So when you deploy something, it basically acts upon it. That says: yo I need to create a load balancer, and it will tell you already. You already know if you want to carry this kind of load balancer.

B

This is the configuration you have to pass in order to create this. You can see in here uh in in well this spec in here, but there's metadata information, there's annotation section also which which we haven't included in here in The annotation section you might have to give it some helpful hints as if you need something additional, so each cloud provider will ask you to do something slightly different in in the English configuration. So the Ingress is the only part which could be slightly specific to different uh managed provider, but everything else stays the same.

B

In here. You'll have annotation section, you might have to add some extra bit. Hopefully that answers it.

A

Yeah perfect and then the last question so far uh comes from Alejandro, who asks? Is it necessary to deploy a service in load balancer mode? If I'm using my cluster within a public cloud? In this case? Wouldn't it become uh enough to configure an Ingress pointing to the service and the load? Balancer would assign the external IP to the Ingress instead of the service.

B

Oh yes, a very good question again: yeah, it's not I I just did it as uh just to show but yeah. Definitely not it's not necessary to deploy service in a load balancer mode. We don't want to do that either, because we're gonna have an Ingress that will take care of everything. So you don't have to uh you. Don't have to deploy a load balancer mode, that's absolutely correct, because you can't do any authentication. You can't do any of that stuff. So yeah! That's that's what you're gonna do perfect.

A

And we got a comment from uh one of the earlier Christians askers. That perfect makes sense. Thank you. So perfectly done there.

B

A

Yeah and then we have um Pearl asking that if this repo is public, can you please add the GitHub link as well.

B

Sure I will do uh I'll share the link in a few minutes when we're when we're. When we take a break uh but yeah, there is there's a whole blog that Daniel Plan should put together on this stuff. I'll share the link in a few minutes, and then we can uh you can you can check it there is that, if that's that's all good I'll have to I have to dig it up. It's it's somewhere here, but you'll have to give me two minutes to do that, but we can.

B

We can continue on if that's, if that's all good yep.

A

B

Thanks, thank you, Ernie for keeping it ticking along. This is excellent, stuff, uh okay, cool. So what we've done so far is deployed our application, but here's a bit, though, what we want to do is scale our Ingress I'm going to talk about Ingress, but you can assume. All of this also applies to anything, not just the Ingress, but you can apply this to any kind of pod. That's running inside the application.

B

You can pick any product you like, and you can do that, but we're going to talk about Ingress, but just just because uh it's a good use case before I can do that. How do I scale up is the question. The question is and the way you scale up is you need what's? uh This is what we want to do. I'm going to put this diagram up here, you go!

B

Imagine we have a deployment which is our Ingress deployment and it's running some multiple, um multiple pods right now and it says, let's just say it's just running one and then what we want to do is query some Metric.

B

um So query is a metric, I. Think there's some questions service measures we'll answer that later. So what we want to do is we want to query some Metric. But how do we get this metric? Well, there's a few things you have to do to get these metrics number one. Your application has to provide these metrics. So usually, if it's a it's a website, you would create an endpoint in your application forward, slash metric and add those metrics in there.

B

Whatever the metric might be and I'll show you an example of that, and then the other thing is, you need something to script this metric and store it, and for that what we're going to do is use yet another open source project called Prometheus Prometheus.

B

uh Oh this one uh real Prometheus monitoring and that's what we've got to do. I, don't know if people seen this movie, um I haven't, but apparently it's a very good movie.

B

um So what we're going to do is uh we are going to use another open source project called Prometheus and the good thing about Prometheus is comes in different parts. We can install it on the cluster. You have what's known as a Prometheus server and that's basically the central component for everything it scrapes the metrics and stores it and in the right format, in the format that it makes and and also the good thing about Prometheus is it can talk to the right components.

B

Well, when I say right, components, I'm talking about the kubernetes API and discover all the services that are running in the cluster and all the pods, so you can go ahead and find it. But the main thing is all the containers that are running in the cluster. If they're, exposing metrics on an HTTP endpoint on usually forward slash, metrics path, it can take all those metrics regularly and stories.

B

Now that's what it does, and this is- and this is really good and also it gives you a really cool dashboard which we're going to use in a second. So basically, what we need is somewhere where we can store this metrics and then once we have this metric, then there's another component inside kubernetes, another kind of resource called the horizontal pod, Auto scatter, there's two kinds of well really in kubernetes. You have three kinds of Auto: scalers you've got the cluster Auto scaler.

B

So if you want to scale up the cluster depending on, if you're running out of resources, you have what's known as a horizontal pod, Auto scalar, which is just increasing the number of replicas of the pods and then there's something called the vertical autoscaler which we're not going to touch upon today, which is increasing the size of the pods like allocating more memory and CPU and that sort of thing, but we're going to talk about horizontal pod, Auto scalar.

B

So there's a component called the horizontal part order, scalar um and then that's what we are going to do so, there's a bit there we're going to query and we are going to use the horizontal portal together. So uh how do we store these metrics? How do I get this? Well again: uh lovely Helm, we're going to use Helm and we'll install this. uh So if I can I think the the screen is is big enough and then, uh if I do Helm in school, Helm install.

B

Let's just do this. I, have uh you know, for the sake of saving a couple of a couple of seconds here, we're just going to install Prometheus again once we install Prometheus, we'll have a bunch of PODS that will be spun up for us. So let's just make sure everything is good. Okay get pods, so you can see in here as I said, there's it installs a number of components right.

A

There's a server.

B

The bit that we really care about uh well, but we care about pretty much everything, there's a couple of bits in here which are doing different things. If you have alert manager, if you want to send alerts, it's if you haven't tried, Prometheus I would highly recommend trying it out if you're looking for uh for a solution, there's there's many solutions out there of course check it out on landscape.cmcf.io a bunch of so solutions for monitoring. But if you this is, this is a really good one to get started with.

B

As you can see for me, it was quite simple to start with: I can install uh inside the cluster using Helm, and once it comes up I'll just wait for it to come up. Everything's almost This Server is almost there. So once the server comes up, what we're going to do is then we will I'm going to start collect some metrics right, but what metrics can we collect?

B

Because here's what we want to do, we can decide ourselves what we're going to scale up and what we are going to scale on, for our thing is the here's? What we're going to do?

B

I'll show you in a second luckily for us: I don't have to modify the um I, don't have to modify the nginx deployment itself because, luckily, for us there's some metrics that already exposes- and this is this- is the nginx's uh this page and it already exposes a number of metrics on the four slash metrics endpoint and one of them is nginx connections active. So if there's active Connections in there I can scale up, but we can Define that ourselves. We can Define our metric.

B

We can say hey if there's like more than 100 active connections inside each deployment scale up, because that's a lot for that to take in that's what we're saying right. So we don't have to do anything. But if this was your application, you'd have to go in and you have to make sure that you're exposing the right, metrics right, for example, let's just go in here: let's just make sure we got our pause up and running. Luckily. So far we are good. So everything is running. Let's just check.

B

If our, if I can do minicube service Prometheus server, we will get that lovely UI in on which we can go in and have a quick look see if we can see some some stuff, that's running so here you go. It's going to start up in a second foreign go here. We are so this is what it looks like uh I can I can like query something, let's just say: if I want to see how many CPU cores I'm running right. So let's just execute this and machine CPU cores.

B

It looks like there's a line here, but really uh all of these are the labels to go with the requests that we made and you can see. Oh, this machine has eight cores right. This is the value that is giving and also uh we have a lovely graph in here. We can see this graph. It looks good right, so we've got the graph, but how about the bit that we actually care about? Well, the bit that we care about the metric is the engine.

B

Next, oh, let me see if I can type this correctly nginx connection is active.

A

Hey look at that.

B

Magic all I did was install Prometheus and Prometheus Master grab the metric understands it already, because the metric is already exposed. It picked it up. So let's just execute this and let's just go to the table real, quick one. You know we we sent one request. Remember that, like we have this this pod info open that we send the request to so. Basically, there's just only one active connection: that's running Ah that's kind of cool, but we're gonna. We will we'll pump a lot on there.

B

I'll answer some questions in a minute, we'll just we'll just go through this, and you can see there's only one one nginx connection active. Now. How are we actually gonna uh do?

B

um How are we going to pump a lot of requests in before that? What we're going to do is used uh Locust.

B

And Locust is another open source tool that you can use written in Python for load testing, and it's really great. You basically have the Locust file I'm, just gonna. If you haven't checked it out, definitely check it out if you're looking for something to do with Locust I'm, just going to briefly show you there's just an example in here: uh if you've written python before might seem familiar, if you haven't is kind of straightforward. In this case we can define a locust file.py in there. We can write all the configuration hey go to this URL.

B

Go to that URL there's how many users I want to have. This is how many concurrent users I want to have. You can write this and the good thing is. We can do this in kubernetes as well. How can we do this in kubernetes? Well, I'm glad you asked- and this is how we can do this in kubernetes we can write this configuration first, we can deploy more yaml file Locus itself, because it's basically a python package right. So that's the python package you can install uh using Piper. We can run.

A

B

That's good Locus will run our cluster, but if you wanna and then we have a service everything, because we need to have that UI to send that and then the Locust file.py that I was talking to you about the one that we, where we write. What we should be doing is this big here, Locust config map config map is something which we use in kubernetes to store some information and config mapping. Here is what we're saying is just send the request to example.com. That's what we're saying.

B

That's all send the request, example.com and then we're going to ramp up right. So, let's just deploy Locus, it's making sense so far, honey I hope right. So that's what we're going to do um excellent, so perfect. For me this will keep running and uh let's just go to the right place where we will be going, make sure we're all good.

B

uh One sec all right yeah. So let's apply that. Oh.

B

Qctl I just want to make sure we do the thing that we're here to do, because we might be running out of time y minus F come on.

B

So can you get pods now Lucas is going to be up and running. This is that's just our load testing thing right. So this is, how do you do load testing, and this is how we're going to do load testing, but how to actually do the scaling right?

B

The thing is, we have the horizontal bottle together and the horizontal scale is very good if you go to kubernetes HPA cuber, that is HPM kubernetes, horizontal part order, scale up, which you can check out yourself later on the thing with the horizontal scaler is I can Define in here. If we could scroll down for a sec to the right place, we can say stuff like uh okay: where are we am I in the right place? Yeah.

A

B

Can say stuff like hey if the if the memory goes, if the CPU, if the applications contain uh consuming more than 50 of the CPU scale up or if you're, using this much memory, scale up or scale back down what it doesn't do is for you can do custom uh metrics, but it's a little bit more involved. But there is a little bit easier way of doing all this stuff, and this is where something called cada comes in yeah another open source, kubernetes event, Driven Auto scaling it.

A

Doesn't just it.

B

Doesn't just give you options of a few things like? Oh, you can Scale based on this. This runs this basically feeds into the horizontal pod open scaler. This is not Auto scaling, because the purpose of the horizontal portal to scanner is to increase the number of replicas in the deployment.

B

So what we can do we can deploy this in the cluster, and this comes in multiple. It's got. Multiple Parts too I have a diagram to show. uh Let's go in here here and once you install in the cluster. What it has is a few things has a metrics API, so it can consume metrics for us we're using uh Prometheus, but it's all good. This can consume metric too. Instead of Prometheus.

B

You can use this and then it has the adapter to make sure to put the metrics in the right places, and then it has a controller, the bit that runs inside the cluster that says okay. This is what I need to do: installs a bunch of custom resource definitions which we're going to touch upon and also the thing that we're not going to cover today, but something that you should really check out is the scalers for cada, for example.

B

The the good thing about is: if I can go to the scalars, you can have a bunch of scalers. For example, you have something like Kafka queue out sitting outside the kubernetes cluster, where it might be sitting- and you might say, hey I want to scale up if there's messages more than 100 messages in a Kafka queue, and then you can. You can scale up based on that and you can Define that or you might based on a different like a SQL query with a SQL query scale up on that.

B

Well, that's what cada is excellent, so definitely check out open source project cater check it out and installing it is the same as before. Really what we're going to do is we are just gonna install using how and we'll answer some questions in a second just want to make sure we got to. We want to get to so I'll install this inside the cluster, give it a second and it should bring up all the components that we need and clear.

B

B

So here you go, you can see this uh operator is starting up. That's the bit! That's going to figure out when to scale it, but how does it know when to scale well? This is where we use. Let me just go here: something called the scaled object. This is not kubernetes native, but once you install cada it installs these custom resource definitions, and in this case, what I'm telling it hey your target. Is this main engine exit Ingress deployment right? That's the one that we want to scale.

B

We can give it some information around how many replicas we like to have minimum maximum. What's the cooldown period after they can, uh it can go back down. What's the polling interval, one or one minute or whatever it might be, but the main thing is: what is the trigger and the trigger is talk Prometheus, so we want to go to the Primitive server. Look for this metric name, nginx, active and some. Well.

B

That's the metric thing, but what we this is our query: what we're going to do is look for the nginx Ingress active connections that we talked about before match it to the name over one minute and the threshold is under 100. So if the requests for a pod go over 100 active connections, scalar and it will decide how many replicas it needs to have and that's what we need to do right. So that's what we're going to do so if I go in here and let's just make sure, we've got the right.

B

Pods pods are all running. Everything is good and I can deploy this scaled object. Okay y minus F uh scaled object, so the scaled object is going to sit in the cluster and, uh let's just give it a second taking longer than help. Oh there you go even if it takes a split second longer when you're doing a demo feels like an eternity, but that's what we are. So what we've got is everything is not set up. We have everything in our cluster.

B

How do let's just quickly scale up because I know we're running out of time. So, uh let's just scale up, but there's one more thing: I'm gonna! Do we? What we want to see is- um and let's just go in here- um gonna bring up the locusts.

B

Let's bring up Locust. This is because it's deployed as a service as well, so it's going to bring up Locust in a second and then we'll have a UI in which we can send. We can start sending information. So let's just put this on the side: real, quick. Let's just do that, how long we got Annie. How are we doing for time? We.

A

Have 10 minutes left, but uh we already have five questions to answer already so yeah, okay,.

B

Cool, so let's just go into this demo and we'll answer the questions. Five minutes for the demo five minutes to answer questions to finish buying on time. Does it sound like a plan.

A

Absolute perfection, yeah.

B

Okay, excellent, so uh what we've got is a couple of things in here, so we have this thing. We have Prometheus, which is running. So let me just quickly: do this hey? How about this right? Look at that nice. What we're going to do is start sending requests when we start sending requests- uh and this is the UI. But what we want to do is see the pods scale up we're.

A

B

We're not gonna just do an accusative get pods, that's a bit boring so I'm going to do something and give you a little bit more fantasy. So, let's just quickly go to the the UI the place that we are yeah uh scaling. This is all this is also in the in the repository and we've got this dashboard that uh that Danielle put together so I'm going to run this dashboard and you'll see that in a second.

B

uh Let that let that spin up give it a second, let it spin up uh localhost uh 8001.

B

Oh, that's uh not great.

B

How about that right! We do not want like we don't want to see this is. This is basically pulling information from the cluster itself, all the pods that are running inside the cluster, that's always showing you. So these are all the KD operator, blah blah that all matches up. We have one nginx pod, so once we start sending requests, what we should see is the number of requests increasing and then eventually we should see the pods scaling up, because we've defined everything we've got metrics that we're collecting we've got the scaled object.

B

We have the pods that are running and that's what we're going to do so, uh here's what we're going to do. We are going to send the request to the Ingress directly via the uh the host here, which is the service itself and we'll do something like this. So let's have a peak users of 2000. Why not, as you can see, I tested this up spawn 10 users per second, so we're going to start swarming in a sec and then this is. uh This is Locust that will start giving some information like you can see.

B

It'll start ramping up. What you can see is a total number of requests. I'll have to zoom out slightly so I can show you all all the information total number of requests make it a little bit bigger total number of requests that are going in and response times it's taking a little bit long, because what we want to do is if I can execute this again and we should see in a second a bunch of requests coming in.

B

So let's just execute that it will take a second and what we should see is a lot of requests coming in there's a bit of a lag in here that you can see and then what we should see if I go in here. Let's open this.

B

Qctl, the pods.

B

What we should see is our look. Ingress is already starting to spin up right. We didn't do any of that. You saw I, didn't do this because kada came in and he started to say: hey, you have a bunch of requests that are going up. So if the requests are going up response times we're going up to, but active connections were also going up. uh So let's just quickly execute this.

B

Oh you can see this faint line in here, but you can see that active connections have shot up real, quick and what we have now is. If I can hop back onto this, you can see three three parts that have come up. This is I'm not making this up, because this is all happening in the cluster. So if I do keeps I get pods and we have a bunch of these pods that have come up and then more will come up as as it requires what Kayla does.

B

Is it actually calls horizontal autoscaler, so it creates that auto scaling for it, and then it updates the deployment right. So it updates the deployment directly and then it basically starts scaling up that it needs to, and this is what's happened- we've gone from one to three um and if I, if I, if I let this run for a while, you can see like three are dealing with it. 100 requests per second uh they're all dealing with this fine.

B

uh Usually when I do this, if I were to do it like I'll, let it run and then we can pop in the horizontal portal to scale a little scale up, and you don't just have to do it for this. You can do it for anything you like.

B

um How does that sound looks like the demo worked right, so we're looking good.

A

We scaled up, and you can do this for anything.

B

It doesn't just have to be for Ingress right.

A

B

I think we should start answering some questions. Oh.

A

B

We've got more pods yeah. What I'll do is, while we answer questions, I'm gonna stop this. So by the time we're done, we will see the scale back down. So we don't want to. You know, put any more requests. All right go for it. Man, great.

A

Perfect um I'm great that the demo works, there's always um a bit of nerve-rack about that. um So there's a question that came in, which is the first in line. Please explain how it does all of this come together with service mesh like istio for.

B

Example, uh so service mesh is, you can inject a service mesh is used for a number of other things. This Auto scaling is not part of service mesh. It's not one of the one of the one of the features of service mesh. It doesn't do that, but you can still have a server server smash. That's running, you'll have you'll, just have to make sure you know like if you're using the right, Ingress, Gateway or or the Ingress itself.

B

That's the configuration that you have to do, which is doesn't touch any of the uh Auto scaling stuff, but yeah the the istio service mesh do not have the auto scaling capabilities for that. You have to use this I hope that answers the question a little bit but yeah apart from that everything else is, is the same. It's not it's not not much different.

A

Great uh there was another question: um I thought the default installation of nginx Ingress controller using Helm will install Ingress controllers Daemon set that's one pod per node.

B

Yeah uh this uh I think it's it's a deployment. I can't I can't remember you could be absolutely right. It will start one quad per note, but the thing is: I have one node and here is minicube, uh which is which is fine and there'll, be one part, but it's not necessary that one pod will be able to deal with all the requests that's coming in.

B

So it's just for the demo purposes like you know, I'm still using nginx, but you can go with the Daemon set that might be enough to handle the traffic, but it might not be so. You might want to scale up more than the demon set that might be running.

A

Yeah, that's good and then Jose asked. uh Can we configure the time interval in which Community scales our application in the cluster based on usage metrics? Let's say, for example, we have some deployments that we need um to react faster to an increase of incoming load than others.

B

So the question, if I understand this correctly, if, if we can scale up for specific deployments faster than others, is that correct did I get that right.

A

um I guess it's a time interval that.

B

They are looking all right. Okay, so time is so so the polling interval. Yes, you can change that polling interval. I, can't remember what the what the least the lowest value that you can set. Yes, you can you can change that polling interval? You can go much faster. That's very well picked up there and yeah. That's.

A

Yeah and then Rama asked can I use kibana for visualization.

B

Oh yeah, you can, and you could use anything you like for for visualization, usually for with Prometheus uh grafana goes well, I know, that's they both run kibana grafana, but grafana goes quite well with visualization uh I've never used kibana with uh if I'm using Primitives, but but you can yes, you can use that.

A

Great and then Jules asked, uh is it possible to scale up across different Cloud providers instead of relying on one.

B

uh uh Very good question, so the question is: can I use I'm I'm going to assume the question was? Can I use cada to scale across multiple Cloud providers? Is that correct I think? Do you think that's.

A

The question yeah I think it was maybe during cater just before cater that the question was asked, but I think they're, maybe looking for any solution that can help them with that.

B

uh I think it's a little bit. There is some stuff that was done I'm just going to think about. Let me have a quick look or we can quickly pop into chat, GPT and ask it um I. Think if you do, if you're doing a bike club across multiple Cloud providers, there is more work that you have to do. There's probably some projects out there that can help you with it. I can't really think of it at the top of my head.

B

So uh honestly, no idea, multi, multiple out stuff, is always harder, but uh you might have to check something some other projects to do that. Yeah.

A

Well, the cloud is definitely very big topic, so uh they should see a lot of content around there yeah and then there was uh another question one from Diego. If there's a details attack that sends a lot of requests, how we could avoid Auto scale to burn the budget uh Waf in front of the.

B

A

B

Yep uh web application firewall- that's that's an excellent suggestion. uh Yeah. You can definitely use that to make sure that you protect yourself from DDOS and there's a few more options there. So yeah Diego, very good, very good point yeah right.

A

And then we had Oliver asking what is uh the URL of the gate, repo that has this code I think yes, I am the same um as the viewers one. If we can share it.

B

Yeah I I am gonna, find this in a second and I'll and I'll share it I'm just looking for it right now. Yeah.

A

B

Yeah, it's here.

A

B

A

Questions to go, but we are at the hour, but if we are super quick we can maybe just you know tackle.

B

A

Two as well so.

B

A

Is the difference between the auto scalers provided by CSV um and cada, is cater compatible with traffic Ingress and opa cluster.

B

uh Csp and cada I'm not aware of uh CSV, but.

B

Oh the cloud service provider: okay, um my.

A

B

uh Yeah so I think uh as Clarkson I've seen, the cloud service providers are are integrating cater in um in their solution, but um I think all the cloud service providers also don't provide this by by default. So this you have to add this. On top of it, this there's nothing. The cloud service providers provide you to um scale your application like that I'm just sending the URL as well yeah.

B

A

B

Okay, it's a bitly, URL and managed to find it.

A

Perfect and then to the last question: oh there we go it's sent so uh for people who are in YouTube. The link is bitly, slash, kcd um and then Dutch.

B

Scaling Dash scaling, so let me put, let me put it up in here here: you go, uh that's the link uh you can like screenshot. Do whatever you like, so here's the links appeared Ohio there we have a bunch of blogs on there check it out learntase.io. So all the resources are in here. The demo at the bottom is bitly forward. Slash kcd scaling. uh So you can. You can check that out on there perfect.

A

And then just a quick 30, second answer or so to question: do we need to scale up other cloud-based Ingress controllers, like AWS load, balancer, controller.

B

It all depends all depends. You can handle the requests that are coming in, that's fine. Otherwise, if you have control over it, you should scale up. I mean it's not. Ingresses are usually very good at handling the traffic. If you have the right number of uh instances running, sometimes you might have to boost up or not, but this is just one example. You should really look at. Do you need to use this for your workloads too, but you can depending on what's happening in your cluster.

B

If you're, uh if your cloud provider controller can handle it, usually they are good. Usually they are good, perfect.

A

um And well uh answered there on quick as well, because we are out of time. So, let's start wrapping up. So thank you. Everyone for joining the latest episode of cloud native live. It was great to have a session about operating high traffic websites on kubernetes and as always, and particularly this time, also really love the introduction and questions from the audience. So many questions great that we got to through them all and as always, we bring you the latest Cloud native code, every Wednesday, so in the coming weeks, stay tuned for more great sessions.

A

Thank you for joining us today and see you all next week.

A