Cloud Native Computing Foundation KCD Sri Lanka 2022, 22 Dec 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Just in time Worker Nodes with Karpenter by Rohini Gaonkar

Description

Sri Lanka has a growing group of Cloud Native enthusiasts, students, professionals, and technology leaders. KCD Sri Lanka offers a platform for this community to come together and connect with other tech communities in India and neighboring countries. It provides an opportunity to experience conferences like KubeCon / CloudNativeCon together with the rich cultural heritage of Sri Lanka.

A

Oh, thank you for joining this session. My name is rohini gankar and I'm a senior developer Advocate at AWS. Today we are talking about Carpenter, an open source kubernetes cluster Auto scalar, if you have more questions, feel free to reach out to me via LinkedIn on the given website. So let's quickly look at different ways. We can do kubernetes scaling. Remember. The goal here is to efficiently use the infrastructure, have less V stage and save cost and ensure a more highly available application.

A

So there are three main concerts and let's look at each of them in the horizontal pod order, scaling or HPA you scale or add more number of PODS based on the resource metrics. So it is forward level scaling. You simply start adding more and more number of Parts as your demand increases, and if your demand decreases, you simply automatically stop the pods to free the resources. So you scale out and scale in as your as per your need with vertical scaling, as it suggests, adding capacity to the same resource.

A

So the kubernetes vpa automatically adjusts the CPU and memory reservation for your pods to help right size. Your applications and finally, the kubernetes cluster order, scaler, which is a popular cluster order, scaling solution maintained by Sig, Auto scaling. It automatically adjusts the number of nodes in your cluster. So when your pods fail or are rescheduled onto other nodes, it is responsible for ensuring that your cluster has enough nodes to schedule your pods without wasting resources.

A

So it watches for pods that I failed to schedule and for nodes that are underutilized and it then simulates the addition or removal of nodes. Before applying the change to your cluster. Now the AWS cloud provider implementation within the cluster order, scalar controls the desired replicas field of the ec2 order. Scaling group easy to order scaling group is a feature of AWS. That's used by cluster Auto scaling.

A

Order scalar works with HPA, so the horizontal pod or the scalar changes the deployments or the applications number of replica based on the current, let's say: CPU workload. If the CPU load increases HPA will create new replicas, for which there may or may not have enough space in your cluster. So if there are not enough resources, cluster Auto scalar will bring up some of these nodes so that the HPA created pulse will have a place to run now. If the load decreases, HPA will stop these.

A

Some of these replicas, which will result in some nodes, may be underutilized or even empty, and that's when cluster Auto scalar will actually terminate these unneeded nodes. So, as I just saw cluster order, scalar relies on the concept of node groups and ec2 auto scaling groups to manage the cluster capacity.

A

Now cluster Auto scaler here assumes that the instance types are all identical in a given group. So if you want to use a node group with let's say mixed instance types, you need to make sure that each type has roughly the same amount of CPU and memory resources.

A

Otherwise, resources might be wasted or insufficient during a scale up to support different instance types. You need multiple node curves. Also, as I mentioned, it's recommended that each node group span only one availability zone so to make sure that if you want your workload to span across multiple availability zones for high availability, you need a node group, for instance, type for availability, Zone.

A

Well, cluster Auto scale was not originally built with the flexibility to handle hundreds of instance, types across multiple availability zones, it loads the entire cluster State into memory, the nodes, then paths and the node groups identifies unscheduled paths in the cluster and simulates the scheduling for each node group. So when you have lots of node groups, this gets very complicated and when Granite scale it often takes up to five minutes to actually scale Your Capacity in your cluster. This can have significant impact in use cases where the speed of capacity scaling is very critical.

A

It could also have a real customer impact, as customers are not able to meet the commitments of their end users. So it's hard to get the high cluster utilization and efficiency of operations.

A

Customers of AWS I have over provisioned resources to ensure that a consistent end user experience I've seen our customers over provisioned their infrastructure by 20 to 25 percent in some cases, and then there are some use cases like machine learning or batch workloads, but I need to quickly experiment something so instead of having sorry so instead of having to get a node group configured then get other resources which actually slows down the pace of innovation, and that's where we need Carpenter Carpenter is an open source, flexible high performance, kubernetes cluster Auto scaler that helps improve your application's ability and cluster efficiency.

A

It launches the right sized computer sources, for example, in our case Amazon ec2 instances in response to changing application load in under a minute through integrating kubernetes with AWS copy token provision just in time, computer sources that precisely meet your the requirements of your workload.

A

What's that asterisks well, AWS is the first cloud provider supported by Carpenter, although it is designed to be vendor neutral, Carpenter Works in tandem with kubernetes scheduler, by observing the incoming pods over the lifetime of your cluster.

A

So it will launch or terminate your notes to maximize your application, availability and cluster utilization when there is enough capacity in the cluster, the kubernetes scheduler will place the incoming Parts as usual, when ports are launched and they cannot be scheduled using the existing capacity of yours, cluster Carpenter will actually bypass the kubernetes scheduler and work directly with your provider's compute service, for example, Amazon ec2, instead of Auto scaling groups in cluster Auto scalar, so to launch the minimal compute resources that are needed to fit those pending pods and binds those ports to the nodes that it provisioned so as the pods are removed or rescheduled to other nodes.

A

Carpenter looks for opportunities to terminate the underutilized nodes as well. Running fewer larger nodes in your cluster reduces the overhead for demon sets and component is system components and provides more opportunities for efficient wind packing. The central Concept in Carpenter is provisional, so we do this using the kubernetes custom resources. This is a kind of modern way or standard way to write controllers.

A

So a provisioner is how you define how Carpenter will manage the unshadable parts and the expired notes. The provisioner comes with some smart defaults, but these are fully configurable and these default include the configuration of the instance, type selection, the launch template generation, the subnet security groups, etc, etc.

A

So you could think of two Persona: okay, there's an administrator and there's an application developer. It is expected that a cluster administrator would install and update Carpenter Define the provisioners to segment the infrastructure space as needed, so they can Define the provisionals based on purchase options, the capacity type, the instance type, the availability zones Etc and the application developer. Who is actually deploying these pods, my which might be evaluated by Carpenter? They write the Pod manifest. So as long as the requests are not outside of the provisioner's constraints, Carpenter will look for the best match.

A

The request, comparing the same well-known labels of kubernetes, defined by the Pod scheduling constraints. Note, if the constraints are such that a match is not possible, the Pod will remain unscheduled. Kubernetes features that Carpenter supports for scheduling. Parts include the node Affinity, the node selector. It also supports for disruption, budgets, topology, spread constraints, interpod affinity and anti-affinity as well. So, let's quickly look at our demo and this demo I have already set up a kubernetes cluster. uh It also has Carpenter installed.

A

You can find all the steps in the carpenter, documentation, I'll, provide the link towards the end of this presentation. Okay, I've already set that up. uh I've also defined a default provisioner. So this is something your administrator could do, so they have defined a default provisioner and in there I've mentioned that any capacity that you launch should be a spot uh should be of this instance type family, and it could be of a certain instance size right now.

A

It's uh I have just commented it out, but you can have it of a certain instance of size and I've also mentioned that it should be um uh sorry. It should be AMD based uh instances as well. You can also mention what is the limit of number of CPUs that we would want. uh How can uh Carpenter understand where to launch these exoto instances?

A

Well, I've also mentioned the where the subnets are and security groups are I've already detect them, so it will go ahead and discover that hey these are the subnets that you want to go ahead and launch your ec2 instances or the nodes that you would need and there's an important uh Point here that I have mentioned that TTL seconds after MTS 10.. What does this mean? Is that once a node is empty, there are no pods running on it. It will wait for 10 seconds.

A

Carpenter will wait for 10 seconds before terminating that node or that ec2 instance uh I've kept it low, because it's a demo I want to show it quickly, but you can keep it higher if you are running a production workload, so I've already applied the default provisioner all could what I'm going to do next is going to see that will create more.

A

um You know replicas in this case, so before we do that, you can see that there are no pods running right now. uh You can see that there is only one node running right now, and uh these are the carpenter logs. So generally, I start with one and you know escalate it further, but now to save time, what I'm going to do is I am going to just ask for maybe uh four uh four uh pots that I wanted to create the moment.

A

I say: yes, what it is going to do is it is going to have four parts that are in pending State and you can see in the locks. Let's go up a little bit. It says that hey uh create a node with four parts requesting certain capacity, okay, that, yes, it is now waiting for this ec2 instance, so it has already created that ec2 instance and you can find that it has already launched an ec2 instance 23 seconds ago. What is this ec2 instance size?

A

The size is obviously anything that would fit all these four parts on, but it is of C5. It is AMD 64, and if you move a little bit here, you could see that it is or D Spot and The instance is already running. So it's 39 seconds, but you can see the status has changed from pending to container creating. So, as we talked about this in the presentation uh when kubernetes is creating those ec2 instances, it's not only uh considering that hey I need to schedule these.

A

uh It's not only creating that ec2 instances, but also taking a scheduling decision. So when it is creating these reset instances, it is bypassing the cube scheduler and it's by directly binding these parts to these. uh There are two of these notes as well, so you can see within few seconds like I, think it was 58 or 60 seconds. You can see that all these spots are actually running on these easy turn. Splences, let's escalate it a little bit, let's make it instead of four.

A

Let's say: I want 100, um Perth and we'll see how quickly Carpenter is able to compute that how many nodes it needs for all these hundred parts and is going to quickly launch all these ec2 instances. You can see that within seconds uh that uh the ec2 instance is that it has calculated. So let's go up and see. Okay, so create a node with 85 parts, so it could fit few pods on the other ec2 instance. So it has gone ahead and deployed that that is something that Cube shuttler will do quickly.

A

So, if you want, we can also check um right away of how many boards are actually in the running state right now.

A

So right now, 15 are actually running on the ec2 instance that was ready or the node that was already ready, the one that is not ready. That's where the other 85 spots are going to be placed okay, and you can see that it's already 75 seconds and um this A2 instance will get ready in couple of few more seconds or a couple of minutes more uh before it can actually have all these pods going and placed on a running state so and by bypassing Auto scaling groups and directly talking to ec2 instances.

A

We are able to save 30 to 35 seconds actually, when you are trying to schedule a lot of pots and if you have been, if you've seen this, that in 108 seconds our ec2 instance was up and running, let's see how many parts are up and running right now you can see that, yes, 20 plots are up and running. There are some in container creating mode, so they are downloading that image and getting ready, and if you want, you can also keep checking that how many of these are at all getting created.

A

So you can now see that that number has quickly uh started escalating and you can within what it's been two minutes uh since that ec2 instance has been launched, and you can see already uh most of the parts have been deployed. So that's how quickly Carpenter can actually get the Institute instances up and running great. So all the hundred parts are up and running. So what we'll do next is actually just go ahead and remove all these spots.

A

Okay, so I'm just going to say, hey just go ahead and have zero, and you can see how quickly it is going to scale down, so the parts will go off instantly, but for the ec2 instances you can see that it has added TTL. If you see the logs that I've highlighted it says that added detail to the empty node and because it was just 10 seconds, it's saying that it's triggered the terminations are within 10 seconds. All my ec2 instances uh have been deleted if I want to make it more interesting.

A

I can also go ahead and, let's say patch the deployment and say hey instead of AMD I want Arm based ec2 instances and once that is done, I'm going to ask for let's say two ports that need a node that is Arm based now, in this case, uh let's scroll down, we also got our base ec2 instances, but we will be like when didn't you mention AMD uh already, but yes, I have also mentioned applied another.

A

uh So you'll you'll be able to see that here one second okay, so you can see that it already found a provisioner for arm 64 and the request that I just had matched that on 64 um a requirement. It was there in one of the provisioners, and so it went ahead and deployed that ec2 instance with um 64.. So you can have multiple provisioners. In this case, these provisioners could have different uh constraints. Different requirements and Carpenter will automatically pick up that hey. There is already a provisioner.

A

If there was not, there was no provisioner for arm 64. It wouldn't have allowed the user to actually go ahead and deploy uh this particular application. So that's it! That's the simple demo: uh let's go back to our presentation and wrap up the section, the key takeaway, so you use the default provisioner for diverse instant types and availability zones. You can add up additional provisioners as you need.

A

You can also control your scheduling, based on their topology, spreads, attains and tolerations, and provisioners Etc use HPA with Carpenter to scale in and out, and you can schedule these pods with spot. If you need to to save cost right, if you want to install Carpenter, you want to play with it. You want to contribute to Carpenter, do check out the documentation and the GitHub link I have mentioned here. There are some best practices that we discussed about how to use this with eks.

A

There are also certain workshops if you want to do more Hands-On with respect to Carpenter, and you can find all that detail on these resources. So that's it. That's me thank you for joining me for this quick demo uh and discussion about Carpenter I hope this was insightful and this was useful and I hope we all experiment and continue innovating in the way our kubernetes clusters are clean today. So thank you again see you again next time.

A

Thank you, rohini yeah. That was great tool when you're working with Cube class.