KubeVirt Talks, 27 Jan 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Roman Mohr: Kubernetes Cloud Autoscaler for Isolated Workloads

Description

GCE and EC2 provide a great platform to run your own isolated Kubernetes cluster within the cloud. With the Kubernetes Cloud Autoscaler, scaling on-demand of your GCE and EC2 instances can even be done from within your Kubernetes cluster. This session will introduce the Kubernetes Cloud Autoscaler concept and discuss how it is implemented for GCE and EC2. Finally we will have a look at the Cloud Autoscaler backend for KubeVirt, a drop-in Virtualization add-on for Kubernetes, which brings Virtual Machines to Kubernetes to allow you running isolated workloads on your Bare-Metal Kubernetes installation.

A

So hello, everyone, my name, is Roman war, I'm working for Red, Hat and today, I want to tell you a little bit about the cluster autoscaler for kubernetes and how you can use it on the public cloud and how you can use it for your private cloud, with the help of Qbert, which is a virtualization add-on for kubernetes, which basically allows you to run virtual machines side by side with reports on bare metal, openshift or Cooney's installations.

A

So, in order to understand why we need to cast out of scaler I want to go with you through four main questions. First, how public clouds in general help you with scaling your application at all? How Kubo need is fits there in the picture and how it helps you to get out of more more of your resources which you are buying on the public cloud, then we will finally see why we actually need to cluster autoscaler. So we will go.

A

We will ask the question how the outer scale helps us with combining the public cloud and kubernetes and finally, I want to tell you a little bit about Qbert in the cluster out of scale and how it can help you to deploy isolated workloads on your bare metal faster. So when we look at the first question how the public cloud helps with scaling in general, the overall answer is pretty easy. It gives you virtual machines. It gives you a lot of them. Basically as much as you need. You just have to pay.

A

That's how it starts. So you write, so you have to make your application scalable to make use of that, and you typically do it with a simple application by taking the resource incentive parts of your application, put them in and make them scalable, which means that they can run in parallel, put a load balancer in front of these parts and when your notes get under pressure, you scale up when the pressure is gone because no one is visiting from your webpage or whatever you scale down again you in theory.

A

If you have a public API to view drop provider, you can just script. Everything monitor your virtual machines, create new virtual machines, stop old ones, but typically public cloud providers help you they're a little bit, for instance the Amazon Web Services ec2. They have the auto scaling groups concept, which means that you can define a virtual machine template and assign scaling policy to it. Scaling policies to it. Some typical scaling policies would be tracking track, a specific CPU metric and if trigger is activated it scales up and if another trigger is activated, it scales down.

A

Google compute engine has pretty much the same thing. There it's called managed instance groups and to typical scaling policy as you have. There is a are also based on metrics or, for instance, on load, balancer requests.

A

Let's now suppose we have our application running and we it consists of two nodes right now, where the load is higher than 80% 80% triggers scaling requests inside the public cloud, so you get another instance automatically, and after your three instances there were all pressure drops to less than 80%, and everyone is happy again, but now we're in an interesting situation, because when you add more and more applications there and you have- you are working for a huge company where a lot of people are trying to go to the cloud.

A

You see that there are a lot of resources, also hundred X. Also here you see it already, which you are paying for, but which you're not using. So when we look again on the example with the three running nodes, you already see, all three of them just have 65% of CPU usage, which means that if hundred percent means is the amount which a full node has you already pay for?

A

One hundred five percent node resources which you are not using, so it's tempting to try to pack the workload harder Titans at the VM, but that's actually not very easy to do. You would need an extra scheduler which helps you doing that, and also, if you want to share your machines with other people, so that they can also more you can make use of the leftover resources. You would need some kind of multi-tenancy or so on, which is normally not there in cloud.

A

So you just have this basic workload unit and you can't split it into further details, and that is exactly where kubernetes can help you, because communities is multi-tenancy includes multi-tenancy support.

A

It has a scheduler which can decide on its own, where it's best to put your workload where it's still something is left on the node and it even has benefits many thing about creating applications, because you can define everything much more natural in the terms of how your application fits together and less about, and so you're you're, basically describing much more how your application looks like and much less how?

A

How much of the public cloud resources you did so the solution here is to just deploy communities inside your nodes on the public cloud and then schedule everything with pots and containers. And that's what you see here in the picture. So the outside is the Amazon Web Services or BQ Google compute, engine node and in on there you run your cube, Anita's notes and in there your plots are finally running so a fine solution. So you take your simple web application.

A

Again, you put it into containers, make sure that it's consists of micro services or whatever and and you're done and then finally, you get some load on your on your web application and it doesn't scale anymore because now you're the autoscaler functionalities from the public clouds don't work anymore.

A

Let's just consider a very simple example: we still have two three workload: the three backends running with a with the CPU usage of 65 percent, each and different schedule, a new pot which needs fifty percent of one node. We don't have that left anywhere and so the community scheduler tells you oh I, can't schedule this, but the Google compute engine, auto scaling group, says hey I'm, fine I don't have. There is no pressure on the node, so I will not scale up, and that is exactly where declare the autoscaler sorry wrong.

A

Button very autoscaler finally helps you. It's just a small application which you can deploy inside you kubernetes cluster, it's in principle, agnostic to the cloud provider, so you can use it for Google compute engine easy to s or, and it makes use of the auto scaling primitives, which the public clouds offer. You, for instance, have already set the auto scaling group or the managed instance group.

A

What it does in detail is it just monitors your kubernetes cluster and checks if there are any pots which are not scheduled level, and if it detects one, it looks on it's no template, it knows about, tries to find out if the workload will fit there, and if, yes, it tells the public cloud below to spin up a new VM.

A

Then it gives the VM some time to register it over the communities node. Once the wants, the resource pressure, isn't it given anymore? It will just scale down the nodes, and that's that's pretty much it already. One side note here is that this is really just about this use case. It's not about pod, auto scaling there extra port autoscaler, and it's also not about load balancing regarding to be scheduling- ports, for instance, from one node and putting it to another node. There are also other projects which are doing this, for instance the D scheduler.

A

In order to to determine if a new workload really fits into into a new node, the autoscaler would create the autoscaler needs to know to know a little bit about the node it would create if it creates a new instance and typical metrics it needs.

A

There are how many CPU, how much memory and how much much storage, then you know it will have- are the tensor tolerances on that node, which labels are in the node, all really that are all metrics and informations related for the community scheduler, so that it knows if a part would fit there and below there.

A

You see a typical I'm, a file of a note which contains a few fields which which influence the decision in the midde data you have to label section in this case with just with just one label: community stop do /host name, and you see the capacity of the node CPU memory, lots of stuff like that, and that's it now. Your application works again. It can scale up and down again if you want to use it with over the open stack you right now out of luck.

A

It just supports public clouds at the moment and now I created a back-end for Cubert and that offers very interesting use cases in the--. In practice. This works like this. You have bare metal, kubernetes or open shift cluster with Qbert installed and in there you're now creating a nested kubernetes cluster and tell the class the autoscaler to monitor the nested kubernetes cluster.

A

If there shortages in the pods scheduling and if it sees shortcomings, it talks to the bare metal kubernetes cluster and tells it to spin up in a VM so that in enough resources are becoming available again.

A

In order to make that possible in Hubert, we had to think about also in implementing something like the cluster auto scaling group for diminished instance group for us, and we decided to create a virtual machine replica set, and the name is not an instance. It's not chosen by instant. It really is pretty much the same, like a replica set for pause, just that it creates virtual machines instead of pods.

A

We need a little bit of metadata there, so that the autoscaler has all the information to scale up, for instance, the label to allow selecting it as a as a as a node, so that the autoscaler knows that this will create nodes which will register to register to the cluster. You need a few annotations which tell it how much resources we'll be able to how much pods will be able to run on the node.

A

How much storage will be there and below that in the specs actually see replicas three in this case, which means create three VMs or make sure that three VMs are running and that's pretty much then what the cluster autoscaler will will manipulate during the runtime to scale up and down virtual machines. But that's not all what you see here. There is on the second screen. The specification goes on, and here you see the actual specification of the virtual machine it will create.

A

As you can see, you can specify resources on the virtual machine, how much memory it will have how much CPU cores it will have. You can add discs there, and what you see here is that I've added two disks, the first disk, the boot disk references, the boot disk volume in the volume section below- and this is just referencing- a fedora 27 image- and we have a second disk here, which is cloud in a new cloud data source which we can use for bootstrapping. To note and auto, registering it references just the kubernetes.

A

We need a secret which contains the bash script in this case, such a best script code. For instance. Look like this. This takes this installs docker cube, ATM and kubernetes client cube.

A

Idiom is a very nice tool for bootstrapping bootstrapping automated for automatically bootstrapping secured kubernetes clusters, and then you start talking cubelet, and then you just you just invoke cube made me an ATM join so that it can register itself to the cluster and the final step which you need is you need to configure the cluster autoscaler so that it can talk to all communities instances?

A

You create a very simple config file which points it to the configuration file of the bare metal kubernetes cluster, and then you start it point it by a dash dash, cube conflict to the nested kubernetes config, well, cloud provider, you select the cube root back-end by a cloud config. You pointed to the Indian file I've written here on the top of the page, and no group out to discovery is not a part where the where the labels from before, and they were to machine right, because that gets interesting again.

A

There I put the cube root of the autoscaler label there, and here with that expression, we're telling the the autoscaler that it should look for returned machine replicas with exactly that labor and selection strategy, which put no template it should use. We were using I'm using here list with least waste, which means it looks it go through all the no templates and just checks which one fits best to workload and leaves least CPU and memory unused, and that's pretty much it now.

A

Your bare metal cluster will scale up or wrong and I've already mentioned the first interesting use case, and that is what you are seeing here. You can have like in the public cloud case, you can have a completely nested kubernetes cluster, which can scale up and down by asking for more resources, but a bare metal community installation at the bottom and that's kind of the typical public cloud market area. In the use case, you you don't have to take care of what the customer is doing in there.

A

That much I mean you need to, of course, need to secure network and everything, but it's completely separated from the bare metal resources and the customer can do it whatever he wants, but there is also another use case, and that is now about isolated workloads, because here we're not creating a nested communities cluster.

A

Instead of that, we are telling the cluster autoscaler that it should monitor and scale on the same cluster, so it it can check on the bare metal cluster if they're unschedulable parts with a specific label like needs isolation, and if such a part is not scheduled label anymore, it creates it asks the same bare metal, kubernetes cluster, two crypts to spin up a virtual machine which reaches is a node on the permit installation.

A

So you'd have to basically have two different types of nodes: the bare metal nodes and virtualized nodes, and in the virtualized nodes you can then run isolated workloads typical example for berries, for instance, if you want to do CI for github in github, it's typical on every pull request. You you run some tests, but what what's run? There is pretty much much opaque for the CI system. It's you just execute arbitrary code. There. You have no idea what people are doing there, so you probably make sure that you don't schedule such workloads directly.

A

Next to your sensitive high performance pots on the bare metal nodes, we want to isolate them, and this is exactly where this scenario can help you. If you want to go into the details there, a little bit more, you can read about the cluster out of scale in their github repo. There is a port autoscaler to which I've mentioned, and a disk a doula. These three projects kind of make a whole story out of the whole scaling in the cluster.

A

You can also see where the cube root, related resources for the cluster auto-scale are located and if you're interested, you can also read about the GCE and easy to scaling groups and learn more about them there and that's already it so. Are there some questions.

A

So the question was, if the technique, all the works on the public cloud and not with Cubert, if the, if there is the question since cube root, requires a nested virtualization enabled.

A

But of course you can't just install the SS before you can just install the autoscaler directly on the public cloud and- and you have the same effect like in the first used case with the nested communities cluster. But that's not related to Hubert.

A

Okay for the questions, then, thank you very much.