KubeVirt SIG Performance and Scale, 14 Jul 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: SIG - Performance and scale 2022-07-14

Description

Meeting Notes:
https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.tybh

A

So, okay, um okay, this is july 14th 2022. This is sixth scale.

A

um I'm going to link the I'm reading out some chat all right here. We go okay, um so today we'll just do uh abbreviated agenda. I think um we need some more folks to talk about the bottom two items so we'll just do the first one. um This is an issue. That's actually been open for a while, but um we've been testing internally in nvidia for a little bit. um This is a concept. That's been added in kubernetes, I think in 121 it was added as uh I went.

A

Okay 120 was beta and I think it's um yeah an alpha 118.. So this has been around for a little bit. It's called api priority and fairness. It's a really neat feature.

A

It focuses um on making sure that um traffic has a fair chance at getting api server access and, um as part of that change, um you get the ability to um some other features to like some kind of some rate limiting ability- and you can also do some other things like um protecting the kubernetes api server, which is really important and protecting the control plane, which is really important for a lot of reasons. Just so, you can take it in your cluster for multi-tenancy. Lots of use cases work so anyway.

A

The um this makes sense to actually for kubert to integrate with this api it because q vert does generate a lot of traffic. We want to make sure that that you know like one, those requests have a good shot at reaching the api server and also the in the same. In the same idea, we also want to make sure we don't overwhelm the api server and um we don't really. I mean from from all the testing we've done. We don't we don't we're not really. The cuber is not the culvert.

A

Whenever it comes to sending a lot of requests to the api server, it usually comes in some other form, but um if it also makes sense that um that we protect ourselves from anyone else, that's very noisy. So um there's a lot of good reasons to add this. So the um to give like uh see, I don't have any more data, so the I've done a presentation on this in the past and kind of go through like some of what um the different um the different things mean different like um settings.

A

uh I I won't go through that today, but I'll just mention like this is the this is an image of what um we've been testing um in production, and this is what I I'm going to contribute up to upstream and out to the operator to get deployed in into uh as part of keywords installation at basically at a high level.

A

What this is, uh what this does is it'll do um per user, um so kubert's just got one service account um and we'll have a we'll do rate limiting for um the all of the apis and the keyword group and all the verbs and then um and then the keyword name, space and what's going to do, is going to take the request it's going to put in the workload low queue. This is where I found it made the most sense.

A

This is where a lot of um a lot of other, like anyone who wants to has a service account um their uh like or any application when they whenever they want to be in a a queue, they usually go into it's a workload low. That's that's what I saw um so this made sense to enqueue include keyword in here. I think this is just an easy way to start. We could also create our own, but I thought this was a very simple way just to um to get started.

A

We can always um we can always change it, but for in terms of like what we saw in um results, I I was very promising, like in the cases that, um where um that that we, that we've observed high amounts of pressure on the api server by some other application, uh we would still see that um hubert's requests were were able to um were able to get through and the the number of rejections from the api server was was small, so that was really good to see and it's what we wanted.

A

So I, I think, um kind of in terms of a starting point for api party. In fairness. I think this is something that that fits, it makes uh makes sense to. You know start with this, and we can always optimize. You know- and I think there's also like this is also per cluster. I really think that people will want to edit this over time like based on what their you know, performance that they're expecting and their clusters so really is like kind of way to look at this is like in our default installation of qver.

A

You know what would we expect to have and what we expect to work well and I think workload low, because it's a default priority level configuration that gets deployed and um and um and then this old last field is the matching precedence. This just means that we're going to be below kubernetes defaults. um Basically, the control plane system um precedence, it's just right below it.

A

I think 900 is like the last one, so we'll be we'll be high up there as an add-on but we'll be below kubernetes um in terms of the amount of shares that we'll get of the api server.

A

Well, that's that's the basic idea, that's pretty much! All I wanted to cover with this. So I'll follow up with a pr on this and attach it to the uh to the issue. All right. Did you have any thoughts on that lubo.

B

I uh yeah it makes sense to have some priority over users and to allow our api to to make the calls we need. So uh do I understand it correctly? We don't get rid limited. We just get more priority over other users right.

A

So weak, uh so technically um you can get right. So it's it's! You can get rejected. Your request can get rejected by the api server, um but it's with when you integrate like this um in the way that we're doing it, it's it's not likely so here I'll I'll back up and put it. This way. Put it through this way. um If there is no real, if there's no api priority and fairness in place right, it's just a free-for-all, so anyone who gets it first wins.

A

The whole idea is like. um So if there's someone really noisy right, we're going to get we're not going to get access and we're going to get rejected. If we have this cue, um what this does is it focuses on making sure that um the person is really noisy gets rejected more often than the people who are less noisy but you're still affected, because resources are honestly finite.

A

So you can be so that your question about being rate limited, we technically can get rate limited because there can be someone, who's really really noisy, but we'll be more protected because we have ourselves a flow control and a priority level um config. So we there's just so there's sort of some guarantees that we'll have a good shot at the server that answers your question.

B

Yes, it does yeah. Thank you, yeah. I will need to catch up a little bit with the api, but it makes sense.

A

Yeah yeah all right well I'll start uh follow up with uh up here on this and we'll uh we'll go from there. Okay, cool all right um lubo! If I don't know, if you have anything else, you want to add our topics, but if not we'll can push these to the next meeting.

B

Yeah, I think it makes sense to take these issues in the next meeting.

A

Okay, all right I'll, stop sharing that all.

B

A