Kubernetes SIG Scalability, 13 Oct 2016

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2016-10-13 Kubernetes SIG Scaling - Weekly Meeting

Description

2016-10-13 Kubernetes SIG Scaling - Weekly Meeting

A

3I October third teeth recording for the Cooper Nettie's scaling sake again.

A

One, what we didn't have a we did have a request from Dina to walk us through some of the scalability work. They've been doing.

B

Yes, I'ma start but see ya, I'm start. So this reminder I've been looking through what this group is doing since august. I have introduced myself someone in all those six scale meetings and recently we I mean we're on to scalar, and the team started some basic researchers around communities and openstack on top of Copernicus, and we have published some results will have it done so. Basically, what we did I just posted links to the chat is running.

B

Et loaded go tests that I used as far as I know by you guys as well on top of bare metal installed. Copernicus so will be that use the Google cloud engine or whatever our other provider. We install. The Burnett is on bare metal using a cargo deployment to it's basically set of ansible scripts so to install the cupboard edges and they used calico for the Overland networking.

B

Simply because we had such requests to check how kalica will behave as a ultra networking tool and we simply run the load that go tests from ET test, sealed against up to three hundred fifty five nodes, which was maximum. We had at the lab on the moment and just shared the results. 1050 m / 155 notes, so I just posting to the chair. So most probably you can take a look on it. If it, you find it interesting, that's cool and one more test that we have done sort of the test. Just an experiment.

B

We created a small tests, you that is checking poster competency, meaning real container workability when containers are really become available and a report about they will appeal to some common master so and in check. What what's the capture will be if we'll just a push up to 100 posts per node? Sadly, we forgot to configure could learn.

B

It is that time to allow more pods to be pushed because as far as I know, there's some limitation about 100 can both score node I, if it's all configured in other design, but we'll run it with the high density. I would like to see how to behave. We will have high density and basically, we had about 50 milliseconds per container start up with the 100 pots per node density, so reaching good results. I believe so and I c-cup commented calico would greatly affect the results.

B

ah So I believe yes, the dude, but do have some being so probably I. Don't know some information about how this will influence, but when you're.

C

Not chosen straight overlay right as much calico uses vgp propagation for your information. So it's not a like a strict over lace, you're not doing the stripping right, yeah yeah.

B

C

If you have any throughput in your tests at all, like even the networking itself would probably faster so.

B

Yeah, so we're going to check several versions if you shouldn't just play around the parameters and see how to go so as sad. Nothing to send so we use the same test tool as Google folks are using, but but still.

A

When you're working you're using for propecia, so we.

C

Have our own stn plugin? Is it sort of similar to calico and no no, no its fullest ian? So it's you know. If we want speed, we have customers that tweak with the excellent offloads and I'll give the jazz that comes along with that. I can't.

B

Miss pass links to the to this plugin so just to take a look sure.

C

I can I can put in the chat window, regina purple,.

B

A

Any didn't see any surprises observations. I would.

B

Say that everything was pretty key and was in line with their blog posts, about 2,000 nose testing and about 1,000 notes. St previously for earlier versions of community is so the only moment that starting became suspiciously, whether we were in fact run an office stack on top of Copernicus.

B

For instance, we frequently are observing issues with fries and docker under the load or during the OpenStack installation, where we are starting lots of containers at once like when we are deploying OpenStack simply on top of Copernicus. So dr. is freezing somewhere on the syscalls, without even which will edit debug and for to its logs. So it's kinda a bit distribution which currently trying to debug what's going on. So this happened with various docker versions. We tried several them so without work.

A

Best when a regular topics in this group yeah.

B

I believe so the last mission we tried was one. So let me just check.

B

B

So here in these tests will be using okay, one point 11.2, but will also run one point, 12 point 1 and right now we're checking one point 12.2, so yeah, and this is kind. This is not happening with the equal frequency and sometimes it's happening more frequent, imes less so I'm, not sure. In fact. What is the influencing pattern here? Because, basically we did this thing: work, clothes and october notice. Every time we're observing this issue's, but sometimes we have kind of ten percent of nodes freezing, sometimes one percent of nodes freezing.

B

So it's just some random pattern so and also a more interesting moment was that under load on OpenStack that isn't a lot of burn notice, sometimes dnas.

B

Scheme we used that is basically installed by cargo, so this is DNS attack, I, just posted the link to the diagram that is used for this installation. Sometimes DNS stopped working as it is supposed to so I have observed, like couldn't resolve names of OpenStack services and the huge load, but most public. This is it as expected and we're trying to find the bottleneck right now.

B

But if you we say about synthetic zest, slack or running just api testing or against can burn notice. It was pretty okay. I mean very enlightened with the numbers you guys in court already on virtual environments, so basically the same a bit faster, but it's less nose and and bare metal. So it should be faster.

D

Good stuff, yeah.

B

So I'm going to continue and right now we are kind of defining the queue for ideas. Water came down in q4 and it's a charm, so something upcoming.

A

You're just baiting us with comments now sorry.

E

Sorry I guess I had done two questions. The first it looks like the test results you showed us were run with Cuba Nettie's 135. Have you had an opportunity to look at urban at ease, 140 or 141? We.

B

Are currently installing it in small environment? Yes, because we have plans for this huge Anza, so basically scale scale. R&Amp;D team has lots of tasks today, Burnett's itself, and it's not the only singular. We need to go through right now, we'll install in 140 or latinos, and we will continue debug doctor for is an issue on this small environment, just a malaysian highness load, ah but yeah we're didn't right now, but not on that scale.

B

I mean in fact one point: 4.0 was installed or one 150 notes, mera metal by us, and we ran poor little sister top measurements exactly against 140 I need to put this information to the results. I forgot to mention that there was other version used to post our applicants in measurements. Yes, so we will try to get 150 notes.

B

F

Don't know muted what was that I think I, don't speaking right up to us or something else, someone else, but that keeps me busy.

B

Yes, oh I, don't hear all right.

E

Sorry about that, let's try that again. So my next question was related to the link I posted yeah.

G

E

Awesome time series graph of my single master minion cluster coming up over time. It seems like it's a good validation of like an end user workload on this cluster um now I think last week, Jeremy I der aur agar presented some concepts around this cluster loader or work load generator tool that I think other folks within Red Hat are working on and Marek. I'm sure you probably know more about this than I. Do we're trying to get some of this work to go inside of this new protests.

E

Repo, that's owned by the scalability group here um I was curious. If these efforts are related at all or if it would be worth getting folks to talk to each other here, I guess.

B

It's really worth talking to each other, because these this tool was written really quickly to check post about my dancing by us. But I think we need to coordinate here for sure and I really eager to just to seed the the tool by red, hat and I would look and just use it or somehow improve it and use it, etc, because it's still that we used was really simple and just yeah.

C

We were planning we're working on it right now and we're gonna get it into we're, going to try and merge it into the perf repo. Hopefully, as.

B

Soon, as possible.

C

G

C

Into my next question, which is like wojtek and I were kind of bantering a little bit about the 5000, no goal the 5000 own goal and I know we're using the legacy tests against that. The question I have is, as we start to beat on it with our cluster loader and other things. What does that even need?

C

Because the performance at 5,000 nodes against your vm is very optimistic right and most of the tests that I run are very pessimistic right, because the optimistic tests don't have your other controllers firing and a whole bunch of services running if you're measuring against density itself. It's a very less is there's a lot less busy things going on in the system.

H

Yes are sure: I agree that, like having better test is better, but so as long as we don't have them, we are using the ones that we have basically I.

C

Just want to be cautious that we, if we put out a number like 5,000 as a cig right in orgo, google, puts out a number like 5,000 that that number can hold up against scrutiny in a real clustered environment right, because what we found even on several hundred or thousand nodes, is that, like the endpoints issue, crops up in in huge bad ways right so like we have.

C

We have thousands of services running right and and coming and going, and when you have that the endpoints are literally saturating nodes and eating up a ton of bandwidth from the Masters. So yeah.

H

I completely agree like that: it's it will be pretty much the same as what we have lived 2000 now, which is which also you can prevent. Shall we break yeah I.

E

C

Know sound, like maybe things.

E

That need to be tacked on to that cold scalability goals, document we put in the Kidron Eddie's community repo because it looks like there we are spelling out like we realized it's, it's not just a one number thing: there are a variety of metrics we should be watching, but you know open question at the end is like so what's an actual use case for this.

E

If we're trying to drive these metrics towards like a human factor like what's the response of this for human or what's the responsiveness, forgiven workload, we probably ought to spell that out, or at least put it to do, that this isn't yet spelled out and we could really use some help figuring out what the right cases to track our.

C

Yes, I think once we open source some of the work or it's already open source, but once we put it in the community arena, we can start to have that shared test plan and shared test metrics and results.

C

Because then, then anybody can run the same experiment and get different results and share it. The community so long as their run in the same tests.

F

C

Right now, I mean like I, think, maybe for next week. I could I should probably run through how we do our analysis and I mentioned that last time and we're like I mentioned before we're pretty pessimistic, because we want to make sure that our customers, even if it's on the low side of things, are well within the bounds of a fully saturated type of environment.

C

Because what often has a tendency to happen is we find the large-scale, abusive customers who then take your numbers, ignore them and go beyond them and in those environments, then, as long as we give up pessimistic numbers, were we're kind of fudging it a little bit, but it it works. It's working better I should say, but I would be remiss if I worded, like give a number like 5,000 and a customer gets close to that with certain profiles that we have and the cluster explodes. That's that's when I'm literally being shipped off somewhere yeah. If.

E

You reminded them it's the phrase at the most 5,000, not least 5,000 yeah, but my other question I guess was I think we were talking at one point in time. Maybe America's helping lead this discussion around, like sfo's for individual controllers, for saying that controller sort of thing that actually implement and user facing functionality and the cases that I've heard you bring up most frequently Tim are hey, guess what happens when you're actually running a bunch of controllers trying to handle real user workloads.

E

That's where the system seems to act really, weird like things like density tests in the low test, don't actually simulate those conditions, I think it's! Those are sort of more of the real world scenarios that we're trying to figure out here. Yet.

F

I can also need to figure out like weak matrix. We actually care about like of qualified. For example, network propagation, execute proxy and propagation is important one, but like, for example like do we really care how quickly employment can updates replica sets like as long as it's reasonable, like I, really want to push the throughput higher, and so we actually need to create a list of things that we think are important right, because not everything is but some things other than that we are trying to measuring our, but like.

F

We need to find like a Swiss border, yeah.

H

Basically throughput is an important thing because, obviously, at any scale like we, there is some crew put that the cluster supports, and there is some true, but from some boundary it won't support the fruit. So this is like pretty important to say, like that. We support given fruit, but not anything higher.

F

But again on young particular resources, so I think wore on by creating the list of things we actually care about mostly yeah.

E

F

The reason I bring it up is because I.

E

Feel like we chatted about this little while ago and I'm not sure if it stalled out, or we decided that that's not worth the effort at that point in time were or what but I mean. This is a great idea, but I'm trying to understand maybe like are we trying to learn any of this in the 15 time frame? Are we talking about sort of loftier goals that were in a for long term, some well I.

C

Think it was a priority problem right. It I think everybody agrees that this is something we need to do, but it's it becomes kind of difficult, sometimes because we're almost like an overlay group right. So sometimes we have these conflicting priorities that that prevent us from doing that, so I agree. We should almost set aside like a set of p0 priorities and P ones, because if that's not something, we've done as a cig, you know fundamentally, like other saves literally have the devoted resources towards the p0 p1 items.

C

But if we as a group, decide to do that, I think we would happily, you know, have a resource go along with defining those metrics and start to do to create tests to measure against them, but I think we need to start to maybe clean house so to speak so that we have deliverables for given release and that that those deliverables have fly cover from p.m. that's another problem right as Google's p.m. and RPM. uh You know sometimes that they choose priorities.

C

So we need to make sure if we say it's going to be a p0. We have p0 fly cover right.

F

And I think I be detected. Cluster loader think it's like breakfast that because then we will actually be able to easily create well easier to. It would be easier to create a dedicated tests.

F

E

Benchmarks, yeah I'm super excited about it, wouldn't mind, continue talking about cluster letter actually, but before I do that is there anything else on the agenda that we're taking time away from.

C

I briefly touched on most of the things, except for the sed three update and state just getting like a PSA down the list. Endpoints is a big problem that needs addressing it's. A huge problem, in fact, is.

E

This, the iptables I finishing tables all the time thing. It's.

C

More than that want to go, but I said.

G

I'm, just kind of horrified by that bug I mean the amount of sort of like network traffic at scale. So I think you know shifting the thinking around taking it up seriously. It's probably worthwhile.

C

Yeah I'm getting diverted to that. We have to talk on our side when fixing that one.

D

Can we have a quick discussion about the proposal for a configuration, dumping I know, we've talked about this in the past. I just want to get kind of a ball rolling, basically in games, encode, none so level out how to take had a look at the proposal in the proposal was built based on the discussion we had two weeks ago on config them, and his comments were long lines that the repulsion will only work really for culet. It will not work for all other components in the control plane he suggested for that.

D

We actually just get configurations from kid from back from config maps, flags etc at the pot level, as opposed to just config a component configs from the components which is kind of not what we agreed to not all, really what we discussed so like some. um Some thought I'm, not some discussion about what we're where we should go, which work, but should we take? What's the writer country to take etc, um just to show kind of a proof of concept that we can actually get.

D

Configs I actually have a PR with tests that does expose an endpoint for culet / config component configs and does give you the full config complaint of things for that, the component and it's on the on the agenda here so anyway. So any thoughts.

G

I gotta be honest: I haven't had time to sort of read up on the issue in the latest there and understand exactly why the plan wouldn't work for the for the control plane pieces.

A

It would be a it would be awesome if you could find a few minutes. Take a look at it and give us some feedback. Yeah.

G

I'll try and find the time alright, but yeah I, just I honestly I haven't had time to dig in I'm. Sorry, no.

D

No no worries where's, uh it doesn't seem like it shouldn't work. It just seems like there's a lot of work that hasn't been done yet to get the component config mass for the components um and so I think that's kind of guiding that suggestion from my life I'm. Just speaking from what I'm going to know now. Oh.

G

Okay, so I can see the fact that the component config stuff is much further ahead for the cube 'let than everything else correct.

D

Yeah and that's why I think his approaches, but I don't want this kind of like half done approach where you have in two worlds. I want like a a definite approach where we actually are going in the future, and we will get all information now so on the.

G

The cluster lifecycle.

G

There is a you know: we've been talking about sort of making it be easier to configure a cluster as you're bringing it up as part of like you, bad man and other sort of tools and sort of we're. Looking at component configures as one of the key pieces, there are finishing that all the boys I'm looking at the ok ours and it just hasn't bubbled to the top in terms of sort of what's happening in this course, and one of.

D

The intentions of my PR is actually show like hey look. This is really cool. Complaining things is the way to go. Maybe we should do this, and do this not only know kind of motivate people to go forward. Yeah just finish it off exactly hmm anyway. So that's two PRS. Please have a look I'll post them on the chat here and let's even get this rolling.

A

We'll probably will probably reach out, we may reach out to daniel's well and see Phil. We can have a wider chat about it. What's wrong with your chat, so.

E

But yes I did it again. Sorry, I was getting.

D

Paid on other things,.

E

Yeah, I, like I kind of want to see code actually happen and see if you can move forward here. My main question like as long as we don't think this is exposing any security concerns. Then great. Let's move forward, it is excluding security. Cert concerns, I think um it was mentioned of getting sick off the walls to see like what work had been done to get out.

D

Dude, if you mention that my PR and I'm using actually the same framework that the slash metrics and point is using, so you would have the same security already in place, but you know I need some wins to golf to just go ahead and get that's fine. Let's go yeah, that's we could really use some attention. There.

A

All right, um I, don't think know that we hit the etsy d3 stuff hard enough here she is Jen. Did you want to tee up the specific.

C

Question or uh what's the current issues you're seeing why tech you see in the 500s.

H

I'm, sorry, yes, but this dive seemed is 500 with 4000 note cube mark cluster with, like significantly increased GPS limits, which may or may not be like expected.

H

I I just did this experiment first time today and like we had some infrastructure issues, so I don't have logs and I'm going to like try to gather all the necessary logs tomorrow but yeah, but basically the problems that we were talking last week, which was basically like pretty slow watch and at CD free slower than at cd2, with the new patch release like 3.0 point 12 and some minor changes on our side.

H

uh Now, it's ed free is significantly better than at CD. Do.

C

So it is, is a testing infrastructure all swapped over yet or is that still a TBD item or upstream PRS and everything else like the full sound structure? So it should be.

H

Like it should work now, I think basically so the so basically like today, I had issues explicitly with logs, like we've gathering logs from cube mark runs not with like the infrastructure itself. It was only about like gathering logs and only heir from cube mark clusters, which was yeah.

C

I, don't want to conflate the two problems. There was a it's just a question of whether or not all of the / pull PR automation and flopped over to testing set3 by default or not sorry,.

H

So, let's see do not know there is like in flight ER that will enable at CD free / PR a builder, but we need to like. Yes, there needs to be done, some tweaking of like resources and number of patrons and stuff like that. It hopefully will be done today so hopefully like today. Tomorrow we start running at cd34 x, ET des, let's see if retest or PR.

C

Well, so does it answer your question? Yes, just us it's using this. We call it though right you're gonna use this record against that. Yes, yes,.

H

Yes, it's 3.0 point: 12 binary, XD, envy, Envy, free mode. The free idea using v3 API is.

A

Ok to him, you gotta say something more about 30 one, since you continued it. You continue to post, interesting tidbits in shop here. Well,.

C

That's we wojtek head uncovered the watch performance issue with the 301 hers. It was actually 310, I believe 30 10, and so we can't you know. The previous versions are not something that you'd want to use at a larger scale, because there's known deficiencies or issues and the old attach was only back patched on 30 12.

I

So that we and all the way, the apps to you.

A

Shall we we really can't hear you? Can you like, maybe.

I

Can't hear me now butter, okay, so, um like we fix a bunch of issues in a watch and light some of the places and also like we, we update the upstream on to like 30 12 custom client will use the field 12 as well. It will be better like to upgrade to fill 12.

A

So I was like sounds like consensus.

A

Think we're we're starting to run a little bit over here. Any any final thoughts topics.

A

All right, this was a very good today, thanks everybody we'll see you in the community meeting. That's all bye-bye.