Kubernetes SIG Testing, 5 Jun 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes SIG Testing 20180605

Description

SIG Testing meeting, June 05.

A

Yes, thanks Eric, so you had to give some context to this discussion. So we've been discussing this way: I guess iishe, Ben's and Eric, and Mitra and I guess a few others or like a few weeks, and so we so I added a couple of scalability presubmit jobs that now run against every single PR. So so so they run against every single player, but they are not blocking the they are not blocking free segments.

A

So so so the plan, and as now that I actually proposed to make those piece of mitts blocking and the reason behind this is that we've been catching way too many scalability progressions using those tests- and it turns out they're, actually not having those pre submits blocking, is significantly hurting the the the hourly schedules and like in terms of engineering productivity as well. Because what happens?

A

Is these regressions enter ideally, which we should have caught even before they get in with these pre submits and then scalability presubmit scalability distance in general they're, like kind of complex enough to debug, because they they come from pretty much anywhere in the system.

A

So so it takes a while to debug these, and sometimes they sometimes we kind of fall short of manpower, and this ends up having these regressions in for a long time, clock close to the relays and then we feel kind of like pushing by a couple of days or something while, while debugging these issues and like we were saying this kind of like pretty much every release. This really is we kind of managed to get these issues fixed on time. But this is happening from time and again.

A

So because of this week, we came up with a proposal to add a set of formal processes which will actually make the scalability takes much more streamlined and we don't have to waste so much time on this, and we don't want to block the entire release on to people from like pretty much two people from the firm six scalability debugging. These issues also, given that how important scalability is for the project and like we, this is really serious like production concern.

A

A lot of these regressions are quite serious and so I gathered some data based on past regressions, like we've debug and maybe I, can quickly share sheet talk, it's kind of a blog that I have it's a compilation of scalability regressions, we've seen in the past that I have written down the stories and one of the lessons I, just post the link.

A

So one of the lessons we learnt there is we should actually having pre submits like stops about 60% of these regressions, which actually has more effect than just stopping these regressions, because it also makes our large-scale 5,000 node tests. Not have these not see these equations and they're costly if we, if we lose a ton of those, we pretty much have wasted like a 17 hour, long run of these performance tests. So yes, so we want to catch them early on and and so I added a couple of these P summits.

A

One of them runs our scalability test or performance test on 100, no DC cluster and the other runs it on of 500 node qumar cluster q mark is our way of running simulated clusters. So so, yes, and over the past month or so ie and like a few others, were working on making these jobs really green and like I've spoken to ash and a few others, and it seems like the level of health, is quite good, and the Cinci of these jobs is like the long-term success.

A

Rate of this is consistently close to 95% from what I saw yes. So so that's about the jobs themselves and yeah and the other concerns also I, think I and the others is ash in the meeting already. But yes,.

B

A

B

One of the concern and pendants and others feel free to Eric feel free to chime in as well the charm. From last time we spoke, the concern was around the timing. These tests take that's good, yes,.

A

I was coming to that so mm-hmm.

A

So yes, so we have these couple of pre submits and the hundred node GCE test runs for it's.

A

Usually it usually runs for about an hour it's just about an hour and which is as much as our slowest presubmit right now, so that one shouldn't actually slow down our current rate and given the worth of these tests, I think they're, really I feel that they're really impactful enough that we are actually catching really serious issues from all over the system from time to time and like the link I have shared, as is an evidence for that.

A

So, given that I feel that we probably should have this in and but I am free for hearing your thoughts and.

B

The other 500 docume was: is that related status on how fast it runs? Yes,.

A

So that one is a bit slower, it takes about an hour and 20 minutes. So, okay, maybe I thought you were going to discuss this one by one. Okay,.

B

No I just wanted to so basically just wanted to open this out to the community itself. Yeah.

B

The bigger concern for for us was around the timing of these tests, especially if you're going to make it presubmit blocking right that the slowest job right now takes about an hour. But Ben was we spice book on flying with men, and there are efforts to make these faster so that it meets a 40-minute time that our presubmit could take. So just adding more tests that are longer is going to like add so much more time for every Cynthia. That's all yes, so that so I we just wanted to.

B

We want a charm to present here, so that we can, as a community, discuss this and see what the right course of action could be. Considering the peace tests are valuable.

C

They're catching 60% of the regressions that end up hurting scalability, the regressions that are not caught by these. Do they also fail the jobs in the end when they entered not.

A

Necessarily so so the way it works is that our 5,000 node test, which is like the highest scale, it usually ends up catching way more regressions, because because of course, it's at a higher scale, so oftentimes some regressions which are not caught at lower scale, because things are not stressed as much at lower scale, so it. So. There are a few regressions which happen in the past, which is just not possible to catch at lower scale of 100 nodes, and you need to actually have it running on the 5,000 node scale.

A

And yes, so, but of course we cannot make a 4000 at lustre if we submit, because it's like super long and you just cannot you don't have code and stuff, so we just have CI running for it, but that's our highest supported limit officially according to kubernetes. So that's the highest we test for, but yes, that that's a that's, that's a four segment so.

D

I'm saying that in this thing, this document that you linked the ear hoping but the post notes will be blocking, is that something that's necessary.

D

A

So maybe I can quickly explain so in general, the buck proposes adding these process in three phases. First, one is for pre-sub mints like having blocking presubmit, and the second is having post submits block on failure, and the third is actually dealing with key element issues potentially early on in the development process, which is right during feature development.

A

So that is where design reviews- and this is also the order in which we we were thinking of introducing this, and but we don't want to actually add all of them at once, because that kind of can end up being a significant red tip for developers. So so so there is an exit criteria mention at the I, don't know if you're actually looking at the same talk right now. But let me also share this when.

E

You're saying wholesome it block and failure. Are you talking about blocking, merges or walking releases.

A

Blocking merges because we have a scalability jobs are released, locking already so block emerges is so, but but the thing is I would not prefer discussing it right now, because it's kind of dependent on this step, and if we see that just having these blocking scalability pyramids is already putting us in a good enough shape, we don't. We may actually not need to do that, and so that's something we would want to do at a later point if things are still really bad after these pre supplements. So that's so that is exactly.

F

A

We propose in the doc it's I just shared it on the chat, so we basically you want to start with pre submits right now and see how this works, and if we really really need it, then we can have that discussion on. If you want to make those posts um it's blocking, it may well not be needed. So when.

C

You're debugging means then fixing the regressions. How often.

G

C

Involving like the members of other six, there is often just scalability Oh a.

A

Lot, if you open that dot, the then I shared with these that the block with the scalability stories. There is also call they're showing what what are the six involved. So we kind its scalability is hugely a horizontal effort in that we do catch these issues and we kind of like narrow it down enough that we know where the causes, and then we try a jet to the right sinks, and usually these things are then following up on these issues, and we are from time to time like communicating with them.

F

And a lot of issues actually.

A

We ourselves fix because, because, like folks in the scalability team have like kind of worked on different areas in different sects, as well like, like.

C

If there's a worried that these are going to take too long if they're reporting, but not blocking as a presubmit and the issues involved in the PR, yes.

G

C

Is clearly a regression what's the downfall of that? Why is yes.

A

So this brings me back to my original point that we, by the way we already have these tests right now running and they're also reporting. So we have these results, but the problem I mentioned earlier that when we are not blocking on these, what happens is when a regression enters.

A

These tests start failing, but then our more costly and expensive and release blocking large-scale jobs, which are like 5,000 node tests fail because of these regressions, and that is a huge hit for us, because we have this, we don't run them very frequently because they need a lot of quota and like also they take a long time so losing runs of those large cluster tests is really expensive for us because then often it happens died because of this regression, which wasn't caught at a lower scale.

A

It went in and then these tests start failing the large cast a test start failing and in the meanwhile there there is some other regression which comes up and we don't have. We don't have a good point to know where this regression came in, because the last casters were failing because of the other regression that actually came in due. To this reason, it's not working and and and trust me.

A

This happens more like quite often that we are seeing multiple scalability issues stacking on top of each other, and then this actually makes debugging those issues exponentially harder, because sometimes we need to run by sections on these large scale clusters and like we need to run multiple rounds of those and things like this. So yes, so, basically.

G

So, although they're running on efficient is right now in the reporting and they're stable.

C

They're still doing it.

C

A

So that's the reason and, and another thing I mean catching regressions- is one particular aspect of this, but in general what I think is these pre submits are adding, let's say, I feel they're, adding quite a lot of value in terms to developers in terms of testing every single change against a big cluster against a reasonably big cluster and test.

A

This is this is actually letting us per performance test, every single change, which is quite, which is of quite a value to developers, because normally I mean often like I've, seen in various discussions from various peers earlier that people are wondering about what can be the performance impact of theirs and like on really large clusters. What what can this change end up having so so so I think these tests, all these presents, also empower developers, will actually knowing what is the performance impact of their changes and.

E

Valuable but I I would want to know what like what actually takes so long in the hour and 20 minute test, and can we cut that because that's at least a 33 percent gain over time to run the tests and we don't just run the tests on each PR. We also have to potentially rerun them before merge batch merge. So that's that's going to impact the merger II, maybe not massively, but at least some would and feedback time for developers. You know calm that down or Arthur can.

C

We can we seem like: can we break up the test into sweets? Can we perform a specific parts.

A

Yes, so so you have good points there and.

A

We probably can try cut in short of a few minutes, but there are some bottlenecks, so the way it works is like. You basically had to start this big cluster and it takes some time to start the cluster itself, and that is not something we probably can really optimize upon, because we've actually already done a lot of things in the past, like parallel, is the startup and stuff for large casters, and that that that that is one front on which we can improve, but I think that does not the limiting factor here.

A

The limiting factor is actually our tests, so we run two kinds of tests and each test has its own value. So the we run a couple of tests. The first one is like it's called a density test where we basically densely fill a cluster with pods at a at a very high rate, and that is for basically measuring our pod start of latency SLO. We have to I suppose in Cuba nighters right now that we officially sell- and one of them is both start-up latency- has to be within 5 seconds.

A

The 99th percentile of what startup literacy has to be within 5 seconds and the other test is load test where we actually exercise a different aspect of the system. In that we basically create many. Many different kinds of API calls and we measure the API call latency and that's our second SLO. So yes, so these two tests are needed for measuring the 2s ellos and could we separate them and each week.

C

A

Different jobs, what's that.

C

Sorry, can we separate those two silos into two separate jobs so run down 100 109 BCE, just for yes,.

A

We can try to do that, but that has a shortcoming in that this means. Basically, we would need 200 nodes for every single PR, and that means we cannot have the same parallelism that we have right now right now, I I made sure that we have as high parallelism as we have for our as much parallelism as we need, so that, like this job, doesn't reduce the number of parallel and that will actually basically mean doubling the quota for these tests. So so much.

E

But it's the instantaneous quota right. Not yet not you shouldn't have much effect on the the actual costs and the tests will run faster. It will tear down the cluster sooner.

A

It depends been because it's not as simple as that, because there are some parts. There is some component of time of the test which is not related to the test itself, and that is kind of the constant overhead and that constant overhead gets double right. So you.

G

Know the test itself actually I.

A

Think for the hundred node GC, the test itself actually runs for I. Think something like 25 or 30 minutes. I need to check. I can't try to speed it up, but there are some hide limits to do that because so in general, like the cuban ideas, control plane has different at different places. There are some rate limits, specifically with respect to the control plane QPS to talk to the api server. So, for example, the scheduler has a limited QPS, so you cannot schedule faster than that.

A

So, if you are creating, let's say 30 watts per node, so basically 30 into the humble into the number of nodes, many parts you basically are limited by the scheduling rate and and in general we want to rate limit few specific kinds of calls.

A

So so we already by the way for the qumar 500, we already sped it up by increasing increasing certain throughput in the test, which kind of was our bottleneck, but we cannot arbitrarily increase it because because because some things in the control plane just don't scale how much time is spent on clustered here down?

A

Yes, that's a good question. So 400 no tests, we it's not really too different from actually our normal pull GC job, which runs the normal set of E 2, is but for Q mark. So q mark in general is like an overlay cluster. You start a base caster and you you create this simulated cluster on top of it. So there is this extra overlay step in Q mark, which is like I, think 7 minutes extra for creating the Q mod 500 node clustered on top of the after.

A

Creating this base cluster and then you have to tear down but tear down in Q mark is it's fast? So that's not.

G

A

We have a long running.

G

Cluster and just lease.

A

Sorry, what's that again, how.

C

Destructive is the actual testing to the cluster. Can we have a long running 100, no GC clusters and lease access to different tests will.

E

Be you need the cluster to be running the binaries from the PR? Yes,.

B

And do these lists always have to run on all the PRS? Is there any intelligent way we can cull down set of peers where we don't need these to run.

A

That's a very good question ash and I would say that we would need a really intelligent way to do that. This, probably we can do at some point like at this point. I can see some obvious peers that we can cut this short for, like type of friend like I, don't know non code changes, but for code changes. It's really non-trivial, because I've seen cases in the past, where, like a very simple change, that no one doubts would create an effect like like just for example, I I can actually recollect one.

A

So so, in the past, someone actually added one line in somewhere in the scheduler code in one of the scheduler predicates to J, like he just changed slightly to add a random string instead of a fixed, constant string, so making calls to rank, and that means like the scheduler, Tenex, slow on large clusters and then later we like we take, and then we realize that, like the rate at which the scheduler runs, these predicate loops for pods is so high that it actually got bottlenecked by because the time calls because randomness is kind of a scarce resource at such high scale right.

A

The basically the the node running, the master William did not have enough randomness to to handle that. So so yes, it's. So these are the manifest in really weird ways, and it's kind of hard to do this in any. Instead.

C

Of looking at the size of the peer that I mean like how often does it change to keep CTL, for instance, would that ever have an impact like this, because.

D

F

C

We can reduce the number of PRS that are selected to test. It might be more acceptable to have like we were saying like two tests, one for each SLA and then each of the actual test runs is shorter. If.

A

We call yes, but I still don't understand how this solves the problem of our march rate, because I assume like on batches of magis, we're still going to run this rate. Unless you have something like every single yeah.

A

C

Job would help right. So if we split the job into two separate jobs, one that tests each each, it's just that overall runtime. If either job is going to be shorter than the runtime of the two together and if we can reduce the number of TRS in which that's triggered. The impact on quota is.

E

Yeah because I think most of our other tests, we can do that. Sort of thing like make verify is one of our slowest tests right now, that's basically just because it's the kitchen sink. If someone could determine like what's a reasonable way to break it up, we can run that in parallel.

E

There's no reason that it needs to run in serial now, just nobody's done it as soon as that spirals any more out of hand someone's eventually going to have to, and once we break out that test, it can run much much much faster and most of our tests should not be taking like an hour plus to get feedback.

E

Having a starting point of an hour and 20 minutes makes me concerned that it's going to get slower, because those would test tend to do and we're gonna have like an hour and a half plus time to have a test feedback on a PR, which is also going to be pretty bad. To urge you.

A

Yeah so so I agree with you that too much 500 is probably like. You probably want to like improve it or think upon what to do with that. But at this point, what do you think about like adding just the other presubmit, because that's still really useful for us and at the moment I, don't think it's going to slow down our margin rate. I will try to improve this time. I try to cut short on this and yeah I.

A

Don't do that, but I really suggest that we have it in already so that we don't I against.

E

Being very one but I think these are valuable tests and that we should blog in them if they're catching like scalability failures. I just want to make sure that we're not going to let the test time just spiral out, and it's a little. It is a little worrying that that one is already starting considerably higher, at least a third than are like worst-case presubmit right now,.

A

Yeah so so, let's do you think about you, mark, 500 and I'll, probably think of some ways in which I can cut it shorter length. Make this faster, yeah I have a few ideas. I can try to do that, but maybe we can start with this 100 node, yes, I mean.

E

Even if you have plans to bring it down, then maybe we should just go ahead and make it blocking. So we don't regress in the meantime and then just make sure they aren't coming like not for it. You know and.

A

I think at this time of the release where, when we are not actually like, we are right now hunter or have you already entered code freeze but well. Yeah anyways, like the I, see that the March queue is not actually that big right now so I think we might want to like kind of do like s introduce like kind of beta test. This like introduce these and see how it works, and another thing which actually just came to my mind is that I have fired earlier in some discussion and I.

A

Think I was talking to all about this on the email thread or something, but I see that there are some plans to paralyze the tight merges like have multiple concurrent batches of mulch, and in that case, if you're going to have that, then I think we may not need to worry so much about the coupon find it we submit, because we are not going to affect the throughput which.

E

Is really help purchase to master that much I mean you can have multiple concurrent batches but like if both of them pass you're still going to have to retest the second batch before you after you've merger from one.

C

Of the options was doing the covers of two also right.

E

C

Five is an arbitrary number I think the merge rate is like the least impactful thing here: I think getting feedback to contributors in an important because a bunch of ways we.

H

Can what about changing the way that you test re the new guys speaking up I apologize, but what about changing the way you test, depending on how close you are to a release so that maybe it's not blocking you know earlier on in a release and then, if something's caught later and it's addressed but closer to a release, you you know upgrade the priority of those tests and make them block.

A

Yes, except that it still ends up in as bad situation as you right. No, because most of these regressions actually come in during the development phase. For example, like I've been debugging, they get list of conferences and progresses which were during the active code period like, which is in the first one month and a half or so, and then so yeah they come in and like the first half of the release, first time of the quarter, and then we debug it for the remaining half of the quarter like slowly removing them one by one.

A

So yes, so, okay, yeah and actually I just have another idea I. This is just another idea.

A

Is that if we can somehow make the following happen, that we run these tests on the first round, that is when on the on the pier but not in the March, then we probably can actually add the just without worrying about the more stupid. What do you guys think about this I? Don't.

E

A

Right now, but I mean.

E

Steve brought up, there are things we can do to improve the merger throughput though they might affect quota again, but like not getting feedback and the reasonable time is probably going to affect developer experience a lot more than like variation in the merger aid, which already happens at times due to other things like a job, is just totally broken. Very sudden, though reason, one one question about link I think there's a pretty common.

C

Development workflow, where you're, making quick changes and you're pushing them up, maybe during the reviewer or will update the remotes all the time. What does it look like when you have 100 no test? It keeps getting triggered every 5 minutes because somebody's overriding the commit at the top.

A

Of TR, have you seen that right now, I can't hear you Stephen? Could you say that again, there's.

C

A developer workflow, where you push changes very quickly, a github, so I think a couple of our ethos. We're seeing developers make changes in the order of every 3 minutes of a chef when you change or every 5 minutes, so they do that 10 times and so they'll see a whole slew of jobs to get started, isn't very quickly cancelled. Yes,.

H

C

H

The trap Travis implemented a feature to fail previous jobs just for that workflow right to cancel those when.

D

H

D

Well, um but still like canceling, a hundred node job isn't non-trivial right.

C

That's what I'm saying like we have the cancellation in there. It's just what is that going to look like for quota and cost when you have somebody doing.

I

This to their PR, we aboard the pot and for the next rung at the beginning. We tear down the previous cluster. Yes,.

A

I

Do so yes yeah, so the second job will actually add like say five more minutes for the previous concert here down to the actor on it. Basically tear down up then ask it again.

A

So Steve this was exactly the problem. This is exactly the reason why this job was failing almost all the time, I initially added it, because what happened was there was no proper garbage collection mechanism was like we didn't have janitor for these pre submits because they were, they are all running in a single project where there's this huge quota, so we cannot really clean it well and because of that, such aborted runs were leaking them and I had to manually, delete them all the time and that ended up creating a big mess.

A

So, yes, then I basically made this tweak that send just told about to to actually have the cluster named in deterministic way, so that, like the next run, GC is the previous run automatically. Yes, that is that that isn't a big problem right now, okay,.

C

Since we're hitting the end of the meeting and Herman has the job, the guidance will be done today in terms of making a decision here so like the biggest I think reservation. That I'm hearing is that at least the key bar 500 job is slow. So maybe, if we identify for the TC 100 job of the current timing, what is spent doing? What and what are our options to speed it up in the future?

C

I would probably like, as long as we're exclusively enabling it now and saying we have these opportunities in the future to make it faster. If we need it to, and we have a commitment to not make it slower, then do you think that's reasonable in terms of yeah.

E

I think that's reasonable. I just definitely seen a lot of jobs in the past, where someone's not actively working on this, and no one really owns them that well and they just get slower and slower and at some point it gets pretty ridiculous.

A

Yes, but like yeah, I totally have experiences cases, but in general, as you might have noticed, like scalability jobs are like one of the most well looked after and like they're, one of the best roughly properly maintained jobs and like yes, so I mean the course of action.

A

I think I would suggest is like maybe, let's, let's add these jobs for now, because we are actually not under like merge queue pressure right now and like there are not many such high submit queue and then I will, in parallel work on trying to improve these times, and if it really becomes a pain in the ass. We can like just disable the cube mark 500 later, but at least, let's start catching these regressions and stopping them, because that is a lot of saved engineering. Productivity.

A

J

Two concerns that haven't been mentioned here. The two other ones I wanted to add, is number one. The cue mark job like last week, we experienced a failure in the cue mark job where it was failing consistently across all pr's yeah.

D

J

It was very opaque as to what the problem actually was and the logs were not clear and the J's reports that were were shipped up to goober nadir were not clear as to what the actual problem was. There was just a bunch of pods that were going crash, loopback off, so that's a difficult thing for somebody: who's, not a scalability engineer right now to kind of figure out. Okay, what is actually the failure here.

J

Is there a way we can make that test and the logs reporting back more clear to PR authors so that people don't get this blocking report back and then are just stuck, and everybody is waiting for a scalability engineer to look at it.

A

Yes, you actually have a valid concern there that, like not many people, will be able to debug the issue, but I would say that this is probably also the case for like the 100 node test, because, like scalability issues are not always directly visible, but as long as we are blocking like the the PRS which are actually causing scalability issues, I believe that's the right thing to do because, like okay, so there are two parts to my answer.

A

The first is that, like we are not going to fail this test for most trivial or simple Pia's or simple changes, because they most like given the scalability and for once, which actually failed. It means that something is actually wrong with respect to scalability in that PR that that either someone from the scalability team might need to look into that pair of what is wrong, because there might actually be some serious, scalability consequence of impact of that year.

A

But even if someone scalable, someone from scale really doesn't look into, we still should stop that PR from going in and with respect to understanding this. Yes, that is something I feel personally, that it is going to develop over time when we introduce this process, because then people will start caring about more than just normal prism. It's passing, but actually quality of solutions and quality of the changes, because scalability then, will actually be something that people start.

A

It's something I believe is going to happen over time. It's not happening right now, because scalability team is taking care of everything, so people are like okay, fine, let's just leave it and it'll probably be taken care of later well.

J

It's also in the test the test design right like it, you know what logs do you expose and ship up in your in your j-unit reports actually expose like hey here's, the actual error message, even if you don't understand what it is because you're not a scalability engineer, here's what the error message is because, like the one we were getting last week was just an exit code, one and it failed out and involved digging through a whole bunch of other logs to try and find okay.

J

This is like this is the particular thing that was failing to start and that was caused by the regression. So I agree. We would catch regressions if we make this blocking, but on the flip side we need visibility either so that individual PR authors can can identify what the problem is incorrect or having like scalability engineers who are able to really dig into it themselves. The other thing that I wanted to bring up that is kind of related is the timing as far as making them blocking I think really it.

J

We would need to engage with the release team there and- and you know, we've obviously got some people who are members that release team here, but that would be my other concern is in particular, as we're going into code. Freeze, yes, the merge queue and the is smaller, but if there's a critical fix that needs to go in for something else that then all of a sudden gets blocking. That would also be a bad situation to run into so close to the really dear.

A

Yes, you have a point there. Chris would I agree with that like if something is really critical. This is going to add some delay. But, honestly, if you ask me, I think the extra 20 minutes that you add to this whole workflow before getting your PR into the Maersk.

A

You is not really significant because think of the time that people actually take for reviews and addressing them and like like discussions around it and like the the number of times these pre summits, and so in the timescale of that like the extra 20 minutes, is probably not that much of a bottleneck that we should worry about, but.

E

A

E

Think if we just mention it to them, they'll be fine. We can always add that if it becomes a problem, we can make it unblocking again in the short term, just at least some release fix out and then make it walking again.

B

That is synchronous meeting in next 15 minutes and yeah I've used. If you want to run it by them, then I can ping. You.

A

Just so so what parts? Maybe you can discuss it off? Science of what parts do you think, should be discussed with the seek release? The timing of.

B

Making these tests talking, if you want to make them blocking sometime, okay, we need to yeah. We need to get the approval before so.

F

The to the point about like allowing developers to better understand failures and their tests or failures and their jobs, there is some work being done on sort of better understanding of logs and better sort of viewing of things that went wrong and the things that went right with certain job. That's going to be coming out very, very soon for review by the community. So if, if you guys are interested in and like a better understanding of how those things are happening, take a look at that.

F

That document adds some some suggestions for what you you'd want to see in terms of a vlog viewing, because we're definitely looking for better solutions than 10 Gruber Nader implements right now, I'm.

K

Also really doubtful that we have enough quota for this if you even just slipped it to a non-blocking skip report you'd, probably. Finally, we don't have the quota for hundreds of nodes, because you need to run this priest MIT over 200 times a day and I think you might expose a lot of issues before.

F

E

K

Be required pass.

E

It's already running and reporting it's just not blocking.

C

So we are out of time. Yes, I, guess we should create the P arts. You make it two things walking as there.

A

Is a PR already, let me share the link, so this is actually a long-standing PR, I guess so. Alright,.

C

That is the pool. Sorry I thought that was an issue wouldn't be a menace yeah. Oh yeah,.

A

That's a pointless okay.

C

Cool so I guess the the timing of that landing is announcing with these, but it seems like as long as we look at it from six capability that these jobs are going to be maintained. There are options to make them faster. It doesn't required yes and.

A

I dare to look into it. Let me see how much I can actually cut your. There are a few avenues. Oh cool.

C

We had one more item on the agenda we're out of time, so we'll have to do that next week, thanks for coming thanks.

A

Thanks everyone for coming.