GitLab Pipeline Insights Group, 30 Jul 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Monthly Testing Internal Customer Call - July 2020

Description

An in depth discussion on the next steps for TestFileFinder and design for Test History.

Links:

https://docs.gitlab.com/ee/user/project/merge_requests/fail_fast_testing.html

Test History Design: https://gitlab.com/gitlab-org/gitlab/-/issues/223737/

A

This is the uh internal customer meeting for the verify testing group for july 2020. I have the first point on the agenda, which is just a general update to the roadmap deck. There's really no changes of note in the progress on the epics. We have some longer-lived epics um right now that we haven't uh made it well. We've made a ton of progress on in the last month.

A

We haven't delivered anything of note that I wanted to call out and wanted to say mostly thanks again to grant for contributing the load performance, testing mvc, which introduced a whole new category for us, which is just amazing, so really appreciate that and then uh really wanted to take some time with kyle joanna mack um to talk through just how has how had things gone with our identify failures, fast template the tff gym. um I really think we could dedicate some time to that. um I do want to before.

A

We uh journey today um show some designs that we're starting to do solution, validation on for the test history in both the junit report and at a project level to make sure that this this group is aware of those and as we're making progress in iterating on those. um So don't let me forget about that before we adjourn, but let's go ahead and jump in um how are things going with identify failures, fast zeph? Do you want to vocalize your point.

B

Yeah, so I I tried to do a simpler integration, um one that albert had identified as a possible target environment to run this in, and that was the customer portal we did get it enabled and ran it for several days captured quite a few data points.

B

In the end, it was burning up quite a few more ci minutes than everyone was comfortable with from a value perspective, and that was because mainly for uh two reasons, one they weren't actually having that many failures of all of the pipelines that were executed, that only caught three failures, so it was like a less than five percent some somewhere in that range um and then the other issue was they had just optimized their pipeline streamlined their test pipeline.

B

So it was knocked down to just under 12 minutes so yeah, so so, just just to spin up an extra stage.

B

um For this I had a generic cost of five to six minutes just from the environment perspective and then the the number of r-spec tests. Obviously it was minimal, and that was usually you know, measured in seconds as to how many of those were executed. So we did end up backing it out. uh There's currently we're trying to figure out how to just integrate it into the specific job itself. But we can't do.

B

We can't do anything direct, it's just basically going to be stealing the logic and trying to do it in a in a bash script uh to check to see if it fails and if, if it does we'll go ahead and fill the job, if not we'll pass on to the rest of the r-spec tests.

B

So that's that's where we've landed up at the moment, uh I'm on rotation this week uh for our uh our pipeline triage. So I'm trying to spend some time on that. But it's been a little difficult.

A

Sure so I just want to make sure I understand that the so for the project um implementing the template, just as is, was taking up more minutes than it was saving, and is that as against a scheduled pipeline.

B

Yeah, um well, that was for mrs and trains right, okay, yeah! So so that's that's how the templates actually set up. um It's not for just your your regular scheduled runs. It's it's! Basically, when anybody uh submits a change. um So what that's? What we're learning from that is that there's a threshold at which this becomes valuable enough, and I think in long enough running pipelines and uh pipelines with enough changes where we have a better chance of capturing issues, I think um it definitely could provide value.

B

I just think that particular project wasn't the project for it and I think kyle's team is still working on possibly integrating it just for for gitlab as a whole. So I don't know where they stand with that. Unless you have more questions with the customer portal, integration.

A

um My only other question was how big is the test suite um number of tests wise.

B

uh The r spec jobs I think there was uh 270 tests. I think, okay off the top of my head.

A

B

A

At least a magnitude of order more, I think, before we're going to start to be impactful with running those new tests. First versus just being able to run everything, yeah yeah.

B

And I think it looked like a promising target to begin with, because their pipeline was uh almost twice as long initially, um but they they had some caching issues that they figured out were taking up a lot of time and they got that cleaned up. Just as we were integrating this in.

A

Yeah, we probably still wouldn't have saved that much time. I think, even with those caching issues but yeah good to know cool.

C

I just had one follow-up: was there uh did you or are you able to look for the reverse of what you mentioned happening? You said that there were three times where the the job caught a failure and killed the pipeline. Were there any times where it didn't catch, a failure and it failed later in the pipeline.

B

Yes, uh that happened at at least twice, if not three times, but the the issue is it's: it captured in our spec failure that was not related to this particular. Mr.

C

So it was like a flaky test kind of got tripped. That's what it looked like to.

B

Me and and funny enough, um all of the failures were with a particular developer, so this is the other thing that I learned from it as well. Is that um I think it's going to depend as well on your.

B

Development process in general that all the developers are following if everyone, if everyone's running their r-spec tests regularly locally uh before they're submitting changes, you know we're not gonna catch a bunch of failures, especially if they're like running all the jobs similar to the way we are, um and I either all the other developers were just excellent and just this one developer is not, or this one developer was just go ahead and pushing it and letting the letting the pipeline catch the jobs instead of running the test locally.

C

I think we talked about that in this call two months ago, just about like well. What can we do to push developers to run tests more often because that would probably save an inordinate amount of pipeline minutes right.

B

So it's also in in environments where developers are not typically running tests locally. This could be much more valuable.

B

From a feedback perspective, obviously there's some cost involved.

A

Cool interesting things to ponder so I guess.

D

Yeah, so I'm just going to ask in the the comment and I've linked to that down below I'll. I didn't realize you're on rotation, so I could just add it to the feedback issue, because I thought you did a great job summarizing it. But when you said uh or when it says, like six minutes was added, that's just because there was a new stage. So the total, like time for the pipeline to complete, was six minutes longer for the majority of the cases just to begin.

B

Right not like.

D

Six new minutes of run time. It's six minutes until that feedback is complete.

B

um Well, I mean compared to it's. It was six additional minutes compared to how long their pipeline was running previously. Yeah, okay,.

D

So without the like, let's just say, it took 15 minutes with the template or 25 minutes. It would take 19 minutes without the job in the template, for the total pipeline run time exactly.

B

Okay, um what uh should I reword that, in in a way to make it clear.

D

No sometimes I'll I'll just say. Sometimes I talk about like job minutes, but it doesn't but there's other jobs that are running simultaneously, so it may not um impact the total pipeline run time so pipeline runtime might be 50 minutes, but um the six minutes might be consumed within a stage where there's other jobs running so.

A

So it sounds like both ci render minutes increased and wall clock time for the pipeline increased, and that was for green pipelines.

A

um In addition, we also found that, or you found that there was at least three cases where it didn't catch, something it should have in three cases where it caught something it. We didn't expect it to catch.

B

There were three cases that caught something we didn't expect it to catch. um I don't know if it should have caught the other three cases: okay, um yeah, but because we're talking about running the entire test, suite for those errors to have been captured.

B

Right, which is what we're.

C

Trying to avoid so I'm not super good at ci, but like how you're saying you could run some things in parallel, but if a job is running in parallel, could it preemptively force the other jobs in the same stage to exit? If it fails, you know what I mean.

B

uh That's a good question. I don't know I was approaching yeah go ahead. Con.

D

So so like what I was thinking- and I um I can bring this up- there's like danger front end test lining r, spec and robocop jobs all in the test stage for this project, and really we just want to short circuit the r-spec job, because that's the tests that are being failed fast, so the other jobs in theory we you could use dag to start them earlier so danger could provide a review comment like things could happen um to get some of the feedback, but the still but um you're, still gonna block some jobs which were previously not blocked to get that feedback.

C

Yeah, it's just it'd be kind of neat if almost like, uh like if you had a context with cancel and go, and you spun up a bunch of go routines and you could pass the cancel context, all the other jobs that were running and cancel them like mid-flight, with the fail fast. So you could like it'd, be neat if you could run that job.

C

At the same time, all the other jobs are running and then, if the fail fast failed- and it should do that quicker than the other jobs would complete, then it could like kind of cancel everything in flight and undo. It.

D

I you know, I'm sure we could so that's interesting. I didn't think about that with um what we were looking to do with the next evolution of test file finder, but that might be something we consider with engineering productivity, because we should be able to make an api call to like stop the other jobs um at minimum. That might be something to start with yeah. It's interesting. I didn't consider that.

E

If I jump in james, do you have any any okr for q3 or what? What is the roadmap for the optimization work on your team.

A

uh Next up, actually um albert is picking up the issue that we were going to start in 13, 4 and 13 3.. I need to jump back into tff um with drew and see where we're going next with it uh really. I mean our map is to support kyle and his team in making use of the gem and so they're going to really be helping us drive the revamp for this.

D

And the reason why I volunteered albert is just because it'll help us advance on rkrs faster versus waiting until 13 4. um albert was looking for some product development experience too. So it's kind of blended things kind of matched up. I think, from uh from a need and desire perspective.

A

Yeah so we'll be looking for the next thing, then in 134 to pull forward. um I I just don't. I haven't looked at the epic today or recently to know what that next thing is um and definitely we'll be leaning on kyle to get the feedback of hey. How can we help next where's, the next problem um that we can go help you solve.

E

Okay, uh so then what is the? uh What is the ops team's uh effort in this in q3? uh Can we is there any other iteration that we can do uh versus waiting for the ep team? It sounds to me like the ep team. Is gonna be working on it uh and it's not going to be a product-facing thing yet. Is that.

C

Is that correct, I think.

B

C

Like the product iteration on the tff is to get an example in the docs, like that's the first mvc, it's like this is how you could use this thing reasonably and then, if the uh ep team is working on the configuration mapping, then in parallel we could work on examples for other languages or also we could work with the runner team, because I know elliott is interested in this a little bit.

C

So if we could get that working for his uh his go projects, then that would be another small iteration that we could take toward furthering the test file. Finder specifically.

E

Okay, uh kyle, let's catch up offline, because I I'm a big boy on the head boom of of the ep team and uh I think both mostly on our side, because I think both the ops, uh q, quality team and and your team have had some experience in this- just want some clarity done clarity on who should be owning doing what then I'd like us to move closer to like a functional ownership. If, if we can uh but yeah just just concern some head hunt head room on our on our side. Thank you. Three.

D

Yeah and the short version is we're facing either a point where we diverge more from test file finder, where albert or someone on the team would spend time building up. What's called a gitlab projects, test file, finder to do the same function or we just partner and and help add the feature to the product that we would then leverage so that development work was going to be done by the ep team one way or the other to work towards our q3 okrs.

D

I just would rather be done to help the product instead of kind of waiting until 13 40.

B

Okay: okay, matt, because your is your question also about our dog fooding efforts.

E

Yeah like it's not, it sounds to me, like our dog fooding momentum will be on pause for a bit because we're waiting for uh improvements that the quality team has to ship in right. It's anything that we can uh be proud of, that this thing has added value to to customers that it reduced the test time um and the first mvc is gives us feedback, and it's totally fine, it's great that we get feedback now. But uh how can we?

E

How can we kind of expand this to more projects and not have to wait, uh because we only try with one project? Is there any other areas that we could just like drop this in um and it's good? It's helping um reduce the time to run tests.

B

I I think I can. I can help with that going forward for other projects, but um I haven't given up on the customer portal. So um what I, what I want to know is if, if we're not using the template- uh as is we're not just in including this as perhaps a customer would try to do and- and maybe we're just wiring in the gem on on the back end for this uh for short circuiting, this r-spec job as kyle described it aptly.

B

Would we count that as a win for dog fooding since we're using our gem, but not necessarily the.

E

Template so dog foodie means it's available to customers um out of the box, and if it's in the documentation and customers can configure it the same way, we are using. I think that counts it it might it's not like a one switch if there's a one switch. Unlike you, add these three statements and the default documentation and they're getting immediate value out of um the tests test.

E

Optimization, I think, that's a win, um and then we slowly reduce the gap in the next iterations like instead of adding, instead of a switch in like four lines of customization, we slowly remove those customization and in the end right, you just flip a switch, um and you know it automatically is baked in into your pipeline, and you can proudly say that if you don't use this feature, you're gonna have like 20, more runtime and just be proud of that, and that's where the end goal should be.

E

It should be, and I for me like I would love to use this in all the projects like not just the customer building. I want to use it on gitly. I want to use it and run it. I want to do what course I want to use on all the other projects that uh that you have handbook.

B

Also as well so so so would it uh make sense for me to continue with the customer portal and work on that, our spec job with that integration and and maybe supplement the documentation um for from a customer perspective. So they could understand how to use the gym, or should we look for another project where would be an easier fit.

E

uh I'll leave that for for you and james, if, because this is the product direction as long as we're moving towards progress there, um I I will leave I'll leave room for creativity, uh how you wanna, how you wanna do it, but it should be. It should be part of the release it should be in the documentation. I think that's the most important thing and they.

B

Can use it that's what.

E

B

Thanks for the clarification.

A

Kyle, I think you at the next point.

D

Oh, I just linked to the summary, um so we we kind of just talked through everything that was in there. I will port that over to the feedback issue that you link to above um okay um and then I did find a fail like a short-circuited pipeline example that I linked to as well, so just for great for reference um and that can kind of show you um like you, can look around in the project and see the change in the pipelines before and after. If need be.

D

B

Thanks kyle I'll look at that too, as I continue on this one.

A

All right, so I just want to make sure nick it sounds like next steps are, except you, and I are going to sync um and kind of pick, a direction for continuing to go down the project with the template um versus pulling out the gem and documenting how to use that to short circuit.

A

um I think it's worthwhile to also look at other projects that maybe have bigger suites, where this might be more helpful um and just better understanding our use case, where it's applicable, um uh we'll then circle back with the next steps within the tff project of where we're going roadmap wise I'll work on between now and our next.

A

Our next meeting with the internal customers of getting that better laid out of the direction that we're going to be going so that we can all prioritize amongst our competing priorities, of how we're going to be working on this. As a team.

A

Did I miss anything in there that we talked through that just flew blue by me great any um anything else that we didn't talk about about tff um or the efforts that we're making there.

B

uh They'll probably be covered with how you're thinking about expanding. You know future discussions. I I'm assuming we are going to discuss about uh applying this to other test frameworks as well. Yeah.

A

Yeah there's issue: there isn't an issue or issues about expanding this beyond rspec.

A

All right: well, I'm going to jump into some designs, then uh real quick. We had a great discussion a couple of months ago about test history and wanted to show off the project that progress that we're making see. If I can.

A

Find right screen.

A

A

So juan and are in the progress of doing some customer interviews and recruited interviews reviewing this.

A

And so we had a nice long discussion today in our team meeting about this first screen, so the first screens, I'm going to show, are applicable to the junit report in the pipeline and so showing a job that has some tests and new failures, and we had a great discussion debate depending on how you want to characterize it about what that would look like different verbiage for that and what it really truly means, and what a customer would expect to see when they see this.

A

Let's say in the customer interviews that we've done and recruited interviews, we've done. There hasn't been confusion about this, but my take is that they they're really not sure what this is. So they don't know to be confused. They're not know what to expect on this screen. Yet um we made for mvc just pull this out and then really focus on history on the next screen.

A

Which is, as you get into a job, and you start to see individual tests which, in the mock these are all the same named, but these would all be different tests with different names, seeing that status and then starting to see some history here, one's already working on an iteration of this column, to better clarify out of those nine out of ten that have failed.

A

When did they fail in sequence, uh some of the resounding feedback we got was that's great but which one failed when it wasn't clear to a customer or to a user what they really needed to care about. They were really confused about the 9 out of 10 and the green like wait. One of those is red, and one of those is green.

A

So I don't know if I need to care about this or not as we talk through, we realize well what this really shows you is that this one failed, the previous nine or sorry backwards. This one passed, the previous nine all failed, so you had a failing test that you just got fixed, and so you don't really need to care about this one.

A

But this one, maybe if like this, was 1 out of 10 this one failed 10 times previously. It's passed. You really should care about that. So juan's. Looking at a couple of different options to display this one is even combining the two so that you have. It would almost look like a pipeline with check, marks and x's uh showing the past history.

A

We may condense, then how many appear in this screen or even pull it into another detail screen where you would see the history we're looking at a couple of different options here, uh the I mean the the vision, though, is that if you have a test that failed, you could see what are some of my past runs. Is it a flicky test? Is it a brand new failure of what's been a rock solid test so far.

E

Does this history only exist in that pipeline assume you're building up right for so it's like, so pipelines can run jobs, jobs can run tests multiple times. So this is the historical scope is within the pipeline.

A

Correct, I know we talked about that quite a bit on this call and for our quality folks pipeline would be the most helpful I believe, is uh or was my takeaway from that on the technical side we haven't decided yet, um where we're going to grab the history from um so not sure.

A

F

See I guess we have heard people that say that sinful branch will be also useful, but I think that's I mean that's more, like a philosophical question on whether I mean both things, give you value right.

D

Yeah yeah, sorry, what I was thinking is like the pipeline context. So if it's a merge request pipeline, it should be how many times did this test run in different pipelines within that merge request or if it's a pipeline that ran on master? What are the last n number of master pipelines, and how did this test behave there, not like the last pipeline, because there should just be one, maybe two executions of that test within the pipeline.

D

There might be multiple pipelines for an mr or multiple pipelines for a branch um so that I'm a little confused on that.

F

Right right, I think the the the the idea originally has been always to track or tally against all existing executions of that of those tests, irrespective of the branch right so this class and this particular test assertion, irrespective of branch, has been executed the last 10 times, and it has failed two of those times right.

F

So it's at the project level, then it's at the.

F

It's at the pipeline level I'll say, but.

C

We haven't headlines yeah, we haven't settled on the actual technical implementation, because we kind of need to know what people are expecting um this. It's going to be a challenge either way, just because there's a ton of data here like storing a junit report for these parsings, like each pipeline, run right now on gitlab.com produces something like. I don't know, 30 megabytes of junit report that we need to parse in order to make this page so for storing historical information about text executions, and we try to do that in the database.

C

We're gonna have a problem like right away with that, so it's really gonna be a matter of. How can we do this in a clever way where we can maybe only store failures and test runs, and so we can kind of see like okay um and again, it depends what what we want like kyle was saying if we're just really looking at merge requests, then. Okay, in this merge request, this pipeline ran three times, and this test failed one out of those three times like I. We could do that.

C

The way that I'm looking at it right now is the value comes from looking at a test as being associated with a project like you were saying, juan and mac. So, if you've identified that this test fails out of the last ten runs, it failed five times total across the whole project.

C

The last five times was run or less 10 times it was run, then that might be more valuable because that's a better indication of whether that test is flaky or not than just having a smaller sample set, and you could even eventually probably boil that down to summarize statistics about um overall, this test fails 33 of the time. So maybe it's not super important that it failed right now, right.

E

Right uh joanna, maybe maybe it would be good to run the whole team on the test session uh thing that we did, and maybe we could. I mean if we don't have to like save the stack trace anymore in our issues and just use this if it lands.

E

B

Well, I I think that'll that'll work if, if we're capturing it at a large enough context, because part of the issue with having um our stack trace uh captures that we currently have it in test cases is to give a more complete history to enable us to to really dig down and figure out where this actually started.

B

E

B

E

Yeah, so what I'm saying is, I think, the next iteration. I think I think the team is on the right track. The next iteration is in in addition to saving the results, um the historical stack trace would be really valuable and we can can add from there so um yeah.

A

Yeah, so so this is our pipeline view. This is specific to a pipeline run and from that test back and we'll solidify between now in the next meeting as well. What is going to make the most sense, in this view, for a developer.

A

I think, where your historical stack, trace and being able to track that data gets interesting. Is this view at the project level? So this would be at the project level. This is kind of that first iteration towards here your flaky tests, um and so here are tests at the last 10 times that they've run um here are the most failures here. The most skips potentially could be another view, as one has on the design here um we haven't talked through, but there's probably an interaction here, maybe not in the mvc, but in a follow-up.

A

Iteration of great now go start to pull all of those test pipelines, so I can link into them and look at those stack traces because they still exist down in that junit data.

A

That could be a really slow interaction for the user, because you're going to have to load up all of those individual junit reports. As you click in, um but if you're doing the research, that's probably a penalty you're willing to pay to individually load those junit reports to figure out hey. When did this start failing.

C

It shouldn't be that slow, because since we did the refactoring, where we're only loading one report at a time after you click in it, really takes like, depending on the size of the junior report, one to two seconds to kind of load it in and then you can kind of interact with it from there. It's not like great, like ideally you'd like to have 100 milliseconds or something when you click in in order to load it, but we're kind of parsing a file on the fly. So it's a little bit expensive.

E

Okay, uh james, would you mind click on the the test case issue 613 under your point, yeah yeah, oh um here. Let me just put it in the chat for us.

E

Or there yeah same thing.

E

So this is our historical view, and would you mind expanding the label bar great, so um you just go all the way down the the reason I think it should be by by project. Is that you see the results we track here? We have. Each of these environments uh is actually grouped by project ci structure.

E

So, if I want to know flaky test in production, um it should be scoped towards that that um historical context, and if you want to look at um the flaky test and staging you scroll down a little bit, there should be. There should be some more right. The next thing is staging correct, so this would decouple this all together, uh because right now everything is in one one one place, and this test case is really long.

E

Since we just keep adding discussion points, um uh then you scroll down, I think they're, like 100, something comments that is automated by the bot, so I think separating this out into different buckets of staging and production, which is by project, and it kind of confirms that it's it should be at the the project level. The historical historical context.

C

So sorry, should it be by project or by environment like should we try and tie this into the to the releases environments, feature.

E

We don't use environments in our in our own structure right now, uh that's kind of where it's blocking or like it's we're, not using the concept of environment in our deployment. So each each project is tied to a gitlab environment. So you see canary you see, production. You see, you see, staging.

B

From a scope level, I think it's better to see the the test across environments as well, because if it's passing in one environment, failing in another you've got an indicator that we have an issue with a particular environment as opposed to uh to a test. So I, like the broader scope there and not not limiting it to just how many times this test has passed or failed in nightly alone,.

E

I I think they could end up being the same place. We just can't use that that granular grouping of environments, yet we will just we'll just be using the project different projects to look at different um different deployments environment on our end. But uh I think for the broader use case, I think grouping it by environments makes sense. Do we have a concept of environments in ci right now or in the release?

E

A

I don't think so. I think if I was going to say, if I'm looking at this right, you would even need a level up from the project, then and say for our group. Here are the tests across projects that match.

E

I think that can be like two or three iterations in the future. I think, if you save, if you solve us at the project level that addresses like 80 percent okay and then when, when we have headroom, you can think of how you want to glue it together into like a ci analytics in the future at the dashboard at a group level. uh But in the immediate form, I think the scoping.

E

The historical result of a test to the project makes sense, because I guess, as um as one said, uh pipelines can run multiple times right and then can run.

E

If you run a test like what pipeline wants test a in the end to end one time and then when you rerun it, it actually kicks off another pipeline or a job. So we don't really care if it's grouped at the pipeline of the job. I think the the whole historical data should be at the project level.

A

And I think, where we're going to end up ricky's not going to hear this probably, but this this design is going to probably split into two different issues, or it needs to split into two different issues pretty soon, because these designs are really targeted at two different users and um who have two different problems where it sounds like yours is more of the the team lead user who's looking at the project and historically, what does everything look like overall versus the developer who's?

A

Maybe looking at just for my branch and maybe back to one to my target branch? How is this test behaving? Did I break it or historically? Is this just a flaky test in the context of my merge request, not in a larger context of how is this test performing historically because they just want to know? Is this always a flaky test, or is this something I broke and need to fix? um So we may end up storing the data differently for the two different use cases.

A

um I don't think we're gonna land in a place where we can store the same data, that's used in both views. um I don't think so. The smarter engineers may, you know, find a great way to do that.

C

I think we could, but uh we'll have to want to think about it. Some more.

F

Yeah, either way, it seems that we all agree that flakiness, it's a concept that lives at the project level right that that's how like it's not a concept that lives at the like at the branch level. You know if you break, uh if you break test in nmr, that's expected, but if they are uh consistently failing throughout the project history, then that means that the the test is flaky and that's likely. What we're trying to explore expose here is this test, flaky or not right so, which could be, makes a lot of.

B

Sense yeah, which can be valuable to developers as well, just knowing that this test is failing outside of my merge request and the pipelines that I'm executing within it.

C

Yeah, where my head's at right now is um when we're part we're running partial sets of our tests dynamically, depending on the changes in the pipeline. How does that affect the statistics? We're gathering about the run for each test? So if this test only gets run ever when we run the whole suite and it never usually gets run when we're doing partial pipeline? Mrs, like what is, how does that affect our statistics for test runs?

C

I think we ran into this problem already kyle, when we were talking about coverage because we're not generating complete coverage on every mr pipeline anymore, because we don't run all the tests, so we're not going to spend the extra cycles to get the coverage report so that that problem, I think, is going to be pretty interesting, because maybe this test failed once or the last 10 times. But maybe it only ran four times today when all the other tests ran a hundred times. You know what I mean.

D

Yeah, I think, where we're moving to is the full test. Suite usually runs in master, and it's going to be as limited as possible in the mr, so that context of project as long as you're looking at where's the pipeline, the most complete which in get lab project, I think that would be the mate like master branch, the main branch um that sounds good you'll be able to detect like these. There.

E

I think that uh these are really great points and I don't think we should think of it in terms of missing data. I think historic, historical test data should be treated. As you know, a glass glass, half full, you didn't run it that's fine. We just won't have the data, so we would just list whatever we have and the branches can just be a filter later on. Where hey this, this test a um ran like a thousand times last month.

E

um These are like list of branches that it existed there and then you can add more dimensions later on as a as a filter mechanism, but the the primary dimension or character of the test lists at the project level. So unit test, one ran like a million times um last quarter, and you know it passed 80 percent of the time and 20 percent of the time it failed, and these are the branches that it failed in which product area which product team um is failing this test.

E

So that's where I'm thinking in terms of linking the data, so I think at the project level makes sense in that regard. Okay,.

B

Just a quick question on that. I agree at the project level, especially for us. It makes sense um james when you were mentioning at the group level. Were you thinking of maybe a company that uses microservices and each microservices in their own project kind of situation.

A

Yeah we've heard from customers that they want- and this is even like a more aggregated view of all of our tests. How many are passing, how many are failing um and so rolling it up to that level, but also then being able to kind of dig into that in between? So if you have a project that this is our project, for you know this environment. This is our project for this environment, or this is our project for linux. This is our project for windows.

A

Same kinds of tests are running. Would you want to aggregate those up.

C

So one thing we heard from elliot uh is actually the reverse of what we're talking about and and and breaking breaking things out, we're she's talking specifically about coverage, but I think this is applicable to tests too, because we're talking about grouping and grouping is important. uh He was talking about.

C

How can he get package level visibility into where his tests are failing or not failing more often or where he has coverage, so think about the opposite scenario, where, instead of a microservices architecture, where you have separate projects for each microservice, maybe you have a monorepo and you want to dig down into each individual service inside of that monorepo and get the historical test data for those, because I think that's a kind of a trendy thing right now in the industry too, and a lot of people are trying to do model repos.

C

So how can we um facilitate both of these things when we're trying to group things by project? But now we also maybe want to group things by like directory inside of a project.

A

We wrote up an issue um and I can link that back in here as well or back into the discussion um for that issue. That elia brought up.

A

We have a few more minutes, uh any other topics that we should cover. While we have the group together.

A

Today, all right, I don't see anything else in the agenda. I see everybody else on mute, so I'm gonna say no. Thank you. Everyone thank you ricky and joanna, as always for taking notes, appreciate it, um and this will get uploaded unfiltered a little bit later today.

A

A