GitLab Delivery Team, 29 Apr 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2021-04-29 GitLab.com k8s migration APAC

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

B

Hello, I've just realized that I've got a clash.

A

Yeah, it's just scheduled over the scalability demo um great did you have anything that you wanted to show for this honestly.

C

The only thing I really wanted to talk about was just the what I would like to do for the next steps for this um console issue really. But that being said, I'm happy to just like drop a comment on the issue or drop a thread in slack and and not tire people's time. If they've got a.

D

Clash, hey everyone.

D

So I think we're all here so uh graham over to.

C

Look I'll be really quick, um andrew and john have got a meeting that clashes, so basically the you know we're continuing the console investigation, I'm pretty confident at this point. We've done a few changes that I'll actually look at rolling back because they haven't made anything better. They've allowed us to debug further things, but the next step I want to do is I actually think now what is happening it?

C

Basically, it's very difficult to debug, because the story around what happens to a demon set pod during node scale down is is very different to a normal pod. Normal pod just goes through this normal process. Demon sets pods, get largely ignored, but essentially what we are seeing or what I now think, I'm seeing and I'd like to actually make a change to prove is that um console is getting taught getting sent to signal to deregister itself from the cluster before it's removed from the kubernetes service endpoints.

C

So dns requests are still being sent to the party, even though it's in a state where console itself thinks that it's basically shutting down or or not um not available, um and typically how we solve this. Is we put a pre-stop uh a post, a pre-stop hook in to sleep for like 10 20 30 seconds, so that way before the pod gets the signal we have that time to make sure the service is deregistered.

C

The problem is the upstream helm. Chart actually hasn't got any support for this and it's a change. I want to test soon. So it goes back to this full song and dance of. How do we? Actually, you know, there's like four lines to it. Oh a gamble file I want to add, but it's not in the upstream home chart, do I put a merge request up to the upstream home chart and then all of this? To be perfectly frank, we have to do with hell just to try and add some stuff.

C

So I've got a merge request that I put up that leverages. Something I've talked about. You may have seen me talk about before we're using helm files, built-in patching functionality to just patch these files, and so basically file can just take the helm chart with all the settings you want and then just patch extra lines over the top. You say just patch: these settings on just basically applies a patch on top of the yaml files. To put the settings you need in, um and I would like to kind of there's a larger discussion about.

C

Should we use this functionality, obviously using it on something like the give that chart. I I suspect, is a very big no-no because we should be pushing those changes upstream, but for something like this third-party chart, where I can just patch it and move on, rather than have to put a merge request to their github, wait for them to merge it um either. You know, wait for them to cut a new release and put it in their chart repo or we fork the whole chart or there's a lot of just kind of rigmarole around that.

B

How um how often do you think that that patching just completely breaks it's like.

C

When people move things.

B

Out or you know so, I.

C

Don't think it will break very often it's pretty resilient. It's it supports strategic, merge, patches and json patches, um and we actually, I didn't, realize this as an rfc defined for the formats for json patches. So if, for example, like you're patching, basically a setting, so the worst thing that can happen is they add that setting up stream and all you're doing is overriding it with yours, instead of taking the upstream setting.

C

So it's what about.

B

C

B

Into other files, or it's on.

C

B

Yeah, oh sure,.

C

Sure so, in that case, it actually still have would work, because how the patch matches is by the kubernetes object, name and type. So unless they change the kubernetes.

D

Object name so.

C

Like so helm renders the whole thing out into one giant big list of manifest your list of kubernetes objects, and then your patch is not just like a text diff. It's like a find. This object with this api version with this name with this like an ingress of version, api one and then please patch, that it's not impossible for them to rename things or change your roles or cluster roles. So it could break that's definitely possible, um but it is a little bit more intelligent. Fortunately,.

B

Yeah, that's pretty clever.

C

um We already do this in tango, like tank actually makes heavy use of this um already like the tanker deployments um stuff, because it's all jace on it. They can just be like here's, a helm chart for prometheus and they just basically use json to patch all of the bits, because you know our stuff is very, you know not standard, but it's a little bit trickier than the the default setup um anyway. I don't want to derail the conversation about that too much.

C

I can put the dmr up there and ping you guys on it to have a look, but the end result is. I would like to try and test this setting for console, because I think of this. I I'm confident this should make things better, but it's just like there's two parts of this is the console problem and this whole worker, like the workflow.

C

We have for trying to do simple, manifest changes when we have a third-party upstream chart, I'm absolutely not against forking their chart either and just keeping a local copy temporarily, even until we figure this issue out and then once we figure the issue out, I could put a merge request up to console helm and like do the proper thing it's just trying to. I want to close this quickly, because I know this is going to start holding up api migration.

D

Yeah, that sounds good. um Do you want to um yeah? Do you want to confirm it locally? Does that is that the most efficient way to confirm it locally and then put the um charts changing.

C

Yeah, so I I've so so there's two parts I've turned I've confirmed like I can put an mri with something that works. I've confirmed that works in free. The problem is testing that this fix will fix. The console problem is hard because the only way the only environment, that's exhibiting, the problem is production, so I'd need to actually roll it through pre-stage and production.

C

Before I could conf definitively say it fixes the problem, I I won't go into too much detail, but thinking about it more and more the model that they have for deploying console in the helm shop, which is the demon set, works well for the use cases they prescribe, but for the dns use case we have, I think it's not the best deployment strategy so, for example, in production we run 70 console pods and we're only yes, we're serving dns traffic to a lot of pods. But dns traffic is not heavy.

C

It does seem like a lot of console pods to service dns requests, um but if you're doing the actual console, http api and service registration deregistration, all those other bits and pieces, it makes more sense, and in that model they say, use a demon set and only talk to the console part of your node, which is the next step. If this fix that I have no, this change, I have doesn't work.

C

Then we absolutely need to basically change to every pod needs to only talk to the console pod on its own node, and we just eliminate that whole. What is the other pods? What are.

D

C

Doing uh altogether, and that is blocked on a gitlab chart, change happening and also would be blocked on potentially a gitlab application change as well.

D

What would be the application change.

C

So at the moment I can inject the host ip of the kubernetes node via an environment variable, but we we expect that setting to be in a text file, so we'd either have to change the gitlab or we change the gitlab app to either read the environment variable if it can't already- or I just have to add a new init container, which I think I could probably do- and maybe that's the simplest. I just get it in a container that copies the environment, setting into the proper location and file.

C

So maybe there isn't an application change needed it's. I would need to kind of figure out um how that would work. Okay,.

D

If, if there is something fine, we can, we can call people in to help. I was just um yeah just curious about what we might want to ask for help with.

D

Do you need any help on this scrim.

C

Look, I I think, I'm okay at the moment, because I I I I'm pretty sure I know one of these. Two things is going to fix it. It's just trying to it's. Actually, as I said it's, it's so simple changes, but just actually doing the right thing to get those changes. I can go in the environments right now and just hack those changes in my hand, but it's you know trying to do the right thing and getting these changes properly deployed. I guess is the tricky part.

D

Yeah absolutely absolutely um well, we don't have to absolutely like rush to get this done today, like great, I think good to know like what approach we need to take, um but in terms of where scarbeck is with the other sort of side of the api um work is he's still, I think, he's still working through the nginx um rate buffering uh work, or certainly that would be the first piece we need to resolve. Then we need to run some traffic on canary and see how that looks.

D

So that will be some days and then we'll get to the point where this becomes a blocker.

C

Sure are we? Oh man, it's like I'm, really, I'm a bit nervous about even going to canary with this issue. I I mean, maybe not I don't know. I know, we've just had other postgres issues recently, I'm nervous about putting undue load on the postgres cluster, but yeah. Look, that's fine, we'll just take it as it comes. I guess.

D

C

So I I guess for my next steps, I'm happy to capture all this onto the issue itself and we can kind of make a decision. I can kind of go through the different options I have for actually rolling out at least some of these changes on what they're blocked on um and then we can kind of work out what the best option is there.

D

Yeah, okay, that sounds good yeah it'd be good to know. I don't know if java few, maybe it's easier in the issue, but if we've done any of these other approaches in the past like whether there are gotchas or some things work better than others. That would be good to know.

A

No, I don't have anything uh sorry I was I was sort of uh doing. I was multitasking, so I didn't really.

D

Yeah, that would be good, though. Let's get that on the issue about like what options we've got and then work out like which one like, if we've done any of these before or whether we, you know whether we can work out, which would be the um easiest, safest approach to take.

A

Hey hey graeme, just just so I understand kind of where we are from a very high level. um We for we first increased the or we add like reservations to the pods for console, and that seemed to clear up most of the errors we were seeing and now the remaining errors we're seeing, we think will be eliminated entirely. Hopefully, if we keep traffic on the same node is that yeah is that, where.

C

We are now and.

A

C

A

C

You're right, I was just going to say I'm pretty confident that um yeah, the last series of barriers is just due to sending traffic to nodes that we know uh we should probably not be sending traffic to because they're due for deletion and the.

D

Reason they stick around.

C

In this, bad state is because they're demon set pods, which are explicitly ignored by the horizontal, uh the node autoscaler. In terms of what it's telling people to evict.

A

Right: um okay,.

C

And and it's and it's like the error rates, it's hard to tell how big or small this error rate is, but for a service like this, it's we're still getting like. Sometimes within a 30 minute period, we could get a hundred or a thousand of them. So it's enough that I'm like this should be a bulletproof service. We really don't want things flip-flopping between the slave and the master like we really don't want that in my opinion anyway. um So we we really should get this error sorted as as much as we possibly can.

D

Nice, okay and then can I just ask as well about um this, mr that scarbeck opened up uh yesterday um for the internet stuff, just like grandmas we've got. You here be good to get your thoughts on uh where we are here and what our next steps might be.

C

Yep, so I put a comment on that today, so myself and jason, I think both overnight had had the same realization that there's actually a better way. We can do that, so I've created another, mr with a different approach, which I think gives us the same solution. That is a little bit. It's a little bit more manageable and I think I kind of mentioned it to you briefly.

C

I think you know one on one that basically the what we're trying to do is we're fighting against the ingress specification and what ingress nginx can do as part of the specification and what it can't. But the end result is the the good news is ingress engine x, configures one or one set of pods running nginx, with the configuration combined from every ingress object. It is given inside the entire kubernetes cluster, so we can leverage that by creating multiple ingress objects, one with the same domain so like gitlab.com, but with different paths.

C

So, like the paths that we we need to change settings for rather than like, because we with the annotations, you can turn them on or off, for an entire ingress object and that's the problem. But we can create multiple ingress objects for gitlab.com, with different paths in them. So one for like slash api one for api lfs or whatever the the end point.

B

Can you have written expressions in those in those.

C

B

C

B

It's that's definitely the way we want to go.

C

Yeah, so I've got the mrr um and basically I just create an extra ingress object that has the regular expressions for all those those problem. Endpoints and the buffer settings sets are like awful. Whatever the changes were, and then I I tested that in pre and the one nginx configuration that I I that was generated looked fine, it had the the location proxy buffer off and then the standard api location, um proxy buffer on and um nginx obviously works off the most exact match, so that should fall through to. We should test this.

C

Obviously, but it looks to me like that will solve the issue. It's still a little bit of weird management like because we're having some ingredients and extra english objects that we've created outside the chart which I'm managing in that extra helm release. But we can push that up to the chart and easily. We can just figure out a way for them to create multiple ingress objects uh in the chart and then like when they've done that work. We can just change that over.

B

Yeah, surely we would want that in the in the upstream charts.

C

There's no reason.

B

Anyone wouldn't want those same settings right sure, yeah. That's that's awesome. That sounds so much better.

C

It is it's it's good. The only thing we do need to be aware of is it might be, luckily, an implementation quirk of ingress engine x. So if we do have well- let's be honest, all those annotations we're using for these settings are ingress engineer specific. They are completely ignored by any other ingress technology, which is why the ingress spec is problematic.

C

But so, as long as we agree that we can put it in the we can put it in the chart, we can set all this up, but if people decide not to use ingress engine x, which they might not, they might use their amazon one or the google one or whatever. Then that's up to them to figure out how to configure it.

B

And there's no generic like proxy. Sorry.

D

B

Requests there's nothing. Well, then we can't guarantee that it'll work right. I mean you can't yeah, surely yeah.

C

I so the ingress specification got ratified to 1.0 and kubernetes have already realized. It was anemic and so they've started work on the new spec, which will be, which is gateways modeled off um it seo and envoy, um and that that's the proper implementation that you can specify every single setting down to the path, and it's all done with implementing different implementations in mind. It's the proper solution, but they just you know it took them this long to get it right.

C

But this will get us out of that. This will get us over.

D

B

C

A way, that's not completely as horrible.

B

D

Yeah, really nice excellent good progress. There awesome um is there anything else anyone wants to discuss demo.

B

I don't know if people are interested at all, but I can kind of give a really ad hoc demo of the stuff I've been working on with kubernetes monitoring.

B

I mean it's very it'll, be very rough like I, um it's all just thanos at the moment, there's no there's no pretty dashboards yet, but um the the first problem that let me share my screen, because this will be a very rough demo.

B

um So the the first problem that I was trying to take a look at last week was um whether we could kind of monitor the the node pools to see whether like they would reach their limits. And then I hit the problem that that information is not exposed through any metrics. The only place that we have it is in terraform, uh and so the first way that I tried to fix. That was with the um I made a change that I I put all that stuff into terraform fars.

B

But I realized that even with the terraform vars, like every environment has got different sizes um and then between the the regional and the zonal cluster, we had different sizes, and sometimes we even had different sizes between different zonal clusters, and it was like a bit of a mess, and I started putting comments everywhere like if you change this number remember to change it in the run books as well- and I was like that's just not gonna work, and so I built that real horrible hack, which someday someone's gonna go.

B

What the hell was, that guy thinking but effectively what it does is that it runs on the get labcom infrastructure.

B

It runs in ci and it basically asks terraform to generate json, and then it uses jq to pull things out of that json um and the one thing that's kind of interesting about it is: there might be other things in future.

B

Where we need this- or you know, we might want to do other things with it as well, but effectively what we do is we then push that to the prometheus push gateway and the only kind of thing that I'm unhappy about it is that it goes to the ops gateway, and so we have this production information like here. You can see this cluster is grp gpro, gitlab gke, but because we pushed it to the prometheus ops, push gateway, it's the all.

B

The environment variables are labeled as that and there's one other problem with this, and that is that we do all of our saturation monitoring in prometheus, not in thanos and so the prometheus that needs. This information doesn't have it because it's in the ops prometheus, so the two ways that I can fix this are: I can either get it um do.

B

Do the evaluation in thanos, which is kind of against best practice, but I can do it, but I'll have to change some of the infrastructure around that or the other option I was thinking about is: can we get to the g stage and g prod push gateway somehow from the ci job that runs on ops, because that would kind of be a cleaner up operation here? If we could do that um so.

C

That was this looks good, especially for being able to get arbitrarily metrics out of terraform, but so my understanding is that when we configure terraform to like set the node pool sizes in gke, they actually use the cluster auto scaler project to actually implement that. And I'm just saying here it's got a metrics endpoint. Have we actually tried looking at the cluster auto scaler metrics endpoint.

B

C

B

Drop this into.

C

Can I just yeah.

B

That would be cool.

C

Yeah, where, where would you like it zoom chat? Oh.

B

C

Yeah, so I'm not, I can't say this will be accurate or or even if it's enabled, but we definitely 100. That's what google we see that in our little stackdriver logs, so that might give you some metrics.

B

Sorry, where did you put it graeme um sorry.

C

B

uh Yeah gotcha, okay,.

C

I yeah, I can actually ask google if they expose that endpoint to us somehow, um but yeah.

B

I mean I can look at slash metrics.

C

Because they they have those pods running, but they might not have the metrics endpoint exposed or we might be able to.

B

Yeah, I mean that's, certainly uh yeah. We can, uh let's just take a look here. This is pretty good.

B

E

Probably auto scale is a better.

E

See these are uh yeah. I wonder.

C

If the cluster order, scale and metrics that yeah.

D

I can go and have a look and.

C

Confirm if there there is the cluster or escala metrics yeah.

B

So I I asked google in the in the slack channel and um honestly, I kind of felt like they were like. Oh, why do you need this like just work on service level indicators and when things go down, you know I was like. No, that's not how you know. Maybe that's how google do you think.

D

It's an infinite.

B

Cluster but um uh yeah, so so they they kind of they kind of come back to me um but yeah. I looked around quite a bit and I didn't.

C

Look at it that's cool, oh.

B

Thanks for that, link, I'll definitely go look at it. The.

C

Other thing I just realized as well is they could be using this, but if they're running it on their masters, we don't see it and get access to.

D

C

Anyway, so maybe that's the maybe that's, the problem is, maybe they are.

D

C

We're not going to be able to see it anyway, in which case it's a midpoint, it'll be cool.

B

Yeah and that is uh yeah, okay, cool I'll I'll- definitely take a look at that a bit and try and understand what's going on there, but that looks that looks super interesting thanks for the link, um so so then, effectively what we can do is we can count up how many nodes we've got, and now we have these numbers here, which is the number that we set in terraform and either we can do.

B

I think I'm going to go with thanos for the moment, even though I don't really like that that solution, just because I can do it, but I don't know if anyone's got any thoughts on how we could reach like. Firstly, do we have a push gateway for production?

B

Java is probably the person I'm looking at most on here be able to answer, oh or graham, if you know the answer or anyone else. I.

C

Was gonna say I don't know the answer.

A

Yes, we do um it's running, I think, on the black box, vm um yeah.

B

A

Yeah we should probably move the push gateway to kubernetes, and maybe maybe one push gateway makes more sense. I don't know like just having no one.

B

A

B

Per one: well, the the reason.

A

For environment.

B

That I wanted is, I wanted one per environment because because of the product like what I, what I was saying is that I need to evaluate this in prometheus, but I've only got half the metrics in prometheus. Now the other half is in the ops prometheus, and so I actually need one per environment, but also I need to be able to access it from upskitlab.net.

B

A

Yeah, oh um well, that should work. I think ops is peered.

B

So that it can get dns reason, I've checked and dns resolution doesn't work well.

A

Dns resolution is definitely not gonna work you have to. You have to set up a dns for.

B

So technically I could get through um if I can. Okay, that's that's that's interesting, so I should be able to appear from ops to that. Push that that one it.

A

Might already be peered, I think I think it is, but I'll have to double check to see if you can reach the ip. If you can, what we have typically been doing is setting up dns entries in route 53 uh using gitlab.net it sucks, because it's public but um yeah that would give you.

B

A

B

Okay and it's just an internal ip on that.

A

um Yeah yeah make sure it's a statically allocated internal ip, not an ephemeral ip. It should be statically allocated uh internal okay,.

B

In in that case, I might actually switch over to pushing to the to the appropriate push gateway because then, like it's just less technical debt, and it's like at the moment, I was thinking of doing all this weird stuff in thanos ruler and like there's places where we break the rules of thanos ruler, but we know very much why- and this is one of the cases where we'll be breaking their recommendations but like against their rules like so okay cool, um and then thanks for that. I appreciate that.

B

The second thing that I did uh was, uh I can't even remember um oh yeah, the the node pool labeling. So basically, uh this is also like really boring at the moment, because we don't have any um dashboards for it, but we've got a whole bunch of things like node uh load 15, for example. Now, obviously, load is not a particularly useful one, but there are other ones um like.

B

I was looking at the om stuff this morning, so we can say here type equals sidekick now uh which we could never do before and we can get the load on the the the nodes that run sidekick, um and what I was looking at this morning was like m kills.

B

um I don't know, whichever uh top okay.

C

We still don't have strict enforcement of what work words run where you really need to circle back to that issue. At some point.

B

Like I've been calling it affinity like generally, if we're getting ooms on this fleet, it's probably sidekick, but this morning we noticed that it's actually some google metrics agents is oh aiming so here you can see in the last 10 minutes. You know this node, which we treat as sidekick. uh Sorry, you know is, has had two homes uh in the last ten minutes.

B

um You know so and that's that's what I was looking at um this morning, but we can start tying in like the health of those nodes to to different clusters, and also we've got like um you know whether they're canary or not, it's interesting that it's also happening on the registry nodes. I wonder if it's the same thing so can we.

C

Run our canary pods on specific notes. Do you remember john? Yes,.

B

We do we do. I I'm not a big fan of the way that we've we've done this, but it's I don't know if it's technical debt or deliberate.

D

B

We run the canary nodes on the regional cluster and we run the non-canary on the zonal clusters.

A

Well, um I mean I don't have a good alternative for that without the overhead I mean like what what would you prefer us to do? What about.

B

So so what about? If we just had a fourth like a zonal cluster which was the canary, so I mean I there's two options: the one option is we just run them on the same node pools. You know because there's no there's no real reason to separate them, but I know that from a helm point of view, that's difficult because there's problems with home in doing that. As far as I understand.

A

I'm sorry, then them on the same note pools what is them in this context? Sorry.

B

Canary and main stage stage main and stage canary run them. You know mixed.

A

Into the same in the.

B

A

But there is an advantage: you see that like we get, if, if we deploy to the regional cluster first, um that gives us also the advantage of deploying the canary first for configuration changes right. Like isn't that good, I thought.

B

It was a good thing, so I I just I I the the reason why I was surprised by it. Let me rather put it that way than dislike. It is that the the canary is is a regional and then the and then all the workloads are zonal. So it's kind of like it's different and there's. First of all, it was kind of surprising to me because I was like: oh you know: why have we still got git in there, but that's maybe just my naivety. I.

A

See I see then, but.

B

Then it's different from the rest of the class. You know it's a different setup and that surprised me that it's not the same thing.

C

I I do remember, I think, going off your point job about configuration, changes in canary um it's, we we at the moment conflate like we use environment, quite unquote, the helm, file, environment or helm environment. To like we actually kind of. We use that to mean cluster like really. What we should have is that there's an environment which is canary there's an environment which is production and then there's a cluster which is a you know: zonal, regional, whatever, like like.

C

We should be having two environments with multiple clusters inside them and be modeling around that, whereas at the moment our that repo we've got just models off an environment and canary's an environment, then the regional zonal and I'm happy to change I'd love to go back and turn that model around, because it is starting to like we're seeing already it's starting to be tricky. But it's it's.

A

C

A bit of work to kind of go in and do.

A

I don't think it would be that much of a change for us to disable canary on the regional cluster and just have canary the canary name space and all the clearing pods deployed to each.

D

A

Clusters, that is a very straightforward change um we would spend. We would have like a little bit more. Maybe we would have more provisioning than we need because, like yeah, you know, there's not a lot of canary traffic, but we could do that.

B

A

B

Do that that's I so so I thought that there was problems with so from what I understood from what um scarbeck said. There was problems with our helm, set up with that that that so you're talking about running them in the same node pools in the same clusters, but yeah, okay,.

C

A

The only reason I was thinking keep it on the regional cluster was because one it's kind of nice for config changes right. We deployed to the regional cluster first and the blast radius is much smaller. So that's one reason, and the second reason is that uh canary doesn't have a lot of traffic and there is sort of a bit of an overhead by duplicating our canary resources by three right like, um but maybe with auto scaling. It's not a big deal like I mean, but.

B

We would you run that in the same note pool so yeah for api. We run it in so so so from a resourcing point of view. It's not um you know it's just sharing.

A

Probably you're, probably right I mean with the way things are packed, it might be cheaper to do it this way. Another reason to you could argue another reason to do. It is um for network traffic to keep it in the same availability zone right now for canary for https yeah. It crosses azs, because, um okay, you know how this works so yeah. So um I'm I'll open up an issue.

D

Can we get an issue to like think about like what's involved? What's the um what's the benefits of doing it, like aside from it being a bit surprising that uh canary's.

C

Got a different.

D

C

Well, it sounds.

D

Like it makes our metrics.

F

C

Make our metrics a bit easier or a bit more sensible, yeah, maybe.

B

Yeah I mean it's not from a metrics point of view. It's not a big deal. It was uh like it's, it's yeah, it's like the. I I guess from my point of view, it's just that it's running in a different thing, but it that's not a really, hugely persuasive argument. Right like I, I don't think it's urgent in any way, certainly um but those those cluster sizes are like yeah I mean I.

B

I think it would be more efficient as well because running those node pools for api canary and git canary, and you know I I think we could cl pack those in better as well, but I like, I don't think it's a massive um urgent issue. um It just seems it it.

B

It would be nice if it was kind of running in the same way, and then you know, potentially it could pick up some more problems that we see in um uh the you know like if there was a problem with with the way we deploy with with zonal deployers we'd get that in canary as well, where at the moment it's I mean I can't imagine, there'd be a huge number but yeah.

D

Okay yeah, maybe it fits well with uh like some point in the future. I guess once we've got through the kubernetes migration. There's uh like canary can totally change right like it doesn't have to be set up. How it currently is.

B

D

Like like, for example, we could do more like gradual traffic rollouts uh as we go through a regular deployment.

B

Cool yeah um rainbow.

C

Deploy is great.

F

D

Yeah yeah. Definitely that would be awesome. um Okay, well, let's get an issue, and then we can look at the uh like at least work out. If this is something we want to plan in for the future or like, is this an approach? We'd actually want to take or other things we need to watch for.

D

Cool awesome um is there anything else and I wants to get through.

D

Nope awesome: okay! Well thanks for uh demos and discussion and um enjoy the rest of your day. Take.

C

Thanks everyone bye, bye.