Kubernetes SIG Network, 19 Aug 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes SIG Network Bi-Weekly Meeting for 20200819

Description

Kubernetes SIG Network Bi-Weekly Meeting for 20200819

A

A

Hello, everybody today is thursday august 18 2022.. This is the sig network, kubernetes meeting uh as usual, we are under the governance of our code of conduct, which distills down to be good people. uh We have a pretty thin agenda today uh and uh I just got back from a vacation, so I haven't had a whole lot of time to catch up on, what's been happening in the last two weeks. I apologize for all the pr's in these emails.

A

Let's do some triage, actually there's nothing on the agenda other than triage, as far as I can tell so. If people want to talk about stuff, now is your moment to throw it on the agenda. While we, while we do some triage, we have uh 15 issues for triage. Although we've pre-triaged a few, where did this window go there we go. I will go ahead and share a window.

A

Excellent, are we got that? Yes, all right uh going from most recent to oldest dns does not work correctly. This seems like a misconfiguration issue. I I asked for a little bit more information.

A

It's pretty clearly not a kubernetes bug. um I should just close the tab. uh I signed it to myself I'll try to help the person figure out their config issue if anybody wants to jump in who's. Super familiar with the dns side of things. They're also welcome to jump in I'm, certainly not going to be. Greedy network policy host could not be resolved when adding network policy.

A

It sounds if I'm reading it correctly uh that they have set up a service with no selector, pointing to an external object, an external ip address and they're trying to use some selector here in a network policy, but there's no pod to select network policy is written in terms of pods, not in terms of services. So I've asked for follow-up. If that's the case, then we just have to close this. Unless we want to consider handling this use case more elegantly, it seems pretty niche, but.

B

Maybe can you tag me in it? Oh sorry, no.

A

B

Or just give me the number.

B

Thanks, you got it.

A

uh All right um I filed this one I haven't read the updates to it, though um I realized, as I was thinking through some other problem this week, that uh if I create a load balancer, I get a node port automatically.

A

If I access that node port on the load, balancer's ip, it doesn't get blocked, and so it falls through into that last cube node ports chain which doesn't check ips. It only checks ports because it kind of assumes that all the ip stuff has been handled uh before that. um So.

C

If you scroll down to my comment at the bottom, it it actually only calls the node port chain for ips that are recognized as being local ips, and it should not be catching the load balancer ip, and I think that this is something that runs on gce. That's screwing, up, cube proxy.

A

uh So yes, at least on gcp, I thought it was on other platforms too. The load balancer ip is configured as a local address, because in the vm case it would be a local address. You'd want to receive traffic on it right.

A

I don't know if the right answer is having cube proxy filter or trying to beat that out of the platform stack, I tried once to remove it. It was exceedingly difficult.

A

Okay, I'll read the comments here, but I thought it was an interesting corner case. um The slippery slope is, we could end up with an equal number of rules in the filter chain as in the nat chain, because we would have to recognize allowed ips, which could perhaps argue for the uh mark drop to be a simpler model which dan just weeded out.

A

At least then, it's clear, like just add one rule at the end of that chain right, although it's not clear from the code point of view where to put that, because we iterate on ports not on ips, all right I'll, look at the updates there, I'm going to go ahead and triage. Accept this though, because it is, it is a bug.

D

The question is, then, so one way of shortcutting all the rules is to create an ap in the in the host. With that you are able to to avoid these project rules.

D

I think if someone somebody wants to to hack on this.

A

Sorry, can you explain, I didn't catch.

D

It yeah, I I'm thinking, I'm speaking so. The thing is, we depend on this local rule to reject right, so everything every ip that is going to be present in the cost is going to be able to to avoid these rules.

D

B

D

Be a problem from the security perspective I mean somebody can exploit this, but adding an ip to a host is a a privilege, operation right.

A

That's right, it's a privileged operation, except when we do it on behalf of a service which is not a privileged operation.

D

When we, we never add ips to the to them well,.

A

On some platforms, I thought amazon's net lb worked the same way, but I could be wrong and I also I thought openstack worked the same way.

A

Any random. Any user who can create a cube service and set type load balancer will end up creating an ip on the local node on every on on every node.

A

D

Not creating an interface.

A

But it's assigning an address to the existing interface.

D

But this is okay. This is the process that runs in in gcp, okay, at least.

A

Gcp, it's certainly gcp's problem. I I think it's others too,.

A

So we could not do that and then, if kubernetes nodes would be fine, but other gce nodes would break so we have to make it then a special part of the kubernetes nodes.

B

I mean it depends on the load balance implementation. I assume that's right.

A

B

It's the vic. You need to hang somewhere, that's right, so.

A

I would presume that anything that acts like a vip will have the same problem. Yes, um okay. Moving on uh I'll, read the follow-ups here.

D

Oh okay, I know I I know now who is going to be affected all these people that use keep alive with external aps.

A

Yeah, yes, external ips would be subject to the same thing, but.

B

The question would be: would you move it over into a vrf? I haven't tried vrfs with the ip tables, but I mean that would be one way, just move it out of the put in his own little jail. So to speak all the ways here.

A

Yeah we'll have to follow up on this one um issue filed anyway. um This one is basically saying: uh hey: ipvs doesn't support node ports on local ports, which we know um they're asking for documentation, which is fair. um I think the thing to document is iptables mode supports it on node port supports localhost, node ports and probably shouldn't have and here's how you can disable it right. There's a pr still open. I think that disables it so we should. We should encourage people to disable that.

C

And- and maybe this ties into the the load- balancer node port thing too, like maybe we need to start pushing people to to to say explicitly. These are the interfaces I want node ports to exist on and and then you know they would only exist on those ips.

A

Yeah well, and also we have this uh allocate load, balancer node port flag, which defaults to true for historical reasons, but like on gcp is not needed. If, if there was a way to set that as the default on gcp, then I wouldn't have filed this bug because I would have tested and said: oh, it doesn't work.

A

We have a, we have a bunch of like evolved um things that we can't change the defaults on or we haven't changed the defaults on. um We should think about that. I have actually literally a note on my whiteboard on my desk here, which is um to think about a core v2 proposal where we could bake in some of these default changes. um It's not just service, but there's a bunch of things that would be nice to change the defaults towards more secure defaults.

A

I don't know if it'll ever go anywhere, but something to think about.

B

um A plus one, big, plus one on changing the default, especially when the default is like basically a load-bearing fog and as long as we make it very clear and um you know give people a path. I think that would be really nice.

A

Yeah um yeah, okay,.

A

Maybe we should open another issue. uh I I'll take a note uh at least on gke. I should be able to get them to change some service flags change, cube proxy flags uh to make some of these defaults saner right, like maybe we should add a cube proxy flag. That is uh no never mind. I gotta think I gotta think through it feels like. I should be able to change this default for gcp somewhere, because we know we don't need node balance and node ports for load, bouncers, uh okay,.

A

That was this one. So uh sorry, antonio, were you gonna write the docs or was somebody. Why did I think it was you who signed up for this.

A

D

A

You touched it now: it's yours, okay! No! uh It would be great if somebody volunteers to write some documentation on this. I'm just going to go ahead and triage. Accept it, um because it's real.

D

The the problem with the docs is that the round trip is horrible. I mean it's getting the bike shed in there every time that we need. We want to modify a fragrance.

A

Yes, I agree that it's it's more painful than it needs to be, and it's not even really clear where this documentation should go.

D

That's why I asked the user and he pointed three places, so I expected him to follow up.

A

Yeah I mean if like if we just made it in the comments around um q proxy, like the references get regenerated, don't they periodically every release? uh I thought they do. um I would not put it in api docs. um I would perhaps put it in this one.

D

That's the problem we have with with the different modes of proxy. I mean yes, I agree and we never fixed that. I think that the root causes, that is, we have two proxies that are different.

A

Yeah we're serving multiple things from one binary all right. uh I also filed this bug. I haven't read any of the follow-ups on it yet, um but, uh as I was looking at um the changes around service controller, I realize it's not clear what the being deleted state means. In fact, this is why I went off and wrote. I wrote a doc in the community repo on controllers and this intermediate state, because we have a real bug, so I'm going to go ahead and triage oops accept.

D

But this input sets with a lot of things with the pod lifecycle, with the knowledge life cycle with and.

B

A

D

Afraid is that we start to to implement different behaviors in different places, so, for example, the scheduler. I know that there are other controls. I don't know if the scheduler who is handling these, these things different or at least.

A

Sorry and chat bowie, which, which doc do you mean.

B

Oh, the doc that you wrote for community. Is it linked to this issue? I just I don't see it no.

A

It's not uh that's a good point, uh I'll find it I tweeted about it this week too, you can find it on my twitter. It's really short. It's just like there exists. A third state between does not exist and exists, which is, is going to not exist right.

B

There we go sit in the chat. Thank you.

A

Anyway, I I come to realize that basically any controller that doesn't explicitly have something documented about deletion. Timestamp is probably wrong because all the ones I've looked at so far are.

D

This is the thing ah alexander, so we have this this thing on on in public and in points so then points have the determination step. The problem is we had a bug in the point of light controller, because the node wasn't present, but the path was present. So we we are. We have one two, three controllers that depends on on their behavior and the three of them manage them different.

B

D

B

Have the position we have the.

D

Nodes and we have the endpoints but nothing points with the garbage collector and all these things that depend, and we don't have a consistency between them.

A

uh Yes, I mean each each resource probably needs to define what does it mean when it's being deleted and how people? How controllers should treat that? Maybe the right answer is it's not a per controller decision. It's a per resource decision.

D

Yeah, that's. The thing is: if I delete a node and it's been deleted, do I delete them points for the pot there, because the pods are waiting for the recycle?

D

I don't know if you see where I'm going.

A

Yeah yeah, no, no! I I see what you're saying I'm trying to figure out. If there's a.

D

D

Yeah, that's the thing is we: we have this across the states between different dogs.

A

Yeah I mean uh there. This came up on slack this week too, with someone who was asking about um why their service was experiencing connection reset errors, while they were doing upgrades, they thought well, I set a graceful termination period and um when I receive a sig term, I call http.shutdown and why isn't kubernetes doing the right thing shouldn't all the traffic have been drained before I get the sig term and I had to explain like no, actually, that's not at all how it works.

A

Don't do the shutdown, because everything is asynchronous to everything else and there's no check-in. After all, the load balancers are configured right. It's just like a wall time oriented um and it's it's unsatisfactory. But here are the reasons- um and uh this goes right to the same topic of like end of life.

A

For all these things isn't clear. Yeah tim. Do you have that written.

C

Down somewhere.

A

uh I posted it on slack, I could write it.

C

Okay, yeah, if it's somewhere in slack that'd, be great. We had some uh confusion over on our side regarding that behavior.

A

Okay, yeah, you know the truth is. I should probably write a blog post or something because there's enough there that um it's probably useful for people to know about all right. I took a note to see if I can grab that and post it somewhere, but it is on slack. I think it was in sig apps uh the discussion.

A

um Okay, so I'm gonna leave this open. uh Now, we've gotten to the point where I've, not pre-triaged um long-held, tcp connection gets destroyed randomly from contract. um It's a really nice long bug report here with lots of details.

A

uh Does somebody want to look at it and see if they can verify that it is in fact an issue whether or not we're going to fix it doesn't matter, but just just to analyze. The report.

A

C

D

I was checking it yes,.

B

Well, I mean the first thing would be to ask for more information right. What is the kernel, what is the versions of everything.

A

Yeah, what are the time, what are the configuration of the various.

C

A

C

Did they mention the kubernetes version at all, because we did fix a bug like this ages ago?.

B

A

There we go 22., okay,.

A

Wants to ask the right questions.

D

Okay, I'll add me to that. All these contest things uh we we.

D

A

Yep, okay, next uh cube proxy, should report events when proxy or failovers. What is the proxy or failover? Oh. This is sorry. I did look at this one. This was the uh when they ask it for ipvs and it falls back in ip tables or something like that.

D

That that's then winship has the kicks for that.

C

Yeah, it's submitted to pr to get rid of that.

B

A

Awesome cool, uh so I guess there's the bigger question, though of uh should we um should we send events more readily for stuff there's. This other discussion that you were on antonio, about sending events for uh node port collisions like it's dangerous whenever we send events from q proxy right, because we can easily flood the system.

D

Yeah, that's the thing and I think that events on the reconciled loop are dangerous, because I mean that means one per node per cycle, but we don't send events. I think that I grew up and we only sent one event and I don't remember why, but I I find that users uh look at the bins. I mean.

A

I I think the users do look at events. They don't look at logs.

D

Yeah, that's my point so and- and I think that some of the bugs and the things that we have should be easier to to result. If we we use, but.

A

The the problem with events is they age out relatively quickly compared to you know the time frame that people tend to detect things in um so like if we send an event at startup? That says?

A

Oh, no, you asked for ipvs, but I couldn't do ipvs, so I gave you iptables instead and two hours later that event goes away. Will anybody have seen it right there? Isn't there isn't a place to decorate permanently, like you guys, remember way way way back? There was this component statuses api that uh we were trying to like add, like hey, cube proxy component status, you can go and note that something is abnormal about cube proxy um and it turned out.

A

It was bad for a lot of reasons, um not the least of which is, I could have thousands of cube proxies. Am I going to have thousands of component statuses right.

D

Yeah, but I mean they're bent, will keep so the q proxy that phase to put to something we keep sending the battery.

A

But if it doesn't fail to boot like if you fall back, I know we're going to get rid of the fallback. I agree with that, but if we fall back from, if we do something the one time at startup- oh no, I've noticed I. I can't configure this particular syscuddle, throw an event right and then nobody ever sees it like. Was it worthwhile or are we going to set up a timer? That repeatedly throws that event and then we're back to the thundering herd.

D

Well, that's for sure my experience is this is when, when you log is for something that knows where to look at I mean you go to keep processing because you know I don't know something's happening when when you want to get the first, the first impression of what is going on the first thing you look at the bank, so that's that's for me, the rule that we should follow. I mean when we want to point the people. You know to give them a starting point.

D

They are not going to look at all the q proxy logs.

A

No, but like again looking at failover like uh it's gonna work right, we'll we'll just have switch, we won't have given you what you asked for, but it's actually gonna work pretty much. Okay until you hit some weird corner case that is different between ipvs and ip tables right um right, and only then will you notice that you didn't get what you wanted um and you can make one argument. I guess that if it worked, why would we bother telling people?

A

um But if that was true, why did we bother implementing ipvs right like it's there for a reason right and if they're not getting what they wanted? There should be a way to tell them, but we know that nobody looks at the logs and especially nobody looks at the logs when things aren't on fire.

D

Yeah, that's why I think the event is good. This is like a bread crumb I mean just if it works only for the five percent of the user that look at the band. This works.

A

I agree with you it's interesting. I mean it's better than nothing um and in fact you know I know I know some people take all the events and archive them right, so they can go back and look at them later. um So it's it's better than nothing.

A

So I'm not I'm not against it. It just. It feels like it's an incomplete solution, a very incomplete solution, yeah.

D

I agree with that, but it's better than logs I mean, and we have these turbos with 10 degrees of blocks. Let's.

A

I mean I almost, I wonder if you know like rebooting something like component status into something that was, you know, learn the lessons from it and make it something more reasonable would actually make sense.

B

I mean it's something like status right or alarm. That has to be cleared, but I mean that also becomes very tedious when you have a lot of them. Yes,.

A

Yeah, if I have a thousand nodes in my cluster and I get a thousand alarms, I'm gonna be one unhappy sysadmin. Well,.

B

If it's the same alarm, they should be clobbered together right as one course and then yeah I mean it's great.

A

Somebody go write a cap, I I mean I'm seriously like I like the idea, I feel like being able to have arbitrary things that are running in or around or or for your cluster to notify you of permanent non-transient non-fatal errors would be useful.

A

All right, let's move on this- is not the topic here, um so this is going to get fixed, close it uh we're halfway in is there more on? Is there more on the agenda, nobody's put anything else on the agenda, all right, we can keep going then um for dual sections. Node addresses show only one ip if node p parameter is fast. Didn't we look at this one last time. Yes, we did.

D

I don't know what happened here, but this.

C

Oh yeah, I don't know if we did look at this, so this is possibly I I made a comment. It's possibly an api break. We might decide um basically node ip was always useless when you were using an external cloud provider, and so somebody changed it to make it useful, and it turns out that this breaks people who were using it for its side effects.

C

D

Related to the pr where, when they added annotation, because I lost track, this is not ips terminal provider changed recently during the last year.

D

So when, when you did modify this last time, it was everything okay right, I'm talking to language.

D

You're a mute or I'm not hitting them. I I didn't modify the.

C

Handling of external cloud provider, stuff.

D

Yeah, no, I I don't know the thing is the this. The the cubelet node ip addresses were through a longer factor and do and andy psyching were modifying and it was fine working fine during one or two releases, but.

C

D

We are starting to have all these things, and I saw some patches from the cloud provided people to add an annotation and handle some migration.

C

So for 1.24, they they changed the behavior of node ip with external cloud providers. It used.

D

C

Okay, I don't know what they've done if they're doing anything in like since.

D

C

I guess that was 125 now 124. right.

D

And that was what changed the behavior? Yes, okay, okay, now I was trying to to go back in the history and- and I didn't see any change. So that's why I was this feature was was stable for a long time.

A

Okay, uh there's a lot of history here that I haven't read: um what uh like is it is it? Is it a break? Is it something we need to roll back on.

C

If you were using a command line flag which didn't work in the way it was documented to work so that you could get a side effect that wasn't documented, then now it does something different.

B

A

B

Using undefined behavior is always undefined right.

A

Well, hiram's law right like at some point, if it's part of your api behavioral surface, it's part of it. Sorry, if it's part of your behavioral surface, it's part of your api.

B

B

But if it's not documented, then yeah, you can break things, but it's still not yeah like we never promised. This glitch would never be fixed, but we still are going to have to bear responsibility for how sad people are when it is yeah.

A

D

I think go ahead. The thing is when I saw the change like last time that they were implemented with the annotation. It just was one thing that looked at. You know that something was going to go wrong because I remember andrea and then discussing on the norway piece, and it took a long time to to have this behavior stable.

D

So I I will try to find the the pr and.

A

D

Give this the history.

A

I'll give this a read later today, um but uh dan, I I I'm gonna tag you as maybe a proposal maker like do we fix it? Do we not fix it? Can I can.

B

C

Stick you with that.

A

Responsibility.

C

After you've read it and and weighed in on, if it seems like, I don't, I don't have a sense of you know like I said it, it's really an edge case like like. If you go with the thing that you know, we can never change behavior, then it is an api break, but it's an api break in using a feature that wasn't working the way that it was supposed to so that you can get a side effect that wasn't documented.

A

Yeah I mean you know this makes me think about the um uh topology behavior of like. Should we turn topology on by default? Oh well! What if it breaks somebody? Oh well, then everything we do has to become opt-in and you can never change. Existing behavior makes for a really horrible mess over time.

C

So in in this case, we could revert it back to how it used to be deprecate, the nodeip flag and add a new node ips flag, that has better semantics and then you know, move forward that way.

C

We just lose the the the new better functionality in 1.25 that uh whoever md booth, I think, had added.

A

Okay, I mean what you just described was where I was going in my brain to. um Let me have a read over today and uh I'll weigh in all right.

B

Yeah, okay, yeah.

A

Oh wasn't that uh that was linked in here somewhere too many windows.

A

I saw that number in here.

A

Yeah yeah yeah, you you linked it in the issue.

D

A

All right uh aps server crash. This is the dual stack um service versus uh advertise address.

D

Yeah, I don't understand, why do you want to change this.

A

Well, what we're doing now is is pretty wrong right.

A

B

A

Aside from the fact that it panics, um if, if they've, set up an advertised address and a service address and those two things are in conflict, what should we do.

D

The problem is that, by definition, the cluster is not going to work, because it's not the kubernetes dot default service is not going to work so everything they use in cluster configuration that are all the pods controllers and everything are going to fail, but they are not going to fail us. They failed as connection timeout.

A

D

So you have a cluster that has the api server working, qctr works and all your controllers are failing with connecting them out. So that is a.

D

A lot of issues that are going to be open against the networking team who's managing the class right. The.

A

Networking team.

D

Will take uh hours to until it finds out that the problem is that.

B

D

Has an endpoint in a different family.

A

Yeah, this is a super subtle bug right.

D

That's that's why I mean language and I have a lot of these.

A

Okay, so so the the question I I was asking was: should we try to do try to figure out what the user actually meant and use the secondary family, or should we just say whoa you've specified two things that don't.

C

Exist together, there's there's no way to make kubernetes by default. Use the secondary api family with the current code like.

D

Okay, I have the the traces of the code there, so the problem is that that that part of the api server that put the strap is better to not, I mean touch it and.

A

D

A

Better to just say, you've set so you've set this up wrong. No.

D

A

D

A

That's that's what we're saying here.

D

Right right until then, when she fix it with the dual stack representing points that then we can do whatever we want.

A

Okay, so I'm going to mark this as triage accepted and I'm going to market help wanted. It seems like a relatively low hanging fruit. If somebody wants to jump on it right.

D

It has a cap, I don't know if it's missing, for something.

A

Well, but I mean aside from the cap, like simply adding some validation, that just says your advertise address and your primary service address don't match. That's not allowed like. I could even argue this should be a bug fix in 25.

D

The problem is that, okay, let me think I need more time.

A

I mean tell me, tell me if you think I'm wrong, it seems like it should be a simple fix, at least in this case it won't, uh it won't panic, it will just fail with it with at least a useful error message.

B

D

Because those values are derived well, I don't remember I had to check, but the problem is that the how the api server builds, the configuration goes through different states and has different latest servers, one on top of that. So the options going you know are going been derived and I don't know at what point you have that information.

D

So I just say when I I typed keep ap server put whatever failed there. Oh don't.

B

No, so so what would happen if you only gave a v6, no v4, I still have a public v4.

B

It would not run right now. It's like right now, it's like! Oh, it's an order of dual stack, but if you try to make a v6 system, but you give the public ips a v4 that shouldn't work either right.

A

Yeah that won't work.

B

So I mean what it says that you need to have a match between your password piece and the first, and so this would then be a bug.

A

Right I mean the code snippet here is, is exactly it right, so it looks like we do report it somewhere, but then later we crash. um So we should probably report it earlier and just say: I'm not even going to try to spin up the the the controller yeah.

B

So I mean the the the description is a little wrong. The problem is not that it's not that it has a v4 public ip. The problem is that it doesn't have a v6.

B

D

No problem is if it doesn't match.

B

Yeah exactly the two flags that don't match yeah exactly. It needs a v6 address right.

A

Well, one of these two has to change either they flip the services around or they change the advertised address that that's the issue right.

B

A

Okay, I'm gonna throw a quick.

B

A

We're working on these prs and caps still thank you for all the excellent work on this. uh I was out last week and this week has been crazy, but I have them all open and I will look at them. Asap.

B

Yeah, I guess whenever the kept cycle opens for 126 is fine. It was more for the cherry pick on 124 that I would be more interested in so to speak.

A

Yes, uh man that cherry pick makes me anxious.

B

I knew you would say that.

A

If I was the branch manager, I don't know that I would buy it. um Yeah.

B

The thing is even for the. I didn't really know how that works. Like does the branch manager wait for your lgtm approve, or do they actually take a look at the other entry.

A

They probably will take a look at the pr they should. um What I'm afraid of is um sometimes you know like. If I say hey, approve this, then they will, even if they don't think it's the right idea, and I don't want to be in the position of telling the branch managers to take risks that they're not comfortable taking because it's them that's on the line, not me right sure.

A

So maybe we should have a. I haven't looked at the cherry pick pr, but maybe we should just actually like loop them into a conversation and say this is the risk. This is the bug like. Do you think that the risk is commensurate to the bug yeah? Or do we just tell users? You have to wait for 25, which you know practically speaking in managed providers won't land until q1 right.

B

Yeah, that's kind of what my my company is facing. I mean the problem, I don't know in any case, so we, I guess we can have a discussion after this meeting offline somewhere, maybe with that branch, but this uh this issue I created in any case to like track the work, so both the pr that went in on 125, but also the cap for 126 and future improvements that we'll make.

A

Yeah- and I want to say thanks like this- is- was sort of a weird and ugly old controller. It is a much nicer piece of code now. After all these changes, it makes a lot more sense and once we get all the pieces lined up, I think it'll be much better than it was.

A

uh Moving backwards in time, uh we have common, probably there's a problem there in cass.

D

This is for you.

A

Was this me all right? Oh yeah, it's signed to me how about that.

D

I don't know what they are doing.

A

Oh, they did respond. Okay, all right, then I'll. Look at it! Sorry, uh all right worker nodes are not showing external dns address.

A

Okay, uh I don't think there's a fix here right.

B

A

It's the the resolution is cited. This is not a guaranteed api uh and then cal's thing, which I don't even remember why we left this open. Oh.

B

D

When is carl coming back, who are you?

D

D

He was in spain last time that they checked.

A

Yeah I've been watching him just on twitter, as he's world world globe trotting around sending tweets from all over the place.

A

How rude would it be for him to be in spain and not drop in on you, antonio.

D

I'm I told him.

B

All right, I had to mute the other sorry I had to meet the other call. Cal will be back this fall, so we will get to enjoy his presence on these calls again soon.

A

All right, I'm gonna stop my share. Now um we still don't have anything else on the agenda so rather than waste more time as much as I enjoy all you, all's company, I'm sure you can do something more useful with the last 13 or so minutes. So last chance. Anybody has anything they want to talk about.

A

Okay, we're about to open the cap window again for 26..

A

uh We have a lot of open caps, that uh and and more incoming, um so we're gonna need to pay close attention and maybe do some prioritization about which ones we're going to spend our energy on. um I personally I like to in between the code freeze and the kept window opening so like that you know several weeks period I like to try to find tech, debt issues long-standing little ugly, things that I can spend some time on.

A

So I will encourage anybody who finds themselves with a few free minutes. Go peek at one of the code bases that we own and see. If there's not some nasty little work that you can pay down and close one of these old issues documenting some weird behavior or helping fix a flag or whatever we have. We have plenty of debt. If you have trouble finding some. Let me know I'll I'll be happy to help you.

A

We need more docs, yes and the docs that we have are not always coherent. uh I was looking at recently this um debug services dock, which I link people to all the time um it could use a refresh to cover a lot of the newer stuff, um especially like the traffic policy fields. um So there's plenty of debt to do.

D

There is oh sorry, there is one pull request from team banister from documentation to to update the services. I don't know where, if he merged it or not, but that's uh he will need some help with that. I would try to find it because he.

A

D

To refactor all this documentation- and he has a pr- and I don't remember- I reviewed it before the previous two home, but it was really nice. Let me annotate this.

A

So I was, I was asking someone this week: uh if we have a label that sort of indicates tech, debt right um and the answer is no, we don't, but we can probably assume that the combination of priority backlog or maybe just priority backlog is, is the closest we have to this. So.

A

Good way to go, and maybe maybe life cycle frozen at the same time, there's no reason to age out tech debt really.

A

Okay, uh everybody gets 10 minutes back thanks all for your time. We will see you in two weeks. Don't forget: kubecon is coming up shockingly fast, see you soon, bye.