Istio Environments Working Group, 29 Sep 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Istio Environments Meeting 2021-09-29

Description

Istio Environments Meeting 2021-09-29

A

All right good morning, folks, I think we should be good to get started um pop right into it. So uh the first item on the agenda. This is something we discussed, I think two or three weeks ago. um It's the proposal for prioritized leader election. So basically, um for those of you who aren't aware, um we want the default revision to handle all of the things that are guarded by leader election.

A

Basically, and we talked about this and basically uh I wanted to go back and change a few things about the proposal, but in the end, actually it stayed basically the same.

A

I'm the main discrepancy last time uh was whether we needed to change the actual kubernetes leader election library that we rely on to make this happen, and I'm pretty convinced that we actually do uh after working on it for uh since the last meeting um so yeah this is. The proposal is pretty much in good shape if uh we could get formal approval on it. That would be great and we also have a um a pr. That's uh that's up and ready for review, so yeah, just kind of request for comments, requests for feedback.

A

Is there any any thoughts on that since last time or are we all thinking that looks reasonable.

B

All right, I think it's it's okay, I mean we now go ahead.

C

Sorry, I was just going to say: can you quickly take us through the user experience for this.

A

Yes, yeah absolutely so um from a few months back, we made it so that the istio control tag command controls um which revision handles validation. um So you want to change the default revision, so you do control tag, set defaults, revision whatever, um and then it handles validation.

A

So the only change now is that when you do that same thing to change the default, it will also uh make it so that it handles like ingress status, controller updates and gateway status, controller updates and all the things that only one revision should be responsible for and it'll just happen automatically. So it's kind of cool like if you change the revision, but it's your control tags at default.

A

You can look in the logs and see that the controllers on the previous, uh the previous default kind of wind down and the controllers on the new default. uh You know spin up and start doing their thing.

A

So it's pretty much the same ui.

B

Yeah from a user perspective, it's it's it's. uh Basically, things will work as expected in the sense that the controllers, that matter I mean ingress, control, upgrade and so forth, will be controlled by the default revision, which is you know what you would expect if you're doing a canary upgrade. You don't want canary to suddenly start picking up those core functionality.

B

So it's more a bug fix for, for uh you know, random workload, picking up the critical work.

C

Yeah makes sense, I guess I would expect that that's already implemented today, thanks.

B

Unfortunately, not.

A

um Yeah so so costing john uh jacob just yeah start going.

D

Oh yeah, I chip it. I like it.

A

uh Can I put you down as approver yeah.

B

Put me there as well I'm trying to edit yep.

A

Okay thanks everyone. That's awesome uh crafting with everything on that proposal, then.

B

So, let's move to the most controversial one.

E

Hey sam uh just very quickly, uh is there anything so is there anything that uh would have to change for helm if helm's using uh default revision labels.

A

Yeah, you see, that's a good, that's a good point. uh It should just use this. So basically, the way it works is it'll. Just look for the default tag mutating web hooks. So, however, you get that set.

E

A

However, you want to do it, however. We decide that the best way to change the default with helm is that'll just work with the default later election, and also it works well with older revisions, uh and it works well when there is no default as well, so it won't block one of so. If you have uh three revisions, none of them are default for some reason, um it'll still pick one of them. One of them will still win the later election. So nothing.

B

Breaks: yeah, okay, some uh one thing I want to to add to this proposal. If possible, we discussed last time. uh It is possible to run issue without injection as we discussed and it's a proposal.

B

uh It would be great if the easter cattle tag command will create a config map where, where uh all the tags- and I mean it- can be a simple hashmap where configmaps are updated and uh we do not depend directly on simulating webhook, which may not exist.

A

Right uh yeah, I agree with.

B

A

It's not a p0. We.

B

Can do it in 113? It's not really uh critical, but we we definitely need to track it because it's uh you know there are cases where users for security reasons do not want to have mutating webhooks and if they use cubic injector other things. It's perfectly legit use case.

A

Yes, so a little while ago, I tried to put the revision tags in base and then the next step there would have been generating a config map from that map of revision tags. But I think we kind of need to go back to the drawing board on how we want to configure that. To that a.

B

Configurable I mean if we go, if, if the injection list is approved, the core proposal there is to have a config flat config map with you know, internal non pub non-api stuff, I mean basically key values that are equivalent to the environment, variables that that are not user, visible and- and that's a perfect place to put this because that's internal between your cattle set, the target and uh and the study.

B

B

When you have the the uh you know, injection list will will uh use this to to select the control plane because it doesn't have mutating weapon.

A

Yeah that yeah, we were kind of looking for a place to like put the canonical mapping. So maybe that's a perfect fit.

B

Yeah we call it mesh and because it's it's also supposed to be a bit more independent to to what mesh it is. I mean it can be easier. It can be other people who are using other meshes, and you know easy to have a consistent way for auto configuration. Okay, let's take it offline. If you wanted, I think we had an agreement, mostly okay,.

A

A

um Okay, the next item yeah. So this is with respect to the proxy config crd. So uh last time I think we discussed it as a working group together, we've kind of settled that we wanted. uh If you have a proxy config crd present, it should just replace whatever is set in the annotation and in the existing mesh config proxy config stuff.

A

So this is an ongoing discussion in the pr for the api change um is that maybe we shouldn't do uh just a straight up place and we should do some kind of a smarter merge strategy here, um which I think I agree with, and that raises some other questions. So yeah.

B

But uh remember the reason we I mean there was a reason: we we decided, we we choose to have proxy config takeover, and that was our back and security and I don't think we have any approach that will address this. If we, if we match, because.

F

B

Was that if you people start adopting proxyconfig, they can set our back, and only you know the admin can decide all the things that go in proxy config, like with all other.

F

B

But if a user has pod permissions can override, then we kind of uh lose this.

D

B

Have complexity.

D

Just to clarify you're saying that I am an untrusted namespace and I can create a pod, but I cannot control my own proxy config is that's perfectly.

B

Legit, I mean it's perfectly possible in your namespace to to to be able to create pods with admission, webhook or whatever, and not be able to create security policies, not be able to change routing, for example, because security admin is in charge of this.

D

I would agree that you could lock down those configs, but proxyconfig is like this is a convenient security to configure the pod. If they can configure the pod, they could already do it manually. They don't need this api right.

B

If, if we move, I mean there are plans to move the control, the proxy out of the pod, so so it's uh you know that in cni or whatever, I think they're all kind of efforts to do this. uh If that is the case, then the pod will be completely powerless to to to mess with security or with other things, so the admin can can enforce that. You know it cannot be bypassed. Basically,.

D

I don't see how you can enforce anything on the pod, because I could always go and manually run pilot agent and envoy and put whatever config. I want.

B

Yeah no, but but but if, if, if you use cni you and and and you don't have network permissions, then ip table will be enabled you cannot mess with it. You cannot do anything.

F

B

It and then, uh if the employee is running outside of the pod.

D

Sure, if we do that eventually- and we are working on this- I mean this.

B

Is in we are, we are talking about efforts that are already approved and on on on you know, yeah.

D

I would assume we don't. If we do that, though, then we don't read the pod annotations.

B

We could do that as well, but but why would you take the complexity and and uh to merge and then to explain user wait a minute if you're running a mess in a mesh that is uh enforced where security is enforced, then the rotations, though, are not merged, but otherwise they are merged, and I mean what benefits do we get on.

D

Well, I don't think the setup that you described is going to be ready in like at least a year, which is part of my uh rationale there, but I think that it's not great migration behavior, if one mesh, if one user, creates proxy config at the global level and suddenly all the annotations stop working.

B

No, no! No! I don't think that was the plan. There are separate issues I mean the global level is uh taking over the mesh config process default and the namespace level is taking over the annotation for report. So it's not.

D

F

We're never going to do.

D

We clearly need to describe what exactly the order is, because it's pretty complicated, no matter what we do well, there's at least a lot of different ways. We could do it and it's not obvious which one it is so we should say.

B

But but adding to this complexity on top the merging, I think it's it's even worse, because we have annotation, we have the mesh config default. We have uh the new crd.

A

So, just to be clear, we're in agreement that the proxy config crds themselves should merge on top of each other on the namespace level, the work, workload, selector level and the global level right. That's that's a given an api.

B

Is it yes, probably.

G

Yeah yeah, I think that's yeah,.

B

A

I think I agree with the migration concerns um it. I think it would be nice if users could migrate to the proxy config crd, like one field at a time, and also a ton of stuff isn't configurable through here.

A

um So I think merging makes the most sense to me even with the existing annotation and mesh config stuff, but.

B

I mean the migration is very simple in a namespace. If you want to move to proxyconfig, you start using proxyconfig.

B

uh What exactly is a migration? I mean it's, it's it it's. um We are not in a hurry to force them to move, I mean if they keep using the annotations they're perfectly fine. It's not. There is no reason for them to move as they want to.

C

Well, the migration challenge is a conflict. What, if there's a conflict right between the customer rituals in that name, space or customer resource? That's selecting the workload and what? If they are different from the annotation so which one is going to win and their workload might be broke right because you might be selecting a proxy config. That's not what user wanted or the admin maybe selected, something that's conflict and that could have broke the service.

B

But then it's very simple: if you, if you use annotation, we keep using annotation, we are not removing it. There is no no problem with that. So if you have a port that is using annotations, there is absolutely no incentive to add the proxy config. If you start fresh with a new port, then you just start using proxy config.

B

That's the easiest for everyone. They don't have to worry about what overrides what and ordering it's it's. It's super intuitive use.

A

B

A

We're just promoting a tiny, tiny subset of what is configurable through the annotation. So if we just completely, if we, if the logic is like, if there's a proxy config annotation in the namespace, just totally ignore the annotation. Sorry, if there's a proxy copic crd, just totally ignore the annotation, then how can the user.

B

A

About the fields that aren't.

G

A

G

B

uh If the annotation is present, we we have a flag that controls I mean like like. We did other things. If, if annotation is present, we keep using it otherwise we use proxyconfig. Then we don't break anyone.

A

So if the annotation is present, we ignore the proxy config crd if it exists or if we.

B

Need to use it.

A

On a merge policy.

B

My concern is that much policies are very hard to understand for users I mean for people. For us I mean what fields take priority. What happens if I default? How do you override it? It's a mess, the more we can avoid it, the better. It is for everyone.

C

Yeah I like it, I mean basically annotation wins to go backwards, compatible.

B

With a flag somewhere, because if we start enforcing enforce mode with uh with you know, known by possible, proxies and other things, then we want the notations to be. You know, kind of strict mode versus permissive and permissive you can override with annotations in in strict mode. We use only crds.

C

Yeah, that's fine, but they have to explicitly enable that. So it's not something it will be shipped by default.

B

Yeah yeah, I suppose they will have cni they'll have all I mean if they want to run in strict mode. They will have. You know a very special setup anyway,.

C

Yep, that seems reasonable. That's yeah.

D

um I think it should be reasonable. One thing I want to make sure that we do support, though, is like right now with external control plane you have to configure a bunch of stuff like addresses and root certs and whatever you should be able to set that at the top level and then not let the lower levels override it.

D

Yep yeah that's an important use case to preserve. As long as we have that, then I I care I'm not too picky about the other options.

B

Did we, uh I think, rob or someone had uh concerns about discovery address and the proposal was to not have discovery address in the crd.

D

uh It is not there but ca address is an environment variable and that's in that's in this, so oh c address and and uh root certs and all the other stuff like that.

B

Yeah but all the environment variables are not intended to be api. Really I mean they're just.

D

B

Are in this api? uh No, no, I mean the api has ability to specify environment variables, but ca address is not really a better api. It's an internal.

B

Which we could add it to mesh? I don't know that's an interesting point.

B

But orthogonal I agree with you. We need to have a way to to to specify what can be special fields. Basically.

D

Yeah like we, we need to support the existing use case that we have today with external h2d uh yep.

B

But that's implementation. I mean part of the implementation. We can, you know override some things and and inject some other stuff, but so that way we try to keep the api surface small.

A

Okay, so, basically as part of the api, we should document at least for this first iteration, but all the variables that we need to set as part of uh the setup for external sdod. Those are not going to be overrideable, um that's kind of part of the smart merge. I think that that's very reasonable.

B

Look if, if you want to override them, I'm fine, I mean I was just trying to avoid complexity, but if you, if you, if you want to merge, I think it's you know not the end of the world. We have worse things in.

A

One one thing: that's still not clear to me, so I I see that if there's an annotation existing, then we just don't look at the crd. That makes sense for back backwards compatibility. um So is it going to be the same idea for the top level mesh config default, convict like if anything is set there? Then we totally ignore the global proxy config crd.

B

That's a more interesting one, because a global, the global one is also, you know, interacts with what john described that external study and other things were.

B

I don't know, can we start in? We want to have just add the crd. Add a simple case where, if you have the crd, it is used and then in 113, based on feedback and experience, we can deal with more complicated cases.

A

I yeah I agree, but these these are blockers for the api merging.

B

Well, the api merge can, you know, can just ignore the we can just say that if annotation is present, we just use it so so.

B

Because, as a simple case, it's basically, if you're in a clean system that we're using only proxy config, then you have the api behavior well deterministic and clear. If you are lega season, we just opt for backward compatibility.

A

Okay, yeah, I agree on the annotation, so I guess also for the global stuff for mesh config. We should just yeah.

B

A

B

That can work as well yeah that can work as well, so you can either have a clean system or you only use better apis and then crds and everything else work. If you have the dirty systems and then we we opt for stability and uh compatibility.

B

And we find it in future versions, I mean it's not 3d.

B

What is the plan for telemetry I mean do john? Do you know what uh what they did? I mean if they have the new crds and the old mesh config, which one takes priority.

D

Let me see, I think, the just a new one takes priority.

B

Yeah, that's uh that's what I was thinking, but then then we have the risk of upgrade because if you have an upgrade and uh well probably not okay.

C

But there is different because this thing is used also on the part level. So it's more a little bit more complicated.

D

They actually the telemetry one could be set at the pod level as well, because it was.

F

D

Config, but they just have the api takes precedence over over everything. I see interesting yeah.

B

So now we we, what we discussed, is opposite of what they're doing.

C

D

The one difference I would agree we probably should align with what they did. I you should double check what I said: I'm not certain that's the case, um but the one difference here is that then telemetry there's no need for, like the control plane having a hard override, uh whereas here I think we do need that.

D

um If people don't have any clue what I'm talking about in external h2d, what we're doing is like it doesn't make sense for the user to control things like discovery address or the root certificate for xds or a few other fields, because that's like the control, plane's decision, and so those are kind of hard coded at the control plane level. um Whereas other settings you know, can be configured by the user. So I think that's important to keep.

B

John, it's not entirely clear. I mean uh that it's a strict requirement that they cannot be overridden. I mean the requirement is that if you create a pod, it will work as expected, but if you explicitly set discovery address to something else, as you said, I mean you can just not inject or do other things to do to bypass.

D

Yeah, so I would say that it is a requirement that if a user goes and sets, oh, I want proxy metadata foo equals bar. It doesn't wipe out all the other proxy metadata.

B

Not all not all, but but but if they want to specify ci address equal foo, because I don't want to use whatever the external studio settings and yeah.

D

If that part, I I'd be fine with, as long as like, I think it's a hard requirement that you don't wipe out everything accidentally, absolutely exclusively overwrite it then I could go either way: yeah yep, okay,.

B

Sam, do you think you we confuse you more than you are at the.

A

Beginning of the meeting, I think it mostly makes sense, uh I think mandar actually from his comment. He he kind of brought up the telemetry api as the case that you know, like they're, already doing a merge that protects certain keys that shouldn't be messed with. So I I have no idea what the telemetry api needs protected there, but uh how? How are we disaligned with them? I'm not 100 clear there, because I thought this.

A

This change to make it a smart merge that protects certain keys at the top level was aligned with the telemetry api.

B

It is not clear what keys we want. We need to protect. I mean it's again. It's it's uh john was saying that if you set environments, we should not just replace the entire dictionary, but.

B

Let's, let's dig more into what they are doing, but I don't, I think I think we don't have to do the smart merge. It's just a problem for upgrades really.

B

A

Okay, I can also reach out uh more to mandar and get some clarification. What the telemetry api is doing and uh just make sure this isn't totally gonna confuse people.

E

What are we doing um with the security crds? Are they having some of the same concerns? Is anybody aware of of because I know like they're starting to make those right or or wasn't there an effort to uh kind of make a more robust uh crd for, uh I think certificates? I wasn't sure.

B

What kind of efforts and and for security? There is also the issue of uh the new gateway apis and delegation, which which has very yet another way to to to control uh attachment and uh but security doesn't use annotation too much.

E

Okay, the reason why I'm I'm bringing this up is I'm wondering if there's more of you know some kind of directive that you know um all teams can be kind of working through, so that we make sure we kind of you know aren't, aren't doing something a little bit differently from each other. So then you know users have a different uh experience from you know, one layer you know to another, so.

B

Yeah, I agree. uh The common experience is to use crdm virtual services what controls virtual service, so uh the discussion here is how you migrate from the current status, with notations to the desired status with crds, where everything is consistent.

B

I completely agree with you: we we want to be consistent with the rest of the officio.

A

I think I think sven made a comment on the pr that uh we kind of re-document for basically every crd, the same structure of precedence and hierarchy with global resource namespace resource workload, selectors, and maybe we should have a central place that we just this is the istio configuration model.

A

um You know, maybe some crds behave slightly differently, but you know look at this really good guide. Instead of all of the one off.

B

But but annotations are the ones that we're supposed to deprecate at some point and and kind of move people to the consistent one. So how about the crd as defined is following exactly what sven said and and what we discussed with no mention of annotation except one either. uh If annotation is present, then you know we are going to it's going to take over or whatever I mean that, but not not do merging, because that's not part of the general issues tragedy because they don't have annotation except this one.

B

That's one off. Basically,.

A

So yeah that makes sense so we'd also, if we wanted people to be able to adopt this, then we'd have to in our docs everywhere we use default, config change that so that it uses the proxy config crd yep.

D

A

It okay, that makes sense too um one thing I was actually looking through some implementation stuff and one thing that confused me was the presence of a proxy config discovery service um and it looks like it only. It only functions uh with certificate like some value with certificates. I I don't think I have the context on that. Does anybody know what that is? I.

B

Think this agreement was to remove discovery addressing uh from from proxy config for now.

A

No sorry uh there's like a so there's like a proxy config discovery service that makes it so that you can change some values and proxy config without a pod. Restart, I think, is what I got from it. But, oh, oh, oh I don't know if there's a design dock on that.

B

Proxy coming certificates.

B

Yes, the idea when we discuss this is that since proxyconfig can be pushed by by xds server, we have the ability at runtime to to change those fields without restarting support. So, ideally, all the fields in proxy config will be pushed uh dynamically. So so you, you will not have to restart the board ever, but implementation wise, some fields are, unfortunately, cannot be required to start. So so that's why we kind of try to document one or the other.

A

Yeah, I don't know how we would do it with a field like concurrency. That would require, like a full envoy, restart.

B

We had quarterly start thin way if necessary, so it's not impossible if we want to john. What's your take.

D

I feel like we shouldn't it's not worth restarting envoy.

B

No, no, I agree, it's not worth it, but uh having the some fields that are dynamic and some fields that are static because.

D

B

If there's fields that can be dynamic,.

D

Right now we just have.

B

What it's kind.

D

Of currency and environment variables, those are not very dynamic.

B

Yeah, that's true, but but the root certificate well root certificates are dynamic. Yeah that part will stay.

A

Okay, uh cool, that's everything. Does anyone have any other comments before we move on.

B

Well, what john said earlier that frieze is around the corner.

A

Yeah yeah, I I'm hoping to get this api. Changing asap just want to make sure it's uh yeah lined up and it's the right change.

A

I think it's october 14th for future freeze so just time.

B

Yeah hold two weeks.

A

Okay, uh even yeah.

F

um This should be quick. I mostly just want to get consensus. I think we already, I already brought it up. I just want to double check that um everybody agrees on the items needed to call multi-cluster stable.

F

um The first thing I brought up- and there was a slight bit of contention on- was that I wanted to split out the promotion so that multi-cluster didn't encompass like every version and topology and just kind of captured. Like you know, the the you know, one good, install method which will be multi-primary and then just like the core cross cluster service discovery feature rather than covering.

F

You know our recommended path for doing sd-list, remotes or external cod.

F

The multi-network case. I just want to make sure that those things are all split out like I need to make uh enhancement pr's, but right now uh we don't actually have like a separate multi-network feature status as far as I'm aware of um so that's the first thing I don't know if anyone disagrees.

F

Cool and then um the second thing are just a few lists of items and docs. uh Probably the most commonly asked question is: how do I control you know? How do I make a service only talk inside of one cluster or how do I make it? So I'm only talking to a service in a specific other cluster. If I have more than two clusters um and then we have several ways to do that, you know we have this old mesh config setting, uh which is just a global setting.

F

You can always just create separate uh service resources that have different names in each cluster. If you really want things to be completely isolated and you have control of that and then we do now have a synthetic label that you can use in your destination rule selector to control this as well, and so these are all useful and they all have their own cases where they're better than others, and so I just want to make sure that people are actually aware they exist.

F

B

Comment, mesh config service settings is not an option because it's a you know, alpha api and it's part of mesh config and it's so if we want to have an option, we can put it in proxy config, maybe, but uh even though I don't know how to, but uh maybe we should just keep fewer options. Really I mean just a destination label which is needed anyway and and uh per service. Is you know, part of what user can do without any yeah.

F

Yeah, I agree, I don't the match. Config option is not my favorite.

B

So, let's cross it out, if possible,.

F

Sounds good, um so that's like the biggest thing I want to cover in docs. um The next, like most common mistake, is um trying to have two services of the same name that are very different, um like kubernetes position on this is that if you're doing multi-cluster, then you need your name spaces to kind of be vaguely the same, and so I want to link to their definition of sameness and kind of expand on it. For you know any cases that istio has um and then just highlighting troubleshooting documents, and that should be mostly it.

F

And then the next thing would be tooling: the bug report command right now we don't really tell users and some of them get confused that you need to actually run it separately against each cluster. It might be possible to add support to just make it automatically run against all the clusters, but we don't have that today and then I also want to just add something to it.

F

So it actually collects a few multi-cluster specific data points, um and then we have a bunch of the multi-cluster commands are sitting under like the experimental that have been around for a while the remote clusters one's new, but it's pretty much just grabbing data from a debug endpoint, it's pretty safe and then create remote secret has been around since, like 1.5 or 1.6.

F

So I think we can move that out of experimental.

F

And then, as far as upgrade testing goes, that is tricky. We don't really have automated up to upgrade tests in general as part of our integration tests outside of one helm upgrade test, um I tried to start porting that to multi-cluster, but it's kind of tricky.

F

So what I've done so far is made us that we actually test revision, install not cross version revisions, but just the same revision, and I can try to like our test framework- does have support out for everything to do with like testing multiple versions outside of the actual installation.

F

And so I can try to do a scripted, install and maybe wrap it up in a pro job, or maybe just kind of have it be a reproducible. Runnable thing to prove that multi-cluster upgrades are safe.

B

What what do you? What does upgrade really means in this context, because in case of multi-cluster, what we need to test is that you can have different different version running at the same time. So you have one cluster with version x with the others.

B

There is no upgrade the fact that you you upgrade that downgrade the cluster doesn't change the what happens inside.

F

Well, yes, I guess a multi-version test would be a better yeah.

B

Multi versions, but so basically we don't need to upgrade the cluster. We just need to have one cluster installed with you know, even with canary version and primary, it's kind of the same thing.

F

um Right now, the the two versions are the same, so we want to show that we have compared with some specific version numbers.

B

Absolutely what I mean is that the test should test with two different versions, but we don't necessarily need to dynamically change. The version.

F

Right yeah, the the thing that we don't have automated is a way to install old versions.

B

uh Okay, we- and we definitely do that because we need interrupt tests and you know all the cross version upgrades so that that's uh bigger than multi-cluster.

F

Yeah, it is bigger than multi-cluster, but I remember it was uh somebody mentioned it as like a feature promotion requirement. If you don't have it for everything that might be a little looser, but I at least want to get some manual test run with that I can, or just manually trigger our integration tests against the setup that has the multiple versions installed.

B

What I mean is that if you do it, maybe you can just make it a bit more generic. That's that's all.

F

um Yeah that mean that the install part is really tricky, especially in a way. That's, you know runnable in ci. I've talked to sam a lot about this and we are working on it, but I just there's no way. I'm gonna be able to do that and these other items in the 112 time frame.

B

Okay: let's wait for multi-cluster and then 1 13 14. We can yeah no further.

F

um Something that I know works well and I've done a lot of manual testing of is taking a class, a cluster that is installed without like plug-ins, ca, certs or anything like that, and then transitioning it to use, plug-in ca, certs and then adding additional clusters.

F

So just um but the thing that is not well tested and might have bugs is removing clusters from an existing mesh using the um cluster local controls and the destination rule label, you're able to move traffic away from a cluster. But right now it um there's no way to fully remove a cluster from pilot's view. Without stod restart, it will try to purge all the services. But I've observed a couple cases where it doesn't do it correctly, and so that's the my biggest concern as far as a bug that would block us from promoting.

B

So, what's the behavior I mean if a cluster just disappears for some time we lose endpoints for sure.

F

um Well, this is like, when you intentionally want to remove that cluster, and so you would want to lose those other points.

B

More common is unintentionally, one cluster goes down because you know there is uh the zone is not available. There isn't something happening.

F

B

F

B

Key is a subset of an unintentional case, so if we support the unintentional, then we are covered for the.

F

Other one right, I think, we've talked about this before and we didn't really want to couple control, plane, availability with data plane, availability.

F

um What we ended up doing is um you know if informers and and reachability to the apis. The like remote api server starts to fail. um We have a bunch of metrics that users can monitor to know that they need to start moving traffic away from that as well. But to tie those two things together, it seems pretty dangerous.

B

No, my question is what what is this, because I think what is critical is what happens in case of an outage in one of the clusters.

B

uh If we solve this problem and have a good story, then then you know the other one is.

F

We should use locality, load, balancing with failover and other mechanisms to deal with this, rather than having the uh control plane explicitly start dropping in points.

B

At some point we don't don't. We have to to drop end points. I mean if, uh if a cluster is, if and if a cluster.

F

B

Down for more than x hours or some time do it, don't we droop drop them.

F

um Not automatically, we have a lot of alarms that allow the operator to do this themselves.

B

Well, you know when the cluster goes down, history probably will have to scale up or there'll be all kinds of events, and you know it's it's not like. We can keep them in memory forever, with the current infrastructure.

F

So you're saying that if the like cluster a cannot right this duty and cluster a cannot reach the kubernetes api server and cluster b for some period of time, it should drop everything it knows about that cluster.

B

I I'm saying that weird things will happen because you know yst may restart this journey may scale up. There are kind of things that happen and if a new ins, a new you know professional new instance of history, is started, then obviously it will not have the endpoints.

F

And inconsistent.

B

And you know you're having an inconsistent state and again that infrastructure we have today where we we get the endpoints. If we had synchronization like what I think nate or someone else was uh working on with endpoint slice, synchronization and then persistence, then we could have a consistent uh approach, but that's a new feature that is probably not even close to.

F

Yeah, I think I actually do remember the proposal where you made a really similar comment.

B

I don't think it's a blocker for graduation, I mean it's, it's something that exists, but but we should, you know, try to have a story at least.

F

um So as far as the like, should I add this as an item that we should document like you know what the current behavior is.

B

Yeah find out what is the current behavior and.

F

Like just an operational guide for if you're gonna run, multi-cluster keep an eye on these things, I think the metrics are listed in their docs, but we don't say if you're running multi-cluster, you really need to be alerting on these yeah.

F

Okay um and then testing is pretty good. It's been pretty good for a long time. There's a few things uh that have just got skipped that need to be. You know, unskipped or we need to find out if they're real bugs. So I'm going to look into those and then the only real feature that we want to make sure we support better is hella service and staple sets. Staple sets are a little weird because you could end up with name conflicts across clusters and right now we do have kind of partial support.

F

The way it'll work is when there's a name conflict. It will choose the more local endpoint, but that's still not very good, and I don't really recommend people using it, but we do have a couple of users who have tried out the experimental multi-cluster headless support using the dns proxy and dns auto allocation.

F

um So I want to expand our testing of that just unskip, a few spots that use headless services and skip under multi-cluster um and then possibly turn that flag on by default in 112, because right now we won't share the headless endpoints across clusters.

B

How does it work, if you have uh you know three replicas in cluster a and three replicas in cluster b or four replica cluster b? We randomly select them one, because the whole definition of stateful set is that you know they are.

F

Shards right, which is why I don't really like the the mcs version um rather than you know, doing, hyphen one two: three: it has a first hyphen for which cluster and then a second hyphen, for uh which instance, which works a bit better.

B

F

B

Work because again you you, when you deploy a stateful set in a cluster, it's you know, it's presumably sharding. You know the data into the three stateful sets and if you I mean she needs to be aware of this, I mean it's. It's kind of something's application built in is doing the shutting you cannot chart.

B

You replicate across clusters.

F

So you can switch to a different replica and the shards should probably align one to one across each replica.

B

My point is not a feature that we can probably support in as stable or you know, uh because it's not well defined and we don't even know supporting uh replica staple sets in one cluster. Only that's okay, that's reasonable.

F

Yeah I mean the the comment here is pretty much to say. I we don't want to solve this ourselves right here.

B

Okay, okay, okay! So it's not part of the graduation! It's a feature that we we delay, yeah. Okay, that's perfect!.

F

um Yeah so I mean I think that is pretty much it. I don't know if anybody has any major concerns, I think the most concerning one is: how do I take a cluster out of rotation if something's wrong with it, and so, if I can, I think that will be kind of the crux on whether or not we can promote.

A

Do we have um any like guidance on that right now or uh do we have a way? That's untested or it's just. We need to figure out even what we want to do.

F

So I've done a few proof of concepts where I've used the destination rule label. To, like you know, do traffic shifting and find ways to slowly move traffic away from a cluster, but there's nothing that, like truly drains long-lived connections that we have um and then once that's done, you can delete the remote secret and stop watching that other cluster and when you're doing traffic like that, it seems to work pretty well.

F

But I know recently john sent me a case where um an extra remote secret that was like malformed was added and then deleted, and then the endpoints were never cleaned up and I'm not sure if that bug is reproducible outside of um a case where you know like. I don't know if I can reproduce that case where the remote secret was. You know correct, because essentially what had happened is there was two remote secrets with different cluster names, but they pointed to the same cluster when the bad cluster name, one was removed.

F

um Every endpoint for every cluster was wrong. It was all pointing to just one cluster, even though there was like four clusters, and I haven't been able to replicate that yet and so that's why I'm somewhat concerned about the ability to just delete a remote secret and assume everything is going to be cleaned up properly.

F

um So I think the you know, using traffic policy to shift traffic away from the cluster you're trying to take out of rotation is the better solution for now, or at least the better first step solution before removing secrets.

F

How do you shift traffic away? um You have the um topology label.

B

So do we have a knot? I mean exclusives.

F

Yeah that that that's right yeah, that's actually a really good.

F

Point yeah: I guess that really only works. If you only have two clusters, otherwise you'd be like just making everything completely cluster local.

B

But but you know, if it's down it's down, I mean: if you cannot reach the nodes, then there is no need to write a traffic policy because it will be, you know, not ready.

F

Yeah, so I mean I think is like for our you know operational guide on here's the things you should do. If you're going to bring multicluster to production, should we start like heavily recommending top um or the topology aware load, balancing stuff with fallout with failover.

B

I'm not sure it's complicated yeah. You want people to keep it simple.

B

But let's add some testing to see what happens in in out in in the regular. You know the expected unexpected, basically, the outages of a cluster. If we can survive and without using intervention and everything keeps working, then I think we are good.

F

All right, um yeah, I think we have a few mechanisms to do that in our kind stuff. We can just like remove the ip routes and see things start to freak out.

F

In general, we.

B

Need we need more testing for istio for unexpected stuff, because we have very little coverage of you, know: uncoverable stuff.

F

Agreed um are there any other things in this dock that you think need to be added or clarified before bringing it to tlc.

C

So a quick question on headless is that only supported with flat network.

F

Oh, that's a good point. Yeah. It is only supported flat network. um We don't really have a way to make it work with multi-network right now and speed forces yeah, then staple set. Isn't I wouldn't really claim support for cross-cluster stable set at all.

C

Okay, maybe you just make mark those clear in the stock very.

B

Helpful, the state research should work if the state will set is in a single cluster, so it works. Other cluster can call it it's not a problem.

F

Yeah but when you start like yeah, other clusters should be able to call it. But once you start having like two staple sets with this.

B

Yeah there's two status set to the same name and different cluster is not supported, but state was set in one cluster and yeah.

C

B

C

The same with had this right: it's just you can't split out between classes for one single service.

F

um No headless should work across like across clusters. Even if you have like two headless services foo in each cluster, they should be able to call each other.

C

But only for vlad network yeah only flat network.

B

Everything both photos in our flat network- everything that is uh that involves a multi-network will require probably each one or six single changes.

B

Because with hbo you can, you can basically add the headers that specify exact destinations and we can. We can effectively route to a particular ip.

C

I see um the other question I have uh so stephen. I think we had this discussion before the label. You were discussing topology.istio.io network. That label is still configured at installation time right, it's just a user use it on their destination, rule and virtual service.

F

um The the slash cluster, I think you mean.

C

uh Oh right, you could do slash cluster two, you and you can also do slash network also right, yeah.

F

Yeah so yeah, I'm gonna try to make sure that we, you know maybe elevate. The fact that multi-network and multi-cluster are somewhat separate features.

C

F

Yes, one main thing, and then the other thing is yeah that that just set with the cluster name, that you use when you.

G

F

The control plane.

C

Okay, that makes sense.

F

Cool is there anything else.

F

um All right, I think, that's it for me, sam.

A

A

Okay and uh john, I think we've got the rest of the time.

D

uh Yeah my two were quick. One was just. um I have the pr uh in doc for the automated gateway controller stuff we've been talking about um just need, review on the pr and kind of official approval on the dock. I think it sounded like we had agreement there. We just don't have the checkbox.

A

It has my approval.

D

Yep mine, too cool um the other thing was just uh distrulis. This is uh we talked about this a bit in the test release, but just wanted to bring it up here because there's some relationship, um I think we're planning to ramp up on on this a bit, uh making it more sported, adding better, tooling docs, making more test run on it, etc.

D

So just want to bring that up, nothing, nothing major! Just to fyi.

D

We still keep keep it as an option, not the default now yeah. So for now it will be an option opt-in in the future. I think we want to move in the direction where it's a default, we'll always keep the other one as an option. I think, but that would be a wider discussion. I mean there's a lot of implications there in terms of backwards compatibility and debug.

B

Support troubleshooting, you know we may we may. We may have some some actually that's an important, interesting uh issue.

B

We discuss in the past of having some flag that says, production versus development mode and based on that we can have different circuits. For example, in production we have mtls at the default. We can have distress as a default. We can have. You know the annotations that we discussed earlier uh ignored and so forth. I mean kind of strict mode versus development model. However, you want to call it.

D

Yeah yeah in terms of the debugging- that's probably the biggest blocker for us ever making this default. But- and I know we've been burned by keps in the past, but I believe that they're actually going to have a thermal containers on by default in kubernetes, 1.23 and so real users may have it in their clusters by say march, if they're early adopters, uh so it may may actually finally happen. I don't have a whole lot of a lot of faith, but um it is.

B

D

American containers.

B

Once ephemeral containers are broadly deployed or if the, if there are a version of kubernetes that has ephemeral super containers, then yeah it can. We can default to to destroy less.

D

All right uh yeah- that's all I want to mention we'll- have bigger discussions in the future if we try and make it default, but for now we're just trying to make it better, uh perhaps beta except I realized that 1.12 is way sooner than I thought. So I there's a very minimal chance that that will happen when not 12, but perhaps when at 13.

B

John one question: uh you did this wonderful work with helm, uploading the you know to the repository. Do you think it's possible to do the same thing for the rpms or debian files to to have uh you know repo ad equivalent.

D

um In like the official repos or like host our own customers, something.

B

Equivalent, where we don't tell people to call uh some, uh you know unsigned files from uh from the internet.

D

Yeah, I don't know what it's like to host like our own thing: I'm not as you're, probably more familiar than I am with uh with the stuff, but I from my understanding it's very hard to get into like the official repos, um so yeah.

B

But it is possible, it's possible to create a private uh right.

D

Yeah, that's what I mean like. I think we could probably do that. I don't know what how much work it entails, but uh okay, yeah.

B

Maybe test any lease we can, we can move it there.

D

Yeah I mean if someone wanted to do that, that sounds good um yeah.

B

If only we knew someone in red hat that could help.

D

Oh, we can keep dreaming all right. I have to go see you guys. Okay,.

B

I'll see you.

G

All right thanks, everyone.

G

G