Kubernetes kops Office Hours, 28 Feb 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes kops office hours 20200228

Description

Recording of the kops office hours meeting held on 20200228

A

Hello, everybody- this is cop's office hours today is February 28 2020 I. Am your moderator, facilitator, Justin, Santa, Barbara I work at Google, a reminder that this meeting is being recorded and we put on the internet and to please be mindful of our code of conduct which boils down to being nice to each other me being a good person. I am pasting a link to our agenda in the chat. Please do feel free to add your name and any agenda items you would like to discuss.

A

You have a lot on the agenda, but I would guess we'll get through it. So please do put your items on there. So we sort of have time and can sort of pace. Let's head everything and I will jump right in into it. John do you want to tell us about Google, apply yeah.

B

I've, actually so, click on apply if you're updating a service which has two port entries with the same port number. It applies, it I think I found the solution while researching the agenda item and that's I think to use server-side apply. So assuming that works, does it make sense? The switch show Kumari's 118 to be using suicide, apply for our channels.

A

So yes, I, think to tickle everybody problem space. Is there a manifest change that we could make like, for example, naming the ports that would make this known it? Had we named the ports.

B

No, the problem is: it uses the port number as the unique key for the strategic patch. So there's no way it will. It can actually modify the second entry because any modification, the second entry, is a flight to the first okay. Well, it's just broken.

C

B

Downsides to doing it, server-side apply is new and possibly that's tables. It still beta yeah.

A

I mean the other. My concern is I I, don't miss air want to be the first ones do things and that it is not. It is so it is not I. Call it service out of play v2, because there's two things: it is not a plot, it is not. The coup can apply logic put on to the server. It is not a rest endpoint you can host to that then cause Cooper it'll apply in, like you know, fisher-price, like engineering the there is a totally different way of doing apply with totally different semantics and critically.

A

Once you apply v2, you cannot go back so once you once you pop you can't you can't stop and it would be I think it's that sort of per object level. So it's not the end of the world to say, like you know that certain objects and coop system namespace are now going to use, apply, v2 or with the cops, and you can't use apply v1, but it is a it is. It is a decision we should not take lightly. So.

B

Yeah, the thing is, we want to remove or want to get cored email start run as root, and this is kind of blocking that. So so there's a desire to change and I. Don't can't think of any other way to do it other than if we replace the service and didn't use a plywood that work. Oh you'd have to figure out how to get channels to replace the service. Yes,.

D

B

Yeah but I mean would that if yeah, that would probably be the other way to do it.

A

This is that less risky. Then it's it's much less risky, because it's a it's something. We understand right and we like it's a very simple operation, to understand in terms of what it will do at that point yeah, but.

B

A

B

Logic to know we're not supposed to do that. Yes, that's the risk that.

A

Was my schema? Yes, but yes, it's a it's something that we control.

B

A

It's yes, our it's it's work, but it's not as risky work, I'd say so, okay, so we want to pursue that in preference I. Don't you documented this in 85 Oh? Is there an issue or something where we can discuss this further or not? Yeah.

B

There's a PR 85, 50, ok main that PR we do have in the discussion. We can have the discussion at PR because it's that PR is the change they're trying make and it. But we can talk about how we're gonna actually implement that change. The.

A

Other option I know we can't do that in front. I was gonna say like can we have two services, but if we care about the plus right piece.

B

A

The cute DMS, yes.

B

It's worth a try, I know we can't rename it okay.

A

Alright, well, I guess! That's yes! He's gonna lead us well into the next topic about how these pr's can sit for a month, because I don't have high hopes for 85.

A

B

ah Possibly just sketch in front of the forum, maybe in Boston how we can perhaps organize our code review process a little bit better, so I think we're being a little inefficient.

B

A

I think that's fair, I, think I think in terms I think we have made good progress on the other side on releases and getting that like into a good state. As of yeah. Yesterday we have like a stable release. That's not too old. We have a fader and an alpha and that's I think where we want to be and we're like. The Alpha is like the one that's coming up in the next weeks: I guess so! That's roughly we're gonna be I.

A

Think we've made, as you say, like the ability to merge, pr's and stuff rapidly has been that's good and I. Think we should see that night I actually had a look I, we talked it like I, think was 84 46, I camera, which ones which but like the one with a surge upgrade I think is really yeah great and a nice trick around the things that worked off loss before and I is any a.

B

13 is the surging 84. 46 is cluster validation.

A

Cool, so 8313 I think is we've if you're interested in surge, which I think a lot of people are. Please do have a look at that. I think it's a nice a way to get around some of the problems we had before, where we weren't sure what words to a state and if Tullius touring the state john doing jump in by like later.

B

A

B

Labels the instances through the cloud provider, so the cloud provider is responsible for finding anything orphaned at letting rolling, update, clean.

A

It out and so this and then you don't count those instances, and that means that the the state is effectively in the cloud provider and we don't have to mess with the tart with the sizes of the instance group, which is sort of where we were before, and that's that got sort of challenging. So that's really nice and if anyone I was inclined to merge it, but I felt like other people, should have a look and currently the only supports AWS.

A

So I was inclined to merge it because the change, the API change is the right change. No matter what and then we can see if we can make the same approach. Work on other cops, yeah other.

B

Cloud riders can implement the API right, there's a testing issue which I'll be bringing at some point, which is I'd like to be able to have a test that cloud providers could run and make sure that they implement the API correctly and that's not quite an e to e test. That means run on ete infrastructure.

A

Do we have anything for that or not no.

B

I I think we'll bring that up, but some after is.

A

Merged, okay, but you've seen the mocks and yeah this we can talk about in two weeks and yeah. We also have a work-in-progress PRI put up to like two tests using Jupiter which might like to actually make it easier to do an integration test which includes a rolling update with a search yeah. So we have I'm optimistic there.

A

Anything more anyone would like to discuss on this topic of them. We should all do more timely reviews. It would be myself.

B

Okay, I'll call, there is I, think the Brewers might want to prioritize reviews that have LZ TMS, because I've LG tend PRS and had them sit for weeks. So that's a good, that's a good approach and then yes kind of reduces the low and approvers, because the reviewers didn't take some of the lower hanging fruit.

A

That's a good rule. I, don't know that we have.

A

Tuning that makes that easy, but I think we can certainly like start with that. Just like you can get some searches and see where it goes.

C

What was your trick there to your which wig heroes? Are you lgt I mean then? Are you searching.

B

It so well I'm LG, teaming things that I feel competent or comfortable. You.

D

B

Saying a man if the approvers could prioritize those, they should be able to deal with it with less time, because they don't have to review them for nitpicky things.

B

So next item sure sure yeah so now we're down to one release branch. This be a good time to get rid of anyone. Alpha 1r, since 117 is now in beta.

A

I'm just running through a guy when people think mm-hmm.

E

So I think I was the biggest poser of this. There are a ton of peers that affect the API. That's the cost of writing code here. So even now we have a couple of BCD something metrics and settings that apply to the cluster, and probably people want to at least back port it in 1.7, terapy, so I think either we look at this merge them or whatever and try to get one one, seven as fast as possible. Otherwise we will keep porting stuff too, and it will be a pain because automated cherubic won't work anymore.

E

On the other hand, I really don't see a very big issue of writing code for v1. For a couple of more weeks, I mean it's not such a big chore.

A

The cherry-pick thing, I think, is a great point: I, don't think it goes away ever. Does it lie in terms? I was just.

E

Like trying to take I would have at least you will kill it. We decided that we kill it. It in one-one way eight and it will go away whether, but at that point we should not cherry-pick anymore.

E

You know once we get to candy, we should be very careful what we cherry-pick so one or two issues we can manually, do it but doing it while we are actively picking stuff that we really want in a stable release faster, not sure if it's a good thing, but that's just my view, I can live with or without it.

F

A

It does seem for actually from that point of view like we are perhaps at the optimal position in it. Like 117 has just gone beta, so we shouldn't be adding a lot more API changes to it. 118 is hopefully gonna go more stable soon, 116 is is released, so we shouldn't be adding a lot of API changes to it. I think this could be the perfect time, I think where we, where we have active PRS that make API changes.

A

We should look at those and see whether we would want to, and maybe sorry I don't prioritize them over the removal for the terrific reason, but then then, we're not gonna have a better time than this other than if we hunt by a whole release. I don't know if anyone has a reason to punt by a whole I.

E

Proposed to say a deadline, let's say either end of next week or two weeks from now. That would be the day that this is merged, so ready or not. That's it.

B

Yeah I would possibly say, get a list of PR, so we want to get into 117 with just when that goes to zero I.

E

Think we can just announce it into I, don't know either in depth. Channel say hey if you want something that the touch is API cherry-pick in one one seven shout here and they will be triaged or reviewed or something I don't know.

A

We can just make no sirs look at what's open because I there shouldn't really be more going more API changes going back, I hope, so we can see what's out there and you're, we wouldn't try Big Mac surge right, no John. No, so that's what I can certainly think about of my head. I.

E

Think from this point of view, one one eight will be a pretty major release for cops. Yes,.

G

Yeah I think the best time would be right before we create the release. 118 branch.

A

Which is gonna coach, beta right.

C

E

It's not even released it's so right. Yes,.

A

118 is alpha yeah we have an alpha I thought: I did it's a million its beta beta? Sorry, what cops when it closed alpha yeah.

E

So it's beta one it's for now. It will be released in about one month.

A

I'm, just coming back I thought I did cops, my team is there cops why I seen it yeah there. It is sorry, mrs. crow white was on the first pitch. Sorry.

A

Okay, so I think this is a good way forward, so I think unless anyone objects to the idea of removing alpha one which it sounds like we've told her in the past, it sounds like we had previously said we could always like see. If people objected, it doesn't sound like anyone is or has objected, so we will release it.

A

We will remove it in 118 and if it we will merge this PR and that one thing we're gonna block on before doing that is looking for any open PRS that would we would likely tear it back to 117 that include an API change.

G

John you mentioned that the release notes for 116 didn't include the API removal notice. Did that get addressed I.

A

Thought it had.

G

G

Do see it in the I, see it in the github really.

B

Yeah, the github release does not have the release note about 118 and earlier kubernetes versions being deprecated. So as an older version of the release notes.

G

Just in your own beauty, I.

A

Just always like do a lot of stipulations, the it's in the it's in the dock, but not in the release, notes for the 116 releases, so I will I will retroactively I.

A

Any maintainer can do that or I don't have any way. I can like I, don't know. Are the people able to edit the releases or not I want? No, it's a fact. Okay,.

B

H

Maintainer level came from the perspective. Okay.

A

And this is another thing, I think. Eventually we should get to the point where no one can do it without accident. Gonna accidentally do whatever that, and it is what I made it, but yeah we're not there currently alright sounds like there's something else on that topic. So next up, is it Michael? I, don't see you here.

E

No, he, but he wouldn't be able to attend and I asked him to write his question for us. So let's say I will speak a bit. He did a PR about etcd, metrics, I.

E

Think me and Peter looked through it and it's now pretty good I think well more than the two of us. There is a ton of comments there.

E

He so probably it will be merged, but he was wondering also if he came back port, that to 1.16 I said that probably only extremely important fixes but I don't know. Maybe people that looked at the PR have opinions I.

A

Would certainly encourage the idea that, like even if you're running kubernetes one one six, you can run cops 117 now I, don't I've lost track of where we're on the releases in about the month. Let's see yeah so I think it's it's not it's not ridiculous to backward to 116 if it's the most recent stable, if it's the current stable branch and this does look fairly low risk, but I think we can have a look at it. From that point of view, I don't know. Other people feel particularly strongly either way.

B

In that timeframe, so.

A

That seems that could be a better yeah. That could be a better answer, is let's, let's drive 117 so.

E

Let's say that both 116 1 and 117 0 will be out pretty much at the same time and he can just use 117 I.

A

Think that's pretty the right answer. If Nestor's I'm overriding concern, then I think that would be the ideal. Okay.

E

And there is the generic etcd parameters there that it's around same area of code I, don't know if that should or shouldn't be back worth it to something. But I think we should ping Rodrigo once more to address the comment and get it in also.

I

Sorry, which one was this, it's something else.

E

Etcd parameter Oh.

I

Until we build the new sed manager image and get all that sorted out, probably don't want it back toward that, so it'll be a while. Okay.

E

So know you're here so do we want to at least get it in 118 or still.

I

I didn't get the ow wait on Justin to build B you, a city manager, image and once that is all done, then we'll figure out which release we want to land that it I can I can build that today. Okay sounds good once that's then I came back for it and landed on 118, and then we can cherry pick it so 117 or not, but.

A

I want to decide whether we're doing that or not because of the.

I

Removal I'm building my own releases internally, so it's really up to the community. If anyone else sees the need for it, I can backboard it, but I don't need to personally.

E

I'm, okay, without it for now so, okay.

I

118 sound good, okay,.

A

And then okay, so I'll build it. I want to build it tomorrow, so that we have the 29th, but that's okay, I'll build it today.

I

Come on just build it on the 29, it's fine.

A

Okay, next next on the agenda is.

D

It, yes being you about reviews.

A

Thank you for the ping. Sorry for not doing so. I think this. We can group us under the previous discussion of the need for timely reviews. I think the doctor health check I think I circle back I did promise a circle back up with the people that are in for the doctor health check and did update the issue with what I found I think, which is that, yes, you did it's.

E

I think I addressed your last comment. There great. Thank you. One question regarding the health check this week. I had a bit of time and look at the note problem. Detector installed it on some clusters to see how it works. I see that it's a pretty easy to install head on. It has even a help chart. You were saying that we may want to have it in cups. So I was curious. Do you still feel that it should be in cups?

E

If so, I can probably create an add-on for it or if we consider it something that can easily be installed externally and anyone can.

A

Yeah I think sorry, I I would love for it to be the one we try with the new add-ons mechanism that I keep threatening to builder and like prototyping, but that's sort of fired by I. Think what's interesting about the new problem. Texture is it.

A

It is the no problem detector, but it doesn't do anything, and so the question is so it like what will probably tend to no, but then you, you could end up with all your nose tainted, for example, or all your notes marked as bad and you sort of need the other half which is to go in and like do something about that like going to leave the note, is it like, for example, I, believe that's opposed last time, I looked anyway and I think so that that we we now have a place to run that right.

A

We have cops controller. We can do that sort of thing. We can build a separate piece, but that's that's where maybe like. We want to treat it as more of a core piece or we, or we just put that second piece in a separate pieces, the act to it. No problem fixer, you know the node fixer in a separate piece as well.

D

A

Did that make it clearer or less clear.

A

Yeah I thought.

J

We already had the an add-on for no problem detector in cops.

A

K

Could be wrong is.

A

It like just in the directory of things, were like here's, a recommended I, don't I.

A

Don't see in the first directory I'm not sure we have it.

J

I could be.

A

We can we start with setting and just and sold it yourself. I do think. That's.

A

The darker health check today doesn't just detect it. It also restart stalker, which is you know, surprising. The and I think the no problems detector would only detect it, so it isn't sufficient to just and so no problems. Actually, you also need the second half. If you want the same behavior.

E

Okay, so for now we don't do anything anyone that may be at the note in the release. Notes about hey, take a look at note, problem, detector, I, think.

A

That's the right thing to do yes, and that we can. We can also, then ourselves take a look at the problem vector and see if yeah see oh she's like it ever gets, triggered. For example, I put.

J

Them in I'd pointed to both the problem, detector and probably like to right now is, though, their bill or half of that yeah.

H

We use no problem detector, you know via Elm, so I also think it could be a worthwhile thing to find a good space in our dots to say they also suggest you run this and you know point them to the helm chart because we could add it as an add-on. Now, but personally, like I kind of think, it's little, you know, I know not. Everyone uses helm, but at least is an easy way than it's one. Less thing we could worry about until we're ready to help suggest, employed and.

A

I liked Reno, that is yeah.

J

We run both of those through home as well, so.

A

Okay- and we can, we can like I guess we should preach, say like for now like instant protector and consider Drano, and then we should evaluate those two together ourselves and and say: like is this functionality we want to like somehow make easy to get going in cops, and we think it's sufficiently important that everyone should be running it typing, oh cool, all, right!

A

If there's nothing else on that.

A

Next, up again, heat note status check with the cops validate cluster or replace it in their customer.

E

Yes right now it we tests, have a pretty simple ready detection system they check if all nodes are ready and then that's if they go forward. I added after that, a cops validate cluster command with wait. Five minutes and I was proposing to wait for the validation a few times, not sure. If that's the best way, I had some discussions in the PR with John about it.

E

So do we want to replace that now the status check completely, because we can even now say, hey, wait, 15 minutes the same way, it waits check for 50 minutes. If it succeeds, if not consider it that it's broken, do you want to keep it disease and don't do anything? I was proposing to do this because I was hoping that our checks are more frequent and it could go to the testing phase.

E

A few minutes earlier at some point, the health checks for the status start to wait about two and a half minutes in between requests. We do the check every.

B

E

Of that, so it's already there we can. The thing is I would start with 15 minutes. So that's totally separate things from the PR, so we can do it even now. Just delete that section and with what we have in current master. We can do it, but it will validate only one time. So if it gets all pods, probably in status running, it will go forward with the tests, not sure if it's more complicated the check. But in my experience you could have a pod that says running and 10 seconds later.

E

It decides that the health check just failed because I have something. So this is why I wanted to do more than one time the validation.

B

So we have, we are had one known case of flapping I think I might have fixed it. I haven't gotten validation, that was the cute controller manager not existing. I.

B

If there's other types of flapping I'd like to know about it, to try to address that really the the high level API accountability coaster should be that when it says it's, okay, your cluster is alive and you're. Okay to go on to do other things, so.

E

You would be in favor of when you run validate to not have like a count or something to just run it once and be sure that it's done yes.

B

And and the one the the existing case, where were waiting a little bit, I was due to the phenomenon which I think I trip in and I recently addressed by making melody cluster fail on that case, yes, I hate success, early.

E

The power e to e o'clock e to eat pests, don't run just cops with the let's say, cube net. They run cops with whatever flavor of network clogging or whatever else. We decide core dns configured in some strange way. Maybe and it could behave unpredictably. This is why waiting longer before saying hey it passed a few times. The validation could catch cases when these plugins misbehave, like start early, say running and then just fail a bit later.

B

If we did a cops, felony cluster and succeeded and then we did subsequent runs on a cluster to see if it fails, I would like those failures to cause the ete test as a whole to fail. So we have signal- and we know that something is actually wrong- that should fail et this.

E

Is what it does so if it, the idea would have been if 15 minutes. Let's say it wouldn't manage to pass three times in a row. Let just the number then.

F

It would just fail it.

B

Looks like here pthey 5:15. If it failed, it would then keep trying to make sure it. It would wait till start.

D

Succeeding again.

B

Yes, so I wouldn't.

D

B

To e 2 e test, so you know that that is actually the inverse of what you want to eat any test to do.

E

No because I don't know at which point the cluster is stable, so you're saying that if it validated once there.

F

Should be fit least either stable or it's a problem with the cluster. It's a problem with melody. This is what to say that if it fails, it should just get out of it. Yes, a DHS should fail. Ok,.

A

I have a suggestion and.

B

A

B

Want to run that in test read a bit before we make it a pre submit because we'll probably have some noise to start with.

L

A

Suggestion on the coop testing I was liking. It like if we ran today's valve a cluster before we waited for the ready notes, so we just swapped the order, the mm-hmm with that that would be that would get around the wait, Freddy nodes going in to back off, because we would expect ready notes to be ready once father de Coster had succeeded so we'd, never we took away for he knows to be to pass first time effectively. So it's never gonna touch we're, never gonna back off and then and then I think.

A

The the function that you've implemented in terms of count I think makes a ton of sense to run in ete, and then we could also optionally, with the flag perhaps or not like like we could run it twice and we could say if it validates and then fails. That is a test failure of cops validate cluster. It's a little weird because we're failing validate not like you know not that BRS but but yeah I think that's. If there was a bug in validate that's how we catch it. Yeah.

E

Okay, I understand so, if it, if, during this, during the time of the consecutive checks it fails just get out and consider it a failure completely.

B

It's a basically an et East test, specific behavior, which is, let's make sure that cluster keeps validate.

A

And I am I think there are some things which I'm not sure we covered like, for example, I think we check that all paws are running, but I don't think we check that, like deployments, are at full count right.

B

So, like a deployments, what valid a cluster checks currently is that all pods in the cube system, main space, are ready and I have a PR to change that, to only care that are to instead care about posit or system critical or system cluster critical. We.

A

Have to make sure we have those on there, but ok, yes, yes, but I. Think so. I think the the thing I think I'm aware of is if a deployment is slow to sketch if their parts haven't been scheduled, yet haven't even created for whatever reason like you could have deployment at 0 of 3 no pods yet theoretically, and it would pass validation because we haven't checked that the deployment has created the pods as creating pods, possibly if there are more things, but no any cluster needs to check. I would like to know about.

A

Yes, I think this is great right like this is. This is what we will. This is what that strategy could uncover. I guess: yes, it.

B

Would also would have caught things like cops, control are going and crash with are periodically crashing, it would have walked at the arcs.

H

A

If we were lucky.

E

Yeah I think we had validate cluster running with with cops controller in Italy, because validate cluster is there for a very long time. Right now and.

L

Didn't find anything yeah, we didn't run it twice. We didn't see the weather was flat back. Yes,.

E

We run it only, we ran it only once because this way we made sure actually the pose running, not just the notes.

B

I think the notice check and go away as being redundant once once for confident of causality cluster.

A

Yes, except I, agree with you. The note status check is technically not ours, it's a it's a thing and they may put more logic in there, which might movie like that. We might not be testing, and so we'd like to pick that up, so it is redundant. I think we should keep in there. Okay,.

E

So we make it run after our validation and should pass in first class.

A

Yes and I would say we should we should run it twice because I'm fail.

A

B

There is he running the note status, checking making sure that passes, but then there's the run validate cluster afterwards and if it fails that fails, e2e, okay,.

E

But I would do different things. Do it three times to be sure, it's ten seconds more and I doubt anyone would care agreed.

A

Especially for saving all this time on the back off and I'm.

G

A

Partially retract, my previous statement about whether we want to keep the note thing, but yes, that's a cuz. Maybe we want to find out if they had something and maybe want to put it into battle a cluster. But anyway, that's don't put too much weight on to anything. I say.

D

A

Worries alright. Can we keep going okay, Peter? The AWS I am authenticator issue with zero five zero. So.

G

Created Nicator is created authentic to 0-5.

A

Right here mean you're sort of chop-chop II, maybe turn off your video and see if that helps the audience. Okay can.

G

You hear me now: okay, so I upgraded Authenticator to zero five zero in master and zero seven or one seventeen.

G

After some testing, it turns out that there is effectively a breaking change in it in the that previously, you provide a config file in the container via I amount, and they had its support for also recognizing config files through CR DS, as well as like, detecting and config Maps, and so, rather than having the default behavior still look for that config file. They it's now trying to look for CRTs and ignoring the config file, even if we specify it. So that means without changing any of the command arguments.

G

It's now no longer functional and so I was looking at possible solutions to add the new beckoned mode argument. It'll only work on 0 5 0, so we would need to and we allow through the API. We allow users to specify an image, so that allows them to use older versions if they want. So that made me cautious of just hard-coding the new back-end mode argument.

G

Another option would be trying to do some sort of semantic version: personing parsing of the image, but because people can provide their own I, don't think that would be sufficient and so another option I talked with Rodrigo about was adding a new API field that would allow users to specify it, but I, don't for that same reason about hard coding. It I don't know if we could effectively set a default value for the API field.

G

If someone is also providing an image, so they would mean, if they're not providing an image, we can add it automatically and that'll be fine, but if they are providing an image they would need to go in to their API spec and add a value for that field.

G

If that makes sense, I'm.

A

Justified the conflict map today is something we do expect users to go and create. It is not something that we create an internal implementation, detail, correct, okay,.

A

I can say how we probably have handled this in the past, which is, or things like this in the past, which is so we could, we probably add a field back in mode or something something equivalent.

A

New clusters that are created clusters that are newly created would, if they default to 0 5 0 would have that back end mode be specified as CR D or whatever. That word is the existing clusters.

A

We would not, we would assume they were running 0 for 0, because otherwise it's broken, and we would say that they are the the failure to specify a back end mode.

A

The default value should be the legacy behavior, which is unfortunate, and so then, when we, if you do create a new cluster and your we will put your 0 right here and we'll write that field value in so no one has to actually type it in themselves on the create flow, and if you override the image, we just have a note like if you're gonna change the image, that's on that's sort of on you, I'm gonna, have a note there saying like.

A

If you change this to 0, 5 0, make sure back end or greater, make sure back in mode is set to CRT and you're gonna have to move your config map. At the same time, okay,.

G

And when you say add a note where exactly what do you mean in the release notes or in the API field or probably.

A

In the API field, and then probably in the release, notes as well, that doesn't mean we can't read upgrade their if they're running, if they're running an existing image and haven't specified, it means I, messed up yeah we need to. We need to think this through very carefully I guess in terms of yeah.

A

Yes, the approach we took the island is how we basically dealt with this so far in terms of like existing clusters create a field. The optional value is the legacy behavior and then create cluster will set the new value on created clusters so that or a newly created cluster, so that users don't have to like do anything.

D

A

Everyone gets the like: the new users get the new classes, get the new behavior and existing clusters don't break. But there is this added complexity of like how do we upgrade the users version.

G

Okay and I I open that or I commented on that github issue, so maybe they'll fix it in zero. Five one in this all go away, but in terms.

D

G

Being in 117 I think at this point, I'd be tempted to just revert it, and then we can address this in master and then whether it makes it into 117 or not, we can deal with that later is.

A

There a feature in zero five zero as compared to zero four zero. That makes it important or bug fix. That makes it important it's a phrase well,.

G

The custom resource itself is pretty attractive because it provides additional. You know, input validation on the contents config map right, so we we've been looking at great to it, but.

J

We also should just reach out to the Nick it's assigned to him at AWS and see if they, if they're gonna, address it in a dot release and.

A

All right and can we discuss, let's discuss this on the issue? I guess, oh, do we know have an issue I, don't.

G

Have a proposition for it, I think.

A

I'd be good or or a like a work in progress by that we hold that like does the refer and discuss it. Okay, whatever you prefer.

A

Okay, next topic: I want to try to get a student agenda. John version docs. Yes,.

B

So currently, our duck site is publishing the dots that are checked in on masters. So when someone adds a feature and master and then they document it, and someone sees that and wonders why it doesn't work because they're using AG a version of cops. So now people are putting in the documentation that this feature is now. You know, as of 118, and the in-game of that is that yet your Doc's ridiculously document, which version each feature got added in so which is kind of hard to read.

B

So I was wondering if we could either have version Docs or if we can't have version. Docs publish the docs from the GA version, not from master.

B

And who knows about our Nong site? Thank.

H

You probably mostly meaning yeah I, talked to a couple groups on this. I know that for kubernetes kubernetes they actually have independent sites every time they do a release. They spit up a new site. That way, then you know, and then it's linked to the release branch of that site. I think that's a little too much overhead for cops personally, and you know I think that that's I mean it would be simple. You know tying things to a release branch and just living off of a specific URL for that.

H

But you know every time we do a release. We'd have to set up a new URL and a new subdomain and a whole bunch of work. That I think is probably overkill for cops.

H

So sorry go ahead. Well, yes, I! Don't I, don't necessarily have a a solution in mind, but if that was one thing, that I was planning to think about later this year, I'm in the middle of a bunch of other work right now internally, but I, yeah or I was thinking you know.

H

Maybe maybe we do some magic where we yeah there's a lot of options we could we could do, but if anyone has ideas, that's probably worthwhile to have its own issue opened on okay, you know and then we can start throwing some ideas in there. If you have any I.

B

Was thinking the simple thing is not published, master at publish 1:16 are released, 1:16 branch and then let me do it release to be swish.

H

I think we could do that, because I think we can I think we can point it at branches in in the config file in our repo. So that's probably doable. It's.

A

Not a bad idea that feels, and we would we would cherry-pick then Docs fixes that were relevant to the older version, I mean or and for any like API or features that we do backward, which we shouldn't really do. But, like my ways, your dad for those man, we need to cherry-pick our.

B

Release notes to their own release.

A

That's true mortuary thinks that's the you did describe I, don't know how other people feel you did describe the from version, this as being a sort of anti pattern. I I personally find it's a little nicer than when I click through onto like a certain C and I providers website and like land on some random version. That was whatever I guess: Google decided to index at that time and then, like it's, not the latest version, but you can.

A

If you like, click through and I, don't know, I, don't know when something was introduced and I, don't know whether something is addressed in a later version or there's a feature available in the later version that would like I am eagerly awaiting so I agree. I, like.

H

That I, like the call-out flags of available in this version personally, but there's an overhead to that, and you know who goes and edits that and I, don't think we so far. Unfortunately, I wonder if we could structure it at least. Have it be a.

A

Yeah, like some sort of block like, but and so that we can also drop them off, as we do start deprecating like like to say like available in cups, one six, it's like that shouldn't be in the release, notes anymore, right right, we're in the docs anymore. That's just like okay read but like available as a as a cops 1:18. That's interesting information. Now.

G

The kubernetes Doc's often mentioned when features are introduced as alpha and when they graduate to bethe unstable. So there is some precedence there.

A

We could probably start by not not dangling the fancy shiny features of 118 in front of the users that are not on my team by pegging to a release, branch, I, guess, yeah, that's a good! That's a good! First! First idea: we have one item left and five minutes left I will move us on. If that's all right.

A

C

I'm I'm working on some of the and n testing. The goal, which is not important to this particular question I, have is to get GCE and and tests like enabled and get that going for PRS. However, I realized that all of our antenna tests are in sync cloud providers, AWS, and there are GCE tests in there, which is a little weird I proposed a change but I, don't think anybody's gonna do anything until Justin thumbs up or thumbs down on it.

C

It's a it, basically just the lift and shift right now to move it into safe cluster lifecycle, which I think more properly describes. What we're doing at this point there that that one I think isn't easy, is I, don't think, there's going to be much competition to it, but I could be wrong. My second question is about consolidating the presubmit checks and I. Don't know if we can do that into thee, because we now have tests in kubernetes cops and we have them in kubernetes sync cloud provider AWS in the testing for every phone.

C

And where should they well, I thought they might more logically, be grouped all together into just like same cloud provider? I, don't know if that's possible, though, if there's some like reason, they can't be kind because it there historically they've been separate or not just didn't.

A

Yeah I'm just looking at does it have any does the directory we choose? Have any I agree we should consolidate. If we can does the director which you choose have any consequence? That's.

C

The question that I don't know the answer to yet, but that's that was the following question: to getting some attention on that pier okay,.

A

Cool I think the only the obvious one is like owners files. Yes, but my.

C

Thoughts there were just a structure, put everything put everything under our arm like cops directory where we can have another nurse file. That was that was mirrored in that way. We can just do it and this that the change doesn't that I propose doesn't impact that at all or should I.

A

Okay, I I, don't know it's. My short answer like it looks like in some places. It reflects like your tough organ and summer places to reflect SIG's structure. Yeah.

C

A

Actually, it looks like the exception for us like okay looks, like the big exception is the cig cloud provider. Edi best cops should just be kubernetes cops because it looks like mostly it reflects github repo structure. Okay, that's great I'd be happy to move it there and.

C

With everything there but.

A

Just because it looks like that to me doesn't mean it is, that is true, but that would be my inclination and then we avoid like the whole like which, in it yeah when things get renamed or whatever, like you, know, yeah. It's just like your effects. Think github repo structure, which feels cool.

C

I can I can adjust that and a researcher. You know if anything else comes up. Okay,.

A

Great, thank you. We are at time, I, don't know if anyone else had any topics. I.

E

Had two questions: anyone coming going to cube con Europe, I'm.

C

Theoretically, going I'm.

A

I'm, theoretically, oh- and they were all watching the travel news right, yeah same I'm.

C

Theoretically, going as well well, we may have a theoretical part. You know, mr. to him that you know baby rats, a pencil.

M

C

M

We go, get Zack, orange-kun, yeah.

C

M

Been putting off my talk work as long as I could who's.

K

Cutting releases when Justin's quarantined is a risk to the overall economic.

C

Community, if all the kubernetes people go and quarantine yeah, you can't put us all on the same plane either. Yeah.

A

But we can all collaborate in quarantine think what we could get done.

C

The people need us.

A

I don't fit I put it I, don't think it's letting us on the release plan. That's important I think we might actually got to create some buster a mice. This week, I told that image, people and I will do another 180. Now, if there's anything but otherwise, I don't have anyone has any last topics or releases. They would like to see.

E

Not releases, but maybe in the same topic is testing from. Maybe someone from let's say this party could get added to the owners file where cops go is in testing front me and Peter have been doing some changes there and it's pretty annoying creeping the week or so for an approval or something that's only related to cops. Okay, yes,.

A

That sounds sensible. I'll write that down yeah, particularly as we're not blocking like KK changes anymore. So there's no reason to like when we were blocking kubernetes kubernetes changes. There was a big reason to be like no. You can't just change your job now we're not doing that. Okay,.

A

Alright, well, if there's nothing else, I wish everyone a very happy two weeks and yes, how do you happy end of every thanks there.

J

Were two things.