Kubernetes SIG Apps, 27 Jun 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes SIG Apps 20220627

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

Okay, um hi: everyone welcome to join the june 27th caps meeting. I'm your host janet and with me machete on the call is also co-host, and today we have two things on the agenda and the first one is mitch is going to talk about the kubernetes progress, progressive rollouts and I made you a co-host, so you should be able to present.

B

Thank you. uh I just noticed that I have to exit zoom in order to give it sharing permissions, so I will be back in 30 seconds.

A

B

Janet if you're able to reshare co-host with me, I think I lost that when I left and came back.

C

B

Okay, can everyone see my screen.

D

Yes, we can yes.

B

Awesome thanks so uh quick, intro, uh hi everybody, I don't think most of us have met. My name is mitch connors, I'm a software engineer at google most days.

B

I work on the istio project, uh which has been a lot of fun, but uh I've been playing around with progressive delivery in kubernetes for quite some time uh as part of my experi experiments sort of uh in how to support istio best how to enable safe rollouts, um and I wanted to share some findings and sort of a a recommendation here with sig apps around progressive rollouts.

B

Let's go to the next, so uh if you're not familiar with progressive rollouts, they are similar to the existing deployment concept in kubernetes you're, going from one version of software to another, but rather than shifting as soon as a new instance becomes available spin up a new instance tear down an old instance, you spin up some instances and intentionally move slowly.

B

You move a little bit of the work onto that set of new instances and rather than simply waiting for a health or liveness signal, you watch how well those instances do the work and I'm being intentionally vague with that word work for many kubernetes apps. The work is handling network traffic for others.

B

It might be processing jobs out of a queue or something along those lines, but at a high level, what we're doing is we're shifting progressively shifting work onto new instances and then deciding whether to proceed not based on some arbitrary health signal from the app itself, but on how well it does the work on an http service you're going to be looking at latency at error rates at success codes. If those things are improved or better than the previous version, or not worse than some arbitrary threshold. You continue rolling out.

B

You shift more work onto the new workloads over time until eventually, hopefully, you've either got 100 of your work on the new workload or you found that there was something bad with this rollout. Some reason that it's not doing work as well as the old one and you've rolled back. That rollout has effectively failed.

B

This is sort of what the workload on a progressive rollout for a a service looks like so here the work is uh traffic. We're gonna run two versions of an application concurrently uh once that, well sorry, we'll run run one in a stable state. When a new version is specified, we schedule a new version. We don't deschedule the old version yet they're running concurrently.

B

We increment traffic to the new version check some measure of the health of the work that has been shifted. um If it's healthy, we check to see if we're done, whether we're at 100 or not. If we're done, then we unschedule the old version, and now the new version is latest. If we're not done, then we just increment again we're going to increment until we get to done or until our is healthy signal sends a no.

B

Maybe our 500 error rate went up as a result of this rollout, at which point we revert traffic to the old version and the old version is latest. The rollout has failed.

B

Kubernetes does have a concept of rollouts today that is not entirely dissimilar to a progressive rollout. The biggest difference is that the signal for progressing a rollout today in kubernetes is liveness and readiness of the pod, which is a very simple on or off flag. That's set by the pod. Usually we're looking is the pod able to respond http 200 to a request at port 80 status or something along those lines? At that point, the pod hasn't done any work.

B

It's just the pod self, selecting and saying that it's ready to progress forward with a progressive deployment. The idea is that we're going to do work first before we really make a final decision on the health of this new version. We're going to get some work done and see how that work goes.

B

um So you can sort of see that you can sort of see the progress uh today you get a pod scheduled its startup probe. Succeeds then its liveness probe succeeds. At that point, we are going to start tearing down the old version.

B

uh Eventually, later its readiness probe will succeed, at which point it starts serving traffic, so that the doing work is two two full steps. After we've made the decision to progress with the rollout, what we'd like to do is delay that decision to progress with a roll out until traffic has actually been served. Work has been done and we can get a better idea of whether the pod is doing work well or not.

B

There's a lot of prior art I see alexis has joined us today. His team at weaveworks has done a great deal to move the state of the art with their flagger tool, uh and that's where most of my experience with progressive delivery comes from. uh There's a lot to like about flagger it uses. Existing deployments creates two side-by-side deployments in order to get both versions executing concurrently.

B

It uses horizontal pod, auto scaler and then creates this new crd called a canary. That gives you all the information you need to check on the progress of a rollout. You can define what healthy looks like and how quickly it's going to take steps which what how much traffic is shifted in each step and actually the way that I got involved with this is one of the ways that flagger can work is built on top of istio for traffic. Shifting.

B

So the the canary crd is a little bit undesirable in the flagger supports a lot of traffic shifting implementations under the hood, it's incredibly flexible, which is fantastic, but that does mean that the crd has a lot of fields that are dedicated to a single implementation of traffic. Shifting uh we'll come back later to why sig apps would not need to implement so many different underlying support mechanisms for the traffic shifting sort of explains why the timing, I think, is right for a progressive delivery.

B

Api uh intuit also has developed a progressive rollout thing, sort of service called argo rollouts.

B

They essentially have cloned the deployment api. If you look at their crd, it's called a rollout, but it looks exactly like a deployment.

B

The downside there is because it doesn't use deployments. If you want to onboard to intuit argo rollouts you're, going to have to update all of your deployments to now be rollouts, it's a pretty small update because they share pretty much identical schemas, but that is a pretty big change in terms of onboarding, whereas with weaveworks you don't need to change any of your existing deployments. You just add a canary crd on top and you magically get those progressive rollouts.

B

However, the fact that argo has copied the deployment api for rollouts signals to me that the deployment api is actually the right place for this to live. uh They were able to take pretty much off-the-shelf deployment at a field or two call it something new and achieve progressive rollouts.

B

So both of these solutions sort of center around the existing deployment api there's also pinned deployments.

B

I'm not sure that anyone's using this, I haven't seen any development in the last three years, but it is one of the other available kubernetes tools in this space and uh that one clones the deployment api with like two specs within the spec field. There's I can't remember what it's called, but it's like an old spec and a new spec to define what you're coming from and what you're going to that's, not especially declarative right.

B

When you use an api, you want to say what you want it to be, and ideally kubernetes would take care of annealing towards that state. You don't need to specify the from state uh with a top-level declarative api. So that there's a few reasons- that's not desirable, but I think we can take the best from all of these solutions, uh learn from what we see in them and come up with something that is excellent for kubernetes users.

B

This is what that might look like. We already have a deployment spec strategy field. There are two values currently supported, um replace, which is I mean it does what you would think it deletes one replica set and then creates a new replica set with the new version uh and then actually what is the other one.

B

I can't remember what the other other value the default for strategy uh spins up a number of new instances and then tears down a number of old instances once those new instances are marked as ready. This would add a third value to the strategy field and I've. I've named it here, progressive rollout, I'm really bad at naming things.

B

So uh please don't get hung up on on the particulars there, but the idea is to introduce this third enum value uh that allows us to within the strategy field and or sorry specify that we want progressive, rollout similar to other values in the strategy field. It has its own settings under the progressive rollout heading and the settings should look fairly familiar max surge and match max unavailable are carry overs from existing strategies. I really should have written down what the name of that default.

B

Existing strategy is, but then there's also these new fields interval threshold max weight step. These are defining how to take steps towards the new version. We're going to every minute uh shift two percent of traffic up until we get to fifty percent of traffic, and then we consider it a success and roll it to a hundred.

B

um Are you thinking of own delete.

E

Recreate or rolling update for the the strategy you can't think of right now.

B

Rolling update, uh I think, is the strategy that I was unable to think of. So thank you for what it's worth. There are three already, so this should be a fourth, oh, is there okay? I missed that.

B

What was the third, it was on delete.

E

Yeah on delete will only recreate the pod on deletion. Recreate causes it to roll out completely and recreate the existing set of pods and rolling update is what you just said, and that's parameterized by min unavailable in mac search.

B

Awesome thanks. I was not aware of the on delete option. That's that's interesting.

B

So these field whoops these fields here, are describing the steps that will be taken in order to get to production and then there's another set of fields um that we'll cover in a minute that define health.

B

What does it look like for this new version to be doing work in a healthy way and again, that's configurable to the user? Before we look at those fields, though, there are a couple of advantages here. um One is that the user is only declaring the desired end state and the steps or or how quickly, to anneal and reconcile to get to that desired. End state they're, not defining what the current state is or they're, not responsible for shifting anything from old or new spec to old, spec or anything along those lines.

B

Also, if, if you're using deployments today, you could onboard to this api just by updating your rollout or your rollout strategy, type sorry deployment strategy type to be progressive, um there's no need to install a new crd. There's no need to write to new. You know a new rollouts, api or canary api. It's all packaged up in the deployment api that our users are already familiar and fairly happy with.

B

So those are some of the advantages here. The other bits of api that we need again are the ways to define what success looks like in a progressive rollout. So in this case we're going to look at request, success every minute and look for 99 percent uh success. We're also going to look for request, duration every minute and make sure that we're never seeing a value over 500 if either of these uh standards are violated during the rollout by the new workload, the traffic on the new workload, um then the rollout fails.

B

It's going to roll back and mark itself as a failed state. um This is fairly heavy in terms of the amount of information that we would need to stuff into the strategy field. um Ideally, there would be a crd or sorry a resource specific to service health, say a slo, an slo resource that we could write to. I've talked to sig instrumentation. It doesn't sound like anything like that is on the horizon for them, but they did point me to horizontal pod, auto scalers, which already support custom metrics.

B

uh As a decision decision driver for auto scaling. I would think we more or less take implementation from horizontal pod, auto scaling, custom metrics and with a few modifications for how thresholds work, we could use that for progressive rollouts. Again, it's still a lot of data to get stuffed into your deployment. It might be ideal to have this defined elsewhere.

B

But it is possible to do here.

B

uh Let's see other rough edges, there are multiple types of progressive rollouts. What I've been talking about and describing is mostly the the canary process where you, you have a new version of your software that you want to roll to as long as it doesn't violate some basic foundational rules. You've set out, uh there's also a b deployments where you might want to move to the new version or you might want to stay on the old version.

B

You just pick, whichever one is better depending on the health semantics that you've defined and then blue green is a little bit similar to our existing out in that you're, spinning up some new and tearing down some old, but with the addition of some health signals on the new before the tear down happens.

B

Another rough edge here is that this is very specific to applications whose work is uh traffic. If you're looking at jobs that pull from a queue or something along these lines, uh the api would really need to modify pretty drastically to support shifting their work in that direction.

B

There's a few reasons why I'm talking about this today and didn't bring it up say a year ago.

B

The the main thing is that a year ago we did not have the apis in kubernetes that we really needed to shift work from one version of a service to another. uh Istio has supported this for a long time. Engine x has supported this in north-south traffic for a long time.

B

There's other envoy-based service mesh products that have supported it, but each of them came with their own apis and it really wouldn't have been suitable within the kubernetes deployment api to take out dependencies on 17 different kinds of crds for traffic shifting. But today, with the development of the kubernetes gateway api, we now have effectively one abstraction that most or all of those implementations support for traffic shifting.

B

This is what a given http root looks like in the gateway api. If you haven't played with it, you can define a particular hostname and say it's going to be backed by these two services at a weight of 20 and 80, respectively and you'll notice. Within this you don't see anything about istio. You don't see anything about envoy or nginx, and that's because it's defined in a completely different resource, the gateway resource which is user defined, uh specifies exactly what implementation is going to be executing the shift of this traffic.

B

You could build a gateway on istio. You could build a gateway on nginx, uh google cloud load. Balancer supports the gateway api. There's I think aws does as well, so the implementation is really up to the user. What uh the deployment would create is this here we would create this http root and allow the user to create a gateway object that selects it, which would actuate the traffic shift. uh This 80 20 split that we've got right here.

B

So it's it's nice timing, in that this dependency is finally available and, finally, at a level of readiness that makes it appropriate. We did have the ingress api before, but shifting really wasn't traffic shifting was not a part of it and the amount of brake glass configuration that you had to do with implementation, specific annotations, etc was pretty heavy. So this is a pretty big improvement coming from signet working.

B

The idea here. What I'm proposing is that sig apps works together with sig networking, as well as sig instrumentation, to define a new best practice for how users move software into production safely.

B

I think that might be the end of my slides. Oh now, this is just a little bit more explanation if you're not familiar with the gateway api, what we demonstrated right here is we would be writing an http root.

B

uh The cluster operator defines a gateway which refers to a gateway class that might be istio envoy, engine etc, and those three resources together sort of separate concerns, uh so that each of us is not stepping on the other's toes and we can actuate this traffic shift uh in a pretty well defined and isolated manner, and that is the last slide. So with that uh questions.

E

um What feedback were you looking for from this forum.

B

uh I any uh at this point I'm looking for. Is this a great idea? Does this belong in the deployment api? um I I certainly wouldn't want to take something as big a change as this and send it as like. A drive-by pull request.

B

I'd like to hear whether this is the right direction, whether this makes sense for sig apps and for kubernetes as a whole.

F

I have a question: can you choose the page? Seven? Yes,.

B

I don't have slide numbers pulled up on my screen, so let me know when I get there.

F

All right beside the previous one, this one here, yeah yeah, uh I'm curious- is that uh match the pro promises matrix.

B

Yes, in this case well, actually, you notice there's nothing specific to prometheus here.

B

If, if you look into the horizontal pod autoscaler by default, if you've used one of these it scales on like memory or cpu consumption of the pod right, that's the the standard way of defining an hpa, but it also supports custom and external metrics where you can build up or spin up. What's called sig instrumentation calls it a metric server within your kubernetes cluster that could be prometheus. That's probably the most common metric server to spin up in kubernetes, but there are others, and then these will pull these named.

B

Metrics will pull from that metric server. So there's the ability you have the ability to set this up with datadog prometheus, a number of other telemetry providers in the kubernetes ecosystem.

B

The one limitation, though, is you, can only set up one metric server per cluster, so you wouldn't be able to pull from, say both data dog and prometheus simultaneously.

F

So interesting I mean that's a little weird. We put the promises metrics into the deployment api, and I mean the deployment api is a core resource of communities and how about a basic canary update api into a deployment, and we may have a new sub-project under c-gaps to do the um improv progressive reward on top of the deployment canary update.

F

I think the uh the progress throughout the api may be changed in future and since it is a new api and some users may have their specific visual requests and I don't think the api. If the api will put into the deployment, we can change it easily in the in future.

B

That's a great point: I don't really know how the api maturity model would work here. um If you were considering in the abstract a large change to deployment.

B

Would that be a v2 alpha 1 of deployment or again I. This is my first time working with a kubernetes sig, so I'm a little bit less familiar with how you all do this.

F

I mean if deployment supports the canary update, then we can control the deployment and canary update on top of the deployment, such as a new crd in a sub project. The crd can control the steps to load such as the first step. It can control department to update a percentage of imports and it can check some metrics or some some hooks custom hooks by defined by users, and it can cont it can decide to whether to to to block here or to go to the second step.

B

Yeah, I think that sounds viable, especially for the metrics sorry yeah.

E

You don't we. We always said that we would allow for the expansion of update strategies without um migrating the entire api version to a new version. So you could release this update like a rolling update as a feature. I think some of the the points that cu brought up. I would echo the major one being that, like by pulling the metrics api into the deployment api and having the deployment enter, the deployment controller interact with directly with the metric server, which isn't actually a requirement to have a kubernetes cluster as far as I'm concerned.

E

A

E

A

E

You can't mandate like we putting in an update strategy that requires the cluster to have a metric server installed, might be a bit controversial, um but having that as part of the api would probably be a non-goal too just because it's it's really like it's failing to pass the test of separation of concerns. Now the deployment controller is going to have to do a whole bunch of things that really are outside of its kind of design constraints, but from like, I think what would interest me is. Like one thing, I've noticed you brought up argo.

E

um You brought up flagger, there's also um spinnaker, when I look at a lot of the orchestration systems that do build on top of kubernetes.

E

They have had to fork deployment in order to kind of make their thing work, and one thing, that's not still not really clear to me is: are they forking deployment as a convenience or out of the necessity right, they're still leveraging replica sets universally under the hood right, which is fine and like when we wrote the initial set of workload controllers, they were always designed to be forkable so, like from the community perspective, taking an existing controller using it as a base and offering your own version of it that provides new functionality is, is a goal of the project right so like we don't want to like, like we're not saying that's a bad approach, but I guess the question is: what can we do to make that approach easier so that, as people are building these work restoration systems, they have a common set of primitives that make it easier for them to offer features like this like it?

E

Wouldn't I'm not. I don't find it super compelling necessarily to just bring this into tree and say: okay, well, okay, argo, okay, flagger, okay, spinnaker!

E

You guys would have put all this work and put all this thought into how you want to do orchestration and provide value to the community on top of the existing workloads apis, we're going to kind of overrule that and say this is the one way to do it, and once you start like specifying the exact like type of roll out and point in how metric like once you put all that in the deployment it becomes like the de facto way right right- and I don't know if that would be the right approach right, because it like in interacting with users who are using argo or spinnaker or flagger.

E

They don't seem like god. This is painful. I really wish like um you guys, would take this in tree like these projects or like, but if there's some value we can add to make it easier to offer it. Like.

E

That's super interesting to me like, if there's something I can do to make those projects more successful and their users happier um as opposed to kind of saying like well, do it this way, because this is what is built into the system and you know uh forget all the other ones that have been you know in production for many clusters for many users for a couple of years now.

B

Yeah, no, those are fair points. I I really can't speak to the motivation for forking the apis from these different providers. What I can speak to is many of my users in istio and google kubernetes engine.

B

They will use the kubernetes apis. They see those as the standard um if there are best practices that are not specified in the kubernetes apis.

B

Very few of our users will pick pick them up and so, like adoption of progressive rollout, whether it's any of these providers across all of our customers is incredibly low, very few of them, because very few of them have done this, and I think it's because they see the kubernetes api as the standard for how to safely roll out software and as soon as they start looking beyond the core kubernetes api there's an overwhelming number of choices, and it's not clear which one is best or preferable um from customer perspective, I've heard a lot of them say we're waiting to see which one wins, which may not be the best model to be thinking about in terms of providers, but from a customer perspective.

B

They just want to know the right way to roll out software safely, so the my motivation in presenting it this way uh and- and I would totally support spinning some of this stuff out into separate resources if it made sense, uh but my motivation was that the default kubernetes way of rolling out software ought to be basically best in class.

B

uh You shouldn't need to reach for third-party providers to make kubernetes good at orchestration.

G

Yeah, maybe to jump in here uh sorry um always some tech, app delivery uh in the cncf and and also working on captain. So I think a lot of what you've built here is what we've dealt with as well as like more or less orchestrating tools like argo and others.

G

I think one key point in design of the api is also the separation of concerns and who builds what the deployment is usually built by someone who doesn't necessarily know which rollout controller or the deployment controller like argo or flux is going to be used. That's why the flux version is a bit more convenient because they like reproduce the deployment versus uh relying on a dedicated resource in argo uh with.

G

So I think, that's one thing and I think, should it be packaged with necessarily with the deployment. Again, I see some separation of concerns there.

G

uh The sre team might decide how they want to roll up some applications in different stages, and we see differences in stages, so that would more or less mean you would have to adjust this across different stages, and you don't even want to give other people control over how this is supposed to happen like if you're responsible for the environment, the developer obviously will tell you what they want to deploy, but the actual process is kind of separated.

G

So I think there are some separations of concerns there, the same like with validation um I I could share it afterwards. We have worked in some of like the slo and sli work as well. This again might be something coming from the developer, the slis, but some of the the rules for slos uh might actually be provided by somebody else like the owner of the environment, for example, and historic values. So that would be a good reason to spin these out into a separate resource.

G

Is that kind of what you're saying yeah, because even you might not have access like you're the developer? You can deploy whatever you want, but I'm responsible for the production environment. So I'm defining the release, strategy and speed.

G

F

Even have different.

G

Outback controls over those resources.

B

That seems really reasonable.

E

You talked about having one like very highly opinionated right way to roll out software. That's that's very tricky right, cicd.

E

The reason, in my opinion, at least that you see a proliferation of tooling for cicd, both in the cncf and outside of the cncf, is because it tends to be like the software release process tends to be very close to an organization's heart right. It's how you get product in the production and they're like finding one universal right way that fits all customers or even a large fraction, is just very tricky to do right so like it would be if there was something that the ecosystem had already developed. That was the universal way to.

E

Let's see, 90 of kubernetes users were using to roll out like stateless, serving workloads into production and that's been demonstrated. Then you know.

E

Okay, we have an industry standard best practice for how to do this and it's worth looking at it to adopt it and potentially bring it in tree right, but I mean that that doesn't seem to be the case when I look more broadly across the industry today, and I get that like you, want to be able to advise your customers like this is the right way to do it, but different software, like, depending on how the customer has deployed their software, um how many regions it's in, how many availability zones, if you're on public cloud, that it's in um the right rollout strategy is going to differ, and it also depends on the nature of the software and the level of risk tolerance of the business.

E

That's rolling that software out, like you know, if you're, if your app is like yo.com, use an exponential rollout strategy. Who cares if it's down the user is not infected if you're a financial institution or a bank you're, probably going to be a little bit more conservative right. So right you know like- and maybe maybe you do- blue green, because you want to do a fast or red, black or raven record, because you want to do a fast roll back as opposed to doing canary analysis on a progressive rollout.

E

So it's like to say, like I mean working toward the goal of having like supporting out of the box a great way to roll out software sounds like a goal. It's just you know I. I would definitely want to see it tested out a tree before starting work on like bringing it in and shipping it as the de facto standard.

B

So what would that out of tree work? Look like how would that differ from say the argo rollouts and the uh flagger from weaveworks.

E

I don't think it necessarily would right like.

F

E

It, but that is kind of the other thing, none of the other.

E

None of those other organizations that have demonstrated a large degree of success have even seen a need to come back with sick acts and say we'd like to bring it back into tree as a filter um which, which is like what. What if the goal of the built-in, I guess is, is really what right like, if you can be successful, doing it out of tree and allow your users to adopt it as they see fit or not adopted. As you see fit.

E

Basically, like the kind of guidance that's come from, sig architecture has been there, shouldn't really be in the long run, a depth like a difference between custom resources and resources.

E

There should just be resources right, so the idea that a built-in resource is has actually become a non-goal of the community at large, so if like and which is why I was kind of like asking like what can we do to make this easy for anyone who wants to build a resource that encodes the business logic of the the custom release strategy, that's best fitted to their organization or a group of organizations that they represent? What can we as cigars, do to help expedite that work and make that easier?

E

I think that's definitely something that's a huge interest, but I don't know what the value is like if you go and look at the complexity of what it would take to implement the proposed progressive rollout strategy inside of the deployment controller, the heavy lifting you're going to do there, because there's no timers in there right, there's no notion and then once you start talking about like I'm, going to wait one minute well, the workload controllers can be restarted at any time, so try it like and also like you now now you're thinking about like clock skew to make like and how tight is one minute like a lot of the complexity around what you want to do.

E

There would be very high and like yes, it can be done without violating the v1. If you could do this as like an alpha beta release entry with an adoption strategy that didn't violate the existing compliance of the v1 api. But the question there is with all that heavy lifting. Where is the value of doing it in tree as opposed to offering it as something on top and and that might be. I wonder if that's not, why, like argo, hasn't, come back and saying like or flagger or spinnaker, haven't come back and said.

E

This is so great that we want to offer it to the community by patching deployment to implement it directly. The complexity is high and I don't like is the value actually there to just build it and trade and like if it's, if you're working for google and you're working on gke there'd be nothing stopping you from or actually anyone who is releasing a kubernetes cluster is in charge of the crds that they ship by default?

E

With that cluster, I mean, if you look at openstack, for instance, they do for kubernetes, but they ship a bunch of other resources along with it. You know- and you can just do that so like if it's like. I want my customers to have a same default and I'm google nothing's stopping you from doing that today, like you, don't need.

F

E

To help you do that, even um unless.

F

E

The idea is, I want to make it built in so every cluster everywhere has this, which again, I would really want to be sure that every user, like most users of most clusters in most environments, really want it, especially for the level of effort it's going to take to implement it and carry that patch forever.

B

Yeah, I think you know those are all excellent points. The the motivation for moving this into core for me is ease of use and removing blockers to adoption for users.

B

The ease of use of installing a third-party library installing crds familiarizing yourself, with their particular apis, their particular support policy, etc, versus the ease of use of going into your existing deployments. That you're, already making use of and changing your strategy type from, say, replace to progressive uh and then you've got some sane defaults right off the bat. So it's it's really about ease of use.

B

In my opinion, uh if, if we were to do this as gke, um we would necessarily be producing a new api, not the deployment api that our users are already familiar with, that we've already got tons of integration baked into around all of the ui and the the various systems that support gke. Instead, we would have to implement something new and ask all of our users to migrate, which is a very different story in terms of onboarding, but is.

E

It though, because if you I mean so, if you implemented a new api and asked people to migrate, you could at least find a set of lighthouse house customers that would work with you and give you some feedback. If you implement it as alpha number one when it goes alpha, nobody uses it right like except people who are getting to turn up alpha clusters to test with, but it's not going to get any production production utilization at all. Then it goes beta and a lot of times. People are kind of unsure of it.

E

So the path to get to ga is just it's harder to do it, entry, which is kind of like why you know, I think I don't want to seem like I'm opposed to doing it. It's just more like it. The motivation seems like if I wanted to get this to the my customers inside of my company as fast as possible. I would use a custom resource because I can get it out immediately. I can get early feedback from product teams.

E

I can iterate on it rapidly and then, when it's mature enough, I can release it to the g. Like ga and say: okay, this is the supported version and then like if I really wanted to bring it back in tree after I have evidence that, like this suits a large amount of organizations, okay, then then I would do that right, but but I mean like yeah, that's I guess that's my feedback.

E

It just seems really like a lot harder of a path to do this intrigue than to do it as a as a separate resource and then bring it even back in tree once it becomes a universal standard.

B

Yeah, I think what I'm proposing is exactly that it's been done as a separate resource and I'm proposing that we bring it back entry.

B

If, if it were to have taken off like let's say hypothetically, that one of these implementations gained dominance and say flagger got 80 adoption compared to the even deployment um at that point, there's really no reason to be bring it back and treat right. It's already successful independently. There's not really a value add if everyone's already done the onboarding work of installing crds installing a new controller just leave it as it is everybody's happy, there's no reason for kubernetes to own it.

B

In this case, I think we've seen broad consensus that this is uh a great way, a very safe way to operate software, that a lot of users want, but we haven't seen that broad adoption specifically because of some of those barriers involved with installing a third-party controller. In crd.

E

Robbie your hands up.

C

Yeah, so I think uh one of the things that I want to think about- and this is what I have been discussing with match as well- is the apa that we have currently for the workloads is pretty much tied to the controllers that we have in in the core, and part of the problem is safe. We can separate that out like have some sort of cook mechanisms for the various stages of the workloads, so that say uh at the rollout time or at a particular phase.

C

In the workload controller, I would like to make an external call just like how we have a http, extender or extending mechanism extending framework extender framework in scheduler part of what we are missing here is we have everything tied within the controller itself, with the api that is sort of causing a problem is what I'm thinking of, and it is also making people to fork and then write their own controllers or have api, which is pretty much being copied from the existing uh workload objects right.

C

So if we can have some sort of mechanism where people can tell uh this is a phase within the workload controller, and I would like to outsource outsource it to another controller that I already have and if you can provide that hook mechanism into the core controllers. Perhaps it's going to solve the problem because I think most of us agree with the api.

C

It's the implementation uh for, for example, in this case health check right- and uh this is a recurring theme that I'm noticing in the past two to three uh calls where people are telling. I would like to have this new api field with this particular implementation within the controller, but I would like, but as a community, we have been telling them no do not do it go ahead for and then work on it.

C

So I think this is causing a bit of friction in terms of like how people can adapt, especially the developers can adapt to the api quickly.

C

So that is something that I've been discussing with machia, and he has also pointed me to a couple of hooks for the deployment api that those were discussed way earlier in the kubernetes uh project. Like now, workload lifecycle hooks.

E

Yeah yeah, I mean those that's a thing that might be timely to bring back up for sure.

D

Yeah, I just linked the uh the issue that was discussed very early in the project and we, if I remember correctly, thomas, even put together a proposal, but we never moved it forward but to to also illustrate a little bit different point that and we did approach uh a couple years back specifically when we finished uh writing jobs, and you did mention that you're not sure how to support the work shifting through non-service resources.

D

So we were approaching and we actually merged even a proposal for a workflow controller inside of kubernetes that could have been 2018, 2017 2018, something along those time.

D

um But after further discussions with uh with eric tune with brian grant, clayton myself and daria at the time, we figured out that forcing users to a particular way of approaching workflow where, as you probably are aware, with workloads, there are. uh Similarly, as you mentioned, several different approaches and the thing and the need is that each of those um external resources are significantly better at delivering features the very different variations of the features, and it's also clear that it's not just one solution that exists.

D

But usually there are a couple of them because each provides different values, different approaches to theoretically, something that seems to be very simple as oh.

D

We just want to have a little bit more sophisticated way of deploying our application doesn't matter whether they'll be a job based or a deployment base or daemon based or stateful set based, I'm pretty sure that and we don't even have to look further, but just going through the chat from our conversation conversation earlier today, there's at least one person that did mention that the time-based is not something that they would be interested in doing because they did that they did try it and it didn't work for them.

D

So, um and I tend to agree with what uh what ken said earlier, having a very tight control over the life cycle of the application, I know that it is problematic, but the ability to build on top gives a lot of flexibility to everyone, and we had similar different, uh similar tough discussions around various um various different uh pieces of the kubernetes itself.

D

My favorite example was exports. If you think about it, export is oh, it's simple.

D

I just want to have the entire resource, as is, but what happened is that each and every single person that looked at the export resource requested different fields being removed from the retrieved resources, so for one an export was being used for as a backup mechanism. So they want to have a full json or yum of the resource for someone else. They were using the export functionality as a templating mechanism, so they would want to have the metadata being removed.

F

D

The rest should be end status and and the rest should be removed and so forth. Each and every single person was giving a very different use case. The same situation is um with this particular approach that you're presenting the fact that you uh face this particular problem and you approach it.

D

It's not that it's bad in any way it just it suits you in your particular case, it's very possible that next month or next year, if you will be struggling with a different approach with a different problem, you will start looking back at the stuff that you wrote and it, and you will realize that oh, it's very limiting because now I don't want to do it this way, but I would prefer to do it that way um and it's pretty normal from what we've been seeing across the cube.

D

I I'm fully aware that we have limitations inside of the core, but at the same time we have certain responsibilities to ensure the uh interoperable interoperability for various approaches and adding features.

D

um It's always hard, and it's always goes through a lot of scrutiny, because we need to make sure that the current use cases are not broken in any way and are maintainable and at the same time, adding additional um update strategies makes the controller harder and harder to maintain, and the group of people that it's maintaining is slowly shrinking because everyone are chasing the next new thing and the amount of people that is left behind with the maintenance burden is growing.

D

So we as the maintainers, we need to weigh the options between one and the other, sometimes not accepting a particular approach or waiting for how it can be sold outside and see if, um if it gains enough popularity outside of the core and then being brought back, is it is a viable approach?

D

We did that with the with a job api, a lot of the uh the recent work that has been happening around jobs um we set and we completed jobs in three four years ago, and we we didn't do any developments around it and we said everyone go on a side. Try. There are solutions such as volcano cube flow and, although all of those different approaches, we are looking at what they are trying to bring back.

D

There is a separate work group that was devoted into looking at how we can improve in a more general um and and have a general approach towards what we can do and improve to make the any kind of hpc related workload simpler in kubernetes. But it's not solving one particular case.

D

We are listening to the entire community across the board, so everyone that created those various tools around the core are being invited and we are listening and we're trying to distill the common denominator that is present in all of them and we will be pulling those mechanism into the core so that the balance is there and we will make it for the external tools to make it simpler, but we won't be able to replace them and we are not even trying to replace certain uh approaches.

D

uh I hope that me talking for the past 10 minutes. It does make a little bit sense, at least.

E

You know, I think I understand I do have one more question. The presentation was given as a proposed api change that doesn't have prior art other than from argo and so forth. Is this something that's actually been implemented in gke that you put in front of a large number of customers- and it's like this- has been great for us and we want to. We want to offer it back as a built-in, or is it something that we're proposing to kind of collaborate on and build from.

B

Scratch uh this would be more the latter, a collaboration proposal as to the best of my knowledge gkd, is not interested in forking the deployment api right.

E

And but you, you kind of indicated that you're like this, you you you work with your customers in order to get them to implement the process of progressive rollouts at the engineering process of progressive rollouts as a best practice- and I know like especially with organizations that are maybe less advanced in their engineering processes.

E

There's a struggle right like they don't think. Well, how do I release software right and I I very much empathize with that so, like um I guess like what? What do you do? Do you point them at argo like what do you do today for these people.

B

uh Today, for for me as an I, don't think we have a like a google cloud wide policy, uh I as an individual, appoint them at flagger. uh I gave a talk with stefan proddan, the principal at weaveworks, uh at cubecon, about how to use flagger to keep istio up to date.

B

I think my specific angle and interest here is that istio today is very much tied to the application lifecycle and is perceived as very unsafe to roll out, and so it is in my interest to put more default guard rails around the health of an application in kubernetes, so that we can make these changes to software with confidence, because while a user introduces changes to their application, they reasonably expect istio or a managed istio offering to release updates to the proxy which, if you've ever used istio, you know it lives everywhere.

B

It's a sidecar to every pod. uh So in order to be able to safely upgrade the proxy, I need better signals around. Is the app still healthy? Should this rollout be proceeding? Should we be rolling back?

B

I can do all of that by driving all of my users to a third party library like flagger or argo rollouts, but it is a harder sell to users to adopt something that is third party that has less clear support uh less it's less clear that that's the the correct direction to them than when it becomes a core kubernetes api.

B

E

Think one challenge as thinking of it from the perspective of the workload controllers and those are all real problems, is you know you can't laser fo, because okay you're thinking about istio but there's also linker d? You mentioned you know um uh uh engine x as a the gateway, but nginx also has a service mesh implementation right.

E

So there there are multiple service mesh implementations, and then there are multiple orchestration patterns that people want to use on top of the workloads resources in order to orchestrate their cicd systems, do traffic shaping, do policy-based traffic management and all of those things, and you know if we're going to do something built in like trying to be a good citizen of the ecosystem, is like we have to do the thing that is, that supports all of those use cases right like I can't do something: that's super istio specific and put linker d out in the cold totally and you're all around that, like this.

B

Model is appropriate for anything that uses side cars, uh which is the vast majority of service meshes today. The one uh standout that comes to mind is the uh psyllium is working on like a service mesh product that is not sidecar based. So it's not clear to me how this would help with that. uh But if you're talking about linker d, nginx service mesh, open service mesh sni glue mesh, all of those useful might benefit from this model.

E

And I think more more still, it's just a matter of what yeah, I think it's hard to figure out what the best way to try to integrate something like this in the quarter would be that would both serve the immediate customer working out of the box with a built-in resource, as well as enable users of third-party mature third-party software. That's been around for many years to leverage as well. I think ravi's point that he brought up about lifecycle. Hooks is one thing we can do now to like. Maybe revisit that and try to figure out.

E

If now is a timely place to try to see if that could help alleviate the problem, ravage your hands up again.

C

Yeah, so I think the other thing is a question to matt. There is the pod readiness game in the pod spec. Have you tried that? Because, if I remember correctly, aws elastic load balancer uses this mechanism where they can inject custom readiness for the pods.

B

I'm not sure if that was for me, but that sounds interesting to me. I'll have to have a look at it.

C

Yeah, that's that's for you. Matt and okay has pointed out the link which I pasted earlier a few minutes ago.

C

C

No problem yeah, I.

E

The think challenge with readiness gates is so contrary to what was you might take from looking at the slide? We're not. He wants the pod to get added to the network, but the rollout to not to progress until you have a better measure of healthiness and my understanding of readiness gates is it's actually going to block readiness which would have the effect of keeping it outside of the load balancer right, so the pi will be marked as unready and won't receive traffic, which would kind of not allow you to do the canary analysis.

E

That would be necessary to have a deeper understanding of the pot's health.

C

I see absolutely that's exactly right, okay, so it's not just the readiness check. What we are perhaps missing is a health check uh at the pod level. Well,.

E

It's it's more than that right, like the problem that a lot of people have is our notion of what health is is basically, do you pass your health check, but health check means just don't kill me right like please. Don't don't shoot me shoot me down. Let me continue to run when you're doing canary analysis.

E

What you really want to understand is like how good is this new product versus the previous version of the product that I and should I progress with the rollout right, so the idea of pausing deployments uh at a particular gate and then like there are different strategies you might use. You might want to do a geometric rollout with an operator in a loop where it pauses after, like it's a very complicated space of things, which is why it's hard to like wrap for me at least to wrap my head around like this.

E

Is the one right thing that we as state apps, can do to add value to this space and enable people to you know, have a better experience on kubernetes.

C

Got it? Thank you.

B

Yeah, the overall idea is the way that you know that something is good is that it does good work. uh It's sort of a skepticism that at least is baked into the google sre concept. uh You're, never sure that a change that you're deploying is good until it's done good work and done good work for a while.

G

Yeah, I I just added that command as well, because what we usually tend to do is we want to check actually the red, the health of the entire deployment and not just a single part.

G

So it would be hard to measure this. A part can be unhealthy, but the deployment can still be healthy and to to other point like the entire deployment is unhealthy, but you can't like nail it down to a specific part, assuming like once of your, is this release? Good metric is whether users are able to log in, and this can be multiple parts causing an issue or like even taking other metrics into account. So this would relate more or less to the entire deployment and not to a single part.

B

I I would think of it at the replica set level rather than the deployment, because you want to compare old versus new. um So if, if the old deployment, like the version, a is misbehaving and not meeting your slos, if you're not in a roll out, there's really nothing to do about that. But when you start rolling out version b, version b will need to meet those slos. Does that make sense.

G

Yes, if you still think of a service and not of an end-to-end application, use case like we have uh some of our project users, for example, that like really use like card value, metrics and other things that you can't like really tie to even a specific replica set. That's really that deployment that is giving birth then another one.

B

C

So what I'm understanding is if we have something similar like pod readiness check at the workload uh spec level, uh would that solve the problem and having that custom check and again, the custom check has to be done externally, not within the core controllers.

B

Custom checks, I think, are valuable as a part of pod readiness. I don't think that that solves the same use case, because pod readiness gates, whether an application handles network traffic, we want it to begin handling network traffic and then make a decision rather than the other way around.

B

C

Understand that sorry.

E

One way to think of it like, if you did it, if you leverage deployment or replica or whatever, let's say you leverage deployments directly. Imagine if the progressive rollout was a top level resource as opposed to a rollout strategy and like we had something like workload hooks that were able to pause the progress at particular points right.

E

You could have this thing, this controller that sat outside of both deployment and hpa and the metric service that interface with all three of them and manage the orchestration across all of them in order to control the progress of rollouts based on external signals- and you could add things like metrics, you could even add things like black box tests. You're talking about user logging like or is that dropping down? Are my synthetics going down? They're like you could do arbitrary things in this controller outside of it, which would be like.

E

I see this thing. I don't want to again save architecture for the past couple of years has kind of moved away from this idea of, like every resource needs to be built in more towards like we want to really support extensions to the system well and make custom resources like the happy path.

E

That's kind of like it's, not that sig apps or any other signal to my knowledge is adverse to taking more things in tree, we're trying to kind of enable people to build a larger and better ecosystem leveraging what's in tree and focus our efforts on making the built-in pieces as best as they can be, to enable those extensions and to enable the growth of the ecosystem and the cncf as a whole. Right so like that would be one approach I could see where it would be really like it's something you can do.

E

You can leverage the existing ecosystem around it. You can like have your own release schedule outside of intrigue because again remember whatever we do in tree, it's alpha and there's fairly few users in alpha. Then it's beta and you can keep messing around with beta, but once it's beta, it's never going away right and like the only path, is towards stable uh so like getting that good signal to make sure it's just a lot.

E

It's it's harder to do intrigue for me, at least than it is to do externally, where you can iterate rapidly, and then you know if it's really great bring it back and we'll add it in.

B

Yeah yeah, I had not seen the deployment hooks rfc before so I'm it'll take a while to read up and and get an understanding of what that would enable.

E

Alicia are your hands up, go for it.

B

B

Okay, well, uh if there's no other questions really appreciate everyone's comments and time. um There's a lot of feedback for me to go through here and a lot of different sort of directions to consider and evaluate so really appreciate everything.

A

Cool um thanks for everyone for joining the call today and I'll end. The call now thanks bye. Thank you.