Cloud Foundry App Runtime Platform Working Group, 5 Jul 2023

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: App Runtime Platform Working Group [July 5, 2023]

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

A

Okay, let's get started I'll post the agenda here, if you want to add anything or I'll share. My screen welcome everyone. Big group today.

A

Okay, I'm gonna start from the top I thought. Max was going to be here, so I want to say congratulations on being our newest contributor, not here. Yet though, but still congratulations to him or our newest approver I mean- and in case you haven't seen it um I just wanted to warn everyone that we found this expect 100 continue go router issue. Of course you know, go 120.. We knew there must be something wrong. That was going to come up and it did come up.

A

um So we have a fix out, make sure to get on the fix, or else your access logs will have many many status code 100s when they shouldn't and um if you're, using certain clients certain clients uh don't like having multiple expect. 100 continues in my air on you, so I'm happy to talk more about that. If anyone runs into this or you can read the doc, let's see Jeff added something he's not here today, but he submitted this PR.

A

This comes out of a conversation that we've been having um about making sure that all that we think deeply about changes that happen to go router's routing algorithm and that probably for those changes we want configuration Flags when we make them at least initially, so that way in case anything goes wrong.

A

It'll be easy to turn off the change, and so he's provided a change to the pr template here, um just so that when people are submitting PRS we're more likely to catch these issues and just update it, you know this thing: references Bosch light, maybe you all use bosslight still we don't use it a lot anymore, um but it'd be great to have a review from some others. If you have thoughts on it,.

A

Now I'll turn it over to Stefan.

B

Yeah hi thanks um I just wanted to raise awareness for this proposal, which was brought up by sap and also wanted to take the opportunity to introduce our colleagues from sap who are responsible for logging service who have raised the issue, and um maybe there are open questions from anybody or maybe I, give opportunity to declass and they need to maybe explain a little bit shortly. The background.

C

Yeah, maybe I can do that. Can you hear me.

B

um Maybe you can also see that how to going to uh yes.

C

uh Yeah, so basically, okay, the latency, is a bit high on my side, so basically, Nicholas and I are working in a development team that is also developing an external, lock management system and uh we wanted to improve our ingestion Pro our ingestion performance, because we also um ingesting locks and also metrics from cloud Foundry over the hdb syslog channel.

C

So far, uh so we've been waiting for uh system TLS to arrive to basically switch to TCP routing internally, but we fail to ever check whether we can really uh do layer 4 routing internally in our external lock management system, and we figured out it comes with a lot of problems on our side Bill. We would like to have improved ingestion performance.

C

um This is why uh basically Nicholas uh came up with the idea of implementing kinds, there's some kind of batching approach, and this was already discussed with Stefan and his team into internally and basically, what developed out of it is doing a combination of both providing a writer that supports HTTP 2 and also supports patch thing for external, lock, Management Systems.

C

We are aware that this is an issue regarding the protocol specific specification, but we don't have a specification for HTTP anyways, so we thought about proposing um uh this solution to improve the ingestion performance, because we have pretty good experience with uh using Flume bit as a log shipper which, by default batches uh Json blocks and simply sends them to us, and we can reach um ingestion numbers of like 25 K locks per second with that easily.

C

But if we use HTTP or even slot TLS, the limit seems to be at least for sysler TS 4K logs per seconds, which is great for sure. But we are pretty sure that we will need a little bit more on top.

C

Also, there are a bunch of other stuff uh that could be added on top here, for example, we thought about later after the meeting that, if you ever want to use an external lock management system as a cloud Foundry landscape operator, you will most likely be concerned about uh egress costs for using this excellent, lock management system as well, and this is where httpq uh and even HTTP one plays uh plays nicely with compression, because locks look mostly the same, especially on the metadata level.

C

So you can save a lot of courses here by simply using batches instead of sending messages one by one. So this is basically we we get something for a log management system and also you get something as a community for saving a little bit of costs, because maybe for some use cases this might be significant.

C

I hope that was not too much um or does anyone have some questions regarding this uh because I didn't I didn't follow up on the uh actual issue, because I wanted to discuss it in this ROM first, before writing some more information in the uh issue description or in the issue comments.

D

I I think I have some contacts that I'd like to add at some point. I'll, probably just leave it as a comment in the doc I'm I just became aware of this as a as a GitHub issue today, um but I'll try and I'll try and add some context um and some thoughts if.

C

That helps awesome, great yeah for sure yeah. uh Actually, internally, we want to do a first Spike and simply try it out. We uh because adding batching on top at least non-configurable batching at first would be pretty simple, so we can do some playing around uh with the whole mechanism and then check the actual performance and whether this results in any benefit for us, um and then we will simply continue with the uh with the implementation and also with the communication over the issue applying for the community.

A

Carson I think I saw you on mute. Did you have a comment.

E

uh I had a previous question. um Maybe I can ask that now, if that's cool, but what you were suggesting before I started asked this question all sounds good um uh Stefan Lee had mentioned that HTTP 2 could fill the gap for batching. I was wondering if that, uh if y'all looked into that since there wasn't responses to that on the GitHub issues.

C

uh Yeah, we could simply stopped with that, uh because we want to do both in combination if it's possible, but maybe we can start with uh simply just HTTP 2. Do some performance testing on our site uh see whether the results are convincing and if not, we can simply throw batching on top um yeah because of the aforementioned advantages.

B

So if I said, if I understand correctly, batching things benefit even with HTTP 2, because when you can still on top compress the content, yes.

C

Somehow exactly- and it's also just check them- sorry go ahead. Nicholas.

F

Yeah, so it might also be an issue if you type very many requests to a very narrow um proxies if you do HTTP 2, you also probably have some open connections and if you get too many open connections, it's probably better to have a batching solution and not a like connection based approach, because then you can get congestion, even if your throughput is not that high already.

B

A

And just so I understand you think hdb2 will give you benefits because you're going to be using the streaming like that part of H2.

F

Will improve our performance for sure I think so we didn't try, but it's pretty pretty easy to guess that it will improve because you currently it's ping pong between the services and that's utterly slow and if you can at least uh do multiples and then get one Arc back, it's faster for sure, but um batching will probably Beat It by 10x still, if, if it's done correctly, so that's why we want to try it out to give correct numbers here and that you can also decide on more of a factual base and not like.

F

We think it will be faster. So we can actually say: okay, we tested it. That's what we got. uh What do you think of it.

A

That'll be great to have a data and numbers I know you are always good about uh testing and sharing those I. Remember working with the networking.

C

A

C

In the actual issue on GitHub as well, we are just managed to find time uh yet to do this, but I'm going to take care of that together with Nicholas.

A

Perfect. Thank you.

G

It's a good question about the the pitching part. If we sorry, um if you do bitching, do we want to do only only one way, and then we say that we're highly opinion opinionated in the bitching part. Let me say that this is the best way that we've tested the networks or or we we do something like a I.

G

Don't know some some kind of uh plugin mechanism or some kind of callbacks that that we enabled The Operators to to add some custom code, which will which will do the bitching before the the requests are are sent over the wire practically just.

F

Implementation I mean as a fast spearheading. We will not do that for sure, but afterwards for the correct implementation and later on down the road I can think something like that might be sensible, yeah sure.

F

If we do all the implementation of the whole thing, I mean we will do if we need to but uh yeah.

G

Yeah, we will support you in any way possible, so to speak, because it's we find it's an interesting Improvement to the whole thing. Yet from from our experience so far there we always say that the syslog is the protocol to to go and to be used when we have high log load, but but also see that some people are are selecting each GTP as an option, because it it's an easier implementation on the low consumer side.

G

It would be interesting to see how the whole thing equals.

A

Okay sounds like people are generally in favor of the idea. There might be some specific questions to answer, but sounds great I look forward to seeing the data from your prototype. Thank you for coming and talking about this.

G

Like a technical question, do we need to uh create an RFC for for this, or we can continue working and documenting things on the GitHub issue?.

E

This seems low level enough to the logging and metrics sub.

G

E

Subsystem that I think an issue on this repo is fine. Personally, okay,.

A

I agree: I think rfcs are good when we're coordinating between uh working groups, a good question.

G

A

Okay, I'm gonna turn it over for the spicy one from Dominic. I, don't know where this is going, but I'm interested.

H

uh All right, a little short on time now, oh you're, sharing yeah you have to give up share, would.

A

You like to share here: I'll, stop.

H

Yeah I have some slides.

H

G

Not only slides.

H

The goddamn system sharing thing needs to be enabled groups.

H

I

I have to quit and reopen no.

A

Maybe while you get that set up I think maybe Eric just has a quick.

J

A

J

A

J

Sure this is a follow-up to Amelia's, excellent uh CF, Day presentation, um but uh I think we're we're both curious to know fairly broadly, if people are observing any problems in their CF environments that uh are related to kind of uh unoptimized uh traffic going across availability zones, so I mean we, we've had kind of like a few different customer requests on the VMware side that could be related to this, but we kind of want to know if there's or broader problems around this. So this could be like you know. Some examples could be like.

J

uh Are you seeing cases where you're seeing um uh greater response times um to app requests because like go router to app, is going across availability zones and that latency hop is um introducing a lot of uh additional time or um uh you know? Are you seeing a lot of just log volume from like the log logging system, components Crossing across azs, although I think there's a maybe some initial easy awareness in there so again uh um not expecting to get any necessarily synchronous responses here, but um you know have think.

J

Let me or Amelia know if uh there's anything you've observed that you could attribute to that. We definitely like the information.

F

J

um Because I chatted with him at cfday and he he said he'd relay that to you.

A

Perfect, thank you Eric and I will afford to oh.

J

Man what you said: Patrick yeah. What is this.

H

I

H

I

Sorry to cut you off, Eric uh I'm, totally short on time now: okay, uh Harold The Herald, who is that guy? This is what this is about: um 502, bad gateway. Every time, I see that error code I die a little inside I I hate it.

I

So we should do everything and anything to avoid it, because customers also I think don't like it that much um so just a little recap. We didn't have broad Integrity for the longest time and it was very simple. You just did HTTP to your instance.

I

If that instance failed, you had an easy way of telling that it actually didn't respond. So you just had this in a very low level. Error message, dial error, connection refused or whatever it is downside was no encryption and you could potentially route to the wrong app on the retry. So we all know that so we added Envoy to the picture as our intermediate proxy that took care of all of this uh of these problems. So we solved the encryption problem and we solved the miserotic problem, but maybe inadvertently added a new problem.

I

I call it the retry problem because Envoy basically uh never sleeps. It always accepts your TCP and TLS connections, no matter what, even if this back-end app is down- and it just closes the connection to go router, which leaves gorod in a state where it has no idea. If anything of this HTTP request had actually made it to the backend app or not. So that's very unfortunate because it basically renders all of those retrace unusable.

I

Now we sort of worked around it, trying to fix it on the gold router side by adding item potency checks and HTTP Trace either impotency basically means a requester can be repeated, so get head options and so forth. uh So if you, if we find a an end of file error that we got from Envoy and we had a get request, we just say: okay, that's a repeatable request. We see an error, but we just figured. We could retry, because it's it's harmless, uh not so much for post and put requests for these.

I

We added very recently HTTP Trace to the picture, which is a little library that allows you to track. The state of HTTP requests inside go router and it would call callbacks like functions that we provide whenever a certain state was hit. So in, for example, if go router had written the headers of the HTTP request, it would call us back and we knew okay, the we made the Assumption at least that the app already got our headers. So if it failed, then we couldn't retry the request. The problem is, it's still very unreliable.

I

The main issue is that uh the goal standard Library calls these callbacks, not just when the headers have actually been sent, but whenever it actually wrote them to its buffer like the outgoing buffer, so this callback basically gets called uh like always, even though the.

H

I

Bit of this request has ever made it to the back end. uh So, let's before we take a look at the solution, let's dive a little into this, this end of file error. So the basic problem that we face is Envoy takes your TCP connection. You then send the the server name indication in the in the TLs client, hello. Envoy then tries to look up the back end for this server.

I

In this case, it's just one backend, because there's only one app per each NY it tries to connect, gets a reset because the app is down. Then it makes a mistake in our opinion, because it finishes the TLs handshake, leaving go router in a state where it thinks okay, I have a TCP connection. I have made a successful TLS connection, I'm actually talking to the correct application, because the certificate is correct.

I

I'm now going to send HTTP, because why not right and right along the same millisecond Envoy, then proceeds and sends a fin closing the connection, because it has no back end.

I

Go rotor doesn't get this memo basically, at the same time, in the same millisecond sends out the post request, which is then of course, reset the envoy again in this case, Garuda has no idea why this connection failed, and it just cannot, can can't make any distinction or decision if, if it can retry okay, so the problem statement is basically, this Envoy always finishes uh the TLs handshake. Even if there's no back end. This changes the the the client and puts it in a state where it has initiative.

I

So the client then thinks okay, it's my turn now I'm about to send HTTP now, because I have an a live collection, and that's not the case of course. So if we wanted to make this more reliable, we have to basically drop this stls handshake. We have to make it fail early enough, so that the the go router would not be in a state to proceed because it was, it would be waiting for the TLs handshake to finish. First Envoy can't help us here because it always will behave like it does now.

I

It's basically what it's just doing, what it's supposed to do it's a very specific requirement that we have okay who's going to help us here um we just figured. We could do it ourselves at this point, because uh Envoy won't budge, but they won't change their code.

I

um Why not replace an Envoy with a Herald, so they they are basically doing the same thing, but not quite the same. Hence we call them Harold, The Herald. Why? Because why? Not it's a very simple and plain layer for proxy, but it knows how to behave so. The difference is unlike Envoy.

I

Harold will uh will abort the TLs handshake. If there is no back-end available, it won't finish TLS. Unless it also knows there's a backend, that's basically it that's the whole difference. One of the major differences like in in the size of of these two is that Envoy is huge. It has like a plethora of of features of which we use basically none right. We use TLS termination and we use forwarding and proxying of TCP connections. So why not just have a custom-made tailored solution that just does just that for us.

I

Instead of having this huge 200 megabyte Envoy that doesn't do what we need, um so the main difference going back to this diagram between Herod and Envoy is that if Harold has gotten your TLS client hello and tried to make the backend connection, it then immediately aborts the TCP connection without sending the TLs server. Hello in this case uh go root, isn't in a different state and will not think that it already sent HTTP.

I

It won't start sending http, that's the major difference and it allows us to safely make the decision to retry or not to retry. I have a little demo just looking at the time, um just uh yeah four minutes. That should be enough. It's very simple: I have a little uh Cloud Foundry deployment here. I have bunch of apps deployed um I I deployed an app.

I

That's called, don't speak, it's very simple: it has 10 instances out of which nine don't respond, so only one instance will always will talk to you, the other one, the other nine will just they are not listening at all, so you will get a connection refused to them when I call this app um in the classical way. Using a post request with Envoy in between I get a 502 and I have basically a 10 chance of these 10 instances.

I

Only one is going to respond so now I hit it, but like ninety percent of these requests will fail now when I switch to uh to to Herald. So the guy I just introduced to you.

I

These requests will always work. It's the same code, the same backend app.

I

What you see when you do a CF logs on this on this app.

I

Hopefully, is that you see all of those nice retries, so here you see, uh for example, connection refused and then the the next guy uh is the is the correct one. So it will. uh It will retry until it's finished and found the the working instance and the other line it will just ignore.

I

So it depends on like the the random selection of the of this endpoint. Sometimes it's fast, sometimes a little. It's a little slow, but you can see it here. All of these connection reviews come in a row until it finally hits the correct one and returns a successful response. So that's it for the demo.

I

I have one more slide. I think um yeah, uh just for the Outlook Herod is, is very much proof of concept. At this point it's very uh recent I'd say kudos to moto to Max.

I

By the way he wrote like 99 of the code, uh it's uh it's very early, but it works as we've just seen, and we feel that in general, if there is no strong argument for Envoy, um why not replace it with a more fitting solution that actually works for us and um we we still have some work to do like we didn't do performance tests or whatever, and we still the the way we integrate into cloud. Foundries is um uh hacky a little bit.

I

We still keep Envoy, but we shut it down and replace a Herod and Harold would just step into amway's shoes. Basically, um the question from us is like: is this interesting to you is? Are there any strong arguments? Pro Envoy? Are we missing something like? Is it uh strategic to use it or whatever? uh So we that's our question, we don't know we just have the tech to like improve, but we don't know how to proceed from here. So that's why I'm here.

H

A

Yeah, that's very exciting. Of course, it's clapping uh I appreciate the very detailed presentation about exactly what's going wrong. That really helped. um Can you please make sure to share these slides or um I'm sure you've opened something to share more broadly sure, uh I I think my strongest argument. Pro Envoy, like what I'm thinking in my mind, is I, know.

A

We've wanted to use other features of envoy we're like, wouldn't it be great if we could do and then there's like a handful of things, but we've never actually got it working and as far as I know, we don't have any current features that we want that we plan to use Envoy, for example, you know like something like at one point. We thought we might do um like fully TLS container to container networking and we would have all C2C networking traffic exit via one Envoy and then enter via the other Envoy.

A

H

A

But we've never done it or you know we couldn't get it working, it was and it's not on the feature right now.

J

A

Guess the other thing, even even previously.

J

um Yeah, when we'd been looking more seriously at istio than we were, you know looking at reusing that Envoy. But again that seems like it's not really feasible. At this point, yeah.

A

That's true: go ahead, continue. Yeah.

J

I I do recall, like one of the things that was attractive about Envoy when we first introduced it was the relatively low memory footprint that we had, so that might just be something to validate about any other proxy that might replace it. For this job is that we could keep it within a similar envelope just to account for the overhead on all the Diego cells.

I

Yeah we found that it's remarkably large to be honest, I think I I can recall some number from Max. He did test it and it was like 200 megabytes or whatever. So it's fairly large for like the small, like purpose that it should fulfill.

I

But of course it has lots of features, so those features have to go somewhere right.

D

Just just to be clear, those 200 megabytes is that disk space, or is that memory.

I

I have to call back on Max I, don't know, but I I think he mentioned the memory.

J

Okay, because yeah we we had some benchmarking of that again when we first introduced it, if it's possible that Envoy has gotten more bloated over time, although that was one of its original selling features. Is that.

H

I

J

It would have a low memory footprint because of all the optimization they've done in the C plus code base.

J

um Yeah and I know that, like that will grow over time with the number of connections that it's maintaining um but um yeah I I mean I I, recall at least at a kind of idle State. We had it benchmarked around I, think between 16 and 32 megabytes, and we had some uh estimate of the linear scaling factor. It's in some Diego notes, somewhere, I believe.

A

Your memory is amazing, Eric.

A

Yeah very interesting should.

I

We like open RFC for that or because it's touching not just routing right. We need to also make changes and uh Diego basically execute.

A

Yeah I think so. I think this is a big enough architectural change that people will have.

H

A

And comments I think one question I have is: would this work on Windows because currently Windows is using its own non-onvoy thing, ah I? Think it's using uh nginx, maybe that's in front or something I'm, not sure exactly well,.

J

I

Think, technically it's just a go program, so it should work. Yeah.

J

That might actually simplify things.

A

Yeah, that could be nice hmm Carson.

E

Yeah one thing that comes to mind is that um maybe this opens up some possibilities for uh the HTTP protocols we use as well I'm thinking back to when we implemented HTTP 2 as an option for apps and some limitation of envoy that we never quite figured out prevented us from being able to magically select the protocol for the apps I.

E

Think one of the proposed Solutions was like magical Health Checker running in the background that obtains apps to determine what protocols it could use and sets the protocol automatically and Envoy had some weird thing where it. uh It would always accept like an H2 request, meaning we couldn't successfully like determine uh uh what protocols an app could respond upon, um perhaps with some kind of custom solution.

E

Maybe it opens up a road where we don't need to specify what protocol exactly an app can use and under the hood we can introduce magic to do that, possibly a bad idea, but I. It's the you know it's more options, which is cool.

A

Yeah, it's unfortunate right now that you have to specify H2 and then the envoy is set up with H2 and it's Alpin right ahead of time would be nice. If it, you just knew your app. Oh well, your app works with H2. Oh, it starts doing. H2 yeah, cool uh Dominic, I'll, look out for the RFC! Please post it in these slides, they're great. uh When you do and.

H

B

Sure there will be.

E

More questions.

A

Coming your way, but it's exciting.

H

Okay, then thanks thank.

A

You, okay, I'm, looking at our agenda and we're out of items. So thank you all for another great working group. If it keeps being this one, maybe I'll consider extending it to an hour, so we won't go over and we don't have to rush Dominique, perfect I'll, see you.

H

All next month see you next time, thanks.

G

C

Thanks bye, foreign.