Cloud Native Computing Foundation EnvoyCon 2020 - Virtual, 12 Nov 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: CacheFilter: Flexible HTTP Caching in Envoy - Josiah Kiehl, Todd Greer

Description

CacheFilter: Flexible HTTP Caching in Envoy - Josiah Kiehl, Todd Greer

Web traffic relies extensively on caching proxies, and Envoy needs robust HTTP caching support to perform that role, but scaling and feature requirements vary too much for a "one size fits all" implementation. CacheFilter is an Envoy filter that handles the many caching-related request and response headers and directives, with the customizability and extensibility to support anything from single-server deployments to planetary-scale caching systems with extensive bespoke needs.

A

Good morning, thank you for your interest in the future of envoy, based caching, I'm todd greer and today I'll be describing the implementation of envoy's http caching filter, but first I've asked my colleague josiah kill to say why you want caching and how to enable it josiah. Why does envoy need a caching filter.

B

The architecture that we have in mind when designing the cache filter is one where on voice serves as an edge proxy.

B

We have all of these clients out on the wide internet connecting to our infrastructure through an envoy which then does back-end service picking and returns the content from those services back to the clients in order to reduce the load on these backend services. So we can scale them up scale them up more, as well as reduce the latency for retrieving the content. In the first place.

B

We want to have that envoy cache the content where possible. So whenever the content is cacheable, when it comes back through the envoy from a client request, we will insert it to the cache uh from the via the cache filter, as well as proxy. It back to the client that way. Subsequent subsequent requests that come in will go to the cash filter, get a cash hit and go straight back out to the client without incurring the back end service cost.

B

This is particularly useful when you have widely distributed architectures, uh where the services could be in different data, centers or different cloud regions or, however, you might imagine, we want the content to be as close to the requesting client as possible, and so we can deploy envoy instances way out in satellite locations which may or may not have instances of the service that they're asking for deployed there.

B

That envy would then route that traffic to the data center, where the services exist, um that request would be processed content, would be retrieved sent back through the internal infrastructure to the envoy, where the clients requested it and give the content back. At that point, the content will get cached locally as close to the clients as possible, making all future requests substantially faster because we don't have to make these long distance remote service calls.

B

Another situation where this might be useful is, if you have envoy deployed in a service mesh where envoy is handling the intra service communication in within your backend infrastructure. This isn't the first architecture that we're considering when designing the cache filter, but I can imagine, especially with an in-memory cache. um It could be useful to cache the content that one service is requesting from another to reduce to reduce the traffic passing between the services.

B

So how do I use this cache filter? Now that that sounds great now we can. We can see how it will help. uh The simplest way is to take a look at the cache filter sandbox, which exists for cache filter developers to spin up a quick onboard instances that has a quick on instance that has caching enabled and the config that turns the caching on is one that adds the cache filter to the http filter chain um at the place where the cache filter is inserted into the chain.

B

The request coming through will make a look up in the cache and then we'll make a look up to the cache that's configured here um in the in this case. It's configuring, the simple http cache um and retrieve the content from there anything else that affects the um cache behavior, such as what very headers do we respect um from the back ends of how this content will differ from request to request that also gets configured in this in this config there's a very likely as feature development continues.

B

We will add a bunch more configuration options to the envoy config. The things noted on this slide don't exist yet, but we expect them to in the near future.

A

Thank you josiah. So how does cash filter work if you're watching this presentation? You probably have some familiarity with how envoy manages his http filters? Envoy has a chain of filters when a request comes in filter manager, iterates through the chain of requests in order notifying each one. When the response comes back, it goes through the chain in the opposite order.

A

Some filters are only involved in one direction or the other, but catch filter is an inten. Is an encoder decoder filter, so it's active in both directions. Now there is no one size fits all to caching http traffic.

A

Some deployments are well served by one small in-memory cache, while others require the scalability of a large distributed caching system in order to support that flexibility, cache filter delegates, the actual storage of responses to a c plus plugin interface, which we call http cache cache filter handles the intricacies of http caching semantics things like parsing relevant headers and determining what can and cannot be cached and it handles implementing envoy interfaces.

A

This allows http cache plugin implementers to focus only on storage or other value-added responses that their plugin needs to provide. This well, this enables the writing of a wide variety of plugins for divergent needs. Those plugins can be http aware if needed, but they can also be simple key value stores.

A

We have an example, simple http filter or a simple http cache. That is, in fact, just a wrapper around a hash map when envoy has parsed in http request setters, it calls the decode headers method of each filter when it gets to cache filter. If it's a get request, we look in the cache for our matching response.

A

If one is found, we interrupt the normal filter, iteration and return a response from cache filter.

A

Let's take a slightly more detailed look at what this process looks like from the perspective of cache filter when the filter manager calls decode headers on cache filter, we ask we ask http cache for a lookup context.

A

Lookup context is one of the interfaces implemented by the plugin provider along with http cache itself. It represents the active lookup operation. We then kick off an asynchronous, get headers request to find headers from a cached response.

A

While this, while this is happening, we return stop all iteration and watermark, which is a status code that tells envoy to pause the current request.

A

Otherwise it would get sent upstream while we're busy checking the cache, which would cause a problem if we got a hit when the cache plugin completes the lookup, it invokes our callback with the results in the case of a hit. Those results will include the cache responses headers, which we pass on to filter manager by calling encode headers.

A

This tells envoy to send those response headers to the client. If the results indicate that the cache result that the cached response has a body, we then make one or more asynchronous, get body requests to retrieve it, calling encode data to send each chunk of data onto the client.

A

At the end of this process, the entire response will have been streamed to the client from the cache. Now I'd love to go into much more detail like this, but time is short, but of course not. Every request is a hit in the cache.

A

We certainly intend for most of them to be cash hits, but those that aren't uh are referred to as cash misses and, of course this miss can happen because it's literally not in the cache or we could be talking about something that was found in the cache, but is too stale to serve or something like that. For some reason it cannot. The entry in the cache can't actually be used in either case.

A

uh If we back up uh to the point where we are getting the response back from the lookup context uh in in the previous scenario, we got uh a result. That said this is a cache hit. Here are the headers. uh In this scenario we get a result. That says sorry. This is a miss and uh when that happens, but instead of calling encode data and giving envoy headers to send to the client, we simply call continued decoding which tells envoy hey. You know how we had you pause early earlier, yeah. Sorry about that.

A

Just you know, keep on going nothing to see here, proceed as usual, and so of course envoy. Does it iterates through the remaining filters and on and on we go now, of course, if that, when that happens, that respond, that request will presumably generate a response that comes back into the into the cache filter on the other direction uh and that'll be, as we'll see the headers in the encode headers call from filter manager in in code headers. Of course, we've got actually uh quite a bit of logic. To do.

A

To figure out, we've got to look for. You know, look at the different rules for whether something is cacheable you know. Is there an authorization header? Is there a cache control header? What are the? What are the directives uh all these uh different? uh Is it a response to conditional headers, all the all these sorts of different things that need to be evaluated? We evaluate them and once we've done that, if we determine that, in fact this is a cachable response, we will of course, then cache it.

A

uh So, in particular, we will uh in a now familiar familiar pattern, ask http cache for an insert context and then we'll use that insert context to insert headers.

A

Now we don't really care uh what the results are in terms of affecting our behavior. uh We need to uh probably report some stats. The stats is one of the outstanding items, but the we're going to respond the same, we're going to allow the package to pass through the same, so we don't actually uh wait for a response to insert headers.

A

We call you and fire and forget, keep going when the when envoy eventually tells us hey here's a body assuming in fact that there is a body in this response. We get told that, via the encode body, callback from filter manager and as you'd expect, we then turn around and insert that body into the insert context, and we fully expect that it should be able to deal with it and if it can't, then that again won't affect this response, because the primary thing that's happening is routing.

A

The response to the client inserting into the cache is a secondary concern, an important concern, but secondary nonetheless, by the way, it is perfectly reasonable for a cash filter, http cash plug-in to arbitrarily refuse to insert some requests.

A

Perhaps a server is overloaded or there's some header, some non-standard header, that it looks at for whatever reason, if it wants to, it, can simply refuse to insert these, and that is fine, uh see the comments on in the insert context class for more details.

A

We are going to be making a few changes there in the near future to better report statistics so to write a plugin for cache filter. These are the four classes you need to implement, http cache along with http cache factory and the lookup context, and insert context, which is the analog on the insert side.

A

Now I mentioned before that there is no one-size-fits-all approach, and one of the consequences of different approaches is that we can have some caches that are synchronous and return a response immediately, while others might issue an rpc and then come back on some unpredictable thread.

A

The way that cash filter deals with that is via its callbacks, all of the callbacks that it provides are able to be called on any thread.

A

You can call them before you return, control back to cash filter or after it doesn't matter, and cash filter takes care of moving things to the right thread when it's necessary.

A

So you don't need to worry about it now. uh With that I'll hand, it back to josiah to talk about the current state of development on this project, just saying.

B

So is the cache filter production ready from a cash semantic standpoint, like is the cash filter rfc compliant um in many cases, yes for basic cash requests, including cash control and vary and validation, request flows with e-tags and last modified? um That's all implemented and ready to go uh some of the more advanced validation logic like with um if none manage, etc.

B

Like those listed there, that's not yet implemented and we'll actually just skip caching, if those are present um and the cache control extensions like immutable and these others, those are also not yet implemented, but they're not as commonly used.

B

If you're asking will it work with the cash that I have in my infrastructure today, uh the answer is no. uh We do not have any production-ready implementations of http cache. The only cache implementation that exists today is the example.

B

One simple http cache, and that's really just there so that if you wanted the envoy cache filter to work with ignite or with memcached or whichever then you would have to write an implementation of http cache so that the cache filter could use it and serve content from that remote from that remote cache, there's a whole list of issues on github that we know we need to have done before. We can declare this thing. Production ready.

B

One of the most important is that the in-memory cache, which I mentioned, the simple http cache- is not scalable. um It currently doesn't do any memory management it will. You can spin up envoy, have it cache your content and it will very quickly run out of memory because it doesn't do any sort of management on the back end.

B

There's also some other basic functionality like serving head requests and important things like gathering stats on cash requests and just a whole list of other things that need to be done that are all filed under the area: slash cash label in github, uh if all that sounds great and you're ready to dive in and help one of the most important things that we need people to contribute are these plugins for the various caches.

B

um So if, if you have expertise in any of these caches and want envoy to work with them, please write an implementation for the http cache interface, so that envoy can talk to it. The interface is ready to go and it would be great to have these implementations to test the cache filter itself against, and so we would happily support that effort.

B

If you need to get in touch with either todd or me, you can find us on the envoy slack.

B

We are almost always logged in there because this is part of our day job and the list of issues that we know need to be done are currently filed under that label. That I mentioned, and if any of those catch your interest, you can either post some comments in the issues or tag us on on slack and we'll get you started.

B

So that does it for our presentation, uh thanks for following along, and thank you even more if you're looking to get started contributing to the cash filter. uh Our contact information is right there and we will take questions from here.

A

Hello, can you hear me.

A

Okay, so uh just so, I think you may have said something, but I didn't hear anything.

A

Okay, I wanted to mention something about. There was a question earlier about cash purge. One of the things that we need to figure out is the is the approach is used for catchphrase because uh different style caches have different needs, you know so uh for some, you can do it. What is literally cashback you go and delete the entries that you want to be gone uh for others. You do a uh invalidation approach where you, where you record entries that say hey if you find the thing in the cache, don't serve it.

A

um That invalidation approach makes more sense for uh look for widely distributed uh caches, and maybe we can support both together in a weird way, and you know so we gotta figure out the next event.

B

uh Just another mic check: can you hear me now? Yes, excellent, I think one of the other questions that we haven't addressed in the chat is: uh is it possible to catch just one route match from the list? uh I'm I'm assuming. That means like cache key configuration like deciding what parts of the of the path contribute to the cache key like deciding, whether or not to include query params or whether to include the protocol and those sorts of things.

A

If I have a different idea of what that question means but go ahead,.

B

Okay, so to answer: if that's the question, uh then that is a planned feature. It's not currently supported, um and that would be one of those things uh I might have mentioned it in the slide about things that we would add to the config and like how you decide how the cache decides, whether to split issues or split entries or not.

A

Yeah- and some of that is already in the config, just not uh it doesn't have any effect yet.

B

Right, it doesn't go anywhere.

A

Yeah um another thing, so so uh what I think the question was was talking about. uh You know the fact that filter config is uh per uh listener, and that tells you what filters are in stack and then you wanted to have different routes have different config. That is something we definitely need um to add, is per route uh configuration um and that just that's just a matter of getting that done.

B

So we say, like does the interface to http cache plug-in allow for coalescing? um I believe it does, but I think todd you would probably have a bit more insight on that.

A

uh Yes, it absolutely does um so all you need to do for, for coalescing is basically have multiple things come in and uh if they're misses just don't respond to the the second third, whatever one uh telling us that that it's a miss just just let it go, I just let it sit and wait and uh that works fine. Now, probably, we would need some configuration around like you know, maximum delays and stuff like that, but fundamentally yeah you could do it in a plug-in.

A

B

Yeah the next next question is how do items get pushed out of cash now. uh The short answer is that's up to the plugin and the plugins currently implemented, like the simple http cache. Just doesn't do it, um and so, depending on how the cache works that we're talking to whether it's like a remote cache like redis or something else, or if it's an in-memory cache that's written completely within envoy. um It's gonna be plug-in specific, how that's managed, but the simple http cache just doesn't. Do it.

A

Yeah um and just to be clear, that's just because we haven't gotten around to it.

B

Yeah sure I mean we are not going to go into production without a feature like that, like uh that this is just like the simple http cache is good for development and, it's good to say, hey my caching semantics work, but it is not good to put in front of live traffic.

A

Yeah, um I do think that we are going to need. There are more configuration options that will uh need to be added. uh So, like I assume, any cash plug-in is gonna need a mac space uh option um a max time, yeah, uh probably so as well, uh and you know so. There are probably some other things that are universal. uh We also in in standard envoy fashion. Have uh you can specify you know, opaque to cat stuff, that's okay, to catch filter that is just handed to the plugin, for whatever configuration you need, yeah.

B

Like whether that's custom, headers or other sort of like metadata, that gets passed along for sure.

B

Let's see have we missed any questions.

A

One thing that I I wanted to explicitly mention just because I I don't know whether I don't think we've mentioned it before is that uh is in terms of cash admittance policy, and that is you know we might expand where you can have different policies in the cash filter. But another option is that, just because we call your plugin and say here's something, please insert it, you don't have to actually insert it. You can say gee thanks nope, I'm gonna pass and not insert it. So you can do whatever you want. There.

B

Yeah uh to answer shakti's question, uh so the plan is to to say yes that you can use redis as a remote cache with uh http cache filter or with the cache filter, um but there is not currently a plug-in implementation that implements redis's api. So once somebody gets inspired and says: hey I'd really like to use redis with envoy and writes the plugin for it then envoy will support talking to it, because the interfaces are all there.

B

We just don't act, we don't have the plugin for redis yet and it's it's designed to do that sort of thing. We just need the plugin.

A

Yeah, the the the root concept behind this was hey at google. We have these really kind of weird requirements that that most people don't have. Can we make it? How do we do caching in a way that handles our requirements and also other people's requirements? And the answer was hey?

A

Let's make it a plug-in that whatever's special is the plugin um so like I don't think we're going to be contributing the redis cache just because that doesn't happen to uh be relevant to our to you know google's business needs, but uh we are absolutely going to uh do anything. We can to hold your hand while you add it. uh Oh yeah.

B

Right and like to build on that, um we are not red as experts, we don't use redis, and so it's probably not good for us to be writing that plug-in anyway.

B

We wouldn't be very good at keeping up with releases and making sure that it and that sort of thing like it's- it's not good for us to own things that we don't use, um but it is in our best interest to have somebody else contributing those um so that we have users adding requirements to both the cache filter itself and the http cache uh interface like. If we're missing a piece of the interface that something like redis or memcached and stuff need.

B

Then uh we want to be extending the generic portions of of the code to to support those things, and so, if somebody comes along with the specific needs that redis has we're happy to support those needs, we just don't want to own the redis http cache implementation itself.

A

Yeah so yeah, please, you know file bugs prs questions. um I think, uh which I mentioned earlier. You know a lot, often on we're routinely on on slack. You know we answer email, all that stuff, so we we are motivated to help uh any efforts on.

A

This and somebody can't ask a live question. If, if you feel so inclined, you know, we don't have to be the only ones talking the formative left.

A

B

In the main, repo yeah- and in fact, if you just add the cache config that I mentioned earlier in the presentation- um then it'll load it into your your filter chain, uh because it's it's in. It's like it's merged into main line. Right now,.

A

Yeah now it is still considered alpha. We haven't, we haven't done fuzzing on it. You know which is really right.

B

Yeah, it is definitely we don't think that it should be used in production, um but if you have, uh if you have the the time and ability to bulletproof it, then then by all means.

A

Yeah we're not taking a claim on any of this if you want to help out anywhere in the code, we are happy about that.

B

A

B

We plan to get it to production ready, but uh like it's, it's not there. Yet what is missing to rounded in production, um an http cache implementation that is scalable. uh The only implementation we have right now is not production ready. uh That's the primary thing. um The basic cache semantics um are ready, like it supports cache control and all of the basic like dtl type headers, um there's some more advanced stuff that that it doesn't support yet like some of the more unique validation flows, but for basic caching it'll work, and it will.

A

B

A

And where it doesn't work it it's still rfc compliant, um so it ignore it ignores some. You know if it's if it doesn't understand something like if range, it just says: okay, never mind, I'm not catching for sure.

B

um To describe scalable in this context the most clear way to point out that simple http cache, the only http cache implementation is not ready for production is that it does absolutely no memory management and it will keep adding entries to the cache until you run out of memory and envoy, probably crashes. um So that's that's the most obvious flaw with it, but it also doesn't do sharding or other things that impact performance like you're, probably going to get lot contention and like it's, it's just like it's written as an example.

B

Implementation to prove the interface not to serve live traffic.

A

Yeah and we think we can turn it into a production quality thing, while still being a good example. um You know if that person's wrong, maybe we'll split it, but that's the plan.

B

uh We actually had a slide on that. How many? How many like it's it's, a relatively simple interface. I wonder if we have this an easy way to.

A

Show that less than a dozen um much less than it doesn't, where is this uh yeah? If you, if you check with slides it'll, be in there, we don't have time.

B

Yeah we've got about 30 seconds left, but um actually, if you, if no it's fine, it's like. If, um if you look at the http cache uh class in the in the github like search, then then you should be able to see it like.

A

It's it's very straightforward, yeah that and look up next and insert context.

B

Right, yeah, the two context, objects and then the http cache interface itself, but yeah. It's it's we're talking like less than 20, probably like. I.

A

B

I don't maybe maybe, if you add them, add all the all the three classes together like it's. It's it's not it's not obnoxious.

B

uh And with that we're at one o'clock, and so the next presentation is about to join.