solo.io Hoot Livestream, 1 Jun 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Episode 17: Envoy + Rate Limiting

Description

Join Yuval Kohavi, Chief Architect at Solo.io, on our next Live Hoot Episode on June 1st. We'll explore and demonstrate how to configure and use Envoy with it's rate limit service.

About us https://www.solo.io

Questions? https://slack.solo.io

Code Samples: https://github.com/solo-io/hoot
Suggest a topic to cover here: https://github.com/solo-io/hoot/issues/new?title=episode+suggestion:

A

All right, I believe we should be live. Let me check the stream everything looking good. Yes, perfect! Welcome to another episode of hoot, we'll be talking about envoy and raid limits. We have a lot to talk about today, so we'll get back in.

A

We are back in the solo offices today in cambridge massachusetts, master is opening up so we're back in the office. I'm doing this in a slightly different setup, so I'll be looking out for the chat feel free to drop your question, as I talk every few slides I'll go back to the screen to see if there's any questions.

A

uh Okay. So, let's start talking about envoy and rate limiting. This is a audience requested, a topic. So let us know if there's any question here. Sometimes that page is a bit laggy all right. So, let's without further ado, let's begin we do these hots.

A

We do these hoots every couple of weeks, and this is also a new microphone. So if there's any audio issues, please let me know I from what I'm seeing it looks like fine but uh feel free to tell me otherwise, and let me just check that nobody said anything. Okay, I think we're good all right.

A

We do this episode every few weeks and kind of try to spread the knowledge. If you have any suggestions, uh there's a link to suggest an episode in the description so go ahead and do that uh subscribe like share per usual to help this content reach more people. You know, educate, more people all right. So let's talk about envoy and raid limiting er, we'll even prepared a little agenda here. Let me make slideshow so and just let me make sure that it's the slides are visible. I think they are yes, perfect, okay, yeah!

A

It's all it's a new setup today, because I'm in the office so learning the ropes here. So, okay, first we're going to do an introduction. What does it mean? You know what alternatives we have for different use cases right. You might actually not want to use rate limiting for your use case how it works. You know the architecture, the components, our configuration, scaling, considerations and we'll do a demo, of course, all right. So, let's start with an introduction.

A

What and why is global rate limiting so the motivation that we usually see is either business driven or operationally driven right so either you have a business need to limit rate right to say this ip address you know for every ap address. I only want five requests per second right and then, when, obviously, when you say rate limiting, we mean limiting the rate of request that android proxy processes right. If you're new to anvoi you you can.

A

We have a whole series on hoot explaining uh the the answer from the start, uh I'm assuming that people already know what envoy is so, I'm not gonna go there. um So a business use case. uh You don't want a certain user to do go above a certain rate. You know api plans, that kind of thing, and there are operational ucss where let's say you have a database that have relatively small number of hosts and you don't want your entire distributed system, your entire cluster, to overwhelm the database.

A

So you want to account, you know globally the access to that database right. So that's kind of the motivation of why global rate limit is useful. Let me just save this question. Okay, we're good all right now, before I go into how this rate limit is implemented, how a truly global rate limit system is architected here with anwa and the raid limit server. Let's talk a bit about different approaches in envoy, so first we have local rate limits right. It's essentially a token bucket system limits in incoming request.

A

Now it's not global right. Each instance of envoy has its own token bucket and its own rate limit view of the system right. So it's good if you want to limit the rate, um but it's not so good if you want to limit the global rate of the of your entire fleet to a certain uh instance right, so that's the local rate limit, there's a filter in envoy that can do that and it's good.

A

If you want a control load, if you're, you know, you have a predictable number of androids, it might work for your use case uh instead of the more sophisticated global railing uh another, so the local rail limit applied on incoming requests right. Besides that, we have a circuit breaking mechanism in envoy, and that applies to outgoing requests. So if the goal of circuit breaking is to limit the load, android generates on up streams right, so you can say to envoy.

A

If you have this, many active requests stop sending new ones, because clearly the option is not keeping up with the load right. So that's circuit breakers, there's various uh things you can tune there. I'm not gonna go into the details on this stock, uh but you can actually request spending requests. Retries, you can read the documentation and if you like, put a suggestion we can. If we see there is demand we can do an episode of for that.

A

In addition, a relatively newer addition to android is adaptive consciously which you can think of it as essentially dynamically setting the circuit breaker values. Currently, the circuit breaker are a constant number. You know a thousand requests right, so adaptive concurrency allows you to adaptively change uh kind of in response to measured load, the the concurrency limits all right, so these are all alternatives to global rate. Limiting and global limiting means that we have a global rate limit across all. You know all our classes, oh and I said, obviously have a signature typo. This is unvoiced.

A

Sorry about that, all right. So here we go. So what is the problem uh with envoy and global rate limiting? Obviously the problem is that amway doesn't have any state. It's meant to be dynamically. uh You know scaled up scaled down, use the sidecar. It doesn't have its own state right in addition, envoy is distributed. So even if you try to distribute the rate limit into the virus, android you'll have a hard time synchronizing between them, because of you know, distributed systems are always eventually consistent right. So how do we solve that?

A

We introduce a component to hold the state and have envoy query that component to achieve a truly global rate limit right, so we create a rate limit service. That android will query and ask if I should be rate limited right now and based on the dancer android, will either rate limit or allow the request right so far, so good.

A

So the way it's architected in envoy, there's a ray limit service and the official redmi service from envoy is a stateless service and it stores its state in a redis or cluster redis. We'll discuss this a bit later right and uh the way it works. The raid limit service is a grpc interface right. You have a grpc protocol definition and you can implement your own rate limit server right, the one provided by android github.com, android proxy slash rate limit.

A

This is what I'm talking about in this hood right and that server is stateless and stored in state in redis or memcached. Now was recently added so so far so good. I know I'm talking fast, so let me see if anybody's asking anything so far, so good all right.

A

So uh next, let's talk about the configuration model, and this is the the the thing I noticed that get really confusing to people right so and I'm going to talk about it a bit and then in the demo, I'm going to review the yaml. So you see how it looks in action.

A

So the way array limit is model. Non-Void is something called descriptors and each descriptor has a descriptor entry right and there's an action that generates a descriptor entry right. So a descriptor entry is essentially a property uh of the request right, yeah, not necessary, of the request, but a certain property that you want to have the descript that you want for the descriptor to have is part of the rate limit decision right. So you specify a bunch of actions.

A

These actions generate descriptor entries and if they don't generate the descriptor entry, then the whole descriptor is not generated and the end result. That is an ordered list of these descriptor entries and that's the descriptor right and each descriptor represents a logical operation right and envoy sends to the rate limit server one or more descriptors and asked the array limit server.

A

Should this descriptors be rate limited and if one of them it needs to be rate limited then android declares this for request over limit and limits it right and all the other descriptors, even if they're, not rate limited they're still counted towards the limit right. So if you have two descriptors one with you know one request per second and another with two requests per second, and you send a should rate limit request with these two descriptors.

A

Both of those counters will reduce by one all right so so far so good. This is how android looks at things and you'll see in a second in the yaml, we'll have an entry uh in the route and you'll see these actions and how they generate the descriptor from the actions we're gonna show it also how the server receives them.

A

Hopefully I know it's not clear by now, but hopefully it'll be clear by then all right, so the configuration model on the server the server configuration essentially specifies a tree that matches the descriptor sent from android and again, I'm not so super clear now, hopefully, it'll be clear when I get into them. So let's say android sends a descriptor and that, as we said, the descriptor has a shape of a ordered list of descriptor entries.

A

The configuration in the server needs to match that ordered list. So if I'm going to send a descriptor with three descriptor entries, the server configuration the tree depth will also need to be three to match the shape of the descriptor sent by envoy right. Each descriptor entry has a key and a value, and also the keys, and potentially the values also need to match together in the server configuration frame and I'll, show those configurations side by side. So it makes sense if the server cannot match, then it does not consider this thumbnail.

A

That needs to be read limited on. So it's important to get to get this right all right. Some notes before we go to a live demo by default, if rate limit query fails or times out, the request is allowed and you can change that default. There's a you know, default model there in the filter, settings and the rate limit filter also emit metrics. uh You can see them on attached to the cluster.

A

You see rate limit, you know the cluster name, dot rate limit, dot, okay, dot, error, dot deny on fail and we'll also take a look at those in a sec.

A

So just as far as observability, you probably want to watch out to see if their account, you know, increases things of that nature, all right so far, so good as far as scaling right, we talked about envoy and voice stateless. You can scale it up or down as much as you want rate limit service is stainless. You can scale it up it's as much as you want. You could even couple it as a sidecar to ango. If you want.

A

Let's talk about redis right, so redis natively has a method of uh of supporting um scaling out and it's called cluster mode, and you can use that the way you use that you actually send the rate limit service through its own side, car envoy, and you can that way. You can use invoice redis cluster abilities right.

A

The alternative would be to implement it inside a red limit service, which I don't think is implemented as of today, but feel free to correct me. If I'm wrong here, it is changed pretty fast right, so you can have envoy calling the regulate service and then the raid limit service has currently, you know a bunch of radiuses you can pull through. But if you want to do something more sophisticated, you can actually have the ray limit service called an android sidecar to re-leverage.

A

Android redis features right, so one of them is supporting redis cluster mode, the other one is client-side sharding and with client-side charting you can have a bunch of redis in a redis cluster and they're all not configured in cluster mode they're, just simple redis instances, and they are unaware of each other and anger can actually take the rate limiting keys and do kind of like a consistent hashing right, do like maglev uh load balancing and always send the same keys to the same radius right.

A

So if the redis cluster doesn't change much in size or doesn't change frequently inside that works because and we will route the request to the correct redis instance right the disadvantage being that data is lost anytime, a cluster that changes in topology right, because the thing is rehashed and so or rather part of the data is lost, depends you know how you do it um and that that would be fine.

A

You know if your use case is limiting scale, let's say in a resolution by the second: it might be, okay that you know, for you might miss you might lose a second in in in the in the state. You know, so it really depends on your use case if you're, okay, with that or not now, another option that we're not going to show but still going to talk about, is having a separate redis cluster for the per second limits right. So a rate limit is usually one per second wipe one per minute.

A

You know x per unit of time, so two per hour, one per second one per minute, so the rate limit service by envoy can actually use a separate credit for the per second limits. So you can actually scale that cluster independently because there's obviously a lot more activity. You know in the first second one all right. So with that, let me see if there's any questions. I know I took a lot and I know I thought fast so far we're looking good. Let me check that other one, all right.

A

Okay, with that, let's go into talking about that demo.

A

And go to vs code perfect. So now I've prepared a few configurations and I'll show the this is here. The raid limit configuration and we'll show it side by side with the android configuration.

A

So let's start with the filter, configuration and then we'll move on to the right configuration so one sec. Here we go so we have here the filler configuration you can see. We have the name. We have the typed config. The main important part here is the rail limit service. We set it to a grpc service with a cluster and the domain. Now I didn't talk about the domain up until now, because it's not that important, it's mainly a means to kind of do. Multi-Tenancy.

A

The server will match a configuration based on a domain, so this domain needs to match the domain in the server configuration and the server can have multiple configurations with different domains right. So this is how the filter is configured, the main takeaway you get just got to set the domain correctly and guess the service correctly, and that's basically that and after the the hoot I will push everything uh to github.

A

So you will have it, you know available all right now, let's talk about the filter versus the route versus the server configuration, so these are the routes that I have. I have two routes, one to slash resources I want to just to slash and you can see a defined rate limits on these routes.

A

A

So the resources has a three set of actions each one. Each of this action translates to a descriptor entry, so each of this translates into a descriptor right. So this is a descriptor. This is a descriptor. This is a descriptor. The order of the descriptors doesn't matter, but the order of the actions does matter right. So remember. We talked about an ordered list. This order here matters, but the order among the different actions doesn't matter all right and the slash one has a very simple.

A

You know: generic key remote address single descriptor single rate limit, pretty simple. Now, let's see how this matches up with the server configuration, so the server configuration. If you recall, I mentioned it's kind of like a tree, so you have the top level and each level can have a sub level right. So let's just talk about a simple case for one, which is a single level configuration.

A

So if you remember, we have the descriptor. Each descriptor has an ordered list of the script or entry. Each descriptor entry is a key and a value right. So the server configuration matches- and let me just check that I'm not talking too fast- that everybody is happy so far so good, all right, so the server configuration it matches that right. So you can see here key value.

A

And so here there's another level and and potentially another level. uh I actually want to talk about this one. This is the single level one right, so every key value. So this means you have a descriptor.

A

We match this part of the configuration tree if the key is generity and the value is resources right and then we have a rate limit of one per second right. Okay. Now in this scenario, not only I have this one level with a rate limit, I also descend and have another level right.

A

So if, instead of just one descriptor entry in the descriptor, I have two descriptor entries, then we will descend to the next level and if the descriptor entry, the second descriptor entry key, is name header match and the second descriptor entry value is named post request we're going to end up in this section of the configuration tree and the limit will be 10 per minute right and let's talk about this first one here for a second just kind of get the full picture.

A

So again, we we have this vector right and maybe I'll just write out here. Kind of how it looks like is the ordered list of descriptors entry. Each descriptor entry is a key and a value right, so we have key one value, one, you know key two value, 2, etc right. So in our case, if android sends us a descriptor that is generic key right.

A

And the value is client. We will go to this section of the configuration right if the val, if the key is generic key, but the value is, resources will go to this section of the configuration right. So let's say we got generic key and client.

A

Then now we're going to this section of the configuration notice that this generic key client does not have its own rate limit, which means that we cannot terminate here if we terminate here, it's not a valid thing to limit on, and it's not going to rate limit right.

A

So in this case, if generic key and client is sent, then we have to also have a second descriptor entry with the name of remote address now, you'll notice that there's no value here, but there are rate limits and what that means that essentially it's kind of like putting value of star it means apply this limit for every unique value.

A

So this configuration will do a one per second limit for every unique remote address, so every unique ip gets its own. One second rate limit right. I could have written a value here and that would be specifically to this value. To that remote address and all other remote address would have been ignored and not limited all right. So hopefully this makes sense. So let's just see how this all matches lines up nicely. So, let's first start in the in this one in the in the second route.

A

You can see that I first have here generic key with descriptor client, so this matches this right. It'll get here now, as I said, there has to be another one. If we have the generic key inclined, because this one doesn't have any rate link, so you can see. I also tell android to send the remote address and there's no configuration here because anvi automatically knows the remote address.

A

You need to configure some things for I'm going to know that, but let's not talk about that right now, um so it'll, automat it'll, get to this second part and andrew will provide the remote address of the client as it sees it.

A

Let me check if there's any questions, all right, header value match, it's mapped to header match key automatically. Yes, that is correct and the way you know this is to go to the android docs and you go. I have it somewhere here here. We go. Yes, sorry, no, not this one! Here we go so you go to the docs right. You see that I did header value match you go to header value match you click this and blah blah blah and another descriptor entry is appended as following see: header match, descriptor value.

A

So that's how you know other options, obviously look at the source code and another option is to run everything in debug mode, which I will demonstrate in a second. Hopefully this answered the questionnaires. Let me know if you have any follow-ups, so, yes, header, very much automatically gets sent with the key header match, so each action generates.

A

It could generate a key. That's pretty fine name by the code. It can also some of them accept the key name and allow and set the value.

A

So you got, you have to read the docs to know exactly how the key and value uh match from the action to the descriptor entry, and if you really want android, allows you to actually extend, extend this and add your own capabilities here, and it has a build one that users sell to really get custom properties of the request to rate limit on esl is the common expression language from google.

A

Okay, so now let's talk about that second route, and you can see that we have three sets of descriptors here right now. We have now remember the order between those doesn't matter, but the order inside matters a lot because it must match the configuration tree. So, let's see how that works.

A

First, descriptor we have one action with a generic key. That value is resources, and that means this will generate a descriptor with single descriptor entry in inside it and it must match. Therefore, this limit right. We only have one level in the tree here- the second action here this one, you can see that it has the same generic key as before resources, but this time I do a header value match and I only add the post request: descriptor value.

A

If the method is a post method right, if it's a post request right, so this thing will match uh this one right. Essentially I go to the configuration tray. I start with generic three resources. Good header value match generates the key header match good and the descriptor value will match to post requests right. So if I hit this route and the method is post, I will get here and we'll get 10 per minute. Now, what happens if the request is not post? What? If it's a get request?

A

If it's a get request, this header value match will fail to add a descriptor entry right. The action does not add a descriptor entry, and in this case this whole descriptor is not generated right. So this descriptor will only be generated if it's a post request right.

A

So if it's not a post request- and we will only send two descriptors instead of three if it is a post request and we will send them all the three descriptors that I defined in these raid limits and finally, we have the the essentially the same one as before, because we still want a rate limit on the remote address and for client and there's also another way to do that I'll leave it as an exercise for the user.

A

I can put this action and this these actions in this descriptor in the virtual host and ask the route to include the descriptors from the virtual host, but you can look it down. We can click and figure that part out. I want to get to the demo and let me just see if there's any questions.

A

uh Where is this chat? Okay, so, first, okay, all right now the magic happens. We're gonna do a demo to do them. We need the terminal, obviously, and let's make it a bit bigger: okay, good!

A

Oh too, big boring, okay, just right now, let's just start some server, so we can send requests to it really doesn't matter what's going on in there we start reddit, because we need redis to back up the rate limit exit. Oh because I already have it running: let's kill the other one, because I don't know what's in there and start this one! Okay, perfect!

A

Let me start another terminal and split it in two and on the one side I will start the raid limit service and I have here a handy command and again I'll put all of this in github. For everyone to enjoy, let me just review this before I run it.

A

So I'm running the android proxy rate limit service and you can see I export a bunch of environment variables that configure it. Specifically, I tell it to connect to my redis. I get the log level to debug.

A

I set the jrpc port to a port that always configured to talk to it, and I set the configuration directory.

A

Oh one sec before I do that I got a cd to the right, folder, perfect and so okay now we run it.

A

Oh invalid. This red is tv somewhere.

A

Sorry about that.

A

Left over move, it.

A

Perfect, so you can see that my my grade limits were loaded from here. Also another thing you could do if you're not sure that they were loaded, you can go. The real limit has a debug page. Let me go there and you can see that this is the array limit server. If I go to its admin page there's an rl config you can go to and there you can see that the rate limits were correctly loaded alright. So this is good.

A

Now we're gonna start envoy.

A

I have here and the glue just because it's a pretty recent envoy ever since the change to v3, you can use the upstream android. It's all good, I'm not using anything glue specific here and we'll start on with the android config we just reviewed and obviously I'm in the wrong folder good. Now we really start onboard terminal before I do that.

A

Okay, good! So now angle is running perfect. Now we can do a bunch of curls. So, let's first curl regular localhost.

A

Oh sorry, I'm getting that because that's my python is listing the directory content. So, okay, that's good all right! So, let's clear the terminal, so we can see what's happening, okay, good, so we send a request to the regular route right. You can see. Android called the rate limit server and the raid limit server got the descriptor, and this is exactly the ordered list that I mentioned that annually sends right.

A

So this is the descriptor and you can see here key value, comma key value right, so anvi is looking up seeing what it needs to read limit. It says that the rate limit is one per second. So it's all good, because there's no way it's not over the limit and we return a normal response.

A

If, instead of curl, I used hey to blast, it has a utility to just send a bunch of requests.

A

You can see that I get a more distribution, I get more 429's yeah. I also get 503. Curiously, I assume it's. The ivan app can't handle the load, but we'll explore that in a different time, but you can see that I get some 200s and I get some rate limited right and if I'll look at the android stats understand there, we go.

A

I'll grab the android stat to find rate limited.

A

You can see that we have rate limits on my cluster. I have 100 errors, I have probably maybe a redis can and a little load on my laptop.

A

We have a bunch of case and we have a bunch of over limits now remember in our case, because failure mode is allowed all the errors will will be allowed through. They will not be released.

A

Okay, so over limit to c97. This should match more or less this 97 responses that we saw getting 429 so far. So good, let's see if there's any questions about what we talked about so far. So if I change the rate limit, config value, will it okay? So that's actually a very good question should do you need to restart? The answer is no, it also depends how you do it. So if you go to the config rate limit, this is the there's two ways to do it, the the the old way.

A

With the run time, it involves a symbolic link and they actually specify exactly here how to do it either way. You don't need to restart the raid limit server, but you have two options. You either update a sim link right. So basically you prepare your new configuration in a new configuration root folder, and you update this folder as a sim link to a new subdirectory containing all your config right. So audio config is in runtime, subdirectory, slash, config and you point envoy to look at the sim link right now you want to update the config.

A

You prepare a new config. Nothing happens right because android doesn't see it, and then you atomically upload the sim link to put to point to the new config. So that's kind of that. The way uh it the kind of classic way, I guess using they call it the run time and you atomically update all those runtime values.

A

The other option, the one I did on this demo that is simpler and for me right now, is to turn on this random watch and then android. You don't need to do all the sim link. You just update the configs in this directory and the red limit server will watch them. So you can see, for example, if I will update this one to two and I'll save it. Oh, I should have done it when I should clear my screen into it. Let's do the ring and save it.

A

You see that it it's hard to see, but you'll have to believe me that it reloaded the config because it watched it on disk and, of course, I'll push all this stuff into github. So you can also try yourself and see that it works as expected. So the the short of it is that you don't need to restart the rhythmic service, all right.

A

Moving on to the next route in my demo here pearl with post, and I will leave the post later again clear this so this time I'm doing resources right and you can see we got a descriptor like before the generic key client and the remote address, but we also got the other descriptor, the generic key resources right, and if you remember this is exactly actually let me leave this on as reference.

A

This is exactly what we set here right. So you see the method. Wasn't post the header value match. Didn't produce a descriptor entry, so the whole descriptor was excluded, but this descriptor, the generic key action, did produce a descriptor entry, so it was included right. So you can see that I got two descriptors.

A

Right so, let's get back to the terminal, it's the terminal!

A

Here we go. Oh didn't mean to open another one. Sorry about that here we go all right. So now, let's do a post request.

A

Oh, you know what let me clear: you can ignore the not implemented error that so now you see we got again the same descriptor as before, but this time we got this descriptor with two descriptor entries with the generic key and the header match, because it's post request and if we scroll down more, we see that we also got that that last descriptor that we didn't get before, because all of the actions generated descriptor entries.

A

We got all of those uh descriptors generated and sent upstream and here the rest of the design by looking to see if it needs to raid limit- and you can see that if we do a bunch of this, we eventually gonna hit.

A

I lost count, but there's 10 a minute in the that part.

A

Here we go so as you can see a rate limit remaining zero, too many requests, and in about a minute it will reset.

A

All right, I hope it all makes sense, because I know it can get confusing. Let me see if there's any more questions and so far so good.

A

I hope this was useful. Again, I'm going to upload everything to github, so you'll have it as a reference. Let me just summarize what we talked about. Maybe I'll go into the slide agenda, so we talked about an intro a little bit global rate. Limiting. I assume most people here know what the uh the meaning is of global rate limiting and that is to limit a rate in a global way consistently across the cluster right. So we talked a little bit about alternatives.

A

It really depends on your use case. You may not need global rate limiting if you don't need it, it's just easier not to use it honestly right because you do need because global rate limiting requires state and state is always harder to manage right. So we talk a little bit about the architecture we have android.

A

We have the raid limit server, both open source, github android proxy, and then you have android and rate limit the rate limit service because it is stateless, it needs to be backed by redis or memcached to store the actual state to store the actual counters, because otherwise you couldn't get a global rate limit right. This only works because every anvil in the cluster effectively asks the red limit service that has redis to the current counter.

A

So redis knows to increment those counters and can give you a global, a truly global rhythm, and we talked a little bit about the configuration how it's modeled and envoy it's modeled. As the descriptors. Each descriptors have a list of actions in each action generate a descriptor entry. A descriptor entry is a key and a value, so we end up with an ordered list of descriptor entries right, so we have one or more descriptors and each descriptor has an ordered list of keys and values.

A

The descriptor entries- and this list is a match to the configuration tree of the server right. So the server has the configuration kind of looking like a tree. You start from the top see the descriptor entry in this descriptor always send matches this configuration in the top of the tree. If, yes, we go to that branch of the configuration and descend now we compare the second level of the tree and the second descriptor entry right.

A

So that way we go until we are finished, looking at all the descriptor entries in the descriptor and they send us and if, in the end of that, we reach a configuration section that has rate limits defined, then we rate limit.

A

So this is kind of how the configuration work. We talk a little bit about observability. You can look at the android stat to observe errors to observe when rate limit happened and observe when it was okay, when it wasn't really anything, we talked a little bit about scaling that you can use.

A

Client-Side charting is a really simple way to scale out your redis, because one radius instance cannot hold it all and the way you do that you can the most simple way to do that is attach an envoy sidecar to the raid limit service and have envoy do the client-side sharding, because envoys already have advanced, redis boxing capabilities and it can do client-side charting for you, so you can kind of get it for free and in the end we did a little demo, showing all these configuration yamls in practice, hopefully to get the understanding a bit more uh grounded all right.

A

So let me just check in for any final questions. If there are no any any more questions, I will wrap it up for today. Thank you, everybody for joining. We are. We will see you again in two weeks with another episode of hoot covering and educating things cloud native, please feel free to propose topics to do that. You just go here. Github.

A

Issues new issue, episode suggestion. I see we have a new one from 38 minutes ago. Thank you. I will look at this right after we wrap up here and that's gonna. Give us uh an idea of what you all want to see. uh I see someone one question yes, but what is the.

A

A

Feel free to ask.

A

And, of course, if you see in the episode suggestion thing that you like uh just do it the thumbs up, you don't need to open another one, because it will confuse us while harish you're typing your question: israelis mandatory um either reddish or mimkishdi, so the rate limit server does not have its own state, oh for custom. No! No! If you so, okay, let me actually go over that. It's very important android doesn't care about redis right envoy.

A

Right as far as I'm concerned rate limit is a grpc service. Android doesn't know if there's redis and doesn't care right. So if we look at the protobuf, all android cares about is that you have a rate limit grpc service that has a should rate limit function that takes a red limit, request and returns a rate limit response right and in fact, in previous hoots, uh specifically the one about external services. You can see a dumb server that I've implemented. Let me open it up here that does not use redis.

A

So let's open this city server and you can see it in the github, and I have here like a server that implements a bunch of android services.

A

One of them is, should ray limit just make this bigger and that's all you need to implement a rate limit service, oops.

A

Okay and in in this case, the rate limit servers always return. Okay, so there's no rate limit right, all the reddit stuff. That's the responsibility of the rate limit service, that's up to the rate limit service right and in addition, the relevant service can be configured to actually use a local cache for over limit request.

A

So it's kind of optimized to not have to go to redis all the time for every request. So you can look at the docs and see if that's of interest to you but yeah, you don't have to uh have redis. It's not a requirement by anfo. It's required by the rate limit service.

A

And yeah, this is essentially all you need like you can. This is a rate limit server right with all not exactly the most sophisticated one, but the working one. That envoy will talk to.

A

Is memory be a good option, so, assuming you mean an in-memory db like redis, or do you mean a specific product in mind? Oh my suggestion. So the way they do it in-memory db is generally not a good idea because it doesn't scale right. If you want to scale up, you got to use a technique like a client-side charting the reason it works, this in-memory cache. The reason it works is because they only do it for over limit uh requests right. They don't record in the cash requests that have not yet been limited.

A

So once you are over the limit, it kind of doesn't matter anymore until the time window is over right. So let's say you have 10 requests per minute and after 10 seconds you hit all the 10 requests. Then you know for in the next minute. Assuming the clocks are reasonably synchronized.

A

Reddish will tell you no anyway, so you can cache that part locally, but only the over limit request right. The the problem with h2 is, it is in memory once you reach capacity, you don't have a way to scale it up for another instance, whereas with the reddish solution, you can use client-side charting, for example, already in cluster mode to scale out radius.

A

That said, you know if your use case is not that complicated. If you don't expect a high load and you think a single instance can deal with it, then yeah, why not you know you don't need to over complicate things?

A

You know if they don't need over complicating. If, if h2 is fine, you can start with that and once you outgrow it, you know move to the redis implementation.

A

I hope that answers the question of the slack. Some reason started annoying.

A

Okay, I hope that answers the question. I'm glad you find it useful and if you have any questions I'll be looking at this page leave a comment with your question. I'll look at the comments for the next few days in this page. So let me know what you're thinking, if you have any questions follow-ups, if you have episode suggestions again the better places than the hoot repo that way everybody can see it and with that I'll wrap it up for today.

A

Thank you, everybody for joining me, and we will see you in about two weeks goodbye everyone.

A