GitLab Support Team, 13 May 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Puma timeouts and threads

Description

An exploration into how Puma timeouts are configured in GitLab and how its threading model differs from Unicorn.

A

Hi everyone with the Puma becoming the default in version 13 I want to go over how timeouts working Puma and how its threading model works and just general differences from unicorn. Okay, so with the unicorn, each worker is its own process, so they're they're, isolated and then there's no single thread for each one actually able to pull threads, but only one real thread going on there, but the Puma it's cluster mode. So basically you have and processes. So if you look in the Barbie, we have number of worker processes.

A

So in this case we set two and then each worker process will have a number of worker threads. So those are the ones that will actually go and really do the processing of the request. So so here we have a minimum. A min and Max of four threads per worker, like testing, is found, that's most efficient, but you could set the min and Max to be different. So if you wanted to throw a scale up and down that's possible anyway, so we should have four threads in each process and two processes.

A

A

So we have this one here, then, with a 4-3-3 glad to hear we're listening the socket and TCP 8080. This is the master deployment process and then the cluster two workers, 0 and cluster worker 1 or the ones will actually be taking in the requests.

A

So this one is just kind of there to mom during create new worker processes is needed. It doesn't actually take traffic. These two are the one selection, do the work and then, if we look at threads just.

A

A

There we go okay, so you can see we actually have a fair number of threads here beyond just the the four that we're expecting so wrap out actually have 19 threads, which is way more than four. But if we s trace it, then we can see that very interesting. Four of these threads really take traffic, so we'll just trip for Puma print the pig and then s tracing spit it out threads, and here we'll just kick off a bunch of API requests in parallel.

A

So he's had to read all the workers to make sure that we are getting all the threads working. Okay.

A

So that's going we'll give it like ten seconds to go, so we have enough data.

A

Okay, that's so now, if we go into the file.

A

It's just the four HTP yeah okay, so that's okay, yeah, so this in a nest race. Actually, so the the process that you see here is not the the process, CD necessarily see and PS aux. It's the the thread ID to let me process ID. So in this case you're gonna see each Huma thread will have a unique process thread. I hid there like we saw here. So it's actually this number here that we're seeing in s trace the one six one: five, nine six etc.

A

So, let's just see which of these processes are actually hid in a nest race. So just gonna look for a process that was has received from and then HTTP after that. That should be good. So if we take that and then.

A

Sure that looks right all right.

A

That looks right. Okay, search for that.

A

Search know, that's right. Don't here we go okay, so we're getting results. Now we can tap talk again.

A

A

Okay, so if we're cluster worker 1, even though we have 19 threads, only these 4 16 500 and the 3 after that are actually doing web requests. So that's one thing to be aware of: you'll see a lot more threads in PS they're, actually taking traffic. The other thing that the the big change in unicorn is how timeit's are handled so in unicorn. The the worker timeout sets a maximum duration of a request right.

A

So if the worker has spent more than 10 seconds on that request, then it gets killed and the request drops with Puma worker timeout works a little bit differently, so a worker timeout is if the worker process does not respond within whatever time out you've set. So in this case, I've said it to be arbitrarily large 300 seconds.

A

But what will happen is so long as the process itself is healthy, it will pass these health checks. So that means that if you have a single thread that is taking a long time that will not trigger the process to be killed, big fried by timeout. So if the process got stuck, the process itself became unhealthy. Then you'd see this come into play, but in terms of like a long-running request from a user that doesn't happen.

A

So how do we handle requests that take a long time? Well, as it turns out, we just heard code a time limit, so what I've done you saw and hitler vs hit the worker timeout to 300 seconds. Just for that example- and I sit my giddily time- has to be 250 seconds so really long. Now, what I want to do is just take a query. I'm like get lab repo that will take a long time. So, let's take version 6.

A

Version, 6.0 I, don't know how many years ago that is, but this should be just take a long long time to do for inter-process. So if we run that and then we wait 60 seconds we'll see that the request times out what else can we talk about we're waiting on that I? Guess we can look at the previous time. I said: I had.

A

A

A

A

Exceptions is your son here interesting, okay, anyway, go away, I might have that's right, we're gonna get anyone soon. There we go okay,.

A

Alright, so now I got her 502 error, but you'll know that was after only a minute, even though I said on my timeouts to be, you know five or ten minutes both events. um So if you look here, you will see something nice, actually, every single 302, that's interesting.

A

Yeah, it's not short, they're, weird.

A

They were, he killed.

A

So you had one memory: memory, kill there, yeah okay. So actually, in this case we had a memory, kill. Okay! Well, that's good! So that's what memory clothes look like implements actually go to Puma its standard out and I standard error. No I should talk about that uh memory totals there are so the.

A

So here per worker, max memory, MB is per worker process, not thread so we're allowing 850 megabytes per worker process. But that's not the number you see here. You see you ever get this.

A

You know we're doing we're consuming two thousand and that point we get killed so it actually happens. Is this lemma here is the sum of all the workers, plus some other constant for the master process. So the number you see here when it gets killed is not gonna, be the number that you see and you get a Barbie, but that's not normal and okay. Alright, so that's strays. I, remember a limit here, so we get a nice time out instead, now, let's just set it to 2000.

A

You look that reconfigure.

A

A

So here, when poom is still coming up, actually what's it yeah? So you see the master process come up here first, but you don't see the cluster workers yet. So this is an easy way to tell. Are the boom actually available to take traffic? Because if you only see this one, then you know that and you don't see any cluster workers- and you know there's nothing available to actually take the the requests coming in so I'm like unicorn, where it's like well I, see a process there, but I really notice doing with puma.

A

If you don't see a cluster worker, you know that it's not ready to take traffic. So, let's slightly convenient.

A

Okay, cluster workers up any we're thinking, no actually never gonna get that long quest just give that I meant to run literally a minute.

A

Okay, all right, so we give that a minute anything else, we'd to look at I'm gonna, be you.

A

Never threads timeout yeah. Well so you know for customers who have a slow instance. You know their storage is not great and they or maybe they've a big repo, and they just need to be able to get longer running requests right now. The only option they have to take care to have a request to takes longer than 60 seconds. I'm Puma is to not use Puma. They have to use unicorn, then that case they can use their worker time at like normal and that'll be fine.

A

For now, until we get rid of boo myself, we will get this configurable for those customers need it. No, let's time it loaded, darn, that's terrible!

A

No! This actually take.

A

56 seconds, maybe I need to take a long elder. One.

A

Version 4.0- hopefully that's long enough.

A

Yeah, so apologies for the dead space here, it's hoping that would have failed, but we were. We were too fast this time.

A

A

Owners to be clear, just like the unicorn there's no reason to manually set your number of worker threads or processes here. I just did that so it'd be easy to see. It'll it'll do the automatic calculation of how many processes to use and threads will always be four.

A

There we go okay, so now we have exceptions now Eastern the perceptions is funny. Here we go alright, so here we can see with this racket. Timeout request time at inception, request took longer than 60,000 milliseconds, so that is the actual error that we've added to Puma, because by default Puma itself doesn't have a way to kill along on a request. So we've we've kind of jury-rigged this in to make sure that we don't have infinite requests running now. So that's it. So the time is big thing to be aware of other than that.

A

It seems like a nice improvement on unicorn. Alright Joe, thanks for watching everyone.