GitLab Scalability Team Demos, 28 Apr 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Scalability Team Demo 2021-04-28 - Call 2

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

There we go so this is the second demo of today and um I threw something on the agenda and I'm the only one who threw something on the agenda. So um I'll just go ahead and present this thing um to you, my audience: um yeah, uh it's actually something we talked about, or it's a it it's something you're familiar with. uh I think already um so the it's about um a bottleneck I discovered in grpc and the way we used in italy and um so found.

A

I got thinking about this because of the back objects cache and uh when we tried to turn it on, uh which is a change you made uh at some point, no wait uh that did not get reverted, but then I tried to bypass the pre-clone script for kidnap or gitlab, and that got reverted because uh the app decks of falcon area one started going down, and then I tried to understand what is wrong here and at the end of that investigation, um I suspected that we were hitting a bottleneck in grpc, but I couldn't really prove that so I just had to close it and we wrapped up the epic, but I kept thinking.

A

I would really like to know if this is true or not, so then what I ended up doing um yeah, but one reason why yeah so the graphs just showed like italy was doing a lot of stuff. I think that's one of them. Okay, maybe I should go through the reasons why I should try to stick with my story, so I have a. I have a bit of a story. Let me share my screen, um so there's an issue where I try to explain this. um So one thing is that we never had.

A

This is the git process activity graph and we never had gitzli in here, and we only added it at the end of the pack object cache project, and then I started noticing that gitly is like the top thing most of the time quickly itself and we always were tracking goodly cpu uh utilization, but it was in a different metric and we never had to do things see the same graph. So then it doesn't jump out at you with that much now.

A

The other thing I noticed, while working on the project, was that the getly allocates a lot of memory when it's surfing traffic and it all happens in the functions that are responsible for transferring get http data, and so in this profile, like the top line's hard to see. But it says 40 gigabytes and um it basically looks like and if you click these uh traces, it's grpc stuff. So it looks like grpc is allocating memory proportional to the amount of bytes that we're transferring and if you're, just copying data.

A

That's not good, because that should be a constant time allocation for requests, not o n in the number of bytes you're proxying. Right that I I found that suspicious. um So I I was yeah just because of the allocations. I thought something fishy is going on here and I wonder if we make this uh just constant, constant in allocations and not linear uh in in the requests or in response size. What would that give us um yeah and then the.

B

Other thing you know: go ahead: yeah I've got some some uh interesting, possibly related data to share, but I don't want to interrupt your story so all right.

C

A

B

Put a mental bookmark there and look up.

A

Yeah, um so the the other thing, besides the allocations which in themselves are fishy and in retrospect, they're super fishy, but we'll get to that. I also noticed looking at cpu utilization, that if you look at wright's uh right activity this these are the threads for the. This is the cpu time spent doing rights and these are actual syscalls.

A

But all this is stuff where we're putting messages on the queue inside the grpc runtime to deliver them to the threads that are doing the rights. So we're actually spending more time telling jeepers grpc to send something across the network for us than it takes to send it across the network and that also points at an.

C

Inefficiency- and this is specifically cpu time- we're talking about here- yeah.

A

This is this is a cpu profile, not well yeah yeah. um So I thought how can I test this and then I realized um I can just try. I only care about git hdp, so I could just try to build a toy version of italy in a toy version of workhorse.

A

That only knows how to serve a git, http fetch request, but build it on top of tcp sockets and nothing else and benchmark that and uh that it turns out that that is insanely faster and that's sort of the story, um and I have a table here at the bottom, where um I set this up on a virtual machine with the same machine type.

A

We use for gitly servers, so a c2 standard 30 as the kit leaf server, and then I had a client machine fetching the data and I have a load generator where I was doing 30 clones in parallel with the same repo and that load generator will saturate the gita server one way or another, and the interesting thing is how it saturates- and what you see is that if you don't have the cache, then it saturates at 920 megabytes per second on the network and your system load goes to 148 83 cpu utilization.

A

If you add the back objects, cache you get more throughputs, but you get and you get the same system load. You get slightly lower cpu, cpu utilization, but you're, basically still cpu saturated now, rpc test is this thing I built that just uses a tcp socket and you get twice as much throughput and you get almost 100 cpu utilization, so it's still being saturated.

A

The craziest thing is that the moment I turn on the cache. With that thing we hit 4 000. Well, nearly 4 000 megabytes per second, which is the heart limit that gcp puts on the network and we're no longer under cpu saturation.

A

So we can break through this wall of one gigabyte per second to the maximum the network offers and we saturate on the network before we saturate on the cpu. So we just completely changed the performance profile of the whole thing.

A

So I I was kind of blown away when I realized that that was possible um and if you have no immediate questions, I can show you some flame graphs that make it a bit clearer. What's going on.

B

Yeah, I was going to say we uh this. This looks like spectacular results and that, of course, as a scientist that always kind of evokes, uh you know questioning the results. uh um I would love to see. uh I would love to see some some profiles, some yeah, some some cpu profiles, in particular.

A

Yep, so uh this is a profile, let me just so. This is slightly annoying. I need to download it and open it, so we can click around.

A

This is with uh the cash off, so we do have the overhead of the cash plumbing but zero, and it looks like this is goodly, not your uh yeah.

B

What's exactly what was the toy thing called rpc test? Rpc.

A

Test, okay, yeah. So this is giddy, and this is the case where we're saturating at 920, megabytes and load is about 1.4, and you can see we're not 100 cpu, because you have to swap.

B

It down here yes right right- and this is uh just just just for for clarity. This is uh this looks like a perf profile and I'm guessing. Your sampling rate was like 99 yeah,.

A

99 hertz across all use 30 seconds, okay, which is I sort of my go-to. I have a script that does that I can just point at a random server and it ssh, and it does that yeah, um so yeah cpu profile, all course. So it's not saturating all the cores. Yes, ghetto is the biggest workload.

B

A

By gitly hooks, which is this in between components which is respon, we use for the caching and that's if it acts as a grpc client, and then this is a collection of git processes that is serving get response data.

A

So uh if you look at gidley hooks, uh one thing that's interesting here is that you see lots of garbage collection this. This is a garbage collection function, but without knowing in detail what the call runtime does. This is garbage collection. Obviously, yes, this is garbage collection. Yes,.

B

A

Well over here you see it's actually reading from uh a network connection.

B

A

So that is getting data from gidly, but so that's this little tower, but, for instance, this is just communicating with the um flow control mechanism in grpcgo. Oh.

C

This is the scheduler okay.

A

Yeah, so this is grpc goes scheduler right. It has its own uh scheduling for io, even though this thing has only one connection that it has to work on, because it's a okay, a single process, more gc and then also here like here. You have kernel stuff right.

B

Those those giddy.

C

Hook are those short-lived processes yeah? Okay? I.

B

C

Because, okay yeah they're one per one per clone, okay,.

A

So the main thing is, you see quite a lot of garbage collection on these.

C

A

Long processes and if you look at italy, then um well that's the same picture. I showed you before but upside down. So here you have cisco's where it's doing reads. That's the process generating data. That's.

B

A

From a pipe, it looks like yeah exactly pipe read: these are the rights that go into the grpc, a message queue it's submitting, so this is not io. This is just submitting to a message. Queue in the process.

B

A

And uh this is the go routine loopy writer, you can see at the bottom that uh sits at the other end of the message queue and that actually does io. Okay.

B

A

You see network rights, socket rights, yes great, but only this narrow tower is circuit rights. Right, yes, it's dominated by other stuff. um So that's that's what things look like with zero cash hit rate? um Next, I should show you what it looks like with uh 100 cash hit rate, but still using italy makes.

C

A

C

In spend time spent in.

B

A

Yeah, let's uh get that one, so we're still not at 100 cpu, and it's also what I saw like it was something like 77 percent. Yes, uh git is much smaller here on the left, but italy, hooks and gately are still pretty big.

B

A

uh If you would drill down into what they're doing they're doing the same stuff, okay, do you want to see more detail on that, or do you uh no.

C

A

Fine yeah, you can okay, so next I can show you uh rpc test without cash, uh so this one was uh load 189 and it was 100 cpu utilization, and uh that also is born out in the in the profile picture, because the swapper is all the way over here on the right and it's tiny. Yes, yes,.

C

Okay, um this is good processes.

B

Yes, that makes sense so we're driving uh okay. So that makes sense so because.

C

B

We're driving we're driving more get processes uh because we've got higher throughput yeah.

A

Exactly which is why it gets more cpu, okay, yeah otherwise, git is just stalling, because it's uh cannot feed its data too quickly, because italy is not accepting its data fast, consuming.

B

A

Fast enough yeah, so this is the equivalent of getting hooks, and this is the equivalent of italy, italy yeah. So if you look at gidley hooks, then this is right on these are pipe rights, futex, wake.

B

There's, um if you don't mind, um let me let me pause you for a second just for some basic background. um I've spent a decent amount of time with italy and uh internal code, but I haven't spent any real time with giddily hooks um is. That is that uh is that that that's a separate process that gets forked for a short amount of time and and it uh it effectively acts as a pipe between child processes in italy or yeah.

A

um So normally what every fetch request corresponds to a git upload back process. Yes, right and uh git, upload pack can do several things, but the most expensive thing is sending back file data and that is farmed out to another subprocess, which is called pack objects. Pack objects.

B

A

And there's a hook mechanism in git where it will run a different executable instead of back objects. Oh okay and that's goodly hooks so.

C

A

Request creates a upload pack. Every upload pack creates a goodly hooks goodly.

B

A

Goes in with the unix socket back to italy,.

B

A

Can either fetch the data from a file if it's cache, if it's a cache hit or it runs back objects, the way upload pack wanted to run it in the first place? Yes,.

B

A

B

Okay, great so it's um and we spawn this. We spawn this. This hooks process uh for, for every call right was this in place before, for every uh upload pack will always uh will always invoke the hook, regardless of configuration.

A

Yes, um so that's that is a design choice we had to make when working on the cache- and um it appeared to me- but I didn't understand this well enough at the time it appeared to me that it did not add noticeable overheads and in the case where the cache is active, it reduces overhead because you have fewer good processes uh back objects processes. But yes, um it act. It's effectively a gitly client right, because it's it makes a gitly call and it's copying data from.

B

A

To standard output, and so it does the same job as workhorse and gitlab shell, except it runs on the gasoline server.

B

Yes, yeah I've seen that uh that uh the post, upload pack and ssh upload pack um now nowadays um now that we've got the the caching enabled have a similar number of another grpc called that whose name I'm forgetting.

B

Thank you yes, but yeah. I was going to say it clearly relates to this yeah. It's the other half of the.

A

Exactly it's the thing that getaway hooks calls.

B

A

Other jobs, it was created to do other jobs uh and it already existed, and I just added another sub command to it, but that.

B

Was my next question yeah what else.

A

Do we use uh well, uh as you can guess, maybe get likes hook, hook executables as a plug-in mechanism? Yes, it's part of git push uh during git push uh there is. uh There are hooks before and after the references get updated.

B

A

We trigger ci pipelines, how we know some of the user pushed how we implement protected branches. All that stuff is built on hooks and those go through getting hooks as well, and then it also makes an internal grp sequel except those are not uh a lot of bytes. Those are relatively small. Yes,.

B

Okay, fantastic. That makes sense. Thank you for uh sorry for the.

A

Engine super helpful great thanks for asking so yeah get the hooks already existed, but it was not part of the chain for git fetch. I I made that change because of the cache yeah.

A

So we see it being busy yes makes sense, and what we see here is that the replacement based on sockets is spending most of its time. Doing. Writes yes,.

C

A

Few text weights.

C

A

Buffer, that's fantastic.

B

A

It's all exactly, it's that's what it's supposed to be yeah, that's that's! That's his job, yes uh and uh here's, the uh gitly replacement and again it's all copybuffer.

B

C

B

A

B

Design, that's that's trying to yeah.

A

Yeah exactly the idea is just we should only be doing copy buffer and we should not be allocating anything and what what if we only only had to do copy buffer that was sort of how fast can we go if we can somehow optimize things that we only have to do that on this code path?

A

B

I was trying to figure.

A

Out, um so that's without cash, so that's it's! A zero percent cash hit rate effectively. Okay, would you uh zoom.

B

In on that tower one more time the rpg yeah perfect- um I just wanted to yeah- take take another peek, so we've got about maybe 30 percent in pipe read. What's the middle tower.

A

B

A

uh Gold gold runtime uses async io to.

B

A

The syscall here is uh ebol. Okay, that makes sense it's um yeah. That's how go makes io uh efficient like it's uh it all the go routines that want to do. I o register with the polling subsystem and then one thing is doing the bowling and waking up call routines when they have something ready for them.

B

Makes sense: okay and then the left-hand.

C

B

C

B

Great great great great, fantastic.

A

Okay, thanks yeah, um so uh this was like where we hit, like I said, swappers almost nowhere, so it's 100 cpu we're actually completely cpu saturating the italy server, that's cool yeah and that's a reasonable bottleneck, yeah uh and the repo we're using here is the handbook which is a very large repo. So, okay, um a lot of the time, because I I did through two sets of tests.

A

I also tested it with kit, lap or kit lab, which is about 1.5 gigs, and I have books like five gigs and you see the numbers shift a little bit because we have more requests in the case of the gita board gitlab in a 30 second window than we do with this one: okay, but yeah. So that's this story and then the the really cool one was, of course, where we have the cash hits. um So let me cut that one.

B

A

Because the cat pack.

B

Objects process is not is not cheap, especially.

A

I'm in the wrong flame graph now uh sorry this one: this is the right one: ah rpc test with cash.

C

Great yeah swap our 52 swapper.

A

That is unbelievably beautiful, yeah there's there's one little detail that I want to show you that is really cool. um The go runtime uh in some cases is smart enough to use, send file um the send file, so the send file system call allows you to copy data from a file into a network.

B

Socket without having to go.

A

Through user space, which is really nice, so what you see here is that all the hook serving traffic is going into send file. uh Where wait, let me just blow this up. Yeah do send file, so this is even it's not even in copy buffer. It is just doing send file, because one.

B

A

The connection, um even I mean in the current situation, we have a socket and we have a file, but because there's these layers of abstraction in between the.

B

A

Can see that that's what's there and it cannot use the syscall, because we peel these layers away this abstract this optimization can kick in, which is just a nice gift and there's another nice gift that I even didn't even do here. But if you have a, uh if you think of upload pack right, it hasn't standard out pipe.

B

A

And you have a socket, you could also dub the socket and make that the standard out of upload pack and then upload back rights go straight to the kernel and never go through gitly.

A

B

Actually works and it's like 10 kinds of.

A

Code and then they get the cpu just goes to almost nothing because you're doing constant time. Italy work, it's just a send file per one cent file per request and um it spawns upload back and upload back.

A

Actually, I have a no. I don't have a profile for this, but when that happens, it's really cool.

B

A

I'm I'm kind of surprised that that's possible.

C

Yeah see here it says pipe right so there. It then said.

A

Socket writes because it's just a file descriptor yeah you're actually allowed to do.

C

That which is just okay so so we're we're delegating the file descriptor to get upload pack. We could but of course,.

A

The downside is that you cannot do user space crypto anymore. If you do that, so then you.

C

A

It's if you have the socket object, it's very easy to write the code that automatically does the upgrade, but if you want to do tls, then you're back in user space. So I.

B

A

Not going to put this in here because then it's it's too.

C

A

But I cannot shut my mouth about it, because it's just there, it's yeah. No, that's that's! That's awesome that it's there that's fantastic yeah, thanks, so yeah. This is mostly copy buffer. A little bit of send file again, the hook executable uh is, is still copy buffer, like it was before. Okay, so, and at this point we're obviously we're not saturating the cpus, but the load factor is also like way below one. I don't know what I said: it was uh 0.6 or something yeah.

B

A

About half yeah.

C

A

And uh yeah I had to look it up, but the limit. Apparently the limit on the vm is 32 gigabits per second. So if you divide that by eight that's 40, that's four gigabytes per second.

B

A

We're almost at that limit.

B

Yeah yeah and it's it's a it's an approximate limit too, so I I would be astonished if that wasn't kicking here.

A

um Actually, if I do that thing, where I dub the file descriptor and sorry dub the the socket and give it to upload back, then you really hit four thousand. I I have like if I go back in my, uh I have graphs where it hits. Fourth, okay, I'm not gonna. I.

B

Could pull up for me this grass, where it is four thousand.

A

Yeah yeah um it just didn't uh in this particular configuration, and the other thing that I wanted to highlight here is that in either of the gitly situations, if you look at a memory profile, then you see that in 30 seconds it allocates 115 gigabytes of memory, and this all has to do with protobuf, marshalling and stuff that happens in the this async ioq. That is part of grpc.

C

A

And so that's 100 gigabytes, and then, if we look at something like rpc test with cash, sorry wrong scroll down it's five megabytes. So I was saying this should be constant in the amount of bytes, not o number of bytes, and it is it's like to go from 100 gigabytes to five megabytes allocations is a little.

B

Yes, no, that makes sense um yeah um yeah. I guess.

B

I I can. I can rationalize why um why we'd have a really high churn on the heap when, um when we're doing multiple layers of uh short-lived um yeah, io.

A

Yeah, I think that um it's just I don't unders very deeply understand how grpc works, but I think one of the main things they try to do is uh connection multiplexing, but.

B

A

On http 2, which also has on one of its main features connection multiplexing and flow control.

A

So all all things that go across the connection are messages, and these gets put in a queue, and then there is this central thing that owns the connection. That looks at which things want to go on that one connection and it feeds off their cues and it implements a quota and things like that, so that's sort of what they built and the way it works is that yeah, you have the submitting part of the queue and there they um pro well. Protobuf itself is not helping either they serialize byte slices into protobuf messages.

B

A

A length prefix on that drop that into 16 kilobytes http 2 frames, maybe do a couple of inefficient things in the meantime where they allocate some things twice, and it goes on this queue and then it needs to go out to queue.

A

So it's I, I think it's a fundamental thing about this multiplexing uh async io design that the library forces on you, plus that yeah in some cases like send files, never going to happen. This way right, if you have that stuff in between yeah yeah, that makes sense.

A

So how does this tie in with what you wanted to say about memory.

B

um Yeah so uh on, uh um let's screenshare for just a second, the um sticky slack crashed. While you were chatting um yes, so.

B

I spent more time than I expected to uh uh digging into this uh several several um several customers. Several self-managed customers reported that um they have uh large you've already got all of my contacts, so I'm just kind of yeah.

C

B

Else watching the recording um summarizing it so this. uh In brief, um we have multiple large customers running non-trivially sized italy nodes where they have observed, um abrupt um spikes in memory usage by italy that do not go away after after the the surgeon workload and we're pretty confident that the from from bits of diagnostic data that those that those customers were able to send us we're pretty confident that what's going on here is, um is there's a some. It doesn't matter what the event is, but there's an event that causes.

A

I I I have a valence what the event is but go on.

B

Yeah, I suspect that there are, except that there are several ways to trigger this, which is kind of why I'm crossing over the the event- and I know that you found one that was very convincing. And I bet.

A

Yeah, because that one is, uh is doing allocations in a loop and it's not constrained by having sent things back to the client. So if that runs out of control fast enough, I can imagine that the runtime ex responds by just making very large allocations. Yes, yeah.

B

um So once uh so, once the runtime allocates uh so um what we're?

B

What we're kind of analyzing here is the interactions between two two discrete memory management frameworks, one of them in the go runtime and the other in the linux kernel when so go runtime allocates a large chunk of virtual memory, which we're going to call a map, and most of that is not actually allocated to physical pages most of the time, but uh but those pages can be allocated on demand when, um um via a page fault yeah um and what we think is going on is uh the the giddily memory. uh Sorry, the the go.

B

Runtimes uh memory management framework um has two options for um I'm gonna wrap this up because I know you already know this.

A

B

A

You talked about this last week too, but I wasn't in that cool. So, okay, don't uh don't wrap it up too fast. Okay,.

B

Okay, um well, we know more now than we did last week, um so so that and the the thing I'll zoom in on this flame graph in just a second, um the oh, by the way I uh if anyone's interested this is the this. Is the issue id three five, six, seven in the gitly uh repo and the the uh there's a there's, a short status summary um at the at the top of the issue, that's up to date as of yesterday, um but the the short version is the the go.

B

The go runtimes memory management has two options for how it notifies the kernel: hey, colonel this chunk of memory. I it's semantically safe for you to reclaim it. If you want to and um the that that that notification mechanism is using m advise- and there are two bits of advice- the the two mechanisms are uh m.

B

Advise free and m advised don't need and advise don't need is what um historically, the go runtime used until a couple of years ago, it switched to m advise free because of some perceived efficiency improvements and in the next in the the latest, release of go 1.16, they're, switching back to don't use, and that's probably going to be better behavior for us and um there's a there's, a caveat though.

B

According to the the the documentation for the m advice, syscall, um the behavior of m advise free is very different on systems that are that have swap enabled versus have no swap, um and all of our gitly hosts have no swap- and I think that's why we're not observing the behavior that these customers are reporting yeah.

B

The behavior and I have not empirically tested this, but I, I think the reason the kernel recommendation is reasonably trustworthy with regard to how it handles these cisco's. So I'm taking it on faith right now that that it's correct in saying that, um when am advise, uh is issued with the m advise free advice.

B

um If the host is swapless, that memory gets immediately reaped, um which is pretty similar, but not exactly the same as what happens with don't need if, in contrast on a system that does have swap enabled which we think these customers do that are experiencing this problem, then m advise free is very, very lazy.

B

In reclaiming memory, um the the kernel has a a pretty abstract mechanism for computing, a perceived amount of memory, numerical value for memory pressure, and that has to exceed a certain threshold for the kernel to even bother reclaiming that memory, and I suspect that, because we know giddily benefits enormously from the file system, cache and frost cache is kind of you know the first victim for for relieving memory pressure.

B

It's pretty likely, in my opinion, that these customers are experiencing a performance aggression because diddly uh giddily spikes, its memory usage, which uh which steals memory from the from the space cache yeah.

C

B

It never gets reclaimed by the kernel because of this mechanism, and if we tell uh the go runtime to switch to this.

C

A

B

More likely to release that memory and hence make it more available.

A

Yeah, uh I that's one thing. We saw um well that you, you probably already knew, but that I and that I sort of suspected- but that became really clear to me when working on the cache is that the page cache really works well with git, and it's not surprising because git was designed to work well on linux.

A

Yes, yes, right, right, yeah, it's more.

C

Ways than one yeah exactly.

A

um But uh yeah, if, if you have something uh that is just that as eating into the space for the page cache, then uh you're going to uh hit, uh I end up doing actual block. I o, which is not what you yes.

B

C

B

Exactly okay, so uh so on to the on to the other bits that that I wanted to share, um so I wanted to try to quantify um um so the the proposal that we've kind of reached here is there is uh there's uh like I said that via go debug.

B

Yes exactly um so we haven't applied this to our host yet, but I think we probably should- and I think I think uh I think I think a lot of folks are in agreement- that we should do that and um um I'm just so.

A

We know ahead of time what's going to happen, because it will happen eventually anyway, when we reach go 116, but yes, and we can wrap this up and we can say: okay, this works. Hopefully, yes,.

B

Exactly yeah, so in our current state um I tried, I tried a few tactics for um for trying to um what's the best way. To summarize this, um so the the the. uh Let me let me uh I'm sorry. Let me take another couple minutes to just okay for one more piece of background: it's okay, okay, so um so this is.

B

uh This is one last bit about the interaction between the go memory management and the the kernel, uh the kernel member management, um when, when the go runtime uh issues that m advised this call to say this range of pages, are you can reclaim? If you want to that, that is effectively saying. um I know that you physically you've allocated physical pages for this portion of the of of my map.

B

If you want to reclaim them, you can, um I will probably use them again in the future, but feel free to reap them at your discretion right now and the behavior that the interesting behavior here is, um if the the process in in this context to go runtime um touches. This is this: is the semantics of the m advices call, so this is linux behavior.

B

um If the process that has issued that m advise touches that page again before the kernel reaps it, then the the the the mark that says that this page can't be reaped, gets implicitly removed, yeah and, of course, uh and.

C

That's what you would expect.

B

Yeah, yes, exactly and so, uh and so the rationale for for using this, this kind of asynchronous reaping behavior is um that the the go runtime can uh can uh be fairly aggressive in marking pages that it thinks it won't need again and if it ends up wanting to use those pages again, it doesn't have to worry about whether the you know whether they're there or not. The kernel will allocate.

A

A page if there isn't right.

B

Because it could.

A

Also shrink uh shrink the map or something like that, but it doesn't shrink the map. It just says you don't need most of it, but yes, it could just barge in and claim an insane chunk of memory. Yes,.

B

Yes, exactly, um and so the rationale for making this a lazy reclaim is. That uh is that, if that page, if that page, that got marked as is wrigley um ends up getting used before it gets reaped, then that saves the kernel, the the bother of a the the the.

A

B

And b, the reallocation of a new user.

A

Yeah, it just doesn't have to update the data structures they just set a bit somewhere on the page and move on yeah.

B

Exactly exactly, and so I figured that because the mechanism for provisioning a page you know um that has been reaped- is to do a page fault, uh specifically a user's user space, page fold that we could trace page fault events and uh and try to identify when those page world events correlate.

C

B

uh To the go runtime trying to access a page in its map, and that turns out to be really difficult.

A

B

Yeah, so this this note here I'll I'll link this in the in the um the meeting doc. But this this note here kind of walks through that experiment, and I thought it was interesting enough to share um the short version of it. Is um that I found a trace point that lets me um excludes uh most of the syscalls like read and write and accept the e and pipe read, and things like that that that we don't care about. The only page faults that we want are ones that involve the private anonymous memory.

B

uh That's been allocated uh in um to to the process and the process.

C

Is directly accessing it, so you're saying.

B

Those other syscalls also calls page faults. Yes, uh yeah.

A

There's a lot of ways to cause.

B

Page accounts, so we can't just look at the system. Yeah.

C

At the end, yeah, okay,.

B

Thanks so um so, looking at uh and page faults happen really really often so I need, I also needed to find a trace point that would support uh kind of sub sampling. So I got this one to give me a one percent sample.

B

um This is the this is the first of the two flame graphs I wanted to show. um So this is. This is effectively saying uh one out of every hundred uh page faults that are initiated from user space, go ahead and capture a stack trace, um uh uh the user space and kernel and all right, um and so you can see a lot of uh you can see. A lot of git processes are doing, which.

C

B

One-Offs so they're going to be constantly needing to do.

A

B

Yes, um I'm going to just zoom in on on a couple of these, so we can see more detail, so this is um so. This is, for example, uh um this looks like I get diff um and it's reading a pac file, entry, yeah and that sure makes a lot of sense um uh doing an um yeah. So it's doing doing a mem copy of uh of a page, that's presumably from the frosting cache into memory into a private memory allocation yeah, and the thing is that.

A

uh When it comes to reading the back files, uh those wait if something is in the file system cache, it would still be a soft page fault right if the process.

B

Has never looked at it yet yeah.

A

That's right because I was thinking. Git also has to allocate memory itself for uh decompressing things, because it's not just enough to load the data from disk. It's also it's compressed, so it needs to at least memory on top of that. But this could just be dio that we're seeing here.

B

Yeah, that's that's exactly right and we can see here is git calling malik, um which right.

C

B

Where it would put its temporary structures yeah exactly um so just kind of scroll, the the uh for for folks who don't spend a lot of time with flame graphs, uh which I know jakob does these are sort of alphabetically from left to right, um yeah, each each layer of the frames and so we're we're looking for the giddily process and I'm going to move my and you can see it's not obvious. So I'm going to move my mouse very slowly right here and you can see, there's giddy dash and it goes yeah.

B

uh Here's giddily hooks scrolling scrolling and over here that's giddily, that's the whole thing. Yeah 149 frames, so this is 149 frames out of about as 149 samples out of about 20 000 samples. So this is a vanishingly small amount of times that giddily itself is doing page defaults of this variety and the second.

B

Where did you capture this? um um This was on. Did I note that here I don't think I did. um This was on one of the production goodly nodes um right.

C

B

Just if this is production traffic, it's production traffic, it's six, it's 60 seconds to try to name the files with clues about this, so this is the uh so this is the a software type of uh perf event uh called page faults. uh It's capturing one in a hundred of those events and the duration of the capture was 60 seconds and it was on one of the production gathering nodes. I think it was file 42..

B

A

Guess that this.

B

A

The the go uh memory allocator either well, both our workloads are probably very constant and the gold memory allocator is doing a good job of not uh of keeping things around that it needs or not. Releasing too many things. I.

B

Think so, yes, um and uh just for reference, the second flame graph I attached here, is just extracting the 149 frames that that we zoomed into here um just so it's easier to see them. um But I think that I think the kind of the the first most important message is that there were very very few of these events, which suggest that the go runtime is doing a pretty good job.

A

Yeah, but also in its.

B

Current configuration look where these are uh that's.

A

The second thing.

B

I wanted to see yeah so that.

A

Is smart, http, post upload back? So that's gets http proto marshalling and the one on the left is back objects hook, uh which is um the bulk of the data that flows through post upload back throws through, like back objects, hook first and then.

B

A

Through that still process getting hooks and then back in and then out the door again um so yeah that makes sense.

B

Yep um so yeah this is, this is more of the the grpc marshalling you were talking about. um There are, um I guess, the other interesting bit that I I one of the things that I was looking for here is um an efficient way to quantify.

B

How often, um how often, um how often do we end up having page faults in uh when, when trying to allocate memory within the go runtime, so we would expect that uh that it's very very frequent to have to do um allocations within the go run time that don't cause page faults, and I figured that one of one of the useful uh um one of the useful tricks for us to kind of quantify the impact of either the the new go, the new go version or uh or toggling this this, uh this m advise mode would be to to count the the ratio of uh allocation events versus page faults that that derive from those allocation events.

B

um This doesn't get all the way there, but it gets kind of close. Have you looked at the goal runtime people of I haven't.

A

Spent a lot of time with it because it emits numbers, uh well some numbers, so it could be that uh yes, it's on metrics for that what you have the thing you want.

B

um Yeah, so I I can't speak, uh I can't speak well to to that, because I haven't used it enough to to be confident in my interpretations, but um my impression is that is that the we're talking about heat profiling here? My impression is that uh is that the goes internal heat profiler.

B

Biases towards sampling.

B

My impression is that bias is towards sampling, large allocations over small allocations, and I don't know how much control we have over. That.

C

Can I show you something.

B

Just sure yeah.

C

There's one other thing I want to show here.

A

Okay, I'll show.

C

When you're done I'll show you you want to finish your thing: yeah yeah, right.

B

Yeah sure it's it's very quick. Let's do that, okay um in in the spirit of uh finding ways to to instrument this, I wanted to highlight that um that about about 44 of of these were were doing uh sorry. These are these are page folds. So this is um this portion of the stack is a very. There are two common. There are two common coal pads near the tops of these stacks that actually did result in page folds.

B

One of them is uh is um when the the runtime is trying to do uh is trying to effectively add a new span to the m cache and to do so for whatever the the size class uh that's needed. That needs a refill.

B

It will call m central grow to allocate a new set of spans and that and that it's perfectly reasonable to expect that to sometimes do a page fault when, uh when we're, when it's doing that by touching a portion of the map that has not yet been physically allocated, the page.

A

uh Yeah and also um go memory allocations are always zeroed, so it could be that the fact the act of zeroing is what causes the page fault, and that's why we would see memclear at the top.

B

uh Yeah that yeah it's it's always going to do yeah. You know it always does the zeroing um and that's.

B

A

Writing into the the mapping, so it could be that that is what causes the page fault. Oh.

B

Sure yeah, I guess I guess my point is that the is that uh is that if the page has already been allocated from the kernel, we wouldn't have seen that as.

A

A page exactly.

B

Yeah and the other interesting stack is, is this point right next to it, the the runtime.large alec? So anything, that's.

B

I think I think the the size classes end at like 32 kilobytes, so any allocation, that's larger than 32k ends up getting allocated directly with this large outlet call um yeah so that just exactly like you just said that also returns your zero pages, which is why it traverses this and it's natural to expect that told so location induced page faults when it's touching a portion of the map that don't currently have a physical page allocated.

B

um So where I was leading up to with this is, I think, maybe maybe maybe these instrumenting, these two functions.

B

I haven't I haven't, I haven't written this yet, but I think that probably I could instrument these two functions in the go: runtime um and the um and either a k, probe or a trace point for page faults and only and and only track, the page faults um where it happened.

A

Under that uh those functions under.

B

That call path exactly um so that would be effectively set a bit here for this particular task id and then, and then, if there's.

A

A page fault events that yeah or just mem clear, no heat, pointers right because that's common to both of them.

B

Yeah, I don't know if that's cold um in other places, um but maybe yeah, maybe so.

A

Well, it doesn't matter if it's cold, but if it's cold and it does a page fault, then it's clearing something that wasn't there before right.

B

Yeah yeah, you know I like that: let's yeah, let's go with that.

A

I I it doesn't really matter if it's one or two, but it could be one from the from the sound of it.

B

Well, this is this is a this is a minor point, but yeah. That's that's fantastic. um This is a minor point, but the the spelling of this of this symbol is turns out to be really obnoxious, because it's got a parenthesis and an asterisk and.

A

B

Escaping that is hard um yeah. um I've worked that I worked out a way to do it, um but it's it's obnoxious. So.

A

B

I'll stop sharing now and you can well another argument in.

A

Favor of memclear, no, he pointer said yeah exactly exactly. I'm gonna make a note for that right now. uh So what I wanted to show you, uh let me share back, is that um oh come on. Why are we not typing in the right box? So we were talking about.

B

A

um Mems, that's yeah this this whole namespace. So I think this has things like heap cis bytes. I wonder if that is.

A

I suppose that wouldn't show you the page faults. That's probably the whole thing it allocated, but.

B

A

Well, this is really there's a lot of stuff here and maybe some of.

B

A

Are useful to you what you're looking at yeah and where do these come from? They probably are in the runtime package. I mean they come from prometheus, but prometheus is.

B

Bringing them up no yeah yeah, I guess so yeah I can. I can look at the prometheus um um exporter to find where, where they're getting that from the runtime they're.

A

They're from here so so you maybe want to look at this. Maybe this helps, but I I don't know if this would the goal run time may be, uh it should be oblivious to page faults, so in that sense it probably doesn't help, but I don't know I'll I'll, throw this in the in the document.

B

Sure, yeah yeah yeah, that's great. um I think that was that was the most interesting stuff that I thought I thought was worth sharing from my weekend.

A

Yeah, that's uh that's fun! So how does it um do you expect.

A

I so this got mo. This was motivated by uh problems for self-managed. Do you also see an uh this relate back to a problem, we're having in production or uh and.

B

A

B

Something like that, I I don't think so, one of the one of the one of the questions in my mind when I got involved with that uh with that uh support task was um was. Why are we not seeing this on our production, um and I think I think I'm satisfied with the answer that that the fact that we're running swappable systems makes the m advise free, behavior, um fairly similar um to the m, advised don't need behavior.

A

Yeah also, I mean do these. Customers have as much ram as we have, because we also are, I think, a bit over provisioned on ram.

B

A

B

Yeah yeah, especially since um yeah, we I'm sure that they I'm sure that they don't have as much ram as we do, but we did. We did see that their giddily process, um uh one one of these customers- was able to give us the um um you. You probably know this already, um there's a there's: a command line, utility called pmap that um that has like three different levels of verbosity. uh If you run a pmat dash capital x, capital x, um that's the maximum verbosity, which is I.

A

Don't know the details of pmap, I think I yeah I'm not even sure. If I know pmap I was looking into pss a while back, but then I ended up looking at the proc file system and not that pmap. But yes,.

B

Well, you'll remember from the prop file system that there that there's uh that there's um for a given process id there's a there's, a maps file and there's an s maps file and the space.

A

B

Is um much much richer information well.

A

I think snaps is where I found.

B

A

B

Right so I didn't even.

A

Think about what the cost is of looking at that, because.

B

It looks like a file, but of course it is right. Yes, no exactly.

B

I actually did some some profiling work to show how expensive it is to to read the s maps file for a process, um um because I think a lot of people were kind of thinking that it was really cheap and it's only cheap under certain circumstances. um So it's easy to kind of trick yourself into thinking. It's always cheap, but it's it's not in fact the the the more um the more memory that's allocated, the more expensive it is to to do the the walk, um but I digress.

B

The yeah you're familiar with the s maps file pmap a little.

A

B

Yeah pmap is you'll love it just uh try that sometime uh pmap is reading the s maps file and presenting it in a more human, readable fashion. um If you uh it's, it's literally just you know window dressing on on sbabs, um but this customer gave us uh gave us some heat map output and it showed that um that the the virtual size for for the gridly process was 19.5 gigabytes uh and of that uh about 18 point something gigabytes uh was private.

B

Anonymous memory, which is you know reasonable, but like almost 90 of that was was, uh was uh of that private anonymous memory was in the in the free state, not the dirty state which yeah it's been, which I think means it's been released to the os for potential reclaim, but hadn't actually been released because those pages are actually still allocated um and that's what led to the conclusion that the lazy reclaim uh is is at fault, yeah,.

A

Do we know, uh do you know what um the effect of uh m advice, free and msi don't need this on this s-map's output.

B

um That's a great question: um one of the things that I wanted to do, but haven't had time to do, is, is to write a write, a a toy c program. To um specifically do uh you know, create a map, allocate the the memory and then and then directly yeah.

A

Do one or the other and then look at s maps yeah.

B

That sounds it.

A

Would my teachers a lot and it's it's probably not that much time to.

B

A

B

Yeah yeah, I got part way into it and got pulled into an incident so right, yeah.

A

No, I I I didn't use it. It's trivial, it's just like it's an experiment with a reasonable amount of effort and potentially a high reward in what you learn from it. I think so.

B

A

B

Yeah I I haven't given up on going back to it. I've saved my partial results. I wasn't going to admit that, though, until you kind of press me.

A

B

I think writing.

A

C programs to try things out is uh is good.

B

I think so too yeah I mean you have so much more direct control over exactly what's going on, sometimes.

A

You just want to know how something works and that that's the best way to find out yeah.

B

Yeah yeah totally with you um yeah, so that's, uh but I I think that's all I had for for that topic.

A

Unless you had other other questions, um oh no, no sorry! It was nice to hear this first person and throughout because I heard the I saw a recording from one or two weeks ago, so yeah that was that was interesting. um Yeah.

B

A

B

This is this is a total tangent. um um You mentioned um you mentioned at one point um recently. uh I think I think there's an in. I forget if it's slack or an issue, but um you mentioned that uh that you were having um oh actually. It was probably this thing that you just demoed um that you started off, trying to get some profiles from gidley and were being thwarted by the fact that when we do package installs that, yes on disc binary, that is so obnoxious.

B

C

B

Under certain circumstances, I've been able to kind of you know, work around that and other circumstances. I don't have a workaround for so.

C

B

When you've got some time, I'd love to kind of pair with you and show you what I've kind of the the partial workarounds that I do have.

A

B

A

That's uh so from a slightly different perspective. uh I I know only buzz from way back but way back. I knew omnibus very deeply uh and uh I think that actually, I'm not even sure this makes sense, but I I think omnibus has some things. Well, it's just chef code right and it does things to try and not restart processes unnecessarily.

A

So it could be that there's something in our omnibus cookbooks that that manage gitly internally to omnibus uh that it decides not to restart the process after the blog, because it thinks it's the same, but.

B

It's literally.

A

The same and we ought to restart it, uh there's of course, a good reason for not restarting processes all the time, because uh people know this and.

B

A

For a long time, in the early days with gitly, that was true, but a couple of years into kittley's existence, somebody, I think alessio sat down to do the hard work to do something where we have uh a file. Descriptor sharing what not live, reload thing. Okay, so we actually.

B

Was gonna ask if you knew how that worked? We have gathered.

A

Live reload, so if we just change omnibus so that it uh always restarts quickly. If, if I don't know what the criterion is right now, but it might be two lakhs and if we make it more strict, so if it does more restarts it should not impact users and then the problem goes away, because the running binary is the deploy binary. Yes, uh hopefully, that's possible. I just I haven't sat down and looked into if that's actually what's going on.

A

One thing about that I wanted to say yeah is that I ended up working around that, because I wanted more gold, runtime profile information, and I sometimes look at the continuous profiler, which is really nice, but it doesn't tell you what host you're looking at and.

B

A

Have or I've been focusing on problems that are local to specific hosts? Yes,.

B

A

I ended up just and I really like the well from what you can tell I like the hipaa locations profile, sure yeah, so I ended up writing a little script that just sshs to a machine and grabs those profiles anyway. So and.

B

A

Don't care that the binary doesn't match because go um these binaries are fairly big because go puts enough information in there that they can generate a self-contained profile.

A

So once you download the profile that way, you don't need to know what what's running or anything yeah, but it would be it's so much nicer when you can see imperf what's going on, that was that was really nice um yeah and because I did my experiments on a test vm that doesn't get deployed all the time. I knew.

B

That the running.

A

Binary matched what's on disk, so the perf things had.

B

A

B

Yes, um there's a um yeah, you know I perf is perf is one of those multi tools where it has so much functionality um that I don't think any any human being uses all of it. I.

A

Really don't use cpu crafts. I.

B

Get so much mileage out of those that.

A

I don't need to look at, I find something there and then I start looking and I don't need to come back.

B

Yeah um perf has uh kind of two uh with regard to like doing profiling. Perf's got kind of two modes, uh two modes of operation, and most of the time I use the the timer based uh frequency sampling, um um which I think I think that's what most folks use it for yeah. um But you can also say um you can also say: I want to instrument a particular uh a particular function and capture every call to that function. um uh So that's that's.

A

B

If that function gets called a lot, yes.

A

B

Exactly and that's that's why I I always run perfstat on that function, for a few seconds to kind of get a count of how often it gets called before before doing.

C

B

More expensive um and on the occasions when you want to instrument a function that does get cold kind of often enough that you're worried about the overhead um there's, there's support for for sampling, one out of every end occurrences. That's.

A

What you did on that? um Well, that wasn't the function call, but it was a uh that painful thing. You also did.

B

Exactly exactly so, um I I was uh actually in in the act of trying to do that that uh I I discovered um kind of a gotcha that uh I I was. I was really surprised at the the interface for specifying that you want uh one at every end.

B

Events just silently didn't work for the first two attempts I made, for there are several proof: events that you can use for, trying to catch paid, page faults um and the ones I started with were k-probe events uh and trace points and for both of those it silently ignores any specification about only get one out of every end events. So.

C

Anyway, got all of them.

B

Yes- um and it was like you know, a few hundred megabytes worth of perf data and you know, and no doubt slowed the system down for the 30 seconds. I was doing their profile, um which you know it isn't a huge deal, but it's not what I wanted so um so, because the tool didn't work. The way I wanted, I did some digging into it. It turns out that only certain kinds of only there are a few types of proof.

B

Events yeah and only certain types of events appear to support that and, as far as I can tell, this is not documented behavior.

B

A

It very I I tried reading about this stuff in the brendan greg book uh about ebpf and I just get lost in all the different kinds of profiles.

B

A

And in the end I realized, like I really like flame grabs.

B

A

He invented them or he sort of championed them anyway. um So I I also I I don't want to make these mistakes you just described. So that's why I just stay away and sure yeah, the cpu sampling, like the frequency, sampling, yeah.

B

Yep yeah and that's that's sufficient for most cases so, um but this is, I think this is the first time that I needed to did I that I felt like I needed to actually instrument a high frequency function, call um and- uh and I did work out a way to do it. I found out that the the software type of of perf events um does support this and the k, probe and trace point events do not, at least as of our current version of perf yeah. There was.

A

Something yeah, no, I I guess perfect one point or parts of perfect, probably using ebpf by now, anyway, internally.

B

Yeah um and that's and that's that's uh going to increase over time, yeah.

C

B

Yeah there's um um talking about bpf, um I mean the.

B

This is, I feel, like we're kind of in in an era of just massively improved observability for systems, and I'm just I'm so happy to get to you know to get to. You know explore this um the um like what we were just talking about a few minutes ago. um uh I think I think, probably the right approach there to to.

B

I think a reasonable approach there is is to write a bpa program that, in that uh that instruments entry into into one uh into into the um the frame before doing the page folds, as well as the page fault events and uh and only emit, and only emit events, whether it's a counter or a stack trace um where, where the, where the the flag's been set by the by the earlier frame, um and that should be a lot cheaper than saying for every one of the high frequency events.

B

The page faults uh examine the stack um right.

A

Yeah, that's the wrong way, massively cheaper yeah.

B

A

B

And bpf is a perfect tool for doing that. um It's it's kind of a box just to do it like there. There are kind of two interfaces for working with bpf and um um there's the the bcc suites, the uh bpf compiler collection, yeah.

C

B

And I I got uh uh about a year and a half ago. I got that installed on all of our all of our servers, um but the the other interface is uh is uh bpf.

A

Trace which is right, yeah yeah.

B

That's because they're.

A

Basically, recreating d trace, uh but for bpf yeah.

B

Yeah yeah exactly um and it's um it's unfortunately difficult to get that uh to get that working on the version of ubuntu that we have installed on most of our fleet. um So once we get upgraded past a certain point it'll become trivial to install that. But um I really really really wanted that to be on all of our boxes too. But I just didn't have the time to because we have to compile it from source and it's.

A

Funny we're just waiting for linux to catch up with solaris uh enough to get some of their goodies.

B

Many many years ago I um introduced solaris boxes at the company. I was working for at the time for, for two reasons, one uh one was d-trace.

B

I've seen people have so.

A

Often yeah I've heard the people who use it are so enthusias who got to use it, we're so enthusiastic about it that uh yeah. It's really nice that I guess enough of these people like oracle. Did everybody a favor by buying sun and killing.

B

Solaris, because.

A

A bunch of these people left and ended up in linux shops and they ended up recreating the good bits.

B

Yes, yes, yeah, prior to that, the best thing we had going on the lake side was system tap when it was just. I mean it really, it really tried, but it had some kind of critical failure. uh Flaws like it was um one of the things that bpf programs tend to do really really well is um avoid uh avoid locking across cpus and.

C

B

Was really easy to write a system tap um instrumentation that had a shared global data structure, and that means the more cpus you got, the more cross, cpu locking you got and- and it ends up introducing an enormous bottleneck in the system that you're trying to profile bpf.

A

Generally, doesn't do that.

B

A

Yeah, so um just wait for uh for the next ubuntu.

B

Yep sounds like it. I I feel, like more people will be comfortable using uh you know, kind of using bpf to explore a system once we get bpf trace available and I think that's really important for, like you know, career development and, and just you know, learning more about the systems that we operate. um But for now what we've got is is the bpf bcc, suites and perf.

A

So yeah something I've been wondering is if there's a same way to make these uh just these very basic frequency, cpu flame, graphs, more accessible, because uh in a way I asked for production access just to get those things. And yes, I'm getting a lot of mileage out of them, but I'm sure other people would too and we can't give all of them. Production access.

B

A

B

um um Jakub, I think it was, I think it was you that actually asked asked for this uh many several months ago. um I actually.

C

B

Object, you know I do uploads of uh perf scripts to object, storage for kidney nodes. We can totally do that. We just need to. I mean it's yeah.

A

There's no technical obstacles.

B

A

I I think the other I think, there's also just uh um uh more of a social obstacle is that um the people who need this- maybe don't know that this exists. So we.

B

A

Need to find it because I know I need this, but I have production access, so I'm fine, but there are people who, if they had. This could do interesting things or have interesting insights. But if we built this and nobody looks at it, then it's a waste of time.

B

A

B

I yeah, I agree. um Yeah.

C

This is part of.

B

Why I try to share flame graphs, uh like you know the svg files, as well as uh as well as, like you know, static images with like circled, you know and annotated text on it just to kind of make it more accessible and kind of show people how to how to interpret it. I think it's it's so empowering and I really really wanted it. It really.

A

Has been for me because this uh this whole black object, cache project started because there was a flame graph of an incident in in december, and I got to download that I'd be like wait. What what's going on here and like finally yeah? I mean they're weird, because um they don't always tell you the whole story or it's not always clear what conclusions you can draw from a flame graph yeah. But you can see so much more than you can uh with a lot of more basic tools.

B

Totally agree and there's like I mean most of the time, we're talking about cpu profiling, um but of course it's you know, flame graph is just a format. uh You know a format for representing yeah. You know abstract data so um like there are other kinds of events that are sometimes applicable to, um like other people,.

A

Publications or yeah or yeah there's anything anything with.

B

Stack traces like I guess it's a stack.

A

Trace visualization, if you think about yes, yeah.

B

Yeah yeah, I feel like I feel, like I'm gonna, go with yes on that I've seen one or two places where people used it for things other than uh revealing stack uh dominance, but um I feel like that was really abusing the format, um although it was also hilarious and useful. It's just not what you expected to see. So I'm not I'm not casting blame. I just I. I thought I thought it was wow that is so weird and creative, and I don't want to do that again.

B

um But no yeah yeah. I totally agree it's it's! uh It's! It's really really empowering to have access to that stuff, and I want more folks to have it. um um I've had um for folks that do have production access.

B

um I think a lot of those folks are not really comfortable with it, because the indications can be kind of arcane and you don't really know you know, what's what's what's safe and what's not in terms of adding overhead so kind of to relax the that that sense of anxiety, I've got a handful of, I think, three, three or so shell scripts um that take like no arguments or one arguments.

A

uh That's what I made so uh it takes one argument which is uh which is a host and it s h's, and it is a record with a sleep per script, gzip and then it's. um It remembers the name based on the host name and a date, and I have a companion script, that uh the only.

B

Thing I'd add to that, would be for perf script, add a dash header so that you get the metadata for the host that it ran on.

A

Right, I sort of get that from the the host name as well, but uh does it end up in the flame graph? If you do that.

B

No, no, it's it's a co, it's the the first 20 lines or so I.

A

I I know this: I noticed that better data in the header, but I'm not sure when you so okay, the only way I consume these things is by making a flame graph, which I do offline, because it doesn't make sense to me to copy uh flame grab like copy stack collapse, perfpl onto the server.

B

Or uh we we actually got those uh installed in in the default path, so you can just call stack collapse, dash proof, dot, pl and flame graph right on the.

C

Server, there's.

B

A

Usually, there's no need to do that right. It doesn't need access to this totally.

C

B

Absolutely do not have to do.

C

That on the servers, the perf recorded perf script are the only things after on the servers, so.

A

This thing um I mean it has this option to filter out swapper, but I usually want to see.

B

A

Anyway, but yeah it takes the the gc file from the first time and it appends uh dot svg. At the end,.

B

A

Then that's the output. So with those two things uh I can capture my profiles and because the shell scripts, I can't do anything wrong. It's kind of.

B

Yeah exactly yeah exactly yeah yep, uh and that's that's exactly the kind of thing that I I figured would make it a little bit easier for folks that do have production access to to you know, be comfortable, get capturing a profile, and so I like make opinionated decisions about uh about the capturing frequency and the direction.

A

B

A

Yeah and the header, if you think the header needs to be in there, then that you can put.

B

It in I like having the header, because it doesn't impede the generation of the flame graph, and it does give you um um like it, tells you the exact invocation, and it gives you like some bits of metadata about, uh like what exact you know. Kernel was running and uh the topology of the of memory. If you ask for that, um it doesn't usually.

C

B

Occasionally, it's helpful yeah yeah. I think.

A

B

Cheap, it doesn't get in the way and on rare occasion, it's helpful, so I.

A

B

Include it, I have it.

A

So you and I write shell scripts on our computers and we're happy, but how do we? How do you make that into something everybody can use.

B

I was going to put it in the chef recipe. I just it's one of those things. That's been on the back burner for a while. I've already got the scripts, I just haven't committed them and pushed them so.

A

But if you put them in chef, what's your idea that they get installed on all servers yeah, so one.

C

Thing I like about my script.

A

Is that it's the same things.

C

The same chef recipe that installs uh that installs perf.

A

Right but part of the problem is getting the perf profile of the server, because then you need to run scp or something to get the file off, and what I like about my script is that it just uses well, not literally scp, but the data ends up on my computer, which is where I want it, and not in my home directory on what server was on uh five minutes ago like so for for me, an ideal solution would be something that people run locally or it would be a web interface where you just click and it downloads to your download, folder sure yeah.

A

B

Think that's very reasonable.

A

um But I I haven't found um I haven't thought about it too much but like if there's some some sort of way to get this, to get this right with a reasonable amount of effort. It would be very nice um just but yeah.

B

Yeah, that's I I agree that would be nice. um I was looking for something low cost. That was, you know, likely to get done quickly and even even just deploying those shell scripts. I I haven't done I I don't have a good excuse for that. It's just like you're.

A

On your chef all day, well not all day, but you you regularly do that sort of stuff. I don't even do that sort of stuff.

B

Yeah yeah um yeah, just just like I've got I've got a set of tutorials that I've got half written for kind of. You know: teaching people how to how to read flame graphs and how to use perf and bpf tools, and uh it's just it takes time all the stuff takes time and there's so many.

A

Other options is it normal for people for for people with production access to have to run books on their computer? um It's it's.

B

A

Yeah, so if I put my scripts in the run books, then everybody would have them on their computer.

C

B

A

That's the solution for my approach to the problem. Maybe I should just submit.

B

A

Merch request.

B

I I don't think that would be a problem.

A

Yeah, a lot of people need to know they're there, but um that takes care of installing the script, because it's just get pool.

A

Well, okay, um it we've been talking for more than an hour.

B

I think we should yeah yeah. We think we.

A

Should wrap it up um thanks for the well demo, thank you for the demo.

B

Yeah, this was.

A

B

A

You too, and um have a great rest of your day. Awesome yeah great talking to you. Likewise bye, bye,.