GitLab Using Strace to Understand GitLab, 17 Jul 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Using Strace to Understand GitLab - Part 1: Intro

Description

Start of a series describing how to use Strace to understand how programs work internally.

0:00 Intro
0:28 What is strace
1:50 When not to use strace
3:08 Tracing 'ls -la'
5:25 execve
7:33 open
9:00 stat
9:48 mmap
14:18 mprotect
21:05 Summary

Strace man page: https://man7.org/linux/man-pages/man1/strace.1.html

Brendan Gregg article on performance impacts of strace: http://www.brendangregg.com/blog/2014-05-11/strace-wow-much-syscall.html

Part 2: https://www.youtube.com/watch?v=tThs8QeP2qY
Part 3: https://www.youtube.com/watch?v=J-GkU7Mmqy4
Part 4: https://www.youtube.com/watch?v=dgJH4wpR5OE

A

Hi everyone, so I wanted to make a series of videos about you, know s trace and how to use it to understand um gitlab and related programs. So this will be the first video and we'll talk about you know: what is s trace? What does it actually do when to use it when not to use it and then we'll go through a really simple program in using s trees, to understand what it's doing um so so what is s trace s trace?

A

Is a linux utility that will allow you to see the system calls made by a given application into the kernel. So system calls are the api the kernel provides to programs linux in this case to to access resources that the kernel controls so opening files sending network traffic uh things things of that nature. Io is the most common thing that we care about, uh but you know it does a lot more than that, um and so it what that does it kind of helps?

A

You understand the internals of a program, you know what is it actually doing? It gives you the ground truth. Instead of looking at logs and hoping the logs are verbose enough, maybe they aren't um and it just lets you kind of peer inside. So unlike a tool like gdb or perf, it doesn't actually show you the code right. So it's just giving you kind of the shadow of the program.

A

You can only see the program as it reaches out to the world to open a file to send a connection or whatever, which is often enough to understand the the problem. um So if it's something related to I o s trace is a great tool to understand the problem.

A

uh If it's within the code itself, where you know it's in some kind of hot loop, where it just goes, it sits there forever. The s is going to be less useful, but in general the problems we see in support are usually things that s trace is helpful with.

A

So when should we not use s trace, so the downside of the s trace is that it makes programs vastly slower so for a typical program. It might be 10 times 200 times slower in the worst case.

A

It might be 400 times slower I'll link to an article by brennan greg in the description that talks about the overhead s trace puts so I use s trace mostly on my own instance just to try to understand how a program is working and what are we doing in the scenario and then in more rare situations on a customer's instance. We'll have to use it, but we try to avoid that whenever possible, just because it has such a significant performance impact, you don't want to leave it running for any more than a few seconds.

A

If you can avoid it, and then you almost never want to run in a healthy instance because it causes so much load. um One other thing to keep in mind with s trace is that occasionally it will get stuck after you exit it. So you can control c. You return to your terminal, you think you're done, but it's actually still running in the background slightly down.

A

So if you're running it in any kind of environment, you care about always check in ps to make sure that it actually exited, because if it doesn't you'll cause an outage and that's bad. uh Okay, all right! Let me share my screen and we'll go through a quick example of just seeing what it looks like to s trace ellis so on the right here.

A

I've got the man page up and we'll just pull up the man pages for the various syscalls as we do them so to invoke s trace, we'll just do estrace and then we'll go through the flags enter them here. So we're going to say, dash f.

A

There we go here, it is so. This is saying that as a as a given process is being traced, if it forks or clones, any new child processes trace those as well and then we'll do dash tt, and this is saying when we write the timestamps for a given event, log it with millisecond precision we'll do dash capital t, which is saying log how long it's taking for a given assist. Call to complete this is useful.

A

If you want to understand, you know, is it is nfs really slow or something like that, and you need to see how long it's taken to access files? That's the most common scenario. We care about that um and then we'll do dash y, which is let me pull it up here in the main page.

A

um So this will this will decode the uh the path for a file descriptor, which is very useful.

A

If you want to understand what what are we writing to her reading to and so on, um and then we'll also include dash yy, which does the same thing, but with sockets and pipes and things like that, and then we'll do dash s and say 10 24 bytes- and this says you know, for a string that is being recorded in s, trace capture the first 10 24 bytes and discard anything beyond that, and then we'll do dash o and give a path which will be we'll just call lstrace.txt, okay.

A

So those are all the flags for stress and it'll write out to this file here and just let me give us to give our actual commands we'll do. Ls-La run that ls runs and completes as normal. It prints out everything here, and so let's look at the actual trace now.

A

Okay, so the first thing we see here is that we've run executive e and this this call just executes program. So we've got here's. The actual binary executed here are the arguments of the head so ls, which we converted into bin ls and then the flags la okay and then over. Here you see, we have 22 vars, that's 22, environment variables.

A

You can do the dash capital e flag in s trace to print the environment variables, that's usually not necessary, but can sometimes be interesting, something to keep in mind, uh okay and just to go through the rest of the first line here. So this is the process id or pid in later videos we'll see that this is actually the thread id um or the it's. The idea of the thread which is pid and within linux, and then there's the thread group id which you actually see in in ps.

A

But for now this is just the process id uh and then over. Here we have the timestamp with microsecond precision syscall arguments. Is this call and then here's the return code so return 0 success and it took 115 microseconds to complete so pretty fast, even with s trace, throwing everything down all right. So then, on the next line. um So next we do a brk which can be used to extend your heap, but in this case um it's just we're passing in a null argument here.

A

So what we're saying is tell me, tell me where my heap ends right now. I just need to understand what my address space is, and so this is just a quick way for the program as it boots up to understand what what its resources are. So in this case the kernel comes back and says: okay, here's the end of your heap.

A

So then, if I wanted to add additional memory to a teep, we could just call brk with a number higher than this, and this took four microseconds, so very fast, okay, um and so then we do an access which is just saying. Am I allowed to access this file, and here we have the path at cld.sl.preload and that comes back and says this path doesn't exist, and so then we do open it, which is opening a file and getting a file handle back.

A

So let's pull up that man page, and so you can see the man page for open is actually open, open and create open eyes kind of all the way down here at the bottom.

A

Actually, let's look at the arguments here so the first argument is this uh ft cwd and then we have the actual path we want to open up and then here we're saying, read only and close on exec and then so. Let's talk about what this first argument is, so you can see here on the man page that says inch drfd, so that could be.

A

The path name to use for a relative path if you didn't want to use the current working directory, but in this case we're saying, use the current working directory with this, dr or fdacwd.

A

So that's just saying: if it's for relative path use the current directory for that open. Does that by default, but open that, let's give you a little more uh precise okay, so I open that and then we get a return code of three and s. Trace has conveniently converted that or decoded it next to it to say: hey that fd3 is scld the sso.cache, and this took six microseconds so very fast, again, okay and then we do an fstat which uh let me pull the manpage for that.

A

um So f, stat! Actually we'll just look here. So if you see the arguments first, we pass in the file descriptor and then the stat struct, but we don't care about the struct really yeah, let's see f-stat okay f says identical to, except that it takes a file descriptor instead of a path. So basically we've gotten this file.

A

Descriptor and now we say, tell us the details and then here you you can see this is actually the response, so you can see that it has six four four permissions and the size of the file is thirty thousand bytes roughly, um and we get a zero for success.

A

All right now onto mmap, which is one of the more important syscalls that we'll see, um so I can kind of be used in two ways. uh The first is to map a file into the address space of the program. So, instead of having to do explicit calls to the kernel to read another thousand bytes, you can just um it will page it in in the background, and so you won't need to make any more explicit calls out which is more efficient.

A

The downside for us is that, when we're s tracing, we can't see any reads that are done on this file in s trace because it's no longer hidden in the kernel directly anyway. So what we see here is not this first address or the first argument here is an address we're saying null, so we don't care and we're saying 30 390 bytes, which is the size of this file, and then we're saying we want to be read only, and this is going to be private to this process.

A

So that means that if another process for also have this web map to make changes, we wouldn't see that and then we pass in our file descriptor, which we know is sc ldsf.cache and uh what is zero, offset yeah, okay and then no offset all right and then the return is an address where we've mapped that uh file into the address space of the program, all right now that we've mapped it in, we no longer need the file descriptor, so we just close that out.

A

So we close the file, descriptor 3 and then return 0 for success. Okay, now we're getting into shared libraries, and so we open it once again with use the current directory for relative paths, and then we open up lab sc linux to so.1.

A

So if we do an ldd on ls, we can see all the uh dynamic library or libraries that are being dynamically linked into it. um The shared libraries so sc linux we have lib c, the pcre and so on- um are all linked into into ls and we'll see that in the stress here in a second okay. So we open up this library. um It is now the new file descriptor three, since we close the old one.

A

Let me read it so, unlike um the previous one, we actually read the contents here a bit, and so you can see this this header here where it's uh byte, 177 elf. So aside here, so this these backslash, followed by numbers, are um those are non-printable, ascii characters, um and so you can see if it's a non-printable character, then we print it in this format. If it is printable, then we'll print, it actually print the actual contents here.

A

So it's kind of interspersed so slash backslash, three pound sign, backslash, 20 and so on uh anyway, but it's a binary, you can't read it and it's really easy to tell that. This is a binary just looking at the header here, that's elf!

A

Okay, so we read that in it's 832 bytes. So we see the full read.

A

And then we f-stat it.

A

And we get back the full size here and then this is the other use of mmap that I mentioned uh where it can also. Actually I mentioned it, but I didn't actually tell you what it was where you can use it to allocate memory for yourself without having a file backing, so it's just uh extra memory to write into so. In this case we don't have an address.

A

We want 8k, we want to be both read and write, and private and anonymous, and here passing in negative one is a file scripture to say there is no file descriptor. So we're saying this isn't a file. This is just memory, and so it spits out a new address for us here.

A

Okay and then we map.

A

This is some more space, but so this is reading it readable and executable and uh not allowed to write to read um and we're doing the best thing likes again.

A

Okay and then here we actually protect some of this memory and say this can't even be read so m protect. Let's pull it up real quick.

A

Let's just spell that yeah, so pro none means memory cannot be accessed at all, and you have read can be read, right, modified exact and so on.

A

Okay, so we've brought it in our memory space and we're not even allowed to touch it. It's just there.

A

Okay and then we close out liberty, linux, we're done with that great. Then we go ahead and do the same process as lib c, and you can see so you open it read the header um yeah, actually we're asking for the first 832 bytes here and we get that.

A

We f-stat it and map it, protect it once again, read-only or sorry not even readable, for this first uh large chunk and map some more and then close out, and then we continue with the next shared library. So the pcr three, and so this pattern continues through each of the shared libraries that we saw from ldd. So we'll just go through that real quick so that we've seen the basic pattern with the first one.

A

A

A

Maps and close it out, uh rgp or ctl is not something that is particularly relevant to us, but just skipped by that uh and protect here. We're setting a large chunks of memory, not large chunks, but some sets of memory to be read only and then we're freeing or unmapping a section memory. So if we look at this address here,.

A

You can see that this is actually.

A

For this g-commerce I go forward, it looks like it's actually for ld, uh so cache.

A

Okay, um set to address and set robot list are not usually relevant to us, we'll skip by those rtc action. uh We don't really care about what it does, but it's useful to understand what it where we see it. So we'll see that, typically, at the start of a new process so like particularly with git when it's first loading, it'll it'll, run rtc connection for all the signals and say when I receive signal x. This is what I want to.

A

Do all right and then we check our statifast we're checking for file systems. They don't exist. We don't have a se linux file system. We once again check what our current heap is. Then we and here's the other use where we pass in a new address to extend our heap. We open up our file systems.

A

Proc file systems, file, file, fstat, it read it, read it again to make sure something else close it out.

A

So when you see a read call- and it's just uh just an empty quote or empty string like that, that means that you know, there's nothing returned. um You can see that zero bytes or returns. That's an eof right. So at that point we know that we've just entered the file and we close it out.

A

uh Let me check, select config doesn't exist this this vm doesn't have sd links installed on it or enabled, uh and then at that point we start checking locales.

A

um So we start opening up lacao files, which makes sense since we have to print out to the console, and we need to know what language to do that in uh for the first two files on a test, but we do find the lc identification, so we open it f-stat it and map it close. It same pattern. We saw with the shared library and let me go ahead and proceed to do the rest. Do that with a whole bunch of locale files.

A

We won't go into that in any detail all right, so we get past locales.

A

Okay, so we try to open a few more of the cal files. They don't exist and here we start doing the actual work of ls. We open the current directory period.

A

Excuse me, um and so we do, that we f-stat it and then we uh we run get dead ends so we're getting the director entities so the files associated with this directory- and it comes back. We get three entries back, which makes sense so the current directory, the parent directory and the example or dot get directory within there. um Okay uh and then we get the ascended attributes, which is not something we normally see. It's kind of an ls thing, which makes sense um we try to create the socket here, creates a socket file.

A

So you can see, we've got a new file handle back, you can file, handle four um and then we're trying to connect to this ncsd socket um not really relevant to us, but we will see sockets when we see some more networking so just keep that in the back of your mind. So we can skip past all this nss switch stuff, not really relevant.

A

Okay, we load in another shared library, nss files, okay, so now we've gotten past that now we open up etsy password. We read in all the users on the system.

A

We close that out create another cycle: try to connect ncsd again for fun, hey, why not all right, then we open up etsy group, f static. Read it find out all the groups on the system close that get these student attributes for the current directory and for the git directory.

A

See if there's any more directs, there are none. So we get back a zero on that read in our final locale messages, and then we check standard out here, which is fd1, which we can see is a pseudo terminal. Pts see the terminal 0 in this case, and then we write out the first line of the output. We saw total 12, so we're right here.

A

Okay and then we check our time, which we'll need for the sorry time zone which we'll need to print out the times for the uh files that we found and let me write out the remainder of the lines that we see in the terminal and then at that point once you've written all that out, we close standard out, we close standard error and then we exit with zero here and you can see that we did indeed x. It was zero in the final line of that trace.

A

So that is what ls does under the hood. So that's a kind of a typical c program um that we see you load in a bunch of shared libraries and then go do the work. um So you know that's. It felt like a fair amount. How long was that.

A

A

You know, but if you look at the actual amount of time spent, so we started at 18 18 49.633, we ended 18 18 49.641, so we spent about eight milliseconds or so for all that. For all that to happen, and that's with this trace, throwing things down, you know 10 or 20 or 50 times from normal speed. So ls is fast. Computers are fast, even though they're. You know this is kind of a wall of text.

A

It didn't actually take that much time at all to do this yeah, and so, if you look at the actual time spent from where we opened up the directory, it was even less. It was like two milliseconds, so actual amount of time spent doing the work in ls was minimal.

A

Okay, so that I think, is a good place to stop for the first video. So that's uh that's kind of what s trace is and uh a quick example of a simple program. We won't go into that much detail with the syscalls in uh future videos. This is just kind of an intro, um but we'll go over in the next one. A couple of examples we'll get and then we'll look at s, tracing puba and unicorn within git lab and the last video will be giddly and how that interacts with all different pieces.

A

Okay, thanks for watching.