National Energy Research Scientific Computing Center (NERSC) New User Training 2019, 14 Aug 2019

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 11. Python and Jupyter

Description

Learn about using Python and the ever-popular Jupyter at NERSC.

Slides for all sessions can be downloaded from here: https://www.nersc.gov/users/training/events/new-user-training-june-21-2019/

A

I'm rollin Thomas I'm, a data architect from the data and analytic services group. That's privates group and I'm going to talk to you. I'm gonna give two talks. The first talk is gonna, be about Python and Jupiter at nurse k--. So the idea here is to kind of give everybody a set of kind of high-level takeaways about do's and don'ts, with Python and Jupiter at nurse, but first of all, before I get started. I want to know how many people in here are Python users so who uses it for like just scripting.

A

Just like manage your jobs, your workflow take the output of one program and put it into another. Is that kind of what people do a little bit of that? How many numpy users do we have so I PI users, pandas users?

A

Okay, how about the Kaos people, people who use Python? It's a platformer machine for machine learning, I! Guess nobody is doing that really mmm all right! Okay! Well, thanks for for letting me know! So, presumably you all want to do science and you want to do science with Python at risk. So this slide is just an example of some of the things that people do in terms of science through Python at Newark, so there's the materials project. In fact, I saw a few MIT's materials project stickers on people's laptops.

A

So you probably know if that's all about workflows being managed with fireworks here at nurse people also do data analysis or data flow or managing workflows like the LHC data processing, workflow, there's also processing for sky surveys. Cosmology cosmic frontier stuff is a big deal here at nurse that those are here on this side. Of course, it's a platform for machine learning and deep learning.

A

Really it's the way to go, and then there are actually some simulation codes like warp or in body kit, which are actually written, mostly in Python or Python, with C extensions or Fortran linked into them. Those are simulation codes and some of those run here at nurse cos as well the most important to know about Python getting started with Python and nurse, because we have really awesome documentation. It's awesome because I wrote it and we have a lot of really good stuff in there. We try to keep it up to date, pretty continuously.

A

It has a frequently asked questions page. If you have a new question that you think you'll find yourself frequently asking yourself then suggest it to us. There's also a page for tips on how to use que eles from Python, and we also have a new page on optimizing, maizing Python.

A

First of all this morning, I think you learned about the software module system at nurse. If, if Rebecca was, was giving a good talk, then she talked about modules or somebody did the way to get Python working at nurse cos through modules. You must always be sure to load a Python module, do not ever ever ever use user bin Python!

A

That's the system Python that came with the cray, so it's a little bit old, but you can do the the kind of standard user, ben and python thing once you have a module loaded so that you can get the the Python interpreter running in your script. If you don't know what versions of Python modules we have currently on Cori, you can do module avail Python now. This is not the only way you can use Python at nurse. You can install your own Python. If you want, you can compile it from a source.

A

If that's your thing, but there's other more recommended ways of doing that. I'll talk about that in a little bit. Nurse Python is anaconda Python. How many people have used in a condo Python say on their laptop yeah most people right, it's kind of the the distribution of Python of choice, especially because it's really good at providing tools for data analytics and scientific computing and getting you going has this handy package tool called Conda lets you build these environments that are customized. You have your own set of libraries that you like to work with.

A

You can completely destroy that whole environment and build it all over again. If you like, in a matter of minutes, it's very popular content environments, replace virtual ends, who has ever used virtual end, it's kind of the older tool they kind of replace virtual end and do a lot more than verse and there's a few other packaged tools out there like Pippen. Does anybody heard of Pippen yeah? You can use Pippen if you want, of course, anaconda Python has many hundreds of very useful packages.

A

The reason that we made anaconda Python our default Python is because they added the Intel math kernal library a few years ago into it, and then there was no really no real reason for us to build our own Python distribution, because the main thing we did for users with Lincoln and KL so comes for free now, there's all kinds of channels out there.

A

For you know if you are part of a community that does I, don't know cosmic microwave background stuff, and you have your own way of compiling and own set of packages that you like to put together. You can use channels to get those to the modules. The anaconda modules are monolithic, there's like a whole anaconda that comes inside of that module. There are a few add-on modules like h5 pi parallel. So if you need to use parallel h5 PI, then you need to add that module.

A

These are the modules that you should be using on Cori right now: 7 anaconda 4 for python, 3, 6, anaconda 4 for those those. So if you're, 2.7 percent or 3 point 6 person, those are the ones that are there there's a few other modules you can mixin, you can try, try different ones out. I've decided that what we'll do is we'll keep 2.7 as the default module. So if you do just module loaded Python, it will load a default module that will be Python 2.7 until the end of this year, when 2.7 is done.

A

So, if you're, how many people have switched to Python 3, all right, those of you you're on borrowed time, you have 6 months, ok and in fact, there's this. There's this handy website here that will tell you exactly when Python 2.7 will retire. It's a load, countdown I think there's supposed to be a party at PyCon 2020. Also, if you wanna go to that, so you know, if you, if you forget exactly what it is, you can you can look there.

A

Of course, I did that and then I I messed up the thing real, smart, Rowan, alright.

A

Alright, I'm, even smarter, okay, alright, so switch to Python three in about six months: okay, kondeh environments, it seems like everybody knows how to do Khanda environments right so Konda, create n, and then this this Python equals whatever is pretty important because it might pick for you and you want to make sure that it's the Python, that you want probably Python 3 and then you activate activating, very MIT, and you can go on install whatever you want. You can use pip, actually I tend to prefer to use pip instead of go, find condo, forge stuff.

A

But that's just me if you do use pip I, think a couple of handy flags are no cash dear. That means don't start the build and in a cache like in a cached place, and then, if you have to restart the build pick up from that make it start all the way over by downloading the package.

A

You know there's a lot of I mentioned that because a lot of users kind of get stuck and they don't know what's wrong and then I go in and I try to just build it from the beginning and it looks like I didn't do anything and it all just works. Also I've decided don't try to do this, some pip install user thing. Quite so much I'll just go ahead and stick it straight in your Conda environment. Don't don't put it over in python user base unless you really have a good reason for doing that.

A

If you don't really know what I'm talking about, then that's fine just stick to kondeh all right doing things yourself. If you don't like our modules for whatever reason, because or you don't even you don't like to do the module load thing, you can install your own and a Khan installation. If you like just a couple of tips, there make sure you don't have a python module, loaded and unset. This Python startup thing, but this is how you do it. You just grab this installer from anaconda.

A

It seems, like probably everybody knows how to do that, but you can do this just fine on Cori and then you just set it up, so you can do source activate alright, a couple couple, little things that are special about Cori that you need to know you should not ever Khanda install MPI for pi. So if you want to use MPI from Python, you use MPI for pi, don't ever do Konda install MPI, for it won't work right. It might look like it works on OneNote or something like that.

A

But then you go to two notes and it will make no sense. What you need to do is you need to compile the MPI for pi against the Cray in pitch, and it's very easy to do so here. I just have like five lines. The first one is just download the package on packet CD in there swap a module so you're using the new compiler I think you can use the Intel compiler to it probably play doesn't matter, but I usually do this.

A

The only thing you have to do is Python setup, I build and then tell it where MPI CC the MPI cc compiler is that's just the compiler wrapper that we talked about this morning, right CC. If you do that, then you'll have built your own MPI for pi, and you only need to do this. If you create a cond environment and you want to use MPI for pi from that, if you're using mine, modulo Python MPI for PI, is there I have a couple of slides about parallelism with Python?

A

Just generally, people tend to use process level parallelism in Python a lot just because it's easier to push okay. So what I mean by that is thread level parallelism. You can really only get by using a compiled library like in C or whatever you could there's a few other things you can do, but mostly people use mkl from numpy and it's threaded and vectorized and all of that, and so that works pretty good.

A

But usually when people are writing Python, they kind of use MPI for pi or multi processing or desk or PI SPARC to kind of get parallelism going. So you'll see these jobs that are like flat. Mpi jobs by that I mean like 68mph, on a on a kml node and that's kind of how how people are all, but you can do both. You can do hybrid pilot parallelism, it's just the same as submitting a to any other kind of job.

A

That's hybrid, parallel, I think I went over that one already one bit of information is that the only one of these parallel libraries that really could like scale to the whole machine is going to be MPI for PI okay, but if you're gonna try to scale to a significant portion of the machine, it's a Python application or hybrid Python, C, C++ kind of application. You're gonna want to do something besides just launch it out of your home directory and that's because of some characteristics of our final system, namely pythons import mechanism is really metadata intensive.

A

Basically, anytime, you do import numpy. It goes through the whole file system through your whole Python path and all of that stuff trying to open libraries everywhere, and so, if you have a hundred thousand MPI ranks or even 100 MPI ranks they're all gonna do that kind of more or less at the same time, so I'm they're all doing import numpy. What happens is those requests go to a single metadata server and they say: hey you guys get in line I'll get to all of your requests in order.

A

Okay, so your application, who's gonna, spend a half hour doing import numpy right. So you don't want to do that. That's bad! So these are the. This is actually a thing where I do import numpy and then this package called Astro pie, which has lots and lots of little little sub modules in it and how long it takes to import that at that 4800 rank scale. There are different file systems, so the one with the best performance on the right is with shifter, which is a container technology which just so happens to be the next.

A

The next talk, but the second-best performance you can get besides. Building shifter container and running from there is to use the global common file system that Java and mentioned earlier. So generally, don't like do a big MPI numpy import from your home directory. It's so bad I, don't even benchmark it. Ok, scratch is ok. Project is not that great, but global common is kind of your second best, one all right how to profile and debug Python applications. There's of course, good old print asked.

A

If that's the thing that you do a lot of the time, that's how people get started doing it. You just have to remember to unbuffered the output from both s run and python, but we have, as I mentioned a whole page about how to profile applications going kind of from easier-to-use to kind of more difficult or kind of professional grade tools. Python comes with C profile, and that works just fine on Cori.

A

You can use a tool like snake, snake, vis or G prof to dot, which is what this visualization is here to see where your code has been it's time and they can work on the bottlenecks there. There's even a way to do this with MPI processes. Just follow that link line profiler is a tool you can use to study the where your bottle, where y-your bottleneck is a bottleneck. We've also developed a package here called time memory, which does which you can instrument into your code.

A

So you put little decorators on functions and it tells you how much time it spends in that function. How much memory is being used all kinds of neat stuff it works with MPI, it works with. You've, got a Python and C++ application and, of course, there's vtune for Intel, Python and tau, which both work on Python. Okay, so are there any questions about Python like I could handle what maybe right now, it's all pretty clear. What's nice I think is that we've set it up so that it's kind of not a big deal right?

A

Okay, how many people use Jupiter at all? Okay? How about it at a nurse? Okay, cool all right! So Jupiter is a you know. This really powerful platform for data analytics for creating documents that have code text, equations, visualizations, widgets, all kinds of nifty stuff in it, our default Jupiter deployment is Jupiter lab and it has been since, basically, since they said they weren't in beta anymore, today is the release of Jupiter lab 1.0 I.

A

Think so we'll probably be upgrading this in the next few weeks to use Jupiter at nurse we've we've set up a hub which is a place where you log in and then you can launch from and that's the URL. You can go to Jupiter nurse techyv. So if you were a long time Jupiter or a recent Jupiter user, you might have used Jupiter, dev or Jupiter they're the same thing now so we've smushed them together into one thing and what you can do there is. You can pick where you want your notebook to start up.

A

You can have it start up on Cori or you can have it start up in this container. Environment called spin, but mostly people are going to want to start up their notebooks on Cori. We have not one node now, not two, but we have three nodes set aside that are kind of like login notes for all of the notebooks that people arrive and at any given time, there's about 150 or 200 notebooks running across those three nodes. um Why would you want to run on on Cori?

A

Well, you, of course your notebooks would then be on Cory. They could see the Cory scratch file system, it's the same kind of Python environment, as if you SSH in you can also submit jobs there. We have a some handy little tools for submitting jobs from from cells called slurm magics. The spin shared node configuration is external to Cori. So it's not not inquiry can't see scratch. You can't submit jobs from it.

A

What that's for is you have a paper deadline and you need to get to your data, that's on project, so you can make that last plot for your paper. Okay, so let's back up I'll, say that I think last time, I looked, there were 200 notebooks running on Cori and then like two in spin okay. So it's kind of this is a backup. So if cauri is down for maintenance, you can maybe you spin all right.

A

The most common Jupiter question I get is how do I take a Conda environment that I created and use that from inside a jupiter notebook, there's a few different ways to do this, but here's the way that I recommend, so you log into to koryu SSH and you create your Conda environment. So you have to add one package called I, pi kernel. Okay, if you do that, then the next thing you can do is this Python.

A

Am I pi kernel install here? This is all documented. So if you can't remember this, you can find it on the website. Then you can do that. What this does is this creates this JSON file called a kernel spec file and it drops it in a special place. It tells you where it is.

A

So if you want to go look at it, you can okay, once you've done that yeah once you've done that you point your browser to Jupiter nurse that gov, you may need to restart your notebook server, but once you do that, you should see that kernel show up and then you should be able to click it, and then you have that Conda environment from your from your notebook. This is what the kernel spec file looks like.

A

So it's just JSON, but basically all it does is it takes an argument which is run Python and then launch my kernel and then connection file stuff which don't worry about it. That's Jupiter, stuff, okay! Now, why am I showing you this? Because you can actually do more than just this with this, you can customize the environment, so you can add this little in red in red, this environment stanza. Basically there to let you set the path or the LD library path or all that stuff that people like to to customize with I.

A

Don't actually like this quite so much the way that I like to do this to do this kind of customization like if you want to add a module or something like that is don't run Python, run a script that wrappers Python, okay, so the way that you do, that is you you change that kernel spec file, so that, instead of the first argument being to do Python, it's do this shell script at some place. Okay, and then inside that shell script, it's gonna help your guy, you say export whatever you want hey.

A

This is like a real real common one. Is people want to make matplotlib plots list, lay tech, you know labels or whatever. This is the way that you can do it. If you do module load this and then what it actually does is it just runs the high pi kernel piece, the kernel piece? Okay. So if you have other modules that you want to be able to talk to from Jupiter, you can load them this way and then shifter is a container technology I'm going to talk about next.

A

This is how you could run a kernel from inside a shifter container. That's also documented on the website. I think. If not it's on the slides here we should add it and then, before you write me a ticket and say something's wrong with Jupiter. What you should do is you should look at your notebook, server's log file and the place where that's found is in your home directory at dot, Jupiter dot log used to just be called Jupiter dot log, but people told us they didn't like seeing it.

A

So we put dot in front of it. Now they don't see it, but we all, but we, the staff, know where it all is. But what it's got is it's got all the stuff that your server says it's doing and if you see an error in there that might give you a hint okay, we're working on ways to expand support for Jupiter. You can run, you can run things like desk or spark on compute nodes and talk to them from notebooks, and so we'll I can tell you all about that.

A

If you want to know, we are gonna have a way for people to launch notebooks on compute nodes so that you don't have to share with you know sixty six other people, but you have to pay okay and then we're also working on interfaces inside Jupiter lab that kind of expose slurm and things like that. So maybe you don't need to ever SSH in ever again, so this is kind of the key takeaways. It's basically use Conda stuff about MPI for pi should use shifter.

A

If you want to scale at all in python, then down here the number one question I get from Jupiter people: Jupiter users is how do I use a condom partment for my notebook. So we went over that. Okay, all right! So do you have any questions about Python, yeah or Jupiter yeah.

A

Okay, so all right, so the the questions about tasks who knows what desk is yeah so desk is one of these kind of newer frameworks for starting up little clusters that you submit work to in the form of a direct directed acyclic graph. But you have tasks, they depend on each other and you say just go: do that the architecture for des distributed is that there's a scheduler and that's the person you submit work to and then there's workers and those are the people who get the stuff from the scheduler and do it okay.

A

So how do you run desk, distributed at nurse there's a few different ways? One would be to set up a job where you start the scheduler task, scheduler the thing that drops the scheduler file ampersand, then the next thing is that s run all the desk workers. Okay, so those are the things that get the S run. The scheduler runs on the head node, but the workers run across the all of the all of the nodes that are in the job. And then you start your client script up after that.

A

Okay, if it's Jupiter, you have to figure out a way to wire up the connection between the notebook and a scheduler. I can tell you more about how to do that in a minute, but generally this would be a way for you to start up a task cluster inside of a job and submit work to it from a client script. So that's that's kind of the way to go now. We we want when you do that, though, what we want you to do is to make sure that you turn on the SSL.

A

So there's a few options you may need to create some certificate files. You can use a package called certify to do that that that kind of just encrypts the communication between all the pieces- okay.

A

So, but if we have more, if you have time afterwards, I might have a couple minutes before I head out to talk about that. But or it send me an email, my email address. Is there we're looking for desk users, yeah, okay,.