National Energy Research Scientific Computing Center (NERSC) Data Day 2022, October 26-27, 2022, 4 Nov 2022

Previous Meeting

⏯

youtube image

►

From YouTube: Workflows Pegasus Workflow Manager

Description

Part of the Data Day 2022 October 26-27, 2022

Please see https://www.nersc.gov/users/training/data-day/data-day-2022/ for the training agenda and presentation slides.

A

So yeah so I'm going to talk a little bit about kind of workflows, workflows and Pegasus and just kind of workflows in general today.

A

um So this word workflow is used a lot nowadays, so this can mean a lot of different things and it comes up in a lot of different places, so some people might refer to just kind of their. um You know your bash script. That has a few different programs that are running in it as their workflow, but a lot of things too are now getting to even more custom infrastructure.

A

You might have databases set up like we just saw in things like spin, those might go to web interfaces, and all of that is interconnected in a workflow that this data goes through um and some of the things that nurse helps our users support are these workflow tools and workflow engines. um So a couple of examples are uh like snake make. You might have heard of the new parallel fireworks.

A

um We also have many more and one of the ones I'm going to be talking about today uh is Pegasus. Pegasus is a a different type of workflow manager, but um kind of the goals in general. Of what we're talking about when we're trying to to have workflow managers is things like automation, so data comes in. We want the workflow manager to be able to pick it up and start Computing on that, and especially like reproducibility, and sharing this workflow with other people.

A

So you you at some point might need to hand this workflow over to a new person who's going to manage it, and you want something: that's going to be easy to be moved over to them, they're able to pick up and start using and running and maybe even run on different systems.

A

um There's also going to be a lot of moving Parts being able to track data through those pipelines, as well as being able to use our resources more efficiently. So maybe you have a lot of really tiny jobs that you can pack into a larger job using one of these workflow managers um and really at the end of the day.

A

This is just taking some of your work out of your work, so putting in a little bit of effort using a workflow tool and then hopefully not having to put as much effort in because you're going to let some of these systems uh do most of this effort.

A

So I'll talk a little bit about um the Pegasus workflow manager. So this is a workflow manager. It defines its workflows and a couple different yaml files, um so the replica sites transforms and workflow um and I'll go into a little bit about that and show there's a couple different apis that you can use to build up these yaml files. So you can either build them up yourself or probably the easiest way is to use one of these apis um and so I'm going to be showing off the python API.

A

And then we also have an example of this uh on the day-to-day GitHub page that I have working for promoter. So there's going to be a little bit more little information here and then some more information on the actual uh GitHub page that'll have install instructions and some instructions on the run again.

A

um So the Pegasus workflow manager um has some different uh apis in order to plan out a dag, so that dag directed analytic graph um and it's a graph that represents the work that can be done. So we have our nodes here.

A

So in this case we have something where we're going to split a file and then do some work counts on those split files, and so each one of those nodes is going to be some kind of executable that we're going to run um and then each one of these edges shows both the data flow and the dependencies. So we see that we need to be able to run the split operation first in order to go on to these word accounts.

A

um All of this is run using the HD Condor scheduler, um and so it you it actually. Pegasus will take your representation and turn that into a dag that uh that HD Condor scheduler understands and it will use that schedule to actually executor um and so that scheduler will be the one executing and managing all the workflow.

A

um So before we start making our workflow there's a couple things that we want to consider when we're building this, so one of them is the what executables? What parts are we going to want to run? um Maybe are these parts in a container as well? Should we say that they're in a container, do we need to to Define that somewhere, um a lot of things too are going to be what data do we have?

A

What are our inputs going to be, what are our outputs going to be and then on top of that, what are those dependencies in there? So what outputs need to go to the next task um and how are all those parts connected, and so one of the parts of this the transforms part, is a way of defining what our executables are.

A

um So here's part of the Python API um showing a function, that's going to create this transform catalog, so we make our transform catalog object and then we can add transforms to this. So the way that Pegasus defines its executables or the way that data is going to come into something and then be transformed into something else is like this. So we can see that we're. We can tell it what site we want to execute on.

A

We can tell it what the executable is and then we add all of these to a list that then gets put into this yaml file of transforms.

A

So the next thing that we might want to go and do is Define data.

A

um So this is what we call the replicas, so the replica here is going to be some test CSV that we're going to split up, and we want to go and say where that that is so. This is a local file. It's on the file system somewhere um and say that it's in the inputs directory at the moment, um all of our output data can actually be uh put in as when, when we go on to the next step, we'll actually go and register that as replicas independently in a different step.

A

um So here's where we actually start building up the workflow um so in this workflow we're going to go and have that that file. We Define that we have a file that we need as input.

A

um We can Define that we want to split this into four different parts and then give the arguments then to our Command. So we have this job, we've called it the same name that we called our executable. We can add things like arguments, add things like inputs and then we can add that job into the workflow.

A

Then we can also go um and add different command sent this too.

A

So now we have our top layer, which is our split, and now all of those are going to come into these different word accounts, and so we see that by going and seeing this word count, job now has to take in the input of one of these parts that we've created from the step before so the outputs from the split are coming into the inputs of our word count um and again we can go and catch things like this is actually going to catch the standard out and then take that standard out and save that as a new replica.

A

So that's going to save that as a piece of data that we also want to collect and again we can go and save this as a job that we want to go and add Contour workflow. So it's a pretty simple way to see how you can programmatically build up a workflow.

A

um This is also a really good way just to show. Maybe this is a very simple workflow here, but you could see how you can extend this as well, especially with it being a python API. Maybe you have a directory that has files that are constantly updating this workflow could look in the directory know that it needs to do the same task on these um depending on what directory it's in, and you could have this build the workflow based on files and directories or or other things.

A

So again, we have this python file now that we can go and generate our workflow with and that'll create all of those yaml files. Again, the replicas is going to have that storage and our data defined in it. The transforms will have all of our executables and parameters and then we'll have the workloads file which will actually Define the workflows, I kind of kept skipped over the site. At the moment, the site is very specific to the site that we're going to be on.

A

So in this case, it'll be parameter and I actually have a site, one that works on promoter, that's in the GitHub page at the moment. That should be easy enough to modify itself without going through the uh python API.

A

So I've talked a little bit about Condor, so Condor is a different scheduler than slurm, um and so we're going to actually have to have to set that up in order to use this system and the way that we're going to set that up is by using uh scront tab to do a long running workflow job. So this is a new feature.

A

That's part of promoter to to enable some long-running workflow jobs and so we'll actually go and set up the scheduler before we start any of our jobs, um and so the scheduler is actually built to do a lot of high throughput workloads. So the idea is that you have lots of really small jobs that can go through pretty quickly that might bog down a system like uh slurm um and a lot of times.

A

These are going to be jobs that have less than a node worth of of compute power that they need so you're able to go and pack a lot more jobs onto a single node, and then this scheduler understands a little bit better how to pack those jobs efficiently onto those nodes um and so yeah we're going to use this to go and run uh Pegasus and Pegasus will use this to run the workflows so down to the bottom.

A

Here is just a example: uh starting up and running a scrawn tab on um parameter, so this will go and run. uh It says every 10 minutes, but really what happens is that it will just start every 10 minutes, but scront tab is smart enough that once one of these jobs is running, it won't run until that job has finished, because it's using slurm in the back end to manage that. So this will just start up 10 minutes and then I have this running for 30 days in the workflow qls.

B

So I think is what it's doing it'll end and then it'll start again. 10 minutes later.

A

Yeah yeah, so for for this one, it's 30 days so every 30 days it would stop and then start up in 10 minutes so essentially infinitely running. But but yes, you could set up different yeah, so you could set up different uh scrum tabs depending on what you want to do so yeah.

A

So once you have um so once the scront tab goes through your scheduler starts running, um you can go and use uh the Condor status command to see and make sure that everything's running once we see that all the pieces are running, we can see the most important one here is going to be the scheduler we have our scheduler running.

A

We've started up that scheduler and it's ready to go and accept the work they're going to have, and then we can do a pegasus plan submit so Pegasus plan will take all of those um take all those yaml files that it sees and it will go and plan through and compute the dag that needs to happen, write out some files- and you can see here it's writing out some files for uh Condor to go and use to go and execute all of this work.

A

um Once your thing starts running, you can go and look at the how things have progressed. Looking at this uh Pegasus, analyzer or Pegasus status as another command too. Both of them will show you the progress of the jobs, how they're going through. So you have the total number of jobs and that, like the number of succeeded, failed um things like that, so you can see here too.

A

This is the same workflow that I had been using right, so we should probably have five jobs, but you actually see that we have nine jobs here, so something that Pegasus is also doing is doing staging in in cleanup for a lot of your jobs. So this would be helpful as well. It does a lot of checking for you to make sure that a file is in the right place before it starts, maybe executing a larger uh s batch that might use up some of your allocation.

A

So it's going to do a lot of checks for you before actually Computing the parts of your workflow that you want.

A

So we can also check both Condors Q to see that uh things are going through it's queue and you can check on uh the slurm queue as well to see that, like I said this, one has um it's waiting for the stage in to happen, so it's staging in some files making sure that things are ready before it actually starts up the jobs and.

C

I have just a tiny demo.

C

See does that work great.

A

So yeah this was right after I had submitted. You can see that it has this split. That's going to run.

A

And go through so something I, didn't say as well: Pegasus actually can go and submit jobs to the slurm scheduler for you. So there's a few things in the site: configuration that tell Pegasus how it should go and make an S batch command for one of the jobs, um and you can actually specify for each one of your transforms of what you want to happen for that transform.

A

So that includes being able to add special parameters for how much memory, how much CPU, uh even how many nodes you might want if it's an MPI job and so.

C

Just let this go I'll skip ahead a bit that's about here.

A

um So it takes a little while it's going, uh Condor is going through and making sure that everything's correct and ready.

A

And then you can see here it's now submitted. All of these word count jobs just waiting on slurm to to go through, and you can also see that this cleanup job is also in the queue as well. So once these other jobs go through, it will clean up directories and make sure that everything is um ready.

A

So here you can see it picked up. All the jobs in slurm runs the cleanup and then everything's done.

A

So again, this is just an example of one of the workflow tools that we can use uh in order to help to go and make sure that your work is is getting done to make sure that everything's going through properly.

A

um So it's just one of many workflow tools and we have lots more workflow tools uh on the docs on docs.nurse.gov.

A

um So please take advantage of those docs to figure out. Maybe what workflow tool might work for you? What workflow tool might not, um and there are always going to be advantages and disadvantages of each of these workflow tools, um so it might be good to read up on which one might fit your needs. The best um so I think that's it. So any questions.

C

What's the kind of the main reason why Pegasus is it like? If you have a phrase all set up, it works really nicely or is it like? Oh, it's just really beautiful jobs and yeah.

A

So I think for Pegasus. One of the advantages is that um a lot of people might have a pegasus, workflow already ready, and so this is a way that those people can also just start using uh this, um but again too I think that one of the advantages as well is that, because you can specify types of jobs that you want, you can have a mix between smaller jobs that need to fit on on that can fit onto one node plus, then maybe an NPI job. That comes after that.

A

So now you can have some staging in a lot of small jobs and then goes on to NPI jobs and then maybe some stage out jobs, so it can kind of handle the the work between the two of those.

C

uh So we have a question from Stephen Bailey.

B

You maybe already started to answer this, but I wanted to ask about how well it played with MPI and in particular different jobs and eating different levels of parallelism and how is Pegasus a good tool for specifying that, or is that just not the thing it does yeah.

A

So, on the back end, what it's going to do is it's just going to write an smash script for you for the parks that need MPI. So really Pegasus is just going to go and figure out what steps need to go and what places and also figure out what jobs need to go where in those places and then then it will just submit those jobs via S bash. So it should be the same performance the same way as it would be with your regular job API.

A

But now it will just be handled by the workflow manager.

B

um If it's not getting too much in the weeds, could you show one of those yaml files and show where you would specify that you know this thing needs 10 nodes and this other one needs one node or if that's too much in the weeds, we can follow up later.

A

I'll just show you kind of in general, with the site. Config looks like.

A

um So right so here it's just you kind of give it just arguments right. So these are similar arguments that you'd see in s batch and then I was just doing a really small job really low run times, but you would just give it a a number of nodes here.

B

um But that applies to all jobs. Doesn't it not like this step in the graph needs 10 nodes, and this other step needs one core.

A

Sure so so what I would do if I was doing that is I would make in my Sipes configuration I would make a like small job in a big job or like.

B

A

You can say this one needs to go to this site like now needs to go to my one that has multiple nodes with it, and then my stage in might only need a shared node for for a minute or two. So that would be the way. I think that you would handle um kind of these.