National Energy Research Scientific Computing Center (NERSC) NUG Monthly Webinars, 24 Aug 2023

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: DVS: Best Practices for Reading and Writing Files (NUG Monthly Meeting)

Description

August 24, 2023: NUG Monthly Meeting
00:00 - Beginning of Meeting
11:27 - DVS: Best Practices for Reading and Writing Files
33:41 - Question/Answer

A

B

All right, great thanks, everybody so welcome to the August nug monthly meeting um so today, because we have a lot of content. We're just gonna. Do some announcements first and a little bit of trivia to get everybody warmed up for the topic of the day, which is about DBS, best practices for reading and writing files.

B

um So, let's jump in as always, please feel free to raise your hand and speak up or or just speak up. We have the slack Chan the slack space. If you want to put some content or ask questions, that's a good place to do that. So some announcements remember a lot of these are in the weekly email.

B

So if you are wanting more information, in fact, if you want any links, the best place to go is the most recent weekly email, um and so the first one is that the ercap, which is the energy research Computing allocations process.

B

So this is how you would ask for time at nurse um that is going to be due: October 2nd, um the the requests will be accepted through October, 2nd um and if you want more information, we're going to be hosting a like how to prepare an ercap um which I will talk about in a moment as well. um So if you want more information, first I would say: go to to the weekly email look at the links provided there, um but then you may have another opportunity which I will speak about. In a moment.

B

um Other announced announcement, a new e4s stack is available, um and so, if you want more information, please take a look at again. The weekly email has more information about all the software packages.

B

Another really important announcement is that the nominations for the 2023 nurse early career achievement awards is open. uh You may have seen in an email about that, but if you didn't, um you could probably try to go, find it or you can again see all the information in the weekly email.

B

There's two categories: there's a high impact scientific achievement category and an Innovative use of high performance computing, um the information about what that means and what qualifies is uh in the um I think it's I mean it's definitely in the weekly email, but also in that dedicated email that was sent out by Charles a couple of weeks ago.

B

Probably um the eligibility for this is that um you need to have used nurse nurse resources significantly uh in your work and any nurse user who was a student at the time of their cited accomplishments um or receive their degree after the state is eligible.

B

um So the nominations are due by Friday September, 8th, um I I. Don't remember for sure. I will double check. um I. Think self-nomination is actually allowed now um I think in the past. It wasn't, but self-nomination is a really good way to make sure that people's contributions are being noted.

B

um So if you want more information, please check out the email.

B

um Sorry does anyone have any questions so far about any of this.

B

Okay, um so this is uh the first announcement for this year's nurse annual meeting, um so this is sort of like our big user meeting for the year. um It's going to be fully hybrid, but in person attendance is welcome um and uh oops. Sorry, um it's going to be September 26 to 28th um any uh questions about the actual meeting feel free to contact me, but here's some sort of a like high level information about. What's going to be presented during that meeting.

B

So, first of all, we're going to have some presentations from the doe allocation managers. So, if you're interested in that ercap process- and you want to know more about what are kind of the upcoming priority areas, what are the allocation managers looking for um Coming To?

B

Those presentations is a really good idea: we're going to have technical tutorials, so these are open to um everybody, but they're kind of meant for people who are doing some more of the Hands-On work at nurse um who are interested in learning about the API and then also we're going to have um a company called Xanadu. They have they have a Quantum Computing, simulator software, I, think and so they're going to do a really Hands-On tutorial with that. We're also going to have a ton of user talks.

B

um There'll be lightning talks, they'll be contributed, talks so come find out what all what amazing research everybody's doing and then uh we'll also have sessions on how to submit an ercap. So that's what um I was sort of alluding to if you haven't ever prepared one or if you have, and you just sort of want some updated information. You can come to that session, that'll be by Richard Gerber who's going to talk about how to prepare that um we're going to have a session on how to make the most of promoter.

B

So this will be kind of an interactive panel session and then we're also going to have a session on the integrated research infrastructure, um and so you will get probably lots of uh well, not lots of, but you will get an email that has the link to register um and the website where all this information is available so keep an eye out. This is going to be a really great event.

A

um I have a quick question: yeah uh lunch is lunch, confirmed lunch.

B

Is confirmed, okay.

A

B

All you owe you for the meeting yeah there.

A

B

Be less yes, there will be lunch, provided um we have working lunches every day, which is lightning talks. That's how the doe allows us to provide meals is if there is uh a working lunch. So if you come in person, um if you you have to make sure to register so we know you're coming so that we can uh order lunch appropriately uh yeah, it's an important, important piece of information.

B

um There is another uh sort of like user annual Gathering, um the esnet Gathering called confab. This is happening in October, so um please see the weekly email for the that registration link. We are also hosting our new user training. um This is so far. It's been something that happens about once a year, but um hopefully start to do that. More often, um if you have new users joining your groups or yourself or a new user, please feel free to attend um and you can register for that as well.

B

There's an ideas, ECP webinar on simplifying Scientific, Python package, installment and usage, that's coming up on September 13th and there's an AI for scientific Computing boot camp, which is in October. um So here so there's some other events. You can go to the events page or in the weekly email for all those links um and then, lastly, the applications are open for the better scientific software fellowship program.

B

So this is a program that gives people some incentive um AKA funding to help them um be able to dedicate time towards uh some kind of software engineering or software development Endeavor, and so, if you're interested in that, please feel free to take a look at that. But that is now open um and I guess I forgot to put with the closing date is, but again um you could probably Google this or look in the weekly email for the information about that.

B

Okay, are people ready for some trivia?

B

um This is supposed to say not so trivial trivia I think it auto, corrects it it's not so trivial, trivia, um all right so feel free uh to put your answer in the chat um and uh we'll see what people are thinking when uh we do some trivia. So the first question is: what does DBS stand for.

A

Am I am I supposed to call out the first person in the chat that does it or no.

B

No, we just let people populate chat with all of their their.

A

B

Guesses yeah go for it whatever you think it is just drop it in the chat.

A

A

I'm really sad that the donut visualization service didn't make the list yeah.

B

That's true I thought that would be a little too obvious yeah, that's what I would wish it was. It's the donut uh 3D printing service, foreign. Okay. We see some A's and B's. No one went for C I love acronyms, where they took random letters from the middle word to make the acronym you.

A

Can work for it when it's like that yeah.

B

Okay, all right, so yes, the answer is B. It stands for data virtualization service. um The visualization service was, was meant to be a little tricky, but it it's not visualizing data. It is virtualizing data which I think we will probably learn a little bit about later today from Lisa. Okay. The next question is: what should you avoid doing?

B

um I can't actually there's a zoom thing in the way, so I, don't even remember what I wrote but I think it's what what should.

A

You, what should you avoid doing in order to get the most performance out of DVS? Thank.

B

You sorry thank.

A

B

There's a there's like a one of those immovable Zoom things.

A

Man, this looks tricky.

B

Actually now I've lost track, of which one was for the first question. We should have put a mark in the chat, which was the.

A

Oh I think I think Brad had the first response: okay, I.

B

See: okay, okay, okay, wow, so everybody says D great! That's great I didn't put a a trick trick in here, uh I think every pretty much everyone said d right or no Heather said a I. Think maybe I don't know.

A

B

This one Heather or the last one, oh sorry, I give the answer away. But if you said a b or c you're also correct.

C

B

But that is also right. It's just that all of these are good things to avoid, um but uh you're you're right, if you said A or B or C, those are individually also correct, um so great, okay, awesome thanks. Everybody for playing and I will now pass it on to Lisa. Who will tell us more about all of this? um If you have questions about any of this, Lisa is going to be a great person to ask um all right: Lisa go for it. Okay,.

D

D

Is it still sharing.

B

Yes, you were good: okay,.

D

Good, okay, so hi everybody, uh my name is Lisa Gerhart I'm in the data and analytics Division and AI division of nurse um I was also I, led the user integration effort during the Parliamentary acceptance process and I'm the user point of contact for file system issues. So today, I'm here to talk to you about DBS and talk to you about basically best practices for reading and writing files around DBS and and at nurse in general, so um I'm sharing my whole screens.

D

It's gonna be hard for me to see, but if, if someone could let me know, if there's questions, you can feel free to raise your hand or hop in um so um you know, I just want to remind everyone of sort of the general setup. A nurse we've got Pearl letter.

D

um We have a whole bunch in pearlander has a whole bunch of really great stuff, a bunch of great GPU notes and CPU notes, and um but for this talk, I I prefer this view of nurse uh which is sort of the hierarchical view of the the file systems that are mounted on parameter, um and you know we. It is a hierarchy because um you know we. We have a trade-off between performance and capacity. As you go down the stack um you know at the top. We have our super fast scratch system.

D

It's very quick um and we can get here our numbers up as high as six terabytes, a second out of it, um so it uh it can really- and it's got it's backed by all flash behind it. So it's very it's very responsive, um but we it's pretty limited capacity. We only have 35 petabytes, uh that's basically the same size as what we had on Quarry and we've expanded the Computing quite a bit for Pearl meter.

D

So um the ratio of available space to Computing is going down is, is pretty small, um but it's great for good I O for fast I, O um I mean at the very top is memory but I'm, assuming that if I appear listening to this talk, you have stuff that you want to keep permanently. So you know you use memory when you can, when you're Computing, when it comes time to write out you know their Scotch is your first choice. Then, after that we have our community file system.

D

um That's a big capacity file system, it's intended for sharing um data with projects at nurse and uh with the public at large over the web. If you have a large web repository, it's going to live on community um and then moving down a bit more. We have our hpss tape archive, which is huge.

D

um We've got 300 petabytes on there. We could expand it out and we have. We do expand it out as needed. So there's a huge amount of capacity there, um but of course it's very slow because it's tape um and then we have sort of uh outlier they're more, like I call them like helper file systems, there's Global homes and Global Commons homes is the place you end up when you SSH into the machine. That's where you start.

D

That's where things like your dot files, SSH config files live, maybe a few helper scripts, things like that um and then there's Global common, which is a place where um it's uh it's all Flash ssds, and it's designed and deployed on our systems to support software Stacks. So it's made to support many small files. Lots of repeated reads those sorts of things.

D

um So that's sort of the picture of where things are at nurse and I just wanted to start this talk with some general. This is going to be about DBS.

D

This talk um and I'll talk about more of what that is later for those of you who don't um who don't know this, um but the general advice for Io at nurse for reading and writing files at nurse is, um if you're running on a batch job, you should go to promoter scratch file system, you're, going to get the fastest rates, um the slowest, the fastest internet pads the sorts of things, that's optimized for reading and writing from the computes for our batch system.

D

um So that means that includes things like input data, if you're, if you're reading it a bunch of data um things like configuration files, output data like if you're reading, a writing and doing a lot of I o. Your best bet is to put this on scratch. If you can and I know that there are some folks who can't we'll talk about those a little later um and then the software for your batch jobs um should ideally be in a container at this point, I think you'll get the best results and the most repeatability.

D

If you can be in a container recognize. That's not. You know it's not always easy for users to do so. We have this the global common file system um and you can access this at nurse, get Global common software and then your project name, every project gets a directory and Global common.

D

um We start everyone with a small quota, but it's it's pretty easy to get that expanded. um So if you, if you find that too confining just open a quote, increase request and we'll work with you, but this is where things like, if you're doing it, installing a conda environment that should go in Global Commons.

D

If you're planning to run in the batch system at scale, it should go in global common, um pretty much anything that you install with config makes cmake and you're not planning to like SB cast to the nodes, or it has a bunch of libraries in it that should all go in global common. So if you're doing these two things and these two things work for you uh you're great good job, excellent, congratulations!

D

But if you're doing anything else, I think you should pay attention to the rest of this talk because you're going to be interacting with the DBS amounts of the file systems, and that has a little bit of a different Behavior.

D

um So it like I, said it doesn't matter how things are mounted. That's coming back to this diagram. I just want to point out the the file systems that are mounted on the computes by DBS, and so our community file system, the global common and Global homes are all mounted via DVS on our computes, um and so DBS is new on parameter. We were doing um Native client mounts before this, um but we switched to DVS a little while short a couple months ago and I'll talk more about that later.

D

um So, if you were using these file systems before on promoter um using CFS, maybe you're reading reading data out of CFS, maybe you had a condom install in your homes or something since the switch to DVS. You may need to change some of the things you're doing when you're running that scale.

D

So what is DBS? um It's? Basically it's an I o forwarder. It was developed maybe 20 years ago, maybe 30 Now by cray. We caught it on our systems for a quite a while. We had it on Corey and Edison um and Hopper, and all the systems for as long as I've been at nurse again I'm sure before that, um but basically what it is.

D

It's a layer of Gateway nodes that Mount the file systems and then they directly afford the I o um to the computes and the Gateway nodes are situated inside the the HPC Network um and have are set up to communicate and cache the information very quickly and efficiently.

D

um So, like I said, Perma DBS recently went live on promoter. It went live on during the maintenance on June, 8th 2023, so in just a few months ago um we switched on DBS, and just this is only on compute nodes.

D

You know, because for the login nodes, where there's only 40 nodes, those are all mounting these file systems in the in the native way. With their native clients, they don't use DBS, so logins don't use. Dvs computes do like I mentioned before the one. The file systems that do Mount. This are community Global common in homes.

D

um So why did we make the switch to DVS? Primarily it was for stability, um so the community file system homes in common are delivered by a kind of file system called Spectrum scale used to be called gpfs.

D

um It's it's a really great file system. It's very, very reliable! It's great for doing lots of I o lots of um reads and writes and lists, and it can. It can push bulk IO. It can do big reads and big rights really quickly. It has a lot of great qualities, um but it's not designed to run in a situation where you have many many clients with potential Network bottlenecks between them, um and so we were finding ourselves in a situation a lot of times where Spectrum scale was the communication between this.

D

The computer and Spectrum scale was having issues enough that spectrums. Whenever that happens, Spectrum sales says hold on. We need to wait until we get this figured out. Everybody stop until I can figure out. What's going on, because it's very important that it not drop any data right. We don't want to lose any data, so it's prioritizing keeping this data and So within anytime. There was a Slowdown or if a node went into an unusual State like a it, was called like a zombie state where it's not responding quite properly.

D

Spectrum scale would ask everyone to hold, and so what this looked like on the user side was, you know a pause when you go to list a directory and the login node. It sits there for 10 minutes and doesn't do something. You go to close your editing with the I o with emacs. You go to close and it just sits there and doesn't do anything or when it opens, you get a blank screen. It doesn't do anything.

D

um So there was a lot of really frustrating pauses on the user side, while the well things were rectified and then on the lot on the compute side for batch jobs. This showed up as as job failures or much longer than expected. Job run times things like that, so we switched from running the native Spectrum scale, clients to to running DVS on June, 8th and the way this is set up at nurse. We have 24 Gateway servers that serve as the DVS servers.

D

Each server is configured to handle a thousand I o threads at once, and normally these things are just like and go right through it. So this is generally enough. This is size to be enough to handle the I o load at nurse um and another advantage of DBS is that it can can aggressively cache data.

D

So, if you're doing something where you're reading like uh the same LD Library path, every time, you're reading the same config file for this thing or you're reading pieces from the same chunk of a file over and over again, that cache is really going to have a dramatic Improvement in your performance, especially at scale, and we actually have. We offer our file systems over over DBS in two different ways. We have them in the read, write mode which is sort of the traditional mode you can read from the file system. You can write to it.

D

It looks just like you know, sort of like a regular interactive file system, um and then we have a read-only mode where all you can do is read.

D

You cannot write to it and the reason why this matters is the way that DVS handles this um when you're doing a write when you allow rights um just like with deep with Spectrum scale before you have to be really careful to not lose any data, and so the way that DVS handles the way that a server is assigned is different for the read write mode versus for the read-only mode. So for read, write when a file is created, it gets its Gateway server and that's like it's forever home.

D

If you are, if you are accessing that file via a read, write now to DBS you're, always going to go through, for instance, Gateway three every time, no matter what compute node on you're on no matter what time of day, no matter where it is, we also keep the caching really low so that if you write to that file from somewhere else, you're going to pick it up um versus read-only um that file. If you try and access it via the read-only point, you could get any one of those 24 gateways.

D

It's going to give you, whichever one is the least lightly loaded um and then there's a cache both on the client side and on the server side. That will keep that information right there on the Node. So you don't have to go all the way back to the file system, which you know is generally fast. But if you're doing that 64 000 times those things really add up yeah.

D

So those two different mounts have two different sets of behavior and right now at nurse by default. If you just use the the regular like slash global global blah blah path, um you will get the read-only mount for Global, common and everything else. You will get the read write note, um but for these file systems we also have twin read-only amounts that you can use for everything if you change the path a little bit and I'll talk more about that. So, if you want the read-only behavior, you can just use a slightly different path.

D

Think of that. So how do users? How do folks interact with DBS? um You interact with it two ways. First, you interact with it. You can interact with it intentionally if you're using reading or writing a file on CFS, um like I said some people have, for instance, really large data Stores.

D

um You know they have a petabyte of data that they're doing analysis on they don't always know which file they need in there that's going to be somewhere in there and it's too costly to Stage that up all up on Scratch and keep that up to date. So you know the data comes in from an external Source: it lands on CFS, they do some analytics and then maybe it goes to hpss and those things. So they work with the data on CFS because of the size of it.

D

um It's also included with that are things like config files or startup files, or sort of things that you would keep with this data set so that you can kind of know and understand it and and read it in um and then the other way that users intentionally interact with um devious is by using Global common, so anytime you're using a module, a nurse consult module you're using something on global common.

D

If you've installed software on global common, then you a new source that you're using you're using DBS, um but there's a lot of ways that users unintentionally interact with VBS.

D

So, for instance, if you have a conda environment and you install it just without changing anything by default, it's going to go in your home directory and then, for instance, if you have something in your dot file, like you really like this kind of environment, you want to have it loaded whenever you log in which is great, you know it's there and it's set up, but then you go and submit a batch job. That's maybe doing something else.

D

It's going to grab that whole conda environment, lookup thing that happens in your home whenever you start up and drag it along with your job and do that at scale when things start up and and for Sometimes some folks, they want that kind of environment. Some people, don't um so it's sort of an unintentional consequence of having this there and set up um folks also tend to default to doing a software installing your home, you could just try and everything out you put it in there. You know. There's things like scripts.

D

um Sometimes you can have hidden dependencies in your software that that end up calling things into your home and they're. Just not it's just not obvious. um The same sort of thing happens with with CFS. You know, because it's a shared space um and it's a little older than Global common. Some groups tend to put their software Stacks there um and so you'll run into a problem. When you try to use this at scale and then there's also things just hidden configuration files and dependencies I think that's true for any file system.

D

It's very difficult, sometimes to see all the the web dependencies that programs have when you run them.

A

There's a quick question in the chat um Brad's asking: should you not write to a file through rewrite and then try to read it immediately through read only.

D

So read, write has um yeah. That's generally, you want to avoid doing that. You want to avoid using read-only for things that are being immediately written, so the cash right now for read-only is I believe a little five minutes.

D

So you'd want to wait for that cash to flush out which I think is kind of, because it's a little tough to get the timing right. We generally don't advise you using the read-only knot for something that you're actively writing just to be extra safe, but what I think you would end up having you know what you would end up with is there's a chance. You might get the old file right and not get the the new information that's in there and so I think. That's in general, that's undesirable for some people.

D

Some people may not feel that. That's that much of a problem. So because it's it's easier to understand, we, we don't recommend doing that.

E

Yeah I had that suspicion I just wanted to confirm and then make sure that everybody else got that understanding as well yeah yeah. Thank you. Don't assume that read-only is the faster reading from something that your process is writing to yeah.

D

It depends so you know if you have like a fixed data store, which is a lot of the folks that we've sort of seen run into these problems. They have a big set of data and that data is not changing. You know if they're coming at it to read, write they're going to have a they're not going to have a good time, so those are the the. If you have that kind of thing, we definitely want to steer your Twitter everyone for sure.

D

So let me give you an example, sort of what this looks like, um so we have two different kinds of loading Behavior going on, so this is the first. This is sort of the good. This is what you should do, so you have a job. It's a hundred note job and it's using it's.

D

It's using a conda install that's been installed in global, common um and you're running it's on CPU, so you're running, 128, procs per node, and so when it starts up all 12 800 processes of these are spread evenly across the 24 DBS servers.

D

um The file that leads to your content environment is fetched once on the DBS side, and then it sits in the cache and it's quickly read and the job starts right away, and if we contrast this against the behavior that we get if the conda install is in home same 100 note job, the only difference is conda installs at home.

D

All these processes line up on this one DBS server and wait to fetch this file, and then each process has to fetch it fresh again from the file system, and what you see is that the load you can see here. These are these lines.

D

Are the load versus time on our DBS servers, the load shoots up to a thousand and it just stays there, and the entire job is sitting and waiting for this single DVS server to kind of finish, processing its load and as an addendum to this up until just last week, when this was happening, every other user that tried to use this service note also had the weight, because it was doing a first in first out for I o requests so, which is really basically undesirable.

D

It was at the point where we would have to reach out to this user who's really not doing anything, it's kind of natural to put conda in your homes. It's not you, know, they're, not deliberately trying to cause this problem, but it would cause basically what it amounted to a center-wide outage on the computes, because this was happening. So this is really undesirable, behavior for DBS to do this, um so we worked with our vendor to change the scheduling algorithm and as of the 16th, now there's something on there. It's called the fairness algorithm.

D

So what it does is when things start to get really high, instead of serving in a first in first out it'll cycle through all the users, so maybe you're sitting here with your 100 node condo thing pegged at a thousand just waiting and it's gonna say and then someone else comes up and says: I just want a single file. Please and they're like here's. Your file go ahead and it'll keep everybody else moving. Well, it slowly turns through this backlog from this one. So it's it's an improvement in that one uses.

D

Behavior is no longer adversely affecting all the other users, um but it's still something you know your job will be slow, so it'll be better. If you can move this conda install or whatever you're using off of the read write on onto the suite only.

D

um So, okay, so what should you do um so I think kind of kind of covered this in Parts, but I wanted to put a summary slide in here. If you just want to read from CFS, you have data you're, not changing you just want to read. You can use the DVS Ro path instead of global. So if you have something, that's you know goes Global CFS, Cedars your project, Mega important config. You just change out this first part it becomes dvsro, cfse version and then you're getting the read-only one.

D

So if all if you're running thoughts of no job- and they all want this config- it's going to go super quick and if you want to run condo environments at scale, our first recommendation is use a container. That's always going to be better, but if you can't do that, use Global, common and there's some instructions on our webpage about how you would get your conda installed to move to the global common path um and then there's a few other things.

D

That um kind of like these are more like edge cases that you want to avoid and one of the things another place where folks run into trouble. A lot is, if you have an ACL for DPS ACLS live in the extended attributes, part of the file which unfortunately, forces DVS to go back to the file system, look something up, and so ACLS tended to defeat any caching. That might happen even on a read-only system.

D

So if you, if you avoid ACLS as much as you can, if you're trying to use ACLS to share things with different groups, we can do things like create a custom group, and you can have subsets belong to different things.

D

um So there's there's ways we can work around it if you're, finding that you have a big data store that you want to read over DBS and you need an ACL like open a ticket and we can work for a way to find something that will break for that.

D

So I think that's all I had for today. I just want to kind of point out that you know this work is ongoing. um You know we. We are always trying to make the I o experience better, like I think that um DVS accesses have improved quite a bit over the last week, and hopefully you guys feel the same, but I know that there's still some work that we need to do there where we need to investigate why sometimes things are a little slower than we want and we're actively working in this area.

D

So if you have any questions about what I talked about today, if you see any unexpected results or performance, you know always open a ticket. um It's always more data for us. It's always helpful um and I think that's! That's all I had.

A

uh I thought that was really great Lisa. Thank you, so you're real Applause.

A

um I I had one quick question real fast. um The atls is that, like making Unix groups and giving different permissions to each one or is there like some other common examples of the ACLS? That I would help me understand a little better.

D

Usually, when I see folks using ACLS for is like they have a group, but they want to share with like one or two other people, and so they'll add those in as an ACL rather than you know. Maybe they don't want those folks to have access to the whole directory or something it's fine. It gives people fine-grain control, directive, access.

C

um This is Howard Richard from Llano I. Have a question? Can folks hear me um Lisa? This is very interesting and very timely for us with our problems with Crossroads. What is the deviate? So you mentioned this baroness algorithm. Do you know is that a patch to DBS is, or is that of something that the vendor is released with a new DBS like RPM, updates that we should try so.

D

That that, as far as I understand, it's not nurse specific, it came to us as part of their um their costs. Bundle, which is where DBS is delivered.

D

um So I think that should be something that they should be able to give to other sites you may have to if you're I'm, not sure what you guys are using if you're using costs under the cover cos or if you're, using something else, you may have to ask specifically for it.

C

Okay, thank you.

A

Other questions.

A

uh Lisa there's one more thing that came up in chat. um There's there's been some discussion about already, but maybe you would like to weigh in or or not, um but they said. The question is: are you supposed to use a dtn, a data transfer node to transfer from scratch to CFS or as a login, node? Okay for a few terabytes.

D

I think I, my answer is Globus is always easier because it'll retry for you and keep going- and you don't have to keep your your your thing open, um there's nothing for a few terabytes, probably a login node is okay. If you're talking tens of terabytes or something really long running, um you need.

C

To start being,.

D

You know you're going to get you likely might get interrupted and also um you just need to be sharing the node well with others. So I'd recommend Globus in general foreign I, see there's something about the difference between Global, common and home um yeah, so homes is um homes is basically intended for just for, um like minimal environment setup scripts. At this point, um if you have a software stack and you want to run at any scale, um I think Global.

D

Common is the place for you and the difference right now, they're, both backed by flash, which is much faster than Spinning Disk, but the um the size of the individual, the block size on global common is, is optimized for software installed. So it's a much smaller block size and it's got the read-only by default mount on the computes. So those are the that's what the advantage common has over homes.

D

And I forgot to mention this in my talk, but when we're talking about scale, it's not just one job of a thousand nodes. If you have a thousand one node jobs, it's going to have the same effect. So just think about this. When you're submitting lots of jobs, you might get a light queue day and all your jobs will start at once, and the same thing will happen.

B

Thank you so much Lisa were there. Other questions um did all the questions from the chat get answered. If, if you ask a question in chat and it got lost somewhere, please feel free to unmute and ask your question now.

B

Okay, any parting thoughts.

B

Okay, well, thank you. Everyone for joining Thank, you Lisa for your wonderful presentation, um I think. Hopefully that will be really helpful for everybody. um We will be posting uh the slides and the recording of today's meeting um on the website. So if you missed something- and you want to hear it again- you can always go check it out and feel free to email, myself or um Charles is usually here he's not here today.

B

If you have any other questions for us, but uh hopefully see you next month for the annual meeting in person or or remote, this is totally fine, but if you want to you can attend in person. Okay, thank you. Everybody.

A

Thanks Lisa, thank you.