NCAR Climate and Global Dynamics Laboratory 2022 CESM Workshop, 13 Jun 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Software Engineering Working Group - 2022 CESM Workshop Day 3

Description

The 27th Annual CESM Workshop will be a virtual event. Specifically, the Workshop will begin with a full-day schedule on 13 June 2022 with presentations on the state of the CESM; by the award recipients; and two presentations from our invited speakers in the morning, followed by order 15-minute highlight and progress presentations from each of the CESM Working Groups (WG) in the afternoon.

To learn more:
https://www.cesm.ucar.edu/events/workshops/2022/

A

See here I'm using the browser so.

B

Okay, things are.

A

A little bit different.

B

I think that as a sort of presenter or moderator, I think I'm not able to raise my hand there.

A

Okay, I see steve okay got it so the people in the browser. Like me, I don't think we have a raised hand function, but I can see it so. That's good! Okay,.

B

Okay, so folks in the browser can just use the chat if they.

B

B

All right I'll give people another minute. uh um A minute to join, looks like people are trickling in I'll give people another minute or so before before we dive in.

B

Alright go go ahead and get started um welcome good morning and welcome to the software engineering working group session. um I'm one of the co-chairs of the software engineering group bill sachs from ncar cgd. uh The other co-chair is lisa bernarde from noaa. Do you want to say a brief hello for those who don't know you.

A

Hello, everybody! Well, I am luigi bernarde from noaa global systems, laboratory here in boulder colorado.

B

um So we have a good agenda that I'm looking forward to this morning. um Just a couple things I want to highlight here, um the last half hour or so will be devoted to an open discussion. um So we can.

B

We can use this time to have a more extended discussion that emerges from any of the talks this morning or really to discuss any other topics that the people would like to discuss in this group um and then, following that, uh we we are planning to keep the zoom session open from 12 until one all the way until one o'clock uh mountain time for just a sort of informal networking and conversations uh over lunch. So I hope you'll stick around and join us it's.

B

You know it's always a little hard having these remote meetings so hoping to have some uh some recreation of some uh actual socializing uh at that time. So we can. We can catch up and chat um uh just a few logistics. um The meeting is being streamed to youtube and recorded um depending on how whether there's sort of interesting things in the chat. We might also record the chat, in which case I believe that also records uh private chat.

B

So don't put any don't, arrange your next date uh in in the private chat here um for questions you can either use the raise hand, feature on zoom or uh type your question or comment in the chat. I know people who are um on here uh with uh with the web-based interface to zoom. I don't think have the raised hand feature. So you can use the chat. um If you have a question uh for the speakers and then.

A

Speakers will give you a two-minute.

B

Warning um at the 16-minute mark, um though you can, let us know if you want a kind of earlier warning um as well and then one more thing um we are. Some of us are going to go this evening to the raybach collective in boulder meeting around 5 15 pm uh to just socialize have some dinner, and so we hope, if you're, local and able to we hope you can join us for that.

B

We hope to see some of you there and then finally, just to remind you of the ncar code of conduct that applies during this meeting and all other meetings. So please try to offer constructive feedback share. The air acknowledge teamwork, encourage innovation, show, appreciation and consider new ideas.

B

So with that, I will stop sharing and turn this over and we'll get started with the exciting series of talks.

A

All right well, thank you bill, so welcome everybody. Good morning, I'm gonna be sharing this first part of the talk so up until the break and we're going to start today with mariana burton's team she's going to talk to us about an update on esmf neuopsie coupling infrastructure in cesm.

A

So she has a longer time slot and we are going to give mariana a 20 minute warning and then a 26 minute warning so that we can fit the talk and the q a within the half an hour slot. So you can share your screen mariana, okay,.

C

Great, can you can you see this lisa we tested this beforehand. Yes,.

A

C

All right great, um so I think that, given the variety of stuff I'm talking about, if there are questions in the chat that are relevant to a given topic and there's some confusion, please just stop me. I'm happy to answer, because I do have some more time, and so we can spread out some of the questions. Sometimes it's hard to get them all in at the very end. So uh what I want to talk about is what I believe is the transformative capabilities. At least this is in my opinion, of the new coupling infrastructure.

C

We have to enable new science to be done with csm and also to enable completely new collaboration with modeling efforts, not just in the united states at noaa, but also with two new europe. Two european efforts that are also using csm and I'll have some details of this coming up. So uh the outline is very terse. There's a lot of information, that's coming in each of these bullets, so the first thing I want to talk about is how we communicate information between components with this new coupling infrastructure.

C

I think some of you have already seen some of these presentations, but I'll be talking about some of the new uh milestones we've achieved with this new coupling infrastructure, which is called cmax and I'll. Try to get all the acronyms to fund.

C

uh One thing, I think, is extremely important is enabling an infrastructure that uh has the capability to do what I call hierarchical, modeling development, which is the ability to selectively turn off feedbacks and isolate the parts of the system that you're most interested in developing with your science and that's uh at least to some extent, enabled by uh the data models we chat.

C

We have the new data models which have which are the community data models for earth prediction systems or c depths, and then I'll wrap up with uh what we're trying to do to address the uh challenges of ultra high resolution simulations, particularly our collaboration with earthworks right now and um I'll summarize. Finally, with the new esme release, that's the earth system, modeling framework that we're using that addresses some of the bottlenecks that we found when we tried to start up these ultra high resolution simulations.

C

The first: uh how do you enable the effective communication of components with each other? So, as you all know, uh the csi model has what we call a hub and spoke architecture, components exchange boundary data. Currently, it's it's still two-dimensional boundary data with a central mediator, which is this hub called the new opsi community mediator for earth prediction systems, and that is that hub, the cmips is, is um targeted to be able to grid data from a set of source component to your target component.

C

Merge that data uh using the fractions that are needed to merge, say data from ice uh ocean and land to the atmosphere as an example, and also it carries out atmosphere, ocean flux, calculation, downscaling of of data from the land ice component and so on. uh The key part is that to be able to communicate with c-maps, each component has what's called the new opposite cap, which is just a very lightweight translational layer that takes the data structures in the source component.

C

Translates it to esmf muoxy newopsy is simply a layer on top of the smf that enables standardization of coupling between components in terms of building an earth system model, and this translation layer simply takes data from the component.

C

Translates it to esmf neuopsy and then uh the mediator connect on it with uh via sending it via the connectors these these lines here and then, when you get data back to your target component, you need to translate the data structures from the smf nuopsi back to your data structures for your component and that's all that the cap does um sorry. I went so. This is a very busy slide, but it contains a lot of the updates that we've done so in this slide.

C

What I'm trying to show is that all of the these are the prognostic components that are being used, not just by csm but other modeling efforts. So if you look at csm and you look at what the components are for csn, it has size six, six wave watch three season, mozart for the river and a new component called mizzy route, cam and ctsn, and where what you see here in the gray, shaded ovals, are that those components size six month, six and now wave watch three are actually being shared directly with noaa.

C

We have a shared cap and we have shared development efforts.

C

So the strength of this is that when you take a prognostic component- and you actually have more than one modeling effort using it- you're testing it in a new various scientific configurations, and so you have much more robust testing and trust in the development of new science in that system.

C

uh The other thing to point out is that the other bullets in color, such as the ocean, empass, o nemo and blum, are really being used by other modeling efforts and paso. The empass ocean is being used by earthworks. Nemo is being used by uh cmcc in europe. Blum is being used by norius sun is norway and they're using this exact same coupling infrastructure.

C

Similarly, the bullets in red for ufs atm and the land model noaa mp and lm4 are land specific components that are being used by noaa. So what you have here is a coupling infrastructure that truly enables interoperability between several modeling efforts, and there is interest to go past. This particularly there's interest from the navy to potentially use c-maps in their modeling system.

C

The key point that is different between neuopsy c-meps and our all-coupling infrastructure infrastructures. That cmaps is a component that is identical in functionality to say the atmosphere component. It is its own unit and can be shared. What's different and I haven't shown on the slide is the drivers that drive this whole system? Those drivers are very lightweight and they drive the temporal evolution of the system and those are model specific.

C

So the only difference between these various component uh utilizations is the drivers that are there and, of course, what we also have is the ability and I'll talk about this a bit later, to swap out prognostic components for data components in um these various scenarios, and that enables you to selectively turn off feedbacks in the system and that is very powerful for being able to do hierarchical model development at the component level, where you will drive, say cam with a data ocean or a slab ocean.

C

So we have written a completely new set of data models that I've done briefly last year called the community data model for earth prediction systems that is neopsi compliant, and that uh is siemens compliant, and that has a lot of new functionality that will be very powerful both for sharing with other modeling efforts and for uh also being used within the component to ingest forcing data.

C

So what are some of the benefits of cements for one? It's easier to introduce new grids, one of the things that esmf neuropsy does. That is a real big change in the system is that it does online regridding. It creates the route handles, so the mapping weights between a source component and a destination component. It creates those at runtime in parallel so before, when we were using our old coupling infrastructure for a fully uh coupled indus pre-industrial run, you needed to create offline, 25 mapping files for a set of combinations.

C

Now we are down to four and the reason it's four is that these are um runoff to ocean mapping files that need to be customized and at some point in the future. Eso map is very interested in creating those, but imagine all of the overhead. You need to do not just to create the mapping files but to put them into the scripting system so that you would know so. It's disk space, it's scripting complexity, and so it made the generation of new grids extremely cumbersome.

C

Another new addition is that when you introduced a new grid, you also needed to create offline land fraction files, because the basically the fraction that is land on the land grid is obtained by mapping the ocean mask, which is one or zero conservatively using first order interpolation to the land grid, and so, in addition to creating the mapping files. You also needed to do this so before each new component grid required generating offline fraction files again putting them in the scripts figuring out how you were going to access them.

C

So now what we have is inside both cdeps and the ctsm cap. These uh land fraction files are generated at runtime by uh conservatively remapping the ocean mask to the land fraction, and so you no longer have to do this again. A really great simplification.

C

uh This is hot off the press, but this doesn't relate directly to cmips, but it relates to sorry. I don't know why this is happening.

C

Okay, it relates to the ability to introduce new grids uh by having a completely new land surface data set generation.

C

So before, when you introduced a new grid for the lan, you need it as part of that process to create 17 offline mapping files, and this speak and then use those mapping files into a surface dataset generation code that only ran on one processor as an example, as we were going to higher resolutions, this took more and more time. uh One metric is that it took over two days to generate a surface data set at 7.5 kilometers for an impasse grid.

C

Now we have a completely new land surface data set generation tool using esmf, neuoxy and parallel io, which has totally changed this, so we can create the same empass surface data set, but in 10 minutes, so you do need to use a lot more processors, but you now have scalability uh and basically some of the new raw data sets that encompass the 17 before offline mapping.

C

Files are getting at higher and higher resolution, so one of the challenges we had is that how could you have mapped a 30-second soil texture data set you that have 724 million points to uh create a new surface data set that you couldn't before you were reaching a bottleneck. Now we can do it so also.

C

We are leveraging an esmf dynamic masking, which is a pro, which is this great capability that can be used not just for uh mapping, but also for doing things like calculating surface standard deviation of surface elevation statistics, so I think those three things are are outlined where I think that esm and new opposition steam maps are real game changers for what we can do.

C

So this is a science change, so what we can now do- and I think bill lipscomb outlined this briefly in his talk- is that we have some new capabilities for land ice simulations that simply were not available before.

C

One of the things we've wanted to do for a long time is enable both antarctica and greenland to run simultaneously in simulation and potentially a whole other set of glaciers. That could be added so before the proposed approach was was to create.

A

C

Global grid- and that meant that, for every combination of grids between antarctica and greenland, you would have had to create a new global grid which would have resulted again in a combinatorial explosion of grid combinations and files.

C

What esmf permits us to do is actually, very simply in the cap to create a nested state where each ice sheet incision that has its own mesh couples directly to a corresponding ice sheet in the mediator.

C

This is very extensible, user, friendly and easy to see, because all of the complexity is hidden in the esmf library, and so not only can you run greenland and antarctica in one simulation now, but we have the capability in cements to add an arbitrary number of ice sheets that can be coupled at runtime now, and we have carefully validated this. A lot of this work has been uh done by close collaboration with bill sachs, and so I think again, bill lipscombe and his group are extremely excited to be able to do this.

C

But in addition to this we can also enable antarctic ocean coupling. The challenge with this is that you need to have ocean land ice coupling at multiple ocean levels where each ocean level has a different bathymetry.

C

What you encountered before as a prototype is that, in addition to the component mapping files, those 25 mapping files, you had to create another mapping file for every ocean, scissor mapping that had a different bathymetry level with esmf dynamic, masking functionality, that's all gone, and so all you need to know is what the bathymetry is and the dynamic masking enables you to do that regretting seamlessly in cement at runtime.

C

So you only need to pass in one field from the ocean to the uh c-meps, where the multiple levels are contained in what is called ungraded dimension, and you have this new antarctic ocean landings capability.

C

uh Another able to do is do uh computational efficiency using this new esmf managed, threading and basically the the idea is that if you had a component that was threaded four ways and another component that was not threaded before. If they had to share the same nodes, then the second component component b could only use one quarter of the nodes in the node. So that led to a lot of idle cores and poor hpcc resource utilization, so esmf uh last year has introduced a completely new capability called managed threading.

C

So if component a is threaded four ways, component b is not threaded: they can now still use the same nodes and component b can use all of the cores in the node, thereby you know greatly increasing the efficiency of the overall system, and what you can see below is that this does this not only gives you greater efficiency, it also gives you greater throughput- and we have just not had the time over the last year to actually implement this and all the scripting system.

C

But I want to point out that it's there and it's a high priority for us to try to get working out of the box as we move forward um in terms of atmosphere, ocean flux, calculation, we have a completely new capability that I think, will greatly facilitate scientific development, particularly if the ocean and atmosphere icebridge are at two different resolutions and that's using what's called the exchange grid. So basically, the exchange grid is simply the union of a target and destination grid.

C

So you can see here in grid a you have a fairly coarse grid in grid b, you have a much finer grid and finally, the exchange grid, which is c, is simply a union of those two grids and what we've put in cement is the ability to do atmosphere, ocean flux, calculation on any one of these three grids, but the right grid to do it on is the exchange grid, and that will be the default way of doing it.

C

So what does this uh imply so thanks to adam harrington, for uh showing how this worked initially in a tropical cyclone case, and the main idea in tropical cyclone tests is that what you want to have when you look at the stresses between the atmosphere and ocean when you are running a very high resolution, uh atmosphere say a quarter degree. Spectral element with a one degree ocean is that you want the stresses to align with uh the winds and what's happening here.

C

Is that when you calculated the atmosphere ocean fluxes on the ocean grid, this was simply not the case, and so you had to figure out how to calculate this on the atmosphere grid, which meant that every time you were going to do a simulation, you needed to figure out what is the best grid to calculate this bond and what about a simulation that had a refined grid where, in some areas, you have very high resolution ocean, maybe less resolution atmosphere in other areas, it's just the reverse and so doing everything on the exchange where it takes care of that.

C

What you have here is an uh excellent example of the latent heat flux. That's mapped to the atmosphere grid when you do the calculation on the x grid, and you can see here on the left when you do it on the ocean. You get a very blocky structure, so this is now in cmeps. uh Extensive simulations were done, including um coupled hundred year runs, and we have been given the go ahead to have this be the default, and that is what's gonna happen.

C

So I just want to add that it's not just us. That's looking at the exchange grid on noaa, the seasonal system is also has explored using the exchange grid because currently they were doing their atmosphere, ocean flux, calculation in the atmosphere, and this enables you to actually bring it into the mediator parallel to what we're doing in csm.

C

But in addition, uh what tarantula has done is that he has brought in the ability to use different science for calculating the atmosphere, oceanflux calculation using this and treating the uh mediator at c-max as a host model for the ccpp framework.

C

Dom will talk about this a bit later, but what this enables you to do now is have various atmosphere, ocean flux, calculations that can be explored all using the exchange grid, which is the correct grid to use both for refined grid and for regardless of the atmosphere, ocean flux uh resolutions when you're doing it on on two different grids that are very different, even if they're, not refined uh and finally um we're looking at new capability in size.

C

A

That excuse me this is your 20-minute warning.

C

Thank you, okay, so we have a new default ice component size 6, that's coming in. We have new wave watch 3 capability, which is that we've now brought in a completely new wave component, that is at the head of the wave development effort at noaa emc.

C

We are sharing uh this code with emc and it's going to be the default in upcoming csm3 development tags. So this brings in not just the capability of having provenance and collaboration and new capabilities, but the ability to explore new functionality that we simply couldn't do before with our old wave watch code.

C

In addition, we're looking at new wave watch three size coupling, because you need that, because the wave field to break the sea ice into small flows, cs concentration then feeds back to the wave fields, and so what we're trying to do is send 25 new spectral data fields from the wave to size and with the smf new office. You don't need to send 25 separate fields. You can actually pack all of these into one field that has what's called a distributed dimension for all the 25 fields.

C

So again, this is just another way of greatly simplifying coupling that would have been much harder to do before. And finally, this is just very brief. um We are supporting csm um to do dart, ensemble column, filter data simulation.

C

I haven't shown that yet, but that is being worked on to validate, but at the same time, we're working uh with the jedi group, uh which has a new uh operational uh data simulation system to create a layer on top of the neopsi driver. That would enable coupled data simulation via sending three-dimensional states to the jetta model and thereby permitting a path forward to do data assimilation.

C

This would not apply to ensembl column filter, but it would apply to data simulation where you're using 3d bearing potentially 4d fair, so in terms of a hierarchical model development. What we'd like to do? If you see here this, is a slide for increased complexity.

C

We would love to be able to selectively turn on complexity and feedbacks easily within our whole coupled model system, and we can do that at least partially by having a new set of data models, which we call cdx. They are esmf compliant data models.

C

uh The key point is that, with these data models- and this was true- with our old data models- you're able to ingest multiple data sources where each have completely different spatial and temporal resolutions and also customize how they are handled. The key point is that all data is read in with parallel. I o and now you can easily adjust both 2d and 3d forcing fields.

C

But now, unlike what was done with coupler 7 at runtime, you have automated rewritten capability. You can do online regretting of the data you ingest both on 2d and 3d fields, and you have support for different regretting types such as conservative and patch and various time. Interpolations.

C

What's also new is that we are trying to put all we have an online interface for cdx, where the prognostic components can call the share code in cdex. That does the time interpolation online, regretting directly from the its component and that's starting to be increasingly used throughout the system say for getting nitrogen deposition, read in in the atmosphere, new oxycap and so forth, and extensively in ctsm already.

C

So. The collaboration with uh in uh noah, who is using cdx now in its operational system, is that we have brought in a whole new set of data forcings, particularly for the atmosphere that test and and enable cdx to run with many new, forcing scenarios that simply we couldn't do before. So this is the strength of having uh community software that is used by more than one modeling system. They bring new functionality and test it in totally different scenarios. So what you have is a much more extensible and robust system.

C

You can see here all of the components that are currently the ocean components on the left that are being tested and routinely run with cdx.

C

um So, in terms of where I'm going to switch from hierarchical model development uh to ultra high resolution, I think in the atmosphere model working group yesterday, uh the earthworks project was presented.

C

If you got, if you didn't hear that what it is, it's a five-year, nsf-funded university-based project to develop a global coupled model, and I want to stress based on csm because it doesn't have all the csm component, but it does have the csm coupling infrastructure cements and it will use c depths and the key thing is to target a single, uniform and pass 3.75 kilometer global grid for the atmosphere, ocean, sea, ice and land surface, and for this point, you're going to be using a non-hydrostatic, dynamical corian mpas, but it will be run via cam.

C

So you're not running the empass atmosphere. You're running the ampas dynamical core in cam, uh the empas ocean is what's going to be used in terms of being able to run in the end pass grid. We have an empath sea ice model that is analogous to size 6, but it has a different dynamical core. It will use ctsm and, of course, cmax. So now we're taking cmips and exercising it in ultra high resolution configurations, and uh we ran into some problems when we started doing this, which is what's.

A

Right: 26, 26 minutes, okay,.

C

A

C

A

C

Thank you very much so esmf uh had uh we encountered problems with memory scalability in terms of ingesting uh the grid for uh this ultra high resolution, and we ran into several other memory bottlenecks, and we worked very closely with the esmf group that created an esmf a3 release on june 8th and basically the main features of this release that were applicable to us is that we now have a scalable mesh creation from the file.

C

This was done very closely in collaboration with jim edwards and bob from cseg and vavonky from the esmf group, so we now have scalable mesh creation and the parallel io library that is uh used inside esmf was an old pio library. It has now been migrated to pio2 and uh lots of other memory. Scalability has also been addressed, and so suddenly we can now run. 7.5 kilometer runs uh with uh no problem and we're looking at now.

C

Of course, 3.75 kilometer runs and seeing other memory issues, but they're no longer within the esmf category and that kind of encompasses a very brief outline of what we've done over the last year, and I really welcome questions. Thank you. Oh one, more thing.

C

I want to really thank, I didn't mention that detailed list of people that uh have helped in this, because the list was getting very long, but I do want to have a call out to the esmf core team, uh jim edwards uh tarantula from esmf and rocky dunlap, who really really uh and bill sacks who stepped up to get some of these features in place.

A

Thank you so much mariana, so uh we have time for one or two quick questions. So if you have a question, please type it on the chat or raise your and then you will be able to unmute and ask your question.

A

I don't see any questions yet. Oh here's one from frank, brian frank: can you unmute and ask your question.

D

Yeah hi mariana, I was wondering, does the exchange grid work for component models that do not have quadrilateral meshes.

C

Yes, it works for unstructured grids, it works for any mesh. Oh that's quadrilateral! That's a different geometry of the.

D

C

D

Mesh ocean and a quadrilateral mesh atmosphere does that work.

C

I think the answer is yes, but I have to check with rocky I'm trying to understand I'm trying to remember the limitations mesh um that that are limited, but I'll get back to you. On that frank.

D

A

All right uh any additional questions.

A

uh Jim, you have your hand up.

D

Yeah, I just wanted to clarify something that marianna said earlier about the esmf aware threading um that does work out of the box. It's ready to go.

C

Right, I guess what I meant is that hasn't been. It works out of the box, but it doesn't automatically gets set up for you.

D

Right right, you have to you, have to set it up. You.

C

Have to set it up, so it's it's! It works, but it's not user friendly. Quite yet, not to the extent that the other other uh layouts are.

A

Okay, I see steve uh has his hand up, but we are at time so I'm going to ask steve to move his question to the general q a session at the end uh and we're going to move on. Thank you mariana and our next speaker is dom heinzeller and he's going to talk to us about the common community physics package, ccpp update so marianna. You can stop sharing I'm trying.

C

To find out where my stop sharing oh here, it is.

A

C

A

You can uh share yours all.

E

Right, can you hear me.

A

Yes, I can, I can see some mountains. Okay,.

E

Works good now, thanks all right so good morning, everyone um on behalf of the ccp, the ccpp developer team at the dtc. I want to give you a brief, update and summary of the common community physics package.

E

I've talked to this audience before, but to refresh your memory, this is a bit of a history of the ccpp. The work on it started in 2017 to develop the future ufs infrastructure for atmospheric physics. So the goal behind this effort was to facilitate the improvement of physical parameterizations and the transition from research to operations.

E

The idea is to make it easier to add new schemes, modify them or transfer them between the different models. So ccpp consists of three different packages: an infrastructure component, the framework, a library of compliant parameterizations to physics and then comprehensive documentation, and here on this image on the right from rocky, you can see the ccpp in the unified forecast system and we'll discuss this in a little more on the next slide.

E

Oh, what I wanted to say is also that nowadays, there is also experimental mode to run chemical parameterizations in the ccpp, not just physical, parameterizations, so ccpp itself. The framework itself is not part of the actual model code. It's more like a code generator also referred to as a data broker that relies on documented interfaces for both the host model and each of the physics schemes. So we refer to these as metadata tables, so at build time the framework parses the tables and connects the variables by one of the attributes, the ccpp standard name.

E

It then generates fortune interfaces between the host model and the physics, and, as this figure suggests, these interfaces can hook up the physics with the atmospheric driver in different ways, for example, in a traditional way, where you'd call the physics right from the atmosphere driver, but also um inside the dynamical core, which is known as fast or inline physics.

E

These metadata tables they are used together with inline oxygen documentation in the ccpp physics schemes to generate complete scientific documentation, as it's shown here in the screen screenshot, so the scientific documentation and the fact that each of the variables that a parasitorization needs to be executed are described in full in those tables. One of the ways to accelerate the development of physical parameterizations. It's much less likely that you get units or something like that wrong or dimensions.

E

So how does ccp know what files to generate and what schemes to call it uses what we call sweet definition, files in xml format? These also get passed at build time, and they are then turned into fortune interfaces between the host model and the physics.

E

So the suite definition file shown here for the operational gfs version, 16 suite in the ufs, contains a surprisingly large number of schemes. For many of you, I guess, because of a fundamental difference between ccpp and other physics driver based, modeling systems, so all the glue code that resides inside these classical physics, drivers and that connects the different parameterizations and there's often tens of thousand lines of code. They must be converted in what into what we call interstitial schemes.

E

So some of the features some of xml suite definition files are shown here, for example, the ability to assemble schemes in groups that can be called individually or altogether subcycling for calling schemes at higher frequency or with a smaller time step and also user-defined ordering of the schemes. So one word of caution here: if users want to change the order of schemes, then they also need to make sure that some of the logic and interstitial schemes works as intended, because some of it still depends on the order.

E

So ccp serves to do a purpose. On the one hand, it has to offer flexibility for research and development and, on the other hand, it has to provide the necessary performance for operational applications.

E

So, to achieve this, we came up with a what we call multi-suite static, build that allows you to compile multiple suites into one executable and then choose between them at runtime, and since there is no code branching inside any of these auto-generated suite caps, this code is pretty efficient, computationally.

E

Here's an example for the flexibility that we baked into that system: ccp supports automatic unit, conv conversions to expedite development and transition of innovation. So, for example, here in fv3 cloud, effective radii are stored in micrometer, whereas thompson expects them in meters.

E

The framework knows about the difference from parting, those metadata tables at build time and then automatically injects the appropriate variable transformation into the auto-generated physics suites cap, as you can see here on the right.

E

So there is a lot of exciting active development going on one great example for successful community engagement is the recently added capability of building and running physics in the ufs in single precision.

E

So this was a collaborative effort between the navy, dtc and noaa emc, and that was also behind it by john mcallick's work on getting the single precision. Rapper physics runs in the model and the neptune model going a bit more work is needed there, but we're on good track.

E

um Also, as mariana already mentioned, the esmf team recently contributed the exchange grid capability to the ufs, in which cmaps was modified to act as a ccp host model and to perform atmosphere. Ocean blocks computations, and there was a typo that I just skipped over all right.

E

Up until today, we had five releases of ccpp each time with an updated and growing list of supported physics and one or more host models from the very first version on ccp was released with the single column model that is developed by the dtc as well, and it's one of the key components of the ufs theoretical system. Development ccpp was also part of all the ufs medium range and short range weather, app releases so far and more are coming soon from epic.

E

In fact, ccpp version 6 will be released later this month and will support the ufs short range by the wrap release version 2.

E

The ccpp physics library, the authoritative main branch, contains many parameterizations for the different categories shown here, and the color coding also shows you that this is a highly collaborative effort between multiple organizations and agencies.

E

Universities, as well so with so many combinations to choose from you, need to make decisions which of these combinations, which is actually the sweets to support. When you create any of these releases.

E

So the approach we've taken so far was to support a small number, maybe only one two or three or so- of these vetted combinations of physical parameterizations for each of the ufs releases and then a slightly number larger number of speeds for the parallel cm single column model releases.

E

So for the example of the upcoming ccpp version, 6 we're going to support four suites with the ufs short range weather, app. That's the operational gfs version 16., the rapid refresh forecast system version, one beta, the warning forecast and the her suite and then um two additional suites with the single column model.

E

All right changing gears a little bit. um An important effort undertaken in the last year focused on those ccp standard names. They are really one of the key aspects of the ccp because they are used to communicate variables between a host model and the physics.

E

So whenever possible, we try to use standard names provided by the cf convention, but for many of the variables we had to come up with additional names, we had to create new names and um it was difficult because there we didn't have any clear rules in the beginning, so we definitely saw this proliferation of often fully constructed names um coming up so to address this issue, the dtc worked with ncar and the community to put in place a set of rules for creating new names and also to assemble a dictionary of current standard names that are in use.

E

So this is not the only. This is not the only standard name effort that has been made in the past so but unfortunately, again, this set of standard names is not connected with any of the other efforts in the united states, for example, esmf standard names, the physics constants dictionary, the jedi yoda data convention. So all of these standards were defined independently, and I personally see this as a bit of a missed opportunity, and hopefully something can be done there to be more interchangeable.

E

All right over the last few years there was a team that was assembled to discuss code management practices for ccpp physics. This team has participants from various institutions such as dtc, the nrl, noaa and ncar. So the main topics here are to discuss um how the future collaboration on ccpp should look like and how to come up with standards for that. So, as you can see here, there are common interests like parameterizations for some processes.

E

How do we engage with the broader community? What's the code, repository structure, standardization of schemes and so on and so forth?

E

One interesting point: a discussion point was centered on the idea of having a single, authoritative, ccpp physics repository for all models. That's currently not the case right now we have an authoritative ccpp physics, repo for sem ufs and neptune, that's managed by dtc, and we've got an encore repository for um for ccpp compliant physics that are shared among the world, empath and cm1 models and there's a reference to laura fowler's talk. That explains that in a bit more detail.

E

So, as most of you know, in 2019, noah and encar signed a memorandum of agreement to co-develop the ccpp framework as a single system to communicate between models and physics, and it was actually on one of them on one of my honest, slides and last slides about this ongoing this new project, this nsf-funded project as well. So the idea was to jointly define the requirements for the next generation framework and then converge to one single common framework with a superior and extended functionality.

E

A few of these high-level items are listed here, so augmented, metadata, standard automatic variable allocations, the ability to compare metadata to the actual fortran code and improve build system and code generator and then very important for this community, especially advancements for chemistry.

E

So this work was delayed once due to other activities at both organizations taking higher priority and then ultimately, encar decided to conduct a review of the cima project, which itself includes ccpp, and this basically brought these efforts to a halt. For now.

E

So one of the efforts we were undertaking at this time was to develop a process for joint code management of the framework very similar to those efforts for the ccp physics.

E

So back in those days, we had regular weekly meetings between ankara and noah to discuss the design and future development of the framework and any issues that came up. This has basically stopped since the beginning of the year. There was very little activity going on there.

E

So here is the original timeline for converging to a common ccpp framework um for both sides, and, as you can see, we decided to approach this really from two angles, with uh ncar ctd, starting to develop the next generation code, generator called captain um based on the joint requirements that we flashed out after the moa was signed, and then noaa dtc incrementally adopting some of those new features and design specifications so that a future transition to captain in the ufs would be almost seamless with very little disruption for for the users and developers, because this.

F

Is no longer valid.

E

So it sounds a little sad probably, but there are also some good news for ccppm on the horizon, as regarding the transition to operations at noaa. So this is a schedule for when ccpp will become operational in the in the various noaa models. um It starts in 2023 with the hurricane analysis and forecast system and the rubbish, reaper rapid, refresh forecast system and then later in gfs and gefs as well.

E

So um that's basically to wrap things up. Ccpv was created and implemented in the ufs and other modeling systems to facilitate the development and testing of innovations and to lower the bar for adding new physics or transferring them between the models.

E

And while there is some uncertainty on the future implementation of and development of ccp at ncar as part of sima, we're still hopeful that one day not too far out we'll see cesm and empaths ship with the ccpv framework, the one ccpp framework under the hood um and that end car will join noaa and the navy and other organizations like nasa and d.o.e, where individual groups have already started to experiment with ccbp.

E

So I have got a bit more time left. So let me quickly go to the additional material. What I didn't mention is that, because ccpp consists of a code generator- and it has a very flexible metadata standard that gives you a lot of information about the variables that need to handle, there's a ton of opportunities for development. So we're talking about things like automatic area transformations.

E

If the index ordering is not correct, vertical flipping calculation of derived variables, potential temperature from temperature and geopotential a visualization tool that shows you how a variable travels through a suite, is it modified somewhere? Is it just read in or is it bypassing a certain scheme entirely? That's actually already in progress. It's basically finished at that that project, better error handling, to include traceback information, um being able to split to specify whether you want to have time split or process split and schemes or groups of schemes in the suite definition file.

E

Logic to handle schemes that update the state in place of return tendencies extended diagnostic output capabilities, then improved handling of constituent, arrays and properties, especially for chemistry.

E

Automated saving of physics scheme states for restarts, we've got more, there's also ideas about creating a generalized way to create either ccpp or neuropc capsule physics, so that you can run them either inline or as a separate component and the ability to leverage gpus by automatically offloading this code onto gpus in the order generated caps, and then there's also some idea about how to decompose and recombine grid columns into different surface types. For selected physics that have been floated around, so that's basically all I had to say I'm going back to my summary slide.

E

Thank you for your attention and I'm looking forward to your questions um either now or later on, um offline.

A

All right, thank you so much tom, so we definitely have time for some questions. um I welcome you to raise your hand if you have that function or put them on the chat.

A

Let's see, um I don't see any raised hands yet, uh but adam harrington has a comment on the chat uh adam. You wanna mute yourself and just uh speak. Your comment.

D

Yeah, I just um wanted to voice uh some disappointment that the ccpp is stalled right now um pending the sema review. Can you say anything more about it.

E

I can tell you that I'm going to be on the team review panel starting from next week. um I think it is what, in two weeks time and then hopefully I'll be able to tell you more I'd, have to defer you to um my dear colleague, steve goldhaver, who has maybe a little more information on that sorry.

A

uh There is another question on the chat from richard loft uh richard. You want to unmute yourself and ask.

G

Sure can you hear me.

A

G

Yeah, okay, can you say something more about the timeline and strategy for gpu offload? It seemed like it was an opportunity for development, but assuming that the sema review goes through and ccpp continues at incar, I'm just curious what the thinking is about where it is in the priority list.

E

So we've had a um a project funded at noaa gsl to prototype some of this work and it's been partially implemented. We um gpu is one of the the schemes in ccppm the graph writers, convection scheme and then sort of hard coded the calls into the model.

E

So we would bypass the framework for this, but just to demonstrate that you know there is a physics package in ccpp that can be called and run on gpu, and that has shown you know great potential in terms of um speed up um with respect to when that work on the framework would start. um My personal opinion is it's relatively high up in the priority list um after the next gfs operational implementation and the leader? Please correct me if I'm wrong, because I know that emc is also looking into gpus for in the future.

E

G

Yeah, okay, could I just follow up real quick on that is? Is that written up anywhere that uh technique that you used to to bypass or gpuis the the grail freitas.

E

There was a presentation lisa. Are we able to share that presentation.

A

Yeah, which, yes, okay,.

G

I'd appreciate getting access to that and, looking over just to understand what you did in more detail.

E

A

And just to add in terms of prioritization, you know for gpu offload or many of the other developments that dom mentioned. uh We are hoping to conduct a ccpp visioning workshop, in which we would discuss that we don't have yet the funding for conducting that workshop, but we're still assembling funding and if things work out, that would happen later 2022 or early 2023, and that would be a great venue for you know discussing in the in a multi-institutional setting what our priorities are and how to go forward to fund some of those priorities.

A

So this uh the sema review is really important because if uh you know steve, gohaber and colleagues already did a lot of development on this next generation ccpp framework and some decisions need to be made about that going forward for noaa or or not so that this development, such as gpu, can be done on top of the existing system or this next generation system.

A

So we are, we are going to be participating with a lot of interest in the cinema review.

A

And with that, I'm going to thank dom for his presentation. If you have more questions for dom, please save them to the final q, a or put them on the chat and we're going to move on with the presentation by brian dobbins about cesm and new technologies, clouds, containers and accelerators gpus. So go ahead. Brian.

F

Everybody uh yeah, I'm brian dobbins, I'm a software engineer in cgd. I've got a lot of material here, so I'm gonna go pretty quick, but I'm happy to also have offline conversations about anybody uh so to jump right in I'm. Gonna start with talking about cesm in the cloud uh we've been doing runs in the cloud for a while now, but we finally have production. Science runs and uh production workshops. So we've really made a lot of progress lately um on the left. You see these uh production science runs. These are large.

F

It's a one degree wacom case about 15 million cpu hours going to generate about a petabyte of output in the whole workflow from running to post-processing to archiving is all done on aws on the right. We recently uh used the same infrastructure to develop, to deploy a jupiter hub to run a train for ctsm they're about 60, plus users automatically creates the counts, makes it nice and easy multiple queues, node types, so these are really success, uh successful projects and we're hoping to make these tools available to the community soon.

F

So how do we use the cloud? uh So uh you start, you just need a cloud account and credentials. Then you gotta learn the cloud interface. Then you gotta choose a node type. Then you gotta choose a network type. You gotta choose a storage type, a storage size cloud region create a yaml file. Configure instance, settings configure network settings, launch cloud resources, update the os and tools, install the libraries and compilers install and configure csm, and then you can do your science. Now. I've actually made this a little bit simpler than it is.

F

I skipped a few things to get this all to fit here, and this is clearly not something that we want the user community to do, because this is not sustainable for them. So our approach is we're creating a an api that makes us a little bit simpler and so the idea is, you have a user with cloud credentials. They access this api and it does all of those steps for you.

F

So this offers several key advantages. It's a lot easier for scientists. We can support multiple clouds with the single interface. You don't need to learn two different cloud interfaces and we can add features. So you can just click a button and select to add a jupiter hub. uh You can phone home, so we can. You know if you have trouble with a run and we want to support that. You can enable remote access via an encryption key from ncar uh tools like the post-processing containers and stuff. We can all deploy that automatically for people.

F

uh When I talk an api, that's a pretty abstract. uh You know idea. So, just as an example, uh here are two interfaces that I've sort of played around with, but we don't have anything up and running uh for the public. Yet on the left, you see a website. This is pretty easy. You select, you know what kind of cluster type here I did single users, you don't see any account information.

F

You just enter your access keys, you click the button and in about 10 minutes, it'll, give you a a a uh ssh address that you can connect to and on the right, I'm running this verbose version. So it's showing you what it's doing, it's finding credentials, checking the mode, selecting node type and that will uh return to you, a ssh command that you can use to connect to cloud resources.

F

um So the cloud is really easy: that's the upside, uh or we can make it very easy. That's the upside! But let's talk about one of the downsides, which is the cost. So uh right now with on-demand pricing, you see in the table above uh prices have improved they've gone from 11 cents, an hour on the older generation of notes to about three cents an hour, but the directio cost equivalent. With my back of the envelope.

F

Calculation is about 0.3 cents an hour so for compute, it's still more expensive uh to use the cloud uh now. Another thing: I've added this slide there's a lot of talk about in in cloud data lately, and you know cloud-based analysis and one of the things that when we do hpc, we often think about uh compute costs and compute hours. We don't we don't we take it for granted that data transfer is free in the cloud. That's not the case and there's cost.

F

So, for example, if you had to store 20 terabyte in the cloud uh per month, that's not too bad. If you wanted to download that you pay a decent chunk and that's per download. So if somebody else downloads it again, you pay that again, um you know. So, for example, uh you know we have this. uh This arise data that we're hosting in the cloud and if you were to download the whole thing, it's 30 000.

F

If 10 people download it that's 300 000, but thankfully uh there are three ways of hosting data in the cloud one is you get uh generous uh uh support from aws and the amazon sustainable data initiative and they will host it for free uh they've done this with csm lens data, and this arise data, but otherwise, for for data you have either have the host has to pay for it, which is a potentially unbounded cost or the downloader pace. This is problematic for open science in the cloud.

F

Okay, so move on really quickly to containers. uh We talked about the clouds offering configurable uh environments and the benefit of that, because we can pre-install things and containers do the same, but on your own hardware. So this makes it really easy to provide ready-to-run tools to the community, so it greatly simplifies the ease of porting and you can ensure some cross system compatibility.

F

Our biggest containers are our csm and csm lab containers.

F

We've got a lot of use out of these they're still sort of in a preview mode, but I got to update those soon into official releases and the idea is, we just have a base layer which has common os tools and libraries, so everything you need to to have an environment that you can use a linux environment even if you're running on windows, then on top of that we can optionally, add a jupyter lab and, I say optionally, add a cesm version because we can have a release version or, if you're doing development work you might want to clone your own repo, and so that way you don't need to have a built-in version.

F

You just use the base container and you still get a consistent environment, uh an analysis platform across any users.

F

Now containers aren't limited to csm itself, uh we're also containerizing a variety of our tools, so we have some cam topology tools. Recently, containerized uh the csm time series generation tool: this is a big one. That's gotten a lot of interest and uh and allison baker is going to talk later about some compression tools. I've started working on a container of that. We can make these parts of workflows for people, so you don't need to install your own software. You can kind of just use the container it's done for you uh on desktops and laptops.

F

This is typically used to be a docker and on htc systems we usually use singularity, which has now been renamed as actainer, and uh if this is of interest to you and you, you don't know how to use this, because these are some new skills.

F

I need to use docker or let me do singularity, um but you want to get in touch and we'll we'll happily help you out with that uh one of the benefits of these configurable environments, the cloud and container, is that you can unify them so that what you see on one is what you see on the other, and this is really helpful, because it means that, if you're a new user to see esm, you can learn on your own laptop and then you can use the cloud and moving their transitioning to that is trivial, because it's just very it's the exact same environment.

F

So you don't need to worry about any differences. Different paths, different directories, it's all very easy and on the topic of standardization, another thing we're working on is standardizing jupiter environments. We call this ease the earth analysis, science, environment, and this is just a pre-installed conda environment with the pangeo and various cesm tools. The idea is, uh we don't want students to have to go through condo install and you know modifying their own environments. We want everything to sort of work out of the box.

F

This is with our new diagnostics various packages, and this will uh be updated on a rolling basis.

F

At the end of the day, we want the same east kernel to be available on uh derecho and cheyenne, and our cloud and container environment so any end. Car environment would have the same kernels available, ready to use out of the box, no configuration necessary okay, uh so one of the biggest topics. I think that that people always wonder about is accelerators in cesm, and uh there are four key challenges here that I'm going to talk about, and the goal today is that this is.

F

This is actually a really complex topic and it's often misrepresented as a pretty simple one. People show a performance number. They say: let's do this and I'm going to try to just kind of get you to think about this in a more complex manner.

F

So, uh let's start with performance versus efficiency, so uh we don't have a gpu capable version of cesm. So as a proxy, I took empass atmosphere version six at a one degree, resolution 58 vertical levels. This is the cam seven vertical levels, double precision mode and if I run it on the directio hardware, so this is the the gpus. These are the uh the ratio. Gpus and cpus are the dracho cpus.

F

You can see this impressive difference, I mean so. Basically the gpu is about 2.75 times faster than the cpu, so roughly 3x uh on on a single node, and that is impressive. This is more power efficient. It's it's! uh It's a really big win. Now. This is not the whole story, though.

F

If we scale out this one degree run well, then we get a different story, so the gpu needs a lot of parallelism to perform well, and so it rapidly runs out of that parallelism and it doesn't scale as well, whereas the cpu scales very well. So here we have a gpus are more efficient, but the cpus are able to perform much better.

F

Now, in this case, this is using an older version of empass atmosphere that has a known issue with uh lacking gpu direct communications, so this will improve a little bit, but at the end of the day, for the one degree workhorse resolutions, there are still advantages to both this. This is a great result from gpus there's really great value in the efficiency and there's also great value in the speed of the cpu. So, for example, paleo runs you're, going to definitely want a cpu for that.

F

uh Another big question: when we talk about this, that is this. You know issue of science capability versus capacity. This is something we talk about in hbc with systems, and you know the it boils down to should end car focus on enabling a few uniquely large-scale runs or supporting a high volume of science at workhorse scales. There's not a right or wrong answer here.

F

These are both great approaches and I think we need input from the community, uh I'm a bit of a numbers guy, so I like to visualize things and have data, so I do a bit of an analysis here. So cheyenne has 145 000 cores, that's 1.27 trillion core hours per year with no down time. This is 24 365.

F

uh about 55 of cheyenne goes to cesm, that's about 700 billion core hours, and so, if we do a a b 1850 coupled run at one degree, it takes 21.98 core hours per year. Well with nothing else being done with the system. No analysis, no higher resolution, no lower resolution. We get around 320 000 simulated years per year of cheyenne.

F

If we were to do a 3.75 kilometer run and we apply linear scaling to this. uh That's 32x in the lat, long direction and a 32x in the time step. We get about 9.7 simulated years per year, so with 55 of the system for one user that that would be less than one model year per month. So there's great science to be done at this resolution.

F

But it's a question of what are we aiming for? Are we aiming for large scale runs, or I mean aiming for a lot of runs of different science at work, of course, resolutions, and this is where we need input from the community on code portability. This has been, this was a concern for us that is no longer a huge one, thanks to some great work that sizzle's been doing so. Basically, a lot of the early work on accelerators was done with uh nvidia's, open, acc or open acc on nvidia gpus.

F

I'm sorry, that's my dog barking. um We we had concerns about open acc, because we are a community model. We need portability, we can't tell people, you have to run nvidia gpus and the intel gpus wouldn't support it. So uh now we're we were leaning towards openmp and sizzle's, been working on both openacc and openmp, and they've been using this intel, converter and exploring that- and this is a great path forward for us, because it enables us to look at uh embracing open, acc early and having automatic conversions.

F

So it's not a some cost on any development work we do. uh Finally, uh uh uh one of the biggest things, uh of course, is uh the people time. uh So models of similar complexity have taken around 10 greater than 10 ft years to gpuis. So do we focus our efforts on this or on science features or usability features again open-ended question. There's value in both and can we get additional funding or community help to do both of these? uh This is you know that would be great.

F

Finally, there's still relatively low adoption of gpus and academic systems. If you look at the top 500 us academic sites, I think the uh the cpu nodes to gpus is around gpus around 20 of them and a little less than that in an informal exceed campus champion survey.

F

uh Finally, I I I'm gonna move very quickly to to an extra slide. I've added after this uh discussions uh on monday, um so one thing that uh so everett mentioned uh serving underserved communities and and and global use of cesm.

F

The one challenge we're seeing now is that, uh as our use increases uh via clouds and containers, it's it's coming from novel ips, there's more users and all that input data is being funneled through ncar. Well, that doesn't need to be the case. uh We can put data in in remote servers and have data go through there. Jim edwards did some great work on enabling this and seem so. We just need to have the data hosted somewhere, we're talking to cloud vendors about that longer term uh because of cloud egress charges.

F

You really need some sort of mesh network which relies on the fast networks at university sites to pull in data from elsewhere, and this is something that we're beginning to think about and want to pursue as we move forward. So I know I moved very quickly from that through all this. uh Our summary is: we have cesm and jupiter hub usable via aws for science and training. It's pretty easy to use, but not as cost effective as on-prem systems.

F

Containers are great. They enable ready-to-run applications and we're looking at some new tools like again the time series one gets asked about a lot so so we'll try to make that available to people soon um and the big one. Our gpu approach is a careful balance of technology, science systems and people to do what's best for the cesm community.

F

uh The technology is very exciting, but we we're trying to just be a little bit uh conservative in how we embrace it to see where it's really gonna serve us best and balance those those issues of uh of resources with time and systems. uh Finally, data access. As I said, it's a growing issue, we're looking at some ideas here. uh So sorry, I went through that very fast and I apologize for my dog barking.

F

If you have any questions, I'm happy to take them and thank you.

A

Yep, thank you so much brian and brian's dog for this talk.

A

So if you have questions, please uh raise your hand or put them on the chat and then we'll ask you to unmute and ask your question.

A

I don't see any hands yet.

A

uh So brian, maybe you can talk a little bit more about this aspect of choosing what parts of the modeling system, oh, I see, rich, has a handout so rich. Go ahead! Now ask my question later.

G

Okay, um yeah, I just want to make a comment about this, that uh you know brian you and I have talked in. I I think the best way to convey all this about gpus is that gpus are all about the data parallelism and that can come from either the size of the ensemble or the amount of points available in the grid of of the bottle, for example.

G

So it you know, I think the issue with uh small ensembles low resolution problems are not the gpu wheelhouse.

G

So that's just to sort of follow up on that and and flesh that out. uh Your question back to you.

F

Hello, please go ahead.

A

I was just gonna: ask you to talk a little bit more about. uh You know how you're working to select which parts of the system to offload to gpu versus cpu, I'm imagining you're, not testing an entire system on gpus.

F

Oh, I think that question the chat was actually for for dom. uh From the last talk yeah um I mean we're yeah, so I think we have a talk coming up from john sun about uh some gpu microphysics, but uh you know we're still sort of taking a a large scale, look and not working to massively convert things to gpu, yet we're kind of letting the technology summer down a bit.

A

Okay, I don't see any questions on chat. I see a comment from eric. um I did get kicked out for a couple minutes there. So if something happened, I don't see it so at this time I don't see any questions or any other hands raised.

A

If someone wants to ask a question, please speak up.

A

Okay, well, if not, let's just go ahead and move on to the next talk. We're going to continue on the topic of gpus and we're going to have john sun talk to us about enabling the execution of pumas on gpus and jen. We can see your screen. It is not in presentation mode. Yet yep looks great, so please go ahead.

H

Okay, thank you, alicia and thank you brian, for give a nice background introduction about gpu computation, which I think is very helpful for my talk as well and good morning. Everyone I'm jim from anchor today, I'm going to present the work that I collaborated with john sherry, brian, andrew and kate. During the past two years. The title of this presentation is enabling the execution of pumas on gpus. Here pumas refers to the cloud microphysics scheme used in cam right now and the presentation will be outlined in the following sections.

H

First, I will describe briefly what is cloud microphysics. Then we will do a code overview of the pumas code. In can later I will describe what is the methodology we use to upload the cpu pumas code to gpu and some preliminary results. Finally, I will draw the conclusion and mention some future work.

H

So what is cloud microphysics cloud microphysics usually refers to the small scale processes that drives the formation and evolution of cloud and precipitation particles. Card microphysics is closely related to many atmospheric phenomena and disasters like hail, sandstone and the tornadoes.

H

The figure, on the right hand, is borrowed from the paper published by morrison in 2020 and showed that in cloud microphysics, it contains more than 10 subprocesses and the number could increase with our increased knowledge.

H

Therefore, it is usually difficult to represent cloud microphases accurately in a climate model, and we typically have to use a private parametrization in this case.

H

So in can we use pumas, which is short for privatization of unified microphysics across scales in the pumas code. It contains about 4 900 lines of source codes for the main calculations.

H

It also requires some water vapor calculations from the cam code that adds up additional 1100 lines of code, considering the fact that for the cam fitness code, it contains about 0.4 million lines of source code. Pumas only reprint represents about 1.2 percent of the total camp fitness code, but it could contribute about eight percent of the total computational time of camp physics.

H

Therefore, it motivates us to spend some efforts to optimize the code and seek some potential speed up.

H

Well, we know that the cam is already highly scalable on cpu through mpi and openmp. Therefore, the strategy we use in our work to speed up pumas is to upload it to gpu. Instead of working directly into the whole cam code. We start from occasion kernel for pumas, which is easy for us to do the code, development, debug and the testing in this particular work. We choose the directive based gpu offloader method to do the gpu parting, and in this case we have explored both open cc and open p upload directives method.

H

The reason we use the directive based program model is because it can keep a single source code for both cpu and gpu version of pumas, which improves the maintainability of the source code. By using the direct base, the method we are basically adding pragmas to convert the cpu code to gpu codes. Therefore, the readability of the code is minorly affected.

H

One thing I want to highlight- and I think brian also mentioned in his talk- is that in this work we use the intel's auto migration tool to convert open, acc directive to open pr flow directive, and it works very well in our case. So I put the link to that too. In this talk and everyone who is interested in this tool, you are more than welcome to check it out and apply it in your production code.

H

So by adding the gpu directive into the cpu code, we increase the source code lines by about 10. We think is reasonable and also acceptable.

H

Before I go and show you some performance data, one thing I want to discuss is the correctness that I think is very important when we do gpu parting.

H

So when we put the cpu code to gpu, usually we don't expect the results a bit for bit, because we are running the code on different platforms and we may have to use the different compilers as well. So the big question we ask ourselves during this work is when we observe a difference, is this difference expected or is due to a code bug that we introduced during the code implementation in this work? We use the cam on server consistency, test to examine the gpu code and ensure its correctness before we look at any performance data.

H

So here is the first performance data from the cajun kernels. In this plot, the black and grey lines are the cpu results on xi'an, the casper from one cpu node with 36 npi ranks the purple and the red lines are for the gpu results coming from obcc and openmp offload method.

H

Here, the gpu results only account the gpu computation, excluding the data movement from cpu and the gpu. The x-axis is the input data size and the y-axis is the performance metric with the higher number. The better performance is both x-axis and the y-axis are plotted in log scale. Here you can see that the gpu performance is consistently worse than the cpu ones, which is against our original expectation.

H

Therefore, we want to do some profiling and looking for some potential reasons for this poor performance on gpu.

H

So I will show some examples that we think is critical for the gpu performance. The first example I will show is some loop dependency in the original implementation. Here I give the example in the original pluma's code, so in this particular code we can see that when we calculate the variable precipitation fractions, we have some vertical dependency for the k loop. Therefore, if we add gpu pragmas directly to these loops, we will end up with running both loops in serial on gpu, because we have to run the outer loop in serial first.

H

uh This really doesn't makes really sense on gpu, because the column calculation is independent. Therefore, the solution for this case that we reverse the loop order puts the column loop at the outer y, and in this case we can achieve some parallelizing on gpu at least.

H

Another example that we find is critical is that in the pumas code there is a very expensive calculation called segmentations, and this segmentation is repeated for different hydro medias in the cpu version of the code, and you can see that in the cpu version the five segmentation calculation is down in serial. But if we look at the implementation in details, they are actually independent from each other. So for cpura, it's really not necessary to follow the same calculation workflow.

H

Therefore, the solution we made is to add different stream id to different gpu kernels for the segmentation and run them at concurrently on the gpu so that their computation can be overlapped and save some computational time.

H

There are other optimization examples. I didn't list them all here, but after some optimizations we can find the new performance data for the same caging kernel. So it's the similar plot than the previous slides with the dashed line refers to the original performance data and the solid lines refer to the improved performance data or the performance with the optimize, the gpu code.

H

First, we can see that, although we have done a few gpu optimizations, the performance on the cpu is nearly affected, which we think is really good, but for the gpu performance compared with the original one, the new performance on gpu is significantly improved, and in this case we can see at a certain point. The gpu is able to outperform the gpu is able to outperform cpu at a certain point and when we increase the problem, size gpu shows more benefits.

H

If we lack the two solid lines for opencc and openmp upload, we can see their performance is in general, very competitive with each other, but in our cases opencc is slightly better than openmp offload. Therefore, we will focus on the overamp oecc results only in the next cam evaluation.

H

So, in the previous slide, we have shown that we can benefit from gpu parting for pumas in a cajun kernel. The next question we ask ourselves is that whether we can maintain the same speed up in a real cam production run. So in this particular case we did a cam simulation with the f2000 the component set. We use the fv digest one degree and perform one day simulation.

H

We are using one computer node for cpu results and only one nvidia via 100 gpu for the gpu simulation.

H

So here is a bar plot on the accesses. It's the data size that is uploaded to gpu per kernel launch and the y-axis is the time per sub-step per mpl rank, so the lower number, the y-value is the better performance. It is so. First we look at the blue and the green bars that refers to the cpu results on shenyang, the casper we can see in our cases with different data size.

H

The casper dot is consistently better than xian yang, which is expected because casper has a relatively new cpu architecture when looking at the gpu results. If we have the default value of p column equal to 16, we can see the gpu performance is much worse than the cpu one, because we spend most of time on the data movement between cpu and the gpu.

H

However, if we increase the data size reduce the data movement, frequency between cp and gpu, the gpu performance can be improved quickly and at the end we are able to achieve two to three times speed up compared with the best performance on cpu that is on casper. Therefore, from this plot, we can see that the gpu-enabled pumas can also outperform its cpu version, even if we are using it in a practical cam simulation.

H

uh If we focus on the gpu result, look into the details of it and decompose the time contribution from different perspectives. We can see that for different p column values, the data movement from cpu to gpu and the gpu back to cpu contribute to much more than the computation on gpu itself. So from this pie chart we can clearly see that the data transfer could be more time consuming than computation for the puma's case, and that should be our next focus of optimizations in the future.

H

So here is a short summary about what we have done. First of all, we have fully offloaded pumas to gpu by exploring opencc and openmp of load technique. We evaluate the performance on cpu and gpu, and the result shows that even we are using only one gpu per node. We can achieve some promising speed up compared with one cpu node. One thing I want to point out is that, in my opinion, open pr load is relatively new compared with open acc and in our work we find that overseas still performs slightly better than openmp offload.

H

Therefore, for people who are interested in openmp offload, as brian mentioned in his talk as well, I probably will suggest some more evaluation work to be done before you really use it in your production simulation.

H

So that's what we have found so far talking about the next step, since we get encouraging results from one gpu per node. It is natural to think of using multiple gpus per node and seeking potential speed up on gpu further, and in our result we find that gpu favors large problem size. So in this case we are interested at if the gpu speed up can be larger.

H

If we look at the high resolution, cam simulation that we probably have seen some results in the previous presentations as well also, we want to know that if our gpu speed up in cam is sensitive and sensitive to different pumas configurations or not, and last but not least, we are preparing a manuscript to document in details about how we do the code party and how we evaluate the results and how the scaling looks like between cpu and gpus for different scenarios.

H

So this work is funded by the nsf earthwork project and ncaa core co-funding. Besides the courses listed in the presentation title, I also want to give a huge thanks to the contributions and the help from different people from different labs and organizations. With their names listed in this slides, so with that, I would say thank you for attention and I'm happy to take any questions.

A

Yep, thank you so much jian. um So we definitely have time for some questions. So if you do, please uh raise your hand or enter your your name, a question on the chat and we'll ask you to unmute.

A

I do not see any questions yet.

A

Yes, so rich, please go ahead.

G

Yeah um uh jensen, one one thing that I didn't uh catch was so you talked about using asynchronous kernel launches in order.

H

G

Speed up the different microphysics.

H

G

That's this stuff that you showed here, but um how much did that actually speed up the code compared to with that like if you turn that.

H

G

Off yeah, just just by itself, do you know what I'm asking.

H

G

Of a optimization jump did that provide on that gpu.

H

Yeah, I think, in my impression, by using the async runs, it improves the pumas cajun kernel performance by like 10. That's my experience.

G

Okay, so about 10 10.

H

It's about 10 and I would like to say this is really case by case because by running uh gpus asynchronously we need to synchronize it at some point and the synchronization is a very expensive operation on gpu computing. So it's really case-by-case choice. I would say.

G

Okay, thank you. Thank.

H

A

Rich okay, more questions: please raise your hand or indicate that, on the chat.

A

Okay, uh jian, so have you been able to keep your code base uh like unified, for you know, use in both a gpu and cpu, or you have to keep separate codes.

H

So in this particular yeah sense for this question lesson, so let me go to uh my slides here: okay yeah. So, as I mentioned, the the method we use in this works called directory, direct-to-base program, parallel program model and a nice feature of this method that we can keep the single source code for both cpu and gpu, which means that we only have a single source code, but we can enable it on either cpu and gpu by some choice of compiler flags.

A

H

H

And thank you brian for encouraging comments.

A

Is this pretty much the only physics code that is being used in uh in cam csm uh like using gpu, or is this one of a variety.

H

Yes, thanks for this question as well, so to my knowledge, pumas is currently the only one brought into can for a real gpu simulation. I know. Rtm gp, which is a radiation scheme, also have obcc and openmp upload directives, but I personally haven't tested it so far. I think some people on the audience list they might have experience and hopefully someone can jump in and clarify it.

G

Yeah, I have my hand up to answer that question.

I

A

G

Yeah, I just thought I'd yeah the. If you look at the jim hurls presentation from the atmospheric working group uh yesterday, there's a status slide which shows the status of these things. There's a port uh jensen's right, there's a rtmg p version which is uh being worked on by brian medeiros and company, uh getting that into the the the physics suite and there's also um work being done on club under a doe contract to gpi's it using offload directives.

G

In that case, openmp offload directives.

G

And all that's in this big table, that's buried in in jim herrell's talk from yesterday.

A

Okay, well, thank you, yeah! That's a good reference.

A

Well, um I think that's it for questions. I don't see any other raised hands, so I think we are going to come. This uh first bring this first session to a closing or the first half. So thank you to all the speakers so far uh we're going to head into a break and uh reconvene at 10 25 for the second set of talks, but before we go there, let me ask bill if there are any announcements or anything, we need to know.

B

uh No, uh that that sounds great great, we'll see you in 20 minutes. I guess I'll just remind people if they joined a little late, um that we would be happy to see you this evening at 5 15 at the raybac collective, for those who are local and they're able to join us for a little social gathering this evening, thanks for a great set of talks for this first uh first set and thanks alicia for moderating that.

A

All right see you all at 10, 25.

B

All right, jesse, are you back that started in a minute or two.

I

Yeah, do you hear me yes,.

B

D

H

If you wanna go ahead and share story.

B

We'll give people we'll wait another minute or so to get started.

I

All right, real quick is it, do you see just the slide? Show um yes, so.

B

Cool yeah awesome.

B

D

I'll just say a couple, quick things before.

B

We uh get started in the second half um first for those who are just joining us for the second half. We would like to invite you to join some members of the software engineering working group this evening at 5 15 at the raybac collective in boulder, if you're, local, so yeah. We hope to hope to see a number of people there and then uh just a reminder that, as in other meetings, we're following the ncar code of conduct here so offering constructive feedback sharing the air, acknowledging teamwork, encouraging innovation, showing appreciation and considering new ideas.

B

So with that said, um yeah I'd like to turn it over to jesse, just nisbaumer who's going to be presenting on the atmosphere, diagnostics framework, so take it away. Jesse.

I

Great thanks bill and thanks all for letting me speak um real, quick, there's an asterisk here, because we found out everyone's been calling it the atmosphere, diagnostics framework, and then we found out 10 minutes ago. That way back in the original documentation was supposed to be the amwg diagnostics framework. So well, but it's definitely the adf anyways.

I

uh I want to I'm just going to talk about the package and I first want to thank my collaborators for cecile and brian, who um have kind of been with me at the beginning, they're ones who even kind of started this whole project justin. Who is a new associate scientist who's?

I

Taking more and more of this work on, it might eventually become kind of kind of become his and then julie uh and danny, uh who kind of also been contributing a lot to the development, and also that kind of like a lot of the thankless organizational work that we all need, but no one wants to do and then andrew gettleman, I guess, for you know, senior scientists this. So, um let's get started.

I

Let me go for it there we go um so you know cam. The amp group has the amd amwg diagnostics, uh which have been around actually. I looked at the common sub's been on for over 20 years, and it basically provides push button um diagnostics ability. Basically you just send it the paths to your cam history files and it outputs a bunch of tables and plots, and it puts it all on a website and it's still being used.

I

I used it in grad school people use it in the community, it's still being used for canon 7 development, and you know I want to give a shout out to it. You know I'll be I'll, be amazed at the adf last 20 years, and so you know any credit to a software that can be around for that long and still being used um regularly um in terms of the actual software itself.

I

uh You know it's basically just a bunch of ncl scripts that are wrapped with a seashell wrapper, um and so that's kind of where the issue is right. So we have to move on. uh One of the reasons is because um you know ncl's being deprecated, it's just no longer really supported by sizzle, and although it's still used still can be used, um you know. Eventually, there will become a time where it'll just be really hard to maintain.

I

um The other issue is, in general, people found that relatively difficult to modify.

A

I

Amwg diagnostics, do you want to change? You know, contours or levels or anything like that, um and also particularly it's difficult to work with different vertical levels and what the original cam was, which is a problem, because a lot of our new, the new versions of cam have many different.

I

You know we have a 58 layer and a 93 layer, and all of these um you know we need a pen, we need diagnosis packages, agnostic to those differences um and then finally, which is you know it's just because this package has been around so long but lacking a lot of modern ic practices right. It's not there's no open development, it's not on a repo really of any kind. It doesn't have any sort of testing or ci systems, or anything like that.

I

um So you know. Given these issues, we realized we have to move on and build a new package to kind of deal with this um and so amp a lot of us and iam got together and we discussed and we kind of looked through the community. You know one of the things of diagnostics, in my opinion, is in some ways kind of a.

I

If not saturated a very a full area right, there's a lot of packages, we looked at things like esm battle tool which is developed at think by doe, we looked at mdtf, which is developed by noaa, which all kind of do the same thing. And then, during this we built myself, uh brian and cecile developed in like two days. A quick python thing like this is what it might look like. If we, you know, started from scratch, that's how we landed on.

I

We landed on that straw, man so that strong man is now the adf, and so you know some ways that was great, but in other ways it did kind of immediately add a lot of technical debt. So, if you were, you know those first, six months when you saw you might have been like ooh, that wasn't the best written code, and that was part of the reason. Why then, just real quick, there's the url for folks who want to look at the repo.

I

So what were the general design principles of um the adf, uh mostly and most of these design principles that I received from amp, or you know the general kind of user group community that uh we're focusing on one which is again just like the old awg diagnosis. We don't push button capability right.

I

Basically, I just give you one or two inputs, like the paths to my model data and the path to where I run all of the diagnostics written and then I just literally like type go, and it just runs right, and so then the adf will generate all these process files. You know climatologies regretted data sets generate tables with statistics plus and then finally try to put it all together in a website and in the future.

I

You might also see about putting it together on like a notebook or a book collection, um one of the really strong things they wanted to be error. Tolerant. uh What this means you know in this case is, um if I have missing you know if I have analyses that require say the zonal wind field, but my model data is actually missing. Zona wind, instead of just the adf dying, it just says: oh, I can't find zone wind, I'm just going to skip this particular analysis and try to move forward.

I

So basically, beta will try its hardest to run to completion, even if it has to skip a lot and, of course, will print warnings but you're going to skip a lot or even if it has to kind of bail on certain things, and this is because there's certain there's certain analyses which are much more expensive than others. So we don't have to redo the expensive analysis that we're fine, just because of another analysis downstream had a bug in it.

I

The other um thing that was really hammered to me was: they wanted to be really easy to port, um and this was more. This is actually almost the most important thing for the community. They wanted. They cared about it more than like performance or um even like co-readability things like that. uh um Some ways this has been let this is not as big of a concern. You know, as brian don't talk about the ease container.

I

You know there's now systems that kind of solve this for you, but in general you know at the time we wanted to make sure that you could at least easily install everything you needed with conda. The other thing, which is a little different, the adif does a little different than other diagnostics. We make a really concerted effort to use the minimal number of external python packages which we'll show later you know. Part of this is because, even with containers right the less packages you have, I gotta imagine easier just to maintain over time.

I

I don't also have some development. Even during the development of the adf, I ran into a dependency hell, and so you know the more package you have, the more likely that is, um and so just trying to keep it to the necessary packages um was kind of our goal. Even if we have to do write some code in house, it's fine, then passing it off to a package that might not be around five years from now, um and then you know they also wanted to be relatively easy to to modify.

I

Some of this is just kind of basic modularization, um and so you know also we wanted to have inputs. As I mentioned earlier, the amwg you couldn't really change things like contour range or the color table, and so the adf. Now we have a yaml file that you can modify. So in this case, psl is a c level pressure. It's a variable from cam. So basically this in the cml file. That's something like this! It tells the adf okay.

I

If you have c level pressure, we're going to use the orange color map we're going to use this range, we also have the ability for some basic unit, conversions right, so it'll come in pascals, convert to pascals uh things like that, then. Finally, right almost all the diagnostics now are moving to python, so adf is also python. um These are actually all of the packages we're using. um We chose them either because they have a they have a very large community base right like x-ray or pandas um or they're developed by ncar.

I

So we know we have a direct kind of connection um to the developers. You know that's the case like geocat, the other two end car focused ones- I don't have on here, which we're working on adding, is intake, esm and desk. So, but you know again we're trying to.

I

Hopefully these packages will be maintained for a long time and play nice with each other. uh You know this is the uh it's kind of a complicated uh diagram, but basically the main takeaway is like the top. Two files are the only files that the average user should modify and even then, the only one that they would have to modify. Is this config file which basically contains um you know the information again like paths, and maybe some settings like? Oh, I don't want to run.

I

I don't need tables, so just don't run tables information like that and then basically passes it to an object which is in reality.

A

I

A

I

Objects that inherit from each other and then that object kind of insanity checking it doesn't calculate as much of derived information and metadata that we need, um and then it calls. uh Essentially, we have a function, call that calls a list of scripts, um and this is where, if you're developing, you can add your script in following a certain api and then the adf will just call it um all of these will basically take the one that kind of uh is in the adf now, but maybe not necessary. Going forward.

I

Is the time series generator we just needed one at the time. But again, as brian dobbins pointed out, you know, there's a time series container and there's talk about eventually csm outputting time series, but so we'll see. But the rest, you know calculating climatologies regretting to your observational data, sets and then doing analysis and plotting and then putting it on a website um basically kind of the workflow of the adf.

I

uh So in terms of.

A

I

um That does not look super great on my monitor, it looks better in the terminal anyways, so the uh the api is so. If I have a script uh analysis script like myscript.pi, how would I bring them to the adf? Well, all you have to do is at the top. You have to add a function header and the function has to be the same name as your script, so my strip.pi has to be defined with my script and then the only thing you have to bring in is the adf object itself.

I

The idea being that adf object um will contain all of the data. You need um to run your analysis and so we've developed um basically a lot of different kind of get functions or methods to try to grab. You know: do you? Are we comparing its obs or a baseline? What are the name of the model run cases, um and then we also have even uh functionalities.

I

If you want to add your own config option to that yaml file, there's just a quick way to read it in um right now is still one of the downsides at the moment. Is it still means that each independent script has to open its own files all the time, so we're hoping things like that will lessen as we uh bring it into esm, uh so individual scripts want to do their own query and reading and those sorts of issues I'm going to skip this because it's gonna be a live demo and live.

I

Demos, of course, are notoriously successful all the time, so I figured save it to the end, uh so I'll come back to that, um so real quick in terms of the uh current status and issues and I'm flying through this relative, um you know so we actually hope to have version, one which is kind of like having the base functionality of the original mwg diagnostics um ready by the end of the summer. Maybe early fall, um there's two things that are holding us back, one is we're actually just lacking observational data sets.

I

This is kind of something that we have to pass on to the science community and be like hey. What's what data set? Do you want to compare? You know the cloud variables against or the radiation again um we're also still missing. We have I'll show a lot of the kind of basic plot types, there's still certain plot types of missing. You know like time, series or um uh uh radial overturning things like that. um We have a controller to try to move this forward. We have a weekly hackathon to adjust these issues.

I

So you know scientists are involved in trying to add these plots um and try to flush out the adaf. If you want to join, feel free, we're happy to take any help we can get. um One of the downsides also is currently works with monthly output. um This is an issue you know, there's a lot of diagnostics like the mgo or looking at diurnal cycles, where you need higher frequency uh output and again we're hoping that intake esm will help with that.

I

um The other another issue is adf right now um it works fine with kind of standard. You know model runs, you know even to like maybe half a degree, but when you get to really long time periods, thousands of years or um really high resolution, it's currently struggles, uh and so we need to.

I

We need to improve performance, particularly be a desk, um and then one thing is like personally, I got caught in the you know: faster, better, cheaper conundrum, so uh I've had to kind of put a lot of the good software during practices like unit testing on the back burner, which means we have relatively low code coverage, um and so you know going forward.

I

uh I kind of want to fix a lot of these right. You know implement intake esm to allow for easier, um a simpler api with scripts and then also to manage um sub-monthly data, I'm going to bring in das uh to particularly from memory. I should back up. You know the constraint for these high resolution. uh Data sets, isn't actually the cp. You know, isn't the flops.

I

It's the memory, it's just hard to load that much data into memory using a lot of you know kind of off-the-shelf stuff, so we have to figure a way to distribute it. um You know also bring in additional plots and also as you'll see, the website can use some beautification which can get subjective.

I

um We also want to develop notebook interfaces, uh especially on the front end right because people it's a good notebooks- are great at kind of being a tutorial for things. So it creates the front end to help explain how to run through the adf, but then also at some point, it'd be nice to have it at the back end.

I

So you can, whenever you're running through the adf and then outputs a notebook, you can run again, so you can share if you have a really specific diagnostic um and then finally increase uh unit testing and code coverage, and the other thing which I want to give a shout out to the community is you know in like model development world seem to have really good, robust regression testing an integration test right, there's a whole there's hundreds of regression tests, I think, for I'm sure, for you know cesm, but for diagnostics that doesn't seem to be as common um and in particular, there's a lot of concerns about like well.

I

How do you regression? How do you determine if a plot which ends up being like a png file, is different than a previous png file? um So it'd be it'd, be great to you, know: kind of spread out the wider community and figure out what would be a good integration regression testing for diagnostics and cells, particularly when we make small changes that shouldn't change answers: um okay, so oops! uh Actually, I'm going to exit full screen, I'm going to show the website real quick time uh so well, you're.

J

Just going to say like this.

I

K

I

um As you can see, it's pretty basic right now, um but you can tell all the different kinds of information we have. So you have tables. So you click on tables. uh It gives you for each um run simulation. This is a case versus a case, so a camera versus a camera, so I can select on one. It gives you a whole lot of information for each variable.

I

I also have comparisons right, so I can say like what's the difference between the two cases over on the right hand, side and what the units are, then we have different plot types. So, like lat lawn uh here's cloud low cloud, so you go okay, I get the plots below cloud if it doesn't fit on your screen super well like this one. You know I'm kind of getting to cut off the edge. You just click it, and then you get the full thing and you can download it.

I

uh If you just want that plot, um you know we have vector plots.

I

uh We have zonal plots both as kind of just two dimensional time series and then also, if you have a three dimensional variable, which I'm actually not sure. If this one has oh yeah, well, uh jupiter heights.

H

I

Particularly interesting, but right you can also then get uh height uh latitude plots and then um finally, we also have polar plots at the moment to plot you know, uh precipitation over antarctica or greenland uh those things anyway, so yeah I'll stop, sharing right in like 15 minutes actually a little over so uh but yeah. uh Thanks for listening. uh Let me get my scene back.

D

And yeah I'm happy to take any questions.

I

B

Great thanks thanks so much for.

I

That jesse yeah we do have we.

B

Do a few minutes for a question so uh feel free to raise your hand or put a question in the chat. um Let's see, I think I first saw a question from jim edwards in the chat jim. Do you want to unmute and ask your question sure I'm just.

D

Wondering about the status of this working with the non-regular grids like sc and m-pass and and yeah.

I

Originally refined.

D

I

Yeah uh so right now we're waiting on some geocache developing that uh ux array system, which is designed to manage that. So um until that comes online yeah we do right now you have to regrid to a regular lat lawn grid. So but um hopefully, when that comes up we'll be yeah. We're able to use non non-regular.

B

I

B

um Marianna, I saw you have a hand raised.

C

Sorry, I think that um so suggesting this. This looks great, um I'm wondering where do you see the future? Let's say for when we release csm3 as integrating this into the total workflow of a seam run, not not even for csm, but as we move forward, it would be great to start integrating diagnostics routinely so that you spit them out as part of the run and we're not doing that. So, what's your vision for that.

I

Yeah um that was part of our original. I guess I didn't actually have an original design principle thing, but.

D

That was one of.

I

D

I

And so it should be um assuming the environment has the python modules. You need yeah, it's literally just um modifying a yaml file um and then running. It's literally running a script, so it can be put as a job in a schedule or just run like that um and then also it's we haven't.

I

This could be a discussion with sema or whoever you know it is you don't have to actually use a yaml file per se. We could um it has. You can just bring in the object and then add the information directly in python so um but yeah it's designed it's designed to just be. It can certainly be run as a as a compute job and there's a few things um we still need to.

I

We still want to enable some ability for user to have like environment variables that are automatically read in like shell variables, but um uh once that's.

D

I

D

Is yes, sorry that was a long way of saying? Yes,.

B

uh Rich loft, I see you have a question in the chat. Do you want to ask your question.

G

No, actually, it got answered. You know. Basically, it's around the same question that uh jim edwards asked about connecting with unstructured grids and with the effort that ryjin represents to introduce. uh You know unstructured x-ray objects for parallel processing, so yeah.

I

Okay, yeah we're hoping to use that once that's once they say it's good to go.

B

um Let's take one more question from brian dobbins and then I see there's a couple other questions, but uh we're gonna have to move on after that and either could have some discussion in the chat or there will be time in at uh 11 30 for some further discussion where this could be addressed more so brian. Do you want to go ahead.

F

Yeah, I mean, I think this ties into the same workflow questions that marianna was asking about, and this is just uh you know you had this workflow of generating the diagnostics and also it seemed like there was some automation for generating an intake, esm catalog, and I was just curious what the thinking was there in terms of where that sits.

F

uh Should this be part of a diagnostics runner part of the whole workflow, but if we're moving towards making the diagnosis part of the workflow, maybe that answers that that's that's sort of what I was thinking.

I

Yeah I mean I, you know we, uh we haven't fully decided this in the edf side yet, but you know um so we might have it as like. An option like you can either have it generate a can and take esm, catalog or not, if you don't want it for some reason, but uh it's just I know a.

A

Lot of I know a lot of other diagnostics.

I

Like in the ocean group- and um I think to the land group to less I'm not as familiar with that, but uh you know they have a lot of diagnosis that expect a catalog and so by having a catalog generated.

A

At some point in the workflow.

I

It would it would help interface with those. I think, that's the one of the major.

D

B

E

Well, thanks again jesse for that.

B

Thanks um our last two talks for the morning are both on the topic of uh lossy compression, um so the first one will be from allison baker on the fine on fine-tuning evaluation, metrics for lossy compression of cesm data um allison. Okay, great, I see your screen.

J

That good yeah.

G

J

Okay, so, oh sorry, I go.

B

Yeah go ahead. Sorry.

J

Okay, so, uh as bill just said, this is uh part one of two talks and on lossy compression, and this is work that I've been doing primarily with dorad hammerling and alex bernard alex will be giving the next talk as well as haying shu and many other people have contributed to this effort over the years.

J

So, as we all know, computers are getting faster and thanks to the work that a lot of people oops just randomly changing slides a lot of people in this group have done. We can generate data at a crazy, fast rate. This plot is data. I got from dave hart it's four years of data for our glade usage and campaign store.

J

You can see. The green line is the campaign storage right now the capacity is 92 petabytes we're about 60 full. You can see that it's growing pretty much linearly. uh The current sizzle plan is that once it, the capacity is increased to between 100 and 120 petabytes. That will be it. So we will run out of storage.

J

uh This is not a good thing. It will negatively impact science, and that is why it's been really important to us to look at lossy compression as a way to mitigate the problem.

J

So for those of you that aren't that familiar with compression there's, basically two different types, one is lossless compression. This means that when we compress the data and then we reconstruct it, we end up with the same information we started with. So this is what gzip does. This is what a net cdf zlib deflate does.

J

The other type of compression is lossy compression and, as the name implies, we lose some information.

J

What we started with and what we get after reconstruction are not exactly the same, and so the reason we need to look at lossy compression is that, unfortunately, loss less compression is not that effective on data that comes from numerical simulations, this isn't particular to csm. But it's just a general statement. That's true! This is because, on this numerical data, if you look, I uh output the temperature with you know what would be a six in 64-bit precision, there's a lot of numbers there.

J

That probably don't mean anything at the end and are essentially random and it's very hard to compress random data. In fact, for csm cam data. At most we get a 2x reduction and usually it's a bit less than that, so we need to use lossy compression if we want more substantial data reduction.

J

But the issue then, is that we have to be careful about what information we're losing. Hopefully we lose the information that was random, noise and didn't matter in the first place, so the goal in our work overall has been. We want to reduce storage, but we don't want to negatively impact our signs.

J

So we've been working on um looking at lossy, compression and csm data for several years, and I just wanted to kind of go over some of the challenges that we face. I mean the the first one is. Understandably, scientists are reluctant to lose any information that might be important, so we've really kind of tried to have this mantra of doing no harm and trying to figure out how best to evaluate the information loss for climate data.

J

Glossy compression has obviously been around for a long time and is really popular in things like you know, video and images, but in a lot of applications. You know if the image still looks. Okay, if your movie still looks okay, you don't really care what was lost, but that's not true with scientific data, so we have to be careful. um The variables have different characteristics: there's spatial and temporal dependencies, and just overall, the focus of our work has been kind of one, and this is what I'll be talking about in alex as well.

J

You know understanding the effect of velocity compression developing the right kinds of tools to do that and from there going on to figure out you know what are the quantifiable metrics that make sense for csm, and how can we use them to predict what optimal compression will be, and that's what I'll be talking about for the rest of the time and alex will be talking about.

J

So the first thing we did in this journey was to establish feasibility that you know, wasn't a horrible idea to use lossy compression on climate data, and we did that by thinking of ensemble-based metrics, and the idea was that you know at minimum. We wouldn't want any compression-induced differences to exceed the ensemble variability, and this is sort of the standard we think of when porting to other machines and stuff.

J

So it turns out. This is very feasible with compression and in fact it's probably too lenient.

J

The the next thing we did was give scientists access to some of this data that we had applied glossy compression to, and this um was an experience, a merit that we jokingly called the pepsi challenge. That's if you're old enough you'll know the reference up there in the upper right and the idea was letting scientists look at the data and see if they could tell the difference between which had been compressed and not compressed.

J

This data is still out there. Two of the lens one runs have been compressed and it's very difficult to notice using standard analysis tools. But of course, if you want to be clever, you can figure it out which are compressed and which aren't. But the question is: does that matter? And so we come to okay.

J

We really need to be looking at compression at these fine spatial and temporal scales, and doing this kind of analysis is really important and simple. Metrics, like you know, the root mean square error, which maybe is enough if you're just looking at an image on your phone or something isn't enough for the climate, where, for example, on the right, I have pictures of looking at the effect on the contrast variants of different compressors. So these are the kinds of things that we need to look at.

J

Okay, so spatial temporal analysis tools, so basically the idea was that we need to be looking at things that emulate what climate scientists are looking at. So we need to care about gradients in space and time. We need to care about derived quantities like like the client relevant budgets.

J

We need to care about changes in variability over space and time so to look at all this alex, and I and anderson and dorad have been developing this python package called ldc pi and what it does is it helps you compare data sets and climate science we're using it to compare compressed and uncompressed data, but you could compare any data sets that you want that are different for different reasons, and this has been built on the pangeo software ecosystem, so it's using desk and x-ray and you can interact through jupiter, notebooks and we'd be happy for other people to use it.

J

We take requests alex and I have been using it extensively for ourselves and he'll, be mentioning it further in his talk.

J

So we can look at all these plots and images and different things in an interactive way, but at the end of the day, if we want to automate the procedure, we have to be able to determine a few quantifiable metrics that we can use and some accompanying thresholds that we can feed feed in to tell us what levels of compression are optimal.

J

So we've been working on developing a suite of metrics and I have some of them listed in this blue box here and those are things like you know, correlation coefficients that are good at noticing: outliners outliners outliers, the ks test, which notices changes in distribution, um relative errors and then visual. Similarity is one thing that we focused on it's quite critical for post-processing analysis. I mean we just saw jesse's talk where you know. Looking at these different plots is quite important and often the first interaction with the data.

J

So a visual similarity metric is basically a metric that tells whether um or how to what degree, two images are alike, and so we actually had over 100 participants in an evaluation study a few years ago, where we let people say, can you see a difference or not between this data and from that we determined that this structural similarity index measure was very useful and along with a corresponding threshold and actually jesse.

J

I think this is something that would be interesting for you to use in regression tests for determining differences in your images, but anyway I'll stay on track here. So from there we developed this data structural similarity index, which was just a variance so that we could apply it directly to the floating point data.

J

The reason we wanted to do this is we want to avoid generating the images at all because we're trying to automate compression selection- and we may be looking at hundreds of thousands of time steps and we cannot be generating plots for each time- sets because that's way too expensive, and we also don't want our measurement to be dependent on plot specific choices.

J

We just want more of general idea of how similar images are, so this has been really useful for us and in fact, it's kind of our primary metric we've been using here's a picture of just the surface temperature. These are using two different compressors and they reduce the data by the same amount and the image on the right is a much higher quality.

J

It's hard to see at this resolution, but if you zoom in and maybe you can see that, there's some blocking artifacts- that the compressor on the left has has left, and we don't want those and it's nice that the dssim picks this up um well from the data, not even from the image just from the data.

J

So this bottom plot, then I don't expect you to be able to read the variable names at the bottom. But these are all the daily variables from the csm1 lens data set, and the reason I have this here is just to basically show you that, for a fixed quality parameter like the dssim different variables are more compressible than others. So the height of the bar shows the amount of compression for each variable and the colors are just different thresholds.

J

But basically you can see that some of the bars are quite higher than others, meaning some variables for the same, fixed quality can get much more compression.

J

Here's a plot that I think is useful to understand how these metrics work. So I have five different variables: they're in each color, and we've used this zfp compressor as we look at the plot from right to left. The solid lines indicate the file size they're going down in a linear fashion, so as we're increasing the amount of compression we're decreasing the file size.

J

Now we look at the corresponding dotted lines and when you go from right to left here, you see that initially the compression does not affect the quality at all, because getting a 1 means that we haven't changed the quality of the plot, but notice that these dssim dash values are not linear. So at some point they just drop off and the quality quickly degrades. So when you think about getting optimal compression and with these compressors and using a metric, you think I want to catch. I want to stop compressing right before the quality drops off.

J

So that's, basically, what we're trying to do so.

J

There's three compressors that we're looking at actively at the moment- and our main criteria is here- is that we needed to work with that that cdf data, obviously because that's what our output data is in right now- and this has been pretty hard until recently, because most of the off-the-shelf compressors did not support net cf data, we had to pull out the data compress it you know, stick it back into the next cf file, but now, in the last year the two leading doe compressors, zfp and sc both have uh registered hdf5 filters that allow us to do compression through net cf4.

J

That has been um a huge for us as far as being able to use it and having others use it easily. Another method, I'll mention, is charlie, zender's bit grooming algorithm. This approach is quite nice. It's a basically a pre-filter, it's available through the nco tools. It's easy to use. It's easy to understand it's available now and we've it's been quite successful when we've tried it out, so here's an idea of what kind of compression you could get for cam data, for example.

J

This is all with zfp and I've categorized the thresholds, I've used as conservative middle ground and aggressive, and we have the compression ratios so, for example, the variable on the left. But if you look at the orange bar, it goes up to about a six. This means that, for the amount of compression that I've termed aggressive, I can reduce the file size six times more than the lossless compressed file. So this means it's about a 12x reduction over the original file and about six over the reduction over the lossless file.

J

So again, you can see that the different variables react differently to the same quality metrics, and I would say that you know, even though I've labeled this aggressive. It's not really that aggressive in the grand scheme of things, and I think it would be very realistic to expect these kinds of compression rates for velocity compression um just quickly. The variables I showed on the last page are from this cam test set that we've been looking at extensively. This is available on glade.

J

If you want to look at it, let me know we're always looking for scientist feedback and thus far, we've primarily worked with cam data, because it's been much easier, but now that we can use lossy compression via net cdf we're making it easier to look at other component model data as well.

J

I've recently got input from the clm and sea ice groups on what variables they would like to look at in depth, and our goal is soon, hopefully to compress the entire csm1 data set with velocity compression and maybe redo, some of the analyses that scientists have already done and see if we can see a difference.

J

So this slide is uh basically what alex is going to talk about, but I want to give some motivation for it, and that is now. I've shown you there's all these different compressors.

J

There's all these different parameter choices, all the variables act differently and it frankly seems kind of overwhelming to think of using this in practice so um clearly, what's needed. For this to be practical is some kind of tool to automatically select. You know for a given variable and temporal output and grid size to automatically select the right amount of compression, and this is what alex will be talking about.

J

So I think I pretty much covered all these lessons learned and I'm not going to go on them here. But you know, working closely with scientists is important, we're always looking for feedback, and um I just want to end with this thought that my hope is that applying lossy compression is going to be something that's not suspicious and scary. But it's just going to be something that you do when running the model like you would choose your grid resolution and output, frequency and precision.

J

It's just going to be another thing that you do and that's all I have. So if you have questions just don't ask me about the optimal compression level, because that's the entire next talk so.

B

All right thanks thanks a lot allison. um I do see one question in the chat and one from kevin rader, uh and so why don't we take that as we transition and then maybe we can take some more questions um uh on lossy compression after alex's related talk, um but kevin asks: uh do these compression tools handle variables that use special values or masking values to denote an absence of data at some points.

J

uh Yes, they do now now that you can use them um so before. That was something that hying and I had to figure out how to handle ourselves, because we had to pull the data out of that cdf and compress it and then basically stuff it back in, but now that these tools are available with hdf5 filters or charlie zenders, just through nco tools. All that is automatically taken care of. So it's it's really huge, and that was why we had started with cam, because it has less missing values and things like that.

J

But um this is really a big development and it's going to make it easier for everyone to use these methods.

B

Great so yeah, so in the interest of time, let's move on again uh well, I think we could take some more questions for alison after after alex's talk. If people have them, um but let's transition to um alex bernard who will be uh giving another talk on lossy compression, uh predicting optimal lossy compression settings for ces and lens data as as alison just introduced um so alex.

B

um Are you able to share your screen? I can. I can show that screen.

K

All right is that visible yep.

B

That looks great all right go ahead. Great.

K

All right so uh yeah today, I'm going to be talking about sort of our process for predicting optimal lossy compression settings for ces and lens data. So my name is alex. um I'm a phd candidate uh in statistics at the colorado school of mines. um Allison, who you just heard talk, is my collaborator, encar and art hammerling.

K

Is my research advisor so uh currently to obtain these sort of compression ratios, like allison, showed in our previous presentation, we're taking a sort of brute force approach to figure out what the optimal compression settings are, uh meaning we're trying every combination of um compression, algorithm and and parameter setting that we have available to us um and running a suite of metrics on all of that compressed data to sort of figure out which one can pass all these metrics and yet still give us a high compression ratio and it's very computationally intensive.

K

So what we want is a model that we can use to predict this compression level in a way, that's a lot more computationally, efficient and so to do that. We're trying to model this as a classification problem um in in the classification problem um sort of through review. We have data sets um some data that we're using as input to this model and corresponding labels, um we're sure, in this case our compression settings that we're trying to to match with them. That figure on the right sort of illustrates um what I'm talking about.

K

We'll see it again.

K

um So the output classes that we are looking at here are our, like. I said the compression um that will be the compression algorithm we want to use which can include um either zfp, big grooming or fc at this stage, and for each of those there's a slightly different parameter that we are our compression level. I guess that we're trying to tune so for zfp, it's the level of precision um for big grouping. It's the number of significant digits of the original data we're trying to maintain and for sc.

K

We have an absolute error, tolerance that we can specify and these um parameters are the output classes of our model and we select um a series of possible levels that cover the range of likely optimal compression settings um for for all of our variables and, as allison briefly mentioned, we're using these four metrics to determine um uh if the data set has been optimally compressed. So that includes the data ssim.

K

Here's some correlation coefficient, komodo, smoothing off p-value and spatial relative error.

K

um We as input to this model. We have um the first two years of cesm large ensemble cam variables in daily output, which is about um 47 different variables and 730 time steps for each variable. So we have several thousand.

K

Data points to work with.

K

And so, as I mentioned currently we're taking a sort of brute force approach to finding what the optimal compression level is, and I was going to go into just a little more detail to show exactly what that means. So on the right. This figure here is is um a plot of so for each column of this plot represents a single variable and then, as we go down the plot, we are increasing the data ssim, which is the visual similarity of the um of the data set that we're working with.

K

So as we cross each of these thresholds, um the aggressive threshold. That means our data is considered similar under the aggressive dssim threshold, which is 0.95, and at that point we use the very next compression a data set. So in this case for lhfox we're looking at zfpp12 and that's our optimal compression level. um According to the data ssim and as we go on, we also find optimal compression levels using different thresholds for the tssim, so middle ground and conservative thresholds, and we this is sort of also generic. For, for each of our thresholds.

K

We sort of repeat this process also for the correlation coefficients special relative error and komagara's sphere knob test- um and we repeat, we repeat this- for for every variable- to come up and every time slice to come up with the ideal uh compression settings for a single um for a single variable, um and we end up taking the the the most compressed variable that passes those metrics.

K

So now that we have an idea of of what an optimally compressed data set is supposed to look like we can tackle the problem of finding a computationally, feasible statistical model uh for this problem.

K

So this requires selection of features from these data sets we're sort of going to extract these features from the data sets and use them as input to the model, and we extract these from the uncompressed database sets because we don't, we don't want to um have to compress it several different levels, that sort of defeats the whole point and um selection of appropriate models and model setups for this type of classification problem.

K

So there are a lot of existing classical statistical models to um that we can apply to this sort of problem, so we have uh random forest and boosting models um which are sort of shown on the right here. They um are sort of a branching tree that starts at the top and works its way down.

K

That applies some rule to the input, um features and ends up as we go down to the very end leaves of the tree, giving us a label for what our data set optimal level, compression level uh or compression algorithm will be, um there's also a few others.

K

So there's support, vector machines, linear, discriminate analysis, deep learning um and aggregate models that sort of look at the results of several other models to come to their um conclusions by taking the mode compression level or something like that, and we have also other models um such as convolutional neural networks that are sort of implicit feature models that discover the features on their own.

K

So the explicit feature models um require us to come up with features for our data, which are used directly by the model to infer a label so so far um for this sort of crude first pass at creating these models, we've come up with um eight different features that are somewhat useful in um helping us predict the optimal compression level. So those include the mean of the variance the north south contrast variance east west.

K

First differences number of zeros range median probability of positive values, so some measures of the centrality of the data, some measures of variance some measures of the size of the data and a couple of others.

K

um To collect all these features, um as alison mentioned, we're using this ldc pi software package that we've developed, um which is sort of designed to do this um between two different data, sets in particular.

K

We sort of designed it with compression in mind, but it's not specific to that use case, and it allows us to perform all of this analysis to generate these features um and to compare differences between the data sets and also allows us to to do things like um plot the data to sort of uh uh give us a better sense of how the data uh the original data compares to the compressed data, um and also said I I developed this along with allison and anderson and dort, has been really helpful in providing guidance for this package as well.

K

To each of these models, we we run a parameter, sweep over several diff possible um values for the the the model parameters um which vary depending on what particular model you're. Looking at.

K

We train each model under each set of parameters and using our validation data.

K

We can sort of come up with an idea of of what parameters for that specific model are performing the best, and once we have uh once we know the ideal parameters for each model, we can test each of our models using this left out set of testing data, which includes the different variables that have different features from the original training data, um and we don't have great results yet, mostly because our um features that we've chosen, as I said, are sort of crude and don't quite give us perfect separation between each of the um the classes that we're trying to predict.

K

But we expect that as we as we try our convolutional approach, we might come up with some more features, so the the sort of challenge here is to find features that that hold generically across these data sets. So they may look entirely different in this case. These two data sets are are optimally compressed using the exact same compression settings, but um it's it's hard to tell just visually um what it is about.

K

These two data sets that are making them optimally compressed uh using the exact same settings, and so for that we are going to try a convolutional approach.

K

So, as I said, the the explicit feature models are simple and fast and easy to interpret, but but we need to find better features that we can feed into these models, and the the nice part about a convolutional network is that it sort of um discovers features on its own through the process of training the network, and we can use those new features to feedback into the explicit models to improve those models.

K

And as sort of a quick review of the convolutional neural network approach, so we have some input data set in this case. um We're inputting the entire um we can. We could potentially input the entire um data set at a single time slice.

K

um We can involve that using a series of filters or kernels to come up with intermediate levels, which are called feature maps which can represent um different aspects of the image. So you might have a kernel that selects for edges in the image like horizontal vertical diagonal edges, you might have kernels that sort of smooth the image and whatever features arrive arise can be sort of seen in these feature, maps which these feature maps are then condensed to sort of summarize.

K

The overall presence of the features um in a region of the map and that process is repeated and at the end we use the the generated features in a dense neural net like normal to to predict the optimal compression level of settings.

K

So at this stage the model is not very sophisticated. We're sampling these images by 36 times so 36 times to reduce the time to fit the model, um we're using the image data directly and, as I showed before, if our data looks completely different, it's not surprising that the cnn can't pick up on features of the data that are maybe not visible from just looking at the raw data.

K

um We're also modeling only one step into this process, so we only we're only modeling, either selection of the optimal compression algorithm or the optimal compression parameters conditional on a single algorithm, we're using a single hidden, convolutional layer and a small collection of ces and daily variables and we're not including any of the explicit features.

K

This is all very work in progress that we've sort of just started in the last couple of weeks. So no great results, as I said, but we definitely have a path forward.

K

Looking at how this model is um coming up with features, um though, as I said, maybe give us insight into how we can improve the explicit future models later on. So um the the end goal here for the convolutional net is to have a sort of two-stage model, where we have um one model that first selects for the optimal compression algorithm, the sccfp or big grooming, and then conditional on that we have a separate model which may or may not be a convolutional neural net.

K

um That selects for the actual optimal parameters within that, um given that compression algorithm um we'll be using some interpretable techniques, such as saliency maps to investigate and improve the other models, um or we may end up finding that the cnn approach itself is the most ideal approach, we're not really sure yet, and we're also going to try pre-processing the input data. So, instead of maybe looking at the raw data, we might look at something like the gradient fields of the the input data.

K

As our input and a quick summary so sort of how this relates to to cesm um we're hoping to make this sort of an automatic process um that happens whenever you run a simulation, so there's just one automatically generated configuration file like a yaml or a json that supplies the compression settings um and it'll have some preset default settings, and it only really needs to be edited if um space or or accuracy of the data are of special importance. Otherwise, the user doesn't really need to do anything.

K

Special, the compression will just run and sort of under the hood, and nothing else needs to be done, um and that's all I have so thanks for listening.

B

All right thanks very much for that great talk um alex, and uh I do see one question already in the chat from brian dobbins uh brian. Do you want to ask your question.

F

Yeah, I was just curious basically, if there, if you've looked at higher resolution, runs and how compression rate and performance changes and also if there is a sort of rule of thumb, if you you know, if it doesn't in chunks, which is sort of my naive understanding, of how these things work, it can be paralyzed, and just you know, speaking more than performance issues of how to do this effectively.

J

Yeah so um yeah it, it really depends brian. um It depends on the compressor, some of them, like um the zfp compressor, that I showed results for. Are it's a transform compressor, it's quite sensitive to the chunking. I usually do chunks uh as individual time slices with some variables. You can get better compression if you compress multiple slices at once with some variables you get worse.

J

The same is true with 2d and 3d ver with 3d variables. Does it make sense to compress it as a 3d trunk or to do 2d slices again? That depends on the variable. um I mean compression really works by finding patterns in the data so variables that have um good correlation. It makes sense to do them in big chunks, but like variables, like cloud fraction, are extremely difficult to do in those it makes sense to do smaller chunks so um yeah. I did that answer your question.

F

Yeah yeah that definitely there's no one answer. It seems it's very complicated, correct.

J

Yeah correct, okay,.

F

Right correct, I guess yeah, it would be. um I'd also be interested in seeing sort of uh what the difference rates are between the different methods for a variable and like different chunk sizes, because if it's relatively minor, then picking something that performs well might be good enough. But uh yeah.

J

F

Be good to learn more about that yeah.

J

It's it's usually minor, except for there are some variables that are just hard. I mentioned cloud fraction because um the smallest non-zero, so all the it's a fraction, so they're all between zero and one, but the smallest non-zero value is like order. Ten to the minus twelve. I think so. There's a huge range in the in the data, and it's that stuff like that is hard for the compressors.

G

Got it? Thank you.

G

Rich yeah, I was just going to respond to brian that uh I put a message in the chat that hyenk about a year ago, did some uh chunking of high resolution impasse, data in and then parallelizing those chunks uh to do to look at the scaling and performance of parallel uh compression, and I can share her poster that she gave on that topic. If you'd like to look at it.

B

Thanks thanks for that comment, um I have a question going back. I think it was your previous slide here on yeah how we would apply it to cesm, so am I can you help I'm having a lot of trouble understanding this? So is the idea that that there would be sort of based on the based on what you find from this. We would kind of pre-specify for different variables, the uh the compression the optimal compression technique for that variable or or is there something else.

K

um Yes, so uh I guess if we um maybe maybe alice- wants to add something here, but but I guess my understanding is um yeah. If, if we know what the optimal compression settings are for that variable, then then um they would sort of automatically be applied, but I guess otherwise we would apply our model to to sort of assess what we um predict to be the optimal compression settings um and we would have some error checking to account for that.

B

J

If there is, I guess, oh, can I collect yeah, so I guess like what I'm envisioning is. You know you have options of all the possible output variables, so hopefully, with this tool, we build some kind of database with you know all these possible options and then, depending on how someone configures their csm run and which variables that we want to output, then we create this file with knowing that okay for this variable they're using this grid resolution and they're doing daily output.

J

So according to our database, that should be this compression level and we have all this defaulted in, so the user doesn't have to do anything. That's sort of my hope how this will work.

B

So, if how so extending that to the the scenario where someone introduces a new output variable or uh kind of time, frequency, that's different from from something that you've like exactly trained your model on, um do you do you have a vision like? Would people find a similar variable to specify, or would it be simple enough to run this to run this tool on these new variables for by a user.

J

I mean, on the one hand, it should be simple enough to do, but on the other hand, I think I mean we know what all the possible output variables are right for a specific model. So in theory in theory, well.

B

Like I guess, I'm thinking about like the development.

J

Yeah something new, I guess yeah. I guess, if you add something new, I I I don't know the details. This is still all like fun. I feel that surely there's a way to work this out.

K

B

Yeah good yeah.

K

That is sort of, I guess. The whole point of the model is, if we introduce a new variable, then we'll be able to um accurately predict what the optimal compression settings for that, for that new variable would would be.

B

Yeah yeah yeah yeah cool, okay, thanks yeah thanks that helps me get a clearer picture. Dave.

D

Right well so this is sort of a tangential point, but I guess I'm sort of trying to understand where in the process, this would go because I mean currently we're limited to writing out single history slices of our variables, and I assume your compression algorithms operate on the time series output. So that's still an extra stage. So I would assume you would plug this sort of thing in along with the post-processing tool that generates the time series.

J

I guess that's how I think we would start, but I mean.

D

J

I mean you know the net cf files. They could just be compressed as they're written so.

D

Right well, the hope is, the hope is that we're doing you know time series file writing on the fly from the model right, but that's not ready to go right now, um an update on when that might be a reality. But I don't know the latest on that.

J

I mean- I guess my thinking on this is the very first thing I'd like to do is start with data sets that are sitting out there, taking lots of space like some of the lens data sets and doing post processing to them, but certainly going forward in the future. I would hope that we can write the data directly to time series in that compressed format unless the user specifically says don't compress it.

D

D

D

So my video seems to be having some trouble, so it won't turn it on. Well, it's not, it doesn't show anything but anyway. um Thank you for all this. This is exciting work and very important work too. I I think this is really vital.

D

um The concern I have I guess with it is that I think what you're saying is that um you have to figure out the optimal mechanism for each variable and also for each resolution of that variable, and I guess potentially like if you run that paleo the same variable for paleoclimate it might be. You might need something different than for future climate right or is that right? Do you think, do you think the time representation might matter as well as resolution.

J

um I think it depends.

D

J

I don't think I don't think we know. Another thing is.

C

G

Know you don't.

J

Have to figure out the optimal for everything I mean, depending what you're gonna. If you're gonna take your data and then like average, it like 10 ways or something, then you might be able to just go pretty aggressive with the compression and you might not really care so.

D

J

I think brian put something like that in the chat brian dobbins, like you could just go with a what you think is a good enough option or you could just say I want to be pretty conservative and, of course there always be the option to say well these three variables I want those losslessly compressed. I don't want to lose any information because I'm going to study those to death, but the other 300, I'm not going to look at that closely and go ahead and compress those a lot.

J

I

J

From gary strand with the csm lens, one data is that everyone downloads the same 10 variables and looks at them and the other 300 are just sitting there taking up space. So we could consider that okay for these 10, that everyone really looks at we'll go very conservative, but for the 300 that people don't really look at we're gonna go more aggressive.

D

Yeah that makes perfect sense yeah thanks for adding that in I just I'm thinking about all these hundreds of variables um that are out there and then trying to figure out the optimal thing for them, but I think you're right that um often people don't care, they only care about a few variables and not everything so yeah. Awesome thanks thanks for that.

B

Question kind of following up on that end and brian's comment in the chat about questions asking about a default options is, is uh can can you give some some sense for how how bad you know? Let's say. Let's say we wanted to get kind of started on this uh sooner rather than later. Given the importance of this um like how bad how bad would things be if we did try to pick something sort of default for a lot of variables that was maybe on the more conservative side um is?

B

Is that even a feasible thing or is there such a big spread and what the optimal thing is? Is that it really doesn't make any sense to to do this default for a lot of variables.

K

Sort of goes back to that um figure that allison had in her presentation. Doesn't it about? um I guess the the amount of compression we can get, um there's sort of a baseline level of compression. We can get for most of the variables of about two to one, um and I think, if we compress all the data at that sort of same conservative level, and it would be reasonable to to expect that sort of saving.

B

Was that that the two to one was from lossless right, correct, yes and.

K

So if there is there is there.

B

Like a you know, just just uh um arbitrarily would you know, could you say something like? Oh we can. We can at least get it an extra two to one by using some conservative threshold that for lossy compression that that works for every variable or 99 of variables, or something like that or is that? Is there just too much spread in the lossy to be able to do something like that.

J

I'm I think on average you could certainly say that, maybe that's I think on that plot alex was talking about my um quote. Conservative one had an additional about 2x improvement over the lossless one. um I mean all these compressors have different ways of controlling the error. Too. I mean in theory a scientist could just say. Well, if every variable has a relative error, you know if the relative error is, you know 0.001 between the compressor and compress on every variable, I'm good with that.

J

I mean we can do things like that too, but we've kind of tried to abstract away the different compressor parameters because they vary by compressor from you know what we think the scientists care about, but um yeah. I don't. I don't think I really answered your question but yeah.

I

That's that's helpful. I do have more to say.

K

um I was actually just trying to to remember if there is sort of like a lower limit to how much additional compression we can get. There might have been a couple of variables that we we couldn't get any additional improvement over the loss list, but I think for the most part, you could definitely compress to at least like say one and a half times over the loss list. I'm not sure if you can get to two for every variable, though, but there's definitely some kind of lower bound. There.

J

And I mean I just want to point out that these are like we're being really conservative here, even though I labeled one of them aggressive, I mean when I hear other people talk about compression they're talking about you know, 10x 20x and I feel like okay, people would flip out if we did that, but we're being like really really conservative and I think that's good because you know if we get started that way and then everyone's comfortable with it and finds out.

J

Okay, I didn't you know, there's all of a sudden, not global warming, because we use compression, I mean as long as terrible things don't happen. Then I feel like we can just crank it up over time and see how it goes. Maybe.

B

Just as a communication thing, you could re-label it to conservative, very conservative and super conservative to.

J

Right exactly when they.

B

See aggressive.

J

Probably that's probably probably good yeah and brian dobbins commented in the chat as as far as making this practical I mean I recently have been working with brian, so that, because, obviously for cloud data, the impact of data storage is immediately obvious in terms of your bill for how much space you're taking up so um brian and I are starting to work together on some of this compression. I hope the more people can play around with it. The more comfortable people will get and um the more feedback we'll get from the community and the better.

J

J

I don't know if you want to say more about that brian.

F

Not yet yeah, I'm still very early. In this I mean my my goal is to have sort of uh you know. So csm has this workflow capability that jim edwards has added and we could have a workflow that calls the container that does this, but I'm very very early on the plane. With this and and uh sort of I can I've been able to test the nco stuff in the container on some already time series data, but I want to get the time series generation into that as well.

F

So that's a full like singular solution, so uh there's still some work to be done, but as soon as I get time- and I I make a bit of progress, I'll I'll coordinate with allison and anybody else- who's interested- maybe we can. We can kind of get some eyes on it from scientists and and get it in workflows for testing.

B

G

Medeiros see you in the end.

B

Yeah, I just have a naive question. um Well, I have two two naive questions. First, with all of the compressors do all of the standard, is there any consideration for the end user enabled.

H

B

It once they go into their analysis programs do they need to have software installed to to uncompress the data.

J

So um with that with using the compressors through netscf filters like we are, then it will just be as if you're reading a losslessly compressed file now so as long as um you have then the right net cdf hdf5 installation, which obviously we would make sure the right one was on cheyenne. Then it won't be any different to you.

B

H

B

And then then, my second question is say: someone might have like a pile of external hard drives sold with hundreds of gigabytes of net cdf files. How what's the first step in just uh grabbing these compressors and compressing away.

J

um Well so I mean for sure you should already be using lossless compression through net cf and then um the easiest one to use that um I mentioned was charlie zender's method because that's just simply available through the nco tools, and I I wrote a document how to do this and I shared it with brian and I happy to share it with you. But it's it's fairly straightforward.

J

It's no more difficult than applying lossless compression.

B

J

Then the uh the more like the doe, compressors, sc and zfp you do have to have a special build of an scf hdf5, as jim pointed out in the um in the chat, and also jim, pointed out that yeah that you won't have to use nco for charlie zender's tool with the next net cdf release.

J

It's actually going to be a part of that, so that one I like that one, you might not get as good compression rates, but it's very simple to use very easy to understand and all your tools are going to work automatically. So it has some really great advantages and I think it's a good place to start sounds good. Thank you.

B

I see uh there, you know, we've been talking about compression of output data. I see that there's an interesting comment from mattvay in the in the chat. Do you uh memphi? Do you want to uh unmute and and uh share your thought.

D

Yeah, I was just wonder wondering because right now you have to download.

G

Download it at least a terabyte.

D

If you want to run like uh 1850 to 20 something yeah to the.

G

End of the 20th century.

D

Whisky on it, it's a bit less but with like.

K

Crew and all the other stuff, it's quite.

D

Some storage space on a laptop.

B

So, particularly for the atmospheric forcings, if you're running with the data atmosphere, model.

D

B

So the the yeah so the question being, would it be feasible for us to compress uh compress all the atmospheric forcings and other input data to see esn so that the yes, so that people can get compressed input data um yeah? That.

E

C

Like a good suggestion,.

B

I uh makes sense to me: um I don't know if, if others like, jim or brian or others, have thoughts on that or or allison and alex if you've looked at all at input data that wasn't output.

F

Yeah I'd be curious to hear um I don't even know to be honest, whether we use any compression on our input data now, but I'm very interested in this because again with the cloud you have, the the charges for transfers are related to the size, the data. So if we can do uh compressed data, that's it's faster. It's cheaper! um It's! This is that's my interest in this uh there there are questions.

F

I think that aren't answered um and alice, and I have talked about doing like a run, with restarts compressed to conservative values, to see what the variance is from uncompressed runs, because if you're talking input data, uh you know uh lossless jim just responded. Input data is not currently compressed as well, not even lossless, um maybe there's something we could do there, uh but then you maybe don't have that backwards. Compatibility, uh but velocity compression be really interesting for uh to see if there are areas where that's even viable. I I don't know I.

F

This is a lot to look into yeah. It's really cool stuff, though yeah.

J

I've kind of steered away from even suggesting using lossy compression on restart files or input files, because my feeling is, if that's like, a step beyond accepting its use on the output files and will have much more effect on the much greater effect on the data. So my feeling is: let's get the output data that adopted first but um yeah in the future that it would be fun to look at though marianna.

C

That's a great question about input data. um The question I have is say we start compressing input data. Do we get rid of the other input data because then you have two copies, which one are you going to use? And so that's that's the. So it brings a question of backwards compatibility in because.

B

We never overwrite our data, given our given our strong, strong requirement for backwards compatibility. I think we'd have to do it for for data sets moving forward um and could start maybe with some of these big uh data atmosphere. Forcing data sets to at least for offline runs they're a big barrier. I.

C

Think that could make actually a huge difference in um how we store it even on I mean even on cheyenne, if you're, if you're forget about the web servers, we have you know some of these are particularly as we're going to higher and higher resolution. That could be a really big saving.

B

F

Yeah I was on this note. I was just going to say if we- uh and- and this is where I don't know enough- but if we know if we can read loss, lossless compression with certain versions of net cdf, and we can detect that uh when we reach out to the input data servers, we can possibly set up a secondary input. Data service, like I said, I'm talking about set up regional ones for the cloud that have the the lossy compression and default back to the loss list from ncar. If that's not available.

F

So I think there are ways to test this and and sort of get the the lossy stuff out and maybe track stats as to how many people are still using older versions of net cdf. You'd have to think of a way to do this, but I think I think there are some options there and maybe that's the path forward. Yeah.

B

Cool um I have, I have a kind of broader question for allison and alex, which is. uh Are you.

D

Able to summarize.

B

Kind of your sense of the um of the level of support you, you get versus resistance from the community when you, when you've talked to others about this, and I guess related to that. Are there things that others from the software engineering working group can do to can do to kind of support, support this effort and kind of kind of help. You guys see it through to uh so that, so that we can get this adopted soon, because I know it is a it's a important thing.

K

What else well.

J

I think certainly the fact that um we are going to run out of storage soon has been very helpful in motivating people's interests. um Gokan has been extremely supportive, he's actually partially funded a lot of alex's work and um I feel, like people are definitely um more interested. I I have had trouble getting um cam scientists to look at these data sets that I generated I'll say in honesty.

J

It's been about a year since I generated them. I um have gotten more interest from some of the other groups and now, with the advances in net cf compression we're going to start looking at some of the other model group data, um I definitely feel like um people are interested and um yeah. I it's it's just we've been working on this for a long time, but it's really only recently that the technology is kind of coming together that it could be put in practice and even so um like. I don't think.

J

I think the sc method will work with pnet cdf, but I don't think uh zft does yet so there's still some issues uh to be worked out as far as doing it in peril, but I guess I'm still just aiming for. Like kind of the low bar of like compressing, some of the data sets that are out sitting out there, not even worrying about doing it real time or new things, but let's go after some of the old stuff, that's taking up space. So I don't that's that's my thought, but certainly I mean.

J

um I think that this group can definitely make it easier to use lossy compression, maybe make it the default. Maybe work with the scientists to help them get experience with the data so that it's not again. I don't want it to be this scary, unknown thing I wanted to be. You know when they sit down to think about which variables they're, gonna output and, as rich mentioned in the comments, that's obviously the best type of compression to stop outputting stuff that you don't look at.

J

But as you decide what you're gonna output decide, you know what kind of compression would be appropriate. So that's my two cents I'll. Let somebody else talk.

B

Okay thanks, um I see there's a there's, a lot of there's a lot of questions and comments in the chat. So let me try to um uh I'm gonna try to summarize them and and for the people who ask them feel free to jump in if I'm, if I'm uh mis stating these um cheryl craig asks about uh bit for bit comparisons, and so I I guess this gets to the reproducibility of the lossy compression algorithm itself.

B

So if we have a baseline, um where we've run lossy compression to store that baseline and then we rerun uh code and then apply the same, lossy compression, algorithm and parameters can will the results still be bit forbid with the um with the baselines?

B

Does that make sense? Yes,.

J

As far as I know, if you're using the same versions of everything, it should be, there's not um inherent randomness in these compressors that we're using at least.

B

Okay, good and what's the dave, bailey asked, what's the latest on uncompressing, to read uh in terms of cpu time I think yeah, so I guess dave you're asking: how long does it take to uncompress these files.

J

So so doing it through netcdf, I haven't noticed the difference between reading a losslessly, compressed net cdf file and one lossy compressed. I mean it's being taken care of by netcf and you're. Like most your data, I think, is already losslessly compressed, so I don't think you're really going to notice a difference between what difference you already experienced when you went from the original to the lossless.

J

If that makes sense,.

D

Yeah, I guess I was thinking in terms of you know, using lossless compression on restarts and input data. You know is that going to add overhead on every time we initialize on a continue or a startup run or whatever.

J

Yeah, I don't so the person that probably knows that is hying. I don't know the answer to that question.

G

I I kind of do.

B

Yeah because I.

G

uh And I looked last year uh be before I retired, from incar we looked at the cost of different two different algorithms. One was dfp and how long? What is the overhead for the compression and and decompression step uh relative to the overall time it takes to do the io and and it's algorithmically dependent?

G

um Actually I have a. I have a slide or a poster. That does that. But you know the fact that you kind of find out really quickly is that um you know your. I o times go down when you're dealing with compressed data because there's less of it.

G

So it's actually it's that if the, if you have a nice uh decompression algorithm, for example, your your read times will be uh significantly lower. If the file is smaller, that's so it it. But the important thing is like uh we hying, never tested this on.

G

uh You know variable, let's say all the different uh algorithms that I know of that were listed by allison, so some may be more expensive than others, and it's worth it's worth. Looking at the overhead of how like per petabyte, how many uh core hours do you actually have to spend decompressing uh but- and I could david, I could share that that poster with you could kind of see what I'm talking about.

G

I think another thing is like uh the little subtle thing which is crops up is, if you try to do parallel uh this stuff in parallel, and you have a block size, I think the reproducibility issue might crop up. If you don't do it with the same block size um in two different runs, and I don't know the answer to that.

J

That that so that's true, it definitely depends on the block, like everything really has to be exactly the same, it does depend on the block sizes. So usually, when I do compression, I specifically pick the block chunk sizes for net cf, so that I know what they are and I don't leave it up to the algorithm to pick.

G

Yeah and- and I think that's that's unfortunate- we tried to stay. We try to stay away from that say, an mpi within the models that you can run with different ranks and you won't get a different answer, but it just may it just may not be possible, uh but that's a kind of a deep in the algorithm model dimension. There might be a way to patch it up. Somehow.

J

Yeah and rich brings up a good point, um dave that it really does depend on the algorithm like charlie zender's. Big grooming algorithm will absolutely be the same time to read data compressed with that as with regular lossless compression, because it's basically like a a pre-conditioning type filter. So the file won't look any different, so we'll take the exact same amount of time, whereas cfp and sc there's a little additional work that goes on so I mean all of this is very algorithm-dependent as well.

D

Jim raised the issue about hdf5, parallel performance versus pnet, cf, parallel performance, and- and I guess my thinking on this is you know this is not really an offline process. This is this is an online thing where we're going to be initializing the model we're going to be reading in great information. Initial data sets potentially restart files, whatever things like that, so this is all about the online tools that we have available, um and so so we need to build it into the whatever the net cdf. Reading. Writing that we're doing within the model.

J

I mean one thing: I think that we have to keep in mind too. Is I mean, like rich said, if the file size is smaller, it's going to take less time to read, and the other thing is beyond that. I mean we're really at the point where storage is so much more precious than cpu hours, so pardon me wants to say who cares if it takes five extra minutes at the beginning of the run, if you're gonna produce 30 less terabytes of data, I mean really. Who cares you know, so I.

A

Think we have to.

J

Adjust I mean I'm I'm being flip about this, but but we have to compare the cost of the cprs versus the storage and we have to be willing to take some. We, we might have to be willing to take some performance, hits and certain things in order to reduce the amount of storage. Well,.

D

The people who care about this or like the data assimilation people, you know who are going to be writing out frequent restarts, and um you know, they're going to really care about that sort of performance. Yeah.

J

And maybe we don't mess with their stuff? You know I mean right.

D

J

Let's go after the low hanging fruit first.

D

No, I I think I think your point is extremely valid of let's go after the existing data that we have already on disk. You know, because that's a huge burden and then I think yeah it's another question about as we move forward. How do we better compress our data as we're producing it so.

G

Yeah, I was just- I was just going to add that we with the experiments that we did with the parallel compression we looked at the cost, uh cost comparison using some um sort of figures of merit for cost of disk space customs cpu hours, and it's it's uh decidedly on the side of the cpu are decidedly on the on the side of data storage.

G

um You can kind of express this as to like uh the concept of I'm trying to frame this properly.

G

How long would you have to store the data and how long does reserving that amount of data for that length of time? How much does that cost compared to the cpu cost, because data is more like renting a an apartment and in cpus are ephemeral right. So when you, when you look at that, it's very the retention time for the data for compressing it from a cost perspective is very small compared to the before you break. Even you know, I forget what the number was, but it's it was on the order of like 48 hours.

G

If you're going to keep the data for more than 48 hours, you probably should you probably should compress it because, but this is again with one algorithm and- and you know particular data file. We were using uh a very high resolution, his single very high resolution history files, like 178 gigabyte file from impasse, so your mileage might vary, but I didn't get the sense that that that allison is wrong. You know that who cares about the cpu costs really.

B

G

We just have a few minutes.

B

Left officially um a minute or so, but uh I there was a lot of activity in the chat. Do people have uh some questions or comments that they want to uh still bring to the whole group either on this or or there were stuff we might have missed from jesse's earlier talk. So in the last few minutes.

F

Yeah, maybe um building on jim's question here um uh allison. Are you guys in touch with any of the people that are doing peanut cdf? Do you see that as uh there's a potential for uh compression in that? Is that the same stuff you're doing what's the path forward? There.

J

So I think it's been four or five years since john dennis and I contacted peanutcf people and at that time it was like. Well, we don't have the resources to do this unless you have money to give us and we didn't have money to give.

J

But uh more recently I know that the sc uh folks out of argonne have been started working with the peanut cf people, um but I don't. I went to their github page the other day and there wasn't a lot there. So I not don't know what the status is, but I think it's only good for us if the doe wants it there, because they probably do have the money to get some of this in there.

J

Yeah so jim, maybe you want to comment on that. Jim.

D

I can only say I don't know either, but I I do know that e3 sm gave peanut cdf developers quite a bit of funding um in the last couple of years, and I think this is on their task list.

J

I mean, I think the algorithm developers are pretty motivated for this. To happen, I mean I feel like both the zfp and sc developers would be thrilled. If we said, oh we're going to use your algorithm to compress all our data, so we do have some leverage there, but unfortunately yeah somebody has to do the work. So I I'm optimistic that all this is going to come around as more and more simulation groups are realizing. They need to use lossy compression, so I'm optimistic about it, but yeah.

B

Okay, any any other final questions or comments in this official session. Before we break for lunch and informal chatting.

B

All right well, thank you to all the speakers today and thanks all for this uh very interesting and useful discussion, um yeah. Clearly a lot of interest in in the lossy compression and a lot. uh You know exciting to see this continue to move forward. um So we are, um you know, so this is where we are officially closing the the session, but um some of us will be staying on for the next hour um in which we try to recreate the experience of of chatting and having lunch together um so feel free, please!

B

uh If you can stick around and we can chat, I think we're gonna set up um breakout rooms for that. If I can, uh if we can figure out how to do that and uh so that we can, you know you can you can get together with someone you wanted to catch up with um if you'd like so yeah we're.

D

Gonna have enough people left for breakout rooms. Well, we'll see.

B

If there's only a few of us left, we don't need to, but.

F

Yeah I mean I I gotta say I've been been hearing all the conversation. Everything is interesting to me, so I'm I'm unless there are needs. I like the main room. Okay,.

B

Well, we could stick it, we could stick here and then, if, uh if people want you can, let me know and we can set something up- cool.

B

I'm I'm gonna personally step out for just uh um just a few minutes to get myself a little bit of food and uh take a take a five minute break. But um but I will be back shortly and others feel free to stay on and chat.

J

Bill, I have a question for you real, quick yeah, so I gave you the hosting before I stopped the recording. So we need to stop the recording real, quick.

B

J

So I can't do it anymore.

B

Oh yeah, I see this okay.

J