National Energy Research Scientific Computing Center (NERSC) NUG Annual Meeting 2021, 30 Oct 2021

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: NUG Annual Meeting 2021 - ML in Fusion Research, with Ralph Kube

Description

Ralph Kube from PPPL talks about machine learning in fusion research

A

Okay, beautiful so um my name is ralph cougar, I'm a research physicist at the princeton plasma physics laboratory and I will talk a little bit about machine learning in fusion energy sciences.

A

So when I was asked to do this presentation, um I really had to go back and think what I was going to present, because um I certainly am a machine learning practitioner in fusion energy sciences, but it is quite a large field.

A

So it came to my mind that there will be a machine learning mini conference at this year's american physical society division of customer physics meeting- and this is quite a big meeting, so there will be four sessions, three oral sessions and one poster session with over 71 submitted abstracts.

A

So um to illustrate the uh the breadth of the field, I took all the abstracts from that meeting and generated a word cloud and here stripping away all filler words like and with etc. The uh how large a word appears in this plot gives you some sense of the relative of the frequency. These words appear so uh first, let me say that fusion energy sciences can be roughly uh separated into uh magnetic confinement and inertial confinement. Fusion um and this mini conference covers both, so um this workout gives you uh oh.

A

I hope this workout gives you an overview over the breadth of modalities, where machine learning is used in uh fusion.

A

Certainly neural network appears very large that reflects the fact that most machine learning practitioners in fusion energy sciences do deep learning in all kinds of varieties that can be multi-layer. Perceptrons. You see the phrase convolutional neural network popping up. You see deep neural network. There are some people who do time series modeling with a recurrent neural network, neural networks and then more of coming from the from the physics side. You see the big phrase simulation and that reflects the the research thrust in the field to replace expensive numerical simulations with machine learning surrogate models.

A

That's uh certainly an aspect, many people look into, and then you see many smaller. You see many smaller terms. For example, anomalous transport coefficients beam emission spectroscopy disruption prediction, inner shell target and those are all terms that reflect uh the different uh physics modalities so to say that come that that are important from the uh from the exp from the experiments, and you also see some smaller words, for example, gaussian process regression.

A

um That's certainly a very popular technique, not deep learning, but there are certainly machine learning practitioners in the fusion energy sciences who use this technique.

A

Okay, um I'm personally working on magnetic confinement fusion, so I thought I it was coming from the experimental side. I'm going. I thought I uh motivate a little bit what kind of data we are working with, because when you say machine learning, you certainly look at the data sources.

A

So um here on the top left, you see a token mac, a machine for use for magnetic confinement, fusion- and here this is an illustration of the upcoming spark token mac, one of three token mags that is currently under construction in the us and then in the top right. I just took some random visualizations of data that are measured in plasma discharges and, as you can see, all these five plots look nothing alike, and that is because the data we, the measurement data we sample infusion plasma, varies by a lot um it can.

A

There are different uh diagnostics, which are all uh sensitive to different physics and, for example, we have magnetic fluctuations here shown a spectrogram. We have.

A

There are diagnostics and sort that are sensitive to electromagnetic radiation in all kinds of the spectrum, for example, visible light infrared which targets the material walls cyclotron radiation, which is kind of micrometer range and they're. Also, for example, diagnostics that are based on particle flux, so um yeah, a broad array of diagnostics, is used to sample the uh various aspects of the plasma and we use those to try and reconstruct the plasma state. Now um these diagnostics also have different output. Some diagnostics just give a zero zero dimensional quantity, for example, a pressure sensor.

A

Some diagnostics give us one dimensional data which would be of profile or two dimensional data and in addition, many of these diagnostics either due to uh the physics they are sensitive to or uh electronic limitations will sample on different time scales.

A

So on the plot on the lower right here, uh this kind of illustrates what kind of length and time scales we are dealing with in fusion plus mass time scales uh range from less than nanoseconds for wave physics to uh seconds for uh microscopic transport and the length scales.

A

So that's that's over 10 orders of magnitude on the x axis and then on the y axis. We have time scale differences between seconds and 10 to the minus seconds. So it's a lot of variability and from this variability uh comes also the fact that machine learning tasks vary a lot in fusion plasmas.

A

So um let me just quickly illustrate three kind of machine learning, research topics that are popular at pppl and since we are associated with princeton, also at princeton university on the left, you see an illustration that's supposed to be uh plasma control and here- uh and that is exactly what the name says. If we have a discharge, we have a real-time feedback system uh that tries to steer the plasma away from unstable configurations, kind of kind of just keep it confined, and there are some there are some.

A

There are some research going on how machine learning can be used to uh optimize the feedback system so that the plasma plasma control system is able to operate more efficiently? Keep the plasma better on time.

A

Another thrust of research that bpl is heavily involved in is machine learning for fusion simulations fusion simulations come in all sizes, and please excuse me.

A

And um the uh we are running some simulations on um yeah, the uh the big computers like corey and.

B

A

Daddy has to talk to us guys, I'm sorry, I'm having a daycare situation today. um Okay, um right, so we we, uh we have the xgc code, which is very compute intensive and um in preparation for running more physically accurate uh models. We would like to replace some compute intensive parts of the codes with machine learn with machine learning surrogate models.

A

Another thrust of research we are involved in is disruption. Detection, a disruption in a fusion plasma is a mode that that is.

A

Can you wait a little bit.

A

um That is a sudden termination of the of the discharge um which can damage the plasma vessel watch, which we, which is very costly, so we would like to uh have a system that uh precautiously terminates the discharge when it is uh detected that the plasma evolves into the state.

A

So um when it comes to using nurse infrastructures, um nurse provides a great infrastructure and many of the machine learning practitioners. I've talked to use no use the infrastructure and are very happy with that.

A

Some things we are very happy with is that tensorflow and pi charge modules are readily available, also in modern versions and a typical job for machine learning. Practitioners are smallish problems; they don't require much compute time. So for this we rely heavily on jupiter notebook and we really appreciate the gpu support we have in there.

A

Five minutes to go five minutes: okay um for logging, people use uh tensorboard and for larger jobs. uh I know people use ray tune, horowat, uh cray, hpo and other tools that are available at a nurse.

A

Some bottlenecks I personally have experienced were that when we train large training jobs which require multiple terabytes of data, data, loading becomes a bottleneck and we have inspected this with uh debuggers uh such as uh tau um and then um so. We, we are actually very happy with the software that is provided. um So I'm actually going to start wrapping up my talk now and talk about uh some new contenders um and there are the ideas that uh that is basically uh using automatic differentiation in combination with uh scientific simulation.

A

um Some uh that paradigm is also known under programming 2.0 and there. The idea is basically to use automatic differentiation to make arbitrary code amenable to gradient-based optimization, not only neural networks, for example.

A

So um there are two tools that uh I gather are popular among: machine learning, practitioners and those are julia and jax. So julia is a language that is uh developed from the scratch. It's very young automatic differentiation as a first class citizen, and the entire language is just in time compiled, so it runs very fast and jax is used in conjunction with python.

A

It can differentiate native python and numpy, and it's also just in time compiled, but right now those languages are only used for a very at a very experimental stage and not at a production stage, and it is still unclear how fusion energy sizes will incorporate this automatic differentiation with traditional hpc simulations and then um another trend I would just uh like to shortly bring up here is that we fusion will most likely uh with the next generation of fusion experiments, generate very large uh data sets.

A

We are talking about crossing petabytes of data per day and it is, uh and we are looking into how machine learning can be used uh with the with these kinds of data sets. um So some technologies we are discuss. I have discussed internally with colleagues, are um first uh custom machine learning hardware, and this goes under the name, wafer scale engines. Sierra brass recently released one of these. It's basically chips designed for machine learning, tasks specialized hardware and combine them into.

B

A

That are like physically large, hence the name way for scale engines. uh Tesla also talked about this when they talked about their uh dojo cluster and um it's it's basically tensor processing units.

A

What what they're doing from which google released, I think in 2017, but those I believe will be commercially available, um also in conjunction with the uh big data um age of fusion, if you so will, um is a move uh towards uh uh transformer uh neural networks and uh those uh those are uh by design more general, but require more data to uh to uh to work on the tasks. So, there's some research going on how to use these networks for uh fusion energy right now, yeah.

B

A

Great timing, in that case, uh we have one minute left. If anybody has a question to ask.

B

I guess I can ask a question um so thanks ralph for the nice uh shout out of the uh machine learning software, but is, is there anything uh you feel is missing from the stack or.

A

um uh Actually, no- and um I I have I uh to the colleagues I talked in preparation for this talk, I really haven't heard anything the anything that they feel is missing or would really increase their productivity when they do machine learning at nurse. So I get the impression that uh the facilities are quite uh yeah, where they need to be right now. Well, we'll see later uh when, um when we really do machine learning with the with these big data that are supposed to be generated from ether.

B

Scale experiments.

A

But yeah at the moment.

B

Okay, yeah feel free to immediately reach out. If there is- and uh I don't know if you've had a chance to try permanently yet but.

A

uh No, we are supposed to, but I haven't had time.

B

Okay, so when you.

A

B

Me know if it if it either helps or gets worse, particularly with regard to data loading and and so forth.

A

Okay, hopefully it gets.

B

Better but it might not all right cool, so thanks, ralph.