National Energy Research Scientific Computing Center (NERSC) GPUs for Science Day, October 25, 2022, 9 Nov 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Machine Learning Applications to Collider Physics

Description

Vinicius Mikuni (NERSC)
Machine Learning Applications to Collider Physics

A

Yeah yeah so today I'm very happy to talk about some of the projects that I've been doing here at the lab, mostly based on this Title Here machine learning methods for collider physics and starting from why we want to use machine learning and why quality physics can be a very challenging and exciting field to study and Java.

A

We also had some glimpses as to different quality physics, applications and ideas on previous talks today, but I also want to talk more about the experimental side of microphysics, where we really take Bunches of very energetic buttons at the LHC, for example, and we just mash them together in order to create these very nice images here that you're.

A

Looking at this animation, where these new particles that are generated, they encode the information on how these protons they interact with each other and our knowledge about how the particles they interact with one another and in order to be able to take theories that we have currently that can describe how these particles interact. We need to have a very large amount of experimental data of the sparkle collisions in order to have a good way to compare the two together.

A

So roughly, what we need to do from the experimental side is to First have these Bunches of protons colliding having some particle detector.

A

That can then measure whatever comes after each of these collisions and do some data analysis methods to take the information from what the detector measure in order to convert that into some Physics observables that we can then compare with the theory and see if there is something there that we either understand or something that we basically don't understand and something that is also interesting from the side of the theory perspective is that, in order to compare with the experimental data that we collect with real particle collisions, we need to have a way to also create our own predictions on how these particles they should interact with each other.

A

So we do that with simulations and we take those simulations. We also pass them through a simulation of the detector material and how the detector would measure each of the particles that we predict in your theory and then the same data analysis methods that we use with real data. We apply that to the simulation, then codes, our knowledge, and we compare the two to see how well can we do the comparison between well? We know before we actually observe so the topic that I want to talk today.

A

They are going to cover three different ideas. The first one that is called here on the top- this is called a surrogate model, is how we can make simulations of the detector response that are faster compared to the full simulation chain that we use currently in your experiments. The second application is called the unfolding, which is roughly the opposite directions that the convolution problem, where we basically want to take data that has been measured by our detector and converts it back to this realm of theory.

A

Predictions where we have particles that are predicted by the theories before any detector effects and the last application that I want to talk about is on the parts of the data analysis and how we can take the data that we observe from particle collisions and try to identify things there that we do not know what it is. But we just know that something might be there that we don't understand, which is basically a normal detection and how to try to find new physics. Even though you don't know how new physics should look like.

A

So starting from this section, where I talk about detector simulation and when we have to simulate the full response of the detector, that can be very computational intensive and that's because the interactions of the particles, if the detector can take a pretty long time to simulate since the detector material is very granular.

A

So you have to do a precise image of these interactions, which corresponds to thousands up to millions of detector without channels that we need to do the full simulation of to give you a perspective, if take the two main experiments at the LHC, the atlas, experiments and the CMS experiments, and if you take their previous uh run times on taking data collected by in the years of 2016 to 2018, you'll, see the 40 of the whole computing power of the whole carburation was dedicated to simulate events and in particular the part that takes the longest is simulates the response of their detectors and that already takes a lot of computing power.

A

But if you look into the future, where we plan to change our accelerator facilities in order to have even more particle collisions in a shorter amount of time, then that means that we have even more data that is going to be collected and we need to select even more a particle collisions in order to match the amount of experimental observations that we have. And if you just try to extrapolate the budgets they currently have for these simulations into the future.

A

Before you actually need to accomplish, you'd see that roughly in this 2026 time scale here where we are going to really do an upgrade of the detector facility, so even more particles colliding at the same time, you see that basically or correct methods. They do not scale to the same level as what we actually need in order to take advantage of this full data set.

A

So the idea is to try to take this full detector simulations and try to replace them with some other methods that can be faster, but also very precise, in determining how Park was interact with the material and that's the idea of a surrogate model.

A

So the idea that we can, for instance, try to train some machine learning method that can replace these full detector simulation with a machine learning model that is very fast to evaluate and can be used to also give the same response as what the food that simulation would give you one of the works that have been done here at the lab use a New Concept for generative models in machine learning.

A

That is based on diffusion models, where the core idea is that, if you have a way that you can control diffuse the sample into some noise, so, for instance, you can imagine that we have this dog here and through some equation that evolves over time. You can transform this dog into a noise distribution. Then you can also do the opposite. You can train your network that learns how to do this inverse the fusion process, organizing process, where, basically, you can stock analyze distribution and keep denoising until you get something back.

A

So the idea is that you can train a method that we start by taking some random noise and use machine learning in order to convert this noise into something that is actually useful. In our case, something that looks like the right picture here where here I'm showing an election interacting with Detective material and each pixel in this picture, corresponds to energies that have been deposited by this box by this particle and to train this machine learning method.

A

I use the perimeter certain computer, which has, as you all know, lots of gpus and are really good for projects that take advantage of machine learning. Libraries like tensorflow in this case I've been using 16 GPU, so not something incredibly large I'll distributed using horovod and using some different test. Data sets just to see if this concept of training designated model is powerful enough. In order to give you a selfie that will actually be user for data analysis.

A

So in this case, I have three different data sets that corresponds to different detector layouts, and the main difference between these data sets just the amounts of the pixels that they possess. That I need to simulate. So in this case, you can see that in this first data sets I have about 200 or 400 Dimensions. They need to be simulated to a data set three where I have 46 000 channels, then now the machine learning needs to learn how to properly simulate.

A

But the nice thing about this use of machine learning is that roughly the amount of memory that you require from the machine learning perspective does not increase too much or the architecture doesn't increase so much based on the amounts of simulated pixels.

A

That I need to do and if you just compare the time that it takes to generate 100 particle interactions and if you look at the column in the middle called column score, you see the for data set one that would take about four seconds using the machine learning method which Compares a full simulation can take up to 100 seconds to 1000 seconds. So a couple order of magnitude faster, just because in the full simulation depends on the energy of the particle. The time that it takes to do this.

A

Full simulation can variety by a lot, but a second application that I also want to talk today is called the detector enfolding. So the idea here is to do almost the opposite as to what I've been discussing before. So imagine that you had your particle collisions.

A

Those particles then interact to detect the material and everything that we have after adjust the response of what your detector gives back to you, but if you want to study a different theory that predicts how particles interacts with one another in a different way than the ones that you've been using, that means that you also need to simulate these particle interactions with the different predictions and do a detective simulation all over again over those guys in order to get how do they look like after detector effects, which can be again very costly and also can be very hard for people that is outside the experiments to know what they need to do in order to do this detector simulations.

A

So the idea is that you can try to take predictions after the detector effect and unfold them back to how particles used to look like before they interacts with the detector material. So, basically, we can try to take data. Real data from particle collisions.

A

Do our data analysis methods after the detector effects and then apply a correction over those results in order to convert these results after detective effects to how they should look like before they interacts with the detector and that's the idea of unfolding, and that's also seen as the convolution problem and currently main the main methods that do die in high energy physics. They need histograms.

A

So, for instance, you have like some histograms in the middle here where That's What You observe after detective effects from different physics processes and for an observable that you want to measure and then using these histograms as an input.

A

You can apply the corrections for each being of the histogram to get back the unfolded, distribution or the distribution before the detector effects, which is the one along the right, but that's very limiting, because if you need to define a histogram, that means they need to define a beginning that you need to use to perform these unfolding method.

A

And if you go into the future, where we have more data and you'll. Look back and you think well, can I reuse the same histogram or can I change the histogram, then you already lost the information since after you put things in a histogram, you cannot go back to how they used to look like before they were put in the histogram.

A

That's why you can also use machine learning to do this process of trying to identify how physics events look before other detector Effects by calculating a re-awading function. So the idea is that if you have the response of some Physics observable before and after interacting with a detector, then you can train a machine learning that learns how to relate one distribution to the other.

A

So you can always move between these two samples, one before and one after detector effects, and the interesting thing is that this idea has also been used in a real experiment in the real particle collisions so just show they actually not only works with other example, but really works with your barcode collisions. So, in this case, I use some data set, that's been collected by an experiment or running during 2006 and 2007 so a few years ago.

A

But the series is a really nice data set that you can use to study particular interactions and to train this model. We really need a lot more computing power, so in this case I use a 128 gpus just because data set sizes that we need to deal with here, a lot bigger compared to what we had before. Plus you also need to take into accounts, other things like uncertainties and so on.

A

But that's the part that is very intensive is only the training, the evaluations, a lot faster and can be done easily with just a single GPU, and the nice thing is that after you train this method, the evaluation can be used for anyone in the collaboration that wants to reuse your results, because that's very general and you can get distributions corrected for these detector effects in basically one goal.

A

So if everything you see in these slides, for instance, a physics of circles that were measured in just one pass of this machine learning matter and the last part that I want to talk in this talk is about how we can take the data that we have from particle collisions and try to identify things that we don't understand about it. Because there's a lot of open questions in high energy physics that we basically don't have a good answer for.

A

So if you ever heard about dark matter, for instance, you can imagine that well, there might be some particle there that creates dark matter, but we haven't yet observed. What should this particle be?

A

And one thing that is very tricky when you're trying to think about how to identify things that you don't understand is that let's say you have some data that you put in a histogram or whatever data, visualization method that you like. That is the one on the left and you want to interpret how these data, what you're actually looking at.

A

So you can imagine that. Maybe there is something that is very exciting and there is some new physics process that you don't understand and that's why you want to figure out, but there's also the chance that everything that you're, just looking at is compatible with the predictions from the theory that you have, that just predicts how particles interacts your phone another and those you understand relatively well.

A

So the idea of this project is to ask yourself: okay, if I have some data sets that tells me that might be something interesting that I don't understand, can I give you context, can I identify how false positives look like in order to take the response of false positives and compare with the data that I have and see if they match or not, which is exactly this idea of this project? What the idea was you estimate, how do false positives, look like by using the data but in different regions?

A

So you can imagine that you can take your data sets and try to find a region that your algorithm says that the compatibility with the simulation is is really good, where the compatibility with the theory that describes just normal Park interactions is really good and use that information in order to predict how the false positive should look like by using again machine learning to train a method.

A

The Learners, how to interpolate between these two ideas and the nice thing is that if you try to use this method in a data set that basically has nothing new. There is no new physics and everything just compatible with your background, then. Basically, what this method tells you is that there's nothing there, which is really good. So the matter is really capable of telling you that there is nothing when there is nothing, but on the other hand, if there is something, then this method is also able to tell you and say that.

A

Well, the number of observations that you see in this region does not match the number of observations that we should expect in a force with a background. Only hypothesis and you can use this difference between or you have to Observer what you predict to give an estimates of how many new things are actually in your sample, which is what is shown here on the right foreign.

A

With these different projects in high energy physics, where the idea is to try to use some of the knowledge that we have in collider physics and try to bring new metals that can be powered by Machine learning and take advantage of the Premier supercomputer in order to train these models that previously without abuse or before machine learning became such a big thing were not so available.

A

Thank you very much for your attention.