National Energy Research Scientific Computing Center (NERSC) GPUs for Science 2020, 5 Aug 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: GPU-Accelerated General Relativistic Ray Tracing for Simulating Black Hole Images

Description

Chi-kwan Chan of Steward Observatory/University of Arizona presents a talk on GPU-Accelerated General Relativistic Ray Tracing for Simulating Black Hole Images. Recorded live via Zoom at GPUs for Science 2020. https://www.nersc.gov/users/training/gpus-for-science/gpus-for-science-2020/ Session Chair: Muaaz Awan

A

This is ck, I'm going to tell you about uh some black hole, um your research. Now, because the the background of you know this group of people seems to be quite diverse, so I will start with somebody with really simple uh on on black hole.

A

So you know the the whole reason we are interested in in black hole is to study gravity, and it turns out that einstein's general theory of relativity is still our best theory to describe gravity, and you actually describe that gravity is not a force. Instead, it's a consequence of that.

A

Your space time can be curved, so I have a little animation here that you know when there's no heavy object around then you know the space time is flat and then particles, photons or whatever they were just moving in strict night like uh what's in the animation.

A

Now now, however, if you have some uh heavy dense object like a sun like a black hole that curves the space time, then, even though the particles and photons want to move in the most strict, uh your possible night, because the space time itself is curved, so they will end up. You know curving uh in in their orbits, so john well uh miller gave a very good uh quote on this.

A

If your generation pretty much say space time tells matter how to move, and matter tells space time how to curve- and you know this is not just a a idea- this is very well test. There are a lot of tests on general relativity, mainly in the solar system.

A

However, uh you know we don't really know if the theory is still valid when we go to very dense uh object, and if you just look at the theory itself, there is a very interesting prediction that, if you put enough matter in a small region, you can curve space time so much that all the light cone will point towards a central singularity.

A

So uh on the left here, this is a you're, actually a guava from very old textbook. So these are the the light cone that pointing on the top. That's your future, so the idea run black hole. Is you once you pass this surface called the u.n horizon the of your future actually point inside the black hole? There's no way for you to escape and if such an object exists, then you can, you know, consider for experiment.

A

Then you have this your black hole and you shine a flashlight onto it, and if you do that, because the the space time is so curved that even night you know some of the these rays will actually orbit around black hole, and some of them will go back to you. So when you look at this object, you will see a very bright ring coming back. So this is the observation signature that your many of us astronomers want to capture.

A

So in order to do this, uh you know part of my research is with the event horizon telescope.

A

So this is a collaboration of more than 200 members all around the world that we use multiple telescopes uh trying to capture this event horizon, and this photon ring that I I showed you earlier so in 2017 we use eight telescopes all around the world to form a big array and in 2018 in the coming year we will be using 12 different telescopes.

A

I won't go into the detail of the observation technology. You know, even though I actually spend most of my time in the last few years, working on the data but yeah. This is a gpu talk, so I'll just go through a very quickly the data pathway, and then I will jump back to the the simulation part that we use the gpu.

A

So when we take the observation, we actually record the radio waveform from the black hole, so the data rate was was very big and within a week we actually fill up five petabytes of data and because we have a telescope at the south pole, it's you know not possible to transfer data through the internet, so we actually physically.

A

uh You know ship this hardest to our data centers. So these are just couple pictures, and this is our your physical library of data and at our data center we have a step called correlation.

A

What that does is to remove the laws from these five petabytes of data reduce the data volume by a factor of thousand, and then we end up with about five uh terabytes of uh actual data that can be used and then there's another step called fringe fit10 that we remove the systematic from data set and that reduce data by another 10 000 times, and after that we will work with a small data set and we can finally apply our feature extraction, imaging tightly to get the science out and we can also reconstruct the black hole image.

A

So you know this is a your f of a large collaboration and from this image there are actually a few interesting things so at the center. You know this dot part is the black hole shadow. So this is a you know, direct evidence that event horizon you actually exist and uh general relativity is correct uh due to the even the regime of uh strong gravity, and we have a ring.

A

The fact that the shape of this ring is is circular. It actually tells us something because different theory of gravity actually pretty different shape of this photon ring. So uh you know with the you know, just with this picture that you know we can measure the shape and the agreement actually confirmed that uh einstein's general graduate is correct and also the asymmetry uh you know in this ring also tell us the how the plasma moves around around the black hole so yeah.

A

This picture tells us a lot of the information but yeah in order to really connect this to theory, we need to carry a lot of numerical simulations, so we actually have a simulate uh simulation library and we use these models to compare with the observation and then and then and then we can extract the physical parameter that we interest into okay. So, in order to simulate this black hole image, there are two major steps.

A

One is called general relativistic monitor hydrodynamic, so this step we pretty much just follow the plasma around the black hole uh and then we follow the turbulence for we study the dynamics of this plasma. Now I won't be talking about this. You know, although there are actually gpu acetylene, gl mht code out there, that's actually not my expertise.

A

uh My work is mainly about the next step. That's called general relativistic ray chasing. The idea is we want to follow photons in this.

A

You know curved space time in the christian flow, so this following graphic actually described pretty well, so we set up a grid of photons and at each pixel we just trace the ray back to the glm hd simulation this your colorful uh volume, rendering here this is the glm hd simulation that you might call it did, and then we perform this ray tracing calculation, integrate the radial transfer equation along the way to get the final image.

A

So this is a problem that uh many people have solved. You know I'm not the first one who did that, but many of the the previous study uh you they are done in cpu and I'm actually the first one who solved this problem on gpu, and this is a benchmark from our first paper that we show just doing things on gpu without too much. Optimization is already a factor of 30 faster than existing cpu calls, and you can see there is this flattening here. This is the you're pretty much the startup time of your kernel.

A

If you're solving a small number of photons, then just to launch kernel will take most of the time. But if you are solving many many photons in your image, then eventually the gpu wing and become faster, but you know there actually a little bit of detail going into this speed up in order to get a 30 time speed up. We need to do everything in single precision flow.

A

Usually that's a bad idea in radioactive transfer, because you have this crazy, constant that is very, very big and divided and multiplying them will give you your either any angle zeros. So uh when we develop the kernel, we will actually be quite careful and we manually regroup the terms so that all the variable we save as a single precision flow that turns out to be. You have a range that is within the your allowed range of folding point.

A

So after you do this, you know kind of careful rearrangement, then single precision actually work very well for weight tracing and a typical uh image movie will actually look like this, so this is now a movie of this uh general redis characteristic ray tracing calculation.

A

uh You know this is a three color channel image: red is some long radial. Green is optical and blue is x-ray, and here I'm swinging the the camera, so you can see things uh around the plow. So uh this vertical part here this is the funnel or the jet of the christian flow, and then this other part they are the in falling plasma, and you can also see a ring of very fingering here.

A

This is the ring that I I said at the very beginning that you know some photons theory will turn back to you and once a while, you can also see some your bright flux tube showing up in this movie, so that corresponds to a magnetic reconnection.

A

You know just that event, so this turns out to be your student you're. One of the leading explanation of why you we observe some of the variability from black holes, and this is another you just I can d animation showing you that the black hole actually looks different at different wavelengths. So let me try to move back to the beginning. So in a very long wavelength uh say, radio, the plasma around the black hole is optically fake. So you cannot see the black hole.

A

All you see is the plasma around it, but when you move to a shorter wavelength, higher frequency, all of a sudden, the plasma becomes transparent and you can finally see the the middle black hole. Okay. So at the end of this movie this is 1.3 millimeter wavelength. So this is exactly the frequency that the eht observed the black hole. So this kind of your simulation uh they're not just useful in comparing the observation with you with model, but there will be also.

A

You know, they're also very useful in predicting how the black hole will look like and help us design our experiment.

A

Now this is another movie, so uh in the 2017 data we captured the m87 uh black hole. So this is, you know our current, uh your simulation with all the fitted parameter, so we actually believe this is your if we have much higher resolution. This is how m87 would look like all right.

A

So uh m87 is only one of the main black hole that we are observing. We are also interested in the black hole at the center of the milky way, uh the century star and for that black hole is much well studied. There are a lot of different observations, uh including your x-ray different frequency, so we are able to use those information to constraint model. This is a spectrum you're again actually calculated by a gl ray tracing calculation, so we can use the x-ray flux to constrain our model.

A

We then use some your other wavelength and then we've used the 1.3 millimeter uh size observed by the eht, and then we also use the optical reconstraint. So uh you know before the each collaboration form. This is actually the biggest uh study we had because of the gpu code, even though each single calculation is quite short, we are, you know, with the acceleration. We are able to run millions of image and form a really big image.

A

Library and, like you know, by comparing this image in library, with all the different constraints we, you know we are able to come up with five best fit model, so the ehd is still working on the central image, but these are the five possibilities that how how cellular will look like now I mentioned earlier that different gravity of three. We will give you different prediction of the shadow size, so this is also something some science we can do with uh generations with the gray tracing and you know.

A

Finally, uh in addition, once we get the parameters we can ask, uh you know the ratios algorithm again to actually compute a whole time series of simulation, and then we can start. You know predicting the variability of of these different models. So this is a again a movie issue and an essential model. The last two panels here these are the radio wavelength. This is the 1.3 millimeter optical and on the right, the blue channel.

A

That is the x-ray, and you know the lower panel here shows the light curve how these uh different wavelengths vary, and you know if you you look at the image once a while again, you form you can see this very bright uh flux tube showing up so again, these are the you know, some of the explanation of why we are seeing flares in in these black holes, okay, so uh yeah. I know this is gpu for science, but gpu is about graphics, so you know our group also, uh you know use gpu to do visualization.

A

We actually develop a uh a software using oculus rift, actually not the microsoft software lens. So we we have a virtual reality, visualization tool to uh overlay this grm hd simulation and the glrt simulation so that we can map the features we see in this vr calculation back to the features in the in the gi mhd simulation. So that is very helpful in understand. What's going on in this calculations, okay,.

B

A

I guess I I you know, I'm short in time, so let me just go through some. You know some recent development, the the the code. I showed you earlier that that was done in a quarter, but our new development, we we are switching to opencl and we are also changing coordinates. It turns out that your internal relativity, the coordinates, is free. You can get the same physics with a different coordinate, so we use something called the cursillo coordinate.

A

Usually that's considered more expensive, computationally, more expensive coordinate, but we work out some symmetry in the equation and we find our formula sum to simplify that so that you know the number of operations is not that much higher than the boiling crystal coordinate, but because cursor is cartesian. So now we can get rid of all the coordinates singularity in the previous code and this is a benchmark of our new code. So on single precision, this is your super fast.

A

It's 0.1 nanosecond per photon step, but even in double precision you can see the the cursed shield coordinate our singularities free uh formulism. It can be even faster than than the standard formulas that other people use, and these are some convergence tests. And again I want to highlight that you know the change of coordinate, even though the computation is more expensive. You get rid of the singularity and your our ray can just go through the pole without any problem, and uh you know.

A

Finally, let me just tell people that uh your order, ray tracing itself, is quite simple. You, you use a foundation for some other interesting work, so when I say ray tracing, this is actually very different from you know what the industry called ray tracing, because we are just.

B

A

Ode so the rays they don't really scatter around. Now, however, you know with the new uh nvidia uh you're turning and amping architecture. You know the industry is moving to ray tracing graphics, but those calculations actually corresponds to scattering in in our calculation.

A

So uh something we are going to do now is to you know, turn our engine to to do something that is able to handle scattering, and we are also working on particle-based trio kinematics. So this grip figure here is just showing you some of the particle triangulation of this, uh your dry kinetic calculations and we are also working on doing a radiation geomhd calculation.

A

So this is combining the weight tracing work that I showed you with the glmhd.

A

So what we want is to do all the gi and calculation on the cpu, but let the gpu handle the radiation, and also this three-way view code that I I described it actually built on a library called, looks and looks, uh is able to measure the performance at one time and uh and you'll just re-architect your algorithm. So this is something that uh we are currently working on and uh and that's it so I I guess you know if this time I can take some questions.

B

uh Thank you very much. That was a very beautiful uh presentation with all the animations and everything. So uh we have a question. I think that was partially answered by you or maybe mostly uh because about the retracing okay, we have another one. I think yeah.

C

Can you comment.

A

B

A

The q and a that you are talking about, or you said.

B

I know so there was one in chat. It was about rate racing. I think you just answered that in the second last slide, so there is one in q and a I think you can see that. Can you comment on the operating system, hardware, software platform and details of the vr visualization stack very interested in using vr for 3d database.

A

Yeah, so that that's, you know very interesting so when, when we first did that uh virtual reality thing we we actually used the oculus development kit, and that was the time that oculus steals your support mats. So we did all the development on the mac with the oculus sdk with opengl, but some of the latest development we are doing now is we turn this whole great tracing calculation into a library.

A

So when I say great view, it's actually not your standard program, it's just a library, and we are planning to uh your interface library with unity, so that you can do most of your your virtual reality stuff in unity. But then you call the function from g way too. To do the scientific.

B

Calculation, all right, uh I think we have a couple of questions uh more and uh so one is from uh here you go here, you can unmute yourself and go ahead and ask the question.

C

Yeah, this was just a question about the fp32 computation. You decided to make thanks to the interval of computation of your floating points, so that uh you were actually able to to do the computation inside of the fp32 range. um Did you find it like by doing interval analysis with a specific tool, or did you do it by hand? How did you make sure that it was always in that interval for any input data.

A

Yeah, so you know we were lucky enough to have some uh your tesla gpu background and when we first did the development, uh the minimum was done in uh double precision. But then we, you know, we type that our our double and fold to another type called real, and then we can just we changing a single line. We can change from double position to single position and.

B

At the beginning, we just saw.

A

You know the result become incorrect. You know we start to get in n, a n and fro and zero uh after we switch from double two to single, and then we just uh you know manually, go into the code and figure out which part of the goal go crazy and then we we find out is you're, mostly the radioactive transfer uh calculation, and then we start you're, manually, recombining and and you're tuning those terms, and then you work out, uh but yeah we didn't do anything fancy. It's a manual uh process.

C

So how do you make sure that uh the results will be in the right range for any input.

A

Yeah, so uh you know to be honest, this is not valid for all the inputs, but when we talk about this accretion black hole, uh you know there is a range of their density, their range of their luminosity. So we know within the ring in within the range that we are interested. They they fall into the range. So actually let me go back to the just nice side.

C

We could say you made a interval analysis by hand yeah.

A

Yeah so so, if you look at this slide here, so we have actually have some comments in our code, showing within the range that we are interested. What is the typical value of this parameter and what is the range of these different things? So you know again everything is done manually, but we just find this very, very uh useful because it allows us to turn a double precision calculation into a single position.

C

Nice, thank you.

B

So uh we have another question from the organizer, but before we go towards that, uh there's a small announcement. uh We have a tutorial on openacc starting in about 15 minutes and the tutorial will take place in the breakout room. The other zoom link that we have. I think it will be going up in the chat shortly or if you're in the slack channel or you might have received the email as well. So, if you're registered for the tutorial, please start uh cleaning up there and you can start testing out your accounts.

B

There will be some experts there to help you out and anyone else interested in this question on the section they can stay here. So I think asheen has a few questions. Machine go ahead.

D

Hi there um I was just wondering if you would comment on the relative value of physical transfer of data um from remote locations to either the investment in infrastructure, to bring the data out or some sort of local processing, maybe using something like gpus.

A

Oh so uh I guess you are referring to the um to the observation part of my project. Instead of the retracing calculation uh yeah.

D

A little bit! um Well, it's more that. So the observation was done at a variety of pretty remote sites, and you said it talked about having to physically transport the data, as in your hard drives in a presumably being antarctica, a ship or a plane, and um I was just wondering: was there consideration giving given to putting a processing center or something like that um close to the instrument, rather than moving the data.

A

Yeah, so so this is a very good comment, so it turns out that uh you know because the the signal is very weak, so what we are observing most of them are noise, and then you know the the step we use to to combine. The data is just a correlation and then the correlation needs to be done on your between each station pair. So you, you can't really just pre-process the data at the station and reduce the size.

A

uh You know so so that's the reason we have to ship all the data uh to to our data center. Now heaven said that uh yo, the issue does use a very high bandwidth and we do have you know some very sensitive station. So in the future, it's actually possible to pre-process in the sense that we reduce the bandwidth.

A

You recall reduce the the recording uh bandwidth and once we do that, we can, you know, reduce the data by by factor of 10 or so but yeah, even with that, it's just faster to do your ship or standard data flow airplay, but you're. Definitely uh you know putting uh putting computation near the observation actually make a lot of sense for for many applications, uh but for vlbi is a little bit more complicated.