►
Description
Chi-kwan Chan of Steward Observatory/University of Arizona presents a talk on GPU-Accelerated General Relativistic Ray Tracing for Simulating Black Hole Images. Recorded live via Zoom at GPUs for Science 2020. https://www.nersc.gov/users/training/gpus-for-science/gpus-for-science-2020/ Session Chair: Muaaz Awan
A
This
is
ck,
I'm
going
to
tell
you
about
some
black
hole,
your
research.
Now,
because
the
the
background
of
you
know
this
group
of
people
seems
to
be
quite
diverse,
so
I
will
start
with
somebody
with
really
simple
on
on
black
hole.
A
A
A
Now
now,
however,
if
you
have
some
heavy
dense
object
like
a
sun
like
a
black
hole
that
curves
the
space
time,
then,
even
though
the
particles
and
photons
want
to
move
in
the
most
strict,
your
possible
night,
because
the
space
time
itself
is
curved,
so
they
will
end
up.
You
know
curving
in
in
their
orbits,
so
john
well
miller
gave
a
very
good
quote
on
this.
A
A
However,
you
know
we
don't
really
know
if
the
theory
is
still
valid
when
we
go
to
very
dense
object,
and
if
you
just
look
at
the
theory
itself,
there
is
a
very
interesting
prediction
that,
if
you
put
enough
matter
in
a
small
region,
you
can
curve
space
time
so
much
that
all
the
light
cone
will
point
towards
a
central
singularity.
A
So
on
the
left
here,
this
is
a
you're,
actually
a
guava
from
very
old
textbook.
So
these
are
the
the
light
cone
that
pointing
on
the
top.
That's
your
future,
so
the
idea
run
black
hole.
Is
you
once
you
pass
this
surface
called
the
u.n
horizon
the
of
your
future
actually
point
inside
the
black
hole?
There's
no
way
for
you
to
escape
and
if
such
an
object
exists,
then
you
can,
you
know,
consider
for
experiment.
A
Then
you
have
this
your
black
hole
and
you
shine
a
flashlight
onto
it,
and
if
you
do
that,
because
the
the
space
time
is
so
curved
that
even
night
you
know
some
of
the
these
rays
will
actually
orbit
around
black
hole,
and
some
of
them
will
go
back
to
you.
So
when
you
look
at
this
object,
you
will
see
a
very
bright
ring
coming
back.
So
this
is
the
observation
signature
that
your
many
of
us
astronomers
want
to
capture.
A
So
in
order
to
do
this,
you
know
part
of
my
research
is
with
the
event
horizon
telescope.
A
So
this
is
a
collaboration
of
more
than
200
members
all
around
the
world
that
we
use
multiple
telescopes
trying
to
capture
this
event
horizon,
and
this
photon
ring
that
I
I
showed
you
earlier
so
in
2017
we
use
eight
telescopes
all
around
the
world
to
form
a
big
array
and
in
2018
in
the
coming
year
we
will
be
using
12
different
telescopes.
A
I
won't
go
into
the
detail
of
the
observation
technology.
You
know,
even
though
I
actually
spend
most
of
my
time
in
the
last
few
years,
working
on
the
data
but
yeah.
This
is
a
gpu
talk,
so
I'll
just
go
through
a
very
quickly
the
data
pathway,
and
then
I
will
jump
back
to
the
the
simulation
part
that
we
use
the
gpu.
A
You
know
ship
this
hardest
to
our
data
centers.
So
these
are
just
couple
pictures,
and
this
is
our
your
physical
library
of
data
and
at
our
data
center
we
have
a
step
called
correlation.
A
What
that
does
is
to
remove
the
laws
from
these
five
petabytes
of
data
reduce
the
data
volume
by
a
factor
of
thousand,
and
then
we
end
up
with
about
five
terabytes
of
actual
data
that
can
be
used
and
then
there's
another
step
called
fringe
fit10
that
we
remove
the
systematic
from
data
set
and
that
reduce
data
by
another
10
000
times,
and
after
that
we
will
work
with
a
small
data
set
and
we
can
finally
apply
our
feature
extraction,
imaging
tightly
to
get
the
science
out
and
we
can
also
reconstruct
the
black
hole
image.
A
So
you
know
this
is
a
your
f
of
a
large
collaboration
and
from
this
image
there
are
actually
a
few
interesting
things
so
at
the
center.
You
know
this
dot
part
is
the
black
hole
shadow.
So
this
is
a
you
know,
direct
evidence
that
event
horizon
you
actually
exist
and
general
relativity
is
correct
due
to
the
even
the
regime
of
strong
gravity,
and
we
have
a
ring.
A
The
fact
that
the
shape
of
this
ring
is
is
circular.
It
actually
tells
us
something
because
different
theory
of
gravity
actually
pretty
different
shape
of
this
photon
ring.
So
you
know
with
the
you
know,
just
with
this
picture
that
you
know
we
can
measure
the
shape
and
the
agreement
actually
confirmed
that
einstein's
general
graduate
is
correct
and
also
the
asymmetry
you
know
in
this
ring
also
tell
us
the
how
the
plasma
moves
around
around
the
black
hole
so
yeah.
A
This
picture
tells
us
a
lot
of
the
information
but
yeah
in
order
to
really
connect
this
to
theory,
we
need
to
carry
a
lot
of
numerical
simulations,
so
we
actually
have
a
simulate
simulation
library
and
we
use
these
models
to
compare
with
the
observation
and
then
and
then
and
then
we
can
extract
the
physical
parameter
that
we
interest
into
okay.
So,
in
order
to
simulate
this
black
hole
image,
there
are
two
major
steps.
A
One
is
called
general
relativistic
monitor
hydrodynamic,
so
this
step
we
pretty
much
just
follow
the
plasma
around
the
black
hole
and
then
we
follow
the
turbulence
for
we
study
the
dynamics
of
this
plasma.
Now
I
won't
be
talking
about
this.
You
know,
although
there
are
actually
gpu
acetylene,
gl
mht
code
out
there,
that's
actually
not
my
expertise.
A
My
work
is
mainly
about
the
next
step.
That's
called
general
relativistic
ray
chasing.
The
idea
is
we
want
to
follow
photons
in
this.
A
You
know
curved
space
time
in
the
christian
flow,
so
this
following
graphic
actually
described
pretty
well,
so
we
set
up
a
grid
of
photons
and
at
each
pixel
we
just
trace
the
ray
back
to
the
glm
hd
simulation
this
your
colorful
volume,
rendering
here
this
is
the
glm
hd
simulation
that
you
might
call
it
did,
and
then
we
perform
this
ray
tracing
calculation,
integrate
the
radial
transfer
equation
along
the
way
to
get
the
final
image.
A
So
this
is
a
problem
that
many
people
have
solved.
You
know
I'm
not
the
first
one
who
did
that,
but
many
of
the
the
previous
study
you
they
are
done
in
cpu
and
I'm
actually
the
first
one
who
solved
this
problem
on
gpu,
and
this
is
a
benchmark
from
our
first
paper
that
we
show
just
doing
things
on
gpu
without
too
much.
Optimization
is
already
a
factor
of
30
faster
than
existing
cpu
calls,
and
you
can
see
there
is
this
flattening
here.
This
is
the
you're
pretty
much
the
startup
time
of
your
kernel.
A
If
you're
solving
a
small
number
of
photons,
then
just
to
launch
kernel
will
take
most
of
the
time.
But
if
you
are
solving
many
many
photons
in
your
image,
then
eventually
the
gpu
wing
and
become
faster,
but
you
know
there
actually
a
little
bit
of
detail
going
into
this
speed
up
in
order
to
get
a
30
time
speed
up.
We
need
to
do
everything
in
single
precision
flow.
A
Usually
that's
a
bad
idea
in
radioactive
transfer,
because
you
have
this
crazy,
constant
that
is
very,
very
big
and
divided
and
multiplying
them
will
give
you
your
either
any
angle
zeros.
So
when
we
develop
the
kernel,
we
will
actually
be
quite
careful
and
we
manually
regroup
the
terms
so
that
all
the
variable
we
save
as
a
single
precision
flow
that
turns
out
to
be.
You
have
a
range
that
is
within
the
your
allowed
range
of
folding
point.
A
So
after
you
do
this,
you
know
kind
of
careful
rearrangement,
then
single
precision
actually
work
very
well
for
weight
tracing
and
a
typical
image
movie
will
actually
look
like
this,
so
this
is
now
a
movie
of
this
general
redis
characteristic
ray
tracing
calculation.
A
You
know
this
is
a
three
color
channel
image:
red
is
some
long
radial.
Green
is
optical
and
blue
is
x-ray,
and
here
I'm
swinging
the
the
camera,
so
you
can
see
things
around
the
plow.
So
this
vertical
part
here
this
is
the
funnel
or
the
jet
of
the
christian
flow,
and
then
this
other
part
they
are
the
in
falling
plasma,
and
you
can
also
see
a
ring
of
very
fingering
here.
A
You
know
just
that
event,
so
this
turns
out
to
be
your
student
you're.
One
of
the
leading
explanation
of
why
you
we
observe
some
of
the
variability
from
black
holes,
and
this
is
another
you
just
I
can
d
animation
showing
you
that
the
black
hole
actually
looks
different
at
different
wavelengths.
So
let
me
try
to
move
back
to
the
beginning.
So
in
a
very
long
wavelength
say,
radio,
the
plasma
around
the
black
hole
is
optically
fake.
So
you
cannot
see
the
black
hole.
A
All
you
see
is
the
plasma
around
it,
but
when
you
move
to
a
shorter
wavelength,
higher
frequency,
all
of
a
sudden,
the
plasma
becomes
transparent
and
you
can
finally
see
the
the
middle
black
hole.
Okay.
So
at
the
end
of
this
movie
this
is
1.3
millimeter
wavelength.
So
this
is
exactly
the
frequency
that
the
eht
observed
the
black
hole.
So
this
kind
of
your
simulation
they're
not
just
useful
in
comparing
the
observation
with
you
with
model,
but
there
will
be
also.
A
Now
this
is
another
movie,
so
in
the
2017
data
we
captured
the
m87
black
hole.
So
this
is,
you
know
our
current,
your
simulation
with
all
the
fitted
parameter,
so
we
actually
believe
this
is
your
if
we
have
much
higher
resolution.
This
is
how
m87
would
look
like
all
right.
A
So
m87
is
only
one
of
the
main
black
hole
that
we
are
observing.
We
are
also
interested
in
the
black
hole
at
the
center
of
the
milky
way,
the
century
star
and
for
that
black
hole
is
much
well
studied.
There
are
a
lot
of
different
observations,
including
your
x-ray
different
frequency,
so
we
are
able
to
use
those
information
to
constraint
model.
This
is
a
spectrum
you're
again
actually
calculated
by
a
gl
ray
tracing
calculation,
so
we
can
use
the
x-ray
flux
to
constrain
our
model.
A
We
then
use
some
your
other
wavelength
and
then
we've
used
the
1.3
millimeter
size
observed
by
the
eht,
and
then
we
also
use
the
optical
reconstraint.
So
you
know
before
the
each
collaboration
form.
This
is
actually
the
biggest
study
we
had
because
of
the
gpu
code,
even
though
each
single
calculation
is
quite
short,
we
are,
you
know,
with
the
acceleration.
We
are
able
to
run
millions
of
image
and
form
a
really
big
image.
A
Library
and,
like
you
know,
by
comparing
this
image
in
library,
with
all
the
different
constraints
we,
you
know
we
are
able
to
come
up
with
five
best
fit
model,
so
the
ehd
is
still
working
on
the
central
image,
but
these
are
the
five
possibilities
that
how
how
cellular
will
look
like
now
I
mentioned
earlier
that
different
gravity
of
three.
We
will
give
you
different
prediction
of
the
shadow
size,
so
this
is
also
something
some
science
we
can
do
with
generations
with
the
gray
tracing
and
you
know.
A
Finally,
in
addition,
once
we
get
the
parameters
we
can
ask,
you
know
the
ratios
algorithm
again
to
actually
compute
a
whole
time
series
of
simulation,
and
then
we
can
start.
You
know
predicting
the
variability
of
of
these
different
models.
So
this
is
a
again
a
movie
issue
and
an
essential
model.
The
last
two
panels
here
these
are
the
radio
wavelength.
This
is
the
1.3
millimeter
optical
and
on
the
right,
the
blue
channel.
A
That
is
the
x-ray,
and
you
know
the
lower
panel
here
shows
the
light
curve
how
these
different
wavelengths
vary,
and
you
know
if
you
you
look
at
the
image
once
a
while
again,
you
form
you
can
see
this
very
bright
flux
tube
showing
up
so
again,
these
are
the
you
know,
some
of
the
explanation
of
why
we
are
seeing
flares
in
in
these
black
holes,
okay,
so
yeah.
I
know
this
is
gpu
for
science,
but
gpu
is
about
graphics,
so
you
know
our
group
also,
you
know
use
gpu
to
do
visualization.
A
We
actually
develop
a
a
software
using
oculus
rift,
actually
not
the
microsoft
software
lens.
So
we
we
have
a
virtual
reality,
visualization
tool
to
overlay
this
grm
hd
simulation
and
the
glrt
simulation
so
that
we
can
map
the
features
we
see
in
this
vr
calculation
back
to
the
features
in
the
in
the
gi
mhd
simulation.
So
that
is
very
helpful
in
understand.
What's
going
on
in
this
calculations,
okay,.
B
A
I
guess
I
I
you
know,
I'm
short
in
time,
so
let
me
just
go
through
some.
You
know
some
recent
development,
the
the
the
code.
I
showed
you
earlier
that
that
was
done
in
a
quarter,
but
our
new
development,
we
we
are
switching
to
opencl
and
we
are
also
changing
coordinates.
It
turns
out
that
your
internal
relativity,
the
coordinates,
is
free.
You
can
get
the
same
physics
with
a
different
coordinate,
so
we
use
something
called
the
cursillo
coordinate.
A
Usually
that's
considered
more
expensive,
computationally,
more
expensive
coordinate,
but
we
work
out
some
symmetry
in
the
equation
and
we
find
our
formula
sum
to
simplify
that
so
that
you
know
the
number
of
operations
is
not
that
much
higher
than
the
boiling
crystal
coordinate,
but
because
cursor
is
cartesian.
So
now
we
can
get
rid
of
all
the
coordinates
singularity
in
the
previous
code
and
this
is
a
benchmark
of
our
new
code.
So
on
single
precision,
this
is
your
super
fast.
A
It's
0.1
nanosecond
per
photon
step,
but
even
in
double
precision
you
can
see
the
the
cursed
shield
coordinate
our
singularities
free
formulism.
It
can
be
even
faster
than
than
the
standard
formulas
that
other
people
use,
and
these
are
some
convergence
tests.
And
again
I
want
to
highlight
that
you
know
the
change
of
coordinate,
even
though
the
computation
is
more
expensive.
You
get
rid
of
the
singularity
and
your
our
ray
can
just
go
through
the
pole
without
any
problem,
and
you
know.
A
A
A
So
something
we
are
going
to
do
now
is
to
you
know,
turn
our
engine
to
to
do
something
that
is
able
to
handle
scattering,
and
we
are
also
working
on
particle-based
trio
kinematics.
So
this
grip
figure
here
is
just
showing
you
some
of
the
particle
triangulation
of
this,
your
dry
kinetic
calculations
and
we
are
also
working
on
doing
a
radiation
geomhd
calculation.
A
So
what
we
want
is
to
do
all
the
gi
and
calculation
on
the
cpu,
but
let
the
gpu
handle
the
radiation,
and
also
this
three-way
view
code
that
I
I
described
it
actually
built
on
a
library
called,
looks
and
looks,
is
able
to
measure
the
performance
at
one
time
and
and
you'll
just
re-architect
your
algorithm.
So
this
is
something
that
we
are
currently
working
on
and
and
that's
it
so
I
I
guess
you
know
if
this
time
I
can
take
some
questions.
B
Thank
you
very
much.
That
was
a
very
beautiful
presentation
with
all
the
animations
and
everything.
So
we
have
a
question.
I
think
that
was
partially
answered
by
you
or
maybe
mostly
because
about
the
retracing
okay,
we
have
another
one.
I
think
yeah.
A
B
I
know
so
there
was
one
in
chat.
It
was
about
rate
racing.
I
think
you
just
answered
that
in
the
second
last
slide,
so
there
is
one
in
q
and
a
I
think
you
can
see
that.
Can
you
comment
on
the
operating
system,
hardware,
software
platform
and
details
of
the
vr
visualization
stack
very
interested
in
using
vr
for
3d
database.
A
Yeah,
so
that
that's,
you
know
very
interesting
so
when,
when
we
first
did
that
virtual
reality
thing
we
we
actually
used
the
oculus
development
kit,
and
that
was
the
time
that
oculus
steals
your
support
mats.
So
we
did
all
the
development
on
the
mac
with
the
oculus
sdk
with
opengl,
but
some
of
the
latest
development
we
are
doing
now
is
we
turn
this
whole
great
tracing
calculation
into
a
library.
A
So
when
I
say
great
view,
it's
actually
not
your
standard
program,
it's
just
a
library,
and
we
are
planning
to
your
interface
library
with
unity,
so
that
you
can
do
most
of
your
your
virtual
reality
stuff
in
unity.
But
then
you
call
the
function
from
g
way
too.
To
do
the
scientific.
B
Calculation,
all
right,
I
think
we
have
a
couple
of
questions
more
and
so
one
is
from
here
you
go
here,
you
can
unmute
yourself
and
go
ahead
and
ask
the
question.
C
Yeah,
this
was
just
a
question
about
the
fp32
computation.
You
decided
to
make
thanks
to
the
interval
of
computation
of
your
floating
points,
so
that
you
were
actually
able
to
to
do
the
computation
inside
of
the
fp32
range.
Did
you
find
it
like
by
doing
interval
analysis
with
a
specific
tool,
or
did
you
do
it
by
hand?
How
did
you
make
sure
that
it
was
always
in
that
interval
for
any
input
data.
A
Yeah,
so
you
know
we
were
lucky
enough
to
have
some
your
tesla
gpu
background
and
when
we
first
did
the
development,
the
minimum
was
done
in
double
precision.
But
then
we,
you
know,
we
type
that
our
our
double
and
fold
to
another
type
called
real,
and
then
we
can
just
we
changing
a
single
line.
We
can
change
from
double
position
to
single
position
and.
A
You
know
the
result
become
incorrect.
You
know
we
start
to
get
in
n,
a
n
and
fro
and
zero
after
we
switch
from
double
two
to
single,
and
then
we
just
you
know
manually,
go
into
the
code
and
figure
out
which
part
of
the
goal
go
crazy
and
then
we
we
find
out
is
you're,
mostly
the
radioactive
transfer
calculation,
and
then
we
start
you're,
manually,
recombining
and
and
you're
tuning
those
terms,
and
then
you
work
out,
but
yeah
we
didn't
do
anything
fancy.
It's
a
manual
process.
C
So
how
do
you
make
sure
that
the
results
will
be
in
the
right
range
for
any
input.
A
Yeah,
so
you
know
to
be
honest,
this
is
not
valid
for
all
the
inputs,
but
when
we
talk
about
this
accretion
black
hole,
you
know
there
is
a
range
of
their
density,
their
range
of
their
luminosity.
So
we
know
within
the
ring
in
within
the
range
that
we
are
interested.
They
they
fall
into
the
range.
So
actually
let
me
go
back
to
the
just
nice
side.
A
Yeah
so
so,
if
you
look
at
this
slide
here,
so
we
have
actually
have
some
comments
in
our
code,
showing
within
the
range
that
we
are
interested.
What
is
the
typical
value
of
this
parameter
and
what
is
the
range
of
these
different
things?
So
you
know
again
everything
is
done
manually,
but
we
just
find
this
very,
very
useful
because
it
allows
us
to
turn
a
double
precision
calculation
into
a
single
position.
B
So
we
have
another
question
from
the
organizer,
but
before
we
go
towards
that,
there's
a
small
announcement.
We
have
a
tutorial
on
openacc
starting
in
about
15
minutes
and
the
tutorial
will
take
place
in
the
breakout
room.
The
other
zoom
link
that
we
have.
I
think
it
will
be
going
up
in
the
chat
shortly
or
if
you're
in
the
slack
channel
or
you
might
have
received
the
email
as
well.
So,
if
you're
registered
for
the
tutorial,
please
start
cleaning
up
there
and
you
can
start
testing
out
your
accounts.
B
D
Hi
there
I
was
just
wondering
if
you
would
comment
on
the
relative
value
of
physical
transfer
of
data
from
remote
locations
to
either
the
investment
in
infrastructure,
to
bring
the
data
out
or
some
sort
of
local
processing,
maybe
using
something
like
gpus.
A
Oh
so
I
guess
you
are
referring
to
the
to
the
observation
part
of
my
project.
Instead
of
the
retracing
calculation
yeah.
D
A
little
bit!
Well,
it's
more
that.
So
the
observation
was
done
at
a
variety
of
pretty
remote
sites,
and
you
said
it
talked
about
having
to
physically
transport
the
data,
as
in
your
hard
drives
in
a
presumably
being
antarctica,
a
ship
or
a
plane,
and
I
was
just
wondering:
was
there
consideration
giving
given
to
putting
a
processing
center
or
something
like
that
close
to
the
instrument,
rather
than
moving
the
data.
A
Yeah,
so
so
this
is
a
very
good
comment,
so
it
turns
out
that
you
know
because
the
the
signal
is
very
weak,
so
what
we
are
observing
most
of
them
are
noise,
and
then
you
know
the
the
step
we
use
to
to
combine.
The
data
is
just
a
correlation
and
then
the
correlation
needs
to
be
done
on
your
between
each
station
pair.
So
you,
you
can't
really
just
pre-process
the
data
at
the
station
and
reduce
the
size.
A
You
know
so
so
that's
the
reason
we
have
to
ship
all
the
data
to
to
our
data
center.
Now
heaven
said
that
yo,
the
issue
does
use
a
very
high
bandwidth
and
we
do
have
you
know
some
very
sensitive
station.
So
in
the
future,
it's
actually
possible
to
pre-process
in
the
sense
that
we
reduce
the
bandwidth.
A
You
recall
reduce
the
the
recording
bandwidth
and
once
we
do
that,
we
can,
you
know,
reduce
the
data
by
by
factor
of
10
or
so
but
yeah,
even
with
that,
it's
just
faster
to
do
your
ship
or
standard
data
flow
airplay,
but
you're.
Definitely
you
know
putting
putting
computation
near
the
observation
actually
make
a
lot
of
sense
for
for
many
applications,
but
for
vlbi
is
a
little
bit
more
complicated.