►
Description
Part of the Nvidia HPC SDK Training, Jan 12-13, 2022. Slides and more details are available at https://www.nersc.gov/users/training/events/nvidia-hpcsdk-training-jan2022/
A
Okay,
welcome
everyone
happy
new
year,
2022.
A
today
we're
having
the
nvidia
hpc
sdk
training,
organized
it's
a
joint
event
for
nurse
oakridge
and
argonne
users,
and
some
other
people
who
are
interested
in
this
training
as
well.
Our
organizers
are
myself
and
chris
daley
from
nursk
suzanne,
parakeet,
kuhn,
tom
papasudor,
sherry
ray
from
oak
ridge
and
yes
van
garder
and
ray
lloyd
from
argonne
sudan,
and
I
are
going
to
present
this
welcome
slide.
A
First
of
all,
this
nvidia
hpc
sdk
is
the
comprehensive
suite
4chan
cc,
plus
plus
development
tools
and
libraries.
It
is
the
default
recommend
compiler
for
permanent
gpu.
This
hands-on
training
is
provided
by
nvidia.
A
Thank
you
very
much,
jeff
lucking,
brent
lebeck,
max
katz,
matt
stack
and
robert
sears.
The
presenters
also
helps
with
all
this
preparation
for
the
hands-on
exercise.
For
us,
the
topics
of
this
training
would
include
gpu
architecture,
hpc
software
developer
considerations,
the
standard
language,
acceleration,
libraries,
open,
acc,
open,
l1mp,
offload,
cuda
and
profiling
tools,
the
first
some
of
the
logistics.
A
Everyone
is
muted
upon
joining,
and
I
would
like
you
to
please
change
your
name
in
zoom
session,
the
first
name
and
last
name.
So
it
will
help
us
to
know
who
you
are.
You
can
click
the
participants
on
bottom
and
then
move
next
to
your
name.
You
can
rename
find
your
name
and
rename
it.
A
As
I
mentioned
in
the
chat
the
captions
and
viewfor
transcripts
are
enabled
you
can
save
you
the
transcripts,
if
you
like
as
well,
you
can
turn
on
and
off
the
captions
and
if
you
haven't
joined
slack
yet,
please
join
in
I'll
post
the
link
again
in
the
zoom.
If
you
are
enjoying
the
zoom
session
late,
so
we
prefer
to
use
slack
instead
of
zoom
for
questions.
It's
threaded
and
also
recorded.
A
So
we'll
be
posting
the
slides
in
the
presentations
channel
in
slack
and
we'll
post
process,
videos
and
publish
them
as
well.
Later
we
have
hands-on
exercises.
This
is
a
github
repo.
We
will
also
post
it
again
in
chat
or
in
slack
for
you
nursery
users
or
use
permanent
oak
ridge,
users
or
use
summit,
and
other
users,
including
a
arcf
users,
will
use
the
nurse
training
account.
You
should
have
received
another
email
for
applying
for
one
and
we
also
prepared
a
survey.
Please
help
us
answering
the
story
question
after
the
training
quick
agenda.
A
This
is
day
one
today.
I
don't
need
to
repeat
the
talks
and
everything
you
just
want
to
mention.
We
have
a
break
at
10,
30
for
15
minutes
and
then
the
at
the
end
of
the
day.
There's
a
big
demo
lab
break
session
people.
A
You
would
go
to
the
github
and
start
to
work
on
your
own
for
a
little
bit
and
at
the
12
o'clock,
we'll
have
a
short
demo,
and
then
you
can
have
the
remaining
of
the
time
continue
to
work
on
the
hands-on
similar
for
day,
two
again
with
break
and
the
lab
and
demo.
At
the
end.
Today,
two
have
opened
an
acc,
open,
p
and
cuda
just
repeat
day.
One
is
the
stood
part
and
profiling.
A
So
some
quick
formatted
usage
info
existing
users
have
already
been
added
to
the
entrance
full
project.
This
is
purely
for
the
purpose
of
accessing
the
compute
node
reservation.
For
today
and
tomorrow
you
don't
have
to
use
entrance
for
outside
of
the
reservation
hours
and
the
training
accounts
if
you're
not
used
to
the
training
account
perimeter
expires
on
january
18th.
A
A
For
stud
part.
You
would
add,
dash
third
part
and
some
libraries
l,
some
libraries
work,
then
p
m
p
equals
gpu.
Gpu
equals
cc
80,
which
is
the
a100
gpu
on
parameter,
open,
acc,
dash
acc
that
equals
gpu
is
actually
optional
because
it's
default
and
then
that's
gpu
cc
equals
cc
80
again,
cuda
just
cuda
those
languages
can
be
mixed
in
the
same
programming,
language
same
same
application
code
as
well,
and
we
do
recommend
you
add,
use
info
flag
when
compiled
and
I'll
show
you
lots
of
detailed,
compiling
information
to
see.
A
And
then
there's
also
nv
compiler
acc
notify
flag.
It's
pretty
useful.
You
can
set
it
to
one
two
or
three
that
shows
you
all
these
corner:
launch
data,
transfer,
etc
and
works
for
cuda,
open,
acc
and
openmp
offload
programs.
A
Here's
a
sample
compiler
run
script,
just
I'm
not
giving
everything
but
just
a
stool
part
and
openmp
offload
example,
like
I
said
you
could
use
nvc,
plus,
plus
native
or
grappler
capture
cc
for
c,
plus
plus
code.
You
could
use
nv,
fortran
or
ftn
for
a
fortran
code
and
respect
respective
flags
here
as
well,
and
then
here
is
the
sample
batch
script.
Today
we
have
reservation
notes.
We
would
like
you
to
use
sbatch
instead
of
interactive.
As
I
look
yes,
so
everybody
can
have
a
turn
to
use
these
nodes.
A
There's
a
few
of
the
flags
here,
normal
nodes.
You
have
to
see
gpu
the
features
number
of
nodes,
another
gpu
you
have
a
cpu
etc,
and
here's
a
actually.
This
is
updated.
Let's
look
at
here
so
some
of
the
compilation
parameters
links
are
here
running
jobs.
This
is
updated
as
of
last
night.
So,
if
we've
been
looking
through
the
perimeter
information
you
go
to
the
permanent
web
page
and
you
would
be
able
to
find
running
jobs
section
as
well
for
insight
systems.
A
We
had
another
link
here
and
then
you
would
run
with
instance
profile.
That's
just
equal
true.
Instead
of
in
front
of
your
executable,
we
do
recommend
to
install
an
x,
which
name
is
and
neural
stands
for
no
machine.
It
greatly
improves
x,
forwarding
when
using
guitar
tools.
You
have
an
instruction
here,
so
this
is
very
useful
for
profiling
tools
today,
if
you
don't
have
a
time
to
install
this,
yet
you
can
do
it
afterwards.
A
It's
and
there's
another
added
feature
edit
advantage
for
this
is
that
it's
it's
sort
of
is
similar
to
screen.
So
it
remembers
where
you
are
it.
Even
if
your
internet
connection
drops
off
you
logs
out
or
login
again,
the
session
is
still
there
for
you,
that's
where
you
left
it
off.
B
Thank
you,
helen
okay,
so
for
the
summit,
information
you're
going
to
run
with
your
user
account
id,
but
we
have
a
reservation
that
goes
from
the
beginning
to
the
end
of
this
training.
So
today
you
would
add:
b
sub
minus,
u
nvidia
sdk
one
and
then
tomorrow
same
same
deal.
The
reservation
lasts
the
duration
of
the
training,
but
it's
nvidia
sdk2
to
load
the
nv
hpc
module.
It
is
not
default,
so
you'll
have
to
module
load
and
be
hpc
21.9.
B
B
We
have
an
example
batch
script.
You
probably
you
won't,
have
to
use
this
cheat
sheet,
because
all
of
these
things
are
in
the
get
repo
that
nvidia
has
prepared
for
you,
and
then
you,
of
course,
would
submit
your
badge
scripts,
with
b
sub
your
batch
script,
dot
elephants,
but
those
are
all
provided
for
you
in
the
hands-on
and
I
suppose
most
of
you,
since
your
users
already
know
these
things
next
slide.
B
B
The
gui
is
not
supported
on
summit,
so
you
might
have
gotten
an
email
from
me
just
before
the
training
that
you
need
to
download
the
insight
ui.
If
you
want
to
look
at
your
profiles,
visually
there's
two
ways
to
do
that.
You
can
download
it
as
an
nvidia
developer,
which
has
versions
for
windows,
mac
and
x86
clinics.
B
You'll
need
to
register,
though,
before
you
do
this
as
an
nvidia
developer,
and
if
you
click
that
link,
it
will
first
send
you
to
the
registration.
You
can
use
any
email,
so
yourself
at
hotmail.com,
it
doesn't
matter,
you
don't
have
to
use
your
work
email.
The
other
way
you
can
do
it
without
having
to
look
to
register
as
a
nvidia
developer,
is
to
download
the
whole
cuda
toolkit,
which
has
a
wealth
of
useful
software.
B
But
you
will
need
to
you
know
you
need
to
install
that
entire
system
on
your
computer.
Another
point
on
summit:
when
you
do
your
profiles
during
the
training,
I
think
this
is
noted
in
the
repo
make
sure
you're
writing
your
insight
profile
reports
to
gpfs
the
compute
nodes
cannot
write
to
your
home
area,
so
either
download
the
git
repo
in
the
gps
or
redirect
your
output
in
your
batch
script
to
go
to
gps,
and
I
think
max
and
robbie
and
company
have
already
taken
care
of
that
in
the
profiling.