►
Description
Part 4A from the Parallelware Trainer Tool workshop at NERSC on June 6, 2019. Slides are available at https://www.nersc.gov/users/training/events/parallelware-tool-workshop-june-6-2019/.
A
You
have
this
as
Levesque
in
the
participant
material
I
will
anyway
leave
it
open
here.
So
this
is
what
you
have
to
do.
Could
you
are
proposed
to
do
as
the
first
exercise
use
the
Peugeot
trainer
to
open
the
PI
product
and
you
get
familiar
with
all
of
the
options
of
generation
of
code
understanding
the
patterns,
the
different
features
of
the
graphical
user
interface?
A
A
These
are
the
examples
that
contain
the
PI
rule
smk
and
hit
practical
that
we
propose
to
you
so
copy
them
from
this
global
directory
into
your
home
directory.
Then
you
need
to
load
some
modules
to
use
parallel
trainer,
so
you
have
here
all
the
sequence
of
modules
that
you
need
to
load
in
curry
to
have
the
appropriate,
open,
HTC
and
open
MP
GP
for
GPUs
debris
offload
in
support
in
these
compilers.
Ok,
just
copy
and
paste
copy
making
copy
and
paste
of
these
commands
should
work
in
your
terminals.
A
A
A
Okay,
so
once
you
locate
this
interactive
session,
you
have
essentially
a
terminal
where
you
can
run
commands
just
invoke.
Pw
trainer
with
an
ampersand
and
the
graphical
user
interface
will
appear
on
your
screens,
and
at
that
point
you
can
start
you
will
need
to
activate
the
support.
The
first
time
you
run
it
on
your
accounts.
You
will
be
presented
with
a
dialog
that
you
need
to
open
this
file
to
activate
the
software
and
have
access
to
the
capabilities.
Ok,
so
it
is
shared
in
this
path
that
it
should
be
accessible
to.
All
of
you.
A
Yeah
you'll
have
copies
of
the
worksheet,
so
in
the
beginning
you
have
our
section
for
with
all
of
these
systems.
You
also
have
just
if
you
don't
remember,
to
have
these
dialogues,
and
here
you
have
the
commands
that
you
need
to
use
using
GCC,
PGI,
openmp,
open
ACC,
for
instance,
to
compile
the
PI
example.
A
Ok,
so
you
can
just
copy
and
paste
them
in
the
appropriate,
a
diol
of
the
UI
to
build
the
binary
code,
the
secret
double
call
and
to
run
the
Siq
the
sequential
code
they
call
either
on
the
cpu
either
on
the
device
using
the
GPU.
Ok,
the
cover
here,
the
commands
you
need
to
use,
and
finally,
we
have
been
talking
about
the
composition
in
patterns.
So
what
is
pi
PI
is
very
simple.
It's
just
one
simple
single
loop
with
one
single
variable.
A
The
recommendation
in
the
worksheet
I
have
been
discussing
that
yesterday
between
the
trainers
is
that
you
first
get
familiar
with
all
of
these
patterns.
Strategies
implementations
the
UI
through
the
PI
example,
because
it's
one
loop,
very
simple
to
understand.
You
already
have
the
notion
of
a
scalar
reduction
before
going
to
the
complexity
of
knowledge.
Microcon.
A
The
second
exercise
I
will
propose
is
using
rule
smk.
So
essentially,
this
is
a
dynamic
code.
You
really
don't
need
to
know
anything
about
the
science
as
I
remarked
before
you
just
need
to
focus
on
the
code
and
understand
the
properties
of
the
code.
This
is
what
you
are
really
paralyzing,
not
and
indirectly,
you
are
of
called
paralyzing
the
science.
So
again,
you
have
in
that
practical,
take
trainer,
UI
all
the
options
again
to
run
the
trainer,
all
the
commands
to
build
and
run
your
smk.
And,
finally,
you
are
provided
with
three
additional
slides.
A
We
said
that
in
the
general
workflow,
this
is
probably
the
first
time
you
see
this
code.
So
what
did
you
begin?
You
can
represent
it
with
twenty
routines
12
loops.
What
could
you
begin?
You
can
go
one
after
another
sequentially
or
you
can
do
something
more
intelligent,
so
to
do
something
more
intelligent,
follow
with
the
general
workflow
is
begin
profiling,
the
application
so
just
running
these
commands
enable
PG.
A
You
will
be
able
to
run
the
code
and
with
the
probe,
you
will
get
this
kind
of
output,
and
it
will
be
your
probably
your
first
profiling
of
an
application
that
you
can
later
release
these
commands
to
try
to
profiler
on
called
general
applications.
So
here
what
you
have
is
a
rank
of
the
routines
of
the
code
and
the
time
invested
in
the
execution
in
each
of
the
routines,
so
you
can
clearly
identify
which
are
the
routines
that
consumed
most
of
this
equation
times,
I'm
beginning
to
focus
begin
working
on
those
routines.
A
Okay,
so
first
experience
with
profiling
for
some
of
you.
Okay,
when
you
run
it
I'm,
going
to
run
the
sequential
version
and
you
run
the
parallel
version.
How
did
you
know
that
the
code
has
been
paralyzed
correctly?
You
need
to
verify
the
output.
Somehow
pi
is
very
simple.
Just
look
at
the
three
point,
something
and
in
just
a
glance
you
can
say:
okay.
A
This
is
correct,
but
delicious
much
more
complicated,
so
we
have
extended
a
bit
the
output
of
the
lune
smk
of
Lula's
example,
and
here
we
highlight
in
boldface
what
you
need
to
focus
on
to
verify
that
they
call
this
correct.
So
when
you
generate
a
parallel
version,
run
it
half
as
a
reference,
the
sequential
execution
CD
laughs
at
time
if
it
has
been
produced
and
also
check
that
these
values
are
more
or
less
in
the
same
order,
because
if
there
is
a
very
significant
difference,
then
there
is
something
that
has
not
been
paralysed
quality.
A
So
it's
important
that
when
you
generate
a
parallel
code
before
iterating
in
parallels
in
another
loop
that
you
always
run
and
verify
that
you
medical,
it
is
correct.
It
doesn't
matter
if
it
if
it
accelerates,
translates
faster,
but
it
provides
you
incorrect
results
is
not
handled
incorrectly,
paralyzed,
okay,
so
don't
focus
only
on
the
time,
but
also
the
numerical
verification
and
finally,
just
to
give
you
a
glance
of
the
complexity.
This
is
the
table
you
need
to
fill
in.
A
So
in
the
relation
K
in
pep
in
pi,
you
have
only
one
row
with
only
one
example:
here
you
have
eight
or
nine
routines
with
12
loops,
and
here
you
have
the
solutions
to
some
of
the
cells.
So
in
loop
at
973,
you
will
find
a
loop
that
contains
a
pattern
parallel
for
all
that
computes
and
save
the
result
in
a
variable
of
this
name.
A
So
you
need
to
go
to
this
loop
and
see
if
this
is,
the
only
variable
is
computed
in
the
loop
or
if
there
are
more
variables,
if
there
are
more
variable,
you
need
to
write
the
name
of
the
variable
in
the
corresponding
cell,
corresponding
to
the
pattern.
Okay,
so
somehow
we
are
giving
you
an
example
of
a
loop
with
a
parallel
for
all
a
loop
with
a
parallelize
color
reduction,
a
loop
with
a
parallel
dispersed
reduction
boldface.
This
is
the
hot,
the
main
hot
spot
of
the
of
the
code.
A
What
you
have
as
part
reductions
and
forget
about
this
convergence.
Look
this
one
should
have
been
deleted
from
this
disable.
Okay,
so
somehow
you
need
to
fill
it
in
with
the
names
of
the
variables
in
the
appropriate
cells
of
this
columns
and
we
have
the
results.
So,
at
the
end
of
the
practical
we
will
before
closing
the
session,
we
will
share
some
time
discussing
the
results
for
pi
on
curry
and
formulation
k
on
curry,
so
that
we
can
all
share
experiences
and
results
and
learn
from
public
awful,
don't
delay.
This.
Is
you
okay?
A
Okay?
So
let's
stop
here
working
lunch
and
begin
to
work
on
this
and
just
let
us
know
how
we
can
help
you
during
these
particles
for
those
of
you
that
have
your
own
codes.
Maybe
we
can
first
suggest
to
make
some
of
the
practicals,
and
once
you
get
used
to
the
concepts
of
patterns
and
things
like
this,
we
can
try
to
apply
it
to
your
own
codes,
but
it's
up
to
usual.
Also,
you
want
to
start
by
your
own
code,
your
code.
A
We
can
deal
with
you
to
see
how
it
is
organized,
but
I
think
it's
better
to
invest
at
least
one
hour,
trying
to
understand
all
the
outcomes
of
this
of
these
lectures.
Okay,
but
please
understand
very
well
it's
up
to
you
to
decide
how
you
want
to
do
it
if
these
are
free
consultation,
do
you
decide
what
you
want
to
invest
your
time
in
okay,.