►
From YouTube: 08 Putting it all together in real codes using LULESHmk
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Let's
use
openscc
as
a
reference
for
this
more
realistic
code.
Example
lulac
electrode,
dynamic
application
from
physics.
In
the
end,
it's
all
about
the
same.
Let's
locate
the
profiling,
which
is
the
hotspot
function
here
you
can
see.
The
code
consists
of
more
than
11
functions,
not
one
single
one,
and
you
have
at
least
four
functions
that
consume
more
than
10
percent
of
the
runtime.
So
eventually
or
potentially,
you
will
like.
You
will
want
to
optimize
the
performance
of
these
four
functions.
A
A
Typically,
measuring
correctness
is
not
that
simple
and
it's
it
requires
knowledge
about
the
scientific
domain
and
producing
by
the
scientists
and
the
computational
scientists,
some
figures
of
merit
or
some
metrics
that
we
can
use
to
verify
that
the
code
is
correct.
Okay,
so
verifying
correctness
is
not
easy,
but
here
in
lulac
we
just
need
to
check
that
it
is
producing
typically,
this
same
output
that
we
can
see
here.
A
We
have
seen
this
so
essentially
what
we
propose
you
from
the
real
application.
One
file
15
functions,
almost
20
loops,
500
lines
of
code,
something
a
bit
bigger
than
four
lines
of
code
in
one
single
loop
in
one
function
in
the
pi
example.
A
A
Here
we
have
something
that
we
call
sparse
reductions,
but
in
the
end
it's
a
type
of
computation
that
appears,
if
you
are
using
molecular
dynamics
evaluating
finite
elements,
finite
differences,
particle
methods,
you
will
find
this
kind
of
computation
in
the
code.
That
is
having
indirections
on
an
array
that
is
a
reduction
array
and
you
need
to
protect
those
interactions
from
introducing
risk
conditions
and
runtime.
A
This
is
what
we
call
sparse
reductions.
You
have
more
extensive
documentation
about
these
compute
patterns
that
are
at
the
heart
of
kodi
in
previous
materials
published
in
in
nurse
website.
We
can
point
you
to
those
materials
if
you
are
interested,
but
it
should
be
enough
to
use
the
current
capabilities
of
kodi.
It
should
not
be
necessary
for
you
to
dig
into
the
theory
or
the
or
the
rationale
behind
these
computation
patterns
and
starting
from
this,
let's
follow
the
same
path.
A
Report
evaluation,
enable
gpu
and
multi-threading
focus
on
the
loops,
particularly
in
the
low
hanging,
fruit,
low
difficulty,
loops
actions,
level.
Two
actions
you
can
see
here.
These
sparse
computation
patterns
that
are
highlighted
here
and
finally,
just
following
instructions
open
a
directives
will
produce
the
code.
You
will
see
that
the
sparse
reductions
have
been
optimized
for
parallel
execution,
protecting
with
atomic
instructions,
the
the
indirections
to
avoid
risk
conditions,
and
also
you
will
see
that
kodi
produces
incomplete,
open
sec
code.
A
A
The
data
is
pointed
out
here,
so
here
you
have
the
template
that
is
produced
these
brackets
columns
brackets.
If
you
try
to
compile
this
openhc
application,
that
opens
the
compiler
could
report
an
error.
This
is
intentional
so
that
you
know
exactly
where
the
error
is,
and
you
can
fix
it
just
by
introducing
the
ray
shape.
Lmk
is
using
only
1d
array,
so
it
should
be
fairly
just
the
maximum
size
of
the
vectors.
These
are
1d
vectors.
A
Finally,
these
are
the
numbers,
and
this
is
what
you
get
so
here
we
are
seeing
that
the
code
runs
correctly
on
the
gpu.
We
have
the
first
starting
version
for
the
gpu
and,
as
usually
in
the
first
version
running
on
gpu,
it
doesn't
run
faster.
It
runs
more
or
less
with
the
same
performance
or
a
bit
slower,
because
remember
for
real
applications
is
not
enough
to
handle
this.
You
will
need
to
address
additional
tpu
challenges
that
we
will
see
in
the
more
advanced
courses.
A
Okay,
so
we
expect
this
lewis
mk
example
to
be
further
optimized
in
the
upcoming
course
in
september
and
october.
So
this
is
the
typical
situation
you
you
will
see
in
when
starting
on
the
gpu
with
real
applications.
Okay
and
finally,
there's
the
typical
remarks
remember
to
load
the
module
help.
Is
your
friend
all
the
command
line
tools
run
the
scripts
in
the
appropriate
reservations
and
to
understand
the
issues.
Please
review
read:
go
through
the
knowledge
base,
a
lot
of
information
illustrated
by
example,
is
there
and
that's
it?
A
That's
what
I
wanted,
or
we
wanted
to
cover
right
after
the
break
in
just
10
minutes
or
15
minutes
any
questions
or
any
remarks.
Helen
about
the
this
example
codon.
What
comes
next
free
hands-on
for
the
people
to
play
with
lulu's
mk
finish
the
previous
laps
or
why
not
try
to
start
with
their
own
application.
B
Yeah
a
free,
hands-on
time
and
mostly
to
work
on
this
lurch
codes
and
not
just
blindly
apply
those
directives,
but
look
into
the
code
to
see
what
it
has
done
to
the
code
and
then,
if
you
wanna
and
work
on
your
own
code,
that's
also
a
good
opportunity,
because
the
developers
are
here
to
answer
your
questions,
good
check
out
the
q,
a
I
like
one
of
the
questions
yesterday,
it
was
answered
today
whether
this
code
is
unique.
B
I
think
I,
like
this
code,
just
my
my
opinion.
First
of
all,
it
helps
novice
programmers
to
to
just
insert
all
these
directives.
For
you,
this
is
a
big
help
on
other.
One
of
the
other
tools
I
know
about
is
the
create
review
tool
that
one
has
you
have
to
use
a
create
compiler
only
and
then,
of
course,
after
the
general,
you
can
use
any
other
compiler
after
that,
but
for
the
kodi
tool
you
can
just
run
it
on
a
logging
node
with
I
think
and
backhand.
B
The
other
advantage
of
this
tool
is
that
it
has
lots
of
knowledge,
base
and
recommendations
of
performance,
optimization
suggestions
so
take
good
advantage
of
these.
I
think
that's
also
a
big
plus.
I
hope
our
attendees
get
learned
something
and
apply
this
tool
and
thank
you
very
much
for
the
training
for
us,
the
petra
team.
The
tour
is
now
called
kodi.
The
company
is
a
pentra,
it's
a
rebranding.
B
So
if
you
have
used
this
before
it
was
a
pw
analyzer
tool,
I
think-
and
I
I
appreciate
all
the
users
attending
the
training
and
maybe
spread
the
knowledge
to
your
colleagues
and
friends,
and
thank
you
for
nerds
and
oakridge
users
and
organizers,
suzanne
and
we've
been
here
today.