►
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
If
you
remember,
we
are
going
to
use
the
matrix
matrix
multiplication
example.
The
same
implementation
we
use
today
and
we
are
going
to
use
this
example
to
identify
defects
in
the
data
transfers
of
our
openmp
or
open
sec
codes.
What
this
means
is
that
in
the
implementation
code,
we
have
a
part
of
the
main
section
that
is
allocating
double
pointed
data
structure.
A
A
It
seems
that
the
data
is
consecutive
in
memory,
but
it
is
actually
not
so
if
we
try
to
use,
modify
the
matrix
multiplication
example.
Just
by
adding
this
pragma
pragma
omp
target
teams
distribute
parallel.
Target
teams,
distribute
parallel
4
to
make
the
data
transfer
and
the
parallelization
of
the
computation
on
the
gpu,
and
we
need.
We
need
to
specify
what
to
map
from
the
cpu
memory
to
jpm
memory.
A
So
here
in
an
intuitive
manner,
we
can
sync
that
mapping
pointers
a
b
and
c
to
the
memory
of
the
gpu
and
back
from
the
memory
of
the
gpu.
The
result
in
the
in
pointer
see
this
should
be
the
the
specification
and
the
implementation
of
openmp
should
be
smart
enough
to
navigate
through
all
the
data
transfer
it
to
the
gpu
memory
and
transfer
it
back.
But
again,
this
there
is
a
hidden
problem
of
tip
copy
here.
A
So
we
you
will
be
seeing
in
this
lab
is
that
by
following
exactly
the
same
process,
you
will
get
the
performance,
optimization
report.
You
will
get
the
entry
level
report
you
will
get
in
the
list
of
actions,
a
defect
in
this
case
defect
number
six,
which
we
encourage
you
to
go
to
the
knowledge
base
and
see
how
it
is
described
and
explained.
A
And
then
you
compile
and
execute
this
code
with
the
defect,
and
you
see
that
the
code
will
break
will
fail
to
execute
because
it
is
accessing
a
legal
addresses
on
the
in
memory.
It's
not
accessing
the
correct
data
and
memory.
Okay.
So-
and
this
is
all
we
wanted
to
say,
you
have
all
the
data
in
the
same
repository
than
today
and
yesterday,
the
same
set
of
files,
and
you
already
have
the
source
code
and
the
launch
script
with
this
implementation.
B
And,
and
for
today's
exercises
we
actually
have
updated
example
codes.
There
are
more
lab
lab6
lab
eight
and
fortran
much
more
details
in
the
mirrors
april.
2022
example
codes
so
grab
a
new.
Do
a
new
copy,
so
I'll
put
the
link
as
s
copy
command
in
chat
the
new
directory.
So
you
get
updated,
exercises
and
rename
the
old
one
to
day
one.
B
Meanwhile,
I
think
it's
a
good
idea,
maybe
oldest,
can
do
a
demo
for
the
defect.
One.
C
C
So
it's
the
regular
man
mood,
but
it
has
this
fragma,
this
openmp
pragma
that
it
has
an
error
in
the
map:
klaus
klausel.
So
let's
analyze
it
with
kodi
and
see
what
it
reports.
C
C
And
as
default,
we
have
multithreading
and
offloading
disabled,
but
we
haven't.
We
can
enable
them
sorry,
okay,
for
projects
that
use
libraries
and
include
paths
that
are
not
the
current
path.
You
can
use
dash
dash
to
pass
compiler
arguments
in
this
case.
We
need
to
pass
dash,
I
and
input
that
is
the
folder,
where
the
input
path
is
so
now
right.
Now
it's
successfully
analyzed
and
we
have
to
enable
the
multituding
and
offloading
with
include
text
all
the
we
have.
C
C
And
now
the
tool
will
tell
us
to
use
to
focus
the
analysis
on
on
a
function
or
a
loop,
in
this
case,
we're
going
to
analyze
the
main
function.
C
A
I
think
it's
not
really
needed,
because
that
is
that
was
covered
in
the
lab2
yesterday.
So
here
we
want
to
point
out
that
kt
can
detect
openmp
openscc
programmers
that
seem
to
be
correct,
but
for
some
reason
for
some
problem
hidden
like
deep
copy,
it
will
make
our
codes
break.
So
I
think
it's
enough
helen.
Do
you
think?
Also
it's
enough.
B
Yeah,
anyway,
lots
of
questions.
I
would
suggest
dude
as
patched
around
this,
but
this.
Otherwise,
this
one
is
running
on
the
logging
note,
but
you'll
see
the
same
error
on
the
gpu
node
as
well.
A
C
C
C
And
here
we
have
the
three
opportunities
opportunity
for
multi-threading,
cd,
vectorization
and
offloading,
and
the
cmd
vectorization
opportunity
is
associated
with
this
remark.
11
at
least
telling
us
feedback
from
the
vectorization
cost
model
that
cody
has.
But
now
we
are,
we
will
focusing
on
offloading
opportunity.
We
will
increase
the
level
of
detail
of
the
analysis
using
the
flag
dash
dash
level
two,
as
it
is
suggested
in
the
first
session,
and
here
we
can
see
suggestions
made
for
the
opportunity
three.
C
Yes,
and
now,
as
fortran
support
is
experimental
for
kodi.
For
now
we
detected
a
bug
and
the
generated
code
for
openmp
offloading,
but
we
can
easily
fix
it
manually
in
this
case,
vim.
C
By
rb,
basically,
we
need
to
change
the
four,
the
four
key,
that
is
for
the
the
c,
the
bracket
for
c,
and
we
have
to
change
it
with
do
and
we
have
to
delete
the
schedule
static.
That
is
failing
with
np
fortran,
compiler
and.
C
C
Okay,
here
it
is
so
the
sequential
version
executed
at
three
seconds:
the
openmp
first
loading
0.17
seconds
and
the
open
acc
version
0.14
seconds
and
the
results
are
okay.
So
this
is
how
we
analyze
a
fortran
code,
in
this
case
pi
using
kodi.
A
No,
no,
I
think
it's
pretty
clear
that
the
same
workflow
we
have
proposed
since
yesterday
for
all
the
codes
also
works
for
fortune,
which
is
what,
as
a
user,
we
might
expect
calen.
Would
you
like
to
point
something
or
repeat
something.
B
Something,
let's
say
that
fortune
support
is
experimental,
as
you
can
see,
when
you
do
the
pw
directives.
There's
a
first
line
says
warning
fortune.
Support
is
experimental.
That's
why,
once
this
is
finished,
you
have
to
make
some
changes
to
the
generated
code.
One
of
them
is
apparel.
Four
has
to
change
the
power
of
do
another.
One
is
related,
possibly
a
compiler
bug
that
schedule
auto.
This
was
not
supported
for
4chan,
but
I
was
for
not
scheduled,
auto
schedule.
B
Anything
schedule
auto
has
to
be
changed
to
schedule
static
for
for
the
c
code,
but
the
same
schedule
doesn't
work
for
the
fortran
compiler,
so
that's
potential
compiler
bug,
but
otherwise
the
4chan
support
will
be
upcoming
in
much
more
mature
in
the
future
releases.
I
hope,
because
we
have
lots
of
fortune
users
at
nursk
and
oak
ridge.
B
But
this
year
this
is
very
encouraging
because
for
for
your
development
workers,
there
was
lots
of
dependencies
for
fortran
fortune
dependencies
that
for
for
your
tour
development,
work,
the
the
rvm
and
flan
stuff
like
that
yeah.
So
there
is
also
a
madmo
unfortune
code
that
people
can
play
with
it
as
well.
A
C
So
let's
see
the
code
matthew
fortune.
C
C
And
now
we
have
eight
actions,
but
we
want
to
see
again
the
offloading
action,
so
we
have
to
enable
them
with
includes
excel.
C
C
C
And
well
here
we
have
three
opportunities:
sorry,
two
opportunities,
one
for
multi-threading,
one
for
offloading
and
a
remark
from
the
vectorization
cost
model.
Again
the
study
stating
that
the
scene
is
not
a
single
opportunity
due
to
strided
memory,
access
in
the
loop
body.
But
again
we
will
focus
on
opportunities
for
offloading.