►
From YouTube: 006 ONNX 20211021 Kuah ONNX and OneAPI for xPU
Description
Event: LF AI & Data Day - ONNX Community Meeting, October 21, 2021
Talk Title: IntelĀ® OneAPI software stack: ONNX Support for xPU hardware
Speaker: Kiefer Kuah (Intel)
A
So
good
morning,
everybody
in
this
talk
we'll
introduce
what
is
one
api
and
what
does
it
have
to
do
with
onyx.
A
Okay,
so
I
guess
to
answer
the
question
of
what
is
one
api?
Maybe
I'll
just
give
a
description
of
the
problem
that
we
want
to
solve
with
one
api
intel
built
several
xpus
the
x
can
represent
c
or
g
or
v,
and
and
and
this
is
to
meet
the
requirements
of
different
types
of
apps
and
workloads.
A
The
different
architectures
of
these
xpus
present
a
challenge
to
developers.
I
mean
it's
great
to
have
all
these
different
devices
and
you
know
their
strengths
for
different
apps,
but
with
these
many
heterogeneous
devices
developing
for
them
is
challenging
and
and
to
develop
them
your
code,
the
apps
optimally,
it's
it's
even
more
challenging,
right
and
and
also
the
other
thing
is
not
just
xps,
but
with
every
new
generation
of
these
xpu's
there'll
be
new
instructions
and
new
technology.
A
That
means
that
you
know,
if
you
want
to
keep
updating
your
apps
or
your
freight
workflows,
to
be
able
to
use
these
new
technology.
A
You
have
to
constantly
be
updating
your
code,
so
development
cost
time
and
effort
will
will
grow
very
quickly.
So
one
api
was
conceived
to
alleviate
some
of
this
cost
and
effort.
It
won't
completely
remove
you
know
100,
all
of
that,
but
it
will.
It
will
lower
that
cost
and
the
effort
and
time
needed
to
develop
code
for
each
of
this
xpu.
A
So
it
is
a
unified
api
where
you
just
ideally
write
once
your
code
once
and
be
able
to
deploy
your
apps
to
multiple
devices
and
for
the
new
technology
that
comes
up
in
in
the
new
new
generation
of
devices
that
intel
will
release.
A
A
One
of
them
is
the
one
api
d,
p
c,
plus
plus
it's
sort
of
the
programming
model
or
programming
extension
to
be
able
to
do
data
parallel
programming
in
c
plus,
plus
the
other
one
is
dpls,
so
that's
the
corresponding
library
for
the
library,
that's
also
written
for
parallel
code,
so
library
for
it's
per
it's,
it's
sort
of
like
the
stl,
but
for
parallel
programming,
the
other
one
that
is
relevant,
I
think
to
ml
apps
and
ml
workloads
are,
is
the
one
dnn,
so
it
it's
a
library
that
is
written
that
has
primitives
that
support
the
different
apps
that
are
found
in
deep
learning,
topologies,
deep
learning,
graphs
such
as
convolution
and
mathematics.
A
These
are
highly
optimized
for
kernels
and
another
one.
I
guess
the
last
one
I'll
highlight
is
the
one
ccl
it
provides.
Primitives
for
communication
patterns
that
occur
in
deep
learning
applications
so
that
this
can
be
used
to
support
scale
up
for
platforms
with
multiple
one
api
devices
or
scale
up
for
clusters
with
multiple
computer
nodes.
A
So
I'll
drill
down
into
more
details
about
1d
and
n,
because
that's
what
is,
I
think,
very
relevant
to
onyx
and
onyx
runtime
1dnn
libraries
is
a
collection
of
optimized,
primitives
or
ops
use
in
executing
deep
learning
graphs,
and
we
think
that
this
library
can
improve
developers,
productivity
and
enhance
the
performance
of
deep
learning
frameworks.
A
This
library
supports
key
data
type
formats
that
are
used
in
deep
learning,
such
as
fp16
fp32,
vfloat16
and
int8,
and
it
implements
a
variety
of
operations.
Computationally,
computationally,
intensive
and
prevalent
in
dl
graphs
and
such
as
convolution
and
matrix
multiplication
intel
has
added
deep
learning
instructions
such
as
dl
boost
in
the
cascade
cpus.
A
A
A
A
One
api
is
abstract
away:
that
complexity
of
programming
to
not
just
one
xpu,
but
potentially
several
xpus,
so
intel
has
built
dl
xl
acceleration
technology
into
our
cpu
into
our
gpu
and
we'll
continue
to
do
so
in
the
future
to
run
onyx
models
using
these
accelerators
require
writing
code
in,
in
the
run
time,
for
these
accelerators.
A
And
one
dnn
did
part
of
that
work
for
us
we
have
to
integrate
and
what
we
have
to
do
is
integrate
the
library
into
runtime
and,
in
our
case,
we're
integrating
that
into
the
onyx
runtime.
A
So
we
have
done
sorry,
so
this
is
sort
of
an
ongoing
work
and
we
have
done
some
of
that.
Some
of
the
features
have
been
added.
Some
more
features
we
added
into
the
one
dnn
runtime
execution
provider,
if
onyx,
rent
in
onyx
runtime
so
last
year,
that
was
support
for
32-bit
fp323
floating-point
data
type.
It
was
supporting
inference.
It
was
supporting
a
convolutional
network
as
well
as
cpu.
It
did
not
have
gpu
support
at
that
time.
A
We
added
the
gpu
support
and
currently
we're
also
adding
support
for
nlp,
basically
ops
in
the
execution
provider
for
nlp
transformer
models
and
we're
also
getting
support
for
training.
Since
onyx
is
beginning
to
support
some
training
ops
as
well,
and
we're
also
beginning
to
support
integrate
data
types
and
potentially
other
data
types
as
well
in
the
future.