►
From YouTube: ONNX20210324 V13 QSforONNXusingINC
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
A
You
can
see
the
right
top
diagram
showing
showing
the
output
architecture
and
the
top
is
the
episodic
model
as
the
input
the
middle
is.
The
output,
including
auto
tuner
as
the
user-facing
apis,
and
the
other
key
components
like
quantizer,
pruner,
benchmark
and
tuning
strategy,
and
also
the
frame
adaption
could
including
onyx,
runtime
adapter
and
the
other
frame
adapters.
A
Now,
let's
look
at
the
simple
unity
flow
and
the
code
example
at
the
left
bottom
side,
given
the
fpc
to
mode
as
the
input
you
don't
need,
just
need
to
prepare,
configure
as
shown
below
and
launch
output
with
minimal
lines
of
code
change,
the
configure
is
template-based,
so
user
just
need
to
take
the
template
and
update
some
minimal
item,
for
example,
the
quantization
approach.
Here
we
use
the
post
training,
static
contact,
quantization
and
the
quantization
data
set
launch
code
is
also
pretty
simple.
A
Then
you
can
see
the
right
button
diagram,
showing
the
auto
tuning
flow,
given
the
configure
with
accuracy
criteria
and
the
timeout
airport
will
generate
a
quantized
model
with
the
quantization
configure
and
driven
by
the
tuning
strategy.
Once
the
accuracy
is
meeting
the
criteria
or
the
other
objectives,
meet
the
criteria
or
is
reaching
the
tuning
timeout.
The
tuning
flow
will
stop
with
the
best
contact
model
with
the
trader
of
the
accuracy
and
the
other
objectives.
A
Now
we
want
to
show
the
quantization
productivity
with
output,
reducing
the
time
frame
days
to
minutes
showing
this
significant
productivity
improvement
compared
with
human
expert.
Usc
airport
is
helping
in
all
those
3k
aspects
for
quantization,
including
effective
model
calibration
advanced
quantization
recipes
and
the
systematic,
auto
tuning
flow.
A
Effective
model
calibration
saves
the
efforts
to
collect
the
tensor
statistics
from
their
scratch
and
advanced
and
quantization
recipe.
Help
reaching
the
higher
accuracy
and
shorten
the
tuning
space,
and
the
systematic,
auto
tuning
flow,
as
described
in
previous
page,
is
greatly
releasing
the
menu
effort
to
tune
the
accuracy
per
quantization
recipe
generally
for
a
big
model.
For
example,
50
there
are
53
convolutions,
then
people
need
to
tune
player
for
those
quantization
recipes,
which
is
a
very
huge
space.
A
So,
overall,
we
expect
the
output
will
improve
up
to
90
quantization,
productive
improvement,
and
we
already
received
very
positive
feedback
from
our
customer
and
in
the
real
use
case
below.
Is
the
table
to
show
the
the
model
using
our
port
compared
with
the
ap
cylinder
baseline
and
the
models
cover
the
typical
workloads
like
razer,
50,
vg,
computer
vision
and
the
various
birth
and
the
birth
variant
models?
A
You
can
see.
The
accuracy
is
kept
within
one
percent
loss
and
the
tuning
time
is
less
than
the
40
minutes
and
for
those
birth
model
actually
is
just
like
one
minute.
We
expect
to
improve
the
productivity
further
with
more
advanced
tuning
strategy
or
the
quantization
recipes,
as
well
as
the
more
optimal
quantized
kernel
for
onyx
runtime.
A
A
Meanwhile,
airport
supports
two
quantization
approaches,
which
is
the
static
quantization,
dynamic,
quantization
static.
Quantization
is
meaning
for
the
computer
vision
workloads,
for
example,
vdg
and
the
dynamic
organizations
meaning
for
the
nlp
models
for
birth
and
the
other
transformer
models.
A
Now,
let's
talk
about
the
radius
collaborations
and
the
plans,
airport
v1.2
was
supported.
Onyx
runtime
1.6,
with
the
offset
and
the
operator
wise
quantization
tuning
and
the
later
release
1.3
was
support.
Onyx
runtime
1.7
with
the
new
quantized
operators
and
the
airport
1.4,
will
integrate
the
python
optimizer
tool
introduced
by
the
onyx
runtime
1.7,
and
support
the
more
flexible
craft
transformation
for
amp
models
on
community
collaborations.
A
So
in
the
future
we
plan
to
continue
improve
the
output
and
contribute
the
context
model
to
onyx
model
zoo
and
enrich
our
product
release
distributions
channel.
For
example,
we
will
release
a
docker
binary
and
also
the
nitro
build
binary
to
the
community.