►
From YouTube: ONNX Training working group meeting on July 23, 2019
Description
Recording of ONNX training working group meeting on Webex from July 23, 2019.
B
B
A
B
A
B
A
B
B
B
A
D
B
B
B
Auto
differentiation
function,
you
define
Auto
differential
patient
function
for
a
set
of
operators,
and
the
outcome
of
the
auto
differentiation
should
be
also
the
set
of
the
operators
so
that
you
can
recursively
apply
Auto
differentiation,
but
for
some
unexaggerated,
like
oh
I,
aligned
and
so
on,
it
will
be
hard
to
live.
I
mean
it's
doable
because
we
can
write
the
bus,
Plus
code
to
implement
and.
D
B
B
B
B
B
B
B
C
B
C
D
D
D
E
E
C
C
B
D
B
B
But
do
we
I
mean
I'm,
okay
with
having
conference
operators,
but
I
feel
it
will
be
easy
if
we
just
put
a
culture
on
a
single
gradient
operator.
E
E
A
E
Remember
I
mean
I
still
have
that
your
simpler,
a
linear
model
using
era,
grad
optimizer
I,
have
that
model.
Actually,
if
you
want
I
can
show
you
that
I
can
actually
sort
of
use
that
intensive
flow
to
to
continue
or
transfer
learning
right,
so
I
think
we
are
good
with
that.
By
having
this
constrained
and
use
of
gradients
of
gradients
thats
related
to
gain.
That's,
why
I'm
asking
do
you
have
a
real
game
model?
We
want
to
try
this
out
in
order
to
prove
our
design
is
good
enough.
That
is
for
some
sort
of
gain.
C
E
C
E
B
Okay,
so
that's
how
this
title,
you
know,
I!
Think
fight
with
herself.
Oh
yeah
I
told
you,
okay,
so
fighters
do
have
to
gradient
operator.
This
is
his
great
operator
and
you
can
see
the
lost
penalties.
What
the
voice
my
side,
gradient.
First
compute,
this
guy!
Okay!
Here
you
got
your
gradient
and
you
compute
the
penalty
penalties
here.
B
E
B
A
B
E
B
So
my
position,
mine
to
me,
the
lower
bound,
would
be
using
so
for
under
the
current
proposal.
Even
with
cover
given
with
I,
can't
rank
right
now,
writer,
we
should
be
able
to
support
everything
we
can
do
using
the
previous
from
also
that
is,
if
in
the
culture
and
also
satisfy
satisfy
the
the
need,
I
will
be
happy
to
coach.
My
current
probe
also
enter
with
one
extra
constraint.
A
E
There's
always
the
things
I
like
to
see.
That's
why
I
took
this?
You
know
a
proposal
and
created
some
kind
of
a
you
know,
touring
to
to
make
it
run
intensive
law,
but
we
don't
have
the
other
direction.
Yet
that
means
from
tensorflow
how
we
can
generate
this
sort
of
gradients
and
other
things.
We
need
right
for
that.
I
have
reached
out
to
the
test
flow
to
onyx
community,
which
is
led
by
come
to.
He
said
his
team
will
take
a
look
so
to
me
to
call
this
a
complete,
like
version.
A
E
Would
say
so
we
already
have
two:
oh
poor
request,
like
I
did
I,
just
you
know,
put
it
on
my
computer.
I
could
generate
this
training
info
and
the
function
and
so
on,
and
then
I
can
make
use
of
it.
Similarly,
you
know
for
the
other
team.
They
should
be
able
to
take
this
down
and
try
to
create
that
training
for
in
gradient
operators
and
so
on
from
their
co-pays.
That's
the
thing
I'm
hoping
you
know
would
happen
sooner
than
later,
as.
E
E
E
A
E
F
E
E
E
Save
that
say
onyx
format,
including
that
any
info,
so
this
is
a
very
simple
linear
model.
Okay,
I
run
that
in
pipe
watch,
first
right
with
a
degrade
again
I
print.
The
results
here
later
on
I
run
the
same
model,
intensive
flow,
okay,
I
print
the
results
here:
okay,
around
50
times
and
then
I-
save
that
the
onyx
format.
Okay,.
E
E
Hey
once
I
have
this
I
go
back
to
my
tential
side
and
that's
the
TF
I
have
a
paragraph
here.
E
So
I
had
to
in
here
lo
different
models,
because
it
was
the
introduction
of
the
function
node
to
capture
the
graph.
My
current
converter,
you
know,
broke
right
because
I
don't
have
a
handle
for
that.
So
I
can
read
the
information
in,
but
I
cannot
produce
so
to
explain
that
I
have
a
view
here.
So
that's
the
current
may
be
maximized
view
right.
I
have
the
inference
function,
it
doesn't
have
any
real
operators
right
vs.
E
the
way
I
have
here
and
you
can
see
I
have
the
real
you
know
transpose
and
multiplication
operators
here,
okay,
in
order
to
have
them
the
training
going.
I
need
this
graph,
of
course,
with
this
graph.
I
need
to
handle
this
function
and
go
down
to
the
next
level
right,
but
that's
where,
for
now,
I
just
have
two
models,
but
you
can
imagine
right
that
inference.
Models
on
the
right
hand,
side
exactly
is
what
we
have
inside
of
this
function.
E
E
D
E
That
as
placeholder,
okay
so
later
on,
I
just
use
the
a
degrade,
optimizer
internship
flow
and
use
that
learning
rate
you
know
from
the
model:
okay
later
on
I
print
it
out
before
the
additional
training
make
sure
it's
the
same
as
before,
and
then
another
one
after
right.
So
if
I
run
this
all
right,
so
you
will
see
the
initial
result
without
training
additional
training.
Oh
I'll!
Have
it
because.
C
E
E
Alright,
I
actually
saved
the
model
in
a
place.
I
can
again
use
tension
board
to
look
at
the
graph
okay,
so
that
also
proves
you
know
we
can
continue
to
use
this
model
intensive
flow
right.
So
so
that's
sort
of
my
very
simple
prototype
to
take
this
linear
model,
with
a
degrade
from
PI
torch
to
onyx
to
tensor
flow.
E
Okay,
that's
why
I
was
sort
of
asking.
If
you
want
to
do
again,
do
you
have
a
model
in
PI
torch
right?
Maybe
we
can
now
compare
that
to
onyx
and
I
can
see.
If
you
know
we
can
do
similar
things
intensive
flow
right
so,
based
on
my
code,
I
have
this
sort
of
fun
summary
right.
The
first
thing
is
right:
now
we
have
a
node
with
you
know,
type
like
this,
and
certainly
that's
something
we
need
to
handle,
not
we,
but
all
the
converters
need
to
handle
that
from
now
on.
E
E
B
E
E
Yeah
there
might
be
some
optional
inputs
for
for
the
iteration
thing,
okay,
so
for
this
particular
node.
Of
course,
we
need
to
know
the
last
function.
We
need
to
know
the
optimizer,
the
gradient
I
right
now.
I
only
use
the
name
because
he
actually
I
can
show
you
in
the
code.
Intensive
flow
I
just
need
to
set.
E
E
E
B
E
Once
you
start
talking
about
backward
pass
that
something
I'm
not
sure
I
have
control
of
intensive
flow
right,
currently
I'm,
just
using
the
principle.
Ap
is
for
optimizers,
for
instance,
right
I
pass
in
the
list
of
variables
for
training
purpose,
but
I
do
not
have
any
control
over
the
backward
pass
or
backward.
E
E
B
E
E
Still
don't
know
actually
how
to
make
use
of
this
binding,
but
anyway,
because
I
the
new
wait,
a
new.
You
know
a
state
I
anyway,
I
I
don't
find
PP
eyes
for
that.
So
I
have
this.
Also
investigation
needed
I
already
cover
more.
So
that's
the
real
work
once
we
introduce
this
training
first
proposal,
if
we,
you
know,
merge
that
PR,
then
from
the
converter
side
right,
we
need
to
handle
this
pseudo
node,
and
then
we
determine
to
create
Constantine
inference
graph
versus
variables
in
training
graph
right.
E
E
E
D
F
E
E
B
F
E
This
additional
handling
knife
or
training
purpose
right.
Okay,
that's
fine!
I
just
said
there
are
some
work
items.
I
will
present
to
our
converters
sake
as
well,
because
those
are
all
the
the
work
introduced
and
if
we're
going
to
support
training,
okay,
where
to
apply
update
by
anything
that
that's
the
same
question
I
had
earlier,
not
sure,
maybe
in
certain
frameworks
you
can
apply
in
certain
ways
right,
okay,
so
so
that's
sort
of
my
current.
E
This
is
a
reasonable
design.
We
have
some
work
to
be
done
in
order
to
really
support
it
right
and
hopefully,
the
other
direction
like
I
said
earlier
here
right,
this
can
be
evaluated
right.
Then
we
feel
much
more
comfortable
going
forward
with
this
proposal.
Okay,
any
comments
on
this
or
additional
work.
It
you
want
us
to
conduct.
E
Maybe
you
can
go
back
to
this
next
month,
a
workshop.
What
we
want
to
present,
that
is
our
goal,
maybe
is
to
present
this
proposal
with
some
sort
of
evidence
that
should
be
you
know,
working
with
you
know,
other
components
in
the
ecosystem,
such
as
converters,
is
that
our
goal
would
that
be
a
reasonable
one.