►
From YouTube: Apache TVM Community Meeting, May 20, 2021
Description
* Agenda
* Introductions
* Announcements
* New TVM Plugin for GDB, written by Eric Lunderberg
* Trial run of Discord Server
* Regular µTVM Community Meeting
* AOT Roadmap (slides)
* Open Discussion on AOT
A
Okay,
everyone
welcome
to
the
may
edition
of
the
apache
tvm
community
meeting.
We
have
a
really
big
agenda
ahead
of
us
today,
so
we're
going
to
try
to
move
through
it
as
as
effectively
as
possible.
The
majority
of
the
meeting
is
going
to
be
taken
up
with
discussions
about
about
micro
tvm,
so
we're
going
to
move
through
some
of
the
earlier
agenda
items
because
I'm
sure
that
we're
going
to
take
up.
A
You
know
most
of
the
45
minutes
to
an
hour
on
on
this
topic
so
but
we
like,
but
we
like
to
start
off
every
meeting
with
introductions
and
so,
if
you're,
if
you're,
if
this
is
your
first
time
to
the
meeting
or
you're
new
to
the
apache
tvm
community,
but
please
feel
free
to
introduce
yourself
is
there?
Is
there
anyone
who
wants
to
say
hello
right
now.
E
Hello,
I'm
jeffrey
spitz,
I'm
with
sima
ai,
and
I
work
on
our
simulations
and
applications.
A
Great,
it's
great
to
see
a
lot
of
the
a
lot
of
the
sema
and
lenaro
people
coming.
Welcome,
jeffrey.
F
Hi
everyone,
I'm
carl
evans,
I'm
the
founder
of
tercero
technologies
based
in
pittsburgh.
We
do
edui,
I'm
here
with
my
colleague,
david.
A
Is
there,
have
you
been
working
on
tvm
that
much
before
I
or
is
this,
is?
Are
you
fairly
new
to
it.
F
I've
been
working
with
tvn
since
november
december,
ish,
okay,
yeah
so
fairly
fairly
new,
but
have
been
digging
a
lot
into
the
code,
particularly
the
fpga,
related
code.
A
Okay,
fantastic!
Well,
if
there's,
you
know,
feel
free
to
you
know
to
reach
out,
if
there's
anything
that
I
I
hopefully
you've
been
reaching
out
to
some
of
the
engineers
who
have
been
working
on
this,
but
I'm
the
developer
advocate
for
the
community
and
so
feel
free
to
reach
out
to
me.
If
you
have
any
questions
or
need
pointers
and
any
directions
to
looking
for
things
that
you've
been
working
on.
A
Okay,
anyone
else
all
right
with
that,
we'll
with
that
we're
going
to
move
on
to
some
announcements.
First
off
we
wanted
to.
I
wanted
to
have
have
eric
talk
a
little
bit
about
a
new
plugin
that
he's
written
for
tvm,
that
plugs
into
gbd
and
and
and
helps
with
helps
with
debugging
tvm
specific
actions,
including
debugging
models
and
so
eric.
If
you
want
to
take
a
few
minutes
to
describe
this
work
this
this
this
this
plugin
and
how
it
works,
that
would
be
fantastic.
D
Oh
certainly
yeah
if
you'd
like,
I
can
also
share
my
screen
to
give
a
bit
of
a
demo.
If
that
helps.
D
Okay
share
screen
and
this
window
yeah.
Let
me
know
when
you
can
see
my
screen
yep.
We
see
it
okay,
so
I
have
two
windows
side
by
side.
So
in
both
I'm
going
to
just
open
up
a
quick
stack
trace.
D
The
difference
is
that
on
see
the
right,
I'm
going
to
source
the
a
extension
file
on
the
left,
I'm
not
going
to
everything
else
will
be
identical
between
the
two.
So
I'm
going
to
put
a
quick
say,
break
point
just
so
that
way
I
can
simulate.
Oh,
there
was
something
that
went
wrong
and
where
does
it
happen,
so
import
tvm
and
then
let's
go
somewhere
that
triggers
that
break
point.
D
So
just
doing
some
query
in
this
case
on
the
vulcan
api
and
okay,
I've
hit
a
break
points,
so
first
thing
where
and
you'll
notice.
This
looks
very
similar
between
the
left
without
the
plugin
and
the
right,
without
the
only
difference
is
a
lot
of
these
say:
stack
frames
are
indented.
D
D
So
if
I
give
the
dash
hide
option
say
as
well,
then
I
can
see
still
the
full
stack
trace
on
the
left
without
the
extension,
but
on
the
right.
D
So
I
can
see
oh
yeah,
here's
where
I
am
in
the
vulcan
api,
where
I
hit
my
breakpoint
and
see
runtime
api
and
then,
let's
skip
a
dozen
frames
until
we
get
to
pactfunk,
hey
pactfunk.pi,
so
in
addition
to
hiding
stack
frames
that
aren't
necessarily
that's
insightful
into
a
given
problem.
D
Then,
instead
of
printing
out
the
see
location
in
the
c
python
c
c
code,
it
instead
looks
up
the
location
in
the
python
script
that
was
running
and
shows
that
instead-
and
so
this
way,
even
as
things
go
back
and
forth
between
some
things
happening
on
the
python
side,
something's
happening
on
the
c
plus
side,
it
shows
the
relevant
stack
traces
that
you
yourself
have
put
in
by
calling
python
functions
or
by
calling
see
the
pax
c
plus
functions
rather
than
showing
the
intermediates
that
handle
that
look,
those
levels
of
interaction
save
the
link.
D
D
H
Uh-Huh,
so
does
it
mean
so
can
I
use
this
whole
framework
now
from
vs
code
directly.
D
See
right
now
I
don't
or
I
don't
know
how
much
let's
say:
interaction
vs
code
has
with
gdb,
so
this
interaction
is
done
using
gdb's
frame,
filter
api,
and
so,
if
see,
vs
code
is
making
those
same
calls,
then
it
will
say
be
enabled
through
gdb's
extensions.
D
I
A
Yeah,
if
you,
if,
if
anyone,
tries
if
you
do,
try
it
out
in
vs
code,
I
think
that
the
you
know,
maybe
posting
that
into
the
discuss
forum
too,
to
to
to
to
express
if
it,
if
it
works
or
doesn't
work.
Maybe
a
nice
way
to
capture
that
information.
A
Yeah
thanks
eric.
I
think
I
think
this
is
like
really
helpful
and
for
for
debugging,
because
it's
definitely
like
what
the
what
we're
seeing
on
the
right.
There
is
definitely
a
lot
easier
to
understand,
what's
happening
on
the
left.
A
A
Okay,
so
so
next
on
the
agenda,
so
next
month's
meeting,
so
we
talked
a
little
bit
about
this
at
the
last
meeting,
but
next
month's
meeting
we're
going
to
be
selecting
an
asia
pacific
friendly
time.
A
A
But
we've
had
some
requests
also
from
from
users
who
are
in
japan
in
china.
Kind
of
you
know
in
you
know,
in
a
similar
region
to
you
know,
maybe
have
alternate
meeting
times
so
that
so
that
it's
easier
for
them
to
come
to
the
meeting,
because
you
know
typically
it's
in
the
middle
of
the
night
for
for
these
meetings
and
so
look
for
an
announcement
on
the
exact
time
that
the
meeting
is
going
to
happen.
A
If
there
are,
if
there's
anybody
who
has
specific
suggestions-
and
we
can
some
of
this
on
this
on
the
discus
channel-
also
to
to
do
this-
you
know
about
about
a
time
that
works
and
then
we'll
send
that
out
and
we'll
we'll
work
towards
adding
that
to
to
the
agenda.
Another
thing
on
here
that
I
I
didn't
quite
add,
but
it
was
also
important
to
note-
is
right.
A
Now
we
are
running
the
community
calendar
from
the
calendar,
that's
being
hosted
by
octo
ml,
and
we
want
to
be
able
to
open
this
community
calendar
up
to
more
of
the
community
so
that
you
can
schedule
your
own
meetings
and
your
own
subgroup
meetings
and
we'll
we'll
get
into
this
a
little
bit
more
during
the
regular
micro
tvm
community
meeting
time
so
we'll
be
looking
for
some
calendar
changes
as
we
stand
up
that
apache
tvm
organization
and
start
reproducing
the
calendars
inside
of
that
and
then
broadcasting
out
to
the
community
more
another
big
announcement.
A
Is
we,
after
a
poll
on
the
discuss
forum,
we're
going
to
do
a
trial
run
of
a
new
discord
server
for
the
apache
tvm
community?
We
talked
about
it
a
little
bit
within
within
within
discuss
and
essentially
what
this
platform
is,
is
a
synchronous
way
for
community
members
to
be
able
to
talk
to
one
another
about
tbm,
and
so
this
is
largely
about
hey.
How
do
I,
how
do
I
get?
How
do
I
work
on
something
if
you're
collaborating
with
someone?
A
If
you
need
some,
if
you
need
a
little
bit
a
little
bit
of
help,
I'm
doing
things,
we're
hoping
that
this
will
be
like
a
really
nice
place
for
you
to
be
able
to
drop
in
ask
a
few
questions:
interact
with
other
people
who
are
working
on
tvm
a
little
bit
more
directly
and
a
little
bit
more
one-on-one.
A
But
we
also
want
to
remind
everyone
that
all
official
decisions
and
all
official
discussions
need
to
happen
on
the
mailing
list
and
within
the
discuss
forums
so
that
so
that
everybody
who
is
involved
in
the
community
has
an
opportunity
to
participate,
and
so
you
know
we're
really
thinking
of
the
of
the
discord
server
as,
like
you
know,
catch
up
with
someone
in
the
hallway
have
a
chat
with
them.
A
You
know,
you
know,
maybe
sort
out
how
to
work
some
things
or
how
to
get
some
help
and
then
bring
those
decisions
back
to
the
community
in
the
discussion
forum.
So
please
there's
an
invitation
link
there.
I
have
a
link
to
the
initial
forum
post
about
this
and
and
and
try
it
out.
This
link
should
be
good
for
everyone
for
forever
and
if
and
if
you
do
run
into
any
problems,
feel
free
to
reach
out
to
me.
You
know
either
inside
of
the
discord
server
or
through
my
email,
which
I
will
put
down
here.
A
If
you,
if
you're
running
in
any
problems,
so
does
anyone
have
any
questions
about
this
or
or
has
anyone
you
know
want?
You
know,
have
any
observations
about
it
that
they'd
like
to
share.
J
Chris,
there
is
a
channel
at
the
tlc
pack
called
the
micro.
How
does
it
fit
in
in
that
new
discord
server?
What
are
the
contexts
you
know?
What
should
we
use.
A
Yeah,
I
think
that
you
know.
I
think,
that
what
we
would
like
to
do
is
so
so
one
of
the
reasons
that
we're
looking
why
we
started
up
this.
This
discord
server
was
because
we
were
running
so
there
is
a
for
those
of
you
who
are
unaware
there
actually
is
a
tlc
pack,
slack
server
that
that
some
members
of
the
community
have
been
using,
but
we've
run
into
some
pretty
significant
limitations,
particularly
in
the
number
of
people
who
can
actually
be
invited
and
and
how
and
how
we
can
manage
that.
A
And
so
my
recommendation
is
that
discussions
that
would
be
happening
on
that
slack
server.
If
we
can
take
those
over
to
the
discord
server
and
really
try
to
see
that
community
there
and
so
favor
discord
in
you
know
over
the
slack-
and
I
think
that's
a
really
good
point,
like
you
know,
like
there's
a
micro
tvm
channel
in
there,
that
people
are
talking
that
people
are
talking,
and
so
we
need
to
create
a
micro
tvm
channel
within
that
server.
J
Awesome
got
it
chris
that
that's
a
good
move
in
my
opinion,
yeah.
I
will
create
a
channel
and
I.
I
I
think
to
answer
your
question
more
yeah.
A
lot
of
the
frustration
with
the
slack
is
that
we
tried
to
make
it
more
open,
but
without
paying
an
incredibly
large
amount
of
money
to
slack,
it's
really
hard
to
get
all
the
features
out
of
it,
and
so
I
think,
in
the
old
days
a
lot
of
us
were
meeting
in
person,
and
you
know,
like
people
weren't
using
slack
that
much,
but
it
seems
like
as
we're
growing
across
companies,
it's
really
important
to
communicate
for
just
some
development
synchronization.
I
A
A
You
know,
criticisms
of
things
that
we
could
do
better
requests
for
things
that
you
would
like
to
see.
I
mean
we
really
want
to
make
it
a
community
resource
that
is
going
to
be
vibrant
and
useful
for
everybody.
K
Yeah-
and
so
you
know,
one
thing
is
like
you
know:
we
currently
have
the
slack
and
we're
calling
this
a
trial
period
of
the
discord
server.
So
let's
all
move
to
a
micro
channel
on
the
discord,
server
and
try
stuff
out
again.
You
know
for
for
anything,
that's
kind
of
a
material
decision
that
we're
gonna
make
within
the
tbn
community.
We
we
can't
make
it
on
either
one,
so
we
wanna
make
that
those
on
the
discuss
forum.
So
hopefully
it
shouldn't
be
too
big
of
a
deal.
K
If
you
know
we
move
over
to
the
discord
server
and
then
you
know
we
we
decide
for
some
reason
and
a
month
from
now
that
actually
we
didn't
like
something
there
and
we
have
to
switch
to
some
other
platform.
But,
let's
you
know
I'll
monitor
both
going
forward
as
we're
in
this
trial
period.
L
M
A
We
can
do
a
pinned,
a
pinned
message
in
the
general
channel
too.
I
think
that's.
M
A
Are
there
and
aren't
at
the
meeting
understand
that
we're
doing
that
migration?
I'm
also
going
to
be
posting
the
link
to
the
discuss
forum
later
on
too
so
that
it's
more
widely
available,
so
that
anyone
who
sees
it
in
the
discuss
forum
can
can
join
the
can
join
the
group.
Now,
as
with
any
of
these
other
community
platforms,
when
you
post
a
public
link
like
that,
we
also
want
to
make
sure
that
everyone
in
the
community
feels
like
feels
like
they're
safe
and
that
they
and
that
and
that
this
is.
A
A
A
A
Okay,
so
we're
about
25
minutes
in
and
let's
move
on
to
the
next
topic,
the
rest
of
the
meeting
we're
going
to
be
mostly
covering
micro,
tvm
and
one
of
the
first
things
that
we
wanted
to
bring
up
is
there's
been
some
some
requests
generally
to
have
some
more
online
meetings
like
like,
like
the
community
meeting
that
are
a
little
bit
more
focused
towards
micro,
tvm,
and
so
to
that
end,
we're
going
to
be
kicking
off
bi-weekly
meetings
over
zoom,
and
did
you
want
to
so
I'll
turn
this
over?
A
Did
you
want
to
talk
about
this
tom
or,
or
is
this
something
that
you
wanted
to
talk
to
talk
about
andrew.
M
Sure,
well,
we'll
just
keep
it
short
and
sweet.
I
think
that's
probably
the
most
important
thing,
so
you
know,
as
velocity
has
picked
up
in
microtvm
development
and
sort
of
facilitating.
What's
all
going
on,
you
know
in
the
community,
it
sort
of
makes
sense
that
we
should
get
together,
maybe
on
a
more
regular
basis,
sort
of
having
a
little
side
huddle.
That's
about
micro,
tdm,
but
of
course,
with
the
provision
that
again
it's
the
discuss
forum
that
counts
when
it
comes
to
business
and
along
those
lines.
M
So
really
this
is
a
you
know,
just
meant
to
be
a
way
to
you
know
help
what's
going
on
already
with
with
with
micro
tvm,
as
speed
has
picked
up
on
it.
So
we've
got
all
the
information
in
here
and
we'll
we'll
try
things
out
as
an
experiment,
and
we
are
looking
at
a
two.
You
know
bi-weekly
cadence,
we'll
dial
that
up
or
dial
that
back
depending
on
how
well
things
work
out.
So
maybe
you
know
monthly
is
more
that
the
the
correct
cadence
will
again
make
adjustments
along
the
way.
M
As
far
as
the
meeting
time,
I
just
picked
a
random
time.
That
would
try
to
appeal
to
a
good
set
of
time
zones.
I
realize
it
is
not
best
for
those
people
on
the
west
coast
of
the
united
states
and
it's
terrible
if
you're
in
china.
So
you
know,
as
you
sort
of
alluded
to
earlier
in
the
meeting
chris,
you
know.
Maybe
we
need
to
do
something
that
flips
back
and
forth
with
something
that
would
be.
M
H
But
tom
one
question:
would
this
yeah,
would
this
be
just
micro,
tvm
or
because
I
think,
last
time
we
also
talked
about
an
aot
meeting?
Would
would
these
topics
be
covered
as
well.
M
Yes,
yeah,
so
micro
tvm
intends
to
make
use
of
the
aot,
so
I
think
it
makes
sense
that
you
can,
you
know,
sort
of
mix
those
two
things.
I
wouldn't
be
surprised
from
time
to
time
that
there
will
be
other
topics
that
drift
in
because
micro
tvm,
you
know
in
some
way
interacts
with.
You
know
some
feature.
That's
you
know
in
the
larger
framework,
you
know
we'll
just
sort
of
let
those
things
organically
go
as
they
need
to
make
sense.
A
Okay
and
and
also
reminding
everyone
too,
that
that
we
are
going
I'm
going
to
be
setting
up
a
new
apache
tvm
organization
that
will
allow
us
to
share
some
of
these
resources,
particularly
the
calendars
for
this
and
so
watch
for
the
announcement
on
the
discuss
forum
about
the
new
calendar
and
you
should
be
able
to
subscribe
to
the
calendar,
and
my
intention
with
the
calendar
is
to
make
sure
that
we
put
all
of
the
public
tv
events
onto
that.
And
so
that's
going
to
include
the
monthly
community
meeting.
A
That's
going
to
include
this
micro
tvm
bi-weekly
meeting,
as
well
as
like
we're
going
to
be
doing
tvm
comp
later
this
year,
and
so
that's
going
to
be
including
information
about
when
tvm
conf
is
going
to
be
happening
when
the
deadlines
are
that
for
that
are
going
to
be.
A
And
so
you
know,
and
that's
and
a
big
part
of
the
reason
that
we
want
to
make
this
a
community
owned
resources
so
that
folks
in
the
community,
who
are
planning
events
like
say
that
we're
planning
local
events
or
planning
you
know
they
want
to
have
like
local,
meetups
or
or
other
things
that
are
going
on.
We
have
a
place
for
community
members
to
add
that
and
be
able
to
broadcast
it
to
the
wider
community
in
kind
of
a
a
you
know,
a
repeatable
and
discoverable
way.
K
I
figured
what
we
could
do
for
this
is
I
I
think,
there's
been
a
lot
of
discuss
posts
and
forum
posts
and-
and
I
myself
have
even
at
times
become
confused
about
just
the
number
of
things
in
flight.
So
I
thought
to
start
with
the
discussion
here.
We
could.
I
threw
together
just
like
a
couple
of
slides
to
kind
of
just
give
an
overview
of
what
we've
been
working
on
some
of
the
the
some
of
the
work.
K
That's
been
kind
of
done
and
some
of
the
work
that's
outstanding,
and
then
you
know
we
can
kind
of
talk
about
whatever
people
feel
like
is
a
concern
from
here.
I
know
a
number
of
the
contributors
to
aot,
or
maybe
even
all
of
them
are
on
the
call
here.
So
if
you
have
questions
for
for
them,
I've
certainly
been
serving
in
more
of
a
reviewer
role.
For
for
this,
and
so
so
kind
of
we
have.
K
You
can
ask
me
questions,
you
can
ask
them
questions
and
we
can
kind
of
go
from
there.
So
let
me
give
a
quick
overview
of
that
and
then
we'll
kind
of
open
it
up
to
to
discussion.
After
that,
so
I
thought
I
just
would
talk
about
like
what
is
the
aot
effort
in
case
people
here
have
not
kind
of
engaged
with
it.
What
are
the
sub
projects
and
then
kind
of
what
are
the
rfcs?
K
We
have
outstanding
right
now,
so
to
start
with,
I
mean
I
just
just
to
give
some
background
for
for
aot.
You
know
what
does
this
mean?
K
It's
we're,
calling
it
ahead
of
time,
compilation
and
what
we're
what
we
mean
here
is
when
you
run
the
tvm
compiler,
you
supply
basically
a
relay
program
at
left,
even
if
you
give
it
a
something
like
a
tensorflow
model,
tdm
is
first
going
to
convert
that
tensorflow
model
into
relay
and
that's
cvm's
kind
of
internal
model
description
language
after
you're
done
with
that
tbm
will
then
compile
pieces
of
this
program
separately,
and
then
it
currently
relies
on
a
sort
of
executor
to
to
handle
this,
and
so
I'll
talk
about
this.
K
But
but
aot
is
an
effort
within
tdm
to
to
generate
basically
glue
code
and
so
that
that
runs
kind
of
all
the
pieces
without
a
need
for
like
a
runtime
library
to
do
that.
So
let
me
explain
a
little
bit
more
so,
like
I
said,
we're
going
to
compile
a
model
and
and
run
it
and
isn't
basically
compiling
the
model
and
calling
the
pieces
what
we're
already
doing.
Not
quite
what
we
do
is
we
break
things
into
sorry?
K
I
think
I
skipped
this
should
just
skip
this
to
this
slide
before
we
break
the
program
into
schedule,
pieces,
there's
sort
of
one
function
for
each
piece,
and
I've
kind
of
illustrated
that
with
this
graph
at
the
right
here-
and
we
export
basically
three
things.
We
export
kind
of
the
generated
operator
implementation
for
each
piece,
so
you've
got
in
this
graph.
You've
got
com2d
bias
ad.
That
was
from
our
relay
model
before,
as
well
as
max
pool
2d.
K
We
then
have
like
a
graph
that
explains
how
to
link
together
the
pieces
as
well
as
how
the
data
dependencies
flow
sort
of
between
the
inputs.
So,
for
example,
if
you
want
to
run
this
comptonity
bias
ad,
the
graph
explains
that
you
need
to
supply
an
input
called
com2d
input
and
a
p1
which
is
a
parameter
and,
and
then
the
output
is
going
to
be
placed
in
this
intermediate
buffer
here,
and
so
these
three
outputs
here
are
what's
currently
generated.
K
I
guess
when
you're,
using
the
graph
runtime,
which
is
the
default
thing
to
use
in
tvm
for
for
at
least
for
micro
applications
and
kind
of
moving
forward,
then
you
know
the
problem
with
this
approach
is,
is
actually
mainly
that
what
we
do
is
encode
the
operator
graph
as
json,
and
so
the
runtime
you
know
basically
has
to
load
in
this
json
reconstruct
the
operator
graph
and
then
sort
of
call
these
operators
in
order
and
and
doing
this
is
pretty
expensive
on
a
microcontroller,
especially
so
what
is
aot
then?
K
K
You
know
what
why
do
we
need
to
to
rely
on
a
library
for
this,
and
the
answer
is
that
we
we
don't.
We
could
do
this,
and
so
I
made
this
slide
to
basically
kind
of
show,
roughly
the
tvm
compiler
architecture-
and
you
can
see
here-
I've
got
the
outputs
written
here
at
right
on
the
right
here.
We
get
simplified
parameters,
we
get
an
operator
graph
and
we
get
the
generated
operator.
K
Implementations
and
the
thing
that
we
wanted
to
to
do
in
aot
then
is
take
this
operator
graph
and
somehow
output
a
function
that
implements
it
and
so
in
the
aot
world.
What
we
do
is
we
actually,
instead
of
outputting
this,
we
feed
it
forward
into
a
new
pass
that
then
creates
sort
of
a
top-level
function
that
that
lives
alongside
all
of
the
the
operator
pieces
and
and
then
this
whole
thing
becomes
kind
of
the
code
generated
output
from
tvm.
K
Let
me
make
sure:
oh
I
see.
Okay,
I
think
I
let
me
this
is
the
example
slide.
I
wanted
to
show
you
guys
sorry.
This
is
not
quite
as
polished
as
I
wanted
it
to
be,
but
just
to
give
you
an
example
of
what
this
top
level
function
looks
like
people
that
have
been
working
with
tvm
for
a
while
are
kind
of
familiar
with
this
function.
Signature
at
the
top.
K
This
is
our
packed
c-function
function,
signature,
and
so
all
of
our
generated
functions
are
invoked
using
this
signature
here
and
so,
but
you
can
see
that
this
function
here
is
not
actually
an
operator
function.
It's
a
the
top-level
glue
function.
That's
meant
to
call
all
the
operators
in
the
graph
kind
of
in
the
order
that
reconstitutes
the
original
model
and
so
to
do
this.
There's
a
couple
challenges
one.
K
If
you
remember
that
intermediate
buffer
well,
we've
got
to
allocate
one
of
those
things
in
this
top-level
function,
so
this
aot
project
generates
code
that
performs
these
allocations.
It
performs
them
kind
of
minimally.
According
to
our
graph
level
memory
planning
algorithm.
K
It's
then
got
to
assemble
kind
of
a
call
stack
that
it
can
use
to
call
sub
functions
with
with
that
same
signature,
and
then
it's
got
to
call
operator
functions
in
order,
and
I've
just
shown
one
example
of
a
function
call
here
you
know
in
in
the
true
implementation.
Of
course,
you
can
expect
kind
of
these
last
two
blocks
to
repeat
themselves
several
times,
as
as
it
works
its
way
through
the
model,
and
you
can
see
in
calling
this
operator
function
effectively.
K
It's
it's
using
the
same
signature
here,
so
so
it's
kind
of
keeping
this
pax
c
phone
call
signature.
K
So
in
doing
this,
this
basically
allows
us
to
get
rid
of
the
need
to
have
a
large
runtime
library,
because
if
you
remember
that
runtime
library,
the
graph
executor
that
is
basically
was
supposed
to
parse
json
reconstitute
the
call
graph
and
then
do
all
this
code
and
what
we've
done
is
we've
effectively
codified
this
in
the
output
of
tvm
here
so
okay.
I
hope
that
was
somewhat
comprehensible
too
to
everyone
here,
but
that's
kind
of
what
is
aot.
K
I
want
to
say
one
thing
about
this
too,
and
that's
that
a
lot
of
the
aot
effort
has
been
kind
of
started
under
the
micro
tvm,
guys
and-
and
I
think
that
micro
tvm
in
particular
has
quite
a
bit
of
interest
in
the
aot
project,
because
resources
are
pretty
constrained
on
microcontrollers,
but
we
don't
intend
to
nothing.
We've
been
presented
so
far
is
particularly
specific
to
microcontrollers,
so
in
developing
the
aot
project.
K
You
know,
I
think,
there's
certainly
a
number
of
folks,
especially
a
number
of
folks
here
that
are
focusing
on
microcontroller
applications
of
the
aot.
But
this
isn't
something
that
we're
intending
to
make
specific
to
microcontrollers.
K
I
guess
in
we
intend
to
make
this
this
core
part
of
the
aot
effort,
more
broadly
reusable
across
the
tdm
code
base,
and
so
particularly
if
there
are
people
on
the
call
here
that
are
interested
in
non-embedded
applications
or
applications
outside
the
tvmc
runtime.
You
know
we'd
especially
love
to
hear
any
thoughts
or
concerns
from
from
your
site
as
well.
K
Okay,
so
I
wanted
to
talk.
This
is
a
fairly
large
effort,
and
so
we
kind
of
have
an
initial
rfc
about
basically
well
a
number
of
things,
but
what
we
wound
up
doing
was
was
merging
a
core
piece
and
then
you
know
we
kind
of
expect
several
pieces
to
follow
this,
and
this
is
by
no
means
a
comprehensive
list.
K
I'm
sure
there
will
be
additional
pieces
as
we
continue
developing
the
effort,
but
just
to
kind
of
give
everyone
kind
of
a
brain
dump
from
me
of
where
I
think
we
are,
and
if
anyone,
if
I've
forgotten
something,
please
let
me
know
so.
The
first
thing
we've
first,
I
guess
project
or
rfc
this.
These
all
roughly
correspond
to
rfcs
the
first
rfc
or
project
that
we've
landed,
is,
is
basically
making
a
tensory
hour
function
to
mimic
this
graph
executor
run.
K
So
that's
this
top
level
function
that
I
just
kind
of
showed
before
and
so
that
rfc
and
pr
has
been
landed,
as
well
as
a
couple
of
follow-on
prs
and
at
this
point
now,
there's
a
bunch
of
follow-on
rfcs
that
are
all
kind
of
working
their
way
through
the
discuss
forum
and
a
lot
of
these
are
are
open
and
active,
and
so
if
people
have
comments
and
questions
on
this
like.
K
Ultimately,
it
would
be
great
if
we
could
materialize
these
on
the
discuss
forum,
but
just
to
give
you
an
idea
of
kind
of
from
my
mind
where
we
are
you
know,
so
we
have
stuff
needed
by
this
kind
of
broader
non-firmware
or
non-embedded
effort,
and
and
for
that,
that's
basically
implementing
something
that
replaces
the
runtime
api
that
you
see
with
the
graph
run
runtime
today.
We
call
that
the
module-based
model
runtime
api.
If
you
search
for
that
on
the
forum,
you'll
find
a
pretty
lengthy
rfc.
K
That
kind
of
goes
through
current
consensus
and
and
which,
of
course,
we
can
always
change
going
forward
of
you
know
what
should
it
look
like
to
load
and
run
a
model
with
tvm?
So,
that's
you
know
what
function
do
we
call
to
allocate
memory?
What
function?
How
do
we
set
a
parameter
and
how
do
we
drive
inference
and
get
the
output?
K
So
that's
kind
of
a
packed
func
based
thing,
and
this
rf,
this
project
hasn't
even
been
started
yet
hasn't,
there's
hasn't
been
an
rfc
written
yet,
but
that's
kind
of
a
to
you
and
then
the
next
few
things
are
are
things
that
are,
I
think,
are
more
specific,
or
I
guess
I
would
say
needed
to
by
microtvm
could
be
used
elsewhere
too.
K
There
are
a
bunch
of
ideas,
basically
from
the
implementers
of
of
aot
around
basically
reducing
stack
usage,
you'll
notice
that,
in
this
example,
function
here
we're
allocating
a
bunch
of
things
on
the
stack
and
I've
actually
pruned.
This
code
example
down
quite
a
bit,
there's
a
bunch
of
other
things.
I
also
allocated
on
this
act
here
as
well.
K
All
of
this
kind
of
conspires
to
really
blow
up
the
stack
usage
requirements
on
embedded
systems,
which
is
kind
of
a
no-no,
and
so
one
of
the
things
that
we
do
is
allocate
dl
tensor
instances
on
the
stack
and
in
the
embedded
world.
You
know
we'd
like
to
kind
of
avoid
that,
since
for
the
most
part
we
just
need
the
data
member.
So
that's
one
thing
in
flight.
There's
an
rfc
out
about
that.
We'll
talk
about
that
in
a
second
there's.
Another
effort
is
basically
to
if
you
notice
this
function.
K
Signature
here
is
not
the
most
friendly
thing
for,
like
a
c
firmware
programmer.
Of
course
it's
like
a
c
function
interface,
but
specifically,
what
are
the
arguments?
What
order
do
you
present
them
in?
You
know,
there's
no
documentation
that
you
would
that
you
would
kind
of
normally
expect
of
a
tbm
function
and
there's
also
a
bunch
of
sort
of
extra
arguments
in
a
sense
that
out
return
the
return,
value
and
return
type
code
arguments
are
are
typically
are
actually
typically
contained
in
the
args
here.
K
I
believe
I
think
I
said
that
right.
So,
basically,
there's
just
a
lot
of
a
lot
of
pieces
here
that
that
could
be
minimized
and
so
there's
some
interest
in
creating
basically
a
smaller
api
that
doesn't
rely
on
this
packed
function,
type,
signature
and
and
can
also
just
kind
of
be
more
oriented
towards
embedded,
embedded
development.
So
doing
things
like
including
model
metadata
is
part
of
this.
K
Lastly,
there's
if
you
notice
that
the
the
intermediate
tensors
here
were
allocated
basically
as
scratch,
fat
tensors-
and
so
you
know
that
only
at
some
point
we'd
like
to
get
to
a
point
where
we
can
do
more
comprehensive
memory
planning
over
those
scratch
pad
tensors
and
that's
kind
of
an
effort,
that's
kind
of
a
work
in
progress.
K
K
That's
the
initial
rfc
that
implemented
this
core
tir
bit,
and
then
we
have
these
kind
of
outstanding
rfcs,
as
well
as
a
few
to
to
kind
of
implement
some
following
pieces,
such
as
the
model
module
based
model,
runtime
interface.
K
Okay,
so
that's
what
I
wanted
to
talk
about
just
to
kind
of
give
a
brief
overview.
I
hope
that
was
somewhat
informative
and
not
super
confusing.
We
have
some
of
the
folks
from
that
are
implementing
the
rfc.
I
think
manupa
and
giuseppe
are
on
the
call.
I
don't
know
if
you
guys
had
anything
you
wanted
to
bring
up
or
or
talk
about
or
feedback.
You
were
interested
in
and
I
guess
just
wanted
to
open
it
up
for
for
a
discussion.
N
K
And
one
thing
one
one
aspect
too-
that
I
think
is
kind
of
one
of
the
bigger
pieces
that
are
in
development
here
right
now
is
this
embedded
c
runtime
interface-
and
you
know
that's
kind
of
a
the
idea
here
again
is
to
provide
kind
of
a
firmware
facing
interface
and
actually
there's
been
a
kind
of
a
proposal.
K
In
a
sense,
the
the
folks
from
sd
micro
on
microelectronics
have
kind
of
up
pushed
a
pr
that
shows
basically
their
kind
of
a
demonstration
of
how
to
integrate
tbm
with
their
embedded
api,
and
I
think,
there's
a
lot
of
really
good
things
there.
That
kind
of
were
were
built
kind
of
from
kind
of
a
firmware
developer's
perspective.
K
So
there's
a
lot
of
there's
metadata.
If
you
need
to
write
reusable
functions
to
pre-process
model
input,
a
lot
of
the
mechanisms
for
getting
input
and
output
are
pretty
well
thought
through.
For
you
know,
in
terms
of
a
world
where
you
know,
the
application
developer
wants
to
be
able
to
kind
of
manage
all
the
the
memory
on
the
system,
and-
and
so
you
know,
kind
of
one
of
the.
K
The
things
that
I
wanted
to
to
ask
people
here
is,
if
they
have
specific
requests
of
of
you
know
what
should
a
firmware
interface
to
micro,
tvm,
look
like
going
forward.
You
know
right
now,
we've
kind
of
confined
you
to
this.
You
know
very
packed
funk
centric
graph,
runtime
interface.
K
H
Yeah,
so
I'm
not
sure
if
this
really
fits
hundred
percent.
Your
question
so
my
question
kind
of,
is
I
I
I
think
you
covered
the
right
point.
So
number
one
would
be
for
me
getting
rid
of
the
dl
tensors
and
the
tm
values
in
the
in
a
generated
code.
I
think
this
is
not
a
not
a
large
issue.
So
the
second
point:
what
would
be
interesting
is
do.
H
Do
you
think
we
should
limit
from
a
design
perspective
on
generating
code
only
for
the
main
function,
or
should
we
also
consider
to
on
the
kernels
so
like
getting
rid
of
the
logs
and
and
free
operations.
K
Yeah,
that's
a
good
question.
I
mean
we've
talked
about
this
a
little
bit
too,
with
I've
been
talking
about
this
with
manufa
and
others
on
the
the
arms
side,
who
have
been
kind
of
implementing
a
lot
of
these
operations
themselves,
and
I
know
that
you
know
memory
memory.
Planning
is
kind
of
one
of
the
big
concerns
here
and-
and
there
is
kind
of
a
question
of
what
we
want
to
do
is
get
away
from
a
world
where
we
have
this
sort
of
dynamic
memory
interface.
K
You
know
sort
of
a
mallet-like
interface.
If
we,
if
we
have
it,
you
know
at
the
best,
it's
like
we'd
like
to
sort
of
pre-plan
every
mallet
call,
so
that
we
can
target
it
to
a
sort
of
a
predefined
memory
location,
and
I
think
what
the
the
the
aot
implementation
today
does
is.
There's
actually
like
an
alternate
stack
based
memory.
Allocator.
K
You
can
pick
that
sort
of
as
long
as
your
model
is
feed
forward
and
straight,
and
then
all
of
the
memory
allocation
should
be
first
and
last
out,
just
like
a
stack,
and
so
you
can
use
this
stack
memory
allocator
with
kind
of
very
little
waste
and
so
kind
of.
How
does
this
all
relate
to
the
original
question
here,
which
is
basically
like?
K
Should
we
allow
these
operator
kernels
to
to
invoke
malik
basically
for
scratch
pad
memory,
and
I
think
that
kind
of
one
of
the
proposals
that
has
been
pushed
so
far
is:
can
we
implement
basically
a
tir
pass?
That
then
goes
into
the
generated
kernels
and
hoists
up
all
of
the
allocations
into
the
main
function?
So
basically
this
would.
K
This
would
look
like
basically,
each
each
of
these
scratch
pad
buffers
would
be
passed
in
as
another
argument
sort
of
a
magically
added
argument
to
each
operator
kernel
and
then
within
the
main
function.
You
know
that
kind
of
gets
to
the
level
of
user
interface
and
kind
of
graph
level
memory
planning
and
then
there's
options.
You
can
either
access
a
set
of
memory,
pools
and-
and
you
know,
use
the
memory
planner
to
target
those
areas
or
you
can.
K
If
you
want
to
continue
calling
the
malik
function,
you
can
do
that
so
there's
kind
of
at
least
that
moves
the
workspace
scratchpad
allocations
closer
to
the
the
kind
of
the
top
level
graph
interface.
K
I
don't
know
if
any
of
any
of
the
armed
folks
want
to
say
anything
more
about
that
or
if
there's
kind
of
further
thoughts
there.
O
Yeah
thanks
andrew
can
you
can
you
hear
me
yeah
so
yeah
that
that
kind
of
aligns
with
the
vision
so
yeah?
As
for
the
original
request,
how
do
we
get
rid
of
these
analog?
Sentries
was
a
concern
in
we
are
in
engineering
design,
aot
and
I
think
we
are
looking
at.
O
We
are
in
the
process
of
designing
something
that
we
hope
to
put
in
an
rfc
very
soon
in
which
we
try
to
make
the
main
function
accept
some
buffers
probably
to
be
used
as
pools
so
that
all
the
intermediate
code
in
the
main
and
operator
kernels
should
be
able
to
access
by
using
a
predefined
offset
into
it
so
that
there
are
no
analogues
and
is
actually
happening
on
the
runtime.
O
So
yeah,
as
andrew
mentioned,
we
are
currently
kind
of
using
a
lightweight
stack
allocator,
because
that's
what
tier
code
could
get
lowered
to
which
has
this
lifo
pattern,
but
yeah
we
have
planned.
We
replace
that
with
user
provided
workspace
buffer,
so
the
idealizing
the
in
the
application
itself,
they
would
be
able
to
pin
them
into
particular
memories.
K
I
guess
implementation
and
the
interface
currently
there's
kind
of
a
implicitly
user-facing
aspect
of
the
tvm
graph
memory.
Planner,
that's
called
it's.
Basically,
how
do
we
identify
a
memory
region
where
a
sensor
is
going
to
live
and
the
tbm
graph
memory
planner
currently
works
by
basically
looking
at
all
the
memory
allocations
that
are
visible
at
the
graph
level
and
currently
that
doesn't
include
scratch
pad
buffers.
K
But
you
know
with
this
idea
of
wasting
spatula
buffers
out
of
operator
functions
that
would
that
would
change,
and,
and
so
it
looks
at
all
the
graphable
allocations
and
then
it
sort
of
tries
to
map
each
graph
level
tensor
to
a
backing,
a
storage
area
by
doing
sort
of
live,
liveness
analysis
on
those
graph
level,
tensors
and
then,
of
course,
the
kind
of
the
implicit
question
then
is:
how
do
we
identify
a
storage
area
and
there's
kind
of
a
user
facing,
but
not
not
exactly
a
user-facing
identifier,
there
called
a
storage
id
that
you
can
see
in
the
graph
json.
K
If
you,
if
you
look
at
this,
json
object
that
we
generate,
and
you
know
moving
forward
as
we
get,
we
move
into
a
world
where
we
want
to
think
about
pinning
tensors
into
memory
and
also,
especially
as
we
start
to
consider
things
like
heterogeneous
memory
systems
or
maybe
systems
with
accelerators
that
have
small
areas
of
accelerator,
visible
memory
versus
larger
areas
of
cpu,
visible
memory.
K
I
think
that
the
the
first
step
in
that
direction
is
basically
to
make
the
the
storage
identification
process
kind
of
a
first
class
user
level
concept
and
and
so
kind
of
moving
forward.
I
think
that's
that's
one
thing
I
want
to
do.
I
see
giuseppe's
raising
his
hand,
and
I
don't
want
to
ramble
on
about
this,
so
go
ahead.
N
No
yeah,
I
would
just
make
sure
that
there
is
a
another
pr
that
I
just
pushed
like
40
minutes
ago.
So
you
you,
you
didn't
list
here
for
obvious
reasons
that
he's
trying
to
decouple
the
couple
aot
from
the
the
memory
planner
just
described
so
because
the
the
the
relay
planet,
the
graph
memory
planner
in
tdm
as
the
problem
that
it
treats
the
output
buffer
as
a
temporary
as
well.
So
this
is
bad
because
it
the
the
email,
the
output
buffer,
is
provided
by
the
user.
So,
yes,
you
know.
D
N
Cannot
share
it,
you
cannot,
you
can
share
it,
but
you
cannot
change
the
size
of
the
buffer.
Basically,
you
cannot
expand
the
buffer,
that's
the
problem.
So
there
is
an
issue
raised
and
in
order
to
solve
this
issue,
we
decided
as
a
first
move
in
the
direction
of
memory
planning
and
to
to
do
a
in
aot
memory
planning.
We
we
try,
we
are
trying
to
to
decouple
the
the
running,
the
the
airt,
the
the
uft,
the
ut
pass,
the
ulti
function
from
the
tvm
graph
memory
planner.
N
So
it
is,
we
run
a
very,
very
sequential
allocation
and
then
we
use
a
tier
pass
called
storage
rewrite.
That
is
basically
that
is
basically
doing
what
remember
the
tvm
memory
planning
does
on
the
t
function.
It's
still
not.
It
is
not
a
a
a.
N
It
still
doesn't
take
in
consideration
all
these
crash
buffers.
So
it's
still
the
is
still
a
bit
a
sub
optimum,
let's
say
but
yeah,
that's
that's
a
step
toward
the
direction.
K
Great
yeah
I'll
glad
to
see
that
and
I'll
take
a
look
at
that
and
yeah
of
course,
interested.
Definitely
please
take
a
look
and
review
it.
What
was
I
going
to
say.
K
Yeah,
I
was
just
going
to
to
speak
a
little
bit
more
about
this
issue
you
raised,
which
was
just
like
a
thing
that
mood
had
ran
into
the
other
day
and
and
previously
one
thing
that
the
kind
of
module
based
model
runtime
interface
implies.
Is
that
there's
a
tvm
library,
that's
managing
all
the
memory?
K
And
so,
if
you,
you
know,
if
you
want
to
change
the
way
that
the
memory
is
allocated,
it's
kind
of
buried
in
kind
of
shared
tdm
library,
code
and,
and
so
you
kind
of
have
to
change
that
code
and
as
essentially
as
we're
moving
into
this
microcontroller
world.
We're
gonna
have
to
change
the
paradigm,
at
least
so
that
you
know
potentially
there's
memory
blocks
that
that
tvm
wants
to
generate
for
you.
So,
for
example,
it
knows
it
needs,
say
32
kilobytes
of
cpu
facing
ram.
K
It
should
provide
you
some
some
con
construct
that
you
can
use
to
easily
allocate
that
memory,
whether
it's
a
defined
constant.
That
says
how
big
the
cpu
facing
ram
is,
or
a
struct,
or
something
like
that.
That
contains
it,
and
in
particular
why
this
is
important.
Is
the
thing
that
that
giuseppe
just
raised
with
the
aot
api?
K
One
thing
that
we
changed
was
that
the
the
tensor
buffers
that
are
inputs
and
outputs
of
your
model
are
supplied
by
the
user
and
and
so
by
contrast
to
the
previous
model-based
module-based
model,
runtime
interface,
you
know
those
were
managed
by
tbm
and
as
we
move
to
a
world
where
we're
considering
that
the
user
is
passing
memory,
what
we
did
was
that
we
just
reused
the
graph
memory
planner
and
the
the
output
buffer.
K
The
input
buffers
were
kind
of
assumed
to
be
kind
of
don't
touch
these,
but
the
the
graph
memory
planner
then
for
for
all
following
buffers
in
the
graph
we'll
try
to
reuse
them.
So
if
there's
an
intermediate
tensor
that
lives
for
exactly
that,
just
needed
as
a
a
go-between
between
two
calls
that
are
subsequent
and
then
later
you
know,
there's
another
intermediate
buffer.
It
will
try
to
use
the
same
memory
for
those
two
intermediate
buffers
and
that's
great.
K
That's
fine
and
well
as
long
as
those
two
buffers
aren't
an
output
of
the
graph,
but
the
minute
that
one
of
those
two
buffers
becomes
an
output.
Then
the
user
has
to
suddenly
care
about.
You
know
if
it
does
reuse
these
buffers,
it
will
kind
of
size
them
up,
and
so
so
this
is
where
you
know
this.
K
This
bit
us
basically
in
the
interface
is
that
if
we
allow
the
output
buffers
or
output
tensors
of
the
graph
to
be
sized
up
by
the
graph
memory
planner,
then
the
user
has
to
know
that
somehow
so
that
was
kind
of
our
bug.
There.
G
Oh,
I
also
hit
this
issue.
I
don't
know
if
you
guys
noticed
in
that
copy
to
from
api
there's
a
new
dl
tensor
overload,
which
you
can
extract
the
shape
of
the
output
tensor,
and
so
then
you
can
look.
You
can
grab
out
the
slice
of
the
output
storage
that
you're
needed
that
you
need
for
that
particular
output
tensor.
So
you
shouldn't
have
to
copy
out,
like
the
full
storage
pool,
you'd
only
just
cut
the
copy
of
that
slice
right.
So
I
think
that
that
should
solve
that
issue.
G
I
think
it
did
for
us.
K
That's
great
yeah,
yeah
well,
we'll
have
to
see
if
we
can
kind
of
mirror
all
those
kind
of
analogies
across
the
interfaces
there
so
kind
of
looking
forward.
I
guess
there's
there's
just
five
minutes
left,
I
guess
is
there?
Is
it
am
I
missing
anyone
that's
raising
their
hand
or
has
anyone
wanted
to
say
anything
that
I'm
missing.
P
Yeah
hi,
this
is
michael
vogelseven
you're
from
one
thing
that
I
would
like
to.
I
mean
I'm
very
interested
by
this
activity,
but
not
for
necessarily
the
full
microcontroller.
What
I'm
interested
in
particularly,
is
limiting
the
current
runtime
that
we
have
with
tvm,
especially
when
models
are
run
within
a
streaming
environment.
Let's
say,
for
example,
when
you
want
to
put
a
model
generated
by
tvm
as
a
filters
inside
gstreamer.
P
Currently,
if
you
do
that,
the
runtime
actually
that
gen,
that
tvm
generates
serialize
everything
and
it
creates
quite
a
problem
actually
with
gstreamer
and
probably
some
other
libraries.
So
I
see
this
work
that
you
are
doing
as
actually
a
bit
a
bit
higher
than
the
mcu.
You
know
a
bit
bigger
devices,
but
to
reduce
the
runtime
to
be
really
self-contained
and
that
can
be
put
inside
a
streaming
environment
like
gstreamer
and
others.
P
That
would
be
absolutely
awesome,
and
here
you
got
the
same
problem
you
mentioned
with
the
memory
that
is
allocated
by
the
external
application.
Like
a
g
streamer.
You
know
that
is
going
to
be
passed
into
this
function,
generated
by
gstream
via
tv
and
the
output
that
is
going
to
other
filters
so
to
have
it
well
self-contained
with
the
memory
management
and
the
multi-threading
management.
You
know
that
would
be
generated
outside
of
the
of
the
tvm
itself
generated
function
is
act
very,
very
paramount.
P
I
mean
I
have
done
this
in
in
the
past
and
unfortunately
I
had
to
change
completely
the
runtime
itself
that
the
tvm
generates,
because
otherwise,
as
I
said,
it's
just
impossible,
everything
is
serialized.
You
got
plenty
of
clash
with
memories.
You
have
to
duplicate
memory
all
the
time,
so
that
would
be
really
helpful.
I
mean
I
see
this.
The
work
that
you
are
doing
guys
with
aot
has
a
really
great
duration.
K
Yeah,
that's
awesome
so
so
in
this
case,
just
one
question
for
you
on
that
when
you're
talking
about
the
kind
of
serializing
the
runtime,
are
you
trying
to
kind
of
run
multiple
inputs
through
the
graph
simultaneously
or
is
that
yeah
just
to
say?
Okay,
I
see
yeah
cool
cool.
P
What
I
mean
by
cellular
is
serializing
the
runtime.
Sorry,
maybe
I
was
wrong
here.
What
I
mean
is
what
happened
with
the
tvm
runtime
currently
is
instead
of
running
in
parallel.
You
know
with
gstreamer
and
the
filter
that
generated
by
that
is
wrapping
the
code
generated
by
tvm,
the
tvm
runtime
that
you
need
to
start
within
this
filter
to
run
the
tvm
code,
actually
serialize
everything,
that's
the
one.
P
It
so
you
end
up
with.
Actually,
you
know
a
big
slowdown
in
performance
when
you
are
using
g
streamer,
if
you
don't
change
it,
and
that
is
really
purely
because
of
the
runtime.
The
way
it
is
today,
so
removing
it
making
it
as
as
pluggable,
let's
say
to
other
framework,
would
be
actually
really
really
a
good
thing.
Great.
K
Yeah
yeah,
that
makes
sense,
that's
definitely
aligned
with
the
embedded
stuff.
So,
okay,
we
just
have
a
minute
left.
So
I
guess
the
last
thing
I
wanted
to
say
was:
like
you
know,
aot
is,
is
again
it's
something
we're
working
on
for
for
embedded
development.
K
I
guess
is
kind
of
the
impetus
for
this,
but
it's
certainly
something
we
want
to
target
for
other
applications
as
well
and
so
kind
of
going
forward
like
we
talked
about
we're,
probably
gonna
start
having
these
or
we're
gonna
have
to
start
having
these
micro
tbm
bi-weekly
meetups,
and
you
know
I
expect
a
lot
of
the
the
things
that
we
are
talking
about
here,
to
be
micro,
tv
in
focus
things
like
these
embedded
tvm
emitted
c
runtime
interface,
but
one
thing
that
I've
found
with
micro
tvm
is
that
it
tends
to
touch
a
lot
of
different
parts
of
tvm
itself,
and
so
I
expect
we'll
we'll
talk
about
different
cons.
K
You
know
quantization
aot
memory
planner
a
bit
of
these
meetings
as
well,
and
so
I
I
just
like
to
say,
I
think
what
we'll
do
is
we'll
post
up.
You
know
agenda
items
and
if
you
know,
I
think
the
focus
of
the
the
meetup
should
definitely
be.
You
know
firmware
oriented,
but
if,
if
people
want
to
attend,
you
know
just
to
kind
of
learn
what's
going
on
in
the
firmware
world
or
you
know
raise
concerns
about.
K
K
H
Yep,
okay,
can
I
follow
up
one
on
your
last
comment,
andrew
just
one
thing,
so
you.
A
H
K
Yeah,
so
we're
still,
I
think
that
we
definitely
do
want
to
create
an
interface
that
has
kind
of
here's
what
you
get,
and
I
think
the
idea
is
what
you
would
provide
the
memory.
Planner
is
basically
a
tir
program,
basically
a
let's
see
an
ir
module
containing
tir
code,
and
then
you
know
kind
of
here's.
What
the
interface
is
expected
to
produce
and
and
that
kind
of
is
those
that's
basically
the
the
first
class
storage
identifier.
K
This
is
basically
a
data
structure,
that's
contained,
I
guess,
buffer
pools,
and
so
so
sorry
this
is
kind
of
hard
to
go
through
in
a
whole
minute
here,
but
basically
the
idea
is,
I
think
the
interface
would.
K
We
should
be
very
clear
about
what
we
want
the
graph
from
marine
planner
to
consume
and
what
we
wanted
to
produce
and
then
whether
or
not
we
provide
kind
of
I
mean,
I
think
that
you
could
always
re-implement
the
interface
as
you
wanted
to.
K
But
whether
or
not
that
means
I
think
we
would
probably
still
try
to
provide
a
more
flexible
implementation
that
can
kind
of
be
retargeted
for
for
small
changes
in
memory
planning
and
then,
if
there's
something
trying
to
do
something
really
crazy,
you
know
you
can
you
can
reimplement
yourself
as
as
you
need
to
so
that's
that's
kind
of
my
view
and
we'll
definitely
talk
about
that
more
in
the
micro,
tv
and
meetups
going
forward,
because
I
think
that's
related
to
things
like
accelerator
offloading
and
things
like
that
as
well.
So.
A
Okay,
thank
you,
everybody
and
super
excited
about
this.
I'm
excited
to
see
everyone
at
the
next
at
the
at
the
micro
tv
meeting,
coming
up,
keep
your
eyes
open
for
the
new
community
calendar
and
resources
coming
out,
and
I
hope
that
everybody
has
a
great
day
thanks
a
bunch.