►
From YouTube: Apache TVM - µTVM Community Meeting, July 21 2021
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right
welcome
everyone
to
this
july,
21st
edition
of
the
microtvm
community
meeting,
so
welcome
and
thanks
for
being
able
to
make
it.
We
have
three
items
on
the
agenda
today
and
so,
thanks
to
all
those
of
you
who
were
able
to
take
part
in
the
discuss
forum
to
pull
together
the
agenda,
so
roughly
the
three
things
are:
we've
got
an
arm
ethos.
U
r
c
discussion
that
manupa
would
like
to
have.
A
We
have
a
micro,
tvmci
discussion
that
gustavo
would
like
to
have,
and
then
we
have
a
unified
static
memory
planning
discussion
that
michael
would
like
to
have,
and
michael
had
indicated,
that
he
was
going
to
be
a
little
bit
late
to
the
meeting
so
hopefully
he's
able
to
to
join
later.
He
had
indicated
he
had
a
conflict,
so
we'll
do
our
best
to
kind
of
hold
off
on
that
one
till
later
in
the
meeting.
A
A
Going
once
going
twice,
all
right
not
hear
anything
so
with
that
we'll
open
up
to
our
our
usual
things
that
we
do
from
week
to
week,
which
first
thing
is
is
introduction.
So
if
there's
anybody
new
to
the
meeting
that
they'd
like
to
introduce
themselves
now's
the
time
to
do
so,
anybody
like
to
introduce
go
ahead.
A
B
Okay,
hi
hi
guys.
My
name
is
jose.
I've
been
I'm
new
to
this
meeting,
because
I've
been
working.
I've
posted
some
posts
on
the
discuss
forums
about
an
embedded
device
which
also
hosts
an
accelerator
that
people
at
my
university
are
developing.
B
So
I
will
start
on
october
1st
with
a
phd
there,
and
I
am
working
for
the
k11
in
belgium,
where
I'm
working
in
the
mikas
department,
which
does
everything
with
microchips.
So
we
make
everything
from
analog
very
high
frequency
amplifiers
to
also
ai
accelerators.
B
So
there
are
some
very
excited
research
going
on,
but
the
problem
is
that
we
don't
really
have
any
compiler
right
now
for
these
devices
there's
only
kind
of
a
ad
hoc
script
in
python,
that's
being
written
every
time
a
new
chip
is
developed,
so
my
job
would
be
to
kind
of
find
some
more
general
way
to
kind
of
tie
or
to
make
make
it
more
general
so
that
we
can
like
utilize
more
of
the
same
thing
across
those
chips
and
that
we
don't
have
to
start
from
scratch
anytime.
Every
time
we
make
a
new
chip.
B
So
that's
kind
of
what
I'm
doing
here.
I
will
be
following
it
a
little
bit
more
passively
because
actually
I'm
on
vacation,
so
I
don't
really
have
that
much
input,
but
I
was
just
a
bit
curious
to
see
what
this
meeting
is
all
about.
So
that's
something
about
me.
Okay,
thank
you.
Welcome.
A
Okay,
well,
let's
keep
moving.
I
don't
think
we've
got
any
announcement
or
news,
but
I'll
pause
just
in
case
anybody
has
anything
that
they
would
like
to
point
out
going
once.
C
Going
twice,
actually,
I
guess
I'll
quickly
put
in
a
final
plug
to
if
anyone.
D
C
A
E
What's
this
yeah
okay,
however,
I
can
see
it
so
yeah,
so
we
have
finally
come
around
drop
stream.
Data
is
your
support.
We
have
been
working
for
several
months.
We've
discussed
this
in
the
last
tbm
conference,
enabling
eaters
use
collagen
support
dvm.
E
So
I
put
an
rc
here
that
that's.
This
has
been
a
group
offered
from
mom
doing
have
this
getting
going
with
micro,
tm
and
all
those
other
development
going
with
it
to
get
getting
get
testing
compile
compiling
through
tvm.
E
So
I
just
briefly
say
some
hsu
55
is
an
npu
that
is
designed
to
work
with
cortex
m
and
bare
metal,
my
embedded
environment,
not
necessarily
bare
metal,
but
that
works
so
environmental
as
well.
But
this
is
this
is
what
the
npu
is
all
about.
So
we
want
to
enable
compilation,
through
tvm
process,
in
ahead
of
time
fast
fashion,
to
use
the
npu
to
run
machine
learning
models
so
yeah.
E
That's
that's,
basically
the
scope
of
the
rfc,
so
I
thought
I'd
just
go
through
the
guide
level
explanation
and
maybe
touch
the
overview
of
the
compilation
flow
for
it.
Yeah
there's
this
video
as
well.
It's
just
bit
different
from
what
we
described
in
the
video,
but
it's
more
organized
now
than
what
I
presented
in
the
last
theorem
conference.
But
this
is
how
it
looks
like
today,
so
just
mentioning
the
tmc
interface
that
we
are
hoping
that
users
will
be
able
to
use
to
compile
to
the
ethers.
E
U
and
cortex
m
along
with
it
machine
learning
model.
This
should
be
the
interface
and
the
way
we
have
integrated.
The
question
is:
has
an
byoc
root,
but
interestingly
this
one,
I
will
have
a
full
pipeline
to
get
it
down
to
the
the
the
c
run
time
modules
we
generate
at
the
end,
so
it
has.
It
goes
through
different
stages.
So,
first
we
partition
out
the
supported
operators
that
the
compilation
pipeline
supports
from
the
relay
graph,
so
that
other
others
operators
can
go
through
the
default
pipeline.
E
The
tvm
has
the
ones
we
say
we
support,
which
that
matches
much
the
supportedness
constraints
we
can
offload
with
our
completion
pipeline,
and
this
would
have
the
external
functions
which
contains
sub
graph
of
separators.
That
is
supported
by
those.
U
and
the
first
step
we
do
with
them,
is
we
kind
of
legalize
them
to
a
subset
of?
U
hardware,
primitive
relay
operators.
E
So
this
abstraction
is
important,
because
this
set
of
operators
describe
exactly
what
the
hardware
can
support,
but
it
doesn't
mean
it
cannot
support
most
of
the
operators
there,
because
we
could
legalize
them
to
for
an
example,
I
think
I
would
like
to
say
the
dens
or
the
fully
connected
operators
could
be
legalized
to
a
conversation
2d
operator.
So
so
that's
an
exact
simple
example,
but
those
sort
of
legalization
happens
here
and
also
we
kind
of
use
help
from.
E
We
have
an
villa
compiler,
which
is
focus
on
tflight,
but
it
has
apis
that
we
are
intended
to
use,
which
is
to
kind
of
encode
the
constant
artifacts,
which
are
the
bias
and
scale
near
a
special
encoding
for
the
hardware
to
interpret
it
and
the
weights
need
a
difficult
encoding
for
the
hardware
to
interpret
it.
And
after
that
we
use
those
artifacts.
The
bias
and
scale
can
be
converted
in
in
level,
but
then
each
of
these
hardware,
primitive
operations,
will
have
an
te
associated
with
it.
E
That
kind
of
faithfully
represent
what
the
operator
does
and
we
undergo
certain
set
of
te
and
tf
passes
to
optimize
for
memory
and
performance,
and
it's
more
like
a
trade-off
there.
So
in
that
process
some
operators
might
get
tiled
and
and
different
kind
of
scheduling,
decisions
would
be
taken
here
and
those
styling
will
affect
how
the
weight
is
getting
covered.
Therefore,
the
variant
coding
will
happen
in
this
phase.
Then
it
should
end
up
producing
a
tear
prime
function,
that
kind
of
corresponds
to
what
cut.
E
Could
that
could
be
lowered
to
this
the
command
stream,
which
is
the
binary
artifact
that
that
we
used
with
the
driver
to
invoke
the
influence,
so
that
would
be
produced
at
the
end
of
the
compilation
pipeline.
So
that's
a
very
brief
nutshell
of
how
the
compression
works,
so
these
kind
of
ties,
along
with
the
other
work
we
have
been
doing
in
in
the
front
of
aot
and
disappear,
and
chris
has
been
working
on
and
and
these
types
with
the
interface
apis.
E
We
have
been
designing
so
so
this
as
a
package
which
andrew
and
others
help
to
create
what
we
call
as
a
model.
Library
format
is
the
package
we
want
to
distribute
to
work
in
any
embedded
environment.
So
that's
what's
written
here.
E
So
one
of
the
questions-
and
I
thought
an
interesting
observation
would
be
why,
unlike
the
other
byoc
slows,
we
kind
of
lowers
them
to
tier.
One
is
one:
is
this
what
we
call
as
its
cascading
style
performance
optimizations,
which
is
kind
of
the
optimizations?
We
do
to
optimize
memory
and
performance
together
in
in
the
tier
domain
and
other
one?
Is
we
want
this
ti
express
out
for
the
unified
static
memory
planner
to
plan
across
the
cpu
and
then
pu
tensors
together?
E
So
this
is
the
reason
why
we
have
this
abstraction
layer
in
the
middle
of
it.
So
yeah,
that's
basically,
you
know
overview
level
how
the
question
looks
like
and
I
don't
intend
to
go
to
much
details.
I've
just
put
called
nip
snipper
to
show
what
it
how
the
transformation
looks
like,
but
yeah.
I
think
I'll
stop
here
for
questions.
E
I
have
a
question:
can
you
explain
a
little
bit
more
about
the
ethos?
Ute,
so
is
that
just
standard
tv
mte
statements
that
you're
using
or
did
you
extend
the
te
statement
vocabulary?
E
I
know
we
didn't
extend
the
ir,
it's
just
it's
just
the
open
implementations.
We
are
defining
it
that.
E
C
So
you
are
making
new
new
te
operators
that
are
specific.
C
C
Yeah,
when
I
was
rereading
this,
I
I
think
we
discussed
this
before,
but
I
was
you
know.
I
was
curious
if
there
was,
if
you
had
a
little
bit
more
to
share
about
kind
of
like
what
the
motivation
was
for
moving
things
out
of
the
kind
of
the
standard,
te
operators
and
and
and
creating
basically
a
a
parallel
set
of
of
operators
for
for
ethos.
C
You,
I
don't
know
if
you
have
anything
more
to
share
on
kind
of
like
what
what
was
the
main
motivating
reason
that
you
wanted
to.
You
decided
you
wanted
to
create
something
separate.
E
E
So
t
t
e
tier
goes
really
down
level
like
you
see
like
an
ir,
so
we
just
needed
just
needed
an
abstraction
that
could
work
with
the
passes
to
compile
it
down
to
the
command
stream,
which
can
which,
which
has
a
subset
of
operations
that
could
run
primitively,
for
example,
convolution
2d
depth,
twice,
conversion
truly
are
primitively
supported
in
in
the
command
stream,
that's
generated
so
yeah,
that's
the
basic
basic
reason.
So
we
want
something
to
represent
that,
so
these
particular
te
and
tf
process.
We
do,
I
think,
mostly
to
your
process.
E
C
Parameters
that
you're
adding
to
the
or
the
am
I
I
could
be
completely
off
base
here,
but
I
think
that
you
have
you
added
the
bias
and
scale
parameters
to
kind
of
ordinary
operators
that
don't
necessarily
have
them
like
a
com,
2d
or
other
kind
of
operators
that
are
that
aren't
typically
associated
with
with
that
kind
of
on
the
input
end.
But
I
I
could
have
misunderstood
this
so.
C
I
I
think
you
you
also
added
parameters
to
some
of
the
operators.
If
I,
if
I
remember
correctly,
I
I
could
have
misunderstood
this,
but
for
some
reason
I
think
that's
what
that
was
kind
of
my
my
recollection,
although
it's
kind
of
fuzzy
right
now,
I'm
not
sure
if
that
was
true
or
if
I'm
just,
I
could
be
kind
of
off.
E
Yes,
so,
for
example,
the
convolution
2d
primary
operator
kind
of
supports
the
biases
as
as
as
an
input
to
it.
So
it's
kind
of
primitive
in
that
way.
So
yeah
that
I
said,
that's
an
example
I
could
say
at
this
time
so
so
the
the
primitiveness
of
the
operators
is
basically
closely
defined
to
what
hardware
supports.
E
So
that
can
be
groupings
of
what
tvm
supports
and
probably
past
partial
breakdowns
of
whatever
support.
So
it's
not
simple
enough
relationship
with
that,
so
that
is
kind
of
handling
the
graph
partition.
That's
where
we
we
do
certain
a
little
bit
complicated
level
pattern
machine
to
identify
the
what
could
be
lowered
to
such
relay
operators.
E
E
C
Yeah-
and
I
know
from
the
october
side
I've
that
chatting
with
jared
a
little
bit-
and
I
think
one
thing-
that's
happened
as
we've
kind
of
as
these
efforts
have
kind
of
marched
along
in
parallel.
Is
there's
been
a
significant
effort
kind
of
spun
up
here
to
kind
of
take
a
good
look
at
the
compilation
pipeline,
particularly
from
the
scheduling
part
onward,
and
so
so
yeah
one
thing
would
be.
It
would
be
great
to
make
sure
that
we
could
get
some
input
from
those
guys
doing.
C
Basically,
the
te
compiler
work
that's
kind
of
what
we're
calling
it.
I
suspect
this
will
all
you
know
slot
in
together
quite
nicely,
but
I
think
it
would
be
good
to
get
some.
I
think
that
one
one
thing
they're
focusing
on
is
sort
of
how
to
come
to
a
unified
lowering
pass
from
te
into
tar
and
so-
and
I
think
that
unified
mostly
means
across
like
the
graph
executor
and
the
aot
executor
and
the
vm
executor.
C
So
I
don't
think
that
I
think
there's
room
for
byfc
flows
to
you
know
to
remain
and
and
do
some
lowering,
but
I
think
it
would
be
good
to
make
sure
that
you
know
all
those
plans
basically
align.
That's
kind
of
one
thing:
I've
been
keeping
track
of
over
the
last
week
or
two
here
so.
C
Yeah,
I've
directed
them
to
the
comments
a
bit
on
you
guys
stuff,
but
I
think
everyone's
a
little
bit
busy,
so
I'll
try
to
keep
keep
at
them
a
bit.
G
Operators
which
will
be
immediate
and
will
be
in
the
leap
n,
dot
c,
for
instance,
they
will
execute
the
comments,
the
for
the
ethos,
u55
or
it
will
be
necessary.
A
driver
at
the
zephyr
site
to
you
know,
dispatch
the
comments
for
ito's
u55
unit.
How
does
it
work?
It
will.
E
E
But
but
we
are
currently
working
on
defining
an
abstraction
for
it
as
you
launch
that
could
be
specialized
to
different
addresses.
So
I'm
not
sure
the
crisis
and
the
call.
G
Will
execute
a
recall
functions
which
will
use
the
drivers
to
dispatch
the
comments
to
the.
F
Just
to
say,
yeah
we're
currently
thinking
about
how
exactly
to
wire
that
up.
I've
been
talking
a
little
bit
to
andrew
about
how
he
passed
devices
down
from
the
the
interface
level
down
into
the
operator
flow.
So
it
ends
up
in
the
back.
So
if
you
have
any
thoughts
would
be
worth
trying.
C
Yeah
in
particular,
this
is
kind
of
you
know.
The
question
is
that
when
you
invoke
the
ethos
you
invoke
v3,
there
may
be
some
context
specific,
some
accelerator,
specific
context
that
you
want
to
pass
along.
C
You
know
that's
kind
of
typically
thought
of
as
like
a
void
star,
but
you
know,
since
we
want
to
sort
of
allow
the
application
to
handle
kind
of
basically
device,
initialization
and
and
kind
of
populate,
the
at
least
delegate
the
responsibility
of
the
application
to
call
the
library
code
to
populate
that
that
driver
context.
You
know
that's
what
I
would
think
of
as
initialization
kind
of
there's
some
question
as
to
how
we
need
to
define
some
interface
basically
to
allow
applications
to
pass
in
that
context
to
the
top
level.
C
Aot
executor
function,
so
you
know.
Basically,
if
you
call
runtime
run,
then
that
should
have
you
know
a
context
for
each
different
accelerator
that
is
used
in
the
computation
and
that
should
somehow
wind
its
way
down
all
the
way
to
here.
So
we're
kind
of
still,
I
guess
thinking
about
how
to
do
that
correctly.
But
yes,
well,
I
I
think
we
should
probably
post
something
up
at
some
point
or
or
I'm
a
little
bit
behind
reading
things.
So
I
I'm
not
sure
if
you
posted
anything
more
on
this
chris.
C
F
Yeah
nothing
new
posted
as
yet.
D
D
E
Yeah,
just
just
have
a
read
through
it
and
feel
free
to
comment.
It
comment
and
we
can
discuss
it
in
the
request.
E
E
Everything,
if
you
don't
have
any
questions
you
can
move
on
tom,
maybe.
A
Okay,
well
thanks
manupa
for
for
leading
us
through
on
this
and
appreciate
your
taking
the
time
to
present
to
us
today
greatly
appreciate
it
all
right.
Next
on
the
agenda
is,
I
believe,
gustavo.
G
Right
so
it's
more
like
a
sync
up
with
the
what's
about
what
optimal
folks
are
doing
so
andrew
and
maybe
have
some
news
about
the
see
a
micro,
tdm
ci
just
to
in
the
sense
to
try
to
avoid
duplicated
work.
You
know:
we've
been
kind
of
chasing
a
micro
tv
mci
here
focus
on
the
purple
request.
Tests
and
theodore
is
also
working
on
that.
G
So
I
understand
that
male
folks
are
kind
of
doing
the
same
thing,
but
for
a
nightly
build
and
using
avm,
but
I'm
I
kind
of
lost
the
track
about
it.
C
So
if
we
yeah
I
can,
I
can
definitely
give
some
some
more.
I
guess
background
on
this,
and-
and
I
guess
just
to
you
know
just
to
add
some-
I
guess
context
here-
is
that,
and
I
don't
want
to
make
it
sound
like
we're,
we're
putting
a
ton
of
work
into
this
we're
kind
of
doing
what
we
could
view
as
kind
of
the
minimum
possible
thing.
C
We
could
do
to
get
kind
of
automated
tests
running
nightly
right
now,
so
we
we
just
it's
it's
just
me
and
my
dad
working
on
it
right
now.
So
there's
not
really
a
large
effort
or
anything
spun
up.
What
we're
doing
right
now
is
we're
launching
a
I
guess.
We
have
a
server
with
some
some
attached
hardware
and
and
we're
basically
building
a
small.
C
I
guess
I
guess
I
think
it's
a
python
script
to
to
sort
of
track,
hardware,
reservations
and
then
kind
of
attaching
that
to
a
jenkins
instance
that,
basically,
you
know,
gets
a
reserves,
some
hardware
launches
of
the
micro
tv
and
reference
vm
and
then
kind
of
uses,
the
vm
to
drive
some.
I
guess
performance
and-
and
I
guess,
functional
regression
on
just
a
set
of
models.
Basically,
nothing
super
nothing
super
complicated,
but
nothing
hopefully
a
little
bit
a
little
bit.
C
I
guess
just
complicated
enough
to
be
useful.
Basically,
this
is
not
a
fast
process,
since
we
have
to
build
the
reference
vm
or
not,
building.
F
C
We
have
to
instantiate
the
reference
vm
every
single
time
and
you
know
not
sure
to
what
degree
you
know
the
we
still
have
yet
to
kind
of
pipe
clean.
The
whole
thing,
so
you
know.
H
C
Things
like
you
know
bugs
with
like
one
thing
that
we
we
see,
sometimes
when
you,
when
you
launch
a
vm
and
you
attach
usb
devices
that
for
some
reason
you
know
it
takes
a
few
times
of
enumerating
the
device
before
it
really
becomes
stable
across
the
vm
and
I'm
not
really
sure,
that's
something
that's
kind
of
new
to
me
in
the
last
month
or
two.
I
think
so.
We're
not
really
sure
what
we
have
to
add
to.
C
I
guess
sort
of
ensure
in
an
automated
way
that
the
usb
traffic
is
flowing
properly.
But
what
you'll
see
is
you'll
see.
C
You
know,
failure
is
programming,
a
part
that's
attached
to
the
to
the
vm,
and
if
you
do
it
two
or
three
times,
then
suddenly
it
works
fine
for
sort
of
indefinitely,
and
so
you
know
so
there's
just
a
bunch
of
small
things
there
that
you
know
we're
kind
of
hoping
to
do
kind
of
the
minimum
amount
of
automation,
but
you
know
sort
of
just
so
we
don't
have
too
many
of
those
things
to
work
through
longer
term
plans
I
mean
yeah.
C
I
think
that
one
thing
that'd
be
interesting
to
discuss
here
is
kind
of
longer
term
plans.
I
don't
know
if
that's
kind
of
in
your
something
you
wanted
to
cover
gustavo
but
yeah.
I
think
one.
You
know
one
question
is
kind
of
with
the
open
source
dbmci.
C
One
thing
we
didn't
want
to
do
is
make
it
impossible
for
people
who,
who
don't
have
this
hardware
to
to
iterate
on
a
pr,
especially
one,
that's
completely
unrelated
to
micro
tvm.
You
know
if
you're
working
on
a
gpu,
centric
pr
and
then
a
micro
tvm
test
fails,
and
it's
like
well
to
reproduce
this
test
you
need
to
buy.
You
know
some,
hopefully
cheap,
but
but
still
you
know
you
have
to
buy
and
and
get
all
this
equipment
and
development
boards.
C
C
Basically,
so
you
know
we
don't
want
to
do
that
at
least
not
in
my
that's
kind
of
my
been
my
take
on
it
so
far,
and
so
you
know
the
limitation,
though,
is
that
if
we,
if
we
take
that
approach,
then
the
problem
is
that
you
know
we
aren't
really
testing
anything
on
real
hardware,
we're
emulating
everything,
and
so
you
know
people
have
been
talking
a
little
about
some
middle
ground
and
you
know
one
thing
we
could
do
is
we
could
run
basically
nightly
tests
on
real
hardware,
and
so
that's
you
know.
C
That's
one
thing
that
we're
thinking
about
doing
right
now
with
the
just
with
the
infrastructure
that
we're
building
here,
but
you
know,
I
think
that
there's
more
interesting
things
to
be
done
beyond
that,
for
example,
tracking
performance
on
maybe
a.
F
C
Website
that
the
community
could
could
inspect
and
then
another
suggestion
that
I've
heard
so
far
is
basically
allowing
allowing
people
to
supply
like
additional
jenkins
jobs
that
are
run
on
their
own
infrastructure
and
that
they
can
basically
launch
these
jobs.
I
don't
know
if
it
would
be
like
a
committer
could
trigger
one
of
these
jobs
running
on
a
on
a
pull
request,
but
basically
these
the
idea
is
that
if
you
open
a
pr,
you
know
some
some
hardware
in
the
loop
regression.
F
C
Could
contribute
an
advisory
vote
and
say:
hey
we
tested
this
on
microgbm
hardware
and
and
it
did
fail,
and
then
you
know
that
wouldn't
necessarily
block
submission
that
the
the
committers
would
have
to
use
discretion
basically
to
to
know
whether
or
not
this
was
like
a
badly
breaking
change
that
you
know
would
really
slow
down
the
community
or
you
know,
a
change
that
you
know.
Maybe
the
regression
ran
into
a
hiccup
itself,
so
anyway,
that's
kind
of
some
thoughts
there.
C
G
Oh
yeah,
so
I
just
would
like
to
get
a
sense
what
you
and
my
dad
were
doing
at
octomell
and
I'm
also
in
this
in
the
same
doing
the
same
with
theodore.
So
we
are
partially
working
on
the
micro
tv
mci.
G
So
we
are
not
fully
located
on
that,
but
we
were
trying
to
you,
know,
get
jenkins
running
and
to
experiment
with
a
full
architecture
like
having
a
worker
attach
it
to
was
a
single
board
and
that
kind
of
stuff
and
see
how
it
dispatches
the
pull
requests
and
how
how
fast
it
goes
in
comparison
to
a
docker
container,
for
instance,
and
also
have
the
chance
to
work
on
those
serial
issues.
G
You've
we've
discussed
in
discord
like
how
to
identify
the
boards
and
how
to
attach
new
hardware
to
the
system,
and,
and
so
that's
what
we
are
trying
to
work
at
our
side
and
in
that
sense
I
just
would
like
to
ask
you
if
you
think
it's
still
important
to
do
that.
Work
in
a
community
sense
to
see
how
it
goes,
and
you
know,
or
if
it
will
kind
of
overlap
with
what
you
were
doing
at
talk
to
email.
C
I
think
one
thing
is
that
you
know
I
I
think
actually
it's
probably
helpful
for
for
both
of
us
to
have
some
form
of
automated
job.
Launching
I
mean
I
think
the
more
infrastructure
people
can
share.
I
think
the
better.
So
so
I
certainly
I
don't
know
that
it's
wasted
or
anything
like
that.
C
I
think
that
one
limitation
of
our
approach
is
that
if
we
take
kind
of
this,
this
next
step,
basically
of
of
trying
to
put
advisory
votes
on
prs,
you
know
our
approach
is
basically
to
build
this
vm,
and
that
takes
quite
a
long
time.
I
mean
I
guess.
C
I
know
that
the
ci
for
for
tdm
takes
quite
a
while,
but
you
know
it
shouldn't
necessarily
be
our
approach
to
always,
you
know,
create
a
two-hour
ci
or
whatever,
and
so
I
think,
there's
significant
opportunity
for
for
speed
ups
there,
and
I
don't
think
we've
planned
to
do
any
of
that
work
there,
and
I
think,
from
last
time
you
guys
were
chatting.
You
were
kind
of
hoping
to
use
the
docker
image
to
to.
C
Better
chance
to
me
of
working
kind
of
a
in
a,
I
guess,
a
faster
and
also
potentially
a
way
of
that
uses
the
cpus
on
the
executor
nodes
much
more
effectively.
So
it
might
be
good
for
us
to
chat.
Perhaps
I'm
not
sure
if
it
would
make
sense
to
have
like
a
detailed
chat
at
this
meeting.
C
Although
we
totally
could
I
think
it's
a
chat
just
now
that
we've
kind
of
like
gone
down
this
path
a
little
bit
and
you
know
maybe
we
can
share
what
we've
learned
so
far
and
see
what
if
there
is
overlap
or
or
if
there's
stuff
we
can
share
for,
for
instance,
I
still
think
that
you
know
developing
kind
of
like
a
status,
page
or
kind
of
a
a
page
that
that
gives
kind
of
like
nightly
performance
stats
or
nightly
functional
status.
C
Statuses
might
be
useful
to
the
community
kind
of
as
we're
pushing
more
prs
and
all
that.
So,
I
think
there's
you
know,
there's
quite
a
bit
to
do
just
above
and
beyond
the
just
having
a
a
simple,
automated
runner,
so
yeah.
G
Right
yeah,
I
agree
andrew
that's.
That
would
be
interesting
yeah
and
we
are
indeed
looking
for
it
yeah
as
well
and
yeah
about
voting
in
a
pull
request.
That
would
be,
I
think,
a
next
step,
but
yeah
initially
get
in
some
performance
statistics.
That
would
be
really
awesome.
I
Yeah
tom,
you
might
know,
but
are
we
limited
in
the
ability
to
show
performance
numbers
on
different
boards
for
tests
like
this
generally.
A
What
we
like
to
do
is
if
it's
a
publicly
available
board.
You
know
shouldn't
shouldn't,
be
an
issue,
but
we
still
like
to
kind
of
you
know
talk
to
the
manufacturers
of
the
board
and
just
make
sure
that
they're
they're,
generally
okay
with
it,
okay
sure,
yeah.
C
It's
easy
for
it's
amazing
how
easy
it
is
to
like
misconfigure
a
board
slightly
and
then
the
performance.
C
C
We're
kind
of
relying
on
zephyr
to
do
most
of
the
like
soc
specific
configuration,
and
so
we
kind
of
assume
that
we'll
put
it
in
a
reasonably
good
state.
You
know,
as
we
start
moving
towards
you
know,
building
these
tuning
logs,
basically
on
these
devices
that
will
help
us
really
optimize
our
performance.
You
know
one
one
easy
way
to
shoot
ourselves
in
the
foot
is
to
misconfigure
the
devices
basically,
and
so
you
know
it
definitely
would
be
helpful
to
get
some
feedback
signals
about
that.
G
J
So
I
think
I
think
my
only
comment
is
that
so
as
much
as
we
obviously
want
ci
running
on
on
boards
and
things
and
that's
valuable.
We
also
are
interested
in
kind
of
maintaining
and
expanding
the
the
ci
then
on
models,
because
that's
cheaper
and
require
less
configuration.
J
J
There
are
still
some
questions
on
which
is
required
from
a
board
so
that
we
can
have
it
consistently
in
the
ci,
but
yeah
I
mean
for
for
benchmarks
and
things.
It
would
be
awesome
if
we
could
get
some
numbers
as
a
routine.
G
Cool
yeah,
I
see
yeah.
One
thing
we
are
really
trying
to
stick
with
andrew
and
leandra
is
using
terraform
and
ansible
to
configure
everything.
So
we
don't
want
anything.
You
know
ad
hoc.
We
want
something
that
you
can
easily
reproduce
anywhere
in
any
host.
So
that's
what
we're
looking
for.
C
You
know
so
yeah
I
don't
yeah.
I
think
that
would
be.
That
would
be
a
good
way
to
like
share
things.
I
guess
and
kind
of
make
sure
that
everyone's
kind
of
on
the
same
page
as
how
things
are
configured.
C
You
know,
one
of
the
things
that
I
did
when
I
kind
of
took
the
vm
approach
was
to
try
to
lock
down
as
much
of
the
software
stack
as
possible,
basically
between
different
runtime
environments
and
so
kind
of
having
that
that
reference,
vm
was
kind
of
you
know
my
early
stabit
at
standardizing.
The
software
stack.
I
think
it's
really
important
that
we,
whatever
we
do
if
we
are
going
to
share
infrastructure
or
have
kind
of
a
you,
know,
here's
how
to
reproduce
the
tdm
performance
stats.
C
Is
so
I
see
yeah
so
so,
maybe
kind
of
for
next
steps.
It
would
make
sense
for
us
to
chat
a
little
bit
more.
We
can
post
up
and
kind
of
share
what
we
have
discussed
so
far
kind
of
on
the
discuss
forum.
If
people
are
interested
in
following
along-
and
you
know,
it
makes
sense
for
us
to
form
a
bit
better
of
a
plan
for
kind
of,
like
medium
term
work
on
the
ci
and
and
all
that,
and-
and
perhaps
you
could
share
that
at
a
following
community
meeting.
G
C
G
So
yeah
regarding
micro,
michael
tvmci,
tom
I'm,
that's
all
I've
got
to
discuss
here.
A
Okay,
thanks
gustavo
thanks
for
everyone
for
the
discussion,
all
right,
michael
is
here,
so
I
think
we're
ready
to
have
our
last
last
item
on
the
agenda,
which
is
a
unified,
static
memory,
planning
discussion.
So
michael
I'll
turn
things
over
to
you.
K
Thanks
so
we
had
a
couple
of
questions
on
the
rfc
that
was
proposed,
I
think
by
manopa
we
posted
some
of
them
online.
K
K
D
K
Okay,
let
me
yep
there,
it
is
perfect
cool,
so
I'm
talking
about
the
one
which
is
in
the
new
tbmrcs
repository
so
monopod.
The
question
we
had
is
you
proposed
here,
or
the
thing
that
is
proposed
here
is
an
interface
where
somehow
the
input
are
the
buffers
plus
pool
sizes
and
so
on.
So
the
first
question
we
would
have
here
is:
what
is
the
assumption?
E
E
D
Let
me
I
have
a
figure
for
towards
this
yeah.
Let
me
see.
E
Yeah
yeah:
this
is
actually
one
of
the
questions
matthew
bentham,
also
from
arm
raised
in
in
one
one
of
the
meetings
that
about
what
do
you
do
so
so
just
give
context
to
all
this.
So
this
is
an
snipper
from
incept
inception,
and
so
these
numbers
kind
of
corresponds
to
the
execution
order
which
would
result
in
creating
the
lowest
memory
pressure.
E
So
so,
I
think,
is
this:
what
you
are,
after
just
to
clarify
so
so
this
is
then
this
is
so
so.
The
current
the
control
graph
or
the
either
the
tier
aot
mod,
the
main
module
or
the
json
creator,
will
just
use
a
visitor
to
create
the
colossus
operators.
But
I
think
we
need
to
just
be
be
a
bit
more
careful
and
and
generate
a
different
sequence
that
might
result
in
different
memory
pressure.
E
So
in
this
example,
this
we
have
found
out
this
particular
sequence
seems
better
than
another
another
approach,
so
simply
because
we
can
get
rid
of
intermediary
feature
maps
being
open
for
so
long.
Is
that
the
feature
you
are
after.
C
K
E
Yeah,
so
so
so
what
we
are
saying
is
kind
of.
We
can
kind
of
use
relays
lid
bindings,
which,
which
kind
of
so
there
isn't
already
a
pascal
to
a
normal
form,
which
kind
of
create
a
sequencing
in
relay
operators
so
really
being
a
fully
functional
language,
doesn't
have
a
concept
of
sequences
such
unlike
tier,
so
tier
kind
of
is
kind
of
imperative
and
functional.
They
both
have
statements
and
expressions,
but
really
only
they
have
expressions,
but
the
lead
bindings
allow
you
to
create
a
sequence
right
now.
E
They,
the
sequence,
is
created
in
the
way
this
that
visitor
is
traversed,
but
that
doesn't
necessarily
need
to
be
so.
So
what
we
are
saying
is
that,
if
you're
just
interested
in
creating
the
sequence,
we
can
add
the
scheduler,
which
is
like
to
call
our
memory
aware
scheduler,
which
creates
the
lead
binding
in
a
different
way
that
results
in
in
different
interim
pressures.
So
I
think
in
this
particular
case,
the
the
memory
used
within
the
operator
doesn't
matter
because
they.
C
E
Live
when
you
finish
the
operator,
what
matters
is
the
boundary
tensors
of
the
operators.
So
therefore
we
feel
that
this
is
a
pass
that
could
happen
in
relay
and
on
its
own,
so
and
and
that
point
we
could
commit
to
that
schedule.
So
then
we
would
end
up
with
the
tier
frame
func
in
this
particular
order.
E
So
the
reason
being
is
that
we
think
we
don't
want
to
make
scheduling
and
allocation
together
because
of
the
crime
complexity,
but
that
is
not
to
say
the
schedulers
can
tap
into
the
allocators
algorithm
if
they
really
want
to
run
in
loop
allocated
to
get
more
realistic
memory
numbers,
not
just
the
memory
pressure.
E
But,
however,
the
idea
is
that
when
you
come
to
this
memory
planner
the
schedule
is
committed,
so
the
order
is
committed.
That
was
the
design
we
are
going
with,
but
happy
to
hear
any
concerns
around
it.
E
E
E
No
okay,
that
yeah,
so
there
can
be
a
case
when
we
you
need
a
scheduler
like
a
very,
very
comprehensive
scheduler
which
which
can
afford
to
run
that
you
know,
look
into
all
the
buffers
and
perform
performer
allocation
before
committing
to
a
schedule.
But
I
we
think
that
is
part
of
the
schedule
that
is
part
of
the
schedule
planning
activity
that
has
to
take
care
of
it,
which
can
call
into
the
allocators
api.
E
The
ordering
is
just
one
problem,
so
there
are
many.
If
you
go
into
advanced
scheduling,
we
might
want
to
break
these
operators
up
and
different.
Do
lot
lot
of
things
that
that
falls
on
the
dumbbell
of
scheduler.
It's
always
going
to
be
a
tradeoff
between
performance
and
the
memory.
So
so
we
feel
all
of
that
should
be
taken
care
of
with
scheduler,
but
that
doesn't
restrict
us
just
to
the
scheduler
from
using
any
of
the
apis
from
the
memory
plan.
If
it
needs
that
information
to
commit
to
a
schedule
that.
C
Is
we've
been
thinking
about
that
as
well
a
bit
here
and
I
think
that's
that's.
The
big
insight
I've
had
so
far
is
that
basically,
it's
not
necessary
to
couple
these
two
components
together,
but
it's
certainly
possible
for
the
scheduler
to
do
something
like
ask
the
memory
planner
to
compute.
You
know
the
total
live
memory
on
a
particular
or
compute,
even
use
the
the
linearization
pass
from
storage
rewrite,
for
example,
to
produce
basically
a
sequence
of
memory,
allocating
free
operations
or
or.
C
Is
kind
of
how
that
that
pass
specifies
things
and
so
yeah
you
can
use
that
basically
to
learn
kind
of
possible
orderings,
basically
at
the
relay
level
and
then
sort
of
propose
an
ordering
and
and
run
memory
planning
based
on
that
to
to
actually
lock
down
things
at
the
tnr
level.
Then
that's
that's
kind
of
on
my
side.
Now
I
I
I
guess
the
one
thing
that's
maybe
a
little
bit
different.
C
There
is,
I
think,
there's
there's
there's
this
question
of
how
we
handle
operator
workspace
buffers,
and
I
think
that
one
thing
that
we've
been
talking
about
a
bit
before
was
lifting
the
workspace
or
hoisting.
The
workspace
allocations
out
to
the
top
level
function
so
that
even
the
workspace
allocations
were
considered
kind
of
parameters
to
each
tar,
prim
func
and
I'm
not
sure
if
you
guys
were
still
planning
on
doing
that,
maneuver
or
if
that
was.
E
No,
no,
I
think
it's
I
mean
for
a
better
readability
of
the
iro
compilation
down
the
line.
If
you
feel
it
should,
I
mean
having
the
allocate
associated
with
the
prim
phone
kind
of
helps
us
when
you're
targeting
multiple
backhands,
so
extreme
funds
are
getting
lowered.
But
still
this
kind
of
overlay
of
buffering
for
liveness
conflicts
could
do
that
without
really
needing
intimidated
ir.
K
E
So
schedule
is
a
bit
tricky,
so
so,
whatever
what
we
are
saying
is
that
one
could
have
a
scheduler
which
is
which
is
a
set
of
passes
that
works
on
the
tier
and
which
may
or
may
not
need
to
access
the
allocator
to
determine
that.
So
we
are
keeping
the
window
open
so
that
one
could
design
a
schedulers
to
the
particular
respective
hardware
or
even
incorporate
with
whoever
needs
that
allocation
information.
E
C
E
Universal
schedule
is
not
not
in
our
roadmap,
but
we
are
just
keeping
the.
K
Possibility
for
that,
because
I'm
trying
to
absolutely
right
so
I'm
trying
to
go
this
from
a
user's
perspective,
so
what
I
want
to
do.
I
want
to
find
the
best
possible
way
for
my
new
fancy
neural
network
that
our
ai
guys,
someone
came
up
with.
Where
would
I
plug
this
in
and
currently
from
my
point
of
view,
this
was
the
universal
memory
planner.
So
now
you're
saying
I
know
this
is
someone
outside
then.
Actually,
I
would
see
there
is
the
need
for
an
interface
around
it.
E
I
would
still
keep
that
decoupled,
but
yeah
to
correct
something.
This
is
unified
memory
plans.
It's
not
meant
to
be
like
universal
memory
planning
by
unified,
I
mean
across
all
the
globe.
Memory,
workspaces
and
buffers
that
are
globally
scoped
could
be
planned.
So
that's
the
goal
of
it
so
yeah,
so
you're
kind
of
thinking
of
the
other
way.
E
What
to
do
about
scheduling
to
it
and
and
if
the
scheduling
needs
to
be
memorable,
which
is
the
case
which
with
us
you
we,
we
are
doing
that
scheduling
with
memory
aware,
but
that
also
comes
with
the
fact.
We
need
to
create
current
performance
into
the
account
as
well.
So
once
you
go
to
its
performance,
it,
the
scheduling
and
memory
allocation
becomes
intertwined,
and
but
so
the
design
we
are
proposing
is
left
scheduler
to
handle
that
complexity.
E
If
it
is
affordable
and
decided
to
decide
there
so
yeah,
we
would
then
I'm
I'm
more
or
less
saying
there
need
to
be
a
component,
a
scheduler
that
does
the
ordering
in
in
the
way
the
backend
wants
it.
But
it's
free
to
access
memory
planners
interfaces
to
do
that,
but
I
think
both
of
them
is
representable
in
relay
and
here
the
outcome
of
the
criteria.
C
Yeah-
and
I
think
that
there's
there's
additionally,
some
overlap
kind
of
between
kind
of
this
and
and
like
the
auto
tar
or
meta
scheduler
plans,
to
kind
of
which
would
sort
of
allow
auto
tuning
to
basically
tune
at
the
tr
level.
If
that
makes
sense-
and
you
know
the
question
of
kind
of
like
where
you
found
that-
and
I
guess
just
what,
how
does
that
relate
to
any
such
scheduler?
So
I
think
there's
there's.
I
totally
agree
with
you
michael.
C
I
think
that
that
is
kind
of
a
a
gap
basically
and
kind
of
the
generic
or
the
general.
C
C
I
think
that
the
kind
of
one
of
the
reasons
we
haven't
released
a
lot
of
that
at
least
one
of
the
reasons
I
haven't
commented
as
much
on
this
lately
is
that
you
know
there's
still
some
open
questions
in
my
mind
about
how
how
this
all
the
pieces
fit
together.
Basically-
and
so
I
think,
there's
some
opportunity
basically
for
us
to
to
release
some
some
more
designs
basically
or
propose
some
more
designs
here.
So
I'm
definitely
very
interested
if
you
guys
have
thoughts
or
or
not.
C
I
agree
with
what
you're
saying
that
you
know
yeah.
There
isn't
a
schedule
or
a
piece
right
now
and
the
scheduler
like
they
have
to
be
kind
of
hardware
aware,
and
so
there's
probably
some
room
for
some
sort
of
interface
as
well,
but
yeah.
I
don't
know
that
anything
in
the
current
rfcs
are.
You
know
explicitly
addressing
this
at
kind
of
a
general
level
right
now,.
H
H
E
Yeah
yeah,
I
mean
if
so
that
that
kind
of
sounds
like
with
me
a
schedule.
So
so
it
depends
on
the
the
degree
of
freedom
that
scheduler
can
take,
but
I
would
still
view
that
as
a
scheduler,
which
can
run
before
allocation
of
the
memories
yeah,
but
I
think
it
could
be
an
extension
on
what
we
are
doing
right
now.
After
after
we
have
something
yeah.
C
E
Current
work
via
current
focus
is
we
kind
of
assume
the
schedule
is
committed,
but
if,
if
someone,
if
one
summit
needs
to
change
the
commit,
I
would
leave
that
as
an
incremental
work
that
can
happen
on
top
of
it.
C
Yeah
so,
for
instance
like
without
with
the
auto
tir
approach,
you
kind
of
have
to
imagine
that
there'll
be
some
sort
of
iterative
process
here
right
so
we'll
perhaps
we'd
run
scheduling
and
scheduling
may
come
to
one
arrangement
basically,
and
then
that
would
be
passed
along
to
memory
planning
and
memory
planning
would
then
plan
within
those
bounds.
Basically,
and
then,
at
the
end
of
the
day,
you
kind
of
look
and
see
the
peak
memory
performance
or
whatever
statistic
you
want
to
use
to
or
like
whatever,
whatever
cosmology.
C
When
it
comes
with
that,
and
then
you
know,
you
may
come
back
and
iterate
again,
basically
and
then
optimize
further
and
so
not
to
say
that
you
couldn't
do
optimization
with
each
within
each
of
those
iterations
but
yeah.
I
guess
the
main
question
is
whether
or
not
the
memory
planner
should
also
be
responsible
for
reordering
stuff
or
whether
it
should
just
work
within
the
bounds.
I
think
currently
the
prevailing
well.
C
Currently,
I
think
the
thought
is
to
keep
things
operating
within
kind
of
the
the
bounds
of
what's
already
been
scheduled,
but
I
would
also
say
we're
just
very
early
on
this,
and
so
I
think,
everything's
very
open
to
the
proposals.
Basically,
so
if
you
guys
have
thoughts
or
you
want
to
propose
something
more
concretely,
I
think
that
would
be.
C
You
know
welcomed
on
the
discuss
forum
for
sure
I
think
that
it'd
be
great
to
see
like
a
more
comprehensive
kind
of
layout,
basically
of
the
the
scheduling
problem
in
general
and
then
think
about
a
little
bit
more,
how
to
kind
of
put
the
pieces
together.
If
that
makes
sense,
I
agree
with
what
you're
saying
sebastian.
C
I
don't
know
yeah
if
you
had
a
if
you
had
a
sketch,
basically
of
the
buffer
sizes
in
each
function
like
in
principle.
That
is
enough
information
for
some
component
to
do
both
memory,
planning
and
scheduling.
If
we
wanted
to
combine
those
two.
D
K
A
Well,
I
think
this
is
probably
a
good
time
to
maybe
call
it
a
meeting
unless
there's
something
else
that
you
know
we
urgently
want
to
discuss.
But
if
not
and
as
always
you
know,
the
discuss
forum
is
a
definitive
place
for
all
discussions
anyway.
So
it's
not
bad
too.
C
Yeah
thanks
for
the
discussions,
everyone
that
was,
that
was
really
great
and
yeah.
Please
keep
following
along
and
we'll
make
sure
everyone
is
kind
of
posting
up
to
the
discuss
forums
as
as
things
develop
and,
and
also
don't
hesitate
to
use
the
discord
as
well
to
kind
of
have
faster
chats
if
people
need
to.
L
Yeah
and
before
we
go,
I
also
wanted
to
remind
everyone
that
we
have
a
general
tvm
community
meeting
tomorrow.
Also
and
leandro
is
going
to
be
presenting
on
some
of
the
work
that
they've
been
doing
with
with
the
doctor
containers
and
ci.
L
A
Thanks
for
the
reminder,
okay,
everyone
well,
let's
go
ahead
and
call
it
a
meeting
I'll
get
the
meeting
recording
to
you
chris
here
in
just
a
moment.
So
thanks
for
attending
thanks
for
participating
and
we'll
see
you
in
about
two
weeks.