►
From YouTube: OMR Compiler Architecture Overview 20180308
Description
This is a quick tour of the compiler technology in Eclipse OMR in action. It provides an overview of the overall architecture, IL generation, high-level optimizer, and code generator all in the context of OpenJ9.
A
That
kind
of
thing,
and
what
I
thought
I
could
do
is
to
is
to
start
the
level
set
most
people
on
sort
of
the
basic
concepts
of
of
how
the
how
a
back-end
in
Willmar
actually
works
and
how
a
more
technically
Willmar
compiler
technology
actually
works.
So
so
what
I
thought
I
would
do
today
is
really
sort
of
start
from
the
beginning
and
and
then
sort
of
take
you
through
a
very
high-level
overview
of
the
Testarossa
compiler
architecture.
That's
available
in
OMR.
A
A
So
yeah
so
I'd
like
to
take
you
through
from
a
high
level.
This
will
be
very
some
of
you
will
probably
find
this
very
familiar
already.
So
I
apologize
for
that,
but
but
I
think
that,
as
we
get
into
this
more
and
as
we
start
to
have
a
few
more
discussions
around
this
technology,
it's
going
to
get
a
lot
more
deeper
and
a
lot
more
relevant
to
to
to
you
so
anyway.
So
let's
get
started
here.
A
Ok,
so
just
to
start
off
with
a
very
big
picture
of
the
the
compiler
technology,
as
it
sits
in
a
much
bigger
context.
So
the
bigger
context
in
this
particular
diagram
here
is
open
j9
and,
as
you
can
see,
the
compiler
technology
itself
is
just
one
part
of
this
much
much
bigger
picture
and
the
the
key
thing
here
is
that
the
compiler
technology
doesn't
doesn't
sort
of
sit
alone.
It
actually
does
need
to
communicate
with,
with
with
the
virtual
machine.
That's
actually
running
regardless
of
whether
or
not
this
is
open
during
on
or
not.
A
It
does
need
to
communicate
with
the
higher-level
virtual
machine
and
it
does
need
to
talk
to
the
runtime
in
some
way
as
well.
So
this
is
this
is
where
you
can
kind
of
get
into
native
code
when
you're
talking
about
Java
through
J&I
or
some
tooling
interfaces
with
JVM
ti.
One
thing
missing
on
here
as
well
is
you
know
several
conversations
of
the
compiler
might
need
to
have
with
the
operating
system
itself
and
the
means
that
we
have
for
going
through
there
and
there's
actually
an
arrow
in
this
diagram.
A
A
But
the
way
that
sort
of
things
work
within
Testarossa
is
that
you
really
need
to
sort
of
start
off
with
the
core
technology
really
sort
of
sits
within
it
with
it,
with
an
API
that
talks
to
a
particular
runtime
system.
So
in
this
particular
case
here
we've
got.
You
know
on
the
on
the
left-hand
side
here
in
blue,
you
can
actually
have
sort
of
you
what
we've
traditionally
called
a
front-end
interface
to
the
runtime
technology,
and
it's
that
front-end
interface
is
what
the
compiler
really
asks.
A
It
really
needs
to
talk
to
these
to
these
front
ends,
and
there
is
sort
of
this
generic
front-end
interface
that
different
consumers
of
this
compiler
technology
can
go
and
and
specialize
for
its
own
for
its
own
use
and
I
mean
I've
got
a
few
examples
here
around
you
know:
we've
got
the
ahead
of
time
compiled
for
Java,
there's
like
an
embedded
Java
there
as
well.
We've
got
other
technologies
that
are
built
on
this
that
have
nothing
to
do
with
Java.
A
A
If
you,
if
you've,
been
sort
of
following
the
Omar
project
as
it's
been
evolving
since
it's
been
open
since
that,
since
the
the
compiler
technology
was
open
sourced
in
September
of
2016,
one
of
the
one
of
the
advances
that
we've
actually
that
we've
actually
provided
and
something
we're
very
heavily
invested
in
putting
a
lot
of
work
into
improving,
is
something
called
JIT
builder.
And
what
JIT
builder
really?
Is
it's
a
much
more
simplified
interface
from
your
language
runtime
to
the
compiler
technology?
A
So
it
really
simplifies
the
process
by
which
so
the
process
of
producing
an
IR
an
il
generator.
It
really
sort
of
takes
that
out
of
the
equation
and
allows
you
to
map
the
the
things
in
your
language
runtime
to
the
compiler
technology,
much
easier,
so
that
in
itself
is
an
entire
presentation
that
we
can
talk
about
at
some
point
in
the
future.
A
Ok,
so
once
you
get
past
the
the
isle
generator
you,
we
have
a
variety
of
high
level
optimizations
that
are
really
based
on
the
the
trees
that
can
be
performed.
The
the
compiler
technology
actually
has
a
very
rich
suite
of
optimizations
they're,
actually
fairly
powerful
optimizations.
As
well,
I
can't
remember
exactly
how
many
are
actually
in
OMR
have
been
contributed
to
Omar,
but
I
think
between
Oh
Maur
and
open
j9
there.
There
must
be
like
120,
if
not
more
optimizations,
that
that
can
be
applied.
A
The
way
that
the
optimizations
work
is
that
they
consume
trees
and
they
also
produce
trees
as
well
in
in
the
most
theoretical
sense.
You
can
reorder
these
optimizations
as
you
see
fit,
however,
in
practice-
and
this
isn't
necessarily
the
right
answer,
but
the
but
in
practice
there
are
dependencies
between
some
of
the
optimizations
such
that
the
ordering
is
certain
ordering
is
required
and
actually
that's
something
that
we're
working
to
take
to
get
rid
of
anyways.
So
it
does
a
whole
suite
of
optimizations
these.
A
These
are
things
that,
like
their
classical
compiler
optimizations
like
loop,
optimizations,
there's
like
dataflow
optimizations,
there's
commenting
simplification.
Very
you
know
very
bread-and-butter
kind
of
compiler
optimizations.
There
are
also
more
runtime
specific
sort
of
optimizations
that
are
available,
so
things
like
removing
texts,
if
you
like
array,
checks
for
example,
or
if
there
are
ways
of
removing
synchronization.
A
Generators
that
we
have
in
Omar
right
now
are
for
x86
and
PowerPC,
and
system
Z
and
32-bit
arm.
This
particular
chart
here
shows
even
more
SH
for
in
MIPS
we
used
to
have
a
deeper
set
of
backends
that
were
available,
but
at
the
moment
in
Willmar,
it's
just
just
the
four
you
know,
and
and
in
terms
of
what
we're
talking
about
here.
A
You
know
we're
talking
about
supplementing
the
Armco
generator
with
you
know,
with
with
64-bit
capabilities
as
well
so
and
based
on
our
sort
of
earlier
discussions
about
this
is
it's
looking
like
it
might
be,
a
new
code
generator
on
its
own,
but
the
code
generator
is
basically
responsible
for
taking
trees
and
and
producing
instructions
from
them
and
these
instructions
can
be
executed.
So
that's
really
the
the
path
idea
that
it
follows.
A
Now
that's
sort
of
the
core
flow
through
the
the
compiler.
There
are
other
components
as
well
that
may
be
consulted
for
for
information
as
the
compilation
proceeds.
So,
for
example,
some
optimizations
do
rely
on
profiling,
information
or
the
or
they
will
perform
much
better
in
the
context
of
profiling
information.
So
there
is
interaction
with
the
profile
manager
to
get
data
about
a
particular.
A
You
know
about
a
particular
opcode
or
a
particular
byte
toad,
and
if
there's
something
that
we
can,
we
can
specialize
better
from
that.
It
also
interacts
with
a
runtime
in
terms
of
you
know
the
the
when
the
code
is
being
produced,
and
you
know
needing
to
call
certain
methods
at
runtime
from
compiled
code
that
sort
of
thing.
So
when
we're
looking
at
just
OMR,
though
I,
usually
don't
you
don't
really
deal
with
the
profiler
right
now
we're
not
really
going
to
be
dealing
with
the
runtime
right
now.
A
Okay,
so
what
I
thought
I
would
do
is
to
is
to
start
going
through
a
fairly
simple
example
of
how
a
method
actually
gets
compiled
in
in
Java
in
using
open
j9.
As
an
example,
I
think,
it's
really
going
to
touch
on
a
lot
of
the
concepts
here.
So
given
some
virtual
machine
I'm
going
to
assume
it's
Java,
what
you
sort
of
start
off
with
is
java
bytecodes,
and
normally
your
your
virtual
machine
is
just
simply
going
to
interpret
those
byte
codes.
A
So
what
we're
really
talking
about
here
with
this
compiler
technology
is
how
do
we
take
those
byte
codes
and
actually
compile
them
at
runtime
into
native
instructions
in
the
binary
code
that
can
actually
be
executed
so,
as
I
talked
about
before
in
terms
of
the
architecture?
The
way
that
this
works
is
that
we
have
a
means
by
which
the
compiler
can
take
java
bytecodes
and
through
the
isle
generator.
A
It
can
produce
the
Testarossa
trees
for
consumption
within
within
Testarossa.
So
that's
what
the
IL
gen
does
and
then
it
calls
those
trees
flow
into
the
optimizer.
Depending
on
the
strategy
in
which
you
want
to
compile
those
methods,
you
can
throw
different
packages
of
these
optimizations
at
those
trees
to
get
them
to
do
different
things.
It's
a
very
kind
of
iterative
process
as
well.
So
you
can
you
can.
A
That's
really
based
on
that
that
really
uses
sort
of
an
infinite
virtual
register
set,
and
then
we
go
through
a
very
platform,
specific
register
assignment
phase,
where
we
react,
take
those
virtual
pseudo
registers
and
map
them
into
real
machine
registers,
and
then
you
end
up
going
through
a
binary
encoding
phase.
It
truly
does
a
lot
of
sort
of
last-minute
things
where
you
actually
take
that
and
map
it
into
a
to
a
buffer.
That's
that's
executable
and
you
pass
that
compiled
method
back
to
the
virtual
machine
and
say:
go
please
execute
it.
A
So
that's
really
the
the
process
at
a
very
high
level,
so
I
think
what
I'd
like
to
do
is
to
drill
down
deeper
into
each
one
of
these
different
phases
and
talk
a
bit
more
about
how
it
does
the
things.
That
is,
that
it
actually
does
so.
The
first
step
is
I'll
generator
so
which
is
sort
of
colloquially
called
il
gen.
A
So
in
general
you
know,
Java
byte
codes
are
not
really
great
for
analysis,
and
it's
not
something.
That's
you
know
easily
consumed
or
worked
on
within
within
the
compiler
technology.
So
what
we
end
up
doing
is
we
turn
those
byte
codes
into
our
into
an
internal
representation
that
we
know
is
easy
to
work
with
and
for
which
you
know
the
properties
of
that
il
are
reasonably
understood
and
it
can
be
used
by
multiple
language
front
ends
as
well.
A
So
what
we
end
up
doing
is
producing
these
trees
and
the
isle
gen
phase
itself
is
actually
just
an
abstract
interpreter,
so
it
really
walks
through
the
byte
codes,
just
like
the
Java
interpreter
would,
and
it's
basically
interpreting
what
it
what
it
comes
across
here.
So
as
an
example,
we,
when
we
hit
the
first
I
load
byte
code
here
we
end
up
creating,
what's
called
a
node
in
our
in
our
tree,
and
a
node
is
really
just
a
representation
of
a
particular
inspect
expression.
A
A
We
come
a
comp
come
across
a
third
byte
code
that
is
going
to
be
consuming
the
previous
two
byte
codes
as
its
operands
and
the
way
that
we
handled.
That
is
that
we
create
a
third
node
to
represent
the
add
expression
and
then
the
way
that
we
associate
the
the
the
operands
with
this
particular
expression
is
by
connecting,
as
by
making
the
first
two
nodes,
children
of
the
add
node.
So
that's
what
those
little
arrows
are
intended
to
represent
there.
A
A
But
what
we're
starting
to
see
here
is
sort
of
a
complete
Express
like
a
complete
statement
being
formed
here.
So
in
this
particular
case,
this
is
really
the
representation
of
a
adding
B
to
2a
and
then
storing
the
result
of
that
back
into
a
and
after
you,
you
know,
each
of
these
statements
really
is
internally
within
within
Testarossa.
A
What
we
refer
to
those
as
is
a
tree
table,
and
we
actually
create
a
special
kind
of
node,
called
the
treetop
that
that
sort
of
points
to
the
root
of
one
of
these
statements
and
I
think
informally
the
the
what
what
at
when
a
tree
table
is
really
supposed
to
represent.
This
is
an
expression.
That's
got
a
single
side-effect
with
it.
So
in
this
case
this
the
the
major
side
effect
here
is
going
to
be
the
store
into
into
a
so
that
alone
should
get
its
own
its
own
tree
table.
A
You
would
never
find
a
tree
table
that
that
actually
has,
if
you
would
never
associate
a
tree
with
a
treetop.
That's
got
more
than
one
side
effects,
so,
for
example,
it
has
two
stores
in
it
or
it
has
two
calls
and
then,
for
example,
or
a
call
in
the
store
that
sort
of
thing.
It
would
always
need
to
have
a
single
side
effect
associated
with
it,
and
that's
going
to
make
our
understanding
of
these
trees
by
optimizations
and
other
parts
of
the
compiler
a
lot
easier
to
reason
about
in
the
future.
A
So
we
carry
on
and
we
can
predict
we
get
to
the
next
statement.
Then
we
end
up
producing
yet
another
treetop
for
the
for
the
next
set
of
byte
codes
there
and
what
we,
what
we
do
when
we're
producing
all
these
tree
tops,
is
we
actually
link
them
all
together
in
program
order,
so
at
the
end
of
consuming
all
the
byte
codes
for
a
particular
method,
we're
going
to
end
up
having
a
sort
of
a
very
long
linked
list
of
these
tree
tops,
as
we
see
here
in
the
next
next
step.
A
So
what
we
tend
to
do
with
these
tree
tops
is
we
actually
start
to
group
them
into
what
we
call
basic
blocks,
and
these
are
single
entry,
single
exit
blocks
and
there
are
special
nodes
that
get
created
to
delimit
the
beginning
of
a
block.
The
basic
blocks
start
and
there's
a
special
node.
That
indicates
the
the
basic
block
end
and
we
can
go
through
and
produce.
These
basic
blocks
from
these
tree
tops
and.
A
A
So
what
I
just
said
here?
You
know
it
was
a
Java
example,
but
any
any
language
runtime
that
wants
to
produce
that
wants
to
use
this
architecture
is
going
to
have
to
end
up
producing
trees
at
some
point
and
sort
of
the
same
rules
around
tree
tops
and
the
same
rules
around.
You
know
how
you
know
how
edges
are
formed
between
blocks
and
the
semantics
of
basic
blocks
all
have
to
apply
across
across
those
different
languages.
B
A
So
once
we
have
produced
the
initial
set
of
internal
intermediate
language
for
a
particular,
let's
say,
I
set
up
byte
codes.
We
we
can
pass
that
into
the
optimizer,
a
high-level
optimizer
and,
like
I,
said
before,
there's
probably
at
least
120
optimizations
that
are
available.
It
would
be
absolutely
impossible
to
go
through
each
and
every
one
of
those
and
it's
certainly
possible
to
do
to
drill
down
into
those.
If
anybody
has
any
sort
of
specific
interests,
we
can
do
that
at
another
time.
A
We
can
bring
in
the
right
people
that
understand
each
of
those
optimizations
to
talk
about
them,
but
it
actually
it
is
in
between.
Suffice
it
to
say
it
does
consume
a
fairly
large
part
of
the
compiler
technology,
I'm
just
going
to
give
a
very
quick
overview
of
what
it
can
do,
just
give
you
a
taste
of
sort
of
the
optimizations
that
it's
capable
of
and
how
it
actually
goes
about,
manipulating
certain
trees.
A
But
basically
the
optimizer
consists
of
a
variety
of
different
optimizations
that
are
each
controlled
by
their
own
or
manager,
optimization
manager
and
basically,
an
optimization,
behaves
in
more
or
less
in
two
phases
right.
It
basically
does
an
analysis
of
what
work
is
is
possible.
What
work
is
needed
to
be
done,
and
so
it's
looking
for
opportunities
and
it
looks
to
make
sure
that
any
sort
of
transformation
that
it
might
want
to
do
is
done
correctly
and
it's
done
safely
as
well.
A
So
once
it's
done
that
analysis,
it
then
goes
through
and
does
the
actual
transformation
that
actually
manipulates
tree.
So
the
analysis
phase
doesn't
actually
changing
the
trees.
Anything
like
that
because
it
can
bail
out
at
any
moment,
but
once
it's
proven
that
things
can
that
there
is
an
opportunity,
it's
worthwhile
doing
this
change
and
that
it
it
can
be
done
correctly
and
then
it'll
go
and
transform
the
trees,
and
hopefully
your
we're
better
off
from
that
or
we're
uncovering
like.
A
Okay,
so
just
a
quick
example
of
some
of
the
things
that
we
can
do.
So
if
you
look
at
a
couple
of
treetops
in
order,
one
of
the
things
that
can
get
done
is
a
strength
reduction
optimization.
So,
basically,
by
analyzing
this
code,
we
can
figure
out
that
we
can
replace
the
multiply
with
with
a
shift
instruction.
So
let's
go
ahead
and
do
that
and
then
a
very,
very,
very
frequent
optimization
that
occurs
in
this
code
is
something
called
commoning,
and
it's
really
like
a
common
sub-expression
elimination,
common
sub-expression,
elimination,
optimization
so
I.
A
A
What
we
can
do
is
assume
that
the
operation
is
going
to
be
occur
is
going
to
occur
by
the
by
the
first
by
the
first
tree,
and
then
we
simply
just
say
that
we
can
just
comment
up
the
what
we
call
commenting
up
the
the
operation
under
the
subtract
node
there.
So
basically,
that
operation
doesn't
need
to
be
repeated.
A
We
already
know
that
we've
done
that,
there's
other
rules
around
how
this
can
actually
happen,
but
from
this
very
simple
example,
we're
just
we're
just
performing
that
straight-up,
but
the
resulting
trees
in
this
case
are
very
likely
going
to
be
a
dag.
We
don't
allow
cycles
in
any
of
the
trees
that
we
do
form.
I
guess
I
should
have
mentioned
that
earlier.
A
So
another
optimization
that
we
can
do
is
something
called
copy
propagation.
So,
looking
at
this
sort
of
the
next
set
of
optimizations
that
we
can
apply
here,
we
are
producing
the
results
of
some
expression
and
storing
it
into
a
variable
a
and
then
the
next
treetop
is
coming
along
and
actually
loading
that
a
and-
and
what
we
can
basically
see
is
that,
instead
of
loading
that
a
for
memory,
what
we
can
actually
do
is
to
just
basically
replace
it
with
the
result
of
the
I.
A
Add
right,
it's
fairly
straightforward
copy
propagation
there,
and
then
we
can
use
like
a
simplification,
optimization
as
well,
which
kind
of
looks
to
see.
If
there's
you
know
unnecessary
trees
or
operations
that
we
can
sort
of
fold
into
a
small
of
small
unit.
As
possible,
so
in
this
particular
case
we
look
at
we're
doing
and
like
the
text
on
the
right
says,
we're
doing
a
it.
A
A
So
I
don't
really
need
to
do
the
store
at
all
in
the
first
case,
because
the
second
one
is
is
going
to
overwrite
it
so
I
simply
replace
that
with
a
tree
top
and
then
we
all,
we
have
another
phase
called
dead
trees,
elimination
that
can
actually
go
ahead
and
remove
dead
tree
tops.
So
all
in
all,
that's
going
to
end
up
happening
is
that
this
tree
is
X
groups
very.
A
Oops
we're
going
to
simplify
this
down
to
as
small
unit
as
possible
and
we
can
keep
going
and
going
and
going
applying
as
many
you
know,
optimizations
as
possible
oftentimes
from
the
apply
optimization.
It
actually
uncovers
opportunities
for
other
optimizations
and
then
we
sort
of
iterate
on
that.
So
the
optimizer
is
a
fairly
big
topic
so
and
it's
not
really
the
focus
of
what
we
wanted
to
talk
about
with
this
work
group.
But
if
there's
any
ever
any
interest
in
anything
in
particular
there
we
can.
A
So,
ultimately,
its
role
is
really
to
produce
binary
code
from
from
il
that's
a
lot
of
steps
happen
between
you
know,
il
being
introduced
and
when
the
binary
code
is
actually
produced
and
what
we
do
is
we've
actually
broken
up
the
code
January
into
a
variety
of
different
phases
that
can
be
customized
by
a
particular
language
runtime
or
the
architecture
that
you're
running
on
so
on.
The
far
left
here
I
just
happen
to
list
all
the
phases
that
are
available
in
the
current
Omar
compiler.
Not
every
runtime
is
actually
going
to
use
these
or
not.
A
In
the
middle
column,
I've
got
an
example
of
how
how
particular
architecture
can
provide
their
own
phases
that
make
sense
to
that
particular
architecture
and
on
the
far
right
is
sort
of
an
example
of
a
package
of
phases
that
that
a
particular
runtime
could
actually
do,
and
in
fact,
this
package
here's
the
sort
of
the
default
one
that
an
Omar
code
generator
would
would
use
unless
it
was
actually
overwritten
by
something
by
a
project
downstream.
This
is
sort
of
the
default
ordering
of
the
the
phases
and
one
of
these
phases.
A
So
the
word
that
gets
used
a
lot
around
here
is
evaluation
and
the
sort
of
the
act
of
inspecting
a
Nile
node
and
then
turning
it
into
an
instruction
is
called
an
evaluation
and,
like
the
quote,
there
says
it's
to
generate
instructions
which,
when
executed,
will
cause
the
trees
value
to
appear
in
some
virtual
register.
So
the
result
of
evaluating
a
node
is
to
produce
a
sequence
of
sort
of
virtual
instructions
that
will
that
will
use
sorry,
whose
results
will
be
a
virtual
register
that
represents
the
value
of
that
particular
tree.
A
We'll
see
that
in
this
a
sec,
the
the
virtual
registers
that
are
used
to
come
from
an
infinite
set,
so
you
can
use
as
many
virtual
registers
as
you
need
in
order
to
accomplish
the
thing
that
you
want,
you
don't
have
to
worry
about
limitations
of
your
particular
architecture
that
you're
targeting
this.
In
this
particular
phase,
you
use
an
infinite
set
of
registers.
A
A
A
A
This
really
means
is
how
many,
how
many
instances
of
a
node
are
really
seen
in
a
particular
program,
and
it
can
you
know
we
can
use
that
information
to
make
decisions
about.
Is
this
the
first
use?
Is
this
the
last
use
of
a
node
that
kind
of
thing,
and
we
can
make
decisions
about
like
when
registers
died
when
register
should
go,
live
that
kind
of
thing?
A
So
yeah?
That's
what
I'm
that's
what
it
says
there
we
can
make
reuse
decisions
based
on
you
know
if
the
reference
count
goes
to
one.
You
know
that
this
is
going
to
be
a
last
reference.
You
see
of
this
of
this
node
and
it's
method
and
an
evaluator
does
not
usually
decrement
its
own
reference
count.
The
parent
is
responsible
for
decrementing
the
the
reference
counter
in
a
particular
node,
so,
okay,
so
an
example
here
fairly
simple
tree
representing
that
statement
there.
So
the
way
that
this
works
is
that
we
do
a
well.
A
We
need
to
have
some
kind
of
a
pointer
to
the
instruction
stream
that
we're
working
on.
So
this
is
the
stream
of
virtual
instructions.
I
guess
you
can
call
it
that
and
the
way
that
this
works
is
we
basically
do
a
post
order.
Traversal
like
we're
talking
here,
we
go.
We
reevaluate
the
children
before
the
before
the
parent,
so
we
get
to
the
bottom
here,
there's
no
more
children
to
traverse.
So
now
we
actually
have
to
do
something
here.
A
So
what
we're
going
to
do
here
is
actually
what
we
call
evaluating
this
node
and
again
we're
assuming
an
infinite
supply
of
registers.
So
what
this
really
turns
into-
and
this
is
it's
basically
a
load-
so
I'm
just
going
to
use
a
very
machine
generic
kind
of
pseudo
assembly
here.
This
isn't
any
particular
architecture
and
like
that.
But
what
it's
really
doing
here
is
loading
the
the
contents
of
a
into
some
virtual
register
that
we
happen
to
call
our
2:03.
A
So
the
the
evaluation
of
a
load
results
in
that
particular
virtual
instruction
and
since
the
result
of
that
is
stored
into
203
into
virtual
intergroup
into
register,
2
or
3,
we
can
actually
set
register
203
on
that
node.
So
anytime
in
the
future.
You
look
at
that
node.
You
can
ask.
What
have
you
been?
Has
a
virtual
register
been
assigned
to
you
yet
and
if
the
answer
is
yes,
you
know
you've
actually
evaluated
that
node
in
the
past,
and
you
can
just
use
the
value
that's
stored
in
there
in
that
virtual
register.
A
A
In
this
case,
it's
loading
B
into
into
register,
204
and
again
what
it
does
is
it
replaces
it
sets
registered
204
on
that
node
and
since
we're
out
of
children
for
this
particular
node.
We
now
go
back
up
to
the
parent,
the
add,
and
so
what
the
add
is
now
going
to
be
able
to
do.
Is
you
know,
since
both
of
its
children
have
now
been
evaluated?
It
basically
replaces
it.
Does
the
operation
that
it's
supposed
to
be
doing
so
it's
interesting
an
add
instruction
that
stores
its
result
into
a
virtual.
A
That's
in
this
case
two
or
five
of
its
of
its
two
children
that
have
been
evaluated,
so
we
just
did
two
or
three
and
204,
and
then
we
keep
going
here
right.
So
we
now
get
on
to
the
IMO.
The
AMA
hasn't
evaluated
this,
the
it's
it's
subtract
child.
Yet
so
that's
what
we're
doing
here
to
subtract
itself.
A
You
can
see
by
the
arrows
here
that
it's
the
children
of
the
subtract
are
already
203
and
204.
We
don't
have
to
re-evaluate
those
nodes
because
they've
already
been
evaluated
into
a
virtual,
so
the
subtract
node
simply
just
takes
its
children
from
our
203
and
our
204,
and
it
needs
to
store
it
into
a
new
register
206
and
we
carry
on
up
the
tree
same
thing
for
the
multiply
we've
done.
205
you've
done
206.
A
A
A
But
what
this
is
really
showing
you
here
is
before
you
evaluate
it's
showing
you
one
particular
tree
top
and
and
then
the
nodes
underneath
it's
what
it
looks
like
before
tree
evaluation,
what
it
looks
like
after
we've
gone
through
and
just
did
the
process
that
we
just
did
and
what
the
actual
underlying
instructions
look
like.
So
how
this?
How
you
can
read
this
is
that
the
column
on
the
left
that
starts
with
an
N
every
node
has
got
a
unique
ID
associated
with
it
and
n
2
3,
8
n,
for
example.
A
A
A
So
in
this
particular
case
you
see
that
the
if
I
compare
less
than
note
as
a
reference
count
of
0,
that's
good,
because
it's
a
treetop
and
treetops
should
not
actually
be
common,
and
then
underneath
that
you
see
like
it's,
got
an
eye
load
with
a
reference
count
of
4,
which
means
that
there
are.
There
are
there's
this
there's
this
use
of
that
I
load,
but
then
there's
three
other
places
that
actually
are
referencing,
this
particular
eye
load.
A
What
you
can
see
is
that,
after
we've
done
a
tree
evaluation,
the
the
reference
counts
on
these
nodes
actually
get
decremented.
So
after
we've
gone
through
this,
the
eye
load
actually
has
its
reference
counts.
Decremented
from
four
to
three
I
load
goes
down
to
0
and
then
the
eye
load
goes
down
to
a
load,
goes
down
to
1
and
that's
for
the
natural
result
of
this
and
then
the
result
of
these
evaluations
is
a
virtual
register
and
and
after
we've
done
prevail,
you
a
ssin.
A
We
can
now
report
which
register
has
been
set
on
each
of
these
nodes.
So
in
the
case
of
the
eye
load,
we
know
that
was
evaluated
into
GPR
64.
So
we
can
actually
now
report
that
when
we
print
these
in
the
logs
and
then
a
little
bit
later,
you
can
actually
look
at
the
virtual
instructions
that
are
that
better
be
produced
as
a
result
of
a
value,
those
trees,
and
indeed
they
are.
You
know
a
very
flat.
It
is
a
flat
instruction
representation
here.
The
column
on
the
left
here
actually
refers
to
the
instruction
address.
A
A
Okay,
so
up
until
this
point,
we've
been
dealing
with
virtual
registers
and
the
problem
is
that
when
you
actually
do
want
to
generate
this,
for
a
real
machine
like
x86
like
arm
whatever
that
real
Hardware
doesn't
have
an
unlimited
set
of
registers-
and
it
makes
it
very
handy
during
evaluation
to
assume
that
it
has
a
virtual
animated
number
of
registers.
But
at
some
point
we
have
to
make
everything
fit
within
the
constraints
of
the
of
the
the
target
architecture.
A
So
what
we
do
is
we
go
through
a
register
assignment
phase,
often
abbreviated
as
RA,
and
what
we
basically
do
there
is
we
map
all
these
virtual
registers
to
a
real
machine
register
and
the
way
that
this
particular
process
works.
It
doesn't
through
a
backward
pass,
so
it
goes
through
all
the
instructions
in
Reverse
and
and
and
does
that
mapping
we
deal
with.
It
goes
one
block
at
a
time.
A
We
call
this
a
local
register
Sun
for
that
reason
and
wheel,
because
we
only
deal
with
a
with
a
single
basic
block
at
a
time,
and
the
other
thing
about
that.
That's
important
about
this
kind
of
a
register
assignment
technique
is
that
it
doesn't
really
know
anything
about
control
flow,
so,
which
is
why
I
can't
really
cross
block
boundaries
right.
So
if
it
comes
across
some
kind
of
control
flow,
it
doesn't
really
know
that
you
know
it
doesn't
really
tract
register
state
in
different
branches
of
this
control
flow.
It
really
is
only
assuming
it.
A
A
The
register
Center
has
to
obey
the
the
register
dependencies
as
well
and
in
the
case
of
Java,
we
have
to
use
well
it.
It
also
has
another
duty
in
real
in
building
these
register
maps
because
it
understands
what's
in
a
what's
in
and
what
we
call
a
collective
register
or
not
I'm
not
going
to
talk
really
much
about
that
right
now,
but
it
does
have
other
tasks
that
they
must
perform
during
register
assignment,
so
just
a
quick
example
on
how
this
actually
works.
A
Let's
assume
for
this
moment
that
we
have
three
real
registers
in
this
particular
architecture.
This
is
called
XY
and
Z
and
at
this
particular
point
in
the
in
the
register,
assignment
we're
actually
walking
backwards,
as
I
said
that
this
technology
works,
we've
already
gone
through
and
assigned
stuff
to
our
203
and
our
204.
So
in
this
particular
case,
let's
assume
that
they've
already
been
assigned
to
real
register
X
and
real
register
Y
for
204.
A
So
the
way
that
this
will
work
is
that
we
get
to
this
particular
instruction
and
we
see
that
there's
one
register
that's
needing
to
be
assigned
here.
It's
207
and
the
question
is:
is
there
a
register
available
for
it?
And
the
answer
is
yes,
because
register
Z
is
still
available,
so
we
can
give
it
register
Z
and
we're
happy.
So
we
go
on
to
the
next
instruction.
A
So
here
it's
this
an
example
of
an
instruction
that
has
a
destination
like
a
target
with
the
our
definition
and
it's
also
got
registers
that
is
actually
using.
So
the
way
that
the
that
this
works
is
that
we
actually
process
the
definitions
first.
So
the
first
register
that
we're
looking
to
assign
is
for
207
so
and
it
turns
out,
we've
already
got
a
register
assigned
to
it
so
Z.
So
that's
great,
but
the
thing
to
notice
here
is
that
register.
A
207
is
no
longer
you
if
you
keep
looking
up
in
the
app
in
the
in
the
program
right.
This
is
the
last
use
of
register
207.
So
after
this
point
it
doesn't
need
to
hang
to
the
to
the
assignment
for
register
Z.
So
after
it's
done
this
assignment,
it
basically
says
I'm
done
with
register
Z.
Somebody
else
can
use
it
so
it'll,
basically
free
it
up
for
someone
else
to
use
it.
A
It's
a
definition.
So
this
is
going
to
be
the
first
death
point
here.
So
so
then
the
register
assignment
proceeds
on
to
our
205
and
our
206,
so
neither
of
those
have
got
registers
assigned
to
them.
Yet
so,
let's
start
with
206
Z
is
available
now,
so
let's
use
register
Z
and
we
now
move
on
to
205.
But
the
problem
that
we
have
here
is
that
XY
and
z
are
all
our
entire
register
set
is
actually
used,
so
we
don't
actually
have
a
free,
real
register
to
assign
to
205.
A
So
what
we
end
up
having
to
do
is
to
use
the
spilling
mechanism
to
to
make
this
work
and
the
way
that
that
works
is
we
say.
Okay,
let's
pick
a
register
and
most
backends
have
got
some
heuristics
to
figure
out
what
the
best
register
is
the
spill.
But
in
this
particular
case
we
chose
X,
that's
the
register
that
we
want
so,
but
because
it
was
already
being
used.
A
What
we
have
to
end
up
doing
here
is
insert
a
load
from
some
spill
slot
into
register
X,
because
all
the
code
beneath
us
had
already
assumed
that
that
value
is
in
real
register
X.
So
what
we're
actually
inserting
here
is
a
load
from
some
spill
location
which
we
haven't
stored
into
you.
So
we
now
say
that
are
2
or
3
which
had
X
is
in
a
spill.
State
r-25
is
now
given
the
real
register
X
and
we're
happy.
A
Our
203
here
is
a
bit
of
a
challenge
because
it's
in
a
spill
State,
so
we
don't.
It
actually
isn't
sitting
in
a
register
right
now,
so
we're
going
to
have
to
find
a
register
for
that
and
what
we
end
up
choosing
is,
oh,
so
the
other
thing
I
wanted
to
point
out
here
may
just
go
back
here
so
are
206.
Here
is
actually
the
first
use
of
it,
so
it
actually
dies
after
this
point,
so
after
we've
assigned
Z
in
this
case
it's
available
for
assignment.
A
So
in
this
particular
case,
our
203,
which
we're
looking
for
a
register,
is
going
to
get
register
Z.
But
since
our
2
or
3
was
in
a
spill
State,
we
actually
have
to
insert
a
store
of
Z
into
that
spill
slot
because
a
little
bit
later
on
we're
going
to
reload
it
back
into
X.
So
this
is
where
that
symmetric
spilling
is,
is
coming
into
play
here.
So
we
have
two
stores
the
end
of
the
spill
slot
and
we
can
carry
on
with
our
register
assignment.
In
this
particular
case.
A
So
this
particular
example,
you
really
demonstrated
a
lot
of
the
well
a
very
very
world.
One
way
it
really
demonstrated
a
lot
of
the
sort
of
the
things
of
the
registry
that
a
register
signer
has
to
deal
with
and
I
guess.
That's
it
for
register
assignment
yeah.
Alright,
so
I
can
pause
there
if
I
happen
to
be
any
questions
on
that
cuz
I
did
go
through
that
fairly
quickly.
A
C
A
Oh
yeah
I
think
you're
you're
right
that
I
would
say
at
least
90%
of
you
know.
The
DNA
in
the
bedstraw
signers
are
the
same
I
think
we
have
had
some
discussions
in
the
past
about
commenting
up
a
lot
of
the
functionality
between
the
different
architectures,
because
you
know
they
are
basically
behaving
the
same
way.
There
are
some
architectural
differences
that
do
come
into
play
with
regards
to.
A
Like
the
like,
the
way
that,
if
there's
sort
of
special
implied
registers
that
happen
to
be
used
within
certain
instructions
and
certain
architectures
or
if
there's
special
ways
of
choosing
a
register
like
I,
can
use
the
upper
half
of
a
reveler.
That
kind
of
thing
so,
but
even
having
said
all
that
I
think
that
a
lot
of
that
could
be
could
be
abstracted
away
and
we
could
actually
do
a
much
more
general
job
of
doing
this.
Now.
A
Having
said
that,
though,
it's
a
very
it's
going
to
be
a
big
task,
because
a
lot
of
these
registers
well,
each
of
these
backends
has
invested
quite
a
bit
in
their
own
implementation
of
these,
even
though
they
are
basically
doing
the
same
thing.
So
reimplemented
this
in
a
generic
way
is
going
to
be
a
tedious
task,
but
it's
an
perhaps
an
error-prone
task
as
well.
So
we
have
to
be
super
careful
about
doing
that,
but
yeah
you're
you're
right
that
there's
not.
C
A
Okay,
so
there's
this
one
last
little
phase
that
we
have
to
go
through
and
reproducing
code,
and
it's
really
where
we
we
take
those
assigned
instructions
and
actually
write
them
out
into
a
buffer
in
the
terms
of
in
terms
of
the
compiler
technology.
Here
we
call
that
a
code
cache,
and
so
we
go
through
this
phase
called
binary
encoding,
and
this
is
very
architecture
specific.
So
for
each
of
those
virtual
instructions,
depending
on
what
opcode
that
actually
are,
is
in
use,
it
actually
produces
a.
A
A
If
you've
got
branches
within
a
method,
it
needs
to
be
able
to
figure
out
like
the
branch
displacements
and
that
kind
of
thing
for
the
case
of
Java,
it
has
to
handle
the
ahead
of
time,
compile
relocations,
that
kind
of
thing,
and
it
also
handles
things
called
snippets,
which
are
really
small
out
of
line
sections
of
code.
That
may
be
needed
in
some
exceptional
cases
and
for
some
for
some
for
some
op
codes,
but
yeah
I
mean,
but
that's
that's.
A
A
So
that's
really
the
whirlwind
tour
of
the
compiler
technology,
I,
really
just
assume
this
is
going
to
be
the
start
of
a
lot
of
discussions
that
we
were
going
to
have
on
this.
This
is
just
to
give
everybody
a
sort
of
a
flavor
of
of
what's
possible,
with
a
bit
more
focus
on
the
code
generator,
which
is
what
people
are
really
going
to
be
interested
in
here.
A
I
expect
that
looking
forwards,
the
kinds
of
things
that
I'd
like
to
talk
about
next
I,
don't
want
to
do
that
today.
I
want
to
view
that
maybe
schedule
something
next
week
when,
when
more
people
are
available
to
start
talking
about,
you
know
the
conceptual
integrity
around
some
of
these
different
components.
So
things
like
what
is
the
machine
class
do
in
the
code
generator?
What
does
the
code
generator
class
supposed
to
do?
What
is
a
register?
What
is
an
instruction?
You
know
how
to
register
dependencies
work.
A
A
A
No
questions:
okay!
Well!
Okay,
if
there
aren't
any
questions,
I
think
the
next
steps
here,
I'd
like
to
you
know,
convene
another
call,
perhaps
on
Monday
evening
or
sorry
Monday
evening,
Eastern
Time,
to
maybe
start
talking
through
some
of
the
other
components
of
the
of
the
code
Jenner
in
a
little
bit
more
detail,
like
I,
said
before
things
like
the
machine
class
and
the
linkage
class
and
the
code
generator
class,
and
that
sort
of
thing
I'd
also
like
to
talk
a
little
bit
about
how
the
code
itself
is
structured
in
the
code
generator.
A
It's
a
way
of
architecting
the
class
is
to
make
them
extensible
and
I'd
like
to,
if
you've
never
actually
seen
that
kind
of
code
before
it
may
strike
you
as
a
bit
odd.
There
are
good
reasons
for
doing
what
we
did
and
I'd
like
to
explain
those
hopefully
not
try
to
overly
justify
them,
but
but
try
to
explain
what
they're,
what
it's
actually
trying
to
accomplish
and
how
you
architect,
the
class
that
uses
extensible
class
in
in
that
in
that
format.
So
I
think
I'd
like
to
talk
about
that.