►
From YouTube: OMR Compiler Architecture 20190221
Description
A description of a new benefits driven inliner developed as a research project at the University of Alberta, and a discussion around contributing it to Eclipse OMR.
A
A
Okay,
so
welcome
everyone.
So
today
we
have
a
sort
of
a
special
topic,
so
Andrew
Craig
and
Erica
troll
is
going
to
talk
about
some
about
some
enlightening
technology
that
was
under
development
as
a
research
project
with
the
University
of
Alberta,
and
the
intention
is
to
contribute
that
to
omr
if
possible.
So
the
purpose
of
today's
meeting
is
for
Andrew
and
Eric
to
basically
take
us
through
the
technology
there
and
for
us
to
understand
the
best
way
of
of
integrating
that
within
within
Walmart,
so
I'll
turn
it
over
to
Andrew.
We've
got
some
slides,
yeah.
B
And
they
should
be
on
the
share
for
anybody
who's
looking
at
the
WebEx.
So
thank
you
for
the
introduction,
Darryl,
so
I'm
going
to
do
most
pretty
much
most
of
the
talking
today.
This
work
is
a
is
a
joint
research
project
between
IBM
and
the
University
of
Alberta
and
the
heavy
lifting
on
the
implementation,
as
well
as
a
lot
of
the
innovations
in
how
we
do
some
of
the
in
being
lining
in
stuff.
B
So
if
you
look
at
what
Omar
has
at
the
moment
in
the
way
of
inlining,
there
is
one
inliner,
it
is
called
the
trivial
inliner.
It's
a
very
basic
inliner
that
works
by
in
aligning
a
limited
amount
of
small
methods.
So
it
has,
you
know,
sort
of
a
budget
heuristic.
It
takes
only
small
things
to
inline.
It
has
no
real
concept
of
what
the
best
thing
to
inline.
Is
it
just
picks,
small
things
and
inlines
them?
It's
sort
of
the
minimum
viable
concept
of
an
inline
er.
B
B
It's
very
Java
centric,
it's
full
of
heuristics
its
operation.
It
does
a
form
of
guided
eager
in
lining
with
some
backtracking
in
what
it's
doing
it
can
miss
opportunities,
because
it
does
tend
to
go
in
a
depth
first
manner
and
has
a
budget.
So
once
it's
used
up,
if
it
hasn't
searched
a
particular
path,
it
won't
even
be
considered.
B
It
has
a
single
metric
that
it
uses
to
judge
the
worthiness
of
in
lining
something,
and
that
means
that
it
can
conflate
a
small
method
with
a
low
benefit
and
a
large
method
with
a
large
benefit,
because
the
division
will
kind
of
result
in
the
same
answer
and
you
don't
have
a
good
way
of
choosing
between
them
and
the
code
is
relatively
convoluted
just
because
of
all
of
the
development.
That's
happened
over
time
and
it's
a
bit
hard
to
reason
about
and
control
the
inlining
in
some
circumstances
in
there.
B
So
obviously
this
is
not
ideal
for
the
omr
project.
So
when
we
sat
down
to
start
doing
this
project,
our
question
to
ourself
is
how
can
we
do
better
right
if
the
one
in
open
j9
isn't
one
that
we
could
generalize
into
a
more
nicely
and
the
wine
Omar?
Isn't
that
smart?
Yet
how
can
we
build
a
better
one?
So
if
you
sort
of
start
from
first
principles,
inlining
provides
a
number
of
benefits
to
the
optimizer,
so
it
reduces
function,
call
overheads.
B
When
you're
executing
the
program
we
call
less
stuff
and
it
provides
improved
opportunities
for
optimization
by
amalgamating
code
units
together,
you
can
discover
facts
that
you
cannot
discover
looking
at
them
in
isolation
and
therefore
you
can
do
better
things
to
the
code
to
make
it
go
faster.
Now,
inlining
can
also
have
negative
effects
right
if
the
method
gets
too
large,
it
can
be
hard
for
it
to
be
easily
compiled.
B
Now,
if
you,
if
you
look
at
the
current
state
of
the
art
in
in
liners,
not
just
in
Colmar
and
open
j9
but
sort
of
across
academia
and
across
industry,
most
of
these
pretty
much
all
of
these
in
liners
are
guided
using
sort
of
a
single
metric.
So
you
have
a
budget
of
some
kind
and
you
choose
candidates
to
inline
until
you
fill
your
budget
right,
so
a
standard
knapsack
packing
problem.
B
So
in
setting
out
to
try
and
look
at
doing
a
new
in
liner,
which
was
10
as
the
goal
of
the
research
collaboration
that
we
had,
we
wanted
to
separate
the
notion
of
cost
and
benefit
right.
So
you
a
cost,
is
the
amount
of
space
or
the
amount
of
constructions
or
whatever
that
you're
willing
to
grow
the
method
to
your
budget
and
the
benefit
is
how
much
better
we
think
the
program
will
be
by
having
inline
to
that
method
right.
B
So
it's
a
measure
not
only
of
saving
the
function,
call
overheads,
but
also
the
opportunities
for
optimization
that
that
inlining
can
unlock
now
benefit
also
necessarily
needs
to
include
a
notion
of
relative
execution
frequency
right.
So
if
cost
is
just
the
size
of
the
things
having
execution
frequency
factored
into
the
benefit
makes
sense,
because
if
you
have
two
things
with
equal
optimization
opportunities,
the
one
that's
more
frequently
executed
is
the
one
that
you
want
to
in
align
right
and
we
want
to
make
the
in
liner
guidance
in
the
new
implementation
much
more
scientific.
B
So
if
you
go
and
look
at
the
multi
target
in
liner,
some
of
its
decisions
can
at
times
appear
rather
magical,
and
it
requires
some
careful
analysis
of
the
code
to
understand
why
it
chose
to
do
what
it
did
and
changing
that
decision
can
be
tricky
so
the
basis
for
this
research
project.
So
before
this
was
background,
work
that
was
provided.
You
know
that
we
done
at
IBM
before
we
started.
This
collaboration
was
that
we
developed
an
algorithm
to
solve
the
knapsack
packing
with
dependencies
problem
right.
So
the
standard
knapsack
packing
problem
right.
B
You
have
a
backpack
of
a
given
size,
you
have
objects
of
various
sizes
or
weights,
and
you
wish
to
fill
the
backpack
as
full
as
you
can.
That
is
a
solvable
problem.
It
has
well
known
algorithms
in
the
literature
now,
if
you
add
dependencies
between
those,
so
you
can
only
include
a
if
you've
included
B,
for
example,
there
are,
there
are
very
few
algorithms
that
can
can
solve
that
problem.
B
So
what
we
developed
was
an
algorithm
to
solve
that
problem,
now
that
this
algorithm
formally
proven
to
be
optimal
in
all
cases
yet,
but
in
practice
it
does
produce
optimal
solutions,
so
we
set
it
quite
a
lot
of
different
different
problems
to
solve
and
in
the
kit,
modeled
off
of
sort
of
inlining
and
those
it
was
able
to
solve
those
problems
and
produce
the
optimal
result.
The
algorithm
is
based
on
dynamic
programming
and
it
uses
two
layers
of
backtracking
to
allow
the
optimization
during
the
search
to
help
you
find
best
inlining
solution
right.
B
So
what
I'm
going
to
do
is
just
run
very
briefly
through
how
this
algorithm
works.
Just
so,
you
get
a
flavor
of
how
this
knapsack
packing
problem
is
solved
because
it's
sort
of
intimately
tied
in
to
the
representations
we've
chosen
for
the
new
inliner
and
how
the
new
inliner
we
built
operates.
So
the
fundamental
currency
of
this
algorithm
or
the
fundamental
data
structure
behind
this
algorithm
is
something
that's
called
the
inlining
dependency
tree.
B
So
this
is
derived
from
a
call
graph,
but
the
thing
that
you
need
to
note
is
that
each
note
node
is
called
site-specific
right.
So
in
this
example,
here
we
have
a
called
B
and
C.
So
you
have
a
connected
to
be
and
an
a
connected
to
C
B
on
C
called
a
and
F.
So
you
connect
them.
If
they
were
to
call
B
twice,
there
would
be
to
be
nodes
as
children
of
a
where
B
one
would
represent
call
site.
B
B
So
we
take
this
inlining
dependency
tree
and
we
annotate
the
nodes
with
costs
and
benefits,
so
the
notations
shown
on
the
tree
in
the
slide.
You
have
a
cost.
On
the
left
hand,
side
of
the
slash
and
a
benefit
on
the
right,
so
node
E
has
a
cost
of
1
and
a
benefit
of
7.
So
therefore,
if
you
have
enough
budget
to
in
line
E,
you're,
probably
going
to
want
to
inline
it
because
it's
the
most
beneficial
thing,
there
was
a
question
in
the
room.
So.
B
In
the
description
of
the
algorithm,
as
we
wrote
it
on
you
when
you're
building
the
IDT,
you
already
are
like
you
give
this
algorithm
the
budget
that
you're
going
to
work
with
so
for
recursive
calls
the
node.
The
node
is
repeated
as
a
child
of
itself
out
to
the
maximum
depth
that
your
budget
can
accommodate
right
so
forward,
yeah
it
effectively.
It
does
the
loop
unrolling
by
representation,
so
the
costs.
B
B
No
transitive
notion,
so
the
costs
and
benefits
are
isolated
to
the
node
itself
right
and
for
a
given
budget.
We
wish
to
pick
the
most
optimal
subset
that
we
can
to
in
line
for
this
thing.
So
this
thing
is
a
dynamic
programming
algorithm.
It
works
with
a
table.
It
considers
the
the
nodes
in
the
IDT
in
a
post
order,
traversal
from
lowest
benefit
to
highest
benefit
at
the
same
level
right
so
for
the
post
order.
You'll
visit
all
of
the
before
you
visit
the
children
and
four
siblings.
B
So
we're
going
to
we're
going
to
we're
going
to
run
this
algorithm
for
a
budget
of
five
which
is
and
so
we've
created
a
table.
We've
got
five
columns,
column.
One
is
the
solution
for
a
budget
of
one
column.
Two
is
the
solution
for
the
budget
of
two
and
so
on,
and
we
then
begin
considering
nodes
in
the
order
that
I
described
so
first,
we
consider
node
a
node.
A
has
a
cost
of
one
and
for
all
cost
is
the
only
node
that
we've
considered
and
we
have
enough
budget.
B
Therefore,
we
will
align
it:
okay,
considering
node
B,
okay,
so
node
B.
When
we
look
at
column
one
for
node
B,
we
can't
inline
node
B
on
its
own,
because
B
depends
upon
a
so
when
we
go
to
try
and
put
B
in,
we
have
to
subtract
B's
cost
from
the
available
budget
and
then
look
at
the
previous
solution.
If
we
subtract
these
cost
from
the
budget,
the
remaining
budget
is
zero.
There's
no
way
that
we
could
fit
it
in
there
for
the
solution
in
column.
One
is
a
in
column.
B
Two
we
have
B,
we
subtract
the
cost
of
B,
which
is
one,
and
we
look
at
the
solution
in
the
previous
row,
a
column
one
which
is
our
remaining
budget,
which
was
a
can
we
combine
a
and
B,
is
the
solution
valid
or
all
of
the
dependencies
of
be
included
in
that
set,
so
that
we
can
graft
beyond
to
an
answer?
Is
yes,
therefore,
the
best
solution
so
far
at
budget
two
is
a
B
and
that
holds
for
the
remainder
of
the
columns
right.
B
So
it's
looking
at
the
previous
row,
which
is
what
these
arrows
indicate
nodes,
node
B
a
similar
operation,
happens
column
two,
you
have
a
B
because
there
is
no
way
to
fit
b
in
right.
If
you
go
and
consider
D
and
you
look
back
at
the
column,
one
right
taking
one
off,
you
have
a
well,
you
can't
have
a
D,
so
then
I
would
have
to
consider
D
and
its
predecessor
B.
So
now
I
want
to
in
line
BD.
B
The
cost
of
that
is
two
that
uses
up
all
of
the
budget,
so
there's
no
way
to
possibly
in
line
BD
right.
So
the
best
solution
at
two
is
built
a
be
at
three.
When
we
look
back
at
column
two
in
the
row
before
a
B
is
there
we
can
graph
B
on
to
that
without
violating
the
dependency
solution.
It's
solution
is
abd
right
now,
likewise
for
see
now
the
more
interesting
one
is
then,
when
we
consider
node
e
right
so
consider
look
at
the
row,
the
last
row
row
e
and
look
at
column
three.
B
So
we
begin
by
considering
E,
which
has
a
cost
of
1,
and
we
look
back
at
the
row
before
a
B,
but
we
can't
graft
it
on
there
right.
The
dependency
C
is
not
included
in
that
set,
so
it's
an
invalid
solution
so
that
we
include
its
parent.
So
now
we're
wanting
to
consider
inlining
c
e.
We
consider
c
e
and
compare
that
right
with
sorry
that
should
be
pointing
at
the
row
C,
not
the
Rho
D.
B
It
should
be
looking
at
8
at
a
and
column
1,
and
you
end
up
with
a
c
e
as
the
solution
right,
and
this
shows
that
we've
actually
backtracked
right.
The
thing
that's
interesting
is
you
look
down
column
3
and
we
decided
it
when
we
only
considered
up
to
see
that
we
wanted
to
go
abd,
which
is
down
one
side
of
the
tree.
The
backtracking
has
undone
both
of
those
decisions
and
now
selected
a
see
E
as
the
path
to
in
line.
B
The
other
one
has
to
do
whether
there's
an
overlap.
So
if
you
had
say
AC
is
a
solution
and
you
want
it
to
graft
c
e.
You
end
up
finding
out.
Oh
well,
there's
an
overlap
that
overlap
indicates
sub
optimality,
and
then
you
continue
backtracking
up
that
column.
So
there's
other
way.
There's
these
other
ways
of
grafting,
the
solutions,
those.
D
B
D
B
But
because
there's
only
one
edge
ever
coming
in
yes
kind
of
implicit
where
it's
implicit,
where
it
comes
from
right,
you
can
derive
it
just
by
having
the
set
of
notes
right
and
because
it's
a
tree,
you
know
you
have
two
in
line
anything.
You
have
two
in
line
the
root
and
then
the
rest
is
pads
from
the
root
to
the
various
nodes
right
which
may
pass
through
things
in
the
subject
in
the
subset.
But
the
none
of
the
paths
that
are
in
the
solution
will
pass
through
a
node.
That's
not
in
the
solution.
B
There,
because
of
the
dependency
part
of
the
algorithm
okay-
so
that's
that's
the
basics
of
this
algorithm,
that's
the
basics
of
how
this
algorithm
works.
This
was
an
input
to
the
research
project,
so
this
this
was
formulated
sort
of
in
isolation
with
the
idea
that
it
would
be
useful
for
inlining,
but
we
hadn't
actually
built
an
in
liner
that
used
it
right
and
we
had
an
implementation
in
Python
that
we
used
to
to
prototype
this
now
in
this
I
had
cost
numbers
and
benefit
numbers
now.
B
Cost
numbers
are
kind
of
a
relatively
sort
of
straightforward
thing:
to
have
some
kind
of
intuition
as
to
how
to
derive
right,
it'll,
be
a
measure
of
the
number
of
instructions
in
some
way
right.
It
could
be
the
number
of
instructions
say
in
java
bytecodes
the
thing
that
you're
planning
to
in
line,
if
you're
doing
inlining
for
java
or
it
could
be
the
number
of
nodes
in
the
tree,
representation
of
a
method
that
you're
planning
to
in
line
it's
just
the
cost
of
the
the
cost
of
in
learning.
B
Right
like
how
much
this
thing
is
worth
right
in
terms
of
code
size.
The
thing
that's
a
lot
more
magical
and
was
left
is
sort
of
an
unknown
in
the
design
of
this
Inlet.
This
packing
algorithm
is
how
do
you
derive
the
benefit?
What
is
the
benefit
number
right?
How
do
you
have
you
cook
that
up
and
that's
really
where
Eric's
research
picked
up
was
trying
to
figure
out
how
to
build
a
benefit
number
right
so
how
to
compute
the
benefit?
B
Well,
there
are
really
two
components
that
we
want
to
consider
in
the
benefit
as
I
sort
of
stated
earlier.
We
want
to
consider
the
frequency
that
is,
if
two
things
have
the
same
cost
two
in
line.
We
wish
to
inline
the
one
that
is
run
more
frequently,
because
we
will
save
more
in
terms
of
the
call
overhead
and
the
other
one
is
optimization
opportunity
if
two
things
have
the
same
hotness
in
it
same
execution,
frequency
and
same
size.
B
We
would
like
to
inline
the
one
that
is
going
to
provide
more
opportunity
for
the
optimizer
to
optimize
it,
because
you
will
end
up
with
a
program
that
runs
faster
right
so
in
the
formulation
that
eric
has
the
frequency
of
a
call
within
a
method.
So
we
have
to
derive
a
frequency
ratio
right,
so
the
ratio
is
derived
as
the
ratio
of
the
inche
method,
entry
frequency
to
the
frequency
of
the
call
site,
and
in
java
we're
doing
this
just
with
the
standard
profiling
infrastructure
that
we
would
normally
use
that
block
frequency.
B
So
the
API
for
doing
that.
Our
API
is
that
already
exists
in
Omar.
That
ratio
can
be
calculated.
It's
calculated
for
a
single
call
right,
so
it's
a
single
node,
and
so
you
can
get
a
multiplicative
factor
if
you
follow
a
path
right.
So
you
know
if
going
from
A
to
B
is
ten
times
hotter,
but
then
from
there
you
go
to
something:
that's
cooler,
you!
You
know
you
get
a
fractional
multiplication
and
the
whole
thing
kind
of
works
out
as
you
do
it
transitively
across
the
tree.
B
B
Essentially,
we
want
to
model
which
optimizations
may
be
unlocked
by
the
inlining
of
a
method
where
there's
information
from
the
caller
that
will
allow
the
optimizer
to
do
something
good
in
the
Cawley
that
we
would
not
have
been
able
to
do
in
isolation.
So
the
idea
that
Eric,
adopted
and
ran
with
was
to
run
an
abstract
interpreter
over
the
program,
representation,
computing,
symbolic
values
or
constrain
on
values,
so
that
we
can
model
what
we
know
about
values
in
the
program.
So
in
his
current
implementation,
this
is
an
abstract
interpreter.
B
Over
the
java
bytecodes,
we
chose
the
java
bytecodes
because,
in
the
context
of
open
j9
use
of
OMR,
the
generation
of
trees
is
expensive.
It
consumes
a
considerable
amount
of
memory
and
compile
time
and
so
writing
an
abstract
interpreter
to
traverse
the
trees.
Well,
we
would
necessarily
have
to
generate
the
trees
in
the
first
place,
which
was
prohibitively
expensive.
Now
you
could
do
it
over
trees,
but
for
reasons
of
efficiency
and
comparability
to
the
existing
inliner,
we
did
it
over
the
byte
codes.
B
So
this
abstract
interpreter
is
run
starting
at
the
root
method
and
you
start
constructing
constraints
on
the
values
and
when
you
get
to
a
call
that
you
will
begin
interpreting
the
Cawley,
so
you
don't
you,
you
take
the
Cawley
in
isolation
and
you
will
interpret
that
Colie
and
what
you're
trying
to
do
in
that
interpretation
is
to
find
the
opportunities
for
optimization.
Now
these
are
done.
Sort
of
as
pattern
matches
in
terms
of
the
operations
that
are
that
are
being
seen,
got
some
examples,
the
ones
that
are
currently
looked
for.
B
When
we
see
one
of
those
opportunities,
we
want
to
record
the
dependency
of
that
opportunity
on
the
parameters
of
the
function.
So
if
you're
say
you're
going
to
do
branch
folding
right,
one
side
of
the
branch
is
a
constant.
The
other
side
is
an
expression
that
is
dependent
upon
the
on
an
input
parameter.
B
We
can
derive
a
constraint
that
says
well
if
the
parameter
has
between
value
and
value
B
or
is
equal
to
value
X
or
whatever,
then
this
branch
could
be
folded
one
way
and
if
it's
in
constrained
in
another
way,
we
can
fold
the
branch
another
way
right,
and
so
what
we
will
do
is
record
those
in
a
table
and
we
store
that
as
a
summary.
So
what
you
end
up
is
a
summary
of
potential,
optimization
transformations
and
constraints
in
terms
of
the
parameters
that
will
allow
that
offer
you
need
to
be
real,
potentially
realized.
B
D
B
B
D
So
interpreting
and
creating
constrain
yeah
it
mean
it
does
incur
some
overhead
as
well.
It
does.
D
But
I'm
saying
it
does
it
has
the
same
feel
of
walking
over
byte
codes.
You
have
to
walk
over
by
code
files
and
you
have
to
generate
nodes
in
one
case
here
to
generate
constraints
and
the
other
right
on.
So
is
that
really
that
much
less
expensive
and
you
will
I
understand
you
have
data
so
I'll
wait
for
it
if
I,
if.
B
E
B
The
abstract
interpreter
was
written
in
terms
of
the
trees,
so
we
chose
to
skip
that
step
and
do
it
directly
for
the
reasons
of
efficiency,
but
you
could
build
one
to
work
on
Omar's
tree
representation
directly.
We
just
didn't
do
that
because
of
the
constraints
in
open
j9,
where
we
were
saying
yes
make
sense.
So
the
other
question
I
have
is
around
this.
D
B
Future
that
your
antidote
Saudi,
so
the
API
for
calling
into
the
abstract
interpreter,
is
defined
now
in
terms
of
how
you
would
pattern
recognized
the
opportunity
that
you
are
looking
for.
Currently,
those
patterns
are
formulated
in
terms
of
the
byte
codes,
because
the
pattern
is
implemented
in
the
abstract
interpreter
right.
So,
if
you
were
to
write
a
different
interpreter,
you
would
have
to
recognize
the
pattern
that
would
allow
you
to
perform
an
optimization,
but
the
optimizations
that
are
being
recognized
are
optimizations
that
exist
in
Omar,
so
they
are
not
Java
specific
optimizations.
B
D
D
B
And
so
once
we
produce
this
summary
table,
we
go
back
to
looking
at
the
call
site
and
the
call
site
has
constraints
on
the
actual
arguments
that
are
going
to
be
at
that
call
site
for
the.
For
that
particular
call.
You
can
intersect
that
set
of
constraints
with
the
constraints
for
each
potential
optimization
and
where
the
answer
is,
that
it
unifies
and
matches
and
the
constraints
in
the
table
are
satisfied.
B
You
can
take
those
optimizations
and
those
are
the
ones
that
have
the
potential
for
occurring
right
and
we
store
a
benefit
metric
with
each
of
those
potential
optimizations
which
I'll
cover.
So
you
can
add
those
together
multiplies
them
by
the
scaling
factor,
and
you
get
the
benefit
of
in
lining
that
particular
elicit.
C
B
B
If
the
nth
answer
is
yes,
it
satisfies
the
constraint
and
it
does
that
for
all
of
the
parameter
constraints
for
that
optimization
right,
because
we
have
one
for
each
of
the
parameters
we
intersect
each
parameter
and
the
answer
is
yes
for
all
of
them,
then
that
optimization
can
be
unlocked
by
the
information
in
the
caller.
If
you
inline,
that
Kali
is,
the
Internet
has
done
four
values:
integer.
B
So
we
use
we've
taken.
Erik
has
taken
most
of
the
constraints
in
a
Lee
propagation
and
employed
them,
so
the
ones
that
are
being
used,
which
is
what
I
have
on
this
next
slide,
so
integers
in
terms
of
integer
ranges
or
specific
values
strings
so
in
terms
of
constant
strings,
there's
representation
for
that
in
value
propagation,
and
we
use
that
no
constraints
objects
in
terms
of
constraints
on
the
class
type
and
on
null
Ness
and
arrays
constraints
on
the
array
size,
as
well
as
on
the
type
of
the
element.
F
C
B
That
method
summary
is
stored
and
is
independent
of
the
context
of
the
call
right,
you've
interpreted
that
method
in
isolation
produced
symbolic
constraints
in
terms
of
symbolic
parameters,
and
that
will
tell
you
that
there's
a
set
of
potential
transformations
that
if
you
goes
each
if
the
constraint
for
that
thing
is
satisfied,
the
optimization
can
potentially
happen.
It's
not
a
guarantee,
but
it
means
that
the
information
should
be
there.
B
You
had
so
when
you
encounter
the
call
you
build
that
the
first
time,
if
you
don't
have
it.
Meanwhile,
when
you
come
back
to
the
call
site,
you
have
a
set
of
constraints
for
the
arguments
at
the
call
signer,
so
it
the
does.
The
set
of
constraints
that
I
have
at
the
call
site
satisfy
the
constraints
that
I
needed
for
the
particular
optimization
and
that
allows
you
to
filter
the
set
of
potential
optimizations
to
a
set
that
can
actually
happen.
B
Based
on
the
information
that
you
have
available
right
and
if
I
need
to
point
out
that
it's
not
you
can
have
false
positives
and
false
negatives
with
this,
and
that
will
just
mean
that
you
will
make
inlining
decisions
that
are
not
optimal
because
you
approximated
in
some
way
or
had
a
lack
of
information.
It's
not
a
correctness,
concern
right.
B
So
the
precision
of
this
abstract
interpretation
and
whether
say
in
a
loop,
you
compute
a
fixed
point
or
you
do
a
single
iteration
or
multi
iteration,
or
how
you
do
that
right
is
a
function
of
how
much
time
you're
willing
to
invest
into
the
abstract
interpretation,
and
you
can
get
a
more
precise
or
a
less
precise
answer.
The
less
precise
you
are,
the
more
likely
you
are
to
make
a
suboptimal
choice.
But
if
it's
good
enough,
most
of
the
time,
you
will
get
a
good
answer
right.
So.
B
B
The
method
summary
tail
was
computed
once
for
a
method.
At
the
moment,
it's
computed
in
each
compilation,
but
in
theory
it
could
be
shared
across
compilations.
If
you
wanted
to
write
unless
the
method
was
redefined,
the
summary
will
remain
the
same
because
it's
a
brand
and
abstract
interpreter,
which
is
a
state
machine
over
the
byte
codes
and
produce
an
answer
that
is
independent
of
any
context
in
which
that
method
is
being
called.
C
So
if
you
have
an
opportunity
where
the
specialization
who
spreads
across
more
than
one
color
deeper
than
one
call
right
like
this
parameter
something
to
your
content
to
the
parameter,
to
a
cult
method
when
it's
used
in
the
Cawley,
but
it's
used
in
a
kolbe
of
both
the
goals.
Yes,
then
you
will
catch
that
as
long
as
at
some
point
in
the
algorithm,
you
can
find
a
way
to
inline
the
first
guy
yeah
and
then,
ideally
when
you're
evaluating
the
second
guy
to
evaluate
the
benefit
will
be
high
because
they'll
be
this
energy.
B
C
C
B
So
on
the
screen,
I
have
one
of
these
sample
method
summaries.
So
there's
just
some
examples
of
branch
folding
here,
so
it
records
a
benefit
metric
for
the
branch
folding
and
at
the
moment,
in
the
bytecode
abstract,
Java
bytecode
apps
interprets
the
number
of
byte
codes
that
will
be
eliminated.
If
that
branch
fold
happens
about
the
benefit,
it
records
the
location
of
the
opportunity
and
the
constraints
in
terms
of
the
parameters,
so
a
blank
means
that
we
don't
need
like.
That
argument
is
free.
We
don't
need
any
yet
to
satisfy
any
particular
value.
B
So
we
implemented
this
with
the
java
bytecodes
in
open
j9.
It's
at
the
moment.
The
optimizations
and
models
are
branch,
folding
null
check,
elimination,
check,
task,
elimination,
folding
of
constant
length,
strings
and
some
opportunities
for
partial
evaluation
right
where
you
would
be
able
to
do
a
compile
time,
evaluation
of
part
of
an
expression,
because
you
would
have
some
constants
that
you
can
fold
away
now,
we
being
able
to
run
a
large
amount
of
stuff.
With
this,
it's
managed
to
run
day-trader
and
things
like
that.
B
B
So
the
total
CPU
time
consumed
by
the
compilation
threads
as
reported
by
the
v-log,
the
compile,
compile
memory
so
that
the
total
mem
we
consume
during
compilation
also
from
the
dialog
generated
code
size
as
in
the
number
of
bytes
of
instructions
generated
also
from
the
V
log,
and
that
was
just
subtraction
of
the
addresses
specified
as
the
range
for
the
method
and
the
runtime.
So
that
would
be
time
to
execute
the
final
iteration
after
the
warmup
period,
eg
a
representation
of
the
steady-state
throughput.
B
So
there
were
three
configurations
in
this
evaluation:
there
was
a
baseline,
which
is
the
current
open,
j9
heuristic
inliner,
the
multi
target
in
liner.
The
frequency
of
what's
called
labeled
is
frequency,
which
is
this
new
in
liner
algorithm
that
we
have
where
all
the
benefits
are
set
to
one.
That's
basically
just
been
lining
based
on
frequency
and
then
there's
the
analysis,
which
is
the
new
in
liner
that
uses
both
the
frequency
scaling
and
the
abstract
interpretation
right.
So
we're
adding
the
cost
of
the
abstract
interpretation
in
the
hope
that
it
will
do
something
good.
B
Okay,
so
run
time,
so
all
of
these
bars
are
normalized
to
the
baseline.
So
all
the
baseline
bars,
which
are
the
dark
blue
bars,
leftmost
bar,
is
1
the
run
times
for
the
middle
bars
or
the
orange
bars
are
for
the
frequency
in
liner
and
the
gray
bars
are
for
the
analysis
in
line
where
we
added
the
abstract
interpretation
right
and
lower
is
better
right.
B
So,
if
you're,
if
you're
running
for
longer
you'll
have
a
higher
score
right,
so
fought
is
not
doing
as
well,
because
it's
running
slower
right,
zahlen,
right
we're
so
generally
it's
roughly
on
par,
there's
Lu
index
and
fought
where
it's
somewhere,
but
between
10
and
20%.
Worse.
In
terms
of
run
time,
sorry.
D
B
B
Okay,
so
that's
the
runtime,
so
in
general,
it's
on
par
with
the
current
heuristic
inliner.
There
are
two
outliers
where
it's
not
found
offered
not
managed
to
do
as
well
note
we
were
modeling,
a
very
limited
set
of
optimization,
so
optimizations
that
may
have
been
important
for
those
benchmarks
may
not
have
be
modeled
lieu
index,
for
example,
is
a
very
loop
intensive
benchmark
and
there
was
no
real
beyond
the
frequency.
There
was
nothing
that
was
specifically
things
that
blue
index
would
need
now
compilation.
Time
again.
B
This
is
normalized
to
one
lower
is
better
right,
so,
for
example,
lose
search.
It
was
about
we
consumed
the
new
inline
are
consumed
about
70%
of
the
compile
time
of
the
current
inliner
based
solution
in
sun
flow.
It
was
getting
up
towards
2x
the
compile
time
and
I'll
comment
on
that
afterwards.
Does
this
run
at
warm
and.
D
B
You
give
it
its
budget.
It's
going
to
stick
to
that
budget.
It's
not
going
to
increase
it
or
decrease
it
depending
on
what
it
sees.
So
it
can
be
a
bit
of
a
moving
target
that
you're
trying
to
compare
to.
We
try
to
get
these
as
Eric
spent
quite
a
lot
of
time,
trying
to
get
the
sizes
as
close
as
possible
in
terms
of
budget,
but
it's
certainly
possible
that
there
are
variations
happening
that
were
nailed
down
right.
B
We
studied
the
top-most
methods,
but
there's
still
variants
that
could
occur
so
I
just
wanted
to
comment
on
a
few
things
from
that
analysis.
So,
in
general,
the
new
inliner
is
more
expensive
than
the
current
open,
g9
inliner.
In
terms
of
compile
time
and
memory
there
are
two.
There
are
a
couple
things
I
want
to
call
out.
First,
is
that
the
current
inliner
does
not
do
a
full
exploration
of
the
state
space
right.
It
is
an
eager
inliner.
B
The
compile
time
is
generally
comparable.
We
had
one
case
where
it
was
sort
of
2x,
but
a
lot
of
those
bars
were
very,
very
close
and
in
some
cases,
lower
right
so
in
general,
is
doing
fairly
well
on
that
the
memory
was
in
within
20
percent
of
the
baseline
and
again
considering
that
it's
doing
a
full
state
space
exploration.
Some
growth
is
to
almost
be
expected.
B
The
abstract
interpretation
is
relatively
cheap
right.
The
difference
between
the
orange
bars
and
the
gray
bars
in
terms
of
the
memory
and
the
compile
time
right,
there's
not
a
significant
difference
right,
you
don't
it
doesn't
cost
you
that
much
to
run
the
interpreter.
Now,
obviously,
as
you
add
more
patterns
or
more
complex
patterns,
the
cost
will
go
up,
but
the
majority
of
the
costs
at
the
moment
is
doing
the
state
space
exploration
with
the
gang
algorithm.
B
Now
the
inline
new
in
liner
can
produce
the
same
performance
with
less
code,
so
we
saw
that
with
like
lieu
index.
It
produced
something
like
about
25%
less
code,
but
it
ran
at
the
same
throughput
and
the
runtime
performance
is
generally
pretty
good.
Considering
the
number
of
optimizations
modeled
and
the
lack
of
any
Java
specific
heuristics
right,
the
the
baseline
has
several
decades
worth
of
knowledge
of
Java
and
what
things
are
good
to
do.
15
liner
has
none
of
that.
B
Now,
there's,
certainly
a
lot
of
room
for
this
to
continue
being
expanded
upon
the
research
collaboration
between
IBM
and
the
University
of
Alberta
is
continuing
and
Kareem
Ali,
who
is
Eric's
supervisor
in
the
principal
research
investigator
at
the
University
of
Alberta
on
this
project
and
I
have
discussed
that
there
will
probably
be
another
master's
student
starting
later
this
year
to
continue
work
on
on
this
inliner.
A
lot
of
the
information
propagation
that
I
was
describing
is
downward
information
propagation,
so
I
know
something
in
the
collar
I
wish
to
use
it
in
the
Kali.
B
It's
not
something
that
we
currently
have
and
I
don't
expect
that
it
would
be
something
that
the
University
of
Alberta
would
produce
at
this
time,
because
they're
interested
in
the
questions
around
how
to
model
optimization
cheaply
how
to
guide
the
inliner
in
a
better
fashion.
The
bad
abstract
interpreter
is
more
engineering
less
interests,
so
we
would
like
to
contribute
this
work
to
omar,
so
the
proposal
for
how
to
contribute
this
would
be
that
the
core
of
the
inliner
would
be
contributed
to
omr,
and
that
would
be
the
knapsack
packing
algorithm
implementation.
B
That
would
be
the
all
of
the
code
for
basically
doing
the
unifications
of
the
method.
Summaries.
The
interfaces
for
the
method,
summaries
all
that
stuff
and
to
contribute
an
abstract
but
unimplemented
api.
For
the
abstract
interpreter,
so
being
it
for
so
that
the
inliner
has
something
to
call
to
be
able
to
do
the
abstract
interpretation.
But
there
would
be
no
implementation
there
in
omar.
B
B
So
eric,
I
believe,
has
written
it
with
having
an
abstract
api
in
mind,
there's
a
well-defined
set
of
connections
between
the
abstract
interpreter
and
the
inlining
Mehcad.
That's
actually
driving
this
so
that
API
exists
the
API
for
doing
the
recognition
of
the
patterns.
I,
don't
believe,
currently
exists,
that's
kind
of
too
tied
up
in
that
abstract
interpreter,
so
that
might
would
be
something
to
pull
out
later
on,
but
the
actual
here's,
the
set
of
hook,
points
that
you
need
to
call
into
the
abstract
interpreter
to
do
this
thing.
D
B
B
Let's
put
it
this
way,
the
current
trivial
in
liner
does
the
absolute
minimum.
As
far
as
in
lining
is
concerned,
there
is
no
large
scale
in
liner
in
omar
all
it
all.
It
has
is
the
trivial
in
liner,
so
what
you
can
get
from
the
trivial
in
liner
is
quite
limited.
This
would
be
a
significant
advance
on
that
for
being
for
open
j9.
The
equation
is
a
little
bit
different
because
they
have
the
multi
targeting
liner,
that's
heavily
tuned
for
their
language.
Yeah.
D
D
B
So
there,
if
you
look
at
the
optimizing,
so
the
optimization
strategies
in
Omar
at
the
moment
have
one
inlining
entry
in
line.
If
you
look
at
the
optimization
strategies
in
Omar,
there
are
two
versions
of
inlining
actually
three,
but
we'll
leave
the
third
one
out.
There's
there's
one
that
uses
the
multi
target
in
liner
and
that's
the
one
that
you
find
in
more
cheap,
warm
hot
scorching
and
there's
wine
that
appears
in
the
cold
strategy,
I
think
in
Omar.
D
B
F
B
B
Right
now,
so
the
list
of
optimizations
that
are
currently
modeled
is
this
list
here,
these
five
transformations
or
what
it
looks
for
and
with
terms
of
partial
evaluation
and
say
subset
of
all
things
that
you
could
do
CSV
on
I
guess
is
their
plan
to
make
that
more
exhaustive
yeah.
Well,
as
I
was
saying
for
the
follow-on
master's
project
at
the
University
of
Alberta,
we
want
to
try
modeling
something
much
more
sophisticated.
B
The
current
hope
is
to
model
escape
analysis
within
the
current
context,
this
kind
of
Java
specific,
but
would
provide
the
model
for
how
to
deal
with
sort
of
heap
based
things,
which
is
basically
the
upward
propagation
and
the
heap
are
the
two
things
that
I
sort
of
view
is
not
being
represented
in
this,
and
the
extension
of
the
work
is
to
figure
out
from
an
academic
perspective
how
to
make
that
work.
So
if
a
particular.
A
Project
Omar
wants
to
they
have
their
own
set
of
language
environment,
specific
optimization
yeah,
what
about
it,
so
they
can
actually
add
their
own.
What
what
is
the
mechanism
by
which,
like
these
work
on
the
on
the
generic
or
optimizations,
but
how
does
the
the
language
you
think
the
language
environment,
specific
optimizations?
How
can
they
participate
in
this.
B
Well,
all
you
need
is
an
entry
in
the
method
summary
table
right,
so
it's
just
an
entry
in
the
method
summary
table,
so
there's
just
a
kind
and
a
benefit
number,
and
the
only
thing
that
the
that
the
inlining
algorithm
actually
really
cares
about
is
the
benefit
number,
which
is
what
uses
to
drive
its
choices
and
the
constraints.
So
if
you
have
language,
specific
optimizations
and
a
language
specific,
abstract
interpreter,
that's
looking
for
those
opportunities
right,
you
would
be
able
to
just
inject
the
opportunity
into
the
method
summary
table
and
away.
B
H
B
And
the
frequency
in
liner,
that's
being
used,
is
basically
doing
that
it
runs
the
abstract
interpreter
to
consume
the
compiled
time
for
the
compiled
times
are
equivalent
for
evaluation,
but
it
then
forgets
the
answer
right.
Basically,
it
does
all
the
math
computes
the
answer
and
then
says
it
was
one
right
and
then
proceeds
on
that
basis.
There.
E
B
It
is
right
we
had
various
points
where
were
bugs
in
the
implementation,
where
the
frequencies
weren't
being
calculated
correctly
right,
and
you
ended
up
in
a
world
of
hurt,
like
you
know,
50
percent
60
percent
below
in
terms
of
performance
right.
There
were
implementations
of
this
that
we're
in
that
neighborhood
of
awfulness
when
there
were
mistakes
and
how
the
math
was
being
done
and
the
answers
weren't
great.
B
D
D
B
B
Insensitive,
yes,
I
believe
so
and
I.
One
of
the
ideas
that's
being
discussed
is
that
we
could
in
theory,
run
the
flow
insensitive
to
sort
of
prune
off
the
least
interesting
parts
of
the
state
space.
And
if
we
wanted
to
try
and
do
better
on
things
that
we
thought
were
important,
we
could
then
redo
flow.
B
B
B
D
B
You
know
point
simulating
it
or
you
could
ignore
the
entries
from
the
Optima
opt
table
depending
on
how
how
much
compile
time
does
it
take
to
model
it
and
how
much
compile
time
or
do
you
just
not
want
to
factor
the
benefit
in
because
it's
not
something
you're
going
to
run
right,
you're,
always
free
to
ignore
an
entry
in
the
in
the
summary
table
to
say
yeah
I'm
not
going
to
do
that
or
I.
Don't
believe
you.
B
B
D
It
also
in
the
very
in
other
end
of
the
spectrum
and
some
of
the
when
it
runs
versus
the
only
other
thing
that
I
think
comes
close
to
this,
which
is
the
register
pressure
simulator,
which
are
very,
very
late
yeah.
It
does
try
to
stimulate
a
very
complex
process,
but
it's
not
yet
abandon.
All
these
other
arms
be
honest,
yeah
in
some
sense,
this
is
harder.
Does
another
like
a
salt.
D
C
B
H
B
B
H
B
C
We're
tasks
like
having
having
a
facility
so
that
you
can
verify
that
when
you
make
a
change
in
liner,
you
have
some
way
of
understanding
what
teams
you've.
Actually,
what
that,
what
the
impact
of
that
change
has
been
on
these,
the
things
that
we've
run
normally
would
be
nice,
even
if
it's
a
something
that
a
human
has
to
go
on
review
right,
something
that
benefits
only
would
get
run
as
part
of
a
pull
request
right.
When
you
recognize
it
someone's
changed,
you
could
request
that
that
test
be
run.
Oh
yeah.
B
H
G
Is
to
build
on
marks
evil
question
of
evil.
Why
don't
try
one
of
the
other
challenges
we
have
in
terms
ability
of
the
JVM
in
this
context
is
reproducing
non-deterministic
bugs,
and
it
seems
that
frequency
and
the
inlining
decisions
are
a
large,
very
large
factor
in
determining
whether
you're
going
to
reproduce
something
or
not.
F
G
C
I've
generally
thought
to
it
goes
back
to
that
enabling
question
or
labeling
point
that
you
were
making,
except
for
the
performance
consensus
of
what
what
gets
presented
to
the
compilers
but
senses
of
what
inliner
decides
to
do.
Yes,
amiable
would
force
them
to
do
a
consistent
thing
again.
Reducing
air
conditions
quite.
B
B
A
B
H
H
Sorry,
modeling
branch
folding,
what's
quite
easy,
yeah
in
Java
code.
The
arguments
are,
you
should
have
placed
on
the
variable
array
and
so
whether
meet
at
the
beginning
of
the
method,
and
so
we
keep
track
of
the
variable
array
and
the
arguments,
values
and
whether
the
arguments
values
have
been
to
the
facts
or
also
if
they
have
been
overwritten
the
variable
array.
Undone.
So
that's
just
keeping
track
of
the
arguments,
and
that
has
to
be
done
basically
always.
But
what
happens
is
just
what
we
are
a
strictly
interpreting
a
branch
or
any
statement.
H
Well
then,
we
can
say
is
the
value
that
I
is
a
test,
an
argument,
and
if
the
answer
is
yes,
then
we
can
say:
okay,
perfect,
we
can
say
well
the
real
test.
We
know
that
the
argument
will
destory,
the
branch
will
fall,
one
way
or
another,
one
perfect
and
that's
great
and
then
again
from
the
color.
We
can
inspect
the
argument
and
say:
okay,
be
the
value
of
the
actual
argument
that
we
are
passing
is
going
to
be
this
one.
Does
it
pass
the
test
in
the
methods
summary?
B
Building
is
recognized
at
the
point
in
the
interpreter
where
you're
interpreting
the
branch
all
you're
looking
at
is
the
arguments
for
the
things
that
are
being
compared,
and
can
we
express
that
as
a
test
of
the
in
terms
of
the
argument,
the
answer
is
yes,
then
we
record
the
possible
choices,
if
not
move
on
right.
So
all
of
these
optimizations
are
hyperlocal
in
that
sense,
they're
very
constrained
in
the
space
that
they're
all
happening,
so
the
complexity
of
picking
them
up
is
quite
simple.
B
A
I
was
just
wanted
to
get
a
little
more
of
a
concrete
sense
for
that
building
I
was
going
to
ask,
was,
can
you
talk
a
bit
more
about
the
costs
and
how
those
were
to
how
you
derive
those
and
how
they
and
how
we
just
how
we
differentiate
those
across
different
architectures?
So
the
cost
is
es.
Java
byte.
B
Is
Java
byte
codes
in
the
minimum
number
of
java
bytecodes?
It's
the
same
currency
that
the
multi-target
in
liner
in
open
j9
uses.
So
we,
its
budget,
is
in
terms
of
nice
design.
Its
notion
of
size
is
based
on
byte
codes
and
we've
set
the
budget
in
terms
of
byte
codes
as
the
number
of
nodes
that
you're
going
to
allow
in
the
method
for
purposes
and
compilation
and.
C
Kind
of
related
to
that,
so
it's
weighing
cost
versus
benefit,
so
it's
assumed
that
a
benefit
is
expressed
in
a
comparable
to
number
of
whitecoats
number
like.
Is
it
or
like
it
assumes
that
the
abstract
interpreter
is
translating
the
benefit
consistent
thing
that
trades
off
against
like
those
Network?
How
is
that?
No,
because
I
decide
how
much
of
like
how
much
of
a
place
code
is
worth
how
much
of
those
in
is
it
there.
B
Is
no
notion
of
that,
because
what
you
are
saying
in
running
the
packing
algorithm,
isn't
you
have
a
budget
in
terms
of
byte
codes
and
you
have
a
benefit
number
that
is
an
abstract
number
and
the
solution
were
actually
compares.
Costs
and
benefits.
Edges,
look
at
them
independently.
It
looks
at
them
independently.
It's
trying
to
get
the
largest
of
total
benefit
for
the
given
budget
of
byte
codes,
so
there
is
no
than
the
really
nice
property
of
this
is
that
there
isn't
a
relationship
between
the
benefit
and
the
cost.
B
At
the
moment,
some
of
the
benefit
numbers
are
derived
in
terms
of
number
of
byte
codes
eliminated,
but
that
that's
not
something
that
any
other
part
of
the
algorithm
knows
anything
about
have
to
know
it.
The
goal
is
just
a
benefit
number.
That
number
is
unitless,
it's
just
I
figure
is
better.
Okay,.
F
B
So
at
the
moment
that
so,
if
you,
if
you
think
that
the
transformation
is
going
to
save
a
large
amount
of
compute
time,
you
should
be
giving
it
a
bigger
benefit.
So
at
the
moment
the
benefit
estimation
for
the
branch
folding
is
based
solely
on
the
number
of
byte
codes.
But
that's
just
what's
implemented
in
that
abstract
interpreter.
You
could
add
more
for
calls
or
you
could
add
more
for
whatever
right
object
allocation
and
it
would.
B
It
would
bias
the
inlining
to
pick
two
in
line
those
bodies
where
you're
going
to
eliminate
branches
that
eliminate
object,
allocations
and
costs
right,
and
while
we
don't
have
the
data
right
now,
the
final
part
of
eric's
thesis
is
actually
doing
a
statistical
analysis
to
show
that
the
things
that
we've
incentivized
are
being
incentivized
right.
So
when
we
say
we
want
to
do
branch
folding
and
we
want
to
fold
more
branches,
that's
what
the
inliner
is
actually
doing.
E
E
F
B
D
B
All
right
so
there's
so
there's
not
we've
got
to
talks
in
terms
of
PowerPoint
slides
as
well
as
there's
a
Matt
Eric's
master's
thesis
which
is
being
written,
which
describes
all
of
the
algorithm
in
excruciating
detail
and
we'll
have
all
the
evaluation
and
there's
plans
for
an
academic
paper.
So
a
more
come
down
to
turn
up
of
a
coated
so
like
because
I've
chosen
to
be
contributed,
I'll
defer
to
Eric
on
the
quality
of
that
we
can
set
the
contribution
bar
on
that
as
appropriate
I.
B
B
A
H
So
I
was
thinking
about
starting
as
soon
as
possible,
in
a
sense
of
just
where
the
a
structure
the
new
inlining
class
should
sit
in
and
after
contributing
that
the
contribution
of
the
astrick
interpreter,
as
you
mentioned,
or
someone
else
in
their
own
mention,
still
needs
to
be
polished
in
terms
of
code.
But
it's
just
a
matter
of
publishing
a
little
bit
and
pushing
it,
and
that
can
be
done.
I
don't
know,
maybe
like
what
do
we
submit
a
pull
request
and
continually
being
over
that
so
I
think
I.
Think
in
terms
of.
H
B
B
B
C
E
B
So
I'm
not
going
to
claim
that
it
solves
the
world's
problems.
However,
I
think
that
it,
my
personal
opinion,
is
that
it
is
a
significant
improvement
over
the
current
state
of
affairs
and
has
significant
scope
for
being
enhanced
in
ways
that
are
understandable,
maintainable
and
extensible.
Unlike
the
current
in
liners
that
are
candidates
for
inclusion
in
omr
beyond
the
trivial
in
liner,
the.
B
B
Limited
and
it's
quite
expensive
because
it
does
tree
gen
right,
that's
the
reason.
The
reason
that
we
didn't
do
tree
generation
for
this
abstract
interpreter
is
because
of
that
cost
now
for
om
r.
It
may
make
a
lot
of
sense
to
build
the
interpreter
for
the
for
the
nodes,
because
being
able
to
run
an
interpreter
over
the
nodes,
even
with
the
cost
of
generating
the
nodes,
means
that
all
languages
would
get
one
by
default.
And
if
you
want
a
better
one,
then
you
can
create
your
language
specific
one.