►
From YouTube: OMR Compiler Architecture Meeting 20190523
Description
Agenda:
* Concurrent scavenge read barrier patching (#3847) [ @yanluo7 ]
* Next steps for RISCV OMR compiler [ @shingarov ]
* Formalization of IL semantics [ @shingarov ]
A
B
B
To
justify
okay,
your
design,
so
for
CSS
we
are
trying
to.
We
already
have
a
working
implantation
where
the
scavenge
of
GC
happens
concurrently
with
the
application
thread.
So
the
idea
here
is
to
instead
of
having
a
big
pause,
but
we
do
GC.
We
do
all
the
tracing
and
copy
that
is
to
amortize
the
effort
and
spread
them
out
into
the
application
threads.
While
they
are
running
so
we
eliminate
the
pause
and,
at
the
same
time
we
have
some
background
threads
that
helping
out
the
application
threads
picking
up
there,
any
slack
that
they
have.
B
We
enforce
a
rule
where,
whenever
a
java
application
threads
trying
to
load
a
reference
field
from
an
object
on
heap,
we
will
implement
a
sequence
of
instructions
called
rib
area
where
we
will
try
to
garbage
collect
the
referenced
object,
ID
field
that
the
fields
that
they
object,
the
fields
point
to.
So
we
will
perform
some
range
check
in
my
compiled
code
and
if
the
range
check
succeeds,
it
will
basically
perform
a
call
to
the
GC
where
the
referenced
object
will
be
got
collected
or
copied
into
the
appropriate
place
on
heap.
B
So
currently,
the
instruction
that
we
generate
in
the
main
line
of
jet
code
consists
consists
of
a
variant
of
a
load
of
the
referenced
field
and
then
a
compare
with
basically
a
range
check,
a
series
of
compare
instructions
where
we
compare
with
the
heat
based
and
the
heat
top,
and
if
the
address
board
within
this
range,
we
basically
fall
through
to
the
next
my
code,
and
it
fails
that
range
check.
It
will
basically
call
out
of
line
to
the
GC
code
to
do
the
selection
so
want
to
make
it
faster
so
before
so.
B
One
way
to
make
it
faster
is
we
thinking
of
basically
patching
out
this
piece
of
code,
which
is
all
over
the
generated
code
cache
by
replacing
them
with
no-ops.
So.
B
A
B
We
have
so
much
yeah.
We
have
some
data
that
sort
of
support
our
one
piece
of
data
is
in
the
in
the
runs.
The
Jimmy
runs
that
we.
B
A
B
Has
to
be
clear
on
that
yep,
so
the
duration,
where
the
CS
the
range
check,
the
construction
is
actually
useful
versus
the
duration
of
each
end
is
actually
not
useful.
The
measurement
we
did
on
X
and
on
Z
point
out
to
a
duty
cycle
for
the
instruction
where
the
range
chicken
was
useful
was
only
10%
overall
time.
E
B
10%
time
the
instruction,
the
ranger
construction
are
useful
and
90%
of
time
we're
not
under
any
CS
mode.
Ie
they're
not
useful,
we'll
continue
to
execute
a
raincheck.
They
will
always
yeah
succeed
fall
through,
and
so
that's
one
piece
of
data
that
supports
that.
Okay,
we
are
trying
to
optimize
ROI
9%
near
you.
If
we
can
somehow
you
know,
get
rid
of
the
cost
of
the
extra
path
lengths
and
the
extra
code
cache
waste
for
90
percent
times.
That
should
give
us
the
and
some
win.
B
We
hope
the
other
piece
of
data
is
within
the
10
percent
of
the
time
when
the
raincheck
instructions
are
useful,
we
do
the
arrangement
we
do.
The
rain
check
rain
check
fails.
We
call
out
to
the
GC,
so
only
10%,
of
that
10%
of
the
time
we
actually
needed
to
call
out
to
this
GC.
So
that
really
tells
us
these
two
pieces
of
data
I
really
tells
us.
B
A
There
was
another
data
point,
that's
worth
bringing
up
right,
which
was
B
so
when,
when
you,
if
you
take
a
garbage
collector,
it
doesn't
require
these
range
checks,
yep
yep
and
you
forcibly
insert
them
so
that
they
will
always
succeed
and
fall
through.
You
never
have
to
call
out
the
throughput
penalty
of
just
spraying.
Those
checks
into
the
generated
code
on
x86
I
believe
you
said,
was
on
the
order
about
5
or
6%
yeah.
Actually.
B
Yeah,
so
so
that
they'll
travel
with
our
current
PS
implementation
in
Gen
Con
is
around
10%
on
the
three
platforms
and
we
did
experiments
where
we
just
take
Gen
Con
read
much
in
town
and
we
try
to
emulate
the
effect
of
CF
I
just
bring
those
instructions
in,
but
they
just
never
do
anything.
Just
a
just
cause
the
extra
path
lines
in
the
extra
cash.
So
just
by
doing
that,
we
sort
of
we
were
able
to
regress
so
most
of
the
10%
regression.
B
We
were
able
to
reproduce
that
so
total
7
6
7
to
8%
regression
we're
able
to
reproduce
that
just
by
you
know
adding
those
into
the
regular
gen
company.
So
that's
really,
you
know
another
piece
of
data
that
points
to
just
by
doing
this
useless
check,
we
are
able
to
slow
down
quite
a
bit
so
so
hands
you
know
90
percent
and
of
the
time
we
are
not
really
doing.
Much
of
anything.
C
A
So
the
knots
on
x86
were
I
believe
in
the
noise
margin
of
the
measurement,
so
it
was
sub
1%
when
it
was
measured
by
McDonnell
Victor.
Previously
so
I
mean
he
done
an
implementation
where
there
was
a
single
knob
in
the
code
where
the
checks
with
the
D
check
would
be,
and
with
that
one
off
it
was
in
the
noise
margin.
A
It's
the
pathway,
so
I
think
that
the
problems
there
are
two
problems
right.
One
problem
is
that
the
key
base
has
to
be
loaded.
If
the
heap
size
is
not
fixed,
you
have
to
load
the
heat
base
from
somewhere,
there's
an
overhead
for
that.
If
it
is
fixed,
there's
still
a
conditional
branch
that
will
processor
can
continue
speculative
execution,
but
there
are
limits
on
what
the
processor
can
do,
while
in
that
speculative
execution
mode
because
of
that
conditional
branch
and
that
conditional,
the
speculation,
is
a
big
part
of
that
overhead.
A
B
All
good
stuff
supporting
our
case,
so
so
that's
a
piece
of
data
that
supports
that
we
try
to
rely
on
in
making
our
design
decisions.
So
we
sort
of
came
up
with
the
idea
of
basically
have
start
strategies
amortization
again,
so
we
don't
want
to
because
we
know
there
are
so
many
loads
all
over
the
co
cache.
We
do
not
want
to
patch
them
all
at
once
within
one
huge
pause.
That
would
just
basically
kill
our
response
time,
so
we
want
to
do
it
lazily
and
we
want
to
amortize
the
cost.
B
B
At
that
time,
the
GC
will
walk
every
javathread
stack
frames
and
basically
records
or
the
collective
reference
on
stack
and
mark
them
as
the
ROO
set
to
commence
the
tracing
of
the
object
reference
tree
and
because
there's
a
risky
new
mechanism
to
do
this
tag,
walk
we
want
to
what
is
well
GC
threads
are
doing
the
staff
walk
for
every
map
for
a
rigid
frame
that
it
visits.
We
want
to
patch
that
frame
patch,
that
method
body
right
there
at
the
CSC
distortion
cause.
B
So
that's
one
piece
so
when
the
pause
ends
or
the
Java
threads
will
pick
up
with
you
a
resuming
execution
and
it
will
be
patched
but
to
the
correct
sequence
where
the
rain
is
actually
there
so
and
so
those
methods
can
continue
to
run
functionally
and
what?
What
about
all
the
other
methods.
So
that's
where
the
lazy
patching
comes
in.
So
we
want
to
patch
those
methods
that
one
that
that
we
didn't
execute
during
GC
pause,
GG,
stop
house
we're
going
to
execute
after
GC
pause
time.
B
So
what
is
the
state
of
a
method
if,
as
long
as
far
as
concurrent
scavenger
is
concerned,
a
state
of
a
method
can
be
in
one
of
two
states
either
patched
or
unmatched
unpatch,
the
meaning?
You
know
we
have
the
raincheck
instructions
in
there
for
every
load,
that
we
that's
reference
load
and
patched
meaning.
You
know
those
rings
check
instructions
or
replace
with
no
ops.
So
at
any
given
time,
a
method
body
can
only
be
in
one
of
these
two
stages.
So
so
having
that
commotion,
then
we
look
at
what
the
global
stage
means.
B
A
global
state
is
when
GC
signals
the
JVM
saying
I'm
going
to
be
in
I'm,
going
to
be
in
the
CSS
active
cycle
active
period,
oh
I'm,
in
the
CS
inactive
period,
where
ninety
percent
time
is
so.
Basically
when
you
have
a
GC
start
pause,
the
GC
was
set
up,
some
variable
global
variable
thing,
I
mean
see
effective,
then
it
will
will
start
running
concurrently
and
until
the
gcn
pause,
where
the
global
state
will
be
flip
to
CS
inactive.
So
now
you
have
a
method
state
and
have
a
global
state,
a
method
state.
B
If
it's
patched
matches
the
CS
in
active
state,
so
the
feeis
is
inactive.
I,
don't
care.
I
can
just
run
a
no
op
instruction
in
do
of
the
we
barrier,
so
they
match
and
the
patch
the
unpatched
a
state
matches
the
CS
active
state.
So
at
any
given
time
a
method
can
be
in
one
of
the
two
states
and
at
any
given
time.
On
the
other
hand,
at
any
given
time,
a
JVM
can
only
be
in
one
of
CS,
active
or
inactive
state.
B
E
A
You
to
interrogate
some
part
of
the
runtime
system,
but
the
statement
that's
being
made
here
is
that
there's
there
is
a
state
that
the
JIT
believes
that
the
methods
need
to
be
in
either.
They
need
to
have
the
read
barriers
running
or
not,
and
if
that,
if,
if
you
use
a
bit
in
a
control,
word
stored
globally,
that
says
we
need
them
where
we
don't
need
them,
then
checking
the
state
of
the
method
is
to
check
the
method.
Bodies.
A
The
use
case
here
is
for
yes,
but
the
hope
is
to
use
it
for
other
things.
I
have
designs
on
trying
to
use
lazy
patching
to
allow
us
to
turn
patching
on
for
Pro
to
patch
in
profiling,
code
software,
profiling
in
method
bodies.
So
again
there
you
would
want
to
have
a
state
that,
yes,
we
want
to
turn
profiling
on,
will
profile.
Any
method
that
runs
or
some
subset
of
methods
that
run
and
we
want
that
state
to
be
toggled
on
or
off
in
a
similar
way
to
what
we
are
proposing
for.
A
Yes,
so
having
a
general
mechanism
where
you're
just
going
to
load
and
control
word
and
compare
it
against
the
methods
state
yeah,
you
have
an
extra
load,
but
you
have
a
lot
of
flexibility
in
having
multiple
different
features
and
abled
or
disabled
and
calling
out.
If
the
body
does
not
match
the
state
that
you
expect
so
over.
Both
the
stock.
E
B
E
Of
this
extra
reason,
if
CS
is
not
active,
ninety
percent
of
the
time
you're
not
doing
any
work
that
you
weren't
already
doing,
you're
just
doing
the
same
check.
You
were
doing
before
it's
not
going
to
fire
you're
not
going
to
go
out
as
one
more
cost
if
it
does
fail
in
that
ten
percent
of
the
time,
which
is
why
I
wanted
to
put
it
out
there
as
something
to
think
about
whether
you
want
whether
it
somehow
works
out
to
be
a
net
positive.
A
G
E
B
G
G
E
G
E
C
Think
from
what
I'm
hearing
is
two
things,
one
is
the
mechanism
of
invoking
the
feature,
and
one
is
the
feature
itself.
It
seems
to
me
from
the
Omar
perspective,
at
least
the
feature
is
being
able
to
patch
instructions
and
revert
them
back
something
we
currently
do
not
support,
where
the
mechanism
of
invoking
that
logic
is
open,
genuine
specific,
so
I
think
we
should
design
this
as
the
feature
itself,
which
is
patching
and
unpacking
instructions
and
the
mechanism
to
invoke
it
is
I'm
totally
separate.
A
Omar
may
wish
to
give
some
consideration
to
it,
because,
if
you're
going
to,
if
Omar
is
going
to
have
a
notion
of
software
profiling,
which
I
certainly
hope
will
be
the
case
in
the
not
distant
future,
there's
some
persistent
method,
information
infrastructure
that
needs
to
be
promoted
into
home
armor
before
we
can
do
that.
But
we
have
software
profiling
techniques
that
are
very
low
overhead,
that
we've
developed
in
the
context
of
open
j9
that
are
not
Java,
specific
and
being
able
to
toggle
that
profiling
on
and
off,
at
least
with
some
default.
A
A
A
C
A
Yeah
thing
to
me:
it
depends
on
what
design
we
settle
on
I.
Think
for
the
control
mechanism.
If
it
ends
up,
you
know
conflated
with
the
stack
overflow
check
and
whatever
happens
in
there
and
open
j9.
Then
that
may
be
the
way
that
open
j9
wants
to
do
it.
You
may
wish
also
to
have
a
more
generic
implementation
in
a
more
so
that
it's
kind
of
supported
out
of
the
box,
but
maybe
not
as
efficiently
as
if
you
carefully
conflate
it
with
your
languages
back
overflow
check
or
whatever
other
things
you
do
on
on
method.
A
C
I
mean
the
infrastructure.
Is
you
call
a
method
in
Omar
which
touches
an
entire
Jettas
body,
yep
mechanism,
you
folks
that
may
be
specific
and
they'll
be
under
the
different
kinds
of
things
that
need
to
be
done
I,
so
you
mentioned
so
you're
going
to
patch
multiple
instructions
on
we
have
concerns
or
authenticity.
That's
under
discussion.
B
We
have
options
to
we
can
do
we
don't
want
to
do
too
many.
So
currently,
currently
I
just
talked
about
currently
what
we
have
kearney
on
axis.
We
have
a
load
and
in
a
compare
and
then
it
jumped
basically
compare
and
jump
jump
to
the
outline
sequence
where
we
do
the
rest
of
the
reburial.
So
that's
two
instruction
as
minimum.
We
need
to
patch
currently
on
x86
on
Z,
all
the
ribery
sequences
in
line
in
main
lines.
So
there's
like
seven
or
eight
or
something
we
don't
want
to.
D
B
That
many
so
at
least
some
others
need
to
be
online,
but
going
back
to
the
X,
so
we
have
to.
We
could
patch
to
the
Reese
what
we're
patch
to
provided
that
which
instruction
we
patch
it's
atomic.
B
The
reason
why
it's
I
think
it's
functionally
incorrect.
Is
you
patched
it's
basically
packaging
and
comparing
the
jump
we
patch
to
jump,
we
just
do
that
impaired,
I,
don't
think
it
matters
is
a
way
to
compare
so.
A
You
then
we're
going
to
run
patching
code,
that's
going
to
go
and
patch
the
instructions
to
a
fixed
sequence,
that's
going
to
be
whatever
that
site
is
going
to
be,
and
the
thread
with
that
thread
will
then
mark
the
control
word
as
having
been
updated.
Right
update
the
state
of
the
method,
basically
attached
to
this
configuration,
and
then
it
will
begin
execution
so
on
x86,
where
you
have
ordering
guarantees
if
all
threads
are
check
controllers
right
and
the
last
thing
that
you
write
is
the
methods
control
word,
update.
A
Multiple
threads
could
enter
there
and
as
long
as
the
caching
is
sort
of
impotent-
and
it
will
always
do
the
same
thing,
then,
if
multiple
threads
come
in
and
multiple
threads
go
in
package,
they're
all
gonna
patch
it
to
the
same
thing
and
execution
can
continue.
So
whether
you're
patching
one
instruction
or
multiple
instructions,
no
thread
is
going
to
enter
the
method
until
all
the
patches
have
been
written
right,
but
then
we're
shooting.
C
A
So
there's
various
dirt.
So
if
you're
you
yet
you
certainly
give
you,
if
you're
going
to
do
the
check
in
the
mainline
and
if
you're
going
to
do
the
check
on
entry
and
patch
on
entry.
Yes,
then
you
and
you're
going
to
rely
on
multiple
threads.
You
know
going
out
to
the
patching
logic
so
that
you
don't
have
to
worry
about
aligning
absolutely
everything.
It.
C
G
B
A
B
B
D
A
Now,
if
you're
going
to
do
n
and
you're
going
to
do
it,
while
the
program
is
running,
then
you
better
make
sure
that
you
align
it
and
whatever
according
to
the
requirements
of
that
platform.
But
it's
also
possible
that
you
can
factor
the
current
read
barrier
code
to
not
require
multiple
instructions.
A
So,
for
example,
in
x86
I
know
that
when
Victor
was
working
on
this,
we
did
an
experiment
where
we
moved
all
the
Ribery
err
checks
out
of
line
both
of
the
range
checks
out
of
line
and
did
a
call
and
a
return
and
I
say
he
sinks.
Because
of
the
call
return
prediction
that
actually
didn't
add.
Very
who
is
unmeasurable
the
extra
overhead
of
doing
that
and
meant
that
we
only
had
to
patch
one
instruction.
So
we
didn't
even
have
this
problem
right.
A
It's
not
a
call
to
an
address,
that's
an
out
of
line
code
sequence
and
then
it
wasn't
a
problem
now
on
some
platforms.
A
call
in
return
is
no
good.
You
know
your
jump
might
be
better,
but
again
we're
coming
down
into
the
specifics
of
what
it
is
that
you're
actually
going
to
patch
and
I.
Don't
think
that
the
mechanism
necessarily
needs
to
be
that
specific.
A
G
B
G
A
They
all
have
to
go
to
there
well
so
in
the
kiln
all
right,
but
in
the
case
where
you're
dealing
with
the
C
at
the
end
we
were
talking
about,
we
were
talking
about
the
cycle.
Is
that
is
ending
right
and
you
can
have
you
couldn't
you're
not
going
to
walk
this
back
at
that
point,
somebody
can
enter
the
method
and
patch
the
stuff
out.
At
that
point,
I'll.
A
B
B
A
However,
one
question
is
say
on
power:
if
we
have
an
inline,
if
we
have
the
sequences
a
a
load
and
then
I
compare
Anna
branch
conditional,
let's
say
we
leave
the
loading
compare
on
power
as
they
are,
and
all
we're
going
to
patch
out
is
maybe
the
branch
conditional,
whether
that's
performing
or
not,
I'm,
not
going
to
argue
about
that
right
now.
A
But
if
all
you
are
going
to
do
is
patch
out
the
branch
conditional
right
and
you
could
run
the
knopf
or
you
could
run
the
branch
it
doesn't
matter
because
the
branch
will
fall
through
the
condition
check
is
okay,
then,
as
long
as
the
right
is
the
single
instruction,
you
either
see
the
new
version
or
you
don't,
and
then
you
don't
need
the
ice,
and
even
if
you
see
the
wrong
version
for
GC
end,
it's
functionally
correct.
Right,
like
the
load
in
the
compare
will
generate
a
valid
result.
G
A
G
H
B
A
Well,
I
think,
okay,
it's
going
to
require
some
careful
thought
based
on
the
memory
models
of
the
various
platforms
and
the
allowable
patching
as
you.
What
is
the
sequence
that
you
actually
want
to
patch
in
around
on
x86?
We
have
little
bit
more
flexibility
in
the
patching
than
on
power
and
we
may
choose
to
patch
a
couple
of
instructions,
but
at
the
very
worst
there's
kind
of
a
one
instruction
patch
that
gets
us
most
of
the
performance
that
we
want,
which
is
that
kind
of
call
return
style.
A
Now
that
may
not
work
on
other
platforms,
we'll
have
to
study
the
sequence
in
the
context
of
the
platform.
The
architecture
may
not
make
it
possible
to
regain
as
much
as
the
performance
as
we
can
on
x86
well,
we'd
like
to
put
the
tool
into
omr
to
let
us
try
and
do
the
x86
and
if
it
helps
the
other
platforms,
then
fantastic.
A
B
B
A
But
again,
that's
orthogonal
to
the
infrastructure
that
our
would
need
to
support
doing
the
patching
it.
It's
a
choice
of
where
do
you
employ
the
patching
right
that
one
design
point
would
be
if
the
patching
takes
too
long
for
the
use
case.
You're.
Looking
at
you
cut
down
the
number
of
places
by
sacrificing
performance
in
places
that
you
don't
expect
it
to
run
very
often,
you
just
leave
the
code
in
the
patch
state
or
whatever
right,
but
inside
it's
a
choice
on
the
correctness
and
the
particular
thing
that
you're.
A
Well,
I
think
it
would
probably
end
up
looking
like
a
runtime
assumption
kind
of
thing
which
is
not
being
factored
up
into
all
of
our
billing.
There
is
the
node.
There
is
a
runtime
assumption
table
in
the
abstract
sort
of
runtime
assumption
exist.
You
know
more
the
concrete
runtime
assumptions
that
are
currently
user
down
in
open
j9,
but
the
notion
that
you
have
a
location
you
need
to
keep
track
of
these
things
against
a
certain
kind
of
operation.
A
C
A
A
C
H
Depending
on
what
you're
patching,
if
you
make
sure
that
what
you're
patching
is
we're
going
to
replace
this
jump
with
a
fixed
offset
to
enough,
then
you
could
get
away
with
storing
just
the
offset
which
could
be
smaller
than
potentially
smooth.
But
then
you
can't
batch
compares
Stachel,
compares
right,
yep,
I.
A
A
A
A
A
E
A
C
A
You
can
in
Geneva
you
can
do
a
similar
effect
with
a
pop
and
yeah
I'll
call
each
other
yeah
you
can.
You
can
generate
a
rip
relative
address
on
32-bit
with
a
small
instruction
sequence,
but
it
doesn't
have
to
be
rip
relative.
It
was
if
you're
going
to
store
the
word
in
the
control
word
in
the
prologue.
That's
one
way
of
accessing
it
on
64-bit,
it's
highly
efficient
on
32-bit.
You
might
want
to
do
something
else.
I
mean.
D
H
E
So
I
mentioned
before
was
that
the
culverton
scheme
that
you
try
and
Lilia
didn't
have
much
of
an
overhead
at
all
right,
yeah.
Okay,
so
if
you
let's
say
it
did
a
call,
return
team
host
and
then
at
the
second
step
you
patched
all
the
methods.
So
every
method
only
has
one
call
returned
to
the
sequence
that
does
whatever
it
needs
to
be
done
and
then
returns,
and
you
just
did
all
methods,
because
there's
only
a
few
thousand
method,
every
single
one
only
has
one
call
and
one
return.
A
A
So
if
we,
if
we
outline
so
at
the
moment,
there's
an
outline
sequence
which
consists
of
a
range
check
and
then
a
potential
call
to
the
to
the
garbage
collector,
what
are
we
allowed
and
then
a
return
and
then
a
jump
back
and
in
the
main
line,
there's
a
load,
a
compare
and
a
jump
conditional.
We
move
that
first
load
compare
and
jump
conditional,
which
is
so
there's
two
range
checks
that
have
to
be
done.
He
base
like
collected
region
base
and
collected
region
pop,
so
we
moved
that
first
range
check.
A
A
Would
complicate
doing
the
out
of
line
sequence
because
the
out
of
lines,
the
the
outer
line
sequence,
relies
on
the
registers
that
are
live
at
that
read
barrier
so
where
you're
going
to
get
the
reference
from
could
be
r8
or
RA
X
or
whatever
the
one
that
we
generate
is
this
to
the
site.
Now
we
can
do
a
form
of
deduplication,
but
that
wasn't
done
just
jumped.
We
were
already
generating
one
of
these
stubs
for
every
read
barrier.
We
just
kept
doing
that.
We
just
move
the
mainline
out,
wait.
E
B
E
A
Added
here
we
go
so
the
deduplication
is
only
saving
footprint
for
the
actual
instructions
that
are
going
to
run
for
the
read
barrier.
The
number
of
sites
that
need
to
patch
remain
unchanged
is
less
patch
each
read
barrier
site
to
call
that
sequence.
So
the
number
of
sites
I
need
to
patch
are
the
number
of
read
barrier
opcodes.
It's.
E
A
Would
still
worry
this
with
a
large
number
of
compiled
method,
you
could
still
have
a
problem
where
the
amount
of
time
that
it
takes
to
catch
all
the
is
non-trivial
and
the
fact
that
I
a
that
there
are
other
potential
uses
for
it,
such
as
software
profiling,
I,
don't
know
that
the
X,
the
small
extra
engineering
cost
small
week
or
whatever
of
making
sure
that
we
can
support
more
general
case
I,
don't
know,
is
miss
spent.
It
provides
a
utility
in
Omar
that
could
be
very
useful
for
a
number
of
things.
A
E
A
Right,
I
still
worry
that
writing
that
many
instructions
across
all
the
methods
in
the
code,
cache
still
add
up
to
be
my
concern
because
we're
talking
about
a
very
short
pause.
What
we're
trying
people
are
actively
moving
path
length
out
of
that
positive,
a
disorder
ensures
you
know
putting
path
length
into
it
is.
F
C
B
But
yeah
we
should
do
that.
I
agree
with
lenders
in
wisdom.
Well,
CSS
we
don't
want
to.
We
want
to
spend
all
the
time
you
know
doing
all
the
CS
work
to
bring
us
where
we
are
and
then
shoot
us
ourselves
in
the
foot
by
you
know
putting
you
know
by
increasing
back
e
to
keep
all
the
time
like
this
stored
or
because
we
do.
If
we,
if
we
take
the
just-in-time
aspect
out
of
it,
we
might
have.
C
H
C
Be
enough
to
satisfy
me
having
only
view
yep,
it
yeah
like
an
easy
way.
It
would
be
just
to
put
a
debug
counter
and
right
before
you
call
out
to
the
GC
green
barrier,
just
count
how
many,
how
many
methods
you're
actually
seeing
during
a
CF
cycle.
If
say,
you
have
to
touch
1,000
method,
whereas
you're
only
executing
100,
the
just-in-time
would
only
patch
100
methods,
whereas
the
nan
would
patch
an
exome
or
100x
more
I.
A
F
For
figuring
out,
when
you
have
to
patch
a
method,
is
it
method,
entry
and
your
and
your
we
talked
about
overloading,
possibly
the
stack
overflow
check?
Did
you
consider
just
changing
all
the
view
calendar
my
table
entries
for
methods?
You
change
it
to
the
entry
point
that
we
would
do
the
patching
we're
not.
H
H
H
A
So
each
VM
thread
has
a
control
control
word
that
says:
I
want
three
barriers
to
run
or
solitude.
Each
method
has
a
state
of
my
Ribery
errs
are
on
or
off,
and
when
you
arrive
at
the
method,
you
check
do
I
match
a
thread
state.
If,
yes,
do
nothing,
if
not
I
need
to
go
and
call
some
patching
code,
so
you're
only
going
to
update
the
control
word
either
globally
or
one
per
thread.
Vm
thread
is
convenient
for
power.
H
Suppose
it's
possible
that
patching
every
read
barrier
sites
say
it's
prohibitively
expensive
at
the
time
in
which
we're
thinking
of
doing
this,
but
then
patching
you
know
every
prologue
is
not
very
least
expensive
depending
on
you
know
what
kind
of
scaling
factor
you're
looking
at
there,
but
I'll
just
I,
don't
know
that
there's
even
a
consequence
of
that
that
we
want
to
consider
right
now
with
all
those
leave
it
there.
How.
A
So
their
control
word
does
not
get
updated.
We
update
the
VM
thread
to
say
so
say
we
toggle
it
on
submit
is
set.
You
start
running
some
methods,
they
check
their
bits,
nuts
that
they
patch
themselves
cycle
ends.
We
turn
the
control
bit
off
then
call
a
method
that
never
patched
its
control
word
matches
the
global
control
word
and
it
does
nothing.
A
So
the
lazy
aspect
of
this
is
you
literally
only
patch,
on
the
methods
that
are
going
to
run
and
you
only
patch
things
out
when
you're
going
to
use
them,
because
if
the
method
doesn't
run
again
until
the
next
vs
cycle
it
doesn't
need
to,
and
in
fact
the
unterhaching
can
be
even
more
lazy,
because
in
this
case
it's
functionally
correct
to
keep
running
the
channel.
So
we
can
even
defer
unmatching
it
further.
E
E
H
A
H
A
Following
the
agenda,
I
think
maybe
just
want
to
cap
this
in
does
another
couple
of
minutes,
and
then
we
move
on
to
the
other
topics,
because
I
want
to
give
them
a
little
time
as
well.
In
the
half
hour
that
we
have
left
so
I
think
that
one
of
the
conclusions
that
I
see
from
this
is
that
we
like
to
separate
the
engineering
after
the
the
mechanism
aspect
of
it
from
the
from
the
use
aspect
of
it.
So
when
you're
looking
at
getting
done
to
Omar
and
we'll
think
of
it,
that
way
and.
A
E
A
D
Next,
with
with
that,
and
what
you
plan
on
doing
with
that,
yeah
I
am
here,
I
think
that
yon.
J
Okay,
so
hello,
everyone
well,
first
of
all,
I
would
like
to
say
that
the
even
though
I
did
the
last
bit
Boris
with
the
first
bits,
and
they
are
always
remember
the
most
difficult
one
but
yeah
now
as
I
as
I
said
on
the
general
channel,
we
like
make
the
public
curl
the
first,
let's
say
version
from
Witcher.
We
would
like
to
you
know,
take
it
as
a
start
and
then
continue
evolving.
Now
this
version
it
is
based
on
the
AR,
64
and
forth
and
yeah
it's
far
from
being
complete.
J
And
calls
work
so
we
know
the
records
recorded
for
Fibonacci
works,
Mundel
both
work
as
I
said
the
other
day,
so
it's
in
quite
ok
shape
and
what
we
would
like
now
is
to
start
the
process
of
getting
this
in
goo-goo
MRI
poet.
You
know
there
is
an
interest
and
it's
impossible,
and
we
are
starting
talking
about
this.
Ok.
A
So,
first
of
all,
I
think
that
some,
you
guys
have
made
some
great
progress,
pretty
much
more
or
less
on
your
own
I
think.
That's
that's
really
great
to
see
and
any
we
really
would
like
to
get
this
get.
This
contributed
so
I
think
my
just
given
the
experience
that
they
had
with
AR
64.
The
the
approach
that
we
took
with
that
was
to
essentially
not
drop.
The
entire
thing
in,
as
one
large
commits
one
logical
request.
A
A
Mean
you
don't
necessarily
have
to
carve
up
files,
I
think
that
I
think
you've
different
different
files.
Even
right,
like
you
could
do
the
like.
The
like
those
of
the
machine
class,
for
example
the
instruction
hierarchy,
the
the
opcode
tables.
Things
like
that
you
can
sort
of
deliver
those
in
little
batches
that
that
people
can
pour
over
and
and
make
sure
that
they're
that
they're
sound.
J
As
far
as
I
seen,
however,
to
the
comment
there
is
bunch
of
them,
I
don't
know,
maybe
30
the
comets
are
structured
like
you
know,
they
show
the
progress
of
actually
incrementally
rewriting
and
removing
the
AR
six
before
big
stuff
and
making
it
the
pure
risk
five
code.
Now
I
am
not
really
sure.
If
it,
you
know,
I,
don't
know
how
to
how
we
want
to
break
it
right.
J
C
Part
of
the
problem
we
had
with
care
towards
that
we
hadn't
configured
build
now
that
yet
we
have
fields
for
from
the
beginning.
Actually
we
do
cross
compile
pills,
we
don't
do
native,
we
don't
run
natively,
but
we
have
the
cross,
compile
double
red,
but
we
don't
have
the
trill
test.
Running
till
test
were
not
running.
No,
they
have
to
be
run
manually
right.
Yes,
so
I
think
the
difficulty
is
going
to
be.
A
The
things
we're
gonna
have
to
discuss
as
well
is
the
infrastructure.
That's
going
to
be
able
to
tap,
build
and
perhaps
even
test
as
I
got
NCI
test
for
this
right
so
and
that
might
get
into
like
first
well.
Do
we
do
cross
compilation
to
begin
with,
or
is
there
do
we
run
this
on?
An
emulator
right
are.
A
J
J
It
does
work
to
some
extent,
but
then
there
is.
There
are
some
places
when,
when
you
build
the
innate
like
a
tools,
let's
generate
something
that
is
in
turns
and
compile
and
I
the
because
I
do
you
have
very
low
hardware.
I
did
in
order
to
fix
this.
He
makes
and
all
that's
due
to
build
this.
You
know
raise
general
courses
like
this
to
build
them
with
native
compiler
and
fill
the
rest
is
in
the
coastal
file,
but
it
it
says
what,
if
you
do
it
so.
A
A
So
that's
sort
of
the
same
situation
that
we
have
with
aired
64,
where
we
can
build
it,
sort
of
publicly
cross-compile
it,
but
right
now,
because
of
the
lack
of
availability
of
a
rh
64
hardware
in
the
open
which
we're
working
on,
we
don't
actually
have
a
CI
test
that
actually
execute
the
trail
testing
on
AR
64.
So
I
think
we're
going
to
be
in
a
similar
position
with
risk
v
in
that
we
could
possibly
build
it,
but
we
can't
actually
run
Li
test
until
we
connect.
A
J
You
could
you
can
do
all
of
this
in
few
of
you
I
just
I.
Just
do
it
in
real
hardware,
because,
first
of
all,
I
have
it
seconds
it's
AIT's
faster
than
300,
but
you
know
I,
don't
have
this
much
powerful
machine.
So
if
you
have
a
powerful
x86
hardware,
then
you
might
be
fine
and
for
CI
you
know
whether
it
build
this
in
an
hour
or
hour,
20
minutes.
That
is
not
much.
It.
A
J
It
works,
I
actually
spent
quite
a
lot
of
time
on
repairing
set
of
scripts
that
filled
the
whole
environment
in
which
me
and
Boris
develop
the
stuff,
and
they
generate
essentially
Q
an
image,
and
then
you
transfer
it
into
on
a
real
hardware.
Okay,
that
is
I,
would
say
pretty
much
straightforward,
so
I
mean
there's
clip
that
builds
with
you
everywhere
may
Chen,
you
just
invent
largely
qmu
and
a
part
of
the
real
execution
machinery
behind
there
is
no
different,
so
that
could
be
a
setup.
The
CI
inside
the
Q&A,
okay,.
I
G
A
Okay,
I'm
not
sure
what
the
answer
there
yet
is.
It
wouldn't
be
a
like
a
system
that
IBM
would
necessarily
own.
It
would
have
to
be
something
that's
kind
of
out
in
the
in
the
public
space
so
that
their
public
builds
can
actually
run
on
it.
So.
A
But
maybe
just
circling
back
to
the
original
question
about
how
this
is
going
to
land
and
whether
or
not
it
could
be
broken
up.
I
mean
until
I
mean
one
of
the
first
things
that
we
probably
should
get
set
up
is
some
sort
of
a
at
least
a
built
environment
where,
at
the
very
least
you
just
do
a
cross-compiled
bill
of
the
of
the
of
the
code.
Just
so
that
we
can
see
that
it
actually
works.
A
A
Like
and
thoroughly
review
each
in
the
different
sections,
then
it
would
be
if
we
could
land
individual
parts
of
it.
So
if
that's
acceptable,
then
then
that's
the
way
that
we
can
go
I.
Also
don't
want
you
to
spend
two
three
months
breaking
it
up.
If
it's,
if
it's
difficult
to
do,
because
that
would
just
be
the
amount
of
time
that
we
would
potentially
be
spending
reviewing
it
anyways.
I
A
Okay,
well,
why
don't
you
carry
on
the
way
you
were
thinking
and
we
will
we'll
make
do
with
with
with
what
we
get.
A
The
the
other
thing
that
I
mentioned
yesterday
was
I
think
that
there
is
an
eclipse
legal
process
that
we're
going
to
have
to
follow,
for
a
contribution
of
this
size
as
well
need
to
I
believe
create
something
called
a
commit.
Your
questionnaire,
it's
not
really
that
it's
not
a
terribly
onerous
process
it,
but
about
what
they.
A
F
A
F
A
J
A
Yeah,
if
we're
not
gonna,
be
able
to
go
back
so
if
there
isn't
much
detail
there
and
if
we're
not
going
to
be
able
to
go
back
and
try
out
different
parts
like
if
I
wanted
to
go
back
to
a
certain
version
of
the
machine
class
to
save
this,
certain
problem
occurred
whatever
I'm,
not
sure.
If
there's
value
in
that,
so
I
guess
I
would
argue
for
squashing.
D
I
Now
there
is
several
branches
that
this
was
developed
in
and
it
is
kind
of
pretty
clear
what
merged
into
what
and
finally
like
there
were
some
smaller
ones
and
they
got
merged
into
risk
five
devel
and
finally,
that
got
merged
into
the
main
required
branches
of
this
morning
and
what
I
think
is
that
they
they're
about
what
35,
40
Cal
is
now
on
the
way
that
separates
like
we're.
You
know
where
we
started
to
where
we
are
now
and
I
think
for
the
actual
master
branch
of
oil
mark.
That
is
too
much
detail.
I
I
So
if
somebody
wants
to
see
how
ever
particular
piece
of
code
both
developed
through
the
different,
you
know
fixing
this
bud
and
oh
no,
we
tried
this,
but
then
we've
scratched
it
away
and
and
like,
for
example,
we
already
did
half
of
this
process,
because
our
original
implementation
that
we
will
cleaver
first
started
with,
will
be
on
Power
PC
port,
and
there
are
another
about
40
commits.
But
right
now
on
the
branch
that
we
have,
we
don't
have
any
of
those
experiments,
even
though,
but
we
just
keep
them
separately
right.
I
J
In
my
clone
yeah
in
my
Park
right,
we
will
site,
we
will
certainly
keep
them
at
least
for
ourselves.
We
can
barely
wash
things
and,
if
you,
if
you
for
whatever
reason
want
to
have
you
know
more
details
or
see
the
evolution,
there
will
still
be
something.
We
can
point
you
to
and
say
look.
This
is
how
it's
developed
yeah.
It's.
A
F
A
So
the
suggestion
that
we're
going
to
make
is
that
when
you
do
your
first
pull
request
your
initial
four
requests.
You
can
have
all
the
individual
commits
there,
but
when
we
actually
merge
it
we're
going
to
ask
the
spot
in
the
one.
So
we
see
the
history
of
your
of
your
development
on
your
initial
pull
request,
but
before
we
merge
it
will.
C
D
I
The
only
thing
here
is:
can
we
do
a
review
on
on
the
well
brain
stem?
Well,
you
know
that
the
removal
for.
D
I
I
I
A
D
A
Okay,
Boris
I,
don't
know
how
much
time
you
needed
you
wanted
to
have
for
aisle
specification.
I'm,
assuming
four
minutes
isn't
going
to
do
it
justice
you
mind
if
we
move
that
to
the
next
to
the
next
call,
or
do
you
have
something
to
you,
I.
I
A
I
I
E
I
A
A
All
right,
okay,
so
yeah,
so
we
look
forward
we'll
look
forward
to
that
that
pull
request
and
to
get
that
like
to
get
that,
get
that
in
also
start
looking
into
what
we
need
to
do
from
the
infrastructure
side.
In
order
to
be
able
to
get
the
some
some
sort
of
testing
going
for
this
I
think
there's
some
parallels
with
the
a
or
at
64
that
were
that
we
can
that
we
can
do
so.
J
A
Okay,
any
questions
for
Boris
or
young,
or
do
you
guys
don't
have
anything
you
want
to
anything
else?
You
want
to
talk
about
that.