►
From YouTube: CHIPS Alliance - Learning To Play the Game of Macro Placement with Deep Reinforcement Learning
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hello
folks,
it's
rob
baines
here,
general
manager
of
chips
alliance.
It's
a
pleasure
to
have
everyone
here
today
and
really
looking
forward
to
the
talk
today
by
yongjung
lee
yong
jun
is
a
physical
design
engineer
at
google
cloud
and
before
joining
google,
he
received
his
phd
from
georgia
tech
and
worked
in
intel
live
for
six
years
at
google,
zhangjun
has
been
working
on
machine
learning,
chip
projects
and
machine
learning
based
physical
design
projects
for
two
years.
A
Youngjun
has
experience
in
cad
eda
algorithms,
physical
design
and
machine
learning,
and
aspires
to
use
machine
learning
to
help
accelerate
chip
design,
which
I
think
is
a
real
benefit,
as
that
is
a
definitive
challenge
area
that
we're
all
interested
in
seeing
novel
improvements
in
so
youngjun
is
very
open
to
receiving
interactive
questions
during
the
talks.
So
if
you
have
a
question,
feel
free
to
go
ahead
and
pose
it,
and
we
will
try
to
answer
it
as
we
progress
along.
So
with
that,
let
me
introduce
yong
jun
and
thank
you.
B
Thanks
rob
for
introduction,
can
you
hear
me.
B
Let's
get
started,
let's
get,
let's
get
started:
yeah,
I'm
yongjun,
I'm
happy
to
share
with
you
our
exciting
project
learning
to
play
the
game
of
macro
placement
with
deep
reinforcement.
Learning
this
work
has
been
a
great
collaboration
within
google,
including
these
people.
B
B
B
For
the
g
for
the
game
of
gold,
the
number
of
states
is
10
to
the
power
of
360,
which
is
really
large,
so
ai
algorithms
couldn't
beat
to
hume
top
human
expert
until
several
years
ago.
B
B
A
lot
of
problems
in
systems
and
chip
designs
are
combinatorial,
optimization
problems
on
graphs,
for
example.
We
have
three
examples
here:
compiler
optimization
chip
placement
and
data
center
resource
allocation.
B
B
B
The
logic
design
is
synthesized
into
a
net
list,
which
is
a
graph
of
a
chip
component
macros
which
can
be
svms
or
other
ip
blocks,
and
the
standard
cells,
which
are
logic
gates
as
nands
and
north,
are
connected
by
wires.
So
this
is
a
graph
and
the
objective
is
to
place
the
components
of
this
graph
onto
the
chip
flow
plan.
Canvas
canvas
so
that
we
minimize
various
costs,
such
as
latency
of
computation
or
power,
consumption
or
area,
while
meeting
the
constraints,
such
as
timing,
congestion
density
and
so
on.
B
A
Yes,
yeah,
I
just
curious,
you
know
you
mentioned
and
you
know
again.
I
want
to
encourage
the
audience
to
feel
free
to
ask
questions,
but
you
mentioned
about
the
rapid
prototyping
environment
that
you
have
at
google.
I'm
just
curious
what
that
exactly
what
that
looks
like
okay,
so.
B
That's
a
good
question.
So
when
we
started
this
project
we
didn't
have
the
prototyping
framework.
So
we
thought
about
creating
the
python
code
around
the
existing
cad
tools.
B
But
the
problem
was
the
cat
tools
are
too
heavy
and
slow
so
and
the
key
point
of
deep
learning
is
to
gather
a
lot
of
sample
data,
so
it's
too
slow
to
gather
data
and
provide
feedback
or
you
know,
reward
right.
So
that's
why
we
started
creating
our
own
lightweight
place
and
route
engine.
B
So
where
we
have,
you
know
where
we
capture
the
the
floor.
Pan
canvas
and
you
know
the
placement
and
then
we
also
do
some
very
simplistic
routing,
and
it's
really
fast,
it's
written
in
c,
plus
plus
and
then
it's
optimized.
So
we
can
iterate
through
a
lot
of
samples
really
fast.
A
Does
your
environment
also
have
cost
based
engines?
In
other
words,
by
that
I
mean
well,
cost
calculation
engines,
in
other
words,
say
like
a
static,
timer
or
some
type
of
power.
Estimation
application
or
you
rely
upon
commercial
solutions
for
that.
B
Yeah,
so
currently
we
don't
have
the
time.
Actually
we
tried
the
timer,
but
it
was
not
working
so
well.
So
in
internally
we
have
violence,
congestion
density
and
the
timer
is
what
we
are
working
on.
We
are
working
on,
you
know
simplistic
or
you
know
modeled
timer,
and
we
are
going
to
expand
it
to
power
consideration
as
well.
B
Thank
you.
Let
me
continue
yeah,
so
a
creative
idea
was.
We
took
a
hybrid,
hybrid
approach
in
this
work.
What
I
mean
is
so
we
trained
our
rl
agent
and
place
macros
one
by
one
and
when
all
the
macros
are
placed,
we
fix
them,
location,
macros
and
use
a
traditional.
B
You
know
force
directed
method
or
you
know
some
other
state-of-the-art
standard
cell
placer
to
place
the
standard
cells.
So
we
are
not
placing
standard
cells
with
rl.
We
are
only
placing
macros
with
rl
the
four
selected
method
that
we
tried.
First,
it
was
using,
you
know,
an
analogy
to
spring
and
mass
system.
So
it's
well-known,
you
know,
approach
to
place
the
standard
cells.
B
It
is
known
to
produce
reasonably
good
standard
cell
placement
fast
and
yeah.
So
if
we
I
mean
in
the
beginning,
we
first
tried
placing
the
standard
cells
and
macros
together,
but
it
was
really
slow.
Even
after
we
do
some
clustering
of
standard
cell
to
reduce
the
number
of
cells,
or
you
know,
number
of
objects
to
place
so
and
because
standard
cell
placement
is
well-known
problem
and
it's
you
know
kind.
A
B
Solved
by
the
existing
method
pretty
well,
so
we
we
decided
to
take
this
hybrid
approach
and
they
that
saved
a
lot
of
time.
For
us,
we
can,
we
could
go
much
faster
and
then
gather
more
data
samples.
A
I
have
two
there's
two
questions
from
stoner
yadis
and
the
first
one
is
is
the
ordering
of
nodes,
arbitrary.
B
Yes,
so
ordering
of
macros
was
what
we
explored
in
the
beginning
and
we
tried
random
ordering
and
then
you
know
larger
first
and
then
you
know
smaller
later
kind
of
ordering
and
we
tried
other
things
like
you
know:
grouping
of
macros
that
are
related
to
each
other
and
then
do
you
know
place
them
first
and
then
you
know,
and
so
on.
B
So
we
tried
a
few
heuristics,
but
then
we
found
that
in
general,
the
well-known
method
of
placing
larger
macro
first
and
then
you
know
smaller
later
works
better.
Overall
I
mean
it's
not
always,
but
and
we
stick
to
that
approach,
there
could
be
some
optimization
chances
that
we
may
haven't
explored
it.
B
We
also
thought
about
you
know
applying
rl
for
choosing
which
macro
to
place
next,
but
it
didn't
work
out
so
well
so
yeah.
But
I
I
admit
that
there
is
a
chance
of
you
know,
optimizing
other
and
then.
B
Oh
yeah,
we
try
that
too.
So
you
know
we
give
negative
partial
reward
when
we
have
overlaps.
The
problem
is
the
convergence.
Speed
was
not
satisfactory.
With
that
approach
it.
You
know
the
aria
was
not
learning
enough
to
place
macros,
you
know
not
overlapping.
B
So
we
had
to
take
this
approach.
You
know
not
allowing
macros
overlapping
each
other
to
enforce
rl
to
stay
away
from
you
know
overlapping
macros,
so
so
that
was
the
decision
that
we
made
a
while
ago.
Maybe
we
need
to
revisit
this
idea
actually
I'll
be
covering
in
the
later
slide,
but
we
are
now
struggling
with
the
high
density
designs.
You
know
we
where
we
have
a
lot
of
macros
packed
each
other
in
a
pack
and
there
isn't
much
of
a
wiggle
room
to
place
macros.
B
In
that
case,
we
are
struggling
to
place
all
the
macros
and
still
you
know,
achieve
high
quality.
B
So
we
are
now
exploring
how
to
you
know,
allow
partial
overlap
and
then
discourage
it
as
as
we
train
more
and
yeah.
So
that's
our
work
in
progress.
B
So
here's
our
only
result.
It's
been
like
two
years
so
this
this
was
yeah
yeah.
So
this
was
the
first
successful.
You
know
results
coming
out
of
this
project
for
the
real
designing
case,
our
tpu
block.
B
So
on
the
left,
I
have
human
macro
placement,
you
can
imagine,
the
white
areas
are
the
macros
and
the
green
is
the
standard
cells
and
the
dark
blue
or
you
know
the
black
is
the
you
know
empty
areas
and
human
expert.
B
B
It
took
about
24
hours
to
generate
the
super
human
medical
placement
with
about
three
percent
shorter
wildlings
and
this
results
in
the
I
mean
in
the
later
part
of
my
presentation.
Now
it
takes
about
six
hours
or
less
to
generate
medical
placements,
because
we
made
a
lot
of
improvements
to
both
the
computational
efficiency
and
learning.
Algorithm
physical
designer
commented
that
the
half
circular
macro
placement
surrounding
the
standard
cell
cloud
in
the
middle
minimizes,
the
y
length
between
the
standard
cells
and
the
macros.
B
So
this
was
obvious,
obviously
better
and
there
was
wasn't
much
of
a
delta
in
terms
of
routability.
So
this
was
a
clear
improvement
result.
B
B
If
it
weren't
yes,
if
without
the
rtl
development
only
for
the
pure
medical
placement,
this
design
was
pretty
big.
I
mean
this
block
was
more
than
two
million
about
two
million
instance,
and
if
we
were
to
go
through
like,
for
example,
like
five
or
ten
medical
placement
trials,
it
has
to
include
the
placement
and
then
qr
evaluation.
So
I
would
say
it
takes
at
least
about
two
weeks
to
evaluate
and
then
update
medical
placement
and
it's
it
can
be
partially
paralyzed
or
you
can.
B
You
can,
you
know,
come
up
with
a
strategy
to
you
know
fully
paralyze,
all
the
you
know
possible
macro
placements
and
then
maybe
reduce
the
time
to
a
week.
Maybe,
but
it's
still,
you
know
longer
than
you
know,
24
hours,
maybe
yeah.
B
Let
me
move
on
yeah,
so
so
we
we
saw
the
sign
that
this
rl
method
may
work,
but
at
that
point
it
wasn't
doing
any
you
know,
learning
transfer
or
you
know
it
was
trained
only
for
a
given
problem.
But
for
the
the
other
case
of
our
problem,
you
have
to
start
from
scratch
and
then
train
again
the
and
it
takes
24
hours
again
right.
So
it's
not
efficient.
B
So
the
next
step
that
we
took
was
we
thought
about.
How
can
we
train
policies
that
generalize
across
this
problem?
So
on
the
left?
You
know
the
previous
case.
We
were
optimizing
for
a
specific
place,
placement
of
a
net
list
onto
blowpan
canvas
and
a
training,
a
policy
to
do.
This
was
an
instance
of
the
problem,
but
after
seeing
initial
proof
of
cut
in
a
concept
we
extended
that
to
the
pictures
on
the
right.
B
B
You
are
given
a
new
netlist
and
you
need
only
a
few
hundreds
of
iterations
which
is
pretty
quick
or
you
you
can.
If
ideal,
you
don't
need.
You
know,
training
at
all.
You
just
do
zero
shot.
What
we
call
zero
shot
to
come
up
with
a
macro
placement
in
a
second.
B
So
if
this
works,
then
this
is
gonna
be
great
and
we
haven't
achieved
it
yet,
but
we
are
working
towards
it.
B
Yeah,
so
we
tried
several
ideas
to
make
generalization
work.
B
The
first
attempt
was:
we
took
our
previous
rl
policy
architecture,
we
trained
it
on
a
bunch
of
net
lists,
and
then
we
tried
it
on
an
unseen
at
least
it
just
didn't
work.
B
B
The
value
network
trained
on
placements
general
generated
by
a
policy
was
unable
to
accurately
predict
the
quality
of
the
placements
generated
by
another
policy,
so
that
was
causing
our
policy
to
be
unable
to
generalize
to
placing
new
mac
new
netlist.
B
So
in
order
to
train
a
supervised
model
to
perform
this
task,
we
compiled
a
large
data
set
of
10
000
placements
generated
by
vanilla,
rl
policies
at
different
stages
of
maturity
in
the
training
process.
This
is
valuable
because
it
provides
a
variety
of
quality
of
placements
in
the
graph.
Different
color
represent
the
data
for
different.
That
is,
we
have
five
different
samples.
I
mean
the
different
cases
of
that
list
and
we
generated
a
lot
of
data.
B
Now,
let's
take
a
look
at
the
graph
convolutional
architecture,
what
we
found
was
other
graph
neural
network
approaches
are
more
focused
on
features
of
nodes,
whereas
in
our
problem
it's
more
of
function
of
edges,
that's
if
you
want
to
predict
wildlings,
it's
not
really
about
the
the
node
features
themselves.
B
A
B
A
B
Is
yeah?
This
is
our
novel
approach.
This
is,
you
know,
creating
the
edge
embedding
from
the
node
embedding
and
this
we
I
mean
we
tried.
You
know
node
based
graph,
neural
network
first,
but
it
wasn't
working.
It
wasn't
capturing.
You
know
the
netlist
essence
of
the
netherlands,
so
it
wasn't
generalizing
for
the
you
know
in
the
supervised
learning,
and
then
we
found
that
you
know.
If
you
think
about
the
wildlings,
it's
not
about
edge,
I
mean
the
node
it's
more
about
the
edge
right.
B
So
that's.
A
B
Yeah
we
started
thinking
about
how
we
can
you
know,
transform
transfer
the
node
into
the
edge,
and
this
is
what
we
came
up
with.
A
Yeah,
because
I
was
thinking
about
you
know,
my
background
is
in
static
timing,
analysis
in
terms
of
technical
expertise,
and
you
know
that's
basically
a
node
edge
graph
type
of
representation
as
well,
and
you
know
it's
been
a
little
bit
since
you
know
I'm
not
totally
familiar
with
the
latest
technology
characteristics
of
like
three
or
five
nanometer
process
technology.
A
But
you
know,
as
you
as
you
well
know,
interconnect
delay
is
always
a
challenge,
and
so
I'm
just
wondering
if
a
rethink
of
the
graph
representation
for
static
timing
analysis
and
then
subsequently
for
interconnect
optimization
if
these
thoughts
would
have
potential
value
there
too,
or
maybe
you're
already.
Looking
at
this,
I
don't
know,
but
I'm
just
thinking
out
loud
here.
B
Yeah,
so
what
we
are
hoping
is,
you
know
this:
these
edge
embeddings
will
capture
those.
You
know
those
characteristics
of
the
net
list
in
the
as
we
training
as
we
trained
in
a
problem
case.
So
you
know
if
we
were
to
add
the
timing
aspect
into
it.
The
timing
will
be
embedded
into
you
know
these
edge
embeddings
and
then
we'll
be
able
to
see
how
it
predicts
the
delay
through
the
edges.
B
Okay,
thank
you.
It's
great
all
right,
okay,
so
let
me
move
on
yeah
so
and
then
we
distribute
the
edge
embedding
to
the
node
embedding
and
then
we
repeat
until
we
converge
at
the
end,
we
get
the
representation
of
the
entire
graph
by
taking
the
mean
of
the
edge
embeddings,
and
that's
that
you
know
orange
is
embedding
in
the
middle
and
then
we
combine
other
things
and
then
go
through
the
fully
connected
layers
to
get
the
wireless
and
condition
prediction
and
that's
how
we
did
the
supervised
learning.
B
So
this
is
the
the
prediction
versus
you
know
the
actual
kind
of
graph.
So
you
can
see
on
the
left
wire
length.
We
have
a
better
correlation.
Congestion
is
only
a
bit
hard
to
predict
it's
because
it's
kind
of
noisy
metric,
but
you
can
see
you
know
positive
correlation
there.
B
This
was
done
like
more
than
a
year
ago,
so
maybe
now
we
have
better
correlation,
but
yeah
anyhow.
B
So
this
is
the
entire
picture
of
our
policy
and
value
model
architecture.
So,
on
the
left,
you
can
see
the
graph
embedding
and
the
you
know
fully
connected
layers,
and
after
that
we
have
the
policy
network
on
the
top
and
the
value
network
which
is
simpler
on
the
in
the
middle,
and
we
have
a
masking
layer
at
the
bottom
which
masks
invalid
moves.
For
example,
when
we
place
when
we
have
pre-placed
macros
or
you
know
previously
placed
macros,
we
cannot
place
macro
there,
so
we
mask
them
off.
B
Here's
our
experimental
setup
for
pre-training,
we
used
one
worker
per
block
in
the
training
data
set
and
the
pre-training
was
done
for
48
hours
for
fine
tuning.
We
used
16
workers
for
up
to
six
hours
with
early
stopping
and
for
zero
shot.
We
could
generate
a
placement
in
a
second
in
less
than
a
second
using
a
single
gpu.
B
B
Each
of
these
colored
squares
is
a
macro,
and
you
can
see
that
the
policy
on
the
left
starts
out
quite
random
and
it's
going
to
take
a
while
for
it
to
reach
a
reasonable
placement,
whereas
a
policy
on
the
right,
it
starts
from
beginning
being
very
close
to
the
optimal
placement
and
it
shows
the
empty
middle
region
for
the
standard
cells.
So
they
can
minimize
the
line
lengths
while
maintaining
acceptable
condition
and
density.
B
You
can
see
that
if
you
have
a
pre-trained
policy
that
you're
fine-tuning
it
almost
almost
from
the
very
beginning,
it's
able
to
achieve
the
quality
that
is
comparable
to
what
the
policy
between
from
scratch
gets
after
about
24
hours
yeah.
So
the
pre-trained
model
helps
helps
us
can
generate
high
quality
placements
much
faster.
A
I
just
had
a
question
on
the
overall
design
topology
of
the
tpu,
and
I
apologize
for
my
ignorance
here
on
the
actual
topology.
But
is
it
primarily
a
standard
cell-based
design,
or
is
it
broken
down
into
some
areas
of
what
I'll
call
structured
custom
and
also
how
much
analog
would
be
present
on
a
given
tpu
chip.
B
I
mean
on
the
tpu
chip,
we
have
some
analog
components,
but
those
are,
you
know,
specialized
components
and
it's
outside
our
interest.
Okay,
we
do
I
mean
we
do
it
manually,
I
guess
yeah
yeah.
I
cannot
talk
too
much
about
detail
on
that
side
and.
B
Yeah
and
the
the
tpu
in
general,
it's
a
you,
know,
mixture
of
various
kinds
of
design,
some
some
part,
it's
a
more
arithmetic,
intensive
and
some
parts.
It's
more
of
a
you
know,
wire
dominated
or
you
know,
data
movements,
and
you
know
mostly
we
do
we
stay
with.
You
know
place
in
an
automatic
place
and
out
we
try
not
to
do
too
much
of
a
you
know,
a
structured.
You
know,
work
semi-custom
methods
because
of
mere
interest
of
schedule,
but
yeah.
B
All
right,
not
only
not
only
do
we
get
results
faster,
but
we
actually
show
that
a
pre-trained
policy
that's
fine-tuned,
that
has
better
quality
than
what
a
policy
train
from
scratch
compares
to
after
more
than
24
hours.
B
So
the
light
blue
bars
are
zero
shot
that
generates
reasonably
reasonable
quality
medical
placements
in
sub
seconds,
and
as
we
get
to
a
darker
blue,
we
do
two
to
12
hours
of
fine
tuning
of
the
pre-trained
policy,
whereas
the
yellow
is
the
policy,
that's
trained
from
scratch,
so
fine
tuning,
the
pre-trained
model
produces
better
placements
in
less
time
less
time
than
the
policy
train
from
scratch.
B
What's
interesting
to
to
us
was
the
effect
of
the
size
of
the
training
set
that
we
pre-trained
our
policy.
We
actually
don't
even
have
that
much
data.
We
didn't
have
that
much
of
data,
so
what
we
could
do
if
we
were
able
to
generate
or
augment
our
training
set.
B
So
we
did
that
on
the
left.
The
green
bars
are,
the
small
training
set
only
two
blocks,
the
blue
is
five
blocks
and
the
yellow
is
a
large
data
set
of
20
blocks
and
the
x-axis
is
how
many
hours
of
fine-tuning
we
perform
on
the
top
of
on
on
top
of
the
pre-trained
policy.
B
B
So
we
are
very
excited
about
various
approaches:
to
increase
the
size
of
our
training
set
and
on
the
right.
You
can
see
the
convergence
curves
for
policies
that
were
pre-trained
with
different
amount
of
data,
and
you
can
see
the
smaller
data
set
causes
us
to
overfit
more
quickly
to
than
at
least
the
policy
observed.
A
A
Sorry,
yes,
yes,
so
you're
using
a
neural
network
as
as
for
the
implementation
of
this,
is
that
correct.
A
B
B
So
I
think
we
have
some
numbers
here
yeah,
so
you
can
see
yeah
I
mean
how
many
layers
and
how?
How
large
is
our
convolution
yeah
yeah?
Okay,
thank
you
all
right,
sorry
about
that,
no
problem.
So
what
also
interesting
was
during
the
deployment
to
the
product,
our
product.
We
found
that
the
users
were
inspired
by
our
emma
placer.
B
But
when
then
our
ml
placer
plays
macros
quite
differently,
you
can
see
in
the
middle,
but
it
was
reducing
wildlings
and
improving
timing.
B
So
the
user
took
the
macro
placement
from
ml
placer
and
then
rearranged
a
bit
to
further
improve
worst
negative
select
here,
and
this
was
done
like
more
than
a
year
ago.
So
this
is
our
previous
version
rl.
So
it
had
some
problem
with
the
timing,
but
anyhow
a
user
got
the
hint
from
the
ml
placer
and
then
came
up
with
a
totally
different
manual
neck
replacement.
That's
inspired
by
our
ml
placer.
B
So
here's
a
comparison
of
our
method
against
the
state-of-the-art
method
replace
as
a
as
well
as
a
human
export
manual
maker
placement
for
five
tpu
v4
blocks.
We
compared
major
quality
metrics
such
as
wns
tns
area,
power,
violence
and
congestion.
B
The
results
are
from
eda
tool
after
place
up
to
step,
note
that
our
method,
optimizes
y
lengths
under
congestion
and
density
constraints,
and
we
create
out
the
placements
that
are
not
usable
according
to
our
user,
we
reviewed
the
results
and
then
some
were
not
looking
good,
so
we
discarded
those-
and
you
can
see
that
in
many
cases,
replace
veils
to
produce
acceptable
macro
placements
and
to
us
what's
also
exciting,
was
our
placer
also
up
outperformed
human
placements?
B
In
most
cases,
which
was
you
know,
human
medical
placement
was
a
very
strong
baseline.
B
B
The
top
table
shows
the
comparison
with
the
eda
tool,
a
b
and
manual
macro
placements,
and
our
ml
placer
is
superior
in
majority
cases.
B
Note
that
users,
physical
designers,
review
the
data
and
comparing
major
metrics
such
as
timing,
condition
and
area,
and
if
the
quality
of
the
metrics
are
similar
enough
within
those
level,
we
consider
them
as
equal
and
the
bottom
table
shows
that
our
ml
placer
showed
most
number
of
best
cases
among
the
competitors.
A
No
sorry,
we
did
have
a
question.
I
don't
know.
If
we
can.
You
can
finish
this.
Your
thought
here
or,
however,
you
wanna
yeah
yeah.
Please
go
ahead.
Okay,
yeah!
This
is
from
professor
matt
goodhouse.
That
always
never
mind.
He
was
just
asking
about
the
runtime
but
you're
showing
that
so
go
ahead.
I
apologize.
B
Okay,
yeah
so
eda
to
runtime.
There
is
some
variation
and
especially
if
the
tool
struggles
in
in
terms
of
timing,
then
the
eda2
runtime
may
increase
right.
So
you
can
see
the
tns
our
ml
placer.
Is
you
know
among
the
best
you
know,
similar
to
manual
case.
B
Okay,
so
yeah,
we
are
almost
done
so
now.
Let
me
summarize
this
talk.
We
we
presented
our
deep
reinforcement,
learning
based
ml.
You
know
medical
placer.
It
learns
to
generate
superhuman
macro
placements
in
several
hours
and
then
we
are
trying
to
reduce
the
runtime
by
improving
our
rl
methods
and
our
method.
Outperformed,
academic,
state-of-the-art
placer
as
well
as
commercial
automatically
placers,
and
we
used
our
ml
placer
in
our
next
generation.
Tpu
designs,
two
generations
now
and.
A
B
Feedback
was
positive
in
general.
Of
course,
this
is
not
perfect,
but
you
know
overall,
the
the
the
feedback
was
positive.
They
learned
from
what
ml
placer
does
and
then
actually,
as
in
some
cases,
we
use
the
ml
placer
result,
as
is
in
the
production,
we
were
able
to
accelerate
chip
design
process
as
a
result.
A
B
A
You
so
if
not,
then
you
know,
I
appreciate
the
excellent
chat
or
talk
and
also
answering
questions
while
we're
on
the
fly.
Oh,
we
do
have
a
question
here.
We
go
sorry
questions
from
akshay
kulkarni.
How
many
years
have
you
guys
been
working
on
this.
B
A
That's
fine,
that's
fine!
That's
fine!
I
will
I
apologize
for
the
difficult
question
all
right
anyway.
I
think
it.
I
think
it's
exciting
work
and
very
innovative,
and
you
know
I
look
forward
to
you.
Know
you
or
google
being
able
to
share
further
details
on
this
and
other
developments
as
time
progresses.
So
thank
you.
B
Yeah
one
thing
I
want
to
mention
is:
we
are
looking
into
open
sourcing
this,
so
you
know
more,
you
know,
maybe
from
academy
or
industry.
Can
you
know,
participate
in
this.
You
know
you
know
advancing
this
technology,
so
we
are
actively
looking
into
this.
Maybe
you
will
hear
more
from
us
in
the
future.
A
Yeah,
that
would
be
great,
I
think,
there's
a
very
interested
community
out
there
and,
if
there's
anything
I
can
do
in
terms
of
my
role
at
chip's
alliance
in
terms
of
helping
the
dialogue
on
that,
I
would
be
happy
to
to
do
that
so
that
I
think
that
that
would
be
very
interesting
for
the
community
all
right.
A
B
Yeah,
so
we
we
are
aware
of
the
sky
or
pdk
in
the
effect,
and
so
you
know
we
looked
into
you
know
open
sourcing,
our
rl
with
some
sample
designs,
that's
open
source
in
the
you
know
the
the
sky
or
pdk
library.
You
know
we
have
that
program
going
on,
so
we
might
be
looking
into
that
as
well.
B
So
currently
we
are
thinking
about
you
know
open
sourcing
with
sample
designs
from
existing
open
source
benchmark
circuits,
but
you
know
to
be
more
realistic
and
then
staying.
You
know
with
the
eda
tools.
We
may
want
to
use
the
real
library
to
get
the
timing
and
you
know
more
accurate
metrics
yeah
thanks.
A
Yeah
so
in
general,
I'll
just
comment
from
the
chips
alliance
side,
you
know
one
of
the
things
that
we
definitely
are
socializing
or
pushing
in
in
the
industry
is
the
notion
of
open
source,
tooling
and
pdks,
and
work
closely
with
e-fabulous
and
skywater
that
are
part
of
chips,
as
is
google.
So
you
know,
I
am
excited
about
the
different
possibilities
here
and
you
know
also
working
with
with
matt
of
uc
santa
cruz,
on
open
ram
and
then
also
with
professor
andrew
kong
and
tom
spyro,
of
open
road
ucsd
efforts
as
well.
A
Well
with
that,
I
want
to
thank
you
again,
young,
thank
you
for
the
excellent
presentation
and
you
know
look
forward
to
further
dialogue
on
this
topic.
So
thank
you
again,.