Chips Alliance Deep Dive Cafe talks, 9 Nov 2021

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: CHIPS Alliance - Learning To Play the Game of Macro Placement with Deep Reinforcement Learning

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

Hello folks, it's rob baines here, general manager of chips alliance. It's a pleasure to have everyone here today and really looking forward to the talk today by yongjung lee yong jun is a physical design engineer at google cloud and before joining google, he received his phd from georgia tech and worked in intel live for six years at google, zhangjun has been working on machine learning, chip projects and machine learning based physical design projects for two years.

A

Youngjun has experience in cad eda algorithms, physical design and machine learning, and aspires to use machine learning to help accelerate chip design, which I think is a real benefit, as that is a definitive challenge area that we're all interested in seeing uh novel improvements in so youngjun is very open to receiving interactive questions during the talks. So if you have a question, feel free to go ahead and pose it, and we will try to answer it as we progress along. So with that, let me introduce yong jun and thank you.

B

Thanks rob for introduction, um can you hear me.

A

uh Yes, we're just fine and your slides are presented well all right.

B

uh Let's get started, let's get, let's get started: um yeah, I'm yongjun, I'm happy to share with you our exciting project learning to play the game of macro placement with deep reinforcement. Learning this work has been a great collaboration within google, including these people.

B

Here's the outline first I'll, introduce the project, motivation and background and then I'll describe our approach in detail, followed by our results and comparison with existing works. Then I'll conclude, the presentation.

B

This project started as a very ambitious project: ambitious projects by a google brain machine learning or systems team.

B

We observed that in the past decade the success of machine learning was because of the great systems with very powerful custom hardware that enabled running ml algorithms at scale on large scale data.

B

The idea is to return the favor and to develop and use ml methods that can transform the way systems and hardwares are designed.

B

If you look at the chart on the right from openai, you can see that since 1959 till 2012, the amount of computational power using the largest ai runs doubled every two years, whereas since 2012, when the deep learning started, taking off the computational power doubled every three point four months and in comparison, the moore's law had an about 18 months doubling period.

B

So this shows that how important is it it is for us to develop more powerful chips and systems to keep up with the computational demands of ml algorithms.

B

Why are we seeing rapid, increasing compute demand in ml domain? It's because we are trying to solve more difficult problems. Here we have a comparison of the number of states which correlates to computational complexity.

B

For the g for the game of gold, the number of states is uh 10 to the power of 360, which is really large, so ai algorithms couldn't beat to hume top human expert until several years ago.

B

In contrast, the simplified version of the chip macro placement, the number of states is much higher. This shows that our problem is really really complex.

B

A lot of problems in systems and chip designs are combinatorial, optimization problems on graphs, for example. We have three examples here: um compiler optimization chip placement and data center resource allocation.

B

All these three have inputs on the form of graphs such as xla graph or chip, netlist graph or job workload graph, and the optimization goal is to schedule ups or allocate some resource to the knowledge of the graph.

B

So combinatorial, optimization over graphs is a key problem that appears over and over in systems and tips problems and is interesting to us.

B

For these problems, we are taking a learning base of approach over traditional approaches, such as branch and bound hill, climbing and ilp so-voice solvers.

B

The reasons we think our learning based approach can be superior to the existing methods are as follows.

B

First, a learning based approach can learn the underlying relationship between the context and target optimization metrics and use that to trade off and optimize objective in a way that might be less clear to human experts who design the optimization and rule-based trade-off methods.

B

Second, which is really exciting, is the learning based methods can gain experience as they solve more instances of problem and become experts. This is a property that traditional methods do not have.

B

Third, we know we now know very clearly how to use distributed system to run them at scale, and we also know very well how to train models with billions of parameters.

B

So, there's a lot of potential here for learning, based approaches to be used on complex combinatorial.

B

Problems now, let me give an overview of our problem. The chip placement problem in a simplified form is an example of graph resource optimization.

B

The logic design is synthesized into a net list, which is a graph of a chip component macros which can be svms or other ip blocks, and the standard cells, which are logic gates as nands and north, are connected by wires. So this is a graph and the objective is to place the components of this graph onto the chip flow plan. Canvas canvas so that we minimize various costs, such as latency of computation or power, consumption or area, while meeting the constraints, such as timing, congestion density and so on.

B

There has been the decades of researches on chip design problem.

B

Prior approaches can be categorized into three methods: partitioning methods, partitioning-based methods, such as mean cut stochastic methods such as simulated annealing and analytic solvers, like the current or previous academic, state-of-the-art replays, which will be compared against in this presentation.

B

In this work, we are proposing a new category, which is a learning based method.

B

Here's a little more detail of our proposed approach.

B

We take a deep reinforcement, learning approach to the chip placement problem, where we train an agent or a policy to place the nodes of the chip that list onto chip flop and canvas one by one and when all the nodes have been placed, we get a reward signal which we use to update the parameters of our policy so that the policy gets better and better at this task.

B

The state here is the potential placement so far, so our policy and value net can see the embedding of the current node of the graph netlist graph and place the next, and also the information of the chip canvas which parts of the canvas are already occupied by other nodes.

B

The action at which action at each timestamp is where to place the current node to the canvas which we discretized into grid cells and our reward function is during the intermediate stage.

B

We just pass zero reward, but at the end, when all the nodes are placed, we calculate the weighted sum of wire lengths density and congestion that this approach provides very sparse reward.

B

Yes, thanks to our rapid prototyping physical design framework, we greatly accelerated the process of gathering rewards, so we are able to get enough samples and still learn from this sparsely word.

B

Here we describe in detail what we are optimizing for. Our objective function is to minimize the cost or maximize the expected reward.

B

Given a placement p of the net list, graph g over the average expected reward of all the graph in our training set g and the reward function is a weighted sum of y lengths, congestion and density.

B

Hey youngjohn,.

A

Yes, yeah, I just curious, you know you mentioned uh and you know again. I want to encourage the audience to feel free to ask questions, but uh you mentioned about the rapid prototyping environment that you have at google. I'm just curious what that exactly what that looks like okay, so.

B

That's a good question. So um when we started uh this project we didn't have the prototyping framework. So we thought about um creating the python code around the existing cad tools.

B

But the problem was the cat tools are too heavy and slow so uh and the key point of deep learning is to gather a lot of sample data, so it's too slow to gather data and provide feedback or you know, reward right. So that's why we started creating our own lightweight um place and route engine.

B

So where we have, you know where we capture the the floor. Pan canvas and uh you know the placement and then we also do some very simplistic routing, and it's really fast, it's written in c, plus plus and then it's optimized. So um we can iterate through a lot of um samples really fast.

A

Does your environment also have uh cost based uh engines? In other words, by that I mean well, cost calculation engines, in other words, say like a uh static, timer or some type of power. Estimation application or you rely upon commercial solutions for that.

B

Yeah, so currently we don't have the time. Actually we tried the timer, but it was not working so well. um So in internally we have uh violence, congestion density and the timer is what we are working on. uh We are working on, uh you know simplistic or you know uh um modeled uh timer, and uh we are going to expand it to uh power consideration as well.

B

A

Great. Thank you so much.

B

Thank you. Let me continue yeah, so uh a creative idea was. um We took a hybrid, hybrid approach in this work. uh What I mean is so we trained our rl agent and place macros one by one and when all the macros are placed, we fix them, location, macros and use a traditional.

B

You know force directed method or you know some other um state-of-the-art uh standard cell placer to place the standard cells. So we are not placing standard cells with rl. We are only placing macros with rl um the four selected method that we tried. First, um it was using, you know, an analogy to spring and mass system. So it's well-known, you know, approach to place the standard cells.

B

It is known to produce reasonably good standard cell placement fast um and um yeah. So if we I mean in the beginning, we first tried placing the standard cells and macros together, but it was really slow. Even after we do some clustering of standard cell to reduce the number of cells, or you know, number of objects to place so uh and because standard cell placement is well-known problem and it's um you know kind.

A

B

uh Solved um by the existing method pretty well, so uh we we decided to take this hybrid approach and they that saved a lot of time. For us, we can, we could uh uh go much faster and then gather more um data samples.

A

I have two there's two questions from uh stoner yadis and the first one is is the ordering of nodes, arbitrary.

B

Yes, uh so ordering of uh macros uh was uh what we um explored in the beginning and we tried random ordering and then um you know larger first and then you know smaller later kind of ordering and um we tried other things like you know: grouping of macros that are related to each other and then do you know place them first and then you know, and so on.

B

So we tried a few heuristics, um but then uh we found that in general, um the well-known method of placing larger macro first and then you know smaller later uh works better. um Overall I mean it's not always, but um and we stick to that approach, um there could be some um optimization chances that we may haven't uh explored it.

B

um We also thought about you know applying rl for choosing which macro to place next, but uh it didn't work out so well so um yeah. But I I admit that there is a chance of you know, optimizing other and then.

A

One other question from soner is: have you explored immediate negative reward if a macro is just placed and overlapping with existing.

B

Oh yeah, we try that too. So uh um you know we give negative partial reward uh when we have overlaps. um The problem is um the convergence. Speed was not satisfactory. With that approach uh it. You know the aria was not learning enough to um place macros, um you know not overlapping.

B

So uh we had to um take this approach. You know not allowing macros overlapping each other uh to enforce rl to stay away from you know overlapping macros, um so so that was um the decision that we made a while ago. Maybe we need to revisit this idea uh actually um um I'll be covering in the later slide, but uh we are now uh struggling with the high density designs. You know we where we have a lot of macros packed each other in a pack and there isn't much of a wiggle room to place macros.

B

In that case, uh we are struggling to place all the macros and still um you know, achieve high quality.

B

um So we are now exploring how to you know, allow partial overlap and then um discourage it as as we train more and um yeah. So that's our work in progress.

A

Great. Thank you so much.

B

All right, let me move on.

B

So here's our only result. It's been like two years um so this this was uh yeah um yeah. So this was uh the first successful. um You know results coming out of this project for the real designing case, our tpu block.

B

Please understand that, due to the confidentiality I had to blur the screenshot images.

B

So uh on the left, I have human macro placement, uh you can imagine, the white areas are the macros and the green is the standard cells and the dark blue or you know the black is the you know empty areas and uh human expert.

B

You know a physical designer took about six to eight weeks to iterate multiple times and converge to this macro placement and of course the six to eight weeks include the rtl development co development on the right, uh which is our earlier rl version.

B

It took about 24 hours to generate the super human medical placement with about three percent shorter wildlings and um this results um in the I mean in the later part of my presentation. um Now it takes about six hours or less to generate medical placements, because we made a lot of improvements to both the computational efficiency and learning. Algorithm um physical designer commented that the half circular macro placement surrounding the standard cell cloud in the middle minimizes, the y length between the standard cells and the macros.

B

So this was obvious, um obviously better um and there was wasn't much of a delta in terms of routability. So uh this was a clear improvement result.

B

A

B

A

Yes and it's hot and it's from an anonymous attendee question- is: how long would an expert take without co, rtl development.

B

um If it weren't um yes, if without the rtl development uh only for the pure medical placement, this design was pretty big. I mean this block was uh more than two million about two million instance, and uh if we were to go through um like, for example, like five or ten medical placement trials, it has to include the placement and then qr evaluation. So I would say it takes at least about two weeks um to evaluate and then update medical placement um and it's uh it can be partially uh paralyzed or you can.

B

You can, you know, come up with a strategy to you know fully paralyze, all the uh you know possible um macro placements and then maybe reduce the time to a week. Maybe, um but it's still, um you know longer than um you know, 24 hours, maybe yeah.

A

B

uh Let me move on yeah, so uh so we we saw the sign that uh this rl method may work, but at that point uh it wasn't doing any um you know, learning transfer or you know it was trained only for a given problem. But uh for the the other case of our problem, you have to start from scratch and then train again the and it takes 24 hours again right. So it's not efficient.

B

So um the next step that we took was um we thought about. How can we train policies that generalize across this problem? um So on the left? um You know the previous case. We were optimizing for a specific place, placement of a net list onto blowpan canvas and a training, a policy to do. This was an instance of the problem, but after seeing initial proof of cut in a concept uh we extended that to uh the pictures on the right.

B

B

So you train for a multiple list cases you go through a lot of iterations and then at the influence.

B

You are given a new netlist and you need only a few hundreds of iterations which is pretty quick or you you can. uh If ideal, um you don't need. uh You know, training at all. You just do zero shot. What we call zero shot to um come up with a macro placement in a second.

B

So uh if this works, then this is gonna be great and uh we haven't achieved it yet, uh but we are uh working towards it.

B

Yeah, so uh we tried several ideas to um make generalization work.

B

The first attempt uh was: we took our previous rl policy architecture, we trained it on a bunch of net lists, and then we tried it on an unseen at least it just didn't work.

B

Then we tried various other schemes like freezing different layers and testing on a new netlist, and then that didn't work either.

B

um So what we did in the end was to use a you know, good old, supervised learning, to discover the architecture that would allow us to generalize across that list.

B

Yeah the insight that we had was it was the value network that was not generalizing.

B

The value network trained on placements general uh generated by a policy was unable to accurately predict the quality of the placements generated by another policy, so that was causing our policy to be unable to generalize to placing new mac new netlist.

B

So we decided to decompose the problem and extract a sub-problem of training policy that could accurately predict reward from off policy data.

B

So we believe that if you are unable to predict reward across a very variety of placements, then we would be unable to solve this general problem of placing that list.

B

So uh in order to train a supervised model to perform this task, we compiled a large data set of 10 000 placements generated by vanilla, rl policies at different stages of maturity in the training process. This is valuable because it provides a variety of quality of placements in the graph. Different color represent the data for different. That is, we have five different samples. I mean the different cases of that list and we generated a lot of data.

B

Then we use this data set to find the right architecture that would be able to perform the task of predicting the quality of the placement, and this is the model architecture that we converged after much effort.

B

We take input, features such as node feature x, y, coordinates, width and height of macro and the type and the features of the graph like macros standard cells and clusters, and we pass them to a custom graph, convolutional network I'll talk about a little bit later.

B

We also pass in at least metadata, like total number wires and macros, and we concatenate these and pass that into two fully connected models that will predict violence and congestion, so this was used in the supervised learning.

B

Now, let's take a look at the graph convolutional architecture, what we found was uh other graph neural network approaches are more focused on features of nodes, whereas in our problem it's more of function of edges, that's uh if you want to predict wildlings, it's not really about the the node features themselves.

B

So we took the edge-based approach.

B

We feed the features of the node like x y, coordinates, width and height into a fully connected network that produces an embedding to that node, and then we concatenate the embeddings and then create the edge embedding and we add the edge weight at the top.

B

And then we pass that into another fully connected layer to generate an embedding edge. Embedding then we present, but then we represent the features of the nodes as everything of our of their edge embeddings.

A

So that and then like yes, I'm sorry yeah. That seems like a a fundamental rethinking of the problem. Am I right about that or.

B

A

B

Is uh yeah? This is uh our novel approach. um This is, uh you know, creating the edge embedding from the um node embedding um and this we I mean we tried. um You know node based graph, neural network first, but it wasn't working. It wasn't capturing. um You know the netlist uh essence of the netherlands, so it wasn't generalizing for the you know in the supervised learning, and um then we found that you know. uh If you think about the wildlings, it's not about edge, I mean the node it's more about the edge right.

B

A

B

Yeah we started thinking about how we can you know, transform um transfer the node into the edge, and this is what we came up with.

A

Yeah, because I was thinking about you know, my background is in static timing, analysis uh in terms of technical expertise, and you know that's basically a node edge graph type of representation as well, and uh you know it's been a little bit since you know I'm not totally familiar with the latest technology characteristics of like three or five nanometer process technology.

A

But you know, as you as you well know, interconnect delay is always a challenge, and so I'm just wondering if a rethink of uh the graph representation for static timing analysis and then subsequently for interconnect optimization if these thoughts would have potential value there too, or maybe you're already. Looking at this, I don't know, but I'm just thinking out loud here.

B

Yeah, so uh what we are hoping is, uh you know this: these uh edge um embeddings will capture those. um You know those characteristics of the net list in the um as we training um as we trained uh uh in a problem case. So um you know if we were to add the timing aspect into it. The timing will be embedded into um you know these uh edge embeddings and then we'll be able to see how it predicts the delay through the edges.

B

Okay, thank you. It's great all right, okay, so let me move on yeah so and then we distribute the edge embedding to the node embedding and then we repeat until we converge at the end, we get the representation of the entire graph by taking the mean of the edge embeddings, and that's uh that uh you know uh orange is uh embedding um in the middle and then um we combine other things and then uh go through the fully connected layers to get the wireless and condition prediction and that's how we did the supervised learning.

B

So this is the uh the prediction versus um you know the actual uh kind of graph. So you can see on the left wire length. We have a better correlation. Congestion is only a bit hard to predict uh it's because it's kind of noisy metric, but you can see um you know positive correlation there.

B

This was done like uh more than a year ago, so maybe now we have better uh correlation, but yeah anyhow.

B

So uh this is the entire picture of our policy and value model architecture. So, on the left, you can see uh the graph embedding and the you know fully connected layers, and after that we have um the policy network on the top and the value network which is simpler on the in the middle, and we have a masking layer um at the bottom which masks uh invalid moves. For example, when we place when we have pre-placed macros or you know previously placed macros, we cannot place macro there, so we mask them off.

B

So that's added at the end of the you know, neural network.

B

Here's our experimental setup for pre-training, we used one worker per block in the training data set and the pre-training was done for 48 hours for fine tuning. We used 16 workers for up to six hours with early stopping and for zero shot. We could generate a placement in a second in less than a second using a single gpu.

B

um So this is a visual comparison of uh convergence. Speed of the training on the left is a policy that was trained from scratch on the right is a pre-trained policy, that's being fine-tuned on a given that list. This is a aryan core, open source risk. Five.

B

um Each of these colored squares is a macro, and you can see that the policy on the left starts out quite random and it's going to take a while for it to reach a reasonable placement, whereas a policy on the right, it starts from beginning being very close to the optimal placement and it shows the empty middle region for the standard cells. So they can minimize the line lengths while maintaining acceptable condition and density.

B

And here are the convergence curves, the x-axis is the training time and the y-axis is the placement cost.

B

You can see that if you have a pre-trained policy that you're fine-tuning it almost almost from the very beginning, it's able to achieve the quality that is comparable to what the policy between from scratch gets after about 24 hours yeah. So the pre-trained model helps helps us can generate high quality placements much faster.

A

I just had a question on the overall design topology of the tpu, and I apologize for my ignorance here on the actual topology. But is it primarily a standard cell-based design, or is it broken down into some areas of what I'll call structured custom and also how much analog would be present on a given tpu chip.

B

I mean uh on the tpu chip, we have some analog components, but uh those are, you know, specialized components and it's uh outside our interest. um Okay, we do I mean we do it manually, I guess um yeah yeah. I cannot talk too much about detail on that side and.

A

B

Yeah and the the tpu in general, it's a um you, know, mixture of uh uh various uh kinds of design, uh some some part, it's a more arithmetic, intensive and some parts. It's more of a you know, wire dominated or you know, data movements, and um you know mostly we do uh we stay with. uh You know place in an automatic place and out we try not to do too much of a um you know, a structured. uh You know, work semi-custom methods because of mere interest of uh schedule, um but uh um yeah.

A

Okay, thank you. Yeah.

B

All right, uh not only not only do we get results faster, but we actually show that a pre-trained policy that's fine-tuned, that has better quality than what a policy train from scratch compares to after more than 24 hours.

B

um So the light blue bars are zero shot that generates reasonably reasonable quality medical placements in sub seconds, and as we get to a darker blue, we do two to 12 hours of fine tuning of the pre-trained policy, whereas the yellow is the policy, that's trained from scratch, so fine tuning, the pre-trained model produces better placements in less time less time than the policy train from scratch.

B

uh What's interesting to uh to us was the effect of the size of the training set that we pre-trained our policy. um We actually don't even have that much data. We didn't have that much of data, so um what we could do if we were able to generate or augment our training set.

B

So we did that um on the left. The green bars are, the small training set only two blocks, the blue is five blocks and the yellow is a large data set of 20 blocks and um the x-axis is how many hours of fine-tuning we perform on the top of on on top of the pre-trained policy.

B

The leftmost bars are zero shot and we, as we go all the way up to 40 hours, fine tuning. You can see the effect of the size of the training training set.

B

So we are very excited about um various approaches: to increase the size of our training set and on the right. You can see the convergence curves for policies that were pre-trained with different amount of data, and you can see the smaller data set causes us to overfit more quickly to than at least the policy observed.

A

B

I'm going to ask you.

A

Sorry, yes, yes, so you're using a neural network as as for the uh implementation of this, is that correct.

B

uh Implementation, what.

A

Are not just so you have all this training data. Are you training, a neural network? Is that what you're doing.

B

A

So how I'm just curious, how how deep is the network and how how wide is it, how many nodes, if you can.

B

So uh I think we have some numbers here uh yeah, so you can see um yeah I mean how many layers and how? How large is our um convolution yeah yeah? Okay, thank you all right, sorry about that, no problem. So uh what also interesting was um during the deployment uh uh to the product, our product. We found that the users were inspired by our emma placer.

B

uh What I mean is on the left is how the human designer placed macros earlier.

B

You know as usual, we we like to place them on the periphery or you know, in in a row and column kind of a rectangular form, for you know formation.

B

But uh when then uh our ml placer plays macros quite differently, uh you can see in the middle, um but it was reducing wildlings and improving timing.

B

So the user took the macro placement from ml placer and then rearranged a bit to further improve uh worst negative select here, and uh this was done like more than a year ago. So this is our previous version rl. um So it had some problem with the timing, um but anyhow a user uh got the hint from the ml placer and then uh came up with a totally different manual neck replacement. That's inspired by our ml placer.

B

So uh here's a comparison of our method against the state-of-the-art method replace as a as well as a human export manual maker placement for uh five tpu v4 blocks. We compared major quality metrics such as wns tns area, power, violence and congestion.

B

The results are from eda tool after place up to step, note that our method, optimizes y lengths under congestion and density constraints, and we create out the placements that are not usable according to our user, uh we reviewed the results and then uh some were not looking good, so um we discarded those- and you can see that in many cases, replace uh veils to produce acceptable macro placements um and uh to us what's also exciting, was uh our placer also up outperformed human placements?

B

In most cases, uh which was uh you know, human medical placement was a very strong baseline.

B

So we were very excited to see this. This was done more than a year ago.

B

We also compared our ml placer with a commercially available automatical placers.

B

We tried the two edi vendor tools which had several different different modes and placer engines, and the table on the right shows the characteristics of blocks using the comparison in terms of canvas saturation and macro counts. So we tried uh many different real design cases in the tpus.

B

The top table shows the comparison um uh with the eda tool, a b and uh manual macro placements, and uh our ml placer is superior in majority cases.

B

uh Note that users, physical designers, review the data and comparing major metrics such as timing, condition and area, and if the quality of the metrics are similar enough within those level, we consider them as equal and the bottom table shows that our ml placer showed most number of best cases among the competitors.

B

And here we have a comparison of the four methods for the tpuv5 block.

B

Again, all the numbers are from eda tool after place up step you you can see that the macro placements look quite different for each method, and the bottom table shows that our ml placer produce the best overall quality in terms of timing, area and violence, and you can see um yes.

A

No sorry, we did have a question. I don't know. If we can. You can finish this. Your thought here or, however, you wanna yeah yeah. Please go ahead. Okay, yeah! This is from uh professor matt goodhouse. That uh always never mind. He was just asking about the runtime but you're showing that so go ahead. I apologize.

B

Okay, yeah so eda to runtime. uh There is some variation uh and especially if the tool struggles in in terms of timing, then the eda2 runtime may increase right. um So you can see the tns uh our ml placer. uh Is uh you know among the best you know, similar to manual case.

B

Any other question.

B

A

Not at this time.

B

Okay, so yeah, we are almost done so uh now. uh Let me summarize this talk. um We we presented our deep reinforcement, learning based ml. uh You know medical placer. It learns to generate superhuman macro placements in several hours and then we are trying to reduce the runtime by improving our rl methods and our method. Outperformed, uh academic, state-of-the-art placer as well as commercial automatically placers, and we used our ml placer in our next generation. Tpu designs, two generations now and.

A

B

Feedback was positive in general. Of course, this is not perfect, but uh um you know overall, uh the uh the the feedback was positive. They learned from what ml placer does and then actually, as in some cases, we use the ml placer result, as is in the production, we were able to accelerate chip design process as a result.

B

Okay, so that concludes my talk. Any more questions.

A

No, that was a great talk which I truly enjoyed. The information are there any other questions from the audience.

A

It doesn't look like we have any more questions at this time. All right.

B

A

You so if not, then uh you know, I appreciate uh the excellent chat or talk and also answering questions while we're on the fly. Oh, we do have a question here. We go sorry uh questions from akshay kulkarni. How many years have you guys been working on this.

B

um It's been three and a half years, I think yeah I joined uh later. uh You know when I joined google. uh The project has been going for like a year or more, and that was uh 2019.

A

I'm just I don't know if you can say, but I'm just curious how many people do you have working on this.

B

uh That I may not be able to tell.

A

That's fine, that's fine! That's fine! I will I apologize for the difficult question all right anyway. I think it. I think it's exciting work and uh very innovative, and uh you know I look forward to uh you. Know you or google being able to share further details on this and other developments as time progresses. So thank you.

B

Yeah one thing I want to mention is: uh we are looking into open sourcing this, uh so uh you know more, um you know, um maybe from academy or industry. Can uh you know, participate in this. uh You know you know advancing this uh technology, so uh uh we are actively looking into this. uh Maybe you will hear more uh from us in the future.

A

Yeah, that would be great, I think, there's a very interested community out there and, if there's anything I can do in terms of my role at chip's alliance in terms of helping the dialogue on that, I would be happy to uh to do that so that I think that that would be very uh interesting for the community all right.

A

So any other final questions.

A

Oops we did just got one in. Let me just get it for you here, hang on.

A

Oh, it's in chat. Sorry, that's why, if open sour, if open sourcing, you might look into collaborating with e fabulous in the skywater pdk that, yes.

B

A

Very good point: go ahead. Sorry.

B

Yeah, so we uh we are aware of the sky or pdk in the effect, and um so uh you know we looked into uh you know open sourcing, uh our rl with uh some sample designs, that's open source in the you know the the sky or pdk library. um You know we have that uh program going on, um so we might be looking into that as well.

B

uh So currently we are um thinking about um you know open sourcing with uh sample designs from existing uh open source benchmark circuits, but um you know to be more realistic and then staying. um You know with the eda tools. We may want to use the real library to get the timing and you know more accurate metrics yeah thanks.

A

Yeah so in general, I'll just comment from the chips alliance side, you know one of the things that we definitely are socializing or pushing in in the industry is the notion of open source, tooling and pdks, and work closely with e-fabulous and skywater that are part of chips, as is google. So you know, I am excited about the different possibilities here and you know also working with uh with matt of uc santa cruz, on open ram and then also with uh professor andrew kong and tom spyro, of open road ucsd efforts as well.

A

So there's a lot of exciting activity in the physical implementation and physical analysis, space that we're trying to do as well as in design verification, which is the other very important part of the overall eda ecosystem and designing chips.

A

Any other final questions.

A

Well with that, uh I want to thank you again, young, thank you for the excellent presentation and uh you know look forward to further dialogue on this topic. So thank you again,.

B

A

All right, bye, bye, everybody, bye.

B