Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US2024394552A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024394552-A1 |
| Application number | US-202318202029-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 25, 2023 |
| Priority date | May 25, 2023 |
| Publication date | Nov 28, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The technology provides techniques for optimizing transistor-level placement using a hybrid approach involving reinforcement learning (“RL”) in conjunction with an optimization technique. This can include implementing an iterative RL training process for an integrated circuit to train a RL agent, including the RL agent learning an ordering of transistors for the integrated circuit by placement of one transistor on an encoded grid per iteration. The RL agent iterates until all transistors for the integrated circuit are placed on the encoded grid. Upon placing all the transistors on the encoded grid, one or more processors implement a solver module using the ordering of the transistors as an input. The solver module is configured to perform an optimization to minimize spacing between the transistors. The trained reinforcement learning agent can then be save in memory.
Opening claim text (preview).
1 . A computer-implemented processing system, comprising: a memory configured to store a reinforcement learning agent and a solver module; and one or more processors operatively coupled to the memory, the one or more processors being configured to: implement an iterative reinforcement learning training process for an integrated circuit to train the reinforcement learning agent, in which the reinforcement learning agent learns an ordering of transistors for the integrated circuit by placement of one transistor on an encoded grid per iteration, in which the reinforcement learning agent is configured to iterate until all transistors for the integrated circuit are placed on the encoded grid; upon placement of all the transistors on the encoded grid, implement the solver module using the ordering of the transistors as an input, the solver module being configured to perform an optimization to minimize spacing between the transistors; and save the trained reinforcement learning agent in the memory. 2 . The processing system of claim 1 , wherein the one or more processors are further configured to generate an integrated circuit design according to the optimization. 3 . The processing system of claim 1 , wherein the reinforcement learning agent learns the ordering of transistors for the integrated circuit by placing either the one transistor on the encoded grid per iteration or by placing a pair of complementary transistors on the encoded grid per iteration. 4 . The processing system of claim 1 , wherein the reinforcement learning agent employs a policy proximal optimization according to an RL action space. 5 . The processing system of claim 4 , wherein the policy proximal optimization implements a probability distribution for every transistor for where that could be placed on the encoded grid. 6 . The processing system of claim 1 , wherein the one or more processors are further configured to implement a router module after one or more intermediate iterations of the iterative reinforcement learning training process. 7 . The processing system of claim 1 , wherein the solver module implements at least one of a Boolean Satisfiability solver, a Satisfiability Modulo a Theory (SMT) solver, or a Mixed-Integer Linear Programming (MILP) solver. 8 . The processing system of claim 1 , wherein the reinforcement learning agent learns the ordering of transistors according to actions, states and rewards for each iteration, in which each state includes connectivity and coordinates of previously placed transistors on the encoded grid. 9 . The processing system of claim 8 , wherein the reward at an end of each iteration is calculated as a linear combination of cell area, wirelength, and any routability or timing penalty. 10 . The processing system of claim 9 , wherein cell area is defined by a minimum bounding box that includes all currently placed transistors at a given iteration. 11 . The processing system of claim 10 , wherein the minimum bounding box represents a half-perimeter wire length. 12 . The processing system of claim 8 , wherein the reward at a conclusion of a final iteration is back-propagated through the reinforcement learning agent. 13 . The processing system of claim 1 , wherein routability at each iteration is approximated using at least one of congestion, pin density, area or wire length. 14 . A computer-implemented method, comprising: implementing, by one or more processors of a processing system, an iterative reinforcement learning training process for an integrated circuit to train a reinforcement learning agent, including the reinforcement learning agent learning an ordering of transistors for the integrated circuit by placement of one transistor on an encoded grid per iteration, in which the reinforcement learning agent iterates until all transistors for the integrated circuit are placed on the encoded grid; upon placing all the transistors on the encoded grid, the one or more processors implementing a solver module using the ordering of the transistors as an input, the solver module being configured to perform an optimization to minimize spacing between the transistors; and saving the trained reinforcement learning agent in memory. 15 . The method of claim 14 , further comprising generating an integrated circuit design according to the optimization. 16 . The method of claim 14 , wherein learning the ordering of transistors for the integrated circuit by the reinforcement learning agent includes placing either the one transistor on the encoded grid per iteration or by placing a pair of complementary transistors on the encoded grid per iteration. 17 . The method of claim 14 , wherein the reinforcement learning agent employs a policy proximal optimization according to an RL action space. 18 . The method of claim 14 , further comprising the one or more processors implementing a router module after one or more intermediate iterations of the iterative reinforcement learning training process. 19 . The method of claim 14 , wherein the solver module implements at least one of a Boolean Satisfiability solver, a Satisfiability Modulo a Theory (SMT) solver, or a Mixed-Integer Linear Programming (MILP) solver. 20 . The method of claim 14 , wherein the reinforcement learning agent learns the ordering of transistors according to actions, states and rewards for each iteration, in which each state includes connectivity and coordinates of previously placed transistors on the encoded grid.
Related publications grouped by family.
Answers are generated from the same data shown on this page.