Schedule instructions of a program of data flows for execution in tiles of a coarse grained reconfigurable array

US12039335B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12039335-B2
Application numberUS-202217705112-A
CountryUS
Kind codeB2
Filing dateMar 25, 2022
Priority dateMar 25, 2022
Publication dateJul 16, 2024
Grant dateJul 16, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Schedule instructions of a program for execution on a coarse grained reconfigurable array having a plurality of tiles operable in parallel. The program identifies data flows through memory locations represented by memory variables and identifies instructions configured to transform data in the data flows. Based on a hardware profile identifying features of the coarse grained reconfigurable array, a scheduler is configured to generate a memory map. The memory map identifies, for each respective memory variable in the program, one of the tiles that contains a memory location represented by the respective memory variable. Based on the memory map reducing possible choices for a brute force search, the scheduler assigns the instructions to the tiles for execution, and determines timing of execution of the instructions in the tiles.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving an assembly language program identifying data flows through memory locations represented by memory variables and identifying instructions configured to transform data in the data flows, wherein the assembly language program is configured to identify the memory variables as the memory locations through which the data is configured to flow through, and the assembly language program is further configured to identify the instructions configured to transform the data as the data flows through the memory locations represented by the memory variables; receiving a hardware profile identifying features of a coarse grained reconfigurable array having a plurality of tiles operable in parallel; generating a memory map identifying, for each respective memory variable in the assembly language program, one of the tiles that contains a memory location represented by the respective memory variable; assigning, based on the memory map, the instructions to the tiles for execution; and providing an instruction execution schedule via identifying timing of execution of the instructions in the tiles. 2. The method of claim 1 , wherein the generating of the memory map includes: partitioning the memory variables and the instructions into a plurality of groups, each of the groups configured to be implemented on one of the tiles. 3. The method of claim 2 , wherein the partitioning is performed to balance a number of instructions implemented per tile. 4. The method of claim 2 , wherein the partitioning is performed to balance a number of memory variables implemented per tile. 5. The method of claim 2 , wherein the partitioning is performed to balance an amount of memory usage implemented per tile. 6. The method of claim 2 , wherein each respective instruction among the instructions is assigned to a tile containing memory variables having data to be operated upon by the respective instruction. 7. The method of claim 6 , wherein the timing of execution of the instructions is determined by: selecting a current instruction for scheduling; identifying a slot in a first tile containing memory variables used by the current instruction; and searching for a clock cycle for execution of the current instruction in the slot. 8. The method of claim 7 , wherein the timing of execution of the instructions is determined further by: selecting a next instruction for scheduling, in response to a determination that a valid spoke RAM slot is found for execution of the current instruction. 9. The method of claim 7 , wherein the timing of execution of the instructions is determined further by: searching, in response to a determination that no valid spoke RAM slot is found for execution of the current instruction in the slot, for an available slot in the first tile; and searching, in response to the available slot being found, for a clock cycle for execution of the current instruction in the available slot. 10. The method of claim 9 , further comprising, in response to no available slot being found: determining a prior instruction scheduled before the current instruction; identifying a first schedule previously determined for the prior instruction as invalid; and starting determination of a second valid schedule for the prior instruction. 11. A computing device, comprising: a memory; and a microprocessor coupled with the memory and configured to: receive an assembly language program identifying data flows through memory locations represented by memory variables and identifying instructions configured to transform data in the data flows, wherein the assembly language program is configured to describe a dispatch interface, at least one memory interface, and the memory locations to be mapped to tile memories, and the assembly language program is further configured to identify processing of the data in the data flows through the at least one memory interface and the memory locations via execution of the instructions; receive a hardware profile identifying features of a coarse grained reconfigurable array having a plurality of tiles operable in parallel; generate a memory map identifying, for each respective memory variable in the assembly language program, one of the tiles that contains a memory location represented by the respective memory variable; assign, based on the memory map, the instructions to the tiles for execution; and provide an instruction execution schedule identifying timing of execution of the instructions in the tiles. 12. The computing device of claim 11 , wherein to generate the memory map, the microprocessor is further configured to: partition the memory variables and the instructions into a plurality of groups, each of the groups configured to be implemented on one of the tiles; wherein each respective instruction among the instructions is assigned to a tile containing memory variables having data to be operated upon by the respective instruction. 13. The computing device of claim 12 , wherein the groups are generated via: balancing a number of instructions implemented per tile; balancing a number of memory variables implemented per tile; or balancing an amount of memory usage implemented per tile; or any combination thereof. 14. The computing device of claim 12 , wherein the timing of execution of the instructions is determined by: selecting a current instruction for scheduling; identifying a slot in a first tile containing memory variables used by the current instruction; and searching for a clock cycle for execution of the current instruction in the slot. 15. The computing device of claim 14 , wherein the timing of execution of the instructions is determined further by: selecting a next instruction for scheduling, in response to a determination that a valid spoke RAM slot is found for execution of the current instruction. 16. The computing device of claim 14 , wherein the timing of execution of the instructions is determined further by: searching, in response to a determination that no valid spoke RAM slot is found for execution of the current instruction in the slot, for an available slot in the first tile; and searching, in response to the available slot being found, for a clock cycle for execution of the current instruction in the available slot. 17. The computing device of claim 16 , further comprising, in response to no available slot being found: determining a prior instruction scheduled before the current instruction; identifying a first schedule previously determined for the prior instruction as invalid; and starting determination of a second valid schedule for the prior instruction. 18. A non-transitory computer storage medium storing instructions which, when executed by a computing device, cause the computing device to perform a method, comprising: receiving an assembly language program identifying data flows through memory locations represented by memory variables to be mapped to tile memories and identifying opcodes configured to transform data in the data flows; receiving a hardware profile identifying features of a coarse grained reconfigurable array having a plurality of tiles operable in parallel; generating a memory map identifying, for each respective memory variable in the assembly language program, one of the tiles that contains a memory location represented by the respective memory variable; assigning, based on the memory map, the opcodes to the tiles for execution; and providing an opcode execution schedule identifying timing of execution of the o

Assignees

Inventors

Classifications

  • LOAD or STORE instructions; Clear instruction · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Instruction operation extension or modification · CPC title

  • G06F9/3836Primary

    Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12039335B2 cover?
Schedule instructions of a program for execution on a coarse grained reconfigurable array having a plurality of tiles operable in parallel. The program identifies data flows through memory locations represented by memory variables and identifies instructions configured to transform data in the data flows. Based on a hardware profile identifying features of the coarse grained reconfigurable arra…
Who is the assignee on this patent?
Micron Technology Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/30181. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).