Multiple contexts for a compute unit in a reconfigurable data processor

US12314754B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12314754-B2
Application numberUS-202318236531-A
CountryUS
Kind codeB2
Filing dateAug 22, 2023
Priority dateAug 23, 2022
Publication dateMay 27, 2025
Grant dateMay 27, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data processing system includes a coarse-grained reconfigurable (CGR) processor and a compiler configured to generate one or more configuration files for an application for execution on the CGR processor. The CGR processor includes an array of pattern compute units (PCUs) and pattern memory units (PMUs). A PCU comprises a plurality of single-instruction multiple data (SIMD) units configurable to form a datapath. The CGR processor is coupled to configure a datapath including a SIMD, using a set of configurations bits corresponding to an operation related to the task. The CGR processor is coupled to switch among the plurality of tasks and their corresponding PCU contexts during execution of the dataflow graph. The CGR processor is coupled to switch among tasks via static switching or dynamic switching, in response to the triggering of a task complete event generated by a preset counter, indicating completion of a current task.

First claim

Opening claim text (preview).

We claim as follows: 1. A data processing system comprising: a coarse-grained reconfigurable (CGR) processor including an array of CGR unit reconfigurable units including a plurality of pattern compute units (PCUs) and a plurality of pattern memory units (PMUs) configured to execute a dataflow graph, a PCU further comprising a plurality of single-instruction multiple data (SIMD) units configurable to form a datapath, wherein a PMU is coupled to the PCU via a datapath pipeline, wherein the CGR processor is coupled to receive a configuration file via a compiler, the configuration file including a plurality of tasks to be performed by the CGR processor and their respective PCU configuration data, wherein the CGR processor is coupled perform a task by configuring a datapath including a SIMD to generate a configured datapath, using a set of configurations bits corresponding to one or more operations corresponding to the task, wherein the configured datapath for the operation is identified as a PCU context, wherein the CGR processor is coupled to switch among the plurality of tasks and a plurality of PCU contexts corresponding to the plurality of tasks during execution of the dataflow graph, wherein progress of the task is tracked using a counter coupled to trigger a task complete event upon completion of a plurality of operations corresponding to the task, and wherein the CGR processor is coupled to switch from a current task to a next task, via static switching or dynamic switching, in response to the triggering of the task complete event indicating completion of the current task. 2. The system of claim 1 , wherein the SIMD further includes a plurality of functional units coupled serially between an input of the datapath and an output of the datapath, wherein each functional unit represents a stage in the datapath. 3. The system of claim 2 , wherein the configured datapath is coupled to receive a plurality of scalar and vector data packets as inputs and coupled to provide a plurality of scalar data packets and vector data packets as outputs. 4. The system of claim 3 , wherein the plurality of functional units are coupled to perform the operation using the scalar data packets and the vector data packets and provide a result of the operation as outputs. 5. The system of claim 4 , wherein the PMU is coupled to provide the inputs and store the outputs. 6. The system of claim 5 , wherein each functional unit performs a part of the operation based on an input received from a previous functional unit and provides a result of the part of the operation to a next functional unit. 7. The system of claim 1 , wherein in the static switching, the CGR processor receives the next task from a sequence of tasks that is pre-programmed in the configuration file. 8. The system of claim 3 , wherein in the dynamic switching, the CGR processor the next task is determined during execution of the dataflow graph, based on a result of the current task. 9. The system of claim 1 , wherein the counter is preset to a minimum value and coupled to trigger the task complete event upon reaching a maximum value. 10. The system of claim 1 , wherein the task can be generating a mean of a plurality of data points in the dataflow graph. 11. The system of claim 1 , wherein the task can be generating a variance of a plurality of data points in the dataflow graph. 12. A method for a coarse-grained reconfigurable (CGR) processor including an array of CGR unit reconfigurable units including a plurality of pattern compute units (PCUs) and a plurality of pattern memory units (PMUs) configured to execute a dataflow graph, and a PCU further comprising a plurality of functional units, a PCU further comprising a plurality of single-instruction multiple data (SIMD) units configurable to form a datapath and a PMU is coupled to the PCU via a datapath pipeline, the method comprising: receiving a configuration file via a compiler, the configuration file including a plurality of tasks to be performed by the CGR processor and their respective PCU configuration data, configurating a datapath including a SIMD to generate a configured datapath, using a set of configurations bits corresponding to one or more operations corresponding to the task, wherein the configured datapath for the operation is identified as a PCU context, switching among the plurality of tasks and a plurality of PCU contexts corresponding to the plurality of tasks during execution of the dataflow graph, triggering a task complete event by a counter upon completion of the task, tracking progress of the task by monitoring the task complete event, and switching from a current task to a next task and from a current PCU context to a next PCU context via static switching or dynamic switching, in response the triggering of the task complete event indicating completion of the current task. 13. The method of claim 12 , wherein the SIMD further includes a plurality of functional units coupled serially between an input of the datapath and an output of the datapath and wherein each functional unit represents a stage in the datapath. 14. The method of claim 12 , further comprising: receiving by the configured datapath, a plurality of scalar and vector data packets as inputs and providing by the configured datapath, a plurality of scalar data packets and vector data packets as outputs. 15. The method of claim 14 further comprising: perform the operation by the plurality of functional units using the scalar data packets and the vector data packets and providing a result of the operation as outputs. 16. The method of claim 15 further comprising: providing the inputs to the PMU and storing the outputs by the PMU. 17. The method of claim 15 further comprising, performing by each functional unit, a part of the operation based on an input received from a previous functional unit and providing a result of the part of the operation to a next functional unit. 18. The method of claim 12 further comprising: receiving the next task from a sequence of tasks that is pre-programmed in the configuration file in the static switching. 19. The method of claim 12 further comprising: determining the next task during execution of the dataflow graph, based on a result of the current task in the dynamic switching. 20. The method of claim 12 further comprising: presetting the counter to a minimum value and triggering the task complete event upon reaching a maximum value.

Assignees

Inventors

Classifications

  • Performance improvement · CPC title

  • using tables or multilevel address translation means (G06F12/023 takes precedence; address translation in virtual memory systems G06F12/10) · CPC title

  • Single storage device · CPC title

  • by changing the path, e.g. traffic rerouting, path reconfiguration · CPC title

  • Improving I/O performance · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12314754B2 cover?
A data processing system includes a coarse-grained reconfigurable (CGR) processor and a compiler configured to generate one or more configuration files for an application for execution on the CGR processor. The CGR processor includes an array of pattern compute units (PCUs) and pattern memory units (PMUs). A PCU comprises a plurality of single-instruction multiple data (SIMD) units configurable…
Who is the assignee on this patent?
Sambanova Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/485. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 27 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).