Deterministic memory for tensor streaming processors
US-2023024670-A1 · Jan 26, 2023 · US
US12561279B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12561279-B2 |
| Application number | US-202418731952-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 3, 2024 |
| Priority date | Jul 7, 2021 |
| Publication date | Feb 24, 2026 |
| Grant date | Feb 24, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments are directed to a deterministic streaming system with one or more deterministic streaming processors each having an array of processing elements and a first deterministic memory coupled to the processing elements. The deterministic streaming system further includes a second deterministic memory with multiple data banks having a global memory address space, and a controller. The controller initiates retrieval of first data from the data banks of the second deterministic memory as a first plurality of streams, each stream of the first plurality of streams streaming toward a respective group of processing elements of the array of processing elements. The controller further initiates writing of second data to the data banks of the second deterministic memory as a second plurality of streams, each stream of the second plurality of streams streaming from the respective group of processing elements toward a respective data bank of the second deterministic memory.
Opening claim text (preview).
What is claimed is: 1 . A system comprising: a deterministic processor comprising: a plurality of processing elements, and a first memory communicatively coupled to the plurality of processing elements; a second memory communicatively coupled with the plurality of processing elements, the second memory comprising a plurality of data banks having a global memory address space for the deterministic processor; and a controller communicatively coupled with the second memory, the controller configured to: facilitate retrieval of first data from the plurality of data banks as a first plurality of streams, wherein respective first streams of the first plurality of streams are configured to stream toward a respective group of processing elements of the plurality of processing elements, and facilitate writing of second data to the plurality of data banks as a second plurality of streams, wherein respective second streams of the second plurality of streams are configured to stream from the respective group of processing elements toward respective data banks of the plurality of data banks; wherein the controller is further configured to facilitate global visibility of a write request at a memory address. 2 . The system of claim 1 , wherein the second memory further comprises global memory address spaces, including the global memory address space, for one or more deterministic processors, including the deterministic processor, of the system. 3 . The system of claim 1 , further comprising: a compiler configured to manage initiation of retrieval of the first data from the second memory at a defined first time. 4 . The system of claim 3 , wherein the defined first time is selected based on a determination that the first data from the second memory are timely placed on the first plurality of streams. 5 . The system of claim 3 , wherein the compiler is further configured to manage initiation of writing of the second data to the second memory at a defined second time. 6 . The system of claim 1 , wherein the deterministic processor comprises a plurality of functional units, wherein functional units of the plurality of functional units operate independently and comprise respective instruction control units that execute instructions, in order, from instruction buffers. 7 . The system of claim 1 , wherein, based on the global visibility and a subsequent request for retrieval of data from the memory address, the controller is further configured to retrieve a latest value written in the memory address. 8 . The system of claim 1 , further comprising: a plurality of deterministic processors, including the deterministic processor, organized as a node of the system. 9 . The system of claim 8 , wherein the controller is further configured to: assign respective deterministic processors of the plurality of deterministic processors to a respective pair of pseudo channels of a plurality of pseudo channels of the second memory; and initiate streaming of data between the respective deterministic processors and a respective data bank of the plurality of data banks of the second memory associated with the respective pair of pseudo channels. 10 . The system of claim 8 , wherein a plurality of nodes, including the node, are organized as a rack of the system. 11 . The system of claim 10 , wherein the controller is further configured to: assign respective nodes of the plurality of nodes to a respective pair of pseudo channels of a plurality of pseudo channels of the second memory; and initiate streaming of data between the respective nodes and a respective data bank of the plurality of data banks of the second memory associated with the respective pair of pseudo channels. 12 . The system of claim 1 , wherein the system further comprises a plurality of racks, wherein respective racks of the plurality of racks comprise a plurality of nodes, and wherein respective nodes of the plurality of nodes comprise a plurality of deterministic processors, including the deterministic processor; and the controller is further configured to: assign each of the racks to a respective pair of pseudo channels of a plurality of pseudo channels of the second memory, and initiate streaming of data between each of the racks and a respective data bank of the plurality of data banks of the second memory associated with the respective pair of pseudo channels. 13 . The system of claim 1 , wherein the first memory comprises a static memory and the second memory comprises a dynamic memory. 14 . The system of claim 13 , wherein the dynamic memory comprises one or more three-dimensional stacks of a high bandwidth memory device. 15 . A method, comprising: facilitating, by a controller of a deterministic processor, retrieval of first data from a plurality of data banks of a first memory as a first plurality of streams, wherein respective first streams of the first plurality of streams are configured to stream toward a respective group of processing elements of an array of processing elements of a second memory; facilitating, by the controller, writing of second data to the plurality of data banks as a second plurality of streams, wherein respective second streams of the second plurality of streams are configured to stream from the respective group of processing elements toward respective data banks of the plurality of data banks; and facilitating, by the controller, global visibility of a write request at a memory address. 16 . The method of claim 15 , further comprising: initiating, by a compiler associated with the deterministic processor, first data from the second memory at a defined first time, wherein the defined first time is selected based on a determination that the first data from the second memory are timely placed on the first plurality of streams; and managing, by the compiler, initiation of writing of the second data to the second memory at a defined second time. 17 . The method of claim 15 , further comprising: receiving, by the controller, a subsequent request for the memory address; and based on the global visibility and the subsequent request, retrieving, by the controller, a latest value written in the memory address. 18 . The method of claim 15 , wherein the first memory comprises a dynamic memory and the second memory comprises a static memory, wherein the dynamic memory comprises one or more three-dimensional stacks of a high bandwidth memory device.
with multidimensional access, e.g. row/column, matrix · CPC title
with a network or matrix configuration · CPC title
Machine learning · CPC title
Vector or matrix data · CPC title
Free address space management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.