Communication between dataflow processing units and memories
US-10564929-B2 · Feb 18, 2020 · US
US10838868B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10838868-B2 |
| Application number | US-201916295408-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 7, 2019 |
| Priority date | Mar 7, 2019 |
| Publication date | Nov 17, 2020 |
| Grant date | Nov 17, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments for implementing a communicating memory between a plurality of computing components are provided. In one embodiment, an apparatus comprises a plurality of memory components residing on a processing chip, the plurality of memory components interconnected between a plurality of processing elements of at least one processing core of the processing chip and at least one external memory component external to the processing chip. The apparatus further comprises a plurality of load agents and a plurality of store agents on the processing chip, each interfacing with the plurality of memory components. Each of the plurality of load agents and the plurality of store agents execute an independent program specifying a destination of data transacted between the plurality of memory components, the at least one external memory component, and the plurality of processing elements.
Opening claim text (preview).
The invention claimed is: 1. A method for implementing a communicating memory between a plurality of computing components, by a processor, comprising: providing a plurality of memory components residing on a processing chip, the plurality of memory components interconnected between a plurality of processing elements of at least one processing core of the processing chip and at least one external memory component external to the processing chip; providing a plurality of load agents and a plurality of store agents on the processing chip, each interfacing with the plurality of memory components; and executing, by each of the plurality of load agents and the plurality of store agents, an independent program specifying a destination of data transacted between the plurality of memory components, at least one external memory component, and the plurality of processing elements. 2. The method of claim 1 , further including arranging the plurality of memory components in a hierarchy into a plurality of levels; wherein the lowest level of the plurality of levels is divided into multiple banks each accepting the data at a plurality of ports from a higher level of the plurality of levels. 3. The method of claim 1 , wherein each of the plurality of load agents and the plurality of store agents concurrently communicate the data asynchronously to the destination within at least one of the plurality of processing elements, the plurality of memory components, and at least one external memory component via the interface. 4. The method of claim 3 , wherein the interface comprises a First-In-First-Out (FIFO) interface or an off-chip interconnection network. 5. The method of claim 3 , wherein executing the independent program further includes executing explicit synchronization instructions using a handshake operation between each of the plurality of load agents and the plurality of store agents to avoid data collisions while the data is transacted. 6. The method of claim 5 , further including, pursuant to communicating the data concurrently, using an arbitration logic to handle concurrent data requests to a same destination as identified by a target identifier. 7. The method of claim 1 , wherein each of the plurality of memory components comprises a scratchpad memory. 8. An apparatus for implementing a communicating memory between a plurality of computing components, comprising: a plurality of memory components residing on a processing chip, the plurality of memory components interconnected between a plurality of processing elements of at least one processing core of the processing chip and at least one external memory component external to the processing chip; and a plurality of load agents and a plurality of store agents on the processing chip, each interfacing with the plurality of memory components; wherein: each of the plurality of load agents and the plurality of store agents execute an independent program specifying a destination of data transacted between the plurality of memory components, at least one external memory component, and the plurality of processing elements. 9. The system of claim 8 , wherein the plurality of memory components are arranged in a hierarchy into a plurality of levels; and wherein the lowest level of the plurality of levels is divided into multiple banks each accepting the data at a plurality of ports from a higher level of the plurality of levels. 10. The system of claim 8 , wherein each of the plurality of load agents and the plurality of store agents concurrently communicate the data asynchronously to the destination within at least one of the plurality of processing elements, the plurality of memory components, and at least one external memory component via the interface. 11. The system of claim 10 , wherein the interface comprises a First-In-First-Out (FIFO) interface or an off-chip interconnection network. 12. The system of claim 10 , wherein executing the independent program further includes executing explicit synchronization instructions using a handshake operation between each of the plurality of load agents and the plurality of store agents to avoid data collisions while the data is transacted. 13. The system of claim 12 , wherein, pursuant to communicating the data concurrently, an arbitration logic is used to handle concurrent data requests to a same destination as identified by a target identifier. 14. The system of claim 8 , wherein each of the plurality of memory components comprises a scratchpad memory. 15. A computer program product for implementing a communicating memory between a plurality of computing components, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that provides communication of a plurality of memory components residing on a processing chip, the plurality of memory components interconnected between a plurality of processing elements of at least one processing core of the processing chip and at least one external memory component external to the processing chip; an executable portion that provides communication of a plurality of load agents and a plurality of store agents on the processing chip, each interfacing with the plurality of memory components; and an executable portion that causes each of the plurality of load agents and the plurality of store agents to execute an independent program specifying a destination of data transacted between the plurality of memory components, the at least one external memory component, and the plurality of processing elements. 16. The computer program product of claim 15 , further including an executable portion that arranges the plurality of memory components in a hierarchy into a plurality of levels; wherein the lowest level of the plurality of levels is divided into multiple banks each accepting the data at a plurality of ports from a higher level of the plurality of levels. 17. The computer program product of claim 15 , wherein each of the plurality of load agents and the plurality of store agents concurrently communicate the data asynchronously to the destination within at least one of the plurality of processing elements, the plurality of memory components, and the at least one external memory component via the interface. 18. The computer program product of claim 17 , wherein the interface comprises a First-In-First-Out (FIFO) interface through at least one of an on-chip and an off-chip interconnection network. 19. The computer program product of claim 17 , wherein executing the independent program further includes executing explicit synchronization instructions using a handshake operation between each of the plurality of load agents and the plurality of store agents to avoid data collisions while the data is transacted. 20. The computer program product of claim 19 , further including an executable portion that, pursuant to communicating the data concurrently, uses an arbitration logic to handle concurrent data requests to a same destination as identified by a target identifier. 21. The computer program product of claim 15 , wherein each of the plurality of memory components comprises a scratchpad memory.
On-chip cache; Off-chip memory · CPC title
Intercommunication techniques · CPC title
Buffers; Shared memory; Pipes · CPC title
for multiprocessing or multitasking · CPC title
Details of cache specific to multiprocessor cache arrangements · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.