Method and apparatus for data access in a heterogeneous processing system with multiple processors using memory extension operation

US12169459B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12169459-B2
Application numberUS-202318099021-A
CountryUS
Kind codeB2
Filing dateJan 19, 2023
Priority dateJan 19, 2023
Publication dateDec 17, 2024
Grant dateDec 17, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A heterogeneous processing system and method including a host processor, a first processor coupled to a first memory, a second processor coupled to a second memory, and switch and bus circuitry that communicatively couples the host processor, the first processor, and the second processor. The host processor is programmed to map virtual addresses of the second memory to physical addresses of the switch and bus circuitry and to configure the first processor to directly access the second memory using the mapped physical addresses according to memory extension operation. The first processor may be a reconfigurable processor, a reconfigurable dataflow unit, or a compute engine. The first processor may directly read data from or directly write data to the second memory while executing an application. The method may include configuring the first processor to directly access the second memory while executing an application for reading or writing data.

First claim

Opening claim text (preview).

What is claimed is: 1. A heterogeneous processing system, comprising: a host processor; a first processor coupled to a first memory, wherein the first processor comprises a reconfigurable processor that includes: an array of coarse-grained reconfigurable units comprising, an address generation unit, a plurality of memory units, and a plurality of compute units interconnected by an array-level network; a top-level network coupled to the address generation unit of the array of coarse-grained reconfigurable units; and an interface coupled between the top-level network and an external port of the first processor; a second processor coupled to a second memory; and switch and bus circuitry that communicatively couples the host processor, the external port of the first processor, and the second processor; wherein the host processor is programmed to configure the address generation unit of the array of coarse-grained reconfigurable unit in the first processor to map virtual addresses of the second memory to physical addresses of the switch and bus circuitry so that the first processor can directly access the second memory using the mapped physical addresses according to memory extension operation. 2. The heterogeneous processing system of claim 1 , wherein the second processor is programmed to execute a first part of an application to generate and store first data into the second memory, and wherein the first processor is configured to directly access the first data from the second memory using the mapped physical addresses while executing a second part of the application using the first data. 3. The heterogeneous processing system of claim 2 , wherein the first processor is further configured to store second data output from executing the second part of the application into the first memory. 4. The heterogeneous processing system of claim 3 , further comprising: a host memory coupled to the host processor; and a data transfer resource coupled to the first processor and communicatively coupled by the switch and bus circuitry; wherein the host processor is configured to program the data transfer resource to transfer data between the second data and the host memory; and wherein the host processor is configured to prompt the data transfer resource to transfer the second data from the first memory to the host memory. 5. The heterogeneous processing system of claim 2 , further comprising: a host memory coupled to the host processor; and wherein the first processor is further configured to directly access the host memory and to store second data output from executing the second part of the application directly into the host memory. 6. The heterogeneous processing system of claim 1 , wherein the first processor is configured to execute a first part of an application to generate first data and to directly write the first data into the second memory using the mapped physical addresses. 7. The heterogeneous processing system of claim 6 , wherein the second processor is programmed to execute a second part of the application using the first data to generate second data and to store the second data into the second memory. 8. The heterogeneous processing system of claim 1 , further comprising: a host memory coupled to the host processor; and wherein the first processor is further configured to directly read first data from the host memory while executing an application using the first data to generate second data and to directly write the second data into the second memory while executing the application. 9. The heterogeneous processing system of claim 1 , wherein: the first processor is programmed to execute at least a portion of a first node a dataflow graph implementing a machine learning algorithm; and the second processor is programmed to execute at least a portion of a second node the dataflow graph. 10. The heterogeneous processing system of claim 9 , wherein: the second processor is further programmed to generate and store first data into the second memory; and the first processor is further programmed to directly access the first data from the second memory using mapped physical addresses. 11. The heterogeneous processing system of claim 9 , further comprising a host memory coupled to the host processor, wherein the first processor is further programmed to: directly read first data from the host memory; use using the first data to generate second data; and directly write the second data into the second memory using mapped physical addresses. 12. A method of accessing data in a heterogeneous system to implement a machine learning system using a dataflow graph having a plurality of nodes connected by edges, wherein the heterogeneous system includes a host processor, a first processor coupled to a first memory, a second processor coupled to a second memory, and switch and bus circuitry that communicatively couples the host processor, the first processor, and the second processor, the method comprising: executing at least a portion of a first node of the plurality of nodes of the dataflow graph using the first processor; executing at least a portion of a second node of the plurality of nodes of the dataflow graph using the second processor; mapping, by the host processor, virtual addresses of the second memory to physical addresses of the switch and bus circuitry; configuring, by the host processor, the first processor to directly access the second memory using the mapped physical addresses according to memory extension operation; and directly accessing, by the first processor, the second memory through the switch and bus circuitry. 13. The method of claim 12 , wherein the configuring the first processor comprises configuring a reconfigurable dataflow unit. 14. The method of claim 12 , wherein the configuring the first processor comprises configuring a compute engine. 15. The method of claim 12 , wherein the first processor comprises a reconfigurable processor that includes: an array of coarse-grained reconfigurable units comprising, an address generation unit, a plurality of memory units, and a plurality of compute units interconnected by an array-level network; a top-level network coupled to the address generation unit of the array of coarse-grained reconfigurable units; and an interface coupled between the top-level network and the switch and bus circuitry; and the configuring the first processor comprises configuring the address generation unit of the array of coarse-grained reconfigurable unit in the reconfigurable processor to map virtual addresses of the second memory to physical addresses of the switch and bus circuitry. 16. The method of claim 12 , further comprising: generating and storing, by the second processor, while executing the portion of the second node, first data into the second memory; and directly accessing, by the first processor, the first data from the second memory using mapped physical addresses while executing the portion of the first node. 17. The method of claim 16 , further comprising configuring the first processor to write second data generated by the portion of the first node. 18. The method of claim 16 , the heterogeneous system including a host memory coupled to the host processor, further comprising configuring the first processor to directly access the host memory and to write second data output from executing the portion of the second node directly into the host memory. 19. The method of claim 12 , further comprising generating, by the first processor while executing

Assignees

Inventors

Classifications

  • Address translation · CPC title

  • Performance improvement · CPC title

  • Distributed shared memory [DSM], e.g. remote direct memory access [RDMA] · CPC title

  • Address space sharing · CPC title

  • Virtual address space management · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12169459B2 cover?
A heterogeneous processing system and method including a host processor, a first processor coupled to a first memory, a second processor coupled to a second memory, and switch and bus circuitry that communicatively couples the host processor, the first processor, and the second processor. The host processor is programmed to map virtual addresses of the second memory to physical addresses of the…
Who is the assignee on this patent?
Sambanova Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/1009. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).