Modular gpu architecture for clients and servers
US-2023109990-A1 · Apr 13, 2023 · US
US12298932B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12298932-B2 |
| Application number | US-202318200311-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 22, 2023 |
| Priority date | May 25, 2022 |
| Publication date | May 13, 2025 |
| Grant date | May 13, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A data processing system is presented in a client-server configuration for executing first and second applications that a client in the client-server configuration can offload for execution onto the data processing system. The data processing system includes a server and a pool of reconfigurable data flow resources that is configured to execute the first application in a first runtime context and the second application in a second runtime context. The server is configured to establish a session with the client, receive first and second execution requests for executing the first application and the second application from the client, start respective first and second execution of the first and second applications in the respective first and second runtime contexts in response to receiving the first and second execution requests, and balance a first load from the first execution with a second load from the second execution.
Opening claim text (preview).
What is claimed is: 1. A data processing system in a client-server configuration for executing first and second applications that a client in the client-server configuration can offload for execution onto the data processing system, comprising: a pool of reconfigurable data flow resources that comprises arrays of coarse-grained reconfigurable (CGR) units and that is configured to execute the first application in a first runtime context and the second application in a second runtime context, wherein the pool of reconfigurable data flow resources is partitionable into a predetermined number of partitions, wherein each partition of the predetermined number of partitions comprises at least one array of coarse-grained reconfigurable units; and a server in the client-server configuration that comprises a storage device and a host processor that is coupled to the storage device and to the pool of reconfigurable data flow resources, wherein the server is coupled to the client and configured to: establish a session with the client, receive a first execution request for executing the first application from the client, receive a second execution request for executing the second application from the client, in response to receiving the first execution request, start a first execution of the first application in the first runtime context, in response to receiving the second execution request, start a second execution of the second application in the second runtime context, and balance a first load from the first execution with a second load from the second execution. 2. The data processing system of claim 1 , wherein each one of the first and second runtime contexts is associated with a respective request queue length, and wherein the server is further configured to balance the first load from the first execution with the second load from the second execution based on the respective request queue length. 3. The data processing system of claim 1 , wherein the storage device stores first and second configuration files that are associated with the first and second applications, wherein the first and second configuration files are used for configuring the pool of reconfigurable data flow resources so that the pool of reconfigurable data flow resources is configured to execute the first and second applications. 4. The data processing system of claim 3 , wherein the host processor is configured to: receive identifiers of the first and second applications, retrieve the first and second configuration files from the storage device using the identifiers of the first and second applications, and start the first and second runtime contexts using the first and second configuration files. 5. The data processing system of claim 1 , wherein the server is coupled to at least one of a supercomputer, a mainframe computer, a workstation, a personal computer, or a quantum computer. 6. The data processing system of claim 1 , wherein the server receives a first remote direct memory access (RDMA) connection request for a first data exchange associated with the first execution request from the client and a second RDMA connection request for a second data exchange associated with the second execution request from the client. 7. The data processing system of claim 1 , wherein the pool of reconfigurable data flow resources is further configured to execute the first application in a third runtime context, wherein the server is further coupled to an additional client in the client-server configuration, and wherein the server is further configured to: establish an additional session with the additional client; receive a third execution request for executing the first application from the additional client; in response to receiving the third execution request, start a third execution of the first application in the third runtime context; and balance the first and second loads with a third load from the third execution. 8. The data processing system of claim 1 , wherein the server is further configured to: spawn a first thread for handling the first execution request; and spawn a second thread that is different than the first thread for handling the second execution request. 9. The data processing system of claim 1 , wherein the server is further configured to: spawn a single thread for handling the first and second execution requests. 10. The data processing system of claim 9 , wherein each one of the first and second runtime contexts is associated with a respective request queue, and wherein the server is further configured to access the respective request queues in a round-robin manner. 11. A method of operating a data processing system in a client-server configuration for executing first and second applications that a client in the client-server configuration can offload for execution onto the data processing system, the data processing system comprising a pool of reconfigurable data flow resources that comprises arrays of coarse-grained reconfigurable (CGR) units, that is partitionable into a predetermined number of partitions, wherein each partition of the predetermined number of partitions comprises at least one array of coarse-grained reconfigurable units, and that is configured to execute the first application in a first runtime context and the second application in a second runtime context, and a server in the client-server configuration that comprises a storage device and a host processor that is coupled to the storage device and to the pool of reconfigurable data flow resources, wherein the server is coupled to the client, the method comprising: with the server, establishing a session with the client; receiving, with the server, a first execution request for executing the first application from the client; receiving, with the server, a second execution request for executing the second application from the client; in response to receiving the first execution request, starting a first execution of the first application in the first runtime context; in response to receiving the second execution request, starting a second execution of the second application in the second runtime context; and balancing, with the server, a first load from the first execution with a second load from the second execution. 12. The method of claim 11 , wherein each one of the first and second runtime contexts is associated with a respective request queue length, and wherein balancing, with the server, the first load from the first execution with the second load from the second execution is based on the respective request queue length. 13. The method of claim 11 , further comprising: receiving, with the server, a first remote direct memory access (RDMA) connection request for a first data exchange associated with the first execution request from the client and a second RDMA connection request for a second data exchange associated with the second execution request from the client. 14. The method of claim 11 , wherein the pool of reconfigurable data flow resources is further configured to execute the first application in a third runtime context, and wherein the server is further coupled to an additional client in the client-server configuration, the method further comprising: with the server, establishing an additional session with the additional client; receiving, with the server, a third execution request for executing the first application from the additional client; in response to receiving the third execution request, starting a third execution of the first application in the third runtime context; and balancing the first and second loads with a third load fro
Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel · CPC title
with reconfigurable architecture · CPC title
comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title
wherein the interconnection is dynamically configurable, e.g. having loosely coupled nearest neighbor architecture (reconfigurable processors arrays G06F15/7867) · CPC title
using a plurality of independent parallel functional units · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.