Methods and apparatus to facilitate field-programmable gate array support during runtime execution of computer readable instructions
US-2019095229-A1 · Mar 28, 2019 · US
US11599498B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11599498-B1 |
| Application number | US-202017068697-A |
| Country | US |
| Kind code | B1 |
| Filing date | Oct 12, 2020 |
| Priority date | Apr 3, 2018 |
| Publication date | Mar 7, 2023 |
| Grant date | Mar 7, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device may include a processor system and an array of data processing engines (DPEs) communicatively coupled to the processor system. Each of the DPEs includes a core and a DPE interconnect. The processor system is configured to transmit configuration data to the array of DPEs, and each of the DPEs is independently configurable based on the configuration data received at the respective DPE via the DPE interconnect of the respective DPE. The array of DPEs enable, without modifying operation of a first kernel of a first subset of the DPEs of the array of DPEs, reconfiguration of a second subset of the DPEs of the array of DPEs.
Opening claim text (preview).
What is claimed is: 1. A device comprising: an array of data processing engines (DPEs), wherein: each DPE of the array of DPEs comprises a core and a configuration memory space; the array of DPEs comprises a stream network and a memory mapped network; the stream network is configurable to route application data between DPEs of the array of DPEs; the memory mapped network is configured to route memory mapped transactions, based on addresses contained in the respective memory mapped transactions, among the array of DPEs and to write configuration data of the respective memory mapped transactions to respective configuration memory spaces of DPEs of the array of DPEs based on the addresses contained in the respective memory mapped transactions; each DPE of the array of DPEs is independently configurable based on configuration data received at the respective DPE via the memory mapped network; and the array of DPEs enables, without modifying operation of a first kernel loaded on a first subset of the array of DPEs, reconfiguration of a second subset of the array of DPEs by memory mapped transactions through the memory mapped network. 2. The device of claim 1 , wherein each DPE of the array of DPEs includes: a stream switch of the stream network comprising a core stream interface connected to the core of the respective DPE and comprising one or more neighboring stream interfaces to respective one or more neighboring DPEs of the array of DPEs; and a memory mapped switch of the memory mapped network comprising one or more memory mapped interfaces connected to the configuration memory space of the respective DPE and comprising one or more neighboring memory mapped interfaces to respective one or more neighboring DPEs of the array of DPEs. 3. The device of claim 2 , wherein, for each DPE of the array of DPEs, the configuration memory space comprises: program memory configured to store executable program code that is executable by the core of the respective DPE; and configuration registers configured to store interconnect data that configures the stream switch of the respective DPE for routing application data via the stream network. 4. The device of claim 2 , wherein the stream switch is configured to be partially reconfigurable while continuing to route application data for a kernel that is not being reconfigured. 5. The device of claim 2 , wherein the array of DPEs enables, after reconfiguring the second subset of the array of DPEs: a data flow to or from one of the first subset of the array of DPEs or the second subset of the array of DPEs is through the stream switch of respective one or more DPEs of the other one of the first subset of the array of DPEs or the second subset of the array of DPEs. 6. The device of claim 1 , wherein the array of DPEs further comprises: a broadcast network; and event logic communicatively coupled to the broadcast network and configured to: detect a triggering event for partial reconfiguration and responsively transmit a stall signal through the broadcast network; and halt execution of a respective DPE when the triggering event is detected and when a stall signal is received from the broadcast network. 7. The device of claim 1 further comprising: a processor system; a network-on-chip coupled to the processor system; and a system interface circuit coupled to the network-on-chip and to the array of DPEs, the system interface circuit comprising tiles, each tile being coupled to a column of DPEs of the array of DPEs, the processor system being configured to transmit configuration data to the array of DPEs via the network-on-chip and the system interface circuit. 8. The device of claim 1 , wherein the array of DPEs enables continued operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs do not have a shared hardware resource and (ii) no data and/or control dependency exists between the first kernel and the second kernel. 9. The device of claim 1 , wherein the array of DPEs enables continued operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs have a shared hardware resource and (ii) no data and/or control dependency exists between the first kernel and the second kernel. 10. The device of claim 1 , wherein the array of DPEs enables stalling operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs do not have a shared hardware resource and (ii) a data and/or control dependency exists between the first kernel and the second kernel. 11. The device of claim 1 , wherein the array of DPEs enables stalling operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs have a shared hardware resource and (ii) a data and/or control dependency exists between the first kernel and the second kernel. 12. The device of claim 1 , wherein the array of DPEs enables, after reconfiguring the second subset of the array of DPEs: application and/or control data generated by one of the first subset of the array of DPEs or the second subset of the array of DPEs is received and processed by the other one of the first subset of the array of DPEs or the second subset of the array of DPEs. 13. A method for operating a device, the method comprising: operating a first kernel loaded on a first subset of an array of data processing engines (DPEs), each DPE of the array of DPEs comprising a core and a configuration memory space, the array of DPEs comprising a stream network and a memory mapped network, the stream network being configurable to route application data between DPEs of the array of DPEs; without modifying operation of the first kernel on the first subset of the array of DPEs, configuring a second subset of the array of DPEs to implement a second kernel, configuring the second subset of the array of DPEs comprising: routing memory mapped transactions via the memory mapped network, based on addresses contained in the respective memory mapped transactions, to the second subset of the array of DPEs; and writing configuration data of the respective memory mapped transactions to respective configuration memory spaces of DPEs of the second subset of the array of DPEs based on the addresses contained in the respective memory mapped transactions; and operating both the first kernel loaded on the first subset of the array of DPEs and the second kernel loaded on the second subset of the array of DPEs after configuring the second subset of the array of DPEs to implement the second kernel. 14. The method of claim 13 , wherein: each DPE of the DPEs includes: a stream switch of the stream network comprising a core stream interface connected to the core of the respective DPE and comprising one or more neighboring stream interfaces to respective one or more neighboring DPEs of the array of DPEs; and a memory mapped switch of the memory mapped network comprising one or more memory mapped interfaces connected to the configuration memory space of the respective DPE and comprising one or more neighboring memory mapped interfaces to respective one or more neighboring DPEs of the array of DPEs; and routing the memory mapped transactions includes routing the memory mapped transactions by one or more memor
Initialisation or configuration control {(processor initialisation G06F9/4405)} · CPC title
Configuring for operating with peripheral devices; Loading of device drivers · CPC title
Bootstrapping (security arrangements therefor G06F21/57) · CPC title
Globally asynchronous, locally synchronous, e.g. network on chip · CPC title
comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.