Device with data processing engine array that enables partial reconfiguration

US11599498B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11599498-B1
Application numberUS-202017068697-A
CountryUS
Kind codeB1
Filing dateOct 12, 2020
Priority dateApr 3, 2018
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A device may include a processor system and an array of data processing engines (DPEs) communicatively coupled to the processor system. Each of the DPEs includes a core and a DPE interconnect. The processor system is configured to transmit configuration data to the array of DPEs, and each of the DPEs is independently configurable based on the configuration data received at the respective DPE via the DPE interconnect of the respective DPE. The array of DPEs enable, without modifying operation of a first kernel of a first subset of the DPEs of the array of DPEs, reconfiguration of a second subset of the DPEs of the array of DPEs.

First claim

Opening claim text (preview).

What is claimed is: 1. A device comprising: an array of data processing engines (DPEs), wherein: each DPE of the array of DPEs comprises a core and a configuration memory space; the array of DPEs comprises a stream network and a memory mapped network; the stream network is configurable to route application data between DPEs of the array of DPEs; the memory mapped network is configured to route memory mapped transactions, based on addresses contained in the respective memory mapped transactions, among the array of DPEs and to write configuration data of the respective memory mapped transactions to respective configuration memory spaces of DPEs of the array of DPEs based on the addresses contained in the respective memory mapped transactions; each DPE of the array of DPEs is independently configurable based on configuration data received at the respective DPE via the memory mapped network; and the array of DPEs enables, without modifying operation of a first kernel loaded on a first subset of the array of DPEs, reconfiguration of a second subset of the array of DPEs by memory mapped transactions through the memory mapped network. 2. The device of claim 1 , wherein each DPE of the array of DPEs includes: a stream switch of the stream network comprising a core stream interface connected to the core of the respective DPE and comprising one or more neighboring stream interfaces to respective one or more neighboring DPEs of the array of DPEs; and a memory mapped switch of the memory mapped network comprising one or more memory mapped interfaces connected to the configuration memory space of the respective DPE and comprising one or more neighboring memory mapped interfaces to respective one or more neighboring DPEs of the array of DPEs. 3. The device of claim 2 , wherein, for each DPE of the array of DPEs, the configuration memory space comprises: program memory configured to store executable program code that is executable by the core of the respective DPE; and configuration registers configured to store interconnect data that configures the stream switch of the respective DPE for routing application data via the stream network. 4. The device of claim 2 , wherein the stream switch is configured to be partially reconfigurable while continuing to route application data for a kernel that is not being reconfigured. 5. The device of claim 2 , wherein the array of DPEs enables, after reconfiguring the second subset of the array of DPEs: a data flow to or from one of the first subset of the array of DPEs or the second subset of the array of DPEs is through the stream switch of respective one or more DPEs of the other one of the first subset of the array of DPEs or the second subset of the array of DPEs. 6. The device of claim 1 , wherein the array of DPEs further comprises: a broadcast network; and event logic communicatively coupled to the broadcast network and configured to: detect a triggering event for partial reconfiguration and responsively transmit a stall signal through the broadcast network; and halt execution of a respective DPE when the triggering event is detected and when a stall signal is received from the broadcast network. 7. The device of claim 1 further comprising: a processor system; a network-on-chip coupled to the processor system; and a system interface circuit coupled to the network-on-chip and to the array of DPEs, the system interface circuit comprising tiles, each tile being coupled to a column of DPEs of the array of DPEs, the processor system being configured to transmit configuration data to the array of DPEs via the network-on-chip and the system interface circuit. 8. The device of claim 1 , wherein the array of DPEs enables continued operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs do not have a shared hardware resource and (ii) no data and/or control dependency exists between the first kernel and the second kernel. 9. The device of claim 1 , wherein the array of DPEs enables continued operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs have a shared hardware resource and (ii) no data and/or control dependency exists between the first kernel and the second kernel. 10. The device of claim 1 , wherein the array of DPEs enables stalling operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs do not have a shared hardware resource and (ii) a data and/or control dependency exists between the first kernel and the second kernel. 11. The device of claim 1 , wherein the array of DPEs enables stalling operation of the first kernel during reconfiguration of the second subset of the array of DPEs when (i) the first kernel and a second kernel loaded on the second subset of the array of DPEs have a shared hardware resource and (ii) a data and/or control dependency exists between the first kernel and the second kernel. 12. The device of claim 1 , wherein the array of DPEs enables, after reconfiguring the second subset of the array of DPEs: application and/or control data generated by one of the first subset of the array of DPEs or the second subset of the array of DPEs is received and processed by the other one of the first subset of the array of DPEs or the second subset of the array of DPEs. 13. A method for operating a device, the method comprising: operating a first kernel loaded on a first subset of an array of data processing engines (DPEs), each DPE of the array of DPEs comprising a core and a configuration memory space, the array of DPEs comprising a stream network and a memory mapped network, the stream network being configurable to route application data between DPEs of the array of DPEs; without modifying operation of the first kernel on the first subset of the array of DPEs, configuring a second subset of the array of DPEs to implement a second kernel, configuring the second subset of the array of DPEs comprising: routing memory mapped transactions via the memory mapped network, based on addresses contained in the respective memory mapped transactions, to the second subset of the array of DPEs; and writing configuration data of the respective memory mapped transactions to respective configuration memory spaces of DPEs of the second subset of the array of DPEs based on the addresses contained in the respective memory mapped transactions; and operating both the first kernel loaded on the first subset of the array of DPEs and the second kernel loaded on the second subset of the array of DPEs after configuring the second subset of the array of DPEs to implement the second kernel. 14. The method of claim 13 , wherein: each DPE of the DPEs includes: a stream switch of the stream network comprising a core stream interface connected to the core of the respective DPE and comprising one or more neighboring stream interfaces to respective one or more neighboring DPEs of the array of DPEs; and a memory mapped switch of the memory mapped network comprising one or more memory mapped interfaces connected to the configuration memory space of the respective DPE and comprising one or more neighboring memory mapped interfaces to respective one or more neighboring DPEs of the array of DPEs; and routing the memory mapped transactions includes routing the memory mapped transactions by one or more memor

Assignees

Inventors

Classifications

  • G06F15/177Primary

    Initialisation or configuration control {(processor initialisation G06F9/4405)} · CPC title

  • Configuring for operating with peripheral devices; Loading of device drivers · CPC title

  • Bootstrapping (security arrangements therefor G06F21/57) · CPC title

  • Globally asynchronous, locally synchronous, e.g. network on chip · CPC title

  • comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11599498B1 cover?
A device may include a processor system and an array of data processing engines (DPEs) communicatively coupled to the processor system. Each of the DPEs includes a core and a DPE interconnect. The processor system is configured to transmit configuration data to the array of DPEs, and each of the DPEs is independently configurable based on the configuration data received at the respective DPE vi…
Who is the assignee on this patent?
Xilinx Inc
What technology area does this patent fall under?
Primary CPC classification G06F15/177. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).