Systems and methods for configuring programmable logic devices for deep learning networks
US-2020151088-A1 · May 14, 2020 · US
US11907828B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11907828-B2 |
| Application number | US-201916558446-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 3, 2019 |
| Priority date | Sep 3, 2019 |
| Publication date | Feb 20, 2024 |
| Grant date | Feb 20, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A field programmable gate array (FPGA) may be used for inference of a trained deep neural network (DNN). The trained DNN may comprise a set of parameters and the FPGA may have a first precision configuration defining first number representations of the set of parameters. The FPGA may determine different precision configurations of the trained DNN. A precision configuration of the precision configurations may define second number representations of a subset of the set of parameters. For each precision configuration of the determined precision configurations a bitstream file may be provided. The bitstream files may be stored so that the FPGA may be programmed using one of the stored bitstream files for inference of the trained DNN.
Opening claim text (preview).
What is claimed is: 1. A method for inference of a trained deep neural network (DNN) using a field programmable gate array (FPGA), the trained DNN comprising a set of parameters, wherein the FPGA has a first precision configuration defining first number representations of the set of parameters, the method comprising: mapping the trained DNN to the FPGA via a configuration map, the mapping comprising: partitioning the FPGA into a static region and a reconfigurable dynamic region; partitioning the reconfigurable dynamic region into sub-regions; and mapping the trained DNN to the sub-regions of the reconfigurable dynamic region; determining different precision configurations of the trained DNN, wherein a precision configuration of the different precision configurations defines second number representations of a subset of the set of parameters; compiling a combination of layers of the trained DNN for each of the different precision configurations; providing, for each of the different precision configurations, a bitstream file for enabling programming of the FPGA, in accordance with the precision configuration; precompiling a bitstream pool with the bitstream files; storing the bitstream files; and programming the FPGA using one of the stored bitstream files for inference of the DNN and the configuration map, wherein the programming of the FPGA is performed automatically. 2. The method of claim 1 , wherein the programming of the FPGA is performed using a partial reconfiguration of the FPGA. 3. The method of claim 1 , wherein the programming of the FPGA is performed using a global reconfiguration of the FPGA. 4. The method of claim 1 , wherein the inference of the trained DNN comprises: providing an output by the trained DNN in response to receiving an input at the trained DNN, and wherein the programming of the FPGA is automatically performed after processing a predefined number of inputs of a set of inputs. 5. The method of claim 4 , further comprising: repeating the programming of the FPGA until all inputs of the set of inputs are processed. 6. The method of claim 1 , wherein the programming of the FPGA is performed during the inference in response to receiving a request to change the first precision configuration of the DNN into a precision configuration of the one of the stored bitstream file. 7. The method of claim 1 , wherein the programming of the FPGA is automatically performed for the inference and each other inference of the DNN. 8. The method of claim 1 , wherein the programming of the FPGA is performed before the inference of the DNN. 9. The method of claim 1 , wherein the programming of the FPGA is automatically performed before the inference of the DNN. 10. The method of claim 1 , wherein the DNN is a convolutional neural network (CNN), wherein the CNN comprises multiple layers, wherein the inference involves a set of layer operations at each layer of the CNN, and wherein the subset of parameters are parameters used in the set of layer operations of a predefined layer of the CNN. 11. The method of claim 10 , wherein each set of the set of layer operations is performed by a respective dynamically reconfigurable region of the FPGA, and wherein the programming of the FPGA comprises programming one region of the regions. 12. The method of claim 1 , further comprising: partitioning the FPGA into a static and dynamic region; partitioning the dynamic region into sub-regions, wherein each sub-region is configured to perform computations using a respective subset of parameters of the set of parameters; and wherein the programming of the FPGA includes programming a sub-region that corresponds to the subset of parameters. 13. The method of claim 12 , wherein the partitioning is performed by a graph partitioning algorithm. 14. A field programmable gate array (FPGA) configured for inference of a trained deep neural network (DNN), the trained DNN comprising a set of parameters, wherein the trained DNN is mapped to the FPGA via a configuration map by partitioning the FPGA into a static region and a reconfigurable region, partitioning the reconfigurable region into sub-regions, and mapping the trained DNN to the sub-regions of the reconfigurable dynamic regions, wherein the FPGA has a first precision configuration defining first number representations of the set of parameters, and wherein the FPGA has access to a precompiled bitstream pool with different stored bitstream files, wherein there is a compilation of a combination of layers of the trained DNN for each of the different precision configurations, the FPGA being configured for: accessing the different stored bitstream files of the different precision configurations; and switching from the first precision configuration to a second precision configuration using one of the stored bitstream files and the configuration map, wherein the switching of the FPGA from the first precision configuration to the second precision configuration is performed automatically. 15. The FPGA of claim 14 , wherein the second precision configuration defines a second number representations of the set of parameters. 16. The FPGA of claim 15 , wherein the FPGA is further configured for: selecting the one of the stored bitstream files based on the second number of representations of the set of parameters. 17. The FPGA of claim 14 , wherein each of the stored bitstream files programs the FPGA with a respective, different precision configuration. 18. The FPGA of claim 17 , wherein the FPGA is programmed by performing a partial reconfiguration of the FPGA. 19. The FPGA of claim 17 , wherein the FPGA is programmed by performing a global reconfiguration of the FPGA. 20. The FPGA of claim 17 , wherein the FPGA is programmed during the inference in response to receiving a request to change the first precision configuration into the second precision configuration of the one of the stored bitstream file.
Quantised networks; Sparse networks; Compressed networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
using electronic means · CPC title
Architecture, e.g. interconnection topology · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.