What technology area does this patent fall under?

Primary CPC classification G06N3/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Deep neural network on field-programmable gate array

US11907828B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11907828-B2
Application number	US-201916558446-A
Country	US
Kind code	B2
Filing date	Sep 3, 2019
Priority date	Sep 3, 2019
Publication date	Feb 20, 2024
Grant date	Feb 20, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A field programmable gate array (FPGA) may be used for inference of a trained deep neural network (DNN). The trained DNN may comprise a set of parameters and the FPGA may have a first precision configuration defining first number representations of the set of parameters. The FPGA may determine different precision configurations of the trained DNN. A precision configuration of the precision configurations may define second number representations of a subset of the set of parameters. For each precision configuration of the determined precision configurations a bitstream file may be provided. The bitstream files may be stored so that the FPGA may be programmed using one of the stored bitstream files for inference of the trained DNN.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for inference of a trained deep neural network (DNN) using a field programmable gate array (FPGA), the trained DNN comprising a set of parameters, wherein the FPGA has a first precision configuration defining first number representations of the set of parameters, the method comprising: mapping the trained DNN to the FPGA via a configuration map, the mapping comprising: partitioning the FPGA into a static region and a reconfigurable dynamic region; partitioning the reconfigurable dynamic region into sub-regions; and mapping the trained DNN to the sub-regions of the reconfigurable dynamic region; determining different precision configurations of the trained DNN, wherein a precision configuration of the different precision configurations defines second number representations of a subset of the set of parameters; compiling a combination of layers of the trained DNN for each of the different precision configurations; providing, for each of the different precision configurations, a bitstream file for enabling programming of the FPGA, in accordance with the precision configuration; precompiling a bitstream pool with the bitstream files; storing the bitstream files; and programming the FPGA using one of the stored bitstream files for inference of the DNN and the configuration map, wherein the programming of the FPGA is performed automatically. 2. The method of claim 1 , wherein the programming of the FPGA is performed using a partial reconfiguration of the FPGA. 3. The method of claim 1 , wherein the programming of the FPGA is performed using a global reconfiguration of the FPGA. 4. The method of claim 1 , wherein the inference of the trained DNN comprises: providing an output by the trained DNN in response to receiving an input at the trained DNN, and wherein the programming of the FPGA is automatically performed after processing a predefined number of inputs of a set of inputs. 5. The method of claim 4 , further comprising: repeating the programming of the FPGA until all inputs of the set of inputs are processed. 6. The method of claim 1 , wherein the programming of the FPGA is performed during the inference in response to receiving a request to change the first precision configuration of the DNN into a precision configuration of the one of the stored bitstream file. 7. The method of claim 1 , wherein the programming of the FPGA is automatically performed for the inference and each other inference of the DNN. 8. The method of claim 1 , wherein the programming of the FPGA is performed before the inference of the DNN. 9. The method of claim 1 , wherein the programming of the FPGA is automatically performed before the inference of the DNN. 10. The method of claim 1 , wherein the DNN is a convolutional neural network (CNN), wherein the CNN comprises multiple layers, wherein the inference involves a set of layer operations at each layer of the CNN, and wherein the subset of parameters are parameters used in the set of layer operations of a predefined layer of the CNN. 11. The method of claim 10 , wherein each set of the set of layer operations is performed by a respective dynamically reconfigurable region of the FPGA, and wherein the programming of the FPGA comprises programming one region of the regions. 12. The method of claim 1 , further comprising: partitioning the FPGA into a static and dynamic region; partitioning the dynamic region into sub-regions, wherein each sub-region is configured to perform computations using a respective subset of parameters of the set of parameters; and wherein the programming of the FPGA includes programming a sub-region that corresponds to the subset of parameters. 13. The method of claim 12 , wherein the partitioning is performed by a graph partitioning algorithm. 14. A field programmable gate array (FPGA) configured for inference of a trained deep neural network (DNN), the trained DNN comprising a set of parameters, wherein the trained DNN is mapped to the FPGA via a configuration map by partitioning the FPGA into a static region and a reconfigurable region, partitioning the reconfigurable region into sub-regions, and mapping the trained DNN to the sub-regions of the reconfigurable dynamic regions, wherein the FPGA has a first precision configuration defining first number representations of the set of parameters, and wherein the FPGA has access to a precompiled bitstream pool with different stored bitstream files, wherein there is a compilation of a combination of layers of the trained DNN for each of the different precision configurations, the FPGA being configured for: accessing the different stored bitstream files of the different precision configurations; and switching from the first precision configuration to a second precision configuration using one of the stored bitstream files and the configuration map, wherein the switching of the FPGA from the first precision configuration to the second precision configuration is performed automatically. 15. The FPGA of claim 14 , wherein the second precision configuration defines a second number representations of the set of parameters. 16. The FPGA of claim 15 , wherein the FPGA is further configured for: selecting the one of the stored bitstream files based on the second number of representations of the set of parameters. 17. The FPGA of claim 14 , wherein each of the stored bitstream files programs the FPGA with a respective, different precision configuration. 18. The FPGA of claim 17 , wherein the FPGA is programmed by performing a partial reconfiguration of the FPGA. 19. The FPGA of claim 17 , wherein the FPGA is programmed by performing a global reconfiguration of the FPGA. 20. The FPGA of claim 17 , wherein the FPGA is programmed during the inference in response to receiving a request to change the first precision configuration into the second precision configuration of the one of the stored bitstream file.

Assignees

Inventors

Classifications

G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/063Primary
using electronic means · CPC title
G06N3/04
Architecture, e.g. interconnection topology · CPC title
G06N3/08
Learning methods · CPC title

Patent family

Related publications grouped by family.

View patent family 74679846

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11907828B2 cover?: A field programmable gate array (FPGA) may be used for inference of a trained deep neural network (DNN). The trained DNN may comprise a set of parameters and the FPGA may have a first precision configuration defining first number representations of the set of parameters. The FPGA may determine different precision configurations of the trained DNN. A precision configuration of the precision conf…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).