Adjusting precision and topology parameters for neural network training based on a performance metric
US-2020210840-A1 · Jul 2, 2020 · US
US12020145B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12020145-B2 |
| Application number | US-201816181147-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 5, 2018 |
| Priority date | Nov 3, 2017 |
| Publication date | Jun 25, 2024 |
| Grant date | Jun 25, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods for selecting fixed point number formats for representing values input to and/or output from layers of a DNN which take into account the impact of the fixed point number formats for a particular layer in the context of the DNN. The methods comprise selecting the fixed point number format(s) used to represent sets of values input to and/or output from a layer one layer at a time in a predetermined sequence wherein any layer is preceded in the sequence by the layer(s) from which it depends. The fixed point number format(s) for each layer is/are selected based on the error in the output of the DNN associated with the fixed point number formats. Once the fixed point number format(s) for a layer has/have been selected any calculation of the error in the output of the DNN for a subsequent layer in the sequence is based on that layer being configured to use the selected fixed point number formats.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of identifying fixed point number formats for representing values input to, and/or output from, a plurality of layers of a Deep Neural Network (DNN) for use in configuring a hardware implementation of the DNN, the method comprising: determining a sequence of the plurality of layers in which each of the plurality of layers is preceded by any layer in the plurality of layers on which it depends; receiving an instantiation of the DNN configured to represent values input to, and/or, output from the plurality of layers of the DNN using a floating point number format; identifying a first layer in the determined sequence as a current layer, and for the current layer: (a) selecting a fixed point number format for representing each of one or more sets of values input to, or output from, the current layer, wherein the fixed point number format for representing a set of values input to, or output from, the current layer is selected so as to minimize an output error of the instantiation of the DNN; and (b) reconfiguring the instantiation of the DNN to represent each of the one or more sets of values input to, or output from, the current layer using the selected fixed point number format for that set of values; subsequent to the reconfiguring, determining whether there is at least one more layer in the determined sequence following the current layer; and in response to determining that there is at least one more layer in the determined sequence following the current layer, identifying a next layer in the determined sequence as the current layer and repeating (a) and (b). 2. The method of claim 1 , wherein selecting a fixed point number format for representing a set of values input to, or output from, a layer comprises: for each fixed point number format of a plurality of potential fixed point number formats: temporarily configuring the instantiation of the DNN to represent the set of values for the layer using the fixed point number format; determining an output of the temporarily configured instantiation of the DNN in response to test input data; and determining an output error of the temporarily configured instantiation of the DNN; and selecting a fixed point number format to represent the set of values input to, or output from, the layer based on the output errors associated with each of the plurality of potential fixed point number formats. 3. The method of claim 2 , wherein the fixed point number format associated with the lowest output error is selected as the fixed point number format for representing the set of values. 4. The method of claim 2 , wherein each potential fixed point number format comprises an exponent and a mantissa bit length and each of the plurality of potential fixed point number formats comprises a same mantissa bit length and a different exponent. 5. The method of claim 1 , wherein selecting a fixed point number format for representing each of one or more sets of values input to, or output from, the layer comprises: selecting a fixed point number format for representing at least a portion of input data values for the layer that minimizes an output error in the instantiation of the DNN; reconfiguring the instantiation of the DNN to represent the at least a portion of the input data values for the layer in the selected fixed point number format; and subsequent to the reconfiguring, selecting a fixed point number format for representing at least a portion of output data values for the layer that minimizes an output error in the instantiation of the DNN. 6. The method of claim 5 , wherein selecting a fixed point number format for representing each of one or more sets of values input to, or output from, the layer further comprises: selecting a fixed point number format for representing at least a portion of weights for the layer that minimizes an output error in the instantiation of the DNN; and reconfiguring the instantiation of the DNN to represent the at least a portion of the weights for the layer in the selected fixed point number format prior to selecting a fixed point number format for representing the at least a portion of output data values for the layer. 7. The method of claim 1 , wherein the DNN is a classification network and the output error is a Top-1 classification accuracy of an output of the instantiation of the DNN in response to test input data. 8. The method of claim 1 , wherein the DNN is a classification network and the output error is a top-5 classification accuracy of an output of the instantiation of the DNN in response to test input data. 9. The method of claim 1 , wherein the DNN is a classification network and the output error is a sum of absolute differences between logits of an output of the instantiation of the DNN in response to test input data and logits of a baseline output. 10. The method of claim 1 , wherein the DNN is a classification network and the output error is a sum of absolute differences between SoftMax normalised logits of an output of the instantiation of the DNN and SoftMax normalised logits of a baseline output. 11. The method of claim 10 , further comprising generating the baseline output by applying the input test data to an instantiation of the DNN configured to represent values input to and/or output from the plurality of layers using a floating point number format. 12. The method of claim 1 , wherein each fixed point number format comprises an exponent and a mantissa bit length. 13. The method of claim 1 , further comprising outputting the selected fixed point number formats for the plurality of layers for use in configuring the hardware implementation of the DNN. 14. The method of claim 1 , further comprising configuring a hardware implementation of the DNN to represent a set of values input to, or output from, at least one of the plurality of layers using the selected fixed point number format for that set of values. 15. A non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform the method as set forth in claim 1 . 16. A computing-based device for identifying fixed point number formats for representing values input to, and/or output from, a plurality of layers of a Deep Neural Network (DNN) for use in configuring a hardware implementation of the DNN, the computing-based device comprising: at least one processor; and memory coupled to the at least one processor, the memory comprising: an instantiation of the DNN configured to represent values input to, and/or, output from the plurality of layers of the DNN using a floating point number format; and computer readable code that when executed by the at least one processor causes the at least one processor to: determine a sequence of the plurality of layers in which each of the plurality of layers is preceded by any layer in the plurality of layers on which it depends; identify a first layer in the determined sequence as a current layer, and for the current layer: (a) select a fixed point number format from a plurality of potential fixed point number formats for representing each of one or more sets of values input to, or output from, the current layer, wherein the fixed point number format for representing a set of values input to, or output from, the layer is selected as the potential fixe point number format of the plurality of fixed point number formats that minimizes an output error of the instantiation of the DNN; and (b) reconfigure the instantiation of the DNN to represent each of the
Convolutional networks [CNN, ConvNet] · CPC title
Learning methods · CPC title
Supervised learning · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.