End-to-end data format selection for hardware implementation of deep neural networks

US12020145B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12020145-B2
Application numberUS-201816181147-A
CountryUS
Kind codeB2
Filing dateNov 5, 2018
Priority dateNov 3, 2017
Publication dateJun 25, 2024
Grant dateJun 25, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods for selecting fixed point number formats for representing values input to and/or output from layers of a DNN which take into account the impact of the fixed point number formats for a particular layer in the context of the DNN. The methods comprise selecting the fixed point number format(s) used to represent sets of values input to and/or output from a layer one layer at a time in a predetermined sequence wherein any layer is preceded in the sequence by the layer(s) from which it depends. The fixed point number format(s) for each layer is/are selected based on the error in the output of the DNN associated with the fixed point number formats. Once the fixed point number format(s) for a layer has/have been selected any calculation of the error in the output of the DNN for a subsequent layer in the sequence is based on that layer being configured to use the selected fixed point number formats.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of identifying fixed point number formats for representing values input to, and/or output from, a plurality of layers of a Deep Neural Network (DNN) for use in configuring a hardware implementation of the DNN, the method comprising: determining a sequence of the plurality of layers in which each of the plurality of layers is preceded by any layer in the plurality of layers on which it depends; receiving an instantiation of the DNN configured to represent values input to, and/or, output from the plurality of layers of the DNN using a floating point number format; identifying a first layer in the determined sequence as a current layer, and for the current layer: (a) selecting a fixed point number format for representing each of one or more sets of values input to, or output from, the current layer, wherein the fixed point number format for representing a set of values input to, or output from, the current layer is selected so as to minimize an output error of the instantiation of the DNN; and (b) reconfiguring the instantiation of the DNN to represent each of the one or more sets of values input to, or output from, the current layer using the selected fixed point number format for that set of values; subsequent to the reconfiguring, determining whether there is at least one more layer in the determined sequence following the current layer; and in response to determining that there is at least one more layer in the determined sequence following the current layer, identifying a next layer in the determined sequence as the current layer and repeating (a) and (b). 2. The method of claim 1 , wherein selecting a fixed point number format for representing a set of values input to, or output from, a layer comprises: for each fixed point number format of a plurality of potential fixed point number formats: temporarily configuring the instantiation of the DNN to represent the set of values for the layer using the fixed point number format; determining an output of the temporarily configured instantiation of the DNN in response to test input data; and determining an output error of the temporarily configured instantiation of the DNN; and selecting a fixed point number format to represent the set of values input to, or output from, the layer based on the output errors associated with each of the plurality of potential fixed point number formats. 3. The method of claim 2 , wherein the fixed point number format associated with the lowest output error is selected as the fixed point number format for representing the set of values. 4. The method of claim 2 , wherein each potential fixed point number format comprises an exponent and a mantissa bit length and each of the plurality of potential fixed point number formats comprises a same mantissa bit length and a different exponent. 5. The method of claim 1 , wherein selecting a fixed point number format for representing each of one or more sets of values input to, or output from, the layer comprises: selecting a fixed point number format for representing at least a portion of input data values for the layer that minimizes an output error in the instantiation of the DNN; reconfiguring the instantiation of the DNN to represent the at least a portion of the input data values for the layer in the selected fixed point number format; and subsequent to the reconfiguring, selecting a fixed point number format for representing at least a portion of output data values for the layer that minimizes an output error in the instantiation of the DNN. 6. The method of claim 5 , wherein selecting a fixed point number format for representing each of one or more sets of values input to, or output from, the layer further comprises: selecting a fixed point number format for representing at least a portion of weights for the layer that minimizes an output error in the instantiation of the DNN; and reconfiguring the instantiation of the DNN to represent the at least a portion of the weights for the layer in the selected fixed point number format prior to selecting a fixed point number format for representing the at least a portion of output data values for the layer. 7. The method of claim 1 , wherein the DNN is a classification network and the output error is a Top-1 classification accuracy of an output of the instantiation of the DNN in response to test input data. 8. The method of claim 1 , wherein the DNN is a classification network and the output error is a top-5 classification accuracy of an output of the instantiation of the DNN in response to test input data. 9. The method of claim 1 , wherein the DNN is a classification network and the output error is a sum of absolute differences between logits of an output of the instantiation of the DNN in response to test input data and logits of a baseline output. 10. The method of claim 1 , wherein the DNN is a classification network and the output error is a sum of absolute differences between SoftMax normalised logits of an output of the instantiation of the DNN and SoftMax normalised logits of a baseline output. 11. The method of claim 10 , further comprising generating the baseline output by applying the input test data to an instantiation of the DNN configured to represent values input to and/or output from the plurality of layers using a floating point number format. 12. The method of claim 1 , wherein each fixed point number format comprises an exponent and a mantissa bit length. 13. The method of claim 1 , further comprising outputting the selected fixed point number formats for the plurality of layers for use in configuring the hardware implementation of the DNN. 14. The method of claim 1 , further comprising configuring a hardware implementation of the DNN to represent a set of values input to, or output from, at least one of the plurality of layers using the selected fixed point number format for that set of values. 15. A non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform the method as set forth in claim 1 . 16. A computing-based device for identifying fixed point number formats for representing values input to, and/or output from, a plurality of layers of a Deep Neural Network (DNN) for use in configuring a hardware implementation of the DNN, the computing-based device comprising: at least one processor; and memory coupled to the at least one processor, the memory comprising: an instantiation of the DNN configured to represent values input to, and/or, output from the plurality of layers of the DNN using a floating point number format; and computer readable code that when executed by the at least one processor causes the at least one processor to: determine a sequence of the plurality of layers in which each of the plurality of layers is preceded by any layer in the plurality of layers on which it depends; identify a first layer in the determined sequence as a current layer, and for the current layer: (a) select a fixed point number format from a plurality of potential fixed point number formats for representing each of one or more sets of values input to, or output from, the current layer, wherein the fixed point number format for representing a set of values input to, or output from, the layer is selected as the potential fixe point number format of the plurality of fixed point number formats that minimizes an output error of the instantiation of the DNN; and (b) reconfigure the instantiation of the DNN to represent each of the

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • Supervised learning · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12020145B2 cover?
Methods for selecting fixed point number formats for representing values input to and/or output from layers of a DNN which take into account the impact of the fixed point number formats for a particular layer in the context of the DNN. The methods comprise selecting the fixed point number format(s) used to represent sets of values input to and/or output from a layer one layer at a time in a pre…
Who is the assignee on this patent?
Imagination Tech Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 25 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).