What technology area does this patent fall under?

Primary CPC classification G06N3/04. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Hierarchical mantissa bit length selection for hardware implementation of deep neural network

US12175349B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12175349-B2
Application number	US-201816180250-A
Country	US
Kind code	B2
Filing date	Nov 5, 2018
Priority date	Nov 3, 2017
Publication date	Dec 24, 2024
Grant date	Dec 24, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Hierarchical methods for selecting fixed point number formats with reduced mantissa bit lengths for representing values input to, and/or output, from, the layers of a DNN. The methods begin with one or more initial fixed point number formats for each layer. The layers are divided into subsets of layers and the mantissa bit lengths of the fixed point number formats are iteratively reduced from the initial fixed point number formats on a per subset basis. If a reduction causes the output error of the DNN to exceed an error threshold, then the reduction is discarded, and no more reductions are made to the layers of the subset. Otherwise a further reduction is made to the fixed point number formats for the layers in that subset. Once no further reductions can be made to any of the subsets the method is repeated for continually increasing numbers of subsets until a predetermined number of layers per subset is achieved.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of selecting a fixed point number format for representing values input to, and/or output from, a plurality of layers of a Deep Neural Network (DNN) for use in configuring a hardware implementation of the DNN, the method comprising: receiving an instantiation of the DNN configured to represent the values of each of the plurality of layers using one or more initial fixed point number formats for that layer, each initial fixed point number format comprising an exponent and a mantissa bit length; forming a plurality of disjoint subsets from the plurality of layers; for each subset of the plurality of subsets, iteratively adjusting the fixed point number formats for the layers in the subset to fixed point number formats with a next lowest mantissa bit length until the output error of the instantiation of the DNN exceeds an error threshold; in response to determining that the subsets comprise greater than a lower threshold number of layers, forming a higher number of disjoint subsets than the plurality of disjoint subsets from the plurality of layers and repeating the iterative adjusting; and in response to determining that the subsets comprise less than or equal to the lower threshold number of layers, outputting the fixed point number formats for the plurality of layers. 2. The method of claim 1 , wherein iteratively adjusting the fixed point number formats for the layers in the subset to fixed point number formats with the next lowest mantissa bit length comprises: determining a fixed point number format with the next lowest mantissa bit length for the fixed point number formats for each layer of the subset; adjusting the fixed point number formats used by the instantiation of the DNN for each layer in the subset to the determined fixed point number formats with the next lowest mantissa bit length; determining an output of the adjusted instantiation of the DNN in response to test input data; determining an output error of the adjusted instantiation of the DNN; in response to determining that the output error exceeds the error threshold, reversing the adjustment of the instantiation of the DNN; and in response to determining that the output error does not exceed the error threshold, repeating the determining the fixed point number formats, adjusting the fixed point number formats, determining the output, and determining the output error. 3. The method of claim 1 , further comprising identifying a sequence of the plurality of layers wherein each layer is preceded in the sequence by any layer of the plurality of layers on which it depends, and wherein each of the subsets comprises a contiguous set of layers in the sequence. 4. The method of claim 1 , wherein the plurality of layers from which the disjoint subsets are formed do not include a first layer of the DNN and/or a last layer of the DNN. 5. The method of claim 1 , wherein a first adjustment of the fixed point number formats is made for all of the subsets before a second adjustment of the fixed point number formats is made for any of the subsets. 6. The method of claim 1 , wherein all iterative adjustments of the fixed point number formats for the layers in a first subset are completed before a first adjustment of the fixed point number formats for the layers in a second subset. 7. The method of claim 1 , wherein there is an initial fixed point number format for input data values of at least one layer of the plurality of layers and there is an initial fixed point number format for weights of at least one layer of the plurality of layers, and iteratively adjusting the fixed point number formats for the layers in the subset to fixed point number formats with the next lowest mantissa bit length until the output error of the instantiation of the DNN exceeds the error threshold comprises: iteratively adjusting the fixed point number formats for the input data values for the layers in the subset to fixed point number formats with the next lowest mantissa bit length until the output error of the instantiation of the DNN exceeds the error threshold; and subsequent to iteratively adjusting the fixed point number formats for the input data values, iteratively adjusting the fixed point number formats for the weights for the layers in the subset to fixed point number formats with the next lowest mantissa bit length until the output error of the instantiation of the DNN exceeds the error threshold. 8. The method of claim 7 , wherein there is an initial fixed point number format for output data values of at least one layer of the plurality of layers, and iteratively adjusting the fixed point number formats for the layers in the subset to a fixed point number format with the next lowest mantissa bit length until the output error of the instantiation of the DNN exceeds the error threshold further comprises: subsequent to iteratively adjusting the fixed point number formats for the input data values, iteratively adjusting the fixed point number formats for the output data values for the layers in the subset to fixed point number formats with the next lowest mantissa bit length until the output error of the instantiation of the DNN exceeds the error threshold. 9. The method of claim 1 , wherein the DNN is a classification network and the output error is a top-1 classification accuracy or a top-5 classification accuracy of an output of the instantiation of the DNN in response to test input data. 10. The method of claim 1 , wherein the DNN is a classification network and the output error is a sum of absolute differences between logits of an output of the instantiation of the DNN in response to test input data and logits of a baseline output or is a sum of absolute differences between SoftMax normalised logits of an output of the instantiation of the DNN in response to test input data and SoftMax normalised logits of a baseline output. 11. The method of claim 10 , further comprising generating the baseline output by applying the test input data to an instantiation of the DNN configured to represent values input to and output from each layer of the DNN using a floating point number format. 12. The method of claim 1 , wherein the lower threshold number of layers is one. 13. The method of claim 1 , wherein the lower threshold number of layers is greater than one. 14. The method of claim 1 , wherein forming a higher number of disjoint subsets from the plurality of layers comprises: dividing the layers in each subset into a plurality of disjoint subsets and/or forming twice as many disjoint subsets from the plurality of layers. 15. The method of claim 1 , wherein the values input to and/or output from the plurality of layers comprise one or more of input data values, output data values, weights and biases. 16. The method of claim 1 , wherein the DNN is a convolutional neural network. 17. The method of claim 1 , further comprising configuring a hardware implementation of the DNN to represent values of at least one of the plurality of layers using a fixed point number format output for the at least one layer. 18. A non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform the method as set forth in claim 1 . 19. A hardware implementation of a Deep Neural Network (DNN) comprising: hardware logic configured to: receive input data values, a set of weights or a set of biases for a layer of the DNN; receive information indicating a fixed po

Assignees

Imagination Tech Ltd

Inventors

Classifications

G06N3/04Primary
Architecture, e.g. interconnection topology · CPC title
G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06F7/49942
Significance control · CPC title
G06N3/048
Activation functions · CPC title

Patent family

Related publications grouped by family.

View patent family 60664825

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12175349B2 cover?: Hierarchical methods for selecting fixed point number formats with reduced mantissa bit lengths for representing values input to, and/or output, from, the layers of a DNN. The methods begin with one or more initial fixed point number formats for each layer. The layers are divided into subsets of layers and the mantissa bit lengths of the fixed point number formats are iteratively reduced from t…
Who is the assignee on this patent?: Imagination Tech Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/04. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Hardware node having a matrix vector unit with block-floating point processing

Performing average pooling in hardware

Efficient calculations of negative curvature in a hessian free deep learning framework

Quantized neural network training and inference

Improved fixed point integer implementations for neural networks

Updating an artificial neural network using flexible fixed point representation

Bit width selection for fixed point neural networks

Frequently asked questions