Cascaded computing for convolutional neural networks

US11556779B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11556779-B2
Application numberUS-201716335775-A
CountryUS
Kind codeB2
Filing dateSep 21, 2017
Priority dateSep 26, 2016
Publication dateJan 17, 2023
Grant dateJan 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are described for efficiently reducing the amount of total computation in convolutional neural networks (CNNs) without affecting the output result or classification accuracy. Computation redundancy in CNNs is reduced by exploiting the computing nature of the convolution and subsequent pooling (e.g., sub-sampling) operations. In some implementations, the input features may be divided into a group of precision values and the operation(s) may be cascaded. A maximum may be identified (e.g., by 90% probability) using a small number of bits in the input features, and the full-precision convolution may then be performed on the maximum input. Accordingly, the total number of bits used to perform the convolution is reduced without affecting the output features or the final classification accuracy.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method performed by at least one processor, the method comprising: in one or more layers of a convolutional neural network (CNN) executed by the at least one processor, performing a first iteration that includes computing a value based on a first set of most significant bits (MSBs) for each of a plurality of data sets; examining, by the at least one processor, a first set of values computed for the plurality of data sets in the first iteration to determine whether a maximum value is present among the first set of values; responsive to identifying the maximum value, performing, by the at least one processor, a full precision computation of the value for a data set, of the plurality of data sets, that exhibited the maximum value; and propagating, by the at least one processor, the full precision computation of the value to a subsequent layer of the CNN. 2. The method of claim 1 , further comprising: responsive to determining that the first set of values are the same, performing, by the at least one processor, a second iteration that includes computing the value based on a second set of MSBs for each of the plurality of data sets, the second set of MSBs being larger than the first set of MSBs. 3. The method of claim 2 , further comprising: examining, by the at least one processor, a second set of values computed for the plurality of data sets in the second iteration to determine whether the maximum value is present among the second set of values; and responsive to identifying the maximum value among the second set of values, performing, by the at least one processor, the full precision computation of the value for a data set, of the plurality of data sets, that exhibited the maximum value in the second iteration. 4. The method of claim 2 , wherein the computing in each of the first iteration and the second iteration employs a convolution and a pooling. 5. The method of claim 4 , wherein the convolution is a N×N convolution, where N is any integer. 6. The method of claim 4 , wherein the pooling is a N×N pooling, where N is any integer. 7. The method of claim 4 , wherein the convolution is a 3×3 convolution, and the pooling is a 2×2 pooling. 8. The method of claim 2 , wherein at least one of the first iteration and the second iteration is performed with a precision less than that of the full precision computation. 9. The method of claim 8 , wherein the precision is 8-bit precision. 10. The method of claim 1 , wherein the CNN is employed to analyze an image. 11. The method of claim 1 , wherein: the first iteration computes a value that approximates the full precision computation of the value; and the full precision computation is performed on the data set the includes less data than the plurality of data sets. 12. A system comprising: at least one processor; and memory communicatively coupled to the at least one processor, the memory storing instructions which, when executed by the at least one processor, cause the at least one processor to perform operations comprising: in one or more layers of a convolutional neural network (CNN), performing a first iteration that includes computing a value based on a first set of most significant bits (MSBs) for each of a plurality of data sets; examining a first set of values computed for the plurality of data sets in the first iteration to determine whether a maximum value is present among the first set of values; responsive to identifying the maximum value, performing a full precision computation of the value for a data set, of the plurality of data sets, that exhibited the maximum value; and propagating the full precision computation of the value to a subsequent layer of the CNN. 13. The system of claim 12 , the operations further comprising: responsive to determining that the first set of values are the same, performing, by the at least one processor, a second iteration that includes computing the value based on a second set of MSBs for each of the plurality of data sets, the second set of MSBs being larger than the first set of MSBs. 14. The system of claim 13 , the operations further comprising: examining, by the at least one processor, a second set of values computed for the plurality of data sets in the second iteration to determine whether the maximum value is present among the second set of values; and responsive to identifying the maximum value among the second set of values, performing, by the at least one processor, the full precision computation of the value for a data set, of the plurality of data sets, that exhibited the maximum value in the second iteration. 15. The system of claim 13 , wherein the computing in each of the first iteration and the second iteration employs a convolution and a pooling. 16. The system of claim 15 , wherein the convolution is a N×N convolution, where N is any integer. 17. The system of claim 15 , wherein the pooling is a N×N pooling, where N is any integer. 18. The system of claim 15 , wherein at least one of the first iteration and the second iteration is performed with a precision less than that of the full precision computation. 19. The system of claim 12 , wherein: the first iteration computes a value that approximates the full precision computation of the value; and the full precision computation is performed on the data set the includes less data than the plurality of data sets.

Assignees

Inventors

Classifications

  • G06N3/045Primary

    Combinations of networks · CPC title

  • using non-contact-making devices, e.g. tube, solid state device; using unspecified devices · CPC title

  • Physics · mapped topic

  • G06N3/08Primary

    Learning methods · CPC title

  • G06N3/0464Primary

    Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11556779B2 cover?
Techniques are described for efficiently reducing the amount of total computation in convolutional neural networks (CNNs) without affecting the output result or classification accuracy. Computation redundancy in CNNs is reduced by exploiting the computing nature of the convolution and subsequent pooling (e.g., sub-sampling) operations. In some implementations, the input features may be divided …
Who is the assignee on this patent?
Univ Arizona State, Seo Jae Sun, Kim Minkyu
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).