Neural network suppression

US2016358069A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016358069-A1
Application numberUS-201615099109-A
CountryUS
Kind codeA1
Filing dateApr 14, 2016
Priority dateJun 3, 2015
Publication dateDec 8, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementing a neural network includes determining whether to process a combination of a first region of an input feature map and a first region of a convolution kernel and, responsive to determining to process the combination, performing a convolution operation on the first region of the input feature map using the first region of the convolution kernel to generate at least a portion of an output feature map.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of implementing a neural network, the method comprising: determining whether to process a combination of a first region of an input feature map and a first region of a convolution kernel; and responsive to determining to process the combination, performing a convolution operation on the first region of the input feature map using the first region of the convolution kernel to generate at least a portion of an output feature map. 2 . The method of claim 1 , wherein the determining whether to process the combination of the first region of the input feature map and the first region of the convolution kernel comprises: identifying a non-zero value in the first region of the input feature map and a non-zero value in the first region of the convolution kernel. 3 . The method of claim 1 , wherein the determining whether to process the combination of the first region of the input feature map and the first region of the convolution kernel comprises: generating a mask indicating zero and non-zero portions of at least one of the first region of the input feature map or the first region of the convolution kernel. 4 . The method of claim 3 , wherein the mask is generated responsive to reading the first region of the input feature map or the first region of the convolution kernel from a memory. 5 . The method of claim 3 , wherein a first mask is generated indicating zero and non-zero portions of the first region of the input feature map and a second mask is generated indicating zero and non-zero portions of the first region of the convolution kernel, the method further comprising: comparing the first mask and the second mask. 6 . The method of claim 1 , further comprising: skipping a convolution operation on a further combination of a second region of the input feature map and a second region of the convolution kernel responsive to determining that the second region of the input feature map includes all zero values or responsive to determining that the second region of the convolution kernel includes all zero values. 7 . The method of claim 1 , wherein the determining whether to process the combination of the first region of the input feature map and the performing the convolution operation on the first region of the input feature map are implemented by each of a plurality of data paths operating independently and concurrently, and wherein each data path operates on a different input feature map using a different convolution kernel each applied over a variable number of cycles according to sparsity of convolution kernel data and input data. 8 . An apparatus for implementing a neural network, the apparatus comprising: a fetch circuit configured to retrieve regions of input feature maps from a memory under control of a control circuit; a mask generation and weight application control circuit configured to determine whether to process a combination of a first region of an input feature map and a first region of a convolution kernel; and a convolution circuit configured to perform a convolution operation on the combination responsive to the determination by the mask generation and weight application control circuit to process the combination. 9 . The apparatus of claim 8 , further comprising: an accumulation circuit configured to sum outputs from the convolution circuit; an activation circuit configured to apply an activation function to the summed outputs of the accumulation circuit; and a pooling and sub-sampling circuit coupled to the activation circuit. 10 . The apparatus of claim 8 , wherein the mask generation and weight application control circuit is configured to determine that the combination is to be processed responsive to determining that the first region of the input feature map includes a value other than zero and that the first region of the convolution kernel includes a value other than zero. 11 . The apparatus of claim 10 , wherein the mask generation and weight application control circuit is further configured to generate a mask indicating zero and non-zero portions of at least one of the first region of the input feature map or the first region of the convolution kernel. 12 . The apparatus of claim 11 , wherein the mask is generated responsive to reading the first region of the input feature map or the first region of the convolution kernel from a memory. 13 . The apparatus of claim 11 , wherein a first mask is generated indicating zero and non-zero portions of the first region of the input feature map and a second mask is generated indicating zero and non-zero portions of the first region of the convolution kernel, wherein the mask generation and weight application control circuit is further configured to compare the first mask and the second mask. 14 . The apparatus of claim 8 , wherein the mask generation and weight application control circuit is further configured to determine to skip convolution processing of a further combination of a second region of the input feature map and a second region of the convolution kernel responsive to determining that the second region of the input feature map includes all zero values or that the second region of the convolution kernel includes all zero values. 15 . An apparatus for implementing a neural network, the apparatus comprising: a weight processing circuit configured to determine whether weights to be applied to regions of input data are zero; a data staging circuit configured to determine whether the regions of the input data are zero; and a multiply-accumulate circuit configured to apply only non-zero weights to the regions of the input data that are non-zero. 16 . The apparatus of claim 15 , wherein: responsive to determining that a selected weight is zero, the weight processing circuit is configured not to output the selected weight to the multiply-accumulate circuit and to instruct the data staging circuit not to output the region of the input data corresponding to the selected weight to the multiply-accumulate circuit. 17 . The apparatus of claim 15 , wherein: responsive to determining that a selected region comprises only zero values, the data staging circuit is configured not to output the selected region to the multiply-accumulate circuit and instructs the weight processing circuit not to output the weight corresponding to the selected region to the multiply-accumulate circuit. 18 . The apparatus of claim 15 , wherein the weight processing circuit comprises a weight decompressor configured to decompress a plurality of weights retrieved from a memory and generate a mask indicating weights of the plurality of weights that are zero. 19 . The apparatus of claim 15 , wherein the data staging circuit comprises: a component mask generator configured to generate a component mask indicating which regions of the input data consist of only zeros; an alignment mask generator configured to generate an alignment mask indicating whether all values in a contiguous region at each alignment are equal to zero; and weight application controller configured to instruct the weight processing circuit not to output a non-zero weight to the multiply-accumulate circuit responsive receiving an alignment mask from the alignment mask generator for a region corresponding to the non-zero weight indicating that all values in the region at each alignment are equal to zero. 20 . The apparatus of claim 15 , wherein the multiply-accumulate circuit comprises: a multiply-accumulate array having a plurality

Assignees

Inventors

Classifications

  • G06N3/045Primary

    Combinations of networks · CPC title

  • G06F7/764Primary

    Masking · CPC title

  • G06N3/0464Primary

    Convolutional networks [CNN, ConvNet] · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Architecture, e.g. interconnection topology · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016358069A1 cover?
Implementing a neural network includes determining whether to process a combination of a first region of an input feature map and a first region of a convolution kernel and, responsive to determining to process the combination, performing a convolution operation on the first region of the input feature map using the first region of the convolution kernel to generate at least a portion of an out…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 08 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).