Convolutional neural network
US-2017200078-A1 · Jul 13, 2017 · US
US10043095B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10043095-B2 |
| Application number | US-201615289733-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 10, 2016 |
| Priority date | Oct 10, 2016 |
| Publication date | Aug 7, 2018 |
| Grant date | Aug 7, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Data arrangement schemes of imagery data and filter coefficients stored in a CNN based digital IC for extracting features out of an input image are disclosed. The CNN based digital IC contains NE number of CNN processing engines connected in a loop via a clock-skew circuit for cyclic data access. Imagery data and filter coefficients are arranged in a specific scheme to fit the data access pattern that the CNN based digital IC requires to operate. The specific scheme is determined based on the number of imagery data, the number of filters and the characteristics of the CNN based digital IC. The characteristics include, but are not limited to, the number of CNN processing engines, the connection direction of clock-skew circuit and the number of the I/O data bus.
Opening claim text (preview).
What is claimed is: 1. A method of arranging and storing imagery data and filter coefficients in a Cellular Neural Networks (CNN) based digital integrated circuit (IC) for extracting features out of an input image, the CNN based digital IC containing NE CNN processing engines connected in a loop via a clock-skew circuit, NE is a positive integer, the method comprising: (a) determining how many imagery data groups are required for storing NIM sets of imagery data in the NE CNN processing engines, each imagery data group containing NE sets of the NIM sets of imagery data, where NIM is a positive integer; (b) circularly storing the NE sets of the imagery data of said each imagery data group in the respective NE CNN processing engines; (c) repeating (b) for the remaining imagery data groups; (d) determining how many filter groups are required for storing all filter coefficients for NF filters in the NE CNN processing engines, each filter group containing NE sets of filter coefficients and said each filter group being further divided into one or more subgroups with each subgroup containing a portion of the NE sets of filter coefficients that correlates to a corresponding group of the imagery data groups, where NF is a positive integer; (e) storing the portion of the NE sets of filter coefficients in a corresponding one of the NE CNN processing engines, the portion of filter coefficients being arranged in a cyclic order for accommodating convolution operations with imagery data received from an upstream neighbor CNN processing engine; and (f) repeating (e) for the remaining subgroups; and (g) repeating (e) and (f) for the remaining filter groups, wherein each of the NE CNN processing engines comprises: a CNN processing block configured for simultaneously obtaining M×M convolution operations results by performing 3×3 convolutions at M×M pixel locations using the stored imagery data and the stored filter coefficients, the stored imagery data representing a (M+2)-pixel by (M+2)-pixel region with the M×M pixel locations being a M×M central portion of the (M+2)-pixel by (M+2)-pixel region, where M is a positive integer; a first set of memory buffers operatively coupled to the CNN processing block for storing one of the NE sets of imagery data; and a second set of memory buffers operatively coupled to the CNN processing block for storing the portion of the NE sets of filter coefficients corresponding to said one of the NE sets of imagery data. 2. The method of claim 1 , where said all filter coefficients contains NF multiplied by NIM sets of filer coefficients. 3. The method of claim 1 , wherein the CNN based digital IC further comprises more than one input/output (I/O) data bus connected to the NE CNN processing engines with a connection scheme. 4. The method of claim 3 , further comprises partitioning the NIM sets of imagery data and said all filter coefficients in the respective I/O data bus in accordance with the connection scheme. 5. The method of claim 3 , wherein said convolution operations produce NF convolution operations results. 6. The method of claim 3 , wherein NE is equal to 16. 7. The method of claim 3 , when said each imagery data group contains less than NE sets, unoccupied sets are filled with zeros. 8. A non-transitory computer readable medium storing imagery data and filter coefficients using a data arrangement scheme enabling a cellular neural networks (CNN) based digital integrated circuit (IC) for extracting features out of an input image, the data arrangement scheme comprising: NIM sets of imagery data organized in at least one imagery data group, each imagery data group including NE sets of imagery data circularly stored in respective NE CNN processing engines of the CNN based digital IC; and all filter coefficients of NF filters organized in at least one filter group, each filter having NIM sets of filter coefficients and each filter group containing NE sets of filter coefficients and being further divided into one or more subgroups with each subgroup containing a portion of the NE sets of filter coefficients that correlates to a corresponding group of the imagery data groups, the portion of the NE sets of filter coefficients are stored in a corresponding one of the NE CNN processing engine, the portion of filter coefficients being arranged in a cyclic order for accommodating convolution operations with imagery data received from an upstream neighbor CNN processing engine, where NE, NIM and NF are positive integers, and wherein each of the NE CNN processing engines comprises: a CNN processing block configured for simultaneously obtaining M×M convolution operations results by performing 3×3 convolutions at M×M pixel locations using the stored imagery data and the stored filter coefficients, the stored imagery data representing a (M+2)-pixel by (M+2)-pixel region with the M×M pixel locations being a M×M central portion of the (M+2)-pixel by (M+2)-pixel region, where M is a positive integer; a first set of memory buffers operatively coupled to the CNN processing block for storing one of the NE sets of imagery data; and a second set of memory buffers operatively coupled to the CNN processing block for storing the portion of the NE sets of filter coefficients corresponding to said one of the NE sets of imagery data. 9. The non-transitory computer readable medium of claim 8 , wherein said all filter coefficients contains NF multiplied by NIM sets of filer coefficients. 10. The non-transitory computer readable medium of claim 8 , wherein NE is equal to 16. 11. The non-transitory computer readable medium of claim 8 , wherein said convolution operations produce NF convolution operations results. 12. The non-transitory computer readable medium of claim 8 , when said each imagery data group contains less than NE sets, unoccupied sets are filled with zeros.
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
using electronic means · CPC title
Combinations of networks · CPC title
Neural networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.