Data structure for CNN based digital integrated circuit for extracting features out of an input image

US10043095B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10043095-B2
Application numberUS-201615289733-A
CountryUS
Kind codeB2
Filing dateOct 10, 2016
Priority dateOct 10, 2016
Publication dateAug 7, 2018
Grant dateAug 7, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data arrangement schemes of imagery data and filter coefficients stored in a CNN based digital IC for extracting features out of an input image are disclosed. The CNN based digital IC contains NE number of CNN processing engines connected in a loop via a clock-skew circuit for cyclic data access. Imagery data and filter coefficients are arranged in a specific scheme to fit the data access pattern that the CNN based digital IC requires to operate. The specific scheme is determined based on the number of imagery data, the number of filters and the characteristics of the CNN based digital IC. The characteristics include, but are not limited to, the number of CNN processing engines, the connection direction of clock-skew circuit and the number of the I/O data bus.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of arranging and storing imagery data and filter coefficients in a Cellular Neural Networks (CNN) based digital integrated circuit (IC) for extracting features out of an input image, the CNN based digital IC containing NE CNN processing engines connected in a loop via a clock-skew circuit, NE is a positive integer, the method comprising: (a) determining how many imagery data groups are required for storing NIM sets of imagery data in the NE CNN processing engines, each imagery data group containing NE sets of the NIM sets of imagery data, where NIM is a positive integer; (b) circularly storing the NE sets of the imagery data of said each imagery data group in the respective NE CNN processing engines; (c) repeating (b) for the remaining imagery data groups; (d) determining how many filter groups are required for storing all filter coefficients for NF filters in the NE CNN processing engines, each filter group containing NE sets of filter coefficients and said each filter group being further divided into one or more subgroups with each subgroup containing a portion of the NE sets of filter coefficients that correlates to a corresponding group of the imagery data groups, where NF is a positive integer; (e) storing the portion of the NE sets of filter coefficients in a corresponding one of the NE CNN processing engines, the portion of filter coefficients being arranged in a cyclic order for accommodating convolution operations with imagery data received from an upstream neighbor CNN processing engine; and (f) repeating (e) for the remaining subgroups; and (g) repeating (e) and (f) for the remaining filter groups, wherein each of the NE CNN processing engines comprises: a CNN processing block configured for simultaneously obtaining M×M convolution operations results by performing 3×3 convolutions at M×M pixel locations using the stored imagery data and the stored filter coefficients, the stored imagery data representing a (M+2)-pixel by (M+2)-pixel region with the M×M pixel locations being a M×M central portion of the (M+2)-pixel by (M+2)-pixel region, where M is a positive integer; a first set of memory buffers operatively coupled to the CNN processing block for storing one of the NE sets of imagery data; and a second set of memory buffers operatively coupled to the CNN processing block for storing the portion of the NE sets of filter coefficients corresponding to said one of the NE sets of imagery data. 2. The method of claim 1 , where said all filter coefficients contains NF multiplied by NIM sets of filer coefficients. 3. The method of claim 1 , wherein the CNN based digital IC further comprises more than one input/output (I/O) data bus connected to the NE CNN processing engines with a connection scheme. 4. The method of claim 3 , further comprises partitioning the NIM sets of imagery data and said all filter coefficients in the respective I/O data bus in accordance with the connection scheme. 5. The method of claim 3 , wherein said convolution operations produce NF convolution operations results. 6. The method of claim 3 , wherein NE is equal to 16. 7. The method of claim 3 , when said each imagery data group contains less than NE sets, unoccupied sets are filled with zeros. 8. A non-transitory computer readable medium storing imagery data and filter coefficients using a data arrangement scheme enabling a cellular neural networks (CNN) based digital integrated circuit (IC) for extracting features out of an input image, the data arrangement scheme comprising: NIM sets of imagery data organized in at least one imagery data group, each imagery data group including NE sets of imagery data circularly stored in respective NE CNN processing engines of the CNN based digital IC; and all filter coefficients of NF filters organized in at least one filter group, each filter having NIM sets of filter coefficients and each filter group containing NE sets of filter coefficients and being further divided into one or more subgroups with each subgroup containing a portion of the NE sets of filter coefficients that correlates to a corresponding group of the imagery data groups, the portion of the NE sets of filter coefficients are stored in a corresponding one of the NE CNN processing engine, the portion of filter coefficients being arranged in a cyclic order for accommodating convolution operations with imagery data received from an upstream neighbor CNN processing engine, where NE, NIM and NF are positive integers, and wherein each of the NE CNN processing engines comprises: a CNN processing block configured for simultaneously obtaining M×M convolution operations results by performing 3×3 convolutions at M×M pixel locations using the stored imagery data and the stored filter coefficients, the stored imagery data representing a (M+2)-pixel by (M+2)-pixel region with the M×M pixel locations being a M×M central portion of the (M+2)-pixel by (M+2)-pixel region, where M is a positive integer; a first set of memory buffers operatively coupled to the CNN processing block for storing one of the NE sets of imagery data; and a second set of memory buffers operatively coupled to the CNN processing block for storing the portion of the NE sets of filter coefficients corresponding to said one of the NE sets of imagery data. 9. The non-transitory computer readable medium of claim 8 , wherein said all filter coefficients contains NF multiplied by NIM sets of filer coefficients. 10. The non-transitory computer readable medium of claim 8 , wherein NE is equal to 16. 11. The non-transitory computer readable medium of claim 8 , wherein said convolution operations produce NF convolution operations results. 12. The non-transitory computer readable medium of claim 8 , when said each imagery data group contains less than NE sets, unoccupied sets are filled with zeros.

Assignees

Inventors

Classifications

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Combinations of networks · CPC title

  • Neural networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10043095B2 cover?
Data arrangement schemes of imagery data and filter coefficients stored in a CNN based digital IC for extracting features out of an input image are disclosed. The CNN based digital IC contains NE number of CNN processing engines connected in a loop via a clock-skew circuit for cyclic data access. Imagery data and filter coefficients are arranged in a specific scheme to fit the data access patte…
Who is the assignee on this patent?
Gyrfalcon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 07 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).