Image assessment using deep convolutional neural networks

US9536293B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9536293-B2
Application numberUS-201414447290-A
CountryUS
Kind codeB2
Filing dateJul 30, 2014
Priority dateJul 30, 2014
Publication dateJan 3, 2017
Grant dateJan 3, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Deep convolutional neural networks receive local and global representations of images as inputs and learn the best representation for a particular feature through multiple convolutional and fully connected layers. A double-column neural network structure receives each of the local and global representations as two heterogeneous parallel inputs to the two columns. After some layers of transformations, the two columns are merged to form the final classifier. Additionally, features may be learned in one of the fully connected layers. The features of the images may be leveraged to boost classification accuracy of other features by learning a regularized double-column neural network.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer storage medium comprising computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform operations comprising: implementing a deep convolutional neural network that is trained to learn and classify image features for a set of images; receiving an image from the set of images; extracting a global image representation of the image as one or more global inputs to a first column of the deep convolutional neural network; extracting a local image representation of the image as one or more fine-grained inputs to a second column of the deep convolutional neural network, each convolutional layer of the first column being independent from each convolutional layer of the second column, the first column and the second column in convolutional layers being in different spatial scales; merging at least one layer of the first column with at least one layer of the second column into a fully connected layer; using the fully connected layer to calculate a probability of each input being assigned to a class for a particular feature; averaging results associated with each input associated with the image; classifying at least one feature for the image using the class with the highest probability; and providing the classified at least one image feature for use in an image processing task. 2. The non-transitory computer storage medium of claim 1 , further comprising resizing the image to create the global image representation. 3. The non-transitory computer storage medium of claim 1 , further comprising resizing the image by warping the image into a normalized input with a fixed size. 4. The non-transitory computer storage medium of claim 1 , further comprising resizing the image by normalizing its shorter side to a normalized input with a fixed length S and center-cropping the normalized input to generate a s×s×3 input. 5. The non-transitory computer storage medium of claim 1 , further comprising resizing the image by normalizing a longer side of the image to a fixed length s and generating a normalized input of a fixed size s×s×3 by padding border pixels with zero. 6. The non-transitory computer storage medium of claim 1 , further comprising randomly cropping the image into a normalized input with a fixed size to create the local image representation, the local image representation preserving details of the image in the original high-resolution format. 7. The non-transitory computer storage medium of claim 1 , wherein an architecture associated with each column in the deep convolutional neural network is the same for each column. 8. The non-transitory computer storage medium of claim 1 , wherein an architecture associated with each column in the deep convolutional neural network is different for each column. 9. The non-transitory computer storage medium of claim 1 , further comprising adding one or more additional columns with additional normalized inputs to form a multi-column convolutional neural network. 10. The non-transitory computer storage medium of claim 1 , wherein an architecture associated with each column in the deep convolutional neural network comprises at least four convolutional layers and at least two fully-connected layers. 11. The non-transitory computer storage medium of claim 10 , further comprising extracting one or more features from the image at one of the fully-connected layers. 12. The non-transitory computer storage medium of claim 10 , further comprising replacing a last layer of the deep convolutional neural network with a regression. 13. The non-transitory computer storage medium of claim 1 , wherein the particular image feature is one of aesthetics, style, or scene. 14. A computer-implemented method comprising: implementing a double-column deep convolutional neural network (DCNN) that is trained to learn and classify features for a set of images; extracting a global image representation of an image as a global input to a first column of the DCNN; extracting a local image representation of the image as a fine-grained input to a second column of the DCNN, each convolutional layer of the first column being independent from each convolutional layer of the second column, the first column and the second column in convolutional layers being in different spatial scales; merging at least one layer of the first column with at least one layer of the second column into a fully connected layer; jointly training weights associated with the fully connected layer; classifying at least one feature for the image using the fully connected layer; and providing the classified at least one image feature for use in an image processing task. 15. The method of claim 14 , further comprising automatically discovering global and local features of an image from the fully connected layer and a layer immediately preceding the fully connected layer. 16. The method of claim 14 , further comprising back propagating error in each column with stochastic gradient descent. 17. The method of claim 14 , further comprising adding one or more additional columns with additional normalized inputs to form a multi-column convolutional neural network. 18. The method of claim 14 , wherein the image processing task comprises one of searching for an image or editing an image. 19. A computerized system comprising: one or more processors; and one or more computer storage media storing computer-useable instructions that, when used by the one or more processors, cause the one or more processors to: implement a double-column deep convolutional neural network (DCNN) to train the DCNN to learn and classify features for a set of images; extract a global image representation of an image as a global input to a first column of the DCNN; extract a local image representation of the image as a fine-grained input to a second column of the DCNN, each convolutional layer of the first column being independent from each convolutional layer of the second column, the first column and the second column in convolutional layers being in different spatial scales; merge at least one layer of the first column with at least one layer of the second column into a fully connected layer; learn or classify at least one feature for the image using the fully connected layer; and provide the classified at least one image feature for use in an image processing task. 20. The computerized system of claim 19 , wherein the image processing task comprises one of searching for an image or editing an image.

Assignees

Inventors

Classifications

  • G06V10/772Primary

    Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries · CPC title

  • G06T7/0002Primary

    Inspection of images, e.g. flaw detection · CPC title

  • Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries · CPC title

  • Combinations of networks · CPC title

  • based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9536293B2 cover?
Deep convolutional neural networks receive local and global representations of images as inputs and learn the best representation for a particular feature through multiple convolutional and fully connected layers. A double-column neural network structure receives each of the local and global representations as two heterogeneous parallel inputs to the two columns. After some layers of transforma…
Who is the assignee on this patent?
Adobe Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/772. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).