Image processing apparatus, image processing method, and non-transitory computer-readable storage medium

US12020474B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12020474-B2
Application numberUS-202217574068-A
CountryUS
Kind codeB2
Filing dateJan 12, 2022
Priority dateAug 9, 2017
Publication dateJun 25, 2024
Grant dateJun 25, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A connected layer feature is generated by connecting outputs of a plurality of layers of a hierarchical neural network obtained by processing an input image using the hierarchical neural network. An attribute score map representing an attribute of each region of the input image is generated for each attribute using the connected layer feature. A recognition result for a recognition target is generated and output by integrating the generated attribute score maps for respective attributes.

First claim

Opening claim text (preview).

What is claimed is: 1. An image processing apparatus comprising: one or more processors which execute instructions stored in one or more memories, wherein by execution of the instructions the one or more processors function as: a feature generation unit configured to generate a connected layer feature by connecting outputs of a plurality of layers of a hierarchical neural network obtained by processing an input image using the hierarchical neural network; a map generation unit configured to generate, based on inputting of the connected layer feature to discriminators corresponding to attributes, for each of the attributes, an attribute score map representing a likelihood of an attribute of a region represented by a coordinate in the input image; and an output unit configured to generate and output a recognition result for a recognition target in the input image, based on integration information obtained by integrating the attribute score maps for respective attributes generated by the map generation unit, wherein each of the discriminators has been learned so as to output likelihood of a corresponding attribute, and wherein at least one of the attribute score maps of the attributes indicates likelihood that a reference position of an object exists at a coordinate in the input image. 2. The apparatus according to claim 1 , wherein the output unit generates and outputs a result concerning presence/absence of the object in the input image, based on the integration information. 3. The apparatus according to claim 2 , wherein the output unit further generates and outputs a result concerning a label of a category of each region in the input image. 4. The apparatus according to claim 1 , further comprising an estimation unit configured to estimate a size of the object by regress an integration result of the attribute score maps for respective attributes generated by the map generation unit. 5. The apparatus according to claim 1 , wherein the discriminator outputs a likelihood indicating whether a reference position of the object exists at a coordinate in the region. 6. The apparatus according to claim 1 , wherein the map generation unit calculates a score map of likelihoods for each subcategory. 7. The apparatus according to claim 6 , wherein each subcategory is a subcategory classified by at least one of a depth rotation of the object, an in-plane rotation of the object, an orientation of the object, a shape of the object, a material of the object, a shape of a region of interest of the object, a size of the region of interest of the object, and an aspect ratio of the region of interest of the object. 8. The apparatus according to claim 1 , wherein the output unit outputs information relating to at least one of a depth rotation of the object, an in-plane rotation of the object, an orientation of the object, a shape of the object, a material of the object, a shape of a region of interest of the object, a size of the region of interest of the object, and an aspect ratio of the region of interest of the object. 9. The apparatus according to claim 1 , wherein the output unit generates the recognition result of a resolution higher than a resolution of the attribute score map. 10. The apparatus according to claim 1 , wherein the output unit estimates a size of the object. 11. The apparatus according to claim 1 , wherein the output unit generates a result of classification of the input image. 12. The apparatus according to claim 11 , wherein the output unit calculates a score based on the result of the classification and a prior distribution of a subject for each classification. 13. The apparatus according to claim 11 , wherein the map generation unit selects, based on the result of the classification, a category to be determined. 14. The apparatus according to claim 1 , further comprising a unit configured to input camera information, wherein the map generation unit uses the camera information in addition to the connected layer feature. 15. The apparatus according to claim 1 , further comprising a unit configured to select, as a final output, one of a plurality of results included in the recognition result. 16. The apparatus according to claim 1 , further comprising a unit configured to integrally learn parameters of processing with respect to at least two of the hierarchical neural network, the feature generation unit, the map generation unit, and the output unit. 17. The apparatus according to claim 1 , wherein the feature generation unit performs up-sampling processing when connecting the outputs of the plurality of layers of the hierarchical neural network. 18. The apparatus according to claim 1 , wherein the feature generation unit performs deconvolution processing when connecting the outputs of the plurality of layers of the hierarchical neural network. 19. An image processing method comprising: generating a connected layer feature by connecting outputs of a plurality of layers of a hierarchical neural network obtained by processing an input image using the hierarchical neural network; generating, based on inputting of the connected layer feature to discriminators corresponding to attributes, for each of the attributes, an attribute score map representing a likelihood of an attribute of a region represented by a coordinate in the input image; and generating and outputting a recognition result for a recognition target in the input image, based on integration information obtained by integrating the generated attribute score maps for respective attributes, wherein each of the discriminators has been learned so as to output likelihood of a corresponding attribute, and wherein at least one of the attribute score maps of the attributes indicates likelihood that a reference position of an object exists at a coordinate in the input image. 20. A non-transitory computer-readable storage medium storing a computer program for causing a computer to function as: a feature generation unit configured to generate a connected layer feature by connecting outputs of a plurality of layers of a hierarchical neural network obtained by processing an input image using the hierarchical neural network; a map generation unit configured to generate, based on inputting of the connected layer feature to discriminators corresponding to attributes, for each of the attributes, an attribute score map representing a likelihood of an attribute of a region represented by a coordinate in the input image; and an output unit configured to generate and output a recognition result for a recognition target in the input image, based on integration information obtained by integrating the attribute score maps for respective attributes generated by the map generation unit, wherein each of the discriminators has been learned so as to output likelihood of a corresponding attribute, and wherein at least one of the attribute score maps of the attributes indicates likelihood that a reference position of an object exists at a coordinate in the input image.

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Classification techniques · CPC title

  • Categorising the entire scene, e.g. birthday party or wedding scene · CPC title

  • Detection; Localisation; Normalisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12020474B2 cover?
A connected layer feature is generated by connecting outputs of a plurality of layers of a hierarchical neural network obtained by processing an input image using the hierarchical neural network. An attribute score map representing an attribute of each region of the input image is generated for each attribute using the connected layer feature. A recognition result for a recognition target is ge…
Who is the assignee on this patent?
Canon Kk
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 25 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).