What technology area does this patent fall under?

Primary CPC classification G06V10/82. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 07 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for fast object detection

US11113507B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11113507-B2
Application number	US-201815986689-A
Country	US
Kind code	B2
Filing date	May 22, 2018
Priority date	May 22, 2018
Publication date	Sep 7, 2021
Grant date	Sep 7, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment provides a method comprising identifying a salient part of an object in an input image based on processing of a region of interest (RoI) in the input image at an electronic device. The method further comprises determining an estimated full appearance of the object in the input image based on the salient part and a relationship between the salient part and the object. The electronic device is operated based on the estimated full appearance of the object.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: identifying one or more salient parts of an object in an input image by classifying, at an electronic device, a set of input patches of the input image utilizing a multi-label classification network, wherein the multi-label classification network is trained to capture one or more global characteristics of the object and one or more local characteristics of the one or more salient parts of the object based on input patches cropped from a set of training images, the multi-label classification network classifies at least one input patch of the set of input patches as a first object classification representing a background, and the multi-label classification network classifies one or more other input patches of the set of input patches as one or more additional object classifications representing the one or more salient parts of the object; determining an estimated full appearance of the object in the input image based on the one or more salient parts and a relationship between the one or more salient parts and the object as defined by one or more bounding box templates for the one or more salient parts; and invoking an action on the electronic device based on the estimated full appearance of the object. 2. The method of claim 1 , wherein the identifying the one or more salient parts of the object in the input image further comprises: generating a set of feature maps based on a sparse image pyramid; and determining a region of interest (RoI) in the input image based on the set of feature maps. 3. The method of claim 2 , the method further comprising: resizing the input image by generating the sparse image pyramid comprising one or more pyramid levels, wherein each pyramid level corresponds to different scales of the input image. 4. The method of claim 3 , wherein the generating the set of feature maps comprises: for each input patch of each pyramid level of the sparse image pyramid, classifying the input patch as one of a plurality of object classifications utilizing the multi-label classification network. 5. The method of claim 1 , wherein the object is a face, and the one or more salient parts is one or more facial parts of the face. 6. The method of claim 4 , wherein the plurality of object classifications comprise the first object classification and the one or more additional object classifications, and at least one of the plurality of object classifications represents at least one of: the background, whole face, eye, nose, whole mouth, left corner of mouth, right corner of mouth, or ear. 7. The method of claim 4 , further comprising: in response to the multi-label classification network capturing a global characteristic of the object in an input patch, determining a location of the object in the input image based on a location of the input patch. 8. The method of claim 4 , further comprising: in response to the multi-label classification network capturing a local characteristic of the one or more salient parts of the object in an input patch, inferring a location of the object in the input image based on a location of the input patch and the relationship between the one or more salient parts and the object as defined by the one or more bounding box templates for the one or more salient parts. 9. The method of claim 1 , wherein the action comprises: in response to a request to capture a picture via a camera coupled to the electronic device: controlling the capture of the picture based on detecting presence of one or more expected features in the input image utilizing the multi-label classification network, wherein the input image is a camera view of the camera. 10. The method of claim 1 , wherein the action comprises: in response to a request to capture a picture via a camera coupled to the electronic device: determining a current composition of the picture by detecting a location and a size of each object in the input image, wherein the input image is a camera view of the camera; and providing one or more suggestions to alter the current composition of the picture based on the current composition. 11. A system comprising: at least one processor; and a non-transitory processor-readable memory device storing instructions that when executed by the at least one processor causes the at least one processor to perform operations including: identifying one or more salient parts of an object in an input image by classifying, at an electronic device, a set of input patches of the input image utilizing a multi-label classification network, wherein the multi-label classification network is trained to capture one or more global characteristics of the object and one or more local characteristics of the one or more salient parts of the object based on input patches cropped from a set of training images, the multi-label classification network classifies at least one input patch of the set of input patches as a first object classification representing a background, and the multi-label classification network classifies one or more other input patches of the set of input patches as one or more additional object classifications representing the one or more salient parts of the object; determining an estimated full appearance of the object in the input image based on the one or more salient parts and a relationship between the one or more salient parts and the object as defined by one or more bounding box templates for the one or more salient parts; and invoking an action on the electronic device based on the estimated full appearance of the object. 12. The system of claim 11 , wherein the identifying the one or more salient parts of the object in the input image further comprises: generating a set of feature maps based on a sparse image pyramid; and determining a region of interest (RoI) in the input image based on the set of feature maps. 13. The system of claim 12 , wherein the generating the set of feature maps comprises: for each input patch of each pyramid level of the sparse image pyramid, classifying the input patch as one of a plurality of object classifications utilizing the multi-label classification network. 14. The system of claim 11 , wherein the object is a face, and the one or more salient parts is one or more facial parts of the face. 15. The system of claim 13 , wherein the plurality of object classifications comprise the first object classification and the one or more additional object classifications, and at least one of the plurality of object classifications represents at least one of: the background, whole face, eye, nose, whole mouth, left corner of mouth, right corner of mouth, or ear. 16. The system of claim 13 , wherein the operations further comprise: in response to the multi-label classification network capturing a global characteristic of the object in an input patch, determining a location of the object in the input image based on a location of the input patch; and in response to the multi-label classification network capturing a local characteristic of the one or more salient parts of the object in the input patch, inferring the location of the object in the input image based on the location of the input patch and the relationship between the one or more salient parts and the object as defined by the one or more bounding box templates for the one or more salient parts. 17. A non-transitory computer readable storage medium including instructions to perform a method comprising: identifying one or more salient parts of an object in an input image by classifying, at an electronic device,

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06V10/513
Sparse representations · CPC title
G06V10/82Primary
using neural networks · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G06V40/165
using facial parts and geometric relationships · CPC title
H04N23/611
where the recognised objects include parts of the human body · CPC title

Patent family

Related publications grouped by family.

View patent family 68615388

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11113507B2 cover?: One embodiment provides a method comprising identifying a salient part of an object in an input image based on processing of a region of interest (RoI) in the input image at an electronic device. The method further comprises determining an estimated full appearance of the object in the input image based on the salient part and a relationship between the salient part and the object. The electron…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 07 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Image capturing apparatus and photo composition method thereof

Method and apparatus for recognizing object, and method and apparatus for training recognizer

Photo composition and position guidance in a camera or augmented reality system

System and Method for Dynamic Image Composition Guidance in Digital Camera

Frequently asked questions