Feature point position detecting appararus, feature point position detecting method and feature point position detecting program
US-2015356346-A1 · Dec 10, 2015 · US
US9268994B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9268994-B2 |
| Application number | US-201313967521-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 15, 2013 |
| Priority date | Mar 15, 2013 |
| Publication date | Feb 23, 2016 |
| Grant date | Feb 23, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A unified framework detects and classifies people interactions in unconstrained user generated images. Previous approaches directly map people/face locations in two-dimensional image space into features for classification. Among other things, the disclosed framework estimates a camera viewpoint and people positions in 3D space and then extracts spatial configuration features from explicit three-dimensional people positions.
Opening claim text (preview).
The invention claimed is: 1. A method for recognizing a human interaction depicted in a two-dimensional image, the method comprising, algorithmically: detecting a plurality of human face locations of people depicted in the image; determining a three-dimensional spatial arrangement of the people depicted in the image based on the detected human face locations; performing a proxemics-based analysis of the three-dimensional spatial arrangement of the people depicted in the image, wherein the proxemics-based analysis identifies cues in the three-dimensional spatial arrangement that are relevant to human interactions; and classifying the image as depicting a type of human interaction using visual proxemes, wherein the visual proxemes comprise a set of prototypical patterns that represent commonly occurring people interactions; wherein the image is created by a camera positioned at a camera viewpoint relative to a reference plane, and the method comprises estimating the camera viewpoint and using the estimated camera viewpoint to classify the image. 2. The method of claim 1 , comprising detecting, in the image, a person standing in front of another person by applying a proxemics-based visibility constraint. 3. The method of claim 1 , comprising detecting, in the image, a child and an adult by applying a proxemics-based localized pose constraint. 4. The method of claim 1 , comprising classifying the image as depicting a group interaction, a family photo, a group photo, a couple with an audience, a crowd scene, or a speaker and an audience. 5. The method of claim 1 , comprising detecting a plurality of feature cues in the image, wherein each of the feature cues relates to a proxemics-based attribute. 6. The method of claim 5 , wherein the plurality of feature cues comprises a shape cue that indicates a shape of the spatial arrangement of the detected face locations, a shot composition cue that indicates a visual distribution of the people depicted in the image, a distance cue that measures distances between the detected face locations in the image, a camera pose cue that estimates the height of the camera used to capture the image in relation to the people depicted in the image relative to a ground plane, and a shape layer cue that indicates whether the people depicted in the image are arranged in a single group or in separate subgroups. 7. The method of claim 1 , comprising creating a collection of classified images by repeating the detecting, determining, performing, and classifying for a plurality of two-dimensional images and arranging the classified images in a collection according to human interaction type. 8. The method of claim 7 , comprising searching the collection using search criteria including a human interaction type. 9. The method of claim 7 , comprising retrieving an image from the collection based on a human interaction type. 10. A method for recognizing a human interaction depicted in a two-dimensional image, the method comprising, algorithmically: detecting a plurality of human face locations of people depicted in the image; determining a three-dimensional spatial arrangement of the people depicted in the image based on the detected human face locations; performing a proxemics-based analysis of the three-dimensional spatial arrangement of the people depicted in the image, wherein the proxemics-based analysis identifies cues in the three-dimensional spatial arrangement that are relevant to human interactions; classifying the image as depicting a type of human interaction using visual proxemes, wherein the visual proxemes comprise a set of prototypical patterns that represent commonly occurring people interactions; and classifying a camera viewpoint as a high-angle viewpoint, an eye-level viewpoint, or a low-angle viewpoint. 11. A method for recognizing a human interaction depicted in a two-dimensional image, the method comprising, algorithmically: detecting a plurality of human face locations of people depicted in the image; determining a three-dimensional spatial arrangement of the people depicted in the image based on the detected human face locations; performing a proxemics-based analysis of the three-dimensional spatial arrangement of the people depicted in the image, wherein the proxemics-based analysis identifies cues in the three-dimensional spatial arrangement that are relevant to human interactions; classifying the image as depicting a type of human interaction using visual proxemes, wherein the visual proxemes comprise a set of prototypical patterns that represent commonly occurring people interactions; and analyzing the plurality of detected human face locations using a linear camera model, identifying a face location that does not fit the linear camera model as an outlier, identifying a face location that fits the linear camera model as an inlier, determining the position of the outlier in relation to the inlier, and classifying the image as depicting a type of human interaction based on the position of the outlier in relation to the inlier. 12. The method of claim 11 , comprising analyzing the position of the outlier in relation to the inlier using one or more visual proxemics-based constraints. 13. A method for recognizing a human interaction depicted in a two-dimensional image, the method comprising, algorithmically: detecting a plurality of human face locations of people depicted in the image; determining a three-dimensional spatial arrangement of the people depicted in the image based on the detected human face locations; performing a proxemics-based analysis of the three-dimensional spatial arrangement of the people depicted in the image, wherein the proxemics-based analysis identifies cues in the three-dimensional spatial arrangement that are relevant to human interactions; classifying the image as depicting a type of human interaction using visual proxemes, wherein the visual proxemes comprise a set of prototypical patterns that represent commonly occurring people interactions; and alternating between estimating a camera parameter of the camera used to create the image and applying proxemics-based constraints to the three-dimensional spatial arrangement of the human face locations detected in the image to identify the type of human interaction depicted by the image.
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
using facial parts and geometric relationships · CPC title
in albums, collections or shared content, e.g. social network photos or video · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.