Dynamic hand gesture recognition using depth data

US9990050B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9990050-B2
Application numberUS-201615334269-A
CountryUS
Kind codeB2
Filing dateOct 25, 2016
Priority dateJun 18, 2012
Publication dateJun 5, 2018
Grant dateJun 5, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The subject disclosure is directed towards a technology by which dynamic hand gestures are recognized by processing depth data, including in real-time. In an offline stage, a classifier is trained from feature values extracted from frames of depth data that are associated with intended hand gestures. In an online stage, a feature extractor extracts feature values from sensed depth data that corresponds to an unknown hand gesture. These feature values are input to the classifier as a feature vector to receive a recognition result of the unknown hand gesture. The technology may be used in real time, and may be robust to variations in lighting, hand orientation, and the user's gesturing speed and style.

First claim

Opening claim text (preview).

What is claimed is: 1. In a computing environment, a method performed at least in part on at least one processor, the method comprising: detecting a hand by: identifying a wrist area as a thinnest part of an arm portion; and separating the arm portion at the identified wrist area; segmenting depth data to isolate the hand represented in a plurality of frames that include hand movement; rotating the hand such that a palm of the hand has a normalized and oriented position relative to an image plane; extracting feature values corresponding to the rotated hand; and recognizing the hand movement as a hand gesture based upon the feature values. 2. The method of claim 1 , wherein extracting the feature values corresponding to the hand comprises extracting feature values based upon hand velocity data. 3. The method of claim 1 , further comprising processing the depth data by: dividing an original depth map for a frame into a plurality of blobs by connecting adjacent pixels if a difference between depth values of the pixels is less than a pre-defined threshold; determining a largest blob of the plurality of blobs; identifying blobs within a predefined distance of the largest blob; and classifying the largest blob and the blobs within the predefined distance of the largest blob as a human body. 4. The method of claim 1 , further comprising detecting the hand by: segmenting the depth data into a human shape, and wherein detecting the hand is further based upon depth data of the hand relative to depth data of the human shape. 5. The method of claim 1 wherein detecting the hand further comprises refining an object that includes an arm portion and a hand portion. 6. The method of claim 1 , wherein extracting the feature values corresponding to the hand comprises extracting feature values based on one or more of the following: one or more hand rotation parameters, and at least one shape descriptor. 7. The method of claim 1 , wherein extracting the feature values corresponding to the hand comprises extracting shape descriptor feature values based upon one or more occupancy features. 8. The method of claim 1 , wherein extracting the feature values corresponding to the hand comprises extracting shape descriptor feature values based upon one or more silhouette features. 9. A system comprising: a memory; a computing device; and a processor programmed to: detect a hand by: identifying a wrist area as a thinnest part of an arm portion; and separating the arm portion at the identified wrist area; segment depth data to isolate the hand represented in a plurality of frames that include hand movement; rotate the hand such that a palm of the hand has a normalized and oriented position relative to an image plane; extract feature values corresponding to the rotated hand; and recognize the hand movement as a hand gesture based upon the feature values. 10. The system of claim 9 , wherein extracting the feature values corresponding to the hand comprises extracting feature values based upon hand velocity data. 11. The system of claim 9 , wherein the processor is further programmed to: process the depth data by: dividing an original depth map for a frame into a plurality of blobs by connecting adjacent pixels if a difference between depth values of the pixels is less than a pre-defined threshold; determining a largest blob of the plurality of blobs; identifying blobs within a predefined distance of the largest blob; and classifying the largest blob and the blobs within the predefined distance of the largest blob as a human body. 12. The system of claim 9 , wherein the processor is further programmed to detect the hand by: segmenting the depth data into a human shape. 13. The system of claim 9 , wherein the processor is further programmed to detect the hand by: refining an object that includes an arm portion and a hand portion. 14. The system of claim 9 , wherein the processor is further programmed to: identify a hand region; and determine that the identified hand region includes a portion of an arm and a portion of the hand. 15. The system of claim 9 , wherein the processor is further programmed to detect the hand by: determining a plurality of hypothesized hand regions; and determining a hand region from among the hypothesized hand regions based upon processing one or more previous frames of depth data. 16. The system of claim 9 , wherein extracting the feature values corresponding to the hand comprises extracting shape descriptor feature values based upon one or more silhouette features. 17. One or more computer-readable storage devices having computer-executable instructions, which when executed perform operations comprising: detecting a hand by: identifying a wrist area as a thinnest part of an arm portion; and separating the arm portion at the identified wrist area; segmenting depth data to isolate the hand represented in a plurality of frames that include hand movement; rotating the hand such that a palm of the hand has a normalized and oriented position relative to an image plane; extracting feature values corresponding to the rotated hand; and recognizing the hand movement as a hand gesture based upon the feature values. 18. The one or more computer-readable storage devices of claim 17 , wherein extracting the feature values corresponding to the hand comprises extracting a hand velocity feature value set, a hand rotation feature value set, and a hand shape descriptor feature set. 19. The one or more computer-readable storage devices of claim 17 , further comprising further computer-executable instructions, which when executed perform operations comprising: processing the depth data by: dividing an original depth map for a frame into a plurality of blobs by connecting adjacent pixels if a difference between depth values of the pixels is less than a pre-defined threshold; determining a largest blob of the plurality of blobs; identifying blobs within a predefined distance of the largest blob; and classifying the largest blob and the blobs within the predefined distance of the largest blob as a human body. 20. The one or more computer-readable storage devices of claim 17 , further comprising further computer-executable instructions, which when executed perform operations comprising detecting the hand by: refining an object that includes an arm portion and a hand portion.

Assignees

Inventors

Classifications

  • Markov-related models; Markov random fields · CPC title

  • using classification, e.g. of video objects · CPC title

  • G06V40/28Primary

    Recognition of hand or arm movements, e.g. recognition of deaf sign language (static hand signs G06V40/113) · CPC title

  • G06F3/017Primary

    Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title

  • Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9990050B2 cover?
The subject disclosure is directed towards a technology by which dynamic hand gestures are recognized by processing depth data, including in real-time. In an offline stage, a classifier is trained from feature values extracted from frames of depth data that are associated with intended hand gestures. In an online stage, a feature extractor extracts feature values from sensed depth data that cor…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06V40/28. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 05 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).