Methods and systems for hand gesture-based control of a device

US12093465B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12093465-B2
Application numberUS-202217950246-A
CountryUS
Kind codeB2
Filing dateSep 22, 2022
Priority dateMar 23, 2020
Publication dateSep 17, 2024
Grant dateSep 17, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for gesture-based control of a device are described. An input frame is processed to determine a location of a distinguishing anatomical feature in the input frame. A virtual gesture-space is defined based on the location of the distinguishing anatomical feature, the virtual gesture-space being a defined space for detecting a gesture input. The input frame is processed in only the virtual gesture-space, to detect and track a hand. Using information generated from detecting and tracking the at least one hand, a gesture class is determined for the at least one hand. The device may be a smart television, a smart phone, a tablet, etc.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: processing an input frame of a sequence of frames captured by a camera of a device to determine a location of at least one detected instance of a distinguishing anatomical feature in the input frame, the at least one detected instance of the distinguishing anatomical feature detected in the input frame being a non-hand anatomical feature; defining, for at least a selected one of the at least one detected instance of the distinguishing anatomical feature, a virtual gesture-space based on the location of the selected one instance of the distinguishing anatomical feature, the virtual gesture-space being a shape defined within the input frame for detecting a gesture input; processing only the virtual gesture-space that is the shape defined within each frame in the sequence of frames to detect and track at least one hand; predicting, using information generated from detecting and tracking the at least one hand, a gesture class associated with the at least one hand; and outputting the predicted gesture class associated with the at least one hand. 2. The method of claim 1 , wherein the distinguishing anatomical feature is a human face. 3. The method of claim 1 , wherein there is a plurality of detected instances of the distinguishing anatomical feature, one virtual gesture-space is defined for each respective detected instance, and each virtual gesture-space is processed to perform hand detection and tracking. 4. The method of claim 1 , further comprising: after the virtual gesture-space has been defined, processing at least one subsequent input frame by performing hand detection and tracking in only the defined virtual gesture-space without further performing detection of the distinguishing anatomical feature in the at least one subsequent input frame. 5. The method of claim 1 , further comprising: using information generated from detecting and tracking the at least one hand, redefining the virtual gesture-space based on a detected location of the at least one hand. 6. The method of claim 5 , further comprising: after the virtual gesture-space has been redefined based on the detected location of the at least one hand, processing at least one subsequent input frame by performing hand detection and tracking only in the redefined virtual gesture-space without further performing detection of the distinguishing anatomical feature in the at least one subsequent input frame. 7. The method of claim 1 , wherein the information generated from detecting and tracking the at least one hand includes a bounding box defining the at least one hand in the input frame, and wherein gesture classification is performed using the bounding box. 8. The method of claim 1 , further comprising: defining one or more subspaces in the virtual gesture-space; wherein information generated from detecting and tracking the at least one hand includes information indicating the at least one hand is detected in one of the one or more subspaces; and wherein each subspace is associated with a respective mouse input. 9. The method of claim 1 , wherein the virtual gesture-space is a 3D shape. 10. An apparatus comprising: a processing device coupled to a memory storing machine-executable instructions thereon, wherein the instructions, when executed by the processing device, cause the apparatus to: process an input frame of a sequence of frames to determine a location of at least one detected instance of a distinguishing anatomical feature in the input frame, the at least one detected instance of the distinguishing anatomical feature detected in the input frame being a non-hand anatomical feature; define, for at least a selected one of the at least one detected instance of the distinguishing anatomical feature, a virtual gesture-space based on the location of the selected one instance of the distinguishing anatomical feature, the virtual gesture-space being a shape defined within the input frame for detecting a gesture input; process only the virtual gesture-space that is the shape defined within each frame in the sequence of frames to detect and track at least one hand; predict, using information generated from detecting and tracking the at least one hand, a gesture class associated with the at least one hand; and output the predicted gesture class associated with the at least one hand. 11. The apparatus of claim 10 , wherein the distinguishing anatomical feature is a human face. 12. The apparatus of claim 10 , wherein there is a plurality of detected instances of the distinguishing anatomical feature, one virtual gesture-space is defined for each respective detected instance, and each virtual gesture-space is processed to perform hand detection and tracking. 13. The apparatus of claim 10 , wherein the instructions further cause the apparatus to: after the virtual gesture-space has been defined, process at least one subsequent input frame by performing hand detection and tracking only in the defined virtual gesture-space without further performing detection of the distinguishing anatomical feature in the at least one subsequent input frame. 14. The apparatus of claim 10 , wherein the instructions further cause the apparatus to: using information generated from detecting and tracking the at least one hand, redefine the virtual gesture-space based on a detected location of the at least one hand. 15. The apparatus of claim 14 , wherein the instructions further cause the apparatus to: after the virtual gesture-space has been redefined based on the detected location of the at least one hand, process at least one subsequent input frame by performing hand detection and tracking only in the redefined virtual gesture-space without further performing detection of the distinguishing anatomical feature in the at least one subsequent input frame. 16. The apparatus of claim 10 , wherein the information generated from detecting and tracking the at least one hand includes a bounding box defining the at least one hand in the input frame, and wherein gesture classification is performed using the bounding box. 17. The apparatus of claim 10 , wherein the instructions further cause the apparatus to: define one or more subspaces in the virtual gesture-space; wherein information generated from detecting and tracking the at least one hand includes information indicating the at least one hand is detected in one of the one or more subspaces; and wherein each subspace is associated with a respective mouse input. 18. The apparatus of claim 10 , wherein the apparatus is a gesture-controlled device, and wherein the determined gesture class is used to determine a command input to the gesture-controlled device. 19. The apparatus of claim 18 , further comprising a camera for capturing the sequence of frames including the input frame, and the gesture-controlled device is one of: a television, a smartphone, a tablet, a vehicle-coupled device, an internet of things device, an artificial reality device, or a virtual reality device. 20. A non-transitory computer-readable medium having machine-executable instructions stored thereon, the instructions, when executed by a processing device of an apparatus, cause the apparatus to: process an input frame of a sequence of frames to determine a location of at least one detected instance of a distinguishing anatomical feature in the input frame, the at least one detected instance of the distinguishing anatomical feature detected in the input frame being a non-hand anatomical feature;

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title

  • Combinations of networks · CPC title

  • Activation functions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12093465B2 cover?
Methods and systems for gesture-based control of a device are described. An input frame is processed to determine a location of a distinguishing anatomical feature in the input frame. A virtual gesture-space is defined based on the location of the distinguishing anatomical feature, the virtual gesture-space being a defined space for detecting a gesture input. The input frame is processed in onl…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F3/017. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).