Hand pose estimation from stereo cameras

US11880509B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11880509-B2
Application numberUS-202318151857-A
CountryUS
Kind codeB2
Filing dateJan 9, 2023
Priority dateSep 9, 2019
Publication dateJan 23, 2024
Grant dateJan 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods herein describe using a neural network to identify a first set of joint location coordinates and a second set of joint location coordinates and identifying a three-dimensional hand pose based on both the first and second sets of joint location coordinates.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, from a camera, a plurality of images of a hand; for each given image in the plurality of images: cropping, using one or more processors, a portion of the given image comprising the hand; identifying a first set of joint location coordinates in the cropped portion of the given image, the first set of joint location coordinates representing pixel locations of the hand relative to the cropped portion of the given image; generating a second set of joint location coordinates, the second set of joint location coordinates representing joint locations of the hand relative to a three-dimensional physical space; and identifying a three-dimensional hand pose of the hand based on the first set of joint location coordinates and the second set of joint location. 2. The method of claim 1 , wherein the plurality of images comprises a plurality of views of the hand. 3. The method of claim 1 , further comprising: prompting a user of a client device to initialize a hand position; receiving the initialized hand position; and tracking the hand based on the initialized hand position. 4. The method of claim 1 , wherein the camera is a stereo camera. 5. The method of claim 1 , further comprising identifying an intermediate set of joint location coordinates, wherein the intermediate set of joint location coordinates represent pixel locations of the hand relative to the given image. 6. The method of claim 5 , wherein the intermediate set of joint location coordinates is measured relative to an uncropped version of the given image. 7. The method of claim 1 , further comprising: generating a synthetic training dataset comprising stereo image pairs of virtual hands and corresponding ground truth labels, wherein the corresponding ground truth labels comprise joint locations. 8. A system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, configure the system to perform operations comprising: receiving, from a camera, a plurality of images of a hand; for each given image in the plurality of images: cropping, using one or more processors, a portion of the given image comprising the hand; identifying a first set of joint location coordinates in the cropped portion of the given image, the first set of joint location coordinates representing pixel locations of the hand relative to the cropped portion of the given image; generating a second set of joint location coordinates, the second set of joint location coordinates representing joint locations of the hand relative to a three-dimensional physical space; and identifying a three-dimensional hand pose of the hand based on the first set of joint location coordinates and the second set of joint location. 9. The system of claim 8 , wherein the plurality of images comprises a plurality of views of the hand. 10. The system of claim 8 , further comprising: prompting a user of a client device to initialize a hand position; receiving the initialized hand position; and tracking the hand based on the initialized hand position. 11. The system of claim 8 , wherein the camera is a stereo camera. 12. The system of claim 8 , further comprising identifying an intermediate set of joint location coordinates, wherein the intermediate set of joint location coordinates represent pixel locations of the hand relative to the given image. 13. The system of claim 12 , wherein the intermediate set of joint location coordinates is measured relative to an uncropped version of the given image. 14. The system of claim 8 , further comprising: generating a synthetic training dataset comprising stereo image pairs of virtual hands and corresponding ground truth labels, wherein the corresponding ground truth labels comprise joint locations. 15. A non-transitory computer-readable storage medium, the computer-readable storage medium including instructions that when executed by a computer, cause the computer to perform operations comprising: receiving, from a camera, a plurality of images of a hand; for each given image in the plurality of images: cropping, using one or more processors, a portion of the given image comprising the hand; identifying a first set of joint location coordinates in the cropped portion of the given image, the first set of joint location coordinates representing pixel locations of the hand relative to the cropped portion of the given image; generating a second set of joint location coordinates, the second set of joint location coordinates representing joint locations of the hand relative to a three-dimensional physical space; and identifying a three-dimensional hand pose of the hand based on the first set of joint location coordinates and the second set of joint location. 16. The computer-readable storage medium of claim 15 , wherein the plurality of images comprises a plurality of views of the hand. 17. The computer-readable storage medium of claim 15 , further comprising: prompting a user of a client device to initialize a hand position; receiving the initialized hand position; and tracking the hand based on the initialized hand position. 18. The computer-readable storage medium of claim 15 , wherein the second set of joint location coordinates is measured use millimeters. 19. The computer-readable storage medium of claim 15 , further comprising: generating a synthetic training dataset comprising stereo image pairs of virtual hands and corresponding ground truth labels, wherein the corresponding ground truth labels comprise joint locations. 20. The computer-readable storage medium of claim 15 , further comprising identifying an intermediate set of joint location coordinates, wherein the intermediate set of joint location coordinates represent pixel locations of the hand relative to the given image.

Assignees

Inventors

Classifications

  • G06F3/017Primary

    Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title

  • Arrangements for interaction with the human body, e.g. for user immersion in virtual reality (blind teaching G09B21/00) · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

  • G06T7/73Primary

    using feature-based methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11880509B2 cover?
Systems and methods herein describe using a neural network to identify a first set of joint location coordinates and a second set of joint location coordinates and identifying a three-dimensional hand pose based on both the first and second sets of joint location coordinates.
Who is the assignee on this patent?
Snap Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/017. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).