Systems and methods for person classification and gesture recognition

US12430949B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12430949-B2
Application numberUS-202318092384-A
CountryUS
Kind codeB2
Filing dateJan 2, 2023
Priority dateJan 2, 2023
Publication dateSep 30, 2025
Grant dateSep 30, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes receiving image data that includes at least two images of an environment associated with a vehicle, identifying at least one person of interest in the image data, and generating, using a pose estimation model and the image data, a representation of the person of interest. The method also includes determining at least one characteristic associated with the at least two images of the image data and providing, to a machine learning model, at least the representation of the person of interest and the at least one characteristic associated with the at least two images of the image data. The method also includes receiving, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest, and causing the vehicle to take at least one action based on the gesture prediction.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for identifying gestures, the method comprising: receiving image data that includes at least two images of an environment associated with a vehicle; identifying at least one person of interest in the image data; generating, using a pose estimation model and the image data, a representation of the person of interest; determining at least one characteristic associated with the at least two images of the image data; providing, to a machine learning model, at least the representation of the person of interest and the at least one characteristic associated with the at least two images of the image data; receiving, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest; providing, to the machine learning model, at least one of temporal difference input, bone joint distance input, and bone angle input; receiving, from the machine learning model, a classification associated with the gesture prediction, wherein the machine learning model generates the classification associated with the gesture prediction based on the at least one of the temporal difference input, the bone joint distance input, and the bone angle input; and causing the vehicle to take at least one action based on the gesture prediction and the classification associated with the gesture prediction. 2. The method of claim 1 , wherein the representation of the person of interest includes a two-dimensional representation of the person of interest. 3. The method of claim 1 , wherein the representation of the person of interest includes a skeletal representation of the person of interest. 4. The method of claim 1 , wherein the at least one characteristic associated with the at least two images includes a temporal difference between at least one aspect of the person of interest in the at least two images. 5. The method of claim 1 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, a bone joint distance for at least one aspect of the person of interest. 6. The method of claim 1 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, at least one bone angle for at least one bone of the person of interest. 7. The method of claim 6 , wherein the at least one bone angle includes a cosine angle. 8. The method of claim 6 , wherein the at least one bone angle includes a sine angle. 9. The method of claim 1 , wherein the machine learning model is trained using gesture data. 10. The method of claim 1 , wherein the machine learning model classifies individuals in the image data as one of authorized person of interest or other person of interest. 11. The method of claim 10 , wherein identifying the at least one person of interest in the image data includes receiving from the machine learning model at least one person of interest classified as an authorized person. 12. A system for identifying gestures, the system comprising: a processor; and a memory including instructions that, when executed by the processor, cause the processor to: receive image data that includes at least two images of an environment associated with a vehicle; identify at least one person of interest in the image data; generate, using a pose estimation model and the image data, a representation of the person of interest; determine at least one characteristic associated with the at least two images of the image data; provide, to a machine learning model, at least the representation of the person of interest and the at least one characteristic associated with the at least two images of the image data; receive, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest; provide, to the machine learning model, at least one of temporal difference input, bone joint distance input, and bone angle input; receive, from the machine learning model, a classification associated with the gesture prediction, wherein the machine learning model generates the classification associated with the gesture prediction based on the at least one of the temporal difference input, the bone joint distance input, and the bone angle input; and cause the vehicle to take at least one action based on the gesture prediction and the classification associated with the gesture prediction. 13. The system of claim 12 , wherein the representation of the person of interest includes a two-dimensional representation of the person of interest. 14. The system of claim 12 , wherein the representation of the person of interest includes a skeletal representation of the person of interest. 15. The system of claim 12 , wherein the at least one characteristic associated with the at least two images includes a temporal difference between at least one aspect of the person of interest in the at least two images. 16. The system of claim 12 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, a bone joint distance for at least one aspect of the person of interest. 17. The system of claim 12 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, at least one bone angle for at least one bone of the person of interest. 18. The system of claim 17 , wherein the at least one bone angle includes a cosine angle. 19. The system of claim 17 , wherein the at least one bone angle includes a sine angle. 20. An apparatus for identifying gestures, the apparatus comprising: a processor; and a memory including instructions that, when executed by the processor, cause the processor to: receive image data that includes at least two images of an environment associated with a vehicle; identify at least one person of interest in the image data; generate, using a pose estimation model and the image data, a two-dimensional skeletal representation of the person of interest; determine at least one characteristic associated with the at least two images of the image data, wherein the at least one characteristic includes at least one of a temporal difference, a bone joint distance, and a bone angle; provide, to a machine learning model, at least the two-dimensional skeletal representation of the person of interest and the at least one characteristic associated with the at least two images of the image data; receive, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest; provide, to the machine learning model, at least one of temporal difference input, bone joint distance input, and bone angle input; receive, from the machine learning model, a classification associated with the gesture prediction, wherein the machine learning model generates the classification associated with the gesture prediction based on the at least one of the temporal difference input, the bone joint distance input, and the bone angle input; and cause the vehicle to take at least one action based on the gesture prediction and the classification associated with the gesture prediction.

Assignees

Inventors

Classifications

  • using classification, e.g. of video objects · CPC title

  • Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title

  • G06V20/58Primary

    Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

  • using neural networks · CPC title

  • Static body considered as a whole, e.g. static pedestrian or occupant recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12430949B2 cover?
A method includes receiving image data that includes at least two images of an environment associated with a vehicle, identifying at least one person of interest in the image data, and generating, using a pose estimation model and the image data, a representation of the person of interest. The method also includes determining at least one characteristic associated with the at least two images o…
Who is the assignee on this patent?
Valeo North America Inc, Valeo Schalter & Sensoren Gmbh
What technology area does this patent fall under?
Primary CPC classification G06V20/58. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).