Flagman traffic gesture recognition
US-2022318560-A1 · Oct 6, 2022 · US
US12430949B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12430949-B2 |
| Application number | US-202318092384-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 2, 2023 |
| Priority date | Jan 2, 2023 |
| Publication date | Sep 30, 2025 |
| Grant date | Sep 30, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method includes receiving image data that includes at least two images of an environment associated with a vehicle, identifying at least one person of interest in the image data, and generating, using a pose estimation model and the image data, a representation of the person of interest. The method also includes determining at least one characteristic associated with the at least two images of the image data and providing, to a machine learning model, at least the representation of the person of interest and the at least one characteristic associated with the at least two images of the image data. The method also includes receiving, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest, and causing the vehicle to take at least one action based on the gesture prediction.
Opening claim text (preview).
What is claimed is: 1. A method for identifying gestures, the method comprising: receiving image data that includes at least two images of an environment associated with a vehicle; identifying at least one person of interest in the image data; generating, using a pose estimation model and the image data, a representation of the person of interest; determining at least one characteristic associated with the at least two images of the image data; providing, to a machine learning model, at least the representation of the person of interest and the at least one characteristic associated with the at least two images of the image data; receiving, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest; providing, to the machine learning model, at least one of temporal difference input, bone joint distance input, and bone angle input; receiving, from the machine learning model, a classification associated with the gesture prediction, wherein the machine learning model generates the classification associated with the gesture prediction based on the at least one of the temporal difference input, the bone joint distance input, and the bone angle input; and causing the vehicle to take at least one action based on the gesture prediction and the classification associated with the gesture prediction. 2. The method of claim 1 , wherein the representation of the person of interest includes a two-dimensional representation of the person of interest. 3. The method of claim 1 , wherein the representation of the person of interest includes a skeletal representation of the person of interest. 4. The method of claim 1 , wherein the at least one characteristic associated with the at least two images includes a temporal difference between at least one aspect of the person of interest in the at least two images. 5. The method of claim 1 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, a bone joint distance for at least one aspect of the person of interest. 6. The method of claim 1 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, at least one bone angle for at least one bone of the person of interest. 7. The method of claim 6 , wherein the at least one bone angle includes a cosine angle. 8. The method of claim 6 , wherein the at least one bone angle includes a sine angle. 9. The method of claim 1 , wherein the machine learning model is trained using gesture data. 10. The method of claim 1 , wherein the machine learning model classifies individuals in the image data as one of authorized person of interest or other person of interest. 11. The method of claim 10 , wherein identifying the at least one person of interest in the image data includes receiving from the machine learning model at least one person of interest classified as an authorized person. 12. A system for identifying gestures, the system comprising: a processor; and a memory including instructions that, when executed by the processor, cause the processor to: receive image data that includes at least two images of an environment associated with a vehicle; identify at least one person of interest in the image data; generate, using a pose estimation model and the image data, a representation of the person of interest; determine at least one characteristic associated with the at least two images of the image data; provide, to a machine learning model, at least the representation of the person of interest and the at least one characteristic associated with the at least two images of the image data; receive, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest; provide, to the machine learning model, at least one of temporal difference input, bone joint distance input, and bone angle input; receive, from the machine learning model, a classification associated with the gesture prediction, wherein the machine learning model generates the classification associated with the gesture prediction based on the at least one of the temporal difference input, the bone joint distance input, and the bone angle input; and cause the vehicle to take at least one action based on the gesture prediction and the classification associated with the gesture prediction. 13. The system of claim 12 , wherein the representation of the person of interest includes a two-dimensional representation of the person of interest. 14. The system of claim 12 , wherein the representation of the person of interest includes a skeletal representation of the person of interest. 15. The system of claim 12 , wherein the at least one characteristic associated with the at least two images includes a temporal difference between at least one aspect of the person of interest in the at least two images. 16. The system of claim 12 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, a bone joint distance for at least one aspect of the person of interest. 17. The system of claim 12 , wherein the at least one characteristic associated with the at least two images includes, for each image of the at least two images, at least one bone angle for at least one bone of the person of interest. 18. The system of claim 17 , wherein the at least one bone angle includes a cosine angle. 19. The system of claim 17 , wherein the at least one bone angle includes a sine angle. 20. An apparatus for identifying gestures, the apparatus comprising: a processor; and a memory including instructions that, when executed by the processor, cause the processor to: receive image data that includes at least two images of an environment associated with a vehicle; identify at least one person of interest in the image data; generate, using a pose estimation model and the image data, a two-dimensional skeletal representation of the person of interest; determine at least one characteristic associated with the at least two images of the image data, wherein the at least one characteristic includes at least one of a temporal difference, a bone joint distance, and a bone angle; provide, to a machine learning model, at least the two-dimensional skeletal representation of the person of interest and the at least one characteristic associated with the at least two images of the image data; receive, from the machine learning model, a gesture prediction indicating a predicted gesture being made by the person of interest; provide, to the machine learning model, at least one of temporal difference input, bone joint distance input, and bone angle input; receive, from the machine learning model, a classification associated with the gesture prediction, wherein the machine learning model generates the classification associated with the gesture prediction based on the at least one of the temporal difference input, the bone joint distance input, and the bone angle input; and cause the vehicle to take at least one action based on the gesture prediction and the classification associated with the gesture prediction.
using classification, e.g. of video objects · CPC title
Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title
Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title
using neural networks · CPC title
Static body considered as a whole, e.g. static pedestrian or occupant recognition · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.