Perception and understanding of vehicles

US12448009B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12448009-B2
Application numberUS-202318326834-A
CountryUS
Kind codeB2
Filing dateMay 31, 2023
Priority dateMay 31, 2023
Publication dateOct 21, 2025
Grant dateOct 21, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Autonomous vehicles utilize perception and understanding of road users to predict behaviors of the road users, and to plan a trajectory for the vehicle. Understanding subtypes and attributes of vehicles may help autonomous vehicles better predict behaviors of and react to vehicles in a variety of road situations. To offer additional understanding capabilities, an additional understanding model is added to the perception and understanding pipeline to improve classification of vehicles and extraction of attributes of the vehicles. The exemplary architectures of the understanding model balance recall and precision performance metrics and computational complexity.

First claim

Opening claim text (preview).

What is claimed is: 1. A vehicle comprising: sensors; one or more processors; and one or more storage media encoding instructions executable by the one or more processors to implement an understanding part, wherein the understanding part includes: a main understanding model to classify a tracked object into at least one of: one or more road user classifications and a vehicle classification; and a sub-model to output inferences for a plurality of task groups, the sub-model including: a shared backbone to receive and process sensor data generated from the sensors corresponding to tracked objects having the vehicle classification; one or more temporal networks dedicated to one or more task groups; and heads to output inferences for the respective task groups, wherein the inferences include one or more vehicle subtype classifications and one or more vehicle attributes, wherein the vehicle further includes a planning part configured to plan a trajectory of the vehicle based on the inferences generated by the heads, and a controls part configured to execute the planned trajectory by controlling one or more mechanical systems of the vehicle. 2. The vehicle of claim 1 , wherein the task groups comprise: a first task group to extract an emergency vehicle classification, extract emergency vehicle subtype classifications, and extract one or more emergency vehicle flashing light attributes. 3. The vehicle of claim 1 , wherein the task groups comprise: a second task group to extract vehicle signal attributes. 4. The vehicle of claim 1 , wherein the task groups comprise: a third task group to extract school bus classification, extract one or more school bus flashing light attributes, and extract one or more school bus activeness attributes. 5. The vehicle of claim 1 , wherein the task groups comprise: a fourth task group to extract vehicle subtype classifications and extract one or more vehicle attributes. 6. The vehicle of claim 1 , wherein the task groups comprise: a fifth task group to extract vehicle subtype classifications. 7. The vehicle of claim 1 , wherein the task groups comprise: a sixth task group to extract one or more vehicle open door attributes. 8. The vehicle of claim 1 , wherein the shared backbone comprises a deep neural network. 9. The vehicle of claim 1 , wherein the one or more temporal networks comprise one or more long short-term memory neural networks dedicated to one or more respective task groups. 10. The vehicle of claim 1 , wherein the one or more temporal networks comprise one or more multi-head attention neural networks dedicated to one or more respective task groups. 11. The vehicle of claim 1 , wherein the shared backbone comprises a part detector to output global features per frame of the sensor data, and one or more part features per frame of the sensor data. 12. The vehicle of claim 11 , wherein the part detector further outputs one or more bounding boxes corresponding to the one or more part features. 13. The vehicle of claim 11 , wherein the shared backbone further comprises one or more masking filters to mask one or more part features. 14. The vehicle of claim 11 , wherein the sub-model further comprises one or more part attention neural networks dedicated to one or more respective task groups. 15. The vehicle of claim 14 , wherein the one or more temporal networks receive part-attended feature vectors for a plurality of timestamps. 16. The vehicle of claim 1 , wherein the heads comprise fully connected neural network layers for the respective task groups. 17. The vehicle of claim 1 , wherein the understanding part further includes a vehicle understanding fusion part to receive at least one of the inferences generated by the heads and one or more inferences generated by one or more other sub-models. 18. A computer-implemented method for understanding vehicles and controlling a vehicle based on the understanding, the method comprising: determining, by a main understanding model, that a tracked object has a vehicle classification; providing sensor data corresponding to the tracked object having the vehicle classification to a sub-model; determining, by the sub-model, a plurality of inferences based on the sensor data from a first sensor, wherein: determining the plurality of inferences comprises: processing the sensor data using a shared backbone; processing outputs of the shared backbone by one or more temporal networks dedicated to one or more task groups; and generating inferences based on respective outputs of the temporal networks by heads that are dedicated to the respective tasks groups; and the inferences include one or more vehicle subtype classifications and one or more vehicle attributes; and planning a trajectory of the vehicle based on the inferences. 19. The computer-implemented method of claim 18 , further comprising: determining, by one or more further sub-models, a plurality of other inferences based on other sensor data captured by a second sensor; and fusing the inferences from the sub-model and the other inferences from the one or more further sub-models to form final vehicle understanding inferences. 20. One or more non-transient storage media encoding instructions executable by one or more processors to implement an understanding part for a vehicle, wherein the understanding part includes: a shared backbone to receive and process sensor data generated from the sensors corresponding to tracked objects having a vehicle classification, wherein the shared backbone includes a part detector to extract global features and a set of one or more part features per frame of the sensor data; one or more part attention neural networks, downstream of the part detector, dedicated to one or more task groups; one or more temporal networks dedicated to one or more task groups; and heads to output inferences for the respective task groups, wherein the inferences include one or more vehicle subtype classifications and one or more vehicle attributes, wherein the instructions further implement a planning part configured to plan a trajectory of the vehicle based on the inferences generated by the heads, and a controls part configured to execute the planned trajectory by controlling one or more mechanical systems of the vehicle.

Assignees

Inventors

Classifications

  • G06V10/764Primary

    using classification, e.g. of video objects · CPC title

  • Type · CPC title

  • Behavior, e.g. aggressive or erratic · CPC title

  • Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

  • using trajectory prediction for other traffic participants · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12448009B2 cover?
Autonomous vehicles utilize perception and understanding of road users to predict behaviors of the road users, and to plan a trajectory for the vehicle. Understanding subtypes and attributes of vehicles may help autonomous vehicles better predict behaviors of and react to vehicles in a variety of road situations. To offer additional understanding capabilities, an additional understanding model …
Who is the assignee on this patent?
Gm Cruise Holdings Llc
What technology area does this patent fall under?
Primary CPC classification G06V10/764. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).