Optimizations for real-time sensor fusion in vehicle understanding models

US12528501B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12528501-B2
Application numberUS-202318326922-A
CountryUS
Kind codeB2
Filing dateMay 31, 2023
Priority dateMay 31, 2023
Publication dateJan 20, 2026
Grant dateJan 20, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Autonomous vehicles utilize perception and understanding of vehicles to predict behaviors of the vehicles, and to plan a trajectory. Understanding of attributes of vehicles may be improved through sensor fusion. Sensor fusion can be computationally expensive and may be difficult to implement in a real-time vehicle understanding system. To limit computational complexity while benefiting from machine learning across modalities, sensor fusion may be selectively implemented for a subset of task groups of a multi-task machine learning model. In some cases, part-based understanding may be implemented before fusion to limit the features being fused together to part features that are most salient for the task group. In addition, sensor data and features that may be fused together can be limited to sensor data and features within a desired field of view. A model that implements sensor fusion may be disabled for objects that are beyond a threshold distance.

First claim

Opening claim text (preview).

What is claimed is: 1 . A vehicle comprising: sensors to generate first sensor data in a first modality and second sensor data in a second modality; one or more processors; and one or more storage media encoding instructions executable by the one or more processors to implement an understanding part, wherein the understanding part includes: a first node to output first inferences for a plurality of first task groups, the first node including: a first shared backbone to receive and process first sensor data corresponding to tracked objects having a vehicle classification; and task group specific heads to output first inferences for the first task groups; and a second node to output second inferences for a second task group, the second node including: a second backbone to receive and process second sensor data corresponding to tracked objects having the vehicle classification; a cross attention neural network to receive first machine learning features from the first shared backbone and second machine learning features from the second backbone; and heads downstream of the cross attention neural network to output inferences for the second task group; and the one or more storage media further encoding instructions for causing the vehicle to: extract a first set of machine learning features from first sensor data using the first backbone, and determine a set of first inferences based on the first set of machine learning features using the first backbone; extract machine learning features from a second set of sensor data using the second backbone, fuse the first set of machine learning features and the second set of machine learning features using the cross attention neural network and determining a second set of inferences from the fusion of first machine learning features and second machine learning features; planning a trajectory of the vehicle using the first inferences and the second inferences; and automatically implementing the planned trajectory by engaging at least one of a vehicle propulsion system, a braking system, and a steering system. 2 . The vehicle of claim 1 , wherein the first node further includes: a plurality of first temporal networks dedicated to respective first task groups. 3 . The vehicle of claim 1 , wherein the second node further includes: a second temporal network downstream of the cross attention neural network. 4 . The vehicle of claim 1 , wherein the second inferences comprise two or more vehicle open door attributes. 5 . The vehicle of claim 1 , wherein the first sensor data comprises image data generated by a camera, and second sensor data comprises point clouds generated by a light detection and ranging sensor. 6 . The vehicle of claim 1 , wherein the second inferences comprise two or more vehicle signal attributes. 7 . The vehicle of claim 1 , wherein the first sensor data comprises color channels image data generated by a camera, and second sensor data comprises signal channel image data generated by the camera. 8 . The vehicle of claim 1 , wherein the first sensor data comprises color image data generated by a first camera, and second sensor data comprises signal image data generated by a second camera. 9 . The vehicle of claim 1 , wherein the first task groups comprise two or more of: a first task group to extract an emergency vehicle classification, extract emergency vehicle subtype classifications, and extract one or more emergency vehicle flashing light attributes, a second task group to extract vehicle signal attributes, a third task group to extract school bus classification, extract one or more school bus flashing light attributes, and extract one or more school bus activeness attributes, a fourth task group to extract vehicle subtype classifications and extract one or more vehicle attributes, and a fifth task group to extract vehicle subtype classifications. 10 . The vehicle of claim 1 , wherein the cross attention neural network encodes attention relationships between the first machine learning features and the second machine learning features, and outputs fused machine learning features based on the attention relationships. 11 . The vehicle of claim 1 , wherein the first shared backbone comprises a part-based backbone to output global machine learning features per frame of the sensor data, and one or more part machine learning features per frame of the sensor data. 12 . The vehicle of claim 11 , wherein the part-based backbone further outputs one or more bounding boxes corresponding to the one or more part machine learning features. 13 . The vehicle of claim 11 , wherein the first machine learning features received by the cross attention neural network comprise one or more selected part machine learning features generated by the part-based backbone. 14 . The vehicle of claim 11 , wherein the first node further comprises one or more task group specific masking filters to mask the one or more part machine learning features. 15 . The vehicle of claim 1 , wherein the second node is deactivated and does not perform processing of sensor data corresponding to tracked objects that are beyond a threshold distance from the vehicle. 16 . The vehicle of claim 1 , wherein the first set of sensor data has a first modality, the second set of sensor data has a second modality, and the first modality is distinct from the second modality.

Assignees

Inventors

Classifications

  • Image sensing, e.g. optical camera · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

  • Data fusion · CPC title

  • Spatial relation or speed relative to objects · CPC title

  • using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12528501B2 cover?
Autonomous vehicles utilize perception and understanding of vehicles to predict behaviors of the vehicles, and to plan a trajectory. Understanding of attributes of vehicles may be improved through sensor fusion. Sensor fusion can be computationally expensive and may be difficult to implement in a real-time vehicle understanding system. To limit computational complexity while benefiting from mac…
Who is the assignee on this patent?
Gm Cruise Holdings Llc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).