What technology area does this patent fall under?

Primary CPC classification G06V10/806. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Feb 13 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Joint 3d detection and segmentation using bird's eye view and perspective view

US2025054286A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2025054286-A1
Application number	US-202318475988-A
Country	US
Kind code	A1
Filing date	Sep 27, 2023
Priority date	Aug 7, 2023
Publication date	Feb 13, 2025
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An image processing method includes performing, using images obtained from one or more sensors onboard a vehicle, a 2-dimensional (2D) feature extraction; performing, a 3-dimensional (3D) feature extraction on the images; detecting objects in the images by fusing detection results from the 2D feature extraction and the 3D feature extraction.

First claim

Opening claim text (preview).

1 . A method of detecting objects in sensor data, comprising: performing, using images obtained from one or more sensors onboard a vehicle, a 2-dimensional (2D) feature extraction; performing, a 3-dimensional (3D) feature extraction on the images; detecting objects in the images by fusing detection results from the 2D feature extraction and the 3D feature extraction. 2 . The method of claim 1 , wherein the 2D feature extraction comprises a perspective view (PV) analysis of the images. 3 . The method of claim 1 , wherein the 3D feature extraction comprises a bird's eye view (BEV) analysis of the images. 4 . The method of any claim 1 , wherein the 3D feature extraction is performed by: generating 3D features from 2D features resulting from the 2D feature extraction; and the method further includes refining 3D feature estimates using dual-space object queries that include joint proposals formed based on 2D features resulting from the 2D feature extraction and 3D features resulting from the 3D feature extraction. 5 . The method of claim 4 , wherein the generating the 3D features from the 2D features comprises applying a back-projection model to the 2D features. 6 . The method of claim 4 , wherein the refining comprises performing a multi-level refinement wherein, at each layer of the multi-level refinement, a self-attention layer that acts on both the 2D features and the 3D features a first cross-attention layer that acts only on the 2D features and a second cross-attention layer that acts only on the 3D features are used. 7 . The method of claim 4 , wherein a shared pose is further used during the refining. 8 . The method of claim 1 , wherein the 2D feature extraction method comprises a 3D objection method. 9 . An apparatus for detecting objects in sensor data, the apparatus comprising at least one processor configured to: perform, from the sensor data, a 2-dimensional (2D) feature extraction; perform, from the sensor data, a 3-dimensional (3D) feature extraction; detect objects in the images by fusing detection results from the 2D feature extraction and the 3D feature extraction. 10 . The apparatus of claim 9 , wherein the 2D feature extraction comprises a perspective view (PV) analysis of the images and the 3D feature extraction comprises a bird's eye view (BEV) analysis of the images. 11 . The method of claim 9 , wherein the at least one processor performs the 3D feature extraction by: generating 3D features from 2D features resulting from the 2D feature extraction; and the method further includes refining 3D feature estimates using dual-space object queries that include joint proposals formed based on 2D features resulting from the 2D feature extraction and 3D features resulting from the 3D feature extraction. 12 . The apparatus of claim 11 , wherein the generating the 3D features from the 2D features comprises applying a back-projection model to the 2D features. 13 . The apparatus of claim 11 , wherein the refining is performed using a Conv3D or a Conv2D algorithm. 14 . The apparatus of claim 11 , wherein the refining comprises performing a multi-level refinement wherein, at each layer of the multi-level refinement, a self-attention layer that acts on both the 2D features and the 3D features a first cross-attention layer that acts only on the 2D features and a second cross-attention layer that acts only on the 3D features are used. 15 . The apparatus of claim 9 , wherein the 2D feature extraction method comprises a 3D objection method. 16 . The apparatus of claim 9 , wherein the 3D feature extraction method comprises a dense segmentation and/or a detection method. 17 . A system for deployment on an autonomous vehicle, comprising: one or more sensors configured to generate sensor data of an environment of the autonomous vehicle; and at least one processor configured to detect objects in the sensor data by: performing, using the sensor data obtained from one or more sensors, a 2-dimensional (2D) feature extraction; performing, a 3-dimensional (3D) feature extraction on the sensor data; detecting objects in the images by fusing detection results from the 2D feature extraction and the 3D feature extraction. 18 . The system of claim 17 , wherein the one or more processor performs the 3D feature extraction by: generating 3D features from 2D features resulting from the 2D feature extraction; and refining 3D feature estimates using dual-space object queries that include joint proposals formed based on 2D features resulting from the 2D feature extraction and 3D features resulting from the 3D feature extraction. 19 . The system of claim 17 , wherein hybrid detection proposals are used for querying for the objects. 20 . The system of claim 17 , wherein the one or more sensors include a camera and a lidar.

Assignees

Tusimple Inc

Inventors

Classifications

G06V20/56
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
G06F18/253
of extracted features · CPC title
G06V10/82
using neural networks · CPC title
G06V10/806Primary
of extracted features · CPC title
G06V20/58
Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

Patent family

Related publications grouped by family.

View patent family 94482267

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025054286A1 cover?: An image processing method includes performing, using images obtained from one or more sensors onboard a vehicle, a 2-dimensional (2D) feature extraction; performing, a 3-dimensional (3D) feature extraction on the images; detecting objects in the images by fusing detection results from the 2D feature extraction and the 3D feature extraction.
Who is the assignee on this patent?: Tusimple Inc
What technology area does this patent fall under?: Primary CPC classification G06V10/806. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Feb 13 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Multi-modal encoder channel fusion with cross-modality awareness

Camera-radar data fusion for efficient object detection

Multi-task multi-sensor fusion for three-dimensional object detection

Frequently asked questions