Who is the assignee on this patent?

Toyota Res Inst Inc, Toyota Motor Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06V20/58. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Shared vision system backbone

US12148223B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12148223-B2
Application number	US-202217732421-A
Country	US
Kind code	B2
Filing date	Apr 28, 2022
Priority date	Apr 28, 2022
Publication date	Nov 19, 2024
Grant date	Nov 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating a dense light detection and ranging (LiDAR) representation by a vision system includes receiving, at a sparse depth network, one or more sparse representations of an environment. The method also includes generating a depth estimate of the environment depicted in an image captured by an image capturing sensor. The method further includes generating, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations. The method also includes fusing the depth estimate and the one or more sparse depth estimates to generate a dense depth estimate. The method further includes generating the dense LiDAR representation based on the dense depth estimate and controlling an action of the vehicle based on identifying a three-dimensional object in the dense LiDAR representation.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for generating a dense light detection and ranging (LiDAR) representation by a vision system of a vehicle, comprising: receiving, at a sparse depth network, one or more sparse representations of an environment within a vicinity of the vehicle; generating, at a depth estimation network, a depth estimate of the environment depicted in an image captured by an image capturing sensor integrated with the vehicle based on receiving the one or more sparse representation; generating, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations of the environment, each sparse depth estimate associated with a respective sparse representation of the one or more sparse representations; fusing, at a depth fusion network, the depth estimate and the one or more sparse depth estimates to generate a dense depth estimate; generating the dense LiDAR representation based on the dense depth estimate; and controlling an action of the vehicle based on identifying a three-dimensional object in the dense LiDAR representation. 2. The method of claim 1 , further comprising: generating, via a feature extraction network, features associated with the image; and performing one or more vision based tasks based on a combination of the features and the one or more sparse depth estimates. 3. The method of claim 2 , wherein the one or more vision based tasks include one or more of generating an instance segmentation map of the environment, identifying a two-dimensional object in the environment, or generating a semantic segmentation map of the environment. 4. The method of claim 1 , wherein: generating the dense LiDAR representation comprises: decoding the depth estimate via a depth decoder; and converting a two-dimensional representation of the environment to a 3D space based on the decoded depth estimate; and the dense LiDAR representation is based on the 3D space. 5. The method of claim 1 , further comprising: receiving, at the sparse depth network, a semantic segmentation map; generating, via the sparse depth network, a sparse depth estimate of the semantic segmentation map based on receiving the semantic segmentation map; generating, at a segmentation fusion block, a fused segmentation representation by fusing the depth estimate and the sparse semantic segmentation map; and generating, via a lane segmentation network, a lane segmentation map of the environment based on a combination features associated with the image and the one or more sparse depth estimates, wherein the features are generated via a feature extraction network. 6. The method of claim 1 , further comprising generating each sparse representation by a respective sparse representation sensor of one or more sparse representation sensors integrated with the vehicle. 7. The method of claim 6 , wherein: the one or more sparse representations include one or more of a sparse LiDAR representation or a radar representation; and the one or more sparse representation sensors include one or more of a sparse LiDAR sensor or a radar sensor. 8. An apparatus for generating a dense light detection and ranging (LiDAR) representation at a vision system of a vehicle, the apparatus comprising: at least one processor; and at least one memory coupled with the at least one processor and storing instructions operable, when executed by the at least one processor, to cause the apparatus: receive, at a sparse depth network, one or more sparse representations of an environment within a vicinity of the vehicle; generate, at a depth estimation network, a depth estimate of the environment depicted in an image captured by an image capturing sensor integrated with the vehicle based on receiving the one or more sparse representation; generate, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations of the environment, each sparse depth estimate associated with a respective sparse representation of the one or more sparse representations; fuse, at a depth fusion network, the depth estimate and the one or more sparse depth estimates to generate a dense depth estimate; generate the dense LiDAR representation based on the dense depth estimate; and control an action of the vehicle based on identifying a three-dimensional object in the dense LiDAR representation. 9. The apparatus of claim 8 , wherein execution of the instructions further cause the apparatus to: generate, via a feature extraction network, features associated with the image; and perform one or more vision based tasks based on a combination of the features and the one or more sparse depth estimates. 10. The apparatus of claim 9 , wherein the one or more vision based tasks include one or more of generating an instance segmentation map of the environment, identifying a two-dimensional object in the environment, or generating a semantic segmentation map of the environment. 11. The apparatus of claim 8 , wherein: execution of the instructions that cause the apparatus to generate the dense LiDAR representation further cause the apparatus to: decode the depth estimate via a depth decoder; and convert a two-dimensional representation of the environment to a 3D space based on the decoded depth estimate; and the dense LiDAR representation is based on the 3D space. 12. The apparatus of claim 8 , wherein execution of the instructions further cause the apparatus to: receive, at the sparse depth network, a semantic segmentation map; generate, via the sparse depth network, a sparse depth estimate of the semantic segmentation map based on receiving the semantic segmentation map; generate, at a segmentation fusion block, a fused segmentation representation by fusing the depth estimate and the sparse semantic segmentation map; and generate, via a lane segmentation network, a lane segmentation map of the environment based on a combination features associated with the image and the one or more sparse depth estimates, wherein the features are generated via a feature extraction network. 13. The apparatus of claim 8 , wherein execution of the instructions further cause the apparatus to generate each sparse representation by a respective sparse representation sensor of one or more sparse representation sensors integrated with the vehicle. 14. The apparatus of claim 13 , wherein: the one or more sparse representations include one or more of a sparse LiDAR representation or a radar representation; and the one or more sparse representation sensors include one or more of a sparse LiDAR sensor or a radar sensor. 15. A non-transitory computer-readable medium having program code recorded thereon for generating a dense light detection and ranging (LiDAR) representation at a vision system of a vehicle, the program code executed by a processor and comprising: program code to receive, at a sparse depth network, one or more sparse representations of an environment within a vicinity of the vehicle; program code to generate, at a depth estimation network, a depth estimate of the environment depicted in an image captured by an image capturing sensor integrated with the vehicle based on receiving the one or more sparse representation; program code to generate, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations of the environment, each sparse depth estimate associated with a respective sparse representation of the one or more sparse representations; program code to fuse, at a depth fusion network, the dept

Assignees

Inventors

Classifications

B60W2420/408
Radar; Laser, e.g. lidar · CPC title
B60W2420/403
Image sensing, e.g. optical camera · CPC title
G06V20/49
Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title
B60W60/001
Planning or execution of driving tasks · CPC title
G06V20/64
Three-dimensional [3D] objects · CPC title

Patent family

Related publications grouped by family.

View patent family 88512437

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12148223B2 cover?: A method for generating a dense light detection and ranging (LiDAR) representation by a vision system includes receiving, at a sparse depth network, one or more sparse representations of an environment. The method also includes generating a depth estimate of the environment depicted in an image captured by an image capturing sensor. The method further includes generating, via the sparse depth n…
Who is the assignee on this patent?: Toyota Res Inst Inc, Toyota Motor Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V20/58. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

3d surface reconstruction with point cloud densification using deep neural networks for autonomous systems and applications

Collision avoidance perception system

Multi-task multi-modal machine learning system

Multi-Task Multi-Sensor Fusion for Three-Dimensional Object Detection

Multi-modal sensor data fusion for perception systems

Frequently asked questions