Shared vision system backbone

US12148223B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12148223-B2
Application numberUS-202217732421-A
CountryUS
Kind codeB2
Filing dateApr 28, 2022
Priority dateApr 28, 2022
Publication dateNov 19, 2024
Grant dateNov 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating a dense light detection and ranging (LiDAR) representation by a vision system includes receiving, at a sparse depth network, one or more sparse representations of an environment. The method also includes generating a depth estimate of the environment depicted in an image captured by an image capturing sensor. The method further includes generating, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations. The method also includes fusing the depth estimate and the one or more sparse depth estimates to generate a dense depth estimate. The method further includes generating the dense LiDAR representation based on the dense depth estimate and controlling an action of the vehicle based on identifying a three-dimensional object in the dense LiDAR representation.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for generating a dense light detection and ranging (LiDAR) representation by a vision system of a vehicle, comprising: receiving, at a sparse depth network, one or more sparse representations of an environment within a vicinity of the vehicle; generating, at a depth estimation network, a depth estimate of the environment depicted in an image captured by an image capturing sensor integrated with the vehicle based on receiving the one or more sparse representation; generating, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations of the environment, each sparse depth estimate associated with a respective sparse representation of the one or more sparse representations; fusing, at a depth fusion network, the depth estimate and the one or more sparse depth estimates to generate a dense depth estimate; generating the dense LiDAR representation based on the dense depth estimate; and controlling an action of the vehicle based on identifying a three-dimensional object in the dense LiDAR representation. 2. The method of claim 1 , further comprising: generating, via a feature extraction network, features associated with the image; and performing one or more vision based tasks based on a combination of the features and the one or more sparse depth estimates. 3. The method of claim 2 , wherein the one or more vision based tasks include one or more of generating an instance segmentation map of the environment, identifying a two-dimensional object in the environment, or generating a semantic segmentation map of the environment. 4. The method of claim 1 , wherein: generating the dense LiDAR representation comprises: decoding the depth estimate via a depth decoder; and converting a two-dimensional representation of the environment to a 3D space based on the decoded depth estimate; and the dense LiDAR representation is based on the 3D space. 5. The method of claim 1 , further comprising: receiving, at the sparse depth network, a semantic segmentation map; generating, via the sparse depth network, a sparse depth estimate of the semantic segmentation map based on receiving the semantic segmentation map; generating, at a segmentation fusion block, a fused segmentation representation by fusing the depth estimate and the sparse semantic segmentation map; and generating, via a lane segmentation network, a lane segmentation map of the environment based on a combination features associated with the image and the one or more sparse depth estimates, wherein the features are generated via a feature extraction network. 6. The method of claim 1 , further comprising generating each sparse representation by a respective sparse representation sensor of one or more sparse representation sensors integrated with the vehicle. 7. The method of claim 6 , wherein: the one or more sparse representations include one or more of a sparse LiDAR representation or a radar representation; and the one or more sparse representation sensors include one or more of a sparse LiDAR sensor or a radar sensor. 8. An apparatus for generating a dense light detection and ranging (LiDAR) representation at a vision system of a vehicle, the apparatus comprising: at least one processor; and at least one memory coupled with the at least one processor and storing instructions operable, when executed by the at least one processor, to cause the apparatus: receive, at a sparse depth network, one or more sparse representations of an environment within a vicinity of the vehicle; generate, at a depth estimation network, a depth estimate of the environment depicted in an image captured by an image capturing sensor integrated with the vehicle based on receiving the one or more sparse representation; generate, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations of the environment, each sparse depth estimate associated with a respective sparse representation of the one or more sparse representations; fuse, at a depth fusion network, the depth estimate and the one or more sparse depth estimates to generate a dense depth estimate; generate the dense LiDAR representation based on the dense depth estimate; and control an action of the vehicle based on identifying a three-dimensional object in the dense LiDAR representation. 9. The apparatus of claim 8 , wherein execution of the instructions further cause the apparatus to: generate, via a feature extraction network, features associated with the image; and perform one or more vision based tasks based on a combination of the features and the one or more sparse depth estimates. 10. The apparatus of claim 9 , wherein the one or more vision based tasks include one or more of generating an instance segmentation map of the environment, identifying a two-dimensional object in the environment, or generating a semantic segmentation map of the environment. 11. The apparatus of claim 8 , wherein: execution of the instructions that cause the apparatus to generate the dense LiDAR representation further cause the apparatus to: decode the depth estimate via a depth decoder; and convert a two-dimensional representation of the environment to a 3D space based on the decoded depth estimate; and the dense LiDAR representation is based on the 3D space. 12. The apparatus of claim 8 , wherein execution of the instructions further cause the apparatus to: receive, at the sparse depth network, a semantic segmentation map; generate, via the sparse depth network, a sparse depth estimate of the semantic segmentation map based on receiving the semantic segmentation map; generate, at a segmentation fusion block, a fused segmentation representation by fusing the depth estimate and the sparse semantic segmentation map; and generate, via a lane segmentation network, a lane segmentation map of the environment based on a combination features associated with the image and the one or more sparse depth estimates, wherein the features are generated via a feature extraction network. 13. The apparatus of claim 8 , wherein execution of the instructions further cause the apparatus to generate each sparse representation by a respective sparse representation sensor of one or more sparse representation sensors integrated with the vehicle. 14. The apparatus of claim 13 , wherein: the one or more sparse representations include one or more of a sparse LiDAR representation or a radar representation; and the one or more sparse representation sensors include one or more of a sparse LiDAR sensor or a radar sensor. 15. A non-transitory computer-readable medium having program code recorded thereon for generating a dense light detection and ranging (LiDAR) representation at a vision system of a vehicle, the program code executed by a processor and comprising: program code to receive, at a sparse depth network, one or more sparse representations of an environment within a vicinity of the vehicle; program code to generate, at a depth estimation network, a depth estimate of the environment depicted in an image captured by an image capturing sensor integrated with the vehicle based on receiving the one or more sparse representation; program code to generate, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations of the environment, each sparse depth estimate associated with a respective sparse representation of the one or more sparse representations; program code to fuse, at a depth fusion network, the dept

Assignees

Inventors

Classifications

  • Radar; Laser, e.g. lidar · CPC title

  • Image sensing, e.g. optical camera · CPC title

  • Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title

  • Planning or execution of driving tasks · CPC title

  • Three-dimensional [3D] objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12148223B2 cover?
A method for generating a dense light detection and ranging (LiDAR) representation by a vision system includes receiving, at a sparse depth network, one or more sparse representations of an environment. The method also includes generating a depth estimate of the environment depicted in an image captured by an image capturing sensor. The method further includes generating, via the sparse depth n…
Who is the assignee on this patent?
Toyota Res Inst Inc, Toyota Motor Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V20/58. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).