Method, apparatus for superimposing laser point clouds and high-precision map and electronic device
US-12313746-B2 · May 27, 2025 · US
US2025029355A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025029355-A1 |
| Application number | US-202318354074-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 18, 2023 |
| Priority date | Jul 18, 2023 |
| Publication date | Jan 23, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method includes receiving an image frame representing a scene; receiving point cloud data representing the scene; determining first sets of image frame features; determining second sets of point cloud data features based on a plurality of voxels representing the point cloud data; determining a third set of features of the image frame based on a first set of features of the plurality of first sets of features of the image frame and a second set of features of the plurality of second sets of features of the point cloud data; and outputting fused data that combines the third set of features of the image frame and a fourth set of features of the point cloud data. Other aspects and features are also claimed and described.
Opening claim text (preview).
What is claimed is: 1 . A method for image processing for use in a vehicle assistance system, comprising: receiving an image frame representing a scene; receiving point cloud data representing the scene; determining a plurality of first sets of features of the image frame; determining a plurality of second sets of features of the point cloud data based on a plurality of voxels representing the point cloud data; determining a third set of features of the image frame based on a first set of features of the plurality of first sets of features of the image frame and a second set of features of the plurality of second sets of features of the point cloud data; and outputting fused data that combines the third set of features of the image frame and a fourth set of features of the point cloud data. 2 . The method of claim 1 , further comprising: determining the first set of features and the second set of features based on a first statistical indicator associated with the plurality of first sets of features of the image frame and a second statistical indicator associated with the plurality of second sets of features of the point cloud data. 3 . The method of claim 1 , wherein each of the plurality of first sets of features of the image frame corresponds to a respective stage of a plurality of first stages of a first encoder, and wherein each of the plurality of second sets of features of the point cloud data corresponds to a respective stage of a plurality of second stages of a second encoder. 4 . The method of claim 3 , wherein the respective stage corresponding to the first set of features of the image frame corresponds to the respective stage corresponding to the second set of features of the point cloud data. 5 . The method of claim 1 , wherein the second set of features includes a plurality of pairs of perspective view features of the point cloud data and BEV features of the point cloud data. 6 . The method of claim 5 , wherein the third set of features of the image frame is determined by an encoder, the method further comprising: adding perspective view features of a pair of the plurality of pairs as a first channel of the encoder; and adding BEV features of the pair as a second channel of the encoder. 7 . The method of claim 5 , wherein the plurality of perspective view features and the plurality of BEV features are each determined from the plurality of voxels using global max pooling. 8 . The method of claim 1 , wherein the point cloud data is received from a ranging sensor. 9 . The method of claim 1 , further comprising detecting an object represented in the image frame based on the fused data. 10 . The method of claim 9 , further comprising controlling a function of a vehicle based on the object detected. 11 . An apparatus, comprising: a memory storing processor-readable code; and at least one processor coupled to the memory, the at least one processor configured to execute the processor-readable code to cause the at least one processor to perform operations including: receiving an image frame representing a scene; receiving point cloud data representing the scene; determining a plurality of first sets of features of the image frame; determining a plurality of second sets of features of the point cloud data based on a plurality of voxels representing the point cloud data; determining a third set of features of the image frame based on a first set of features of the plurality of first sets of features of the image frame and a second set of features of the plurality of second sets of features of the point cloud data; and outputting fused data that combines the third set of features of the image frame and a fourth set of features of the point cloud data. 12 . The apparatus of claim 11 , the operations further comprising: determining the first set of features and the second set of features based on first statistical indicators associated with the plurality of first sets of features of the image frame and second statistical indicators associated with the plurality of second sets of features of the point cloud data. 13 . The apparatus of claim 11 , wherein each of the plurality of first sets of features of the image frame corresponds to a respective stage of a plurality of first stages of a first encoder, and wherein each of the plurality of second sets of features of the point cloud data corresponds to a respective stage of a plurality of second stages of a second encoder. 14 . The apparatus of claim 13 , wherein the respective stage corresponding to the first set of features of the image frame corresponds to the respective stage corresponding to the second set of features of the point cloud data. 15 . The apparatus of claim 12 , wherein the second set of features includes a plurality of pairs of perspective view features of the point cloud data and BEV features of the point cloud data. 16 . The apparatus of claim 15 , wherein the third set of features of the image frame is determined by an encoder, the operations further comprising: adding perspective view features of a pair of the plurality of pairs as a first channel of the encoder; and adding BEV features of the pair as a second channel of the encoder. 17 . The apparatus of claim 15 , wherein the plurality of perspective view features and the plurality of BEV features are each determined from the plurality of voxels using global max pooling. 18 . The apparatus of claim 11 , wherein the point cloud data is received from a LiDAR sensor or a radar sensor. 19 . The apparatus of claim 11 , wherein the operations further including detecting an object represented in the image frame based on the fused data. 20 . The apparatus of claim 19 , wherein the operations further include controlling a function of a vehicle based on the object detected. 21 . A non-transitory computer-readable medium storing instructions that, when executed by a processor, cause the processor to perform operations comprising: receiving an image frame representing a scene; receiving point cloud data representing the scene; determining a plurality of first sets of features of the image frame; determining a plurality of second sets of features of the point cloud data based on a plurality of voxels representing the point cloud data; determining a third set of features of the image frame based on a first set of features of the plurality of first sets of features of the image frame and a second set of features of the plurality of second sets of features of the point cloud data; and outputting fused data that combines the third set of features of the image frame and a fourth set of features of the point cloud data. 22 . The non-transitory, computer-readable medium of claim 21 , the operations further comprising: determining the first set of features and the second set of features based on first statistical indicators associated with the plurality of first sets of features of the image frame and second statistical indicators associated with the plurality of second sets of features of the point cloud data. 23 . The non-transitory, computer-readable medium of claim 21 , wherein each of the plurality of third sets of features of the image frame corresponds to a respective stage of a plurality of first stages of a first encoder, and wherein each of the plurality of fourth sets of features of the point cloud data corresponds to a respective stage of a plurality of second stag
Target detection · CPC title
Involving statistics of pixels or of feature values, e.g. histogram matching · CPC title
of extracted features · CPC title
Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.