Systems and methods for generating a road surface semantic segmentation map from a sequence of point clouds

US12008762B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12008762-B2
Application numberUS-202217676131-A
CountryUS
Kind codeB2
Filing dateFeb 19, 2022
Priority dateFeb 19, 2022
Publication dateJun 11, 2024
Grant dateJun 11, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

System and method for semantic segmentation of point clouds. The method may include: generating, via a first neural network, a birds-eye-view (BEV) image of the environment from the aggregated point cloud; generating, via a second neural network, a labelled BEV image from the BEV image, wherein each pixel in the labelled BEV image is associated with a class label from a set of class labels; generating a BEV feature map; and generating, via a third neural network, the road surface segmentation map in the form of a refined labelled BEV image based on the labelled BEV image by smoothing the labelled BEV image using the BEV feature map, wherein each pixel in the refined labelled BEV image is associated with a class label from the set of class labels.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method of generating a road surface segmentation map, comprising: receiving a sequence of point clouds, each respective point cloud of the sequence of point clouds representing a three-dimensional (3D) scan of an environment at a different point in time; generating an aggregated point cloud based on the sequence of point clouds; generating, via a first neural network, a birds-eye-view (BEV) image of the environment from the aggregated point cloud, wherein the first neural network is a Pillar Feature Net (PFN) neural network; generating, via a second neural network, a labelled BEV image from the BEV image, wherein each pixel in the labelled BEV image is associated with a class label from a set of class labels, wherein the second neural network is an encoder/decoder (E/D) neural network; generating a BEV feature map generating, via a third neural network, the road surface segmentation map in the form of a refined labelled BEV image based on the labelled BEV image by smoothing the labelled BEV image using the BEV feature map, wherein each pixel in the refined labelled BEV image is associated with a class label from the set of class labels, wherein the third neural network is a Convolutional Conditional Random Field (ConvCRF) neural network; and training the PFN neural network, the E/D neural network and the ConvCRF neural network by, during each training epoch: generating a plurality of predicted labels by the PFN neural network, the E/D neural network and the ConvCRF neural network; receiving or retrieving a plurality of ground truth labels; computing a loss based on the plurality of ground truth labels and the plurality of predicted labels; and refining one or more weights in the PFN neural network, the E/D neural network and the ConvCRF neural network based on the loss. 2. The method of claim 1 , wherein the BEV feature map is generated based on a set of pillars generated from the aggregated point cloud, wherein each pillar in the set of pillars is a voxel corresponding to a point in the aggregated point cloud with coordinates x, y in the x-y plane and an unlimited spatial extent in the z direction. 3. The method of claim 2 , comprising: generating the set of pillars generated from the aggregated point cloud. 4. The method of claim 1 , wherein the BEV feature map defines three elements for each pixel in the BEV image, the three elements being each height, intensity and density. 5. The method of claim 4 , wherein the height of a pixel in the BEV feature map represents the difference between points in a pillar having a maximum and a minimum elevation, the intensity of a pixel in the BEV feature map represents a mean of the intensity of the corresponding points in a pillar, and the density of a pixel in the BEV feature map represents the number of points in the respective pillar. 6. The method of claim 1 , wherein the smoothing comprises performing Gaussian kernel smoothing on the labelled BEV image using the BEV feature map, the Gaussian kernel smoothing comprising: generating Gaussian kernels based on the BEV feature map and the class labels of the labelled BEV image; and smoothing the labelled BEV image with the Gaussian kernels to generate the refined labelled BEV image. 7. The method of claim 1 , further comprising: generating and storing an elevation value for each pixel in the refined labelled BEV image based on the labelled BEV image. 8. The method of claim 1 , wherein generating the aggregated point cloud is based on an equation: PC t ⁢ _ ⁢ agg = ⋂ i = 1 w τ 1 , i ⁢ PC i wherein: PC t_agg represents the aggregated point cloud; w is a window size parameter; the point cloud having a most recent timestamp in the sequence of points clouds is a target point cloud; each point cloud having a timestamp earlier than the most recent timestamp in the sequence of points clouds is a source point cloud; τ 1,i represents a homogenous transformation between the target point cloud and each respective source point cloud i; and PC i is the respective source point cloud i. 9. The method of claim 8 , wherein the sequence of point clouds is generated by a LIDAR sensor on a vehicle, and the homogenous transformation is computed based on odometry data from the vehicle captured between the target point cloud and each respective source point cloud i. 10. The method of claim 9 , wherein the odometry data comprises data representing a rotation speed of one or more wheels of the vehicle. 11. The method of claim 8 , wherein the value of w is an integer value between 2 to 5. 12. The method of claim 1 , wherein the loss L surface is computed based on a local loss term L focal and a dice coefficient loss term L dice based on an equation: L surface =α*L focal +(1−α)* L dice . 13. The method of claim 12 , wherein a has a value between 0.6 and 0.8. 14. The method of claim 12 , wherein the local loss term L focal is computed based on an equation: L focal =−μ β (1− {circumflex over (p)} t β ) γ log( {circumflex over (p)} t β ), wherein β the ground truth label for a given pixel, μ β is a class-specific weight, and (1−p t β ) γ is a modulating term. 15. The method of claim 1 , further comprising: displaying the refined labelled BEV image on a display of a computing system. 16. A computing system for generating a road surface segmentation map, the computing system comprising: a processor configured to: receive a sequence of point clouds, each respective point cloud of the sequence of point clouds representing a three-dimensional (3D) scan of an environment at a different point in time; generate an aggregated point cloud based on the sequence of point clouds; generate, via a first neural network, a birds-eye-view (BEV) image of the environment from the aggregated point cloud, wherein the first neural network is a Pillar Feature Net (PFN) neural network; generate, via a second neural network, a labelled BEV image from the BEV image, wherein each pixel in the labelled BEV image is associated with a class label from a set of class labels, wherein the second neural network is an encoder/decoder (E/D) neural network; generate a BEV feature map; generate, via a third neural network, the road surface segmentation map in the form of a refined labelled BEV image based on the labelled BEV image by smoothing the labelled BEV image using the BEV feature map, wherein each pixel in the refined labelled BEV image is associated with a class label from the set of class labels, wherein the third neural network is a Convolutional Conditional Random Field (ConvCRF) neural network; and train the PFN neural network, the E/D neural network and the ConvCRF neural network by, during each training epoch: gene

Assignees

Inventors

Classifications

  • Multiple classes · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods · CPC title

  • for mapping or imaging · CPC title

  • involving 3D image data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12008762B2 cover?
System and method for semantic segmentation of point clouds. The method may include: generating, via a first neural network, a birds-eye-view (BEV) image of the environment from the aggregated point cloud; generating, via a second neural network, a labelled BEV image from the BEV image, wherein each pixel in the labelled BEV image is associated with a class label from a set of class labels; gen…
Who is the assignee on this patent?
Agia Christopher George R, Cheng Ran, Ren Yuan, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06T7/11. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 11 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).