Semantic segmentation neural network for point clouds

US12555366B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12555366-B2
Application numberUS-202217945325-A
CountryUS
Kind codeB2
Filing dateSep 15, 2022
Priority dateSep 15, 2022
Publication dateFeb 17, 2026
Grant dateFeb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a semantic segmentation neural network for point clouds. One of the methods includes: obtaining a plurality of training points divided into a respective plurality of components; obtaining, for each of the respective plurality of components, data identifying a ground truth category for one or more labeled point; processing each training points using a semantic segmentation neural network to generate a semantic segmentation that includes a respective score for each of the plurality of categories; determining a gradient of a loss function that penalizes the semantic segmentation neural network for generating, for points in the component, non-zero scores for categories that are not the ground truth category for any labeled point in the component; and updating, using the gradient, the parameters of the semantic segmentation neural network.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method performed by one or more computers for training a semantic segmentation neural network, the method comprising: obtaining a batch of one or more training point clouds, wherein each training point cloud comprises a respective plurality of points, wherein, for each training point cloud, the respective plurality of points of the training point cloud are divided into a respective plurality of components; obtaining, for each training point cloud and for each of the respective plurality of components for the training point cloud, data identifying one or more labeled points in the component and, for each labeled point, a ground truth category for the labeled point; and training the semantic segmentation neural network on the batch of the one or more training point clouds, comprising: for each training point cloud, processing the training point cloud using the semantic segmentation neural network to generate a semantic segmentation that comprises, for each point of the respective plurality of points in the training point cloud, a respective score for each category of a plurality of categories that represents a likelihood that the point is a measurement of an object that belongs to the category; for each training point cloud and for each component of the respective plurality of components for the training point cloud, identifying one or more ground truth categories for the component, wherein the one or more ground truth categories for the component comprise the ground truth category for each labeled point of the one or more labeled points in the component; for each training point cloud and for each component of the respective plurality of components for the training point cloud, determining a subset of points in the component that are not the one or more labeled points and for which the semantic segmentation neural network generates non-zero scores for categories that are not included in the one or more ground truth categories for the component; and training the semantic segmentation neural network on a loss function for the batch of the one or more training point clouds, wherein the loss function includes a first term that, for each training point cloud, for each component of the respective plurality of components for the training point cloud, and for the subset of points in the component, penalizes the semantic segmentation neural network for generating the non-zero scores for the categories that are not included in the one or more ground truth categories for the component. 2 . The method of claim 1 , wherein the loss function includes a second term that measures, for each training point cloud and for each labeled point in each component of the plurality of components for the training point cloud that is one of the points in the training point cloud, an error between (i) the respective score for each category of the plurality of categories for the labeled point and (ii) the ground truth category for the labeled point. 3 . The method of claim 1 , further comprising: identifying, as a pure component, any component in any of the training point clouds for which all of the labeled points in the any component have a same ground truth category, wherein: the loss function includes a third term that measures, for each point in the training point cloud that is in a pure component, an error between (i) the respective score for each category of the plurality of categories for the point and (ii) the ground truth category for the one or more labeled points in the pure component to which the point belongs. 4 . The method of claim 1 , wherein the loss function includes a fourth term that measures, for each training point cloud, a prototype feature learning loss on prototype features for each category of the plurality of categories that, for each particular category, are generated from an intermediate output of the semantic segmentation neural network for (i) labeled points that are in the training point cloud and that have the particular category as the ground truth category and (ii) points in the training point cloud that are in a component for which all of the labeled points in the component have the particular category as the ground truth category. 5 . The method of claim 1 , further comprising: for each training point cloud, processing a fused point cloud generated from the training point cloud and one or more additional point clouds captured within a time window of the training point cloud using a trained multi-frame semantic segmentation neural network that has been trained to process the fused point cloud to generate a fused semantic segmentation that comprises, for each point of the plurality of points in the training point cloud, a respective fused score for each category of the plurality of categories, wherein the loss function includes a fifth term that measures, for each point in each training point cloud, an error between (i) the respective score for each category of the plurality of categories for the point generated by the semantic segmentation neural network and (ii) the respective fused score for each category of the plurality of categories for the point generated by the trained multi-frame semantic segmentation neural network. 6 . The method of claim 1 , further comprising, for each training point cloud: providing, for presentation to a user on a user device, a visual representation of at least the respective plurality of points in the training point cloud that identifies the respective plurality of components for the training point cloud; and obtaining, from the user device, user inputs specifying the one or more labeled points in each component of the components and, for each labeled point, the ground truth category for the labeled point. 7 . The method of claim 1 , further comprising: generating the respective plurality of components for each training point cloud of the one or more training point clouds, comprising: for each training point cloud, generating a fused point cloud from the training point cloud and one or more additional point clouds captured within a time window of the training point cloud; detecting ground points in the fused point cloud, wherein each ground point corresponds to a measurement of a ground in an environment; generating a plurality of non-ground points by removing the ground points from the fused point cloud; generating a plurality of initial components from the plurality of non-ground points; and generating the plurality of components for the training point cloud from the ground points and the plurality of initial components. 8 . The method of claim 7 , wherein generating the plurality of initial components from the plurality of non-ground points comprises: for each point, identifying, as connected points for the point, each point that is within a corresponding threshold distance of the point; and generating the plurality of initial components from the connected points, wherein each initial component comprises a group of connected points. 9 . The method of claim 8 , wherein the corresponding threshold distance for each point is based on a distance from the point to a sensor that captured the training point cloud. 10 . The method of claim 7 , wherein generating the plurality of components for the training point cloud from the ground points and the plurality of initial components comprises: generating one or more fixed size components that include only the ground points. 11 . The method of claim 7 , wherein generating the plurality of components for the training point cloud from the ground points and the plurality of initial components comprises: dividing any initial component that exceeds a f

Assignees

Inventors

Classifications

  • Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level (multimodal speaker identification or verification G10L17/10) · CPC title

  • by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition · CPC title

  • Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • exterior to a vehicle by using sensors mounted on the vehicle · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12555366B2 cover?
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a semantic segmentation neural network for point clouds. One of the methods includes: obtaining a plurality of training points divided into a respective plurality of components; obtaining, for each of the respective plurality of components, data identifying a ground truth category for…
Who is the assignee on this patent?
Waymo Llc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).