Region detection and geometry prediction

US12154347B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12154347-B2
Application numberUS-202217691103-A
CountryUS
Kind codeB2
Filing dateMar 9, 2022
Priority dateMar 9, 2021
Publication dateNov 26, 2024
Grant dateNov 26, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting regions of an environment. One of the methods includes receiving a representation of a scene in an environment; processing the representation using a center prediction neural network to generate: (i) features of the scene in the environment, and (ii) a respective center score corresponding to each of a plurality of locations in the environment; selecting, based on the respective center scores, one or more of the plurality of locations; and for each selected location: processing an input comprising the features of the scene in the environment and data specifying the selected location using a geometry prediction neural network to generate a geometry prediction that represents a geometry of the region that is centered at the selected location.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed by one or more computers, the method comprising: receiving a representation of a scene in an environment; processing the representation using a center prediction neural network to generate: (i) features of the scene in the environment, and (ii) a respective center score corresponding to each of a plurality of locations in the environment, wherein each respective center score represents a predicted likelihood that a center of a region is located at the corresponding location in the environment; selecting, based on the respective center scores, one or more of the plurality of locations in the environment; and for each selected location: processing an input comprising the features of the scene in the environment and data specifying the selected location using a geometry prediction neural network to generate a geometry prediction that represents a geometry of the region that is centered at the selected location as a collection of one or more convexes by specifying, for each of the one or more convexes, a respective plurality of hyperplanes that define the convex. 2. The method of claim 1 , further comprising: for each selected location, generating a polygonal representation that represents the geometry of the region that is centered at the selected location from the respective plurality of hyperplanes for each of the one or more convexes. 3. The method of claim 1 , wherein the representation is a top-down representation of the scene in the environment. 4. The method of claim 3 , wherein the representation is generated from raw laser data collected by one or more laser sensors of a vehicle navigating through the environment. 5. The method of claim 3 , wherein each of the plurality of locations corresponds to a respective portion of the top-down representation. 6. The method of claim 5 , wherein each of the plurality of locations corresponds to a respective pixel in the top-down representation. 7. The method of claim 1 , wherein the center prediction neural network is configured to generate a respective pixel prediction score for each of a plurality of pixels in the representation that represents a likelihood that a region instance is depicted at the pixel. 8. The method of claim 7 , wherein the features of the scene comprise the respective per pixel prediction scores for the plurality of pixels. 9. The method of claim 1 , wherein the features of the scene comprise outputs of one or more hidden layers of the center prediction neural network. 10. The method of claim 1 , wherein the data specifying the selected location is a feature map that has a same spatial dimensionality as the features and that identifies the selected location. 11. The method of claim 1 , wherein the geometry prediction generated by the geometry prediction neural network includes, for each hyperplane of each convex, parameters of a signed distance function that measures a signed distance of any given point in the environment from the hyperplane. 12. The method of claim 11 , wherein the parameters of the signed distance function include a normal corresponding to the hyperplane. 13. The method of claim 11 , wherein the parameters of the signed distance function include an offset of the hyperplane from the origin. 14. The method of claim 1 , wherein the geometry prediction neural network comprises an encoder neural network configured to process the input to generate a set of hyperplane parameters and a decoder neural network configured to process the set of hyperplane parameters to generate the geometry prediction. 15. The method of claim 1 , wherein the center prediction neural network and the geometry prediction neural network have been trained jointly on a set of training data that includes a plurality of training representations and for each training representation a set of ground truth region geometries. 16. The method of claim 15 , wherein the center prediction neural network and the geometry prediction neural network have been trained jointly to minimize a loss function that includes a (i) a reconstruction loss that measures errors in geometry predictions relative to the ground truth region geometries and (ii) a center prediction loss that measures errors in center predictions generated by the center prediction neural network relative to region centers specified by the ground truth region geometries. 17. The method of claim 16 , wherein the center prediction neural network is configured to generate a respective pixel prediction score for each of a plurality of pixels in the representation that represents a likelihood that a region instance is depicted at the pixel, and wherein the loss function also includes (iii) a per pixel prediction loss that measures errors in the per pixel predictions relative to region locations specified by the ground truth region geometries. 18. The method of claim 17 , wherein the loss function also includes (iv) a localization loss. 19. The method of claim 16 , wherein during the joint training the geometry prediction neural network receives as input locations of region centers specified by the ground truth region geometries rather than locations selected based on center predictions generated by the center prediction neural network. 20. A system comprising: one or more computers; and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising: receiving a representation of a scene in an environment; processing the representation using a center prediction neural network to generate: (i) features of the scene in the environment, and (ii) a respective center score corresponding to each of a plurality of locations in the environment, wherein each respective center score represents a predicted likelihood that a center of a region is located at the corresponding location in the environment; selecting, based on the respective center scores, one or more of the plurality of locations in the environment; and for each selected location: processing an input comprising the features of the scene in the environment and data specifying the selected location using a geometry prediction neural network to generate a geometry prediction that represents a geometry of the region that is centered at the selected location as a collection of one or more convexes by specifying, for each of the one or more convexes, a respective plurality of hyperplanes that define the convex. 21. One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: receiving a representation of a scene in an environment; processing the representation using a center prediction neural network to generate: (i) features of the scene in the environment, and (ii) a respective center score corresponding to each of a plurality of locations in the environment, wherein each respective center score represents a predicted likelihood that a center of a region is located at the corresponding location in the environment; selecting, based on the respective center scores, one or more of the plurality of locations in the environment; and for each selected location: processing an input comprising the features of the scene in the environment and data specifying the selected location using a geometry prediction neural network to generate a geometry prediction that represents a g

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12154347B2 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting regions of an environment. One of the methods includes receiving a representation of a scene in an environment; processing the representation using a center prediction neural network to generate: (i) features of the scene in the environment, and (ii) a respective center score correspond…
Who is the assignee on this patent?
Waymo Llc
What technology area does this patent fall under?
Primary CPC classification G06V20/58. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).