Object-centric three-dimensional auto labeling of point cloud data

US12073575B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12073575-B2
Application numberUS-202117407795-A
CountryUS
Kind codeB2
Filing dateAug 20, 2021
Priority dateAug 21, 2020
Publication dateAug 27, 2024
Grant dateAug 27, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for performing three-dimensional auto-labeling on sensor data. The system obtains a sensor data segment that includes a temporal sequence of three-dimensional point clouds generated from sensor readings of an environment by one or more sensors. The system identifies, from the sensor data segment, (i) a plurality of object tracks that each corresponds to a different object in the environment and (ii) for each object track, respective initial three-dimensional regions in each of one or more of the point clouds in which the corresponding object appears. The system generates, for each object track, extracted object track data that includes at least the points in the respective initial three-dimensional regions for the object track. The system further generates, for each object track and from the extracted object track data for the object track, an auto labeling output that defines respective refined three-dimensional regions in each of the one or more point clouds.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: obtaining a sensor data segment comprising a temporal sequence of three-dimensional point clouds generated from sensor readings of an environment by one or more sensors, each three-dimensional point cloud comprising a respective plurality of points in a first coordinate system; identifying, from the sensor data segment, (i) a plurality of object tracks that each corresponds to a different object in the environment and (ii) for each object track, respective initial three-dimensional regions in each of one or more of the point clouds in which the corresponding object appears, wherein each initial three-dimensional region is an initial estimate of the three-dimensional region of the point cloud that includes points that are measurements of the corresponding object, wherein the identifying comprises: processing each of the point clouds in the temporal sequence using an object detector to obtain, for each point cloud, a detector output that identifies a plurality of three-dimensional regions in the point cloud that are predicted to correspond to objects; and processing the detector output using an object tracker to obtain an object tracker output that associates each of at least a subset of the three-dimensional regions in each of the point clouds with a respective one of the plurality of object tracks; generating, for each object track, extracted object track data that includes at least the points in the respective initial three-dimensional regions for the object track; and generating, for each object track and from the extracted object track data for the object track, an auto labeling output that defines respective refined three-dimensional regions in each of the one or more point clouds that is a refined estimate of the three-dimensional region of the point cloud that includes points that are measurements of the corresponding object. 2. The method of claim 1 , wherein the one or more sensors are located on an object in the environment and wherein the first coordinate system is centered at the object in the environment. 3. The method of claim 2 , wherein the object in the environment is an autonomous vehicle navigating through the environment. 4. The method of claim 2 , wherein generating, for each object track, extracted object track data that includes at least the points in the initial three-dimensional regions for the object track comprises: extracting, from each of the one or more point clouds in which the corresponding object appears, a plurality of points including the points in the initial three-dimensional region in the point cloud for the object track; and transforming, using frame pose data for each of the one or more point clouds, each of the extracted points to a second coordinate system centered at a stationary location in the environment. 5. The method of claim 4 , wherein the plurality of points includes the points in the initial three-dimensional region of the point cloud for the object track and additional context points in a vicinity of the three-dimensional region in the point cloud. 6. The method of claim 4 , wherein generating, for each object track and from the extracted object track data for the object track, data defining a respective refined three-dimensional region in the point cloud that is a refined estimate of the three-dimensional region of the point cloud that includes points that are measurements of the corresponding object comprises: determining, from the extracted object track data for the object track, whether the object track corresponds to a static object or a dynamic object. 7. The method of claim 6 , wherein generating, for each object track and from the extracted object track data for the object track, data defining a respective refined three-dimensional region in the point cloud that is a refined estimate of the three-dimensional region of the point cloud that includes points that are measurements of the corresponding object further comprises: in response to determining that the object track corresponds to a static object: generating an aggregate representation of the object track that includes extracted points from all of the one or more point clouds in the second coordinate system; and processing the aggregate representation using a static track auto labeling neural network to generate the data defining the refined region. 8. The method of claim 7 , wherein processing the aggregate representation using a static track auto labeling neural network to generate the data defining the refined region comprises: identifying one of the initial three-dimensional regions for the object track; generating a network input by transforming each of the extracted points from the second coordinate system to a third coordinate system that is centered at a particular point in the identified initial three-dimensional region; and providing the network input as input to the static track auto labeling neural network. 9. The method of claim 8 , wherein the object detector also outputs a confidence score for each three-dimensional region and wherein identifying one of the initial three-dimensional regions for the object track comprises selecting, from the initial three-dimensional regions for the object track, the initial three-dimensional region with a highest confidence score. 10. The method of claim 8 , wherein the static track auto labeling neural network outputs data identifying a three-dimensional region in the third coordinate system, and wherein generating the data defining the refined region comprises transforming the data identifying the three-dimensional region into the second coordinate system. 11. The method of claim 6 , wherein generating, for each object track and from the extracted object track data for the object track, data defining a refined region in the point cloud that is a refined estimate of a region of the point cloud that corresponds to the object track further comprises: in response to determining that the object track corresponds to a dynamic object: generating, for each of the one or more point clouds in which the corresponding object appears, a respective representation of the object track from the extracted point from the point cloud; and processing the respective representations using a dynamic track auto labeling neural network to generate, for each of the one or more point clouds in which the corresponding object appears, data defining the respective refined region in the point cloud. 12. The method of claim 11 , wherein generating, for each of the one or more point clouds in which the corresponding object appears, a respective representation of the object track comprises: transforming each of the extracted points from the point cloud from the second coordinate system to a respective fourth coordinate system that is centered at a particular point in the initial three-dimensional region in the point cloud for the object track. 13. The method of claim 12 , wherein, for each of the one or more point clouds in which the corresponding object appears, the dynamic track auto labeling neural network outputs data identifying a three-dimensional region in the respective fourth coordinate system, and wherein generating the data defining the refined region comprises transforming the data identifying the three-dimensional region from the respective fourth coordinate system to the second coordinate system. 14. The method of claim 11 , wherein each respective representation also includes data specifying the initial three-dimensional region in the point cloud for the object track. 15. A system comprising one or more computers an

Assignees

Inventors

Classifications

  • Artificial neural networks [ANN] · CPC title

  • Range image; Depth image; 3D point clouds · CPC title

  • Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

  • exterior to a vehicle by using sensors mounted on the vehicle · CPC title

  • Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12073575B2 cover?
Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for performing three-dimensional auto-labeling on sensor data. The system obtains a sensor data segment that includes a temporal sequence of three-dimensional point clouds generated from sensor readings of an environment by one or more sensors. The system identifies, from the sensor data seg…
Who is the assignee on this patent?
Waymo Llc
What technology area does this patent fall under?
Primary CPC classification G06T7/521. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 27 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).