Generating scene flow labels for point clouds using object labels

US12106528B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12106528-B2
Application numberUS-202217684334-A
CountryUS
Kind codeB2
Filing dateMar 1, 2022
Priority dateMar 1, 2021
Publication dateOct 1, 2024
Grant dateOct 1, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicting scene flow. One of the methods includes obtaining a current point cloud representing an observed scene at a current time point; obtaining object label data that identifies a first three-dimensional region in the observed scene; determining, for each current three-dimensional point that is within the first three-dimensional region and using the object label data, a respective preceding position of the current three-dimensional point at a preceding time point in a reference frame of the sensor at the current time point; and generating, using the preceding positions, a scene flow label for the current point cloud that comprises a respective ground truth motion vector for each of a plurality of the current three-dimensional points.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed by one or more computers, the method comprising: obtaining a current point cloud representing an observed scene at a current time point, wherein the current point cloud is generated from measurements of a sensor at the current time point, and wherein the current point cloud comprises a plurality of current three-dimensional points; obtaining object label data that identifies a first three-dimensional region in the observed scene that has been labeled as containing a first object in the observed scene at the current time point; determining, for each current three-dimensional point that is within the first three-dimensional region and using the object label data, a respective preceding position of the current three-dimensional point at a preceding time point in a reference frame of the sensor at the current time point; and generating a scene flow label for the current point cloud that comprises a respective ground truth motion vector for each of a plurality of the current three-dimensional points, wherein generating the scene flow label comprises: for each of the current three-dimensional points in the first three-dimensional region, generating the respective motion vector for the current three-dimensional point from a displacement between (i) a current position of the current three-dimensional point at the current time point in the reference frame of the sensor at the current time point and (ii) the preceding position of the current three-dimensional point at the preceding time point in the reference frame of the sensor at the current time point. 2. The method of claim 1 , wherein the motion vector includes, for each of multiple directions, a respective velocity component in the direction in the reference frame of the sensor at the current time point. 3. The method of claim 2 , wherein generating the respective motion vector for the current three-dimensional point from a displacement between (i) a current position of the current three-dimensional point at the current time point in the reference frame of the sensor at the current time point and (ii) the preceding position of the current three-dimensional point at the preceding time point comprises, for each of the multiple directions: computing the respective velocity component for the direction based on (i) a displacement along the direction between the current position and the preceding position and (ii) a time difference between the current time point and the preceding time point. 4. The method of claim 1 , wherein the object label data also identifies a second three-dimensional region in the observed scene that is in a reference frame of the sensor at the preceding time point and that has been labeled as containing the first object at the preceding time point. 5. The method of claim 4 , wherein determining, for each current three-dimensional point that is within the first three-dimensional region, a respective preceding position of the current three-dimensional point at a preceding time point in a reference frame of the sensor at the current time point comprises: determining, from a pose of the second three-dimensional region in the reference frame of the sensor at the preceding time point, a preceding pose of the first object at the preceding time point in the reference frame of the sensor at the preceding time point; generating, from (i) the preceding pose and (ii) ego motion data characterizing motion of the sensor from the preceding time point to the current time point, a transformed preceding pose of the first object at the preceding time point that is in the reference frame of the sensor at the current time point; determining, from a pose of the first three-dimensional region in the reference frame of the sensor at the current time point, a current pose of the first object at the current time point in the reference frame of the sensor at the current time point; and determining, from the transformed preceding pose and the current pose, the respective preceding positions for each of the current three-dimensional points in the first three-dimensional region. 6. The method of claim 5 , wherein determining, from the transformed preceding pose and the current pose, the respective preceding positions for each of the current three-dimensional points in the first three-dimensional region comprises: determining, from the transformed preceding pose and the current pose, a rigid body transform from the current time point to the preceding time point for the first object; and for each of the current three-dimensional points in the first three-dimensional region, determining the preceding position of the current three-dimensional point by applying the rigid body transform to the current position of the current three-dimensional point. 7. The method of claim 1 , wherein the object label data also identifies a third three-dimensional region in the observed scene in the reference frame of the sensor at the current time point that has been labeled as containing a second object in the observed scene at the current time point, and wherein generating the scene flow label for the current point comprises: determining that the object label data indicates that the second object was not detected in the observed scene at the preceding time point; and in response, including, in the scene flow label, data indicating that each current three-dimensional point within the third three-dimensional region does not have a valid motion vector at the current time point. 8. The method of claim 1 , wherein generating the scene flow label for the current point cloud comprises: determining that one or more current three-dimensional points are not included in any regions identified as containing any objects at the current time point in the object label data; and in response, generating, for each of the one or more current three-dimensional points, a respective motion vector that indicates that the current three-dimensional point is stationary. 9. The method of claim 8 , wherein generating the scene flow label for the current point cloud data comprises identifying each of the one or more current three-dimensional points as belonging to a background of the observed scene in the scene flow label. 10. The method of claim 1 , further comprising: generating, from at least the current point cloud and the scene flow label for the current point cloud, a training example for training a machine learning model to predict scene flow of input point clouds. 11. The method of claim 10 , further comprising: training the machine learning model on training data that includes the training example. 12. The method of claim 10 , further comprising: providing the training example for use in training the machine learning model. 13. One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: obtaining a current point cloud representing an observed scene at a current time point, wherein the current point cloud is generated from measurements of a sensor at the current time point, and wherein the current point cloud comprises a plurality of current three-dimensional points; obtaining object label data that identifies a first three-dimensional region in the observed scene that has been labeled as containing a first object in the observed scene at the current time point; determining, for each current three-dimensional point that is within the first three-dimensional region and using the object label data, a respective preceding position of the current three-dimensional point at a prec

Assignees

Inventors

Classifications

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Training; Learning · CPC title

  • Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12106528B2 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicting scene flow. One of the methods includes obtaining a current point cloud representing an observed scene at a current time point; obtaining object label data that identifies a first three-dimensional region in the observed scene; determining, for each current three-dimensional point that…
Who is the assignee on this patent?
Waymo Llc
What technology area does this patent fall under?
Primary CPC classification G06T9/002. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 01 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).