Video prediction using one or more neural networks
US-11902705-B2 · Feb 13, 2024 · US
US12597171B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12597171-B2 |
| Application number | US-202217973967-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 26, 2022 |
| Priority date | Mar 4, 2022 |
| Publication date | Apr 7, 2026 |
| Grant date | Apr 7, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Devices, systems, methods, and media are disclosed for domain adaptation using data densification. Example embodiments described herein receives LiDAR 3D point clouds from a source-domain and introduces interpolated 3D points inferred by a trained deep learning neural network to output a denser version of the input 3D point cloud with increased resolution. The trained domain adaptation network reconstructs the source-domain 3D point cloud data, generates translation vectors to compute interpolated 3D point cloud data and merges the reconstructed 3D point cloud data and the interpolated 3D point cloud data to output a densified 3D point cloud resembling data 3D point clouds captured generated by the target LiDAR sensor from the source-domain.
Opening claim text (preview).
The invention claimed is: 1 . A method comprising: receiving a source-domain 3D point cloud comprising a set of source-domain 3D data; encoding, using a trained domain adaptation network, the set of source-domain 3D data to generate encoded source-domain 3D data; decoding, using the trained domain adaptation network, the encoded source-domain 3D data to generate one or more 3D translation vectors; decoding, using the trained domain adaptation network, the encoded source-domain 3D data to generate a reconstructed 3D point cloud comprising a set of reconstructed 3D data; computing a set of interpolated 3D data based on the one or more 3D translation vectors and the set of reconstructed 3D data; and concatenating the set of interpolated 3D data and the set of source-domain 3D data to obtain a densified 3D point cloud comprising a set of densified 3D data. 2 . The method of claim 1 , wherein encoding the set of source-domain 3D data to generate encoded source-domain 3D data comprises: performing kernel point convolution on each point in the set of source-domain 3D data to generate one or more feature arrays. 3 . The method of claim 2 , wherein decoding the encoded source-domain 3D data to generate one or more 3D translation vectors comprises: performing a regression on the encoded source-domain 3D data to generate a 3D translation vector corresponding to each point in the set of reconstructed 3D data. 4 . The method of claim 3 , wherein computing the set of interpolated 3D data comprises: adding a corresponding translation vector to each corresponding data point in the set of reconstructed 3D data to obtain a locally-translated version of the reconstructed 3D data points. 5 . The method of claim 1 , further comprising training the domain adaptation network by: obtaining a first training data, based on the set of source-domain 3D data; computing a second training data comprising a set of target interpolated 3D data, based on the set of source-domain 3D data; and training the domain adaptation network based on the first training data and the second training data, where training of the domain adaptation network includes computing an overall loss function. 6 . The method of claim 5 , wherein computing the second training data comprises: for each point in the set of source-domain 3D data: computing a vertical angle and a horizontal angle; generating a histogram of vertical angle bins based on the vertical angle computed for each point in the set of source-domain 3D data; clustering groups of points of the set of source-domain 3D data into curves based on the histogram of vertical angle bins; associating a first point corresponding to a first group of clustered points with a nearest-point from a second group of clustered points, using a 3D point neighborhood-based association mechanism; generating a unit vector corresponding to the first point corresponding to a first group of clustered points based on the nearest-point from the second group of clustered points; projecting an intermediate point information based on unit vector corresponding to the first point corresponding to the first group of clustered points; and scaling the intermediate point information based on a scaling factor to define an interpolated point. 7 . The method of claim 6 , wherein the projected intermediate point information is a location and the scaled intermediate point location defines an interpolated position of the point. 8 . The method of claim 6 , wherein the scaling factor is selectable and different interpolated points are definable for the first point by scaling the intermediate point information based on different selections of the scaling factor. 9 . The method of claim 5 , where computing the overall loss function comprises calculating at least one of: a reconstruction loss; an interpolation loss; or a regularization loss. 10 . The method of claim 9 , wherein calculating the regularization loss comprises: calculating one or more vertical angles for each point in the reconstructed 3D data; calculating one or more vertical angles for each point in the target interpolated 3D data; computing one or more sine loss functions based on the one or more vertical angles for each point in the reconstructed 3D data and the one or more vertical angles for each point in the target interpolated 3D data; and comparing the one or more vertical angles for each point in the reconstructed 3D data and the one or more vertical angles for each point in the target interpolated 3D data to a corresponding sine function of the one or more sine loss functions and minimizing the one or more vertical angles with respect to a valley of the corresponding sine function. 11 . The method of claim 1 , wherein the set of source-domain 3D data is received from a LIDAR sensor, the LiDAR sensor being a rotating LiDAR sensor. 12 . A system comprising: one or more processors; one or more memories storing machine-executable instructions, which, when executed by the one or more processors, cause the system to: receive a source-domain 3D point cloud comprising a set of source-domain 3D data; encode, using a trained domain adaptation network, the set of source-domain 3D data to generate encoded source-domain 3D data; decode, using the trained domain adaptation network, the encoded source-domain 3D data to generate one or more 3D translation vectors; decode, using the trained domain adaptation network, the encoded source-domain 3D data to generate a reconstructed 3D point cloud comprising a set of reconstructed 3D data; compute a set of interpolated 3D data based on the one or more 3D translation vectors and the set of reconstructed 3D data; and concatenate the set of interpolated 3D data and the set of source-domain 3D data to obtain a densified 3D point cloud comprising a set of densified 3D data. 13 . The system of claim 12 , wherein the machine-executable instructions, when executed by the one or more processors to encode the set of source-domain 3D data to generate encoded source-domain 3D data, further cause the system to: perform kernel point convolution on each point in the set of source-domain 3D data to generate one or more feature arrays. 14 . The system of claim 13 , wherein the machine-executable instructions, when executed by the one or more processors to decode the set of source-domain 3D data to generate one or more 3D translation vectors, further cause the system to: perform a regression on the encoded source-domain 3D data to generate a 3D translation vector corresponding to each point in the set of reconstructed 3D data. 15 . The system of claim 14 , wherein the machine-executable instructions, when executed by the one or more processors to compute the set of interpolated 3D data, further cause the system to: add a corresponding translation vector to each corresponding data point in the set of reconstructed 3D data to obtain a locally-translated version of the reconstructed 3D data points. 16 . The system of claim 12 , wherein the machine-executable instructions, when executed by the one or more processors, further cause the system to: train the domain adaptation network by: obtaining a first training data, based on the set of source-domain 3D data; computing a second training data comprising a set of target interpolated 3D data, based on the set of source-domain 3D data; and training the domain adaptation network based on the first training data and the second training data, where training of the domain adaptation network includes computing an overall loss
using neural networks · CPC title
Evaluating distance, position or velocity data · CPC title
Particle system, point based geometry or rendering · CPC title
Style variation · CPC title
Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.