What technology area does this patent fall under?

Primary CPC classification G06T17/10. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Computer-based techniques for learning compositional representations of 3D point clouds

US11869149B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11869149-B2
Application number	US-202217744467-A
Country	US
Kind code	B2
Filing date	May 13, 2022
Priority date	May 13, 2022
Publication date	Jan 9, 2024
Grant date	Jan 9, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In various embodiments, an unsupervised training application executes a neural network on a first point cloud to generate keys and values. The unsupervised training application generates output vectors based on a first query set, the keys, and the values and then computes spatial features based on the output vectors. The unsupervised training application computes quantized context features based on the output vectors and a first set of codes representing a first set of 3D geometry blocks. The unsupervised training application modifies the first neural network based on a likelihood of reconstructing the first point cloud, the quantized context features, and the spatial features to generate an updated neural network. A trained machine learning model includes the updated neural network, a second query set, and a second set of codes representing a second set of 3D geometry blocks and maps a point cloud to a representation of 3D geometry instances.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for training a machine learning model to generate representations of point clouds, the method comprising: executing a first neural network on a first point cloud that represents a first three-dimensional (3D) scene to generate a key set and a value set; generating an output vector set based on a first query set, the key set, and the value set; computing a plurality of spatial features based on the output vector set; computing a plurality of quantized context features based on the output vector set and a first set of codes representing a first set of 3D geometry blocks; and modifying the first neural network based on a likelihood of reconstructing the first point cloud, the plurality of quantized context features, and the plurality of spatial features to generate an updated neural network, wherein a trained machine learning model includes the updated neural network, a second query set, and a second set of codes representing a second set of 3D geometry blocks and maps a point cloud representing a 3D scene to a representation of a plurality of 3D geometry instances. 2. The computer-implemented method of claim 1 , wherein the output vector set is generated by executing a second neural network that includes a plurality of attention layers on the key set, the value set, and the first query set. 3. The computer-implemented method of claim 1 , wherein generating the output vector set comprises: computing a first plurality of compatibility scores between the first query set and the key set; computing a first intermediate query set based on the value set and the first plurality of compatibility scores; and computing the output vector set based on the value set, the first intermediate query set, and the key set. 4. The computer-implemented method of claim 1 , further comprising generating the second query set based on at least one of the output vector set, the first query set, the key set, or the value set. 5. The computer-implemented method of claim 1 , wherein a first spatial feature included in the plurality of spatial features specifies at least one of a weight, a rotation matrix, a 3D scaling factor, or a translation. 6. The computer-implemented method of claim 1 , wherein a first quantized context feature included in the plurality of quantized context features is computed by: computing a first context feature based on a first output vector included in the output vector set; computing a set of distances between the first context feature and the first set of codes; and setting the first quantized context feature equal to a first code included in the first set of codes based on the set of distances. 7. The computer-implemented method of claim 1 , wherein modifying the first neural network comprises replacing a first value for a first weight included in the first neural network with a second value for the first weight that increases a likelihood associated with reconstructing the first point cloud. 8. The computer-implemented method of claim 7 , further comprising executing one or more backpropagation operations on the first neural network to determine the second value for the first weight. 9. The computer-implemented method of claim 1 , further comprising executing the trained machine learning model on a second point cloud to generate a first representation of a first plurality of 3D geometry instances that includes at least one instance of a first 3D geometry block included in the second set of 3D geometry blocks and at least one instance of a second 3D geometry block included in the second set of 3D geometry blocks. 10. The computer-implemented method of claim 9 , wherein, for each 3D geometry instance included in the first plurality of 3D geometry instances, the first representation of the first plurality of 3D geometry instances includes a different quantized context feature and a different spatial feature. 11. One or more non-transitory computer readable media including instructions that, when executed by one or more processors, cause the one or more processors to train a machine learning model to generate representations of point clouds by performing the steps of: executing a first neural network on a first point cloud that represents a first three-dimensional (3D) scene to generate a key set and a value set; generating an output vector set based on a first query set, the key set, and the value set; computing a plurality of spatial features based on the output vector set; computing a plurality of quantized context features based on the output vector set and a first set of codes representing a first set of 3D geometry blocks; and modifying the first neural network based on a likelihood of reconstructing the first point cloud, the plurality of quantized context features, and the plurality of spatial features to generate an updated neural network, wherein a trained machine learning model includes the updated neural network, a second query set, and a second set of codes representing a second set of 3D geometry blocks and maps a point cloud representing a 3D scene to a representation of a plurality of 3D geometry instances. 12. The one or more non-transitory computer readable media of claim 11 , wherein a second neural network executes a plurality of weighted averaging operations on the value set based on the key set and the first query set to generate the output vector set. 13. The one or more non-transitory computer readable media of claim 11 , wherein generating the output vector set comprises: computing a first plurality of compatibility scores between the first query set and the key set; computing a first intermediate query set based on the value set and the first plurality of compatibility scores; and computing the output vector set based on the value set, the first intermediate query set, and the key set. 14. The one or more non-transitory computer readable media of claim 11 , further comprising generating the second query set based on at least one of the output vector set, the first query set, the key set, or the value set. 15. The one or more non-transitory computer readable media of claim 11 , wherein each spatial feature included in the plurality of spatial features specifies spatial information associated with a different cluster of points within the first point cloud. 16. The one or more non-transitory computer readable media of claim 11 , wherein a first quantized context feature included in the plurality of quantized context features is computed by: computing a first context feature based on a first output vector included in the output vector set; computing a set of distances between the first context feature and the first set of codes; and setting the first quantized context feature equal to a first code included in the first set of codes based on the set of distances. 17. The one or more non-transitory computer readable media of claim 16 , further comprising: computing a second code based on the first context feature; and replacing the first code included in the first set of codes with the second code to generate the second set of codes. 18. The one or more non-transitory computer readable media of claim 11 , wherein modifying the first neural network comprises replacing a first value for a first weight included in the first neural network with a second value for the first weight that increases a likelihood associated with reconstructing the first point cloud. 19. The one or more non-transitory computer readable media of claim 11 , wherein a fir

Assignees

Nvidia Corp

Inventors

Classifications

G06T17/10Primary
Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
G06T19/20
Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts · CPC title
G06T2219/2016
Rotation, translation, scaling · CPC title

Patent family

Related publications grouped by family.

View patent family 88699252

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11869149B2 cover?: In various embodiments, an unsupervised training application executes a neural network on a first point cloud to generate keys and values. The unsupervised training application generates output vectors based on a first query set, the keys, and the values and then computes spatial features based on the output vectors. The unsupervised training application computes quantized context features base…
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification G06T17/10. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Reconstructing three-dimensional models of objects from real images based on depth information

Neural network training technique

Neural networks trained using event occurrences

Systems and methods for reconstructing a scene in three dimensions from a two-dimensional image

Using neural networks to perform object detection, instance segmentation, and semantic correspondence from bounding box supervision

Systems and methods for inspection and defect detection using 3-d scanning

Frequently asked questions