Canonicalized codebook for 3d object generation

US2024251103A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024251103-A1
Application numberUS-202418413163-A
CountryUS
Kind codeA1
Filing dateJan 16, 2024
Priority dateJan 25, 2023
Publication dateJul 25, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of machine-learning. The method includes obtaining a training dataset of 3D models of real-world objects. The method further includes learning, based on the training dataset and on a patch-decomposition of the 3D models of the training dataset, a finite codebook of quantized vectors and a neural network. The neural network comprises a rotation-invariant encoder. The rotation-invariant encoder is configured for rotation-invariant encoding of a patch of a 3D model into a quantized latent vector of the codebook. The neural network further includes a decoder. The decoder is configured for decoding a sequence of quantized latent vectors of the codebook into a 3D model. The sequence corresponds to a patch-decomposition. This constitutes an improved solution for 3D model generation.

First claim

Opening claim text (preview).

1 . A computer-implemented method of machine-learning, the method comprising: obtaining a training dataset of 3D models of real-world objects; and learning, based on the training dataset and on a patch-decomposition of the 3D models of the training dataset, a finite codebook of quantized vectors and a neural network, the neural network including: a rotation-invariant encoder configured for rotation-invariant encoding of a patch of a 3D model into a quantized latent vector of the codebook, and a decoder configured for decoding a sequence of quantized latent vectors of the codebook, the sequence corresponding to a patch-decomposition, into a 3D model. 2 . The computer-implemented method of claim 1 , wherein the encoder is translation-invariant and rotation-invariant and is configured for translation-invariant and rotation-invariant encoding of a patch of a 3D model of the training dataset into a quantized latent vector of the codebook. 3 . The computer-implemented method of claim 1 , wherein the decoder includes: a first module configured for taking as input a sequence of quantized latent vectors of the codebook corresponding to a patch-decomposition and inferring patches rotations for reconstructing a 3D model; and a second module configured for taking as input the sequence of quantized latent vectors of the codebook corresponding to the patch-decomposition and inferring patches geometries for reconstructing a 3D model. 4 . The computer-implemented method of claim 1 , wherein the learning further comprises minimizing a loss, the loss including a reconstruction loss and a commitment loss, the commitment loss rewarding consistency between quantized latent vectors outputted by the encoder and vectors of the codebook. 5 . The computer-implemented method of claim 4 , wherein the loss is of a type: ℒ ⁡ ( x ; ϕ , ψ , D ) = ℒ r ( o x , σ x ) + βℒ V ⁢ Q ( Z , Z q ) where r is a reconstruction binary cross-entropy loss, VQ is a commitment loss, x represents a 3D point, ψ represents a parameter of the decoder, β is a weighting parameter, represents a ground truth occupancy for x, o x represents a predicted occupancy for x, ϕ represents the parameter of the encoder, and where ℒ V ⁢ Q ( Z , Z q ) =  Z - s ⁢ g [ Z q ]  2 2 where sg[.] denotes a stop-gradient operation, Z={z i } i , Z q ={z i q } i , where z i is a non-quantized encoding of patch X i and where V ⁢ Q ⁡ ( z i ) = z i q = arg min e ∈ D  z i - e  where D={e k ∈R D }; k=1 . . . K is the codebook. 6 . A computer-implemented method of applying a decoder and a codebook learnable according to a machine-learning including obtaining a training dataset of 3D models of real-world objects and learning, based on the training dataset and on a patch-decomposition of the 3D models of the training dataset, a finite codebook of quantized vectors and a neural network, the neural network including: a rotation-invariant encoder configured for rotation-invariant encoding of a patch of a 3D model into a quantized latent vector of the codebook, and a decoder configured for decoding a sequence of quantized latent vectors of the codebook, the sequence corresponding to a patch-decomposition, into a 3D model, the method comprising: obtaining a sequence of quantized latent vectors of the codebook; and applying the decoder to the sequence. 7 . The computer-implemented method of claim 6 , wherein obtaining the sequence further comprises: applying a transformer neural network to obtain the sequence, the transformer neural network being configured for, given an input latent vector representing an input 3D model, generating a sequence of quantized latent vectors of the codebook that correspond to a patch-decomposition of the input 3D model. 8 . The computer-implemented method of claim 7 , wherein the latent vector representing the input 3D model corresponds to an embedding of an image or point cloud representing the input 3D model. 9 . The computer-implemented method of claim 8 , wherein the image is a single-view image or the point cloud is a partial point cloud. 10 . The computer-implemented method of claim 6 , wherein obtaining the sequence further comprises applying the encoder to a patch-decomposition of an input 3D model. 11 . The computer-implemented method of claim 6 , further comprising

Assignees

Inventors

Classifications

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06N3/0455Primary

    Auto-encoder networks; Encoder-decoder networks · CPC title

  • Three-dimensional [3D] modelling for computer graphics · CPC title

  • Vector quantisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024251103A1 cover?
A computer-implemented method of machine-learning. The method includes obtaining a training dataset of 3D models of real-world objects. The method further includes learning, based on the training dataset and on a patch-decomposition of the 3D models of the training dataset, a finite codebook of quantized vectors and a neural network. The neural network comprises a rotation-invariant encoder. Th…
Who is the assignee on this patent?
Dassault Systemes, Ecole Polytech, Centre Nat Rech Scient
What technology area does this patent fall under?
Primary CPC classification G06N3/0455. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 25 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).