Synthesizing sequences of 3D geometries for movement-based performance

US12488524B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12488524-B2
Application numberUS-202117526608-A
CountryUS
Kind codeB2
Filing dateNov 15, 2021
Priority dateNov 15, 2021
Publication dateDec 2, 2025
Grant dateDec 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique for generating a sequence of geometries includes converting, via an encoder neural network, one or more input geometries corresponding to one or more frames within an animation into one or more latent vectors. The technique also includes generating the sequence of geometries corresponding to a sequence of frames within the animation based on the one or more latent vectors. The technique further includes causing output related to the animation to be generated based on the sequence of geometries.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for generating a sequence of three-dimensional (3D) geometries, the computer-implemented method comprising: converting, via an encoder neural network, one or more input 3D geometries corresponding to one or more frames within an animation into one or more latent vectors; combining (i) a capture code that represents one or more attributes of the sequence of 3D geometries with (ii) a plurality of position encodings that represent a plurality of time steps within the animation to produce a plurality of position-encoded representations of the capture code; generating, via a decoder neural network, the sequence of 3D geometries based on input that includes (i) the one or more latent vectors and (ii) the plurality of position-encoded representations of the capture code, wherein each 3D geometry included in the sequence of 3D geometries corresponds to (i) a different time step included in the plurality of time steps and (ii) a different frame included in a sequence of frames within the animation; and causing output related to the animation to be generated based on the sequence of 3D geometries. 2 . The computer-implemented method of claim 1 , further comprising training the encoder neural network and the decoder neural network based on a training dataset that includes a plurality of sequences of sampled 3D geometries, wherein each sequence of sampled 3D geometries included in the plurality of sequences of sampled 3D geometries comprises a sampled subset of a plurality of 3D geometries corresponding to a plurality of time steps within a geometric representation of one or more movements. 3 . The computer-implemented method of claim 2 , further comprising determining the capture code based on one or more capture codes included in a plurality of capture codes associated with the plurality of sequences of sampled 3D geometries. 4 . The computer-implemented method of claim 3 , wherein determining the capture code comprises at least one of: selecting the capture code from the plurality of capture codes associated with the plurality of sequences of sampled 3D geometries in the training dataset; or interpolating between two or more capture codes included in the plurality of capture codes. 5 . The computer-implemented method of claim 1 , further comprising receiving the one or more input 3D geometries as one or more sets of blendshape weights. 6 . The computer-implemented method of claim 1 , wherein converting the one or more input 3D geometries into the one or more latent vectors comprises: generating one or more input representations based on the one or more input 3D geometries and one or more position encodings that are included in the plurality of position encodings and represent one or more time steps corresponding to of the one or more frames within the animation; and applying a series of one or more encoder blocks included in the encoder neural network to the one or more input representations to generate the one or more latent vectors. 7 . The computer-implemented method of claim 6 , wherein the one or more encoder blocks comprise a self-attention layer, an addition and normalization layer, and a feed-forward layer. 8 . The computer-implemented method of claim 1 , wherein generating the sequence of 3D geometries comprises: generating, via a self-attention layer included in the decoder neural network, a first plurality of outputs based on relative distances between pairs of position-encoded representations of the capture code included in the plurality of position-encoded representations of the capture code; and applying an encoder-decoder attention layer included in the decoder neural network to the first plurality of outputs and the one or more latent vectors to generate a second plurality of outputs; and generating the sequence of 3D geometries based on the second plurality of outputs. 9 . The computer-implemented method of claim 8 , wherein the decoder neural network further comprises an addition and normalization layer and a feed-forward layer. 10 . The computer-implemented method of claim 1 , wherein the animation comprises at least one of a facial performance or a full-body performance. 11 . One or more non-transitory computer readable media storing instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of: converting, via an encoder neural network, one or more input three-dimensional (3D) geometries corresponding to one or more frames within an animation into one or more latent vectors; combining (i) a capture code that represents one or more attributes of a sequence of 3D geometries with (ii) a plurality of position encodings that represent a plurality of time steps within the animation to produce a plurality of position-encoded representations of the capture code; generating, via a decoder neural network, sequence of 3D geometries based on input that includes (i) the one or more latent vectors and (ii) h plurality of position-encoded representations of the capture code, wherein each 3D geometry included in the sequence of 3D geometries corresponds to (i) a different time step included in the plurality of time steps and (ii) a different frame included in a sequence of frames within the animation; and causing output related to the animation to be generated based on the sequence of 3D geometries. 12 . The one or more non-transitory computer readable media of claim 11 , wherein the instructions further cause the one or more processors to perform the step of training the encoder neural network and the decoder neural network based on a training dataset that includes a plurality of sequences of sampled 3D geometries and a discriminator neural network, wherein each sequence of sampled 3D geometries included in the plurality of sequences of sampled 3D geometries comprises a plurality of sampled 3D geometries corresponding to a plurality of time steps within a geometric representation of one or more movements. 13 . The one or more non-transitory computer readable media of claim 12 , wherein the instructions further cause the one or more processors to perform the step of determining the capture code based on one or more capture codes included in a plurality of capture codes associated with the plurality of sequences of sampled 3D geometries. 14 . The one or more non-transitory computer readable media of claim 13 , wherein determining the capture code comprises at least one of: selecting the capture code from the plurality of capture codes; or interpolating between two or more capture codes included in the plurality of capture codes. 15 . The one or more non-transitory computer readable media of claim 12 , wherein the encoder neural network and the decoder neural network are included in a transformer neural network. 16 . The one or more non-transitory computer readable media of claim 11 , wherein converting the one or more input 3D geometries into the one or more latent vectors comprises: generating one or more input representations based on a combination of the one or more input 3D geometries with one or more position encodings that are included in the plurality of position encodings and represent one or more time steps corresponding to of the one or more frames within the animation; and applying a series of one or more encoder blocks to the one or more input representations to generate the one or more latent vectors. 17 . The one or more non-transitory computer readable media of claim 11 , wherein generating the sequence of 3D

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Geometric image transformations in the plane of the image · CPC title

  • Non-supervised learning, e.g. competitive learning · CPC title

  • G06T9/002Primary

    using neural networks · CPC title

  • Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12488524B2 cover?
A technique for generating a sequence of geometries includes converting, via an encoder neural network, one or more input geometries corresponding to one or more frames within an animation into one or more latent vectors. The technique also includes generating the sequence of geometries corresponding to a sequence of frames within the animation based on the one or more latent vectors. The techn…
Who is the assignee on this patent?
Disney Entpr Inc, Eth Zuerich Eidgenoessische Technische Hochschule Zuerich
What technology area does this patent fall under?
Primary CPC classification G06T9/002. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).