Transformer-based shape models

US12198225B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12198225-B2
Application numberUS-202217675713-A
CountryUS
Kind codeB2
Filing dateFeb 18, 2022
Priority dateOct 1, 2021
Publication dateJan 14, 2025
Grant dateJan 14, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique for synthesizing a shape includes generating a first plurality of offset tokens based on a first shape code and a first plurality of position tokens, wherein the first shape code represents a variation of a canonical shape, and wherein the first plurality of position tokens represent a first plurality of positions on the canonical shape. The technique also includes generating a first plurality of offsets associated with the first plurality of positions on the canonical shape based on the first plurality of offset tokens. The technique further includes generating the shape based on the first plurality of offsets and the first plurality of positions.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for synthesizing a shape, the computer-implemented method comprising: generating a first plurality of offset tokens represented in a latent space based on a first shape code and a first plurality of position tokens, wherein the first shape code represents a variation of a canonical shape, and wherein the first plurality of position tokens represent a first plurality of positions on the canonical shape; generating a first plurality of offsets associated with the first plurality of positions on the canonical shape based on the first plurality of offset tokens; and generating the shape based on the first plurality of offsets and the first plurality of positions. 2. The computer-implemented method of claim 1 , further comprising: executing an encoder neural network that generates a second shape code based on a second plurality of offset tokens associated with a training shape; and updating one or more parameters of the encoder neural network and a decoder neural network, wherein the decoder neural network generates the first plurality of offset tokens, and wherein the one or more parameters are updated based on a loss between a plurality of ground truth offsets associated with the second plurality of offset tokens and a second plurality of offsets outputted by the decoder neural network based on the second shape code. 3. The computer-implemented method of claim 2 , wherein executing the encoder neural network comprises: for each offset token included in the second plurality of offset tokens, inputting a concatenation of the offset token with a corresponding position token into the encoder neural network; and inputting a shape token associated with the second shape code into the encoder neural network. 4. The computer-implemented method of claim 2 , wherein the first shape code is randomly generated, interpolated between the second shape code and one or more shape codes generated by the encoder neural network based on one or more training shapes, or selected from the one or more training shapes. 5. The computer-implemented method of claim 1 , wherein the first plurality of offset tokens are generated by one or more neural network layers. 6. The computer-implemented method of claim 5 , wherein the one or more neural network layers modulate the first plurality of offset tokens based on the first shape code. 7. The computer-implemented method of claim 5 , wherein the one or more neural network layers comprise a cross-covariance attention layer. 8. The computer-implemented method of claim 1 , further comprising generating the first shape code based on an identity code that represents an identity associated with the shape and an expression code that represents an expression associated with the shape. 9. The computer-implemented method of claim 1 , further comprising generating the first plurality of position tokens as a plurality of latent representations of the first plurality of positions on the canonical shape. 10. The computer-implemented method of claim 1 , wherein the first plurality of offset tokens are converted into the first plurality of offsets via one or more neural network layers. 11. One or more non-transitory computer readable media storing instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of: generating a first plurality of offset tokens represented in a latent space based on a first shape code and a first plurality of position tokens, wherein the first shape code represents a variation of a canonical shape, and wherein the first plurality of position tokens represent a first plurality of positions on the canonical shape; generating a first plurality of offsets associated with the first plurality of positions on the canonical shape based on the first plurality of offset tokens; and generating a shape based on the first plurality of offsets and the first plurality of positions. 12. The one or more non-transitory computer readable media of claim 11 , wherein the instructions further cause the one or more processors to perform the steps of: executing an encoder neural network that generates a second shape code based on a second plurality of offset tokens and a second plurality of position tokens associated with a training shape; and updating one or more parameters of the encoder neural network and a decoder neural network, wherein the decoder neural network generates the first plurality of offset tokens, and wherein the one or more parameters are updated based on a loss between a plurality of ground truth offsets associated with the second plurality of offset tokens and a second plurality of offsets outputted by the decoder neural network based on the second shape code. 13. The one or more non-transitory computer readable media of claim 12 , wherein the encoder neural network comprises a sequence of transformer blocks. 14. The one or more non-transitory computer readable media of claim 12 , wherein the first shape code is generated by the encoder neural network based on a third plurality of offset tokens and a third plurality of position tokens associated with a first portion of the shape. 15. The one or more non-transitory computer readable media of claim 12 , wherein the second plurality of position tokens represent a second plurality of positions on the canonical shape, and wherein the second plurality of positions is different from the first plurality of positions. 16. The one or more non-transitory computer readable media of claim 11 , further comprising iteratively updating the first shape code based on a loss between the shape and a target shape. 17. The one or more non-transitory computer readable media of claim 11 , wherein generating the first plurality of offset tokens comprises: modulating the first plurality of offset tokens based on the first shape code; generating a first plurality of output tokens based on the modulated first plurality of offset tokens; modulating the first plurality of output tokens based on a second shape code that is different from the first shape code; and generating the first plurality of offset tokens based on the modulated first plurality of output tokens. 18. The one or more non-transitory computer readable media of claim 11 , wherein the instructions further cause the one or more processors to perform the step of sampling the first plurality of positions from a continuous surface representing the canonical shape. 19. The one or more non-transitory computer readable media of claim 11 , wherein the canonical shape comprises at least one of a face, a hand, or a body. 20. A system, comprising: one or more memories that store instructions, and one or more processors that are coupled to the one or more memories and, when executing the instructions, are configured to: generate a first plurality of offset tokens represented in a latent space based on a first shape code and a first plurality of position tokens, wherein the first shape code represents a variation of a canonical shape, and wherein the first plurality of position tokens represent a first plurality of positions on the canonical shape; generate a first plurality of offsets associated with the first plurality of positions on the canonical shape based on the first plurality of offset tokens; and generate a shape based on the first plurality of offsets and the first plurality of positions.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12198225B2 cover?
A technique for synthesizing a shape includes generating a first plurality of offset tokens based on a first shape code and a first plurality of position tokens, wherein the first shape code represents a variation of a canonical shape, and wherein the first plurality of position tokens represent a first plurality of positions on the canonical shape. The technique also includes generating a firs…
Who is the assignee on this patent?
Disney Entpr Inc, Eth Zuerich Eidgenoessische Technische Hochschule Zuerich
What technology area does this patent fall under?
Primary CPC classification G06T11/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).