Few-shot synthesis of talking heads
US-2022130111-A1 · Apr 28, 2022 · US
US12165260B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12165260-B2 |
| Application number | US-202217715646-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 7, 2022 |
| Priority date | Apr 7, 2022 |
| Publication date | Dec 10, 2024 |
| Grant date | Dec 10, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are described for rendering garments. The system includes a first machine learning model trained to generate coarse garment templates of a garment and a second machine learning model trained to render garment images. The first machine learning model generates a coarse garment template based on position data. The system produces a neural texture for the garment, the neural texture comprising a multi-dimensional feature map characterizing detail of the garment. The system provides the coarse garment template and the neural texture to the second machine learning model trained to render garment images. The second machine learning model generates a rendered garment image of the garment based on the coarse garment template of the garment and the neural texture.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: providing, to a first machine learning model trained to generate coarse garment templates of a garment, position data; generating, by the first machine learning model, a coarse garment template of the garment based on the position data, wherein the coarse garment template is an intermediate representation that reflects dynamics of the garment in motion with removal of garment details, wherein the garment details comprise patterns and/or textures of the garment; producing a neural texture for the garment, the neural texture comprising a multi-dimensional feature map characterizing garment details; providing the coarse garment template and the neural texture to a second machine learning model trained to render garment images; and generating, by the second machine learning model, a rendered garment image of the garment based on the coarse garment template of the garment and the neural texture. 2. The method of claim 1 , wherein the second machine learning model comprises an encoder, a decoder, and a Spatially Adaptive Normalization (SPADE) block, and wherein generating the rendered garment image by the second machine learning model comprises: generating, by the encoder, a first latent space representation of a first coarse garment template at a first time; generating, by the encoder, a second latent space representation of a second coarse garment template at a second time; providing the first latent space representation of the first coarse garment template and the second latent space representation of the second coarse garment template to the SPADE block; generating, by the SPADE block, a normalized latent space representation of the coarse garment template; and generating, by the decoder based on the normalized latent space representation of the garment generated by the SPADE block, the rendered garment image. 3. The method of claim 2 , wherein the encoder comprises a set of convolutional layers that encode the neural texture into latent space representations of consecutive frames. 4. The method of claim 1 , further comprising: generating a motion feature image representing three-dimensional (3D) positions of points on a surface of the garment; and concatenating the motion feature image with the neural texture to form a neural texture map, wherein the neural texture map is used to generate the rendered garment image. 5. The method of claim 1 , wherein generating the rendered garment image further comprises: generating, by the second machine learning model, a feature image for the garment; providing a background image to the second machine learning model; generating a mask; and blending, by the second machine learning model, the feature image and the background image using the mask to generate the rendered garment image. 6. The method of claim 1 , wherein the second machine learning model comprises a decoder neural network comprising a plurality of weights, the method further comprising training the decoder neural network by performing multiple iterations of a training procedure to minimize a loss function to update values of the weights of the decoder neural network. 7. The method of claim 6 , wherein the decoder neural network is part of a generative adversarial network comprising a discriminator with multiple layers, and wherein the loss function comprises: a perceptual loss based on a training image of the garment and a generated image of the garment; and an adversarial loss based on features extracted from the multiple layers of the discriminator. 8. A system comprising: a memory component; and a processing device coupled to the memory component, the processing device to perform operations comprising: generating, by a first machine learning model trained to generate coarse garment templates of a garment, a coarse garment template based on position data, wherein the coarse garment template is an intermediate representation that reflects dynamics of the garment in motion with removal of garment details, wherein the garment details comprise patterns and/or textures of the garment; producing a neural texture for the garment, the neural texture comprising a multi-dimensional feature map characterizing garment details; providing the coarse garment template and the neural texture to a second machine learning model trained to render garment images; and generating, by the second machine learning model, a rendered garment image of the garment based on the coarse garment template of the garment and the neural texture. 9. The system of claim 8 , wherein the second machine learning model comprises an encoder, a decoder, and a Spatially Adaptive Normalization (SPADE) block, and wherein generating the rendered garment image by the second machine learning model comprises: generating, by the encoder, a first latent space representation of a first coarse garment template at a first time; generating, by the encoder, a second latent space representation of a second coarse garment template at a second time; providing the first latent space representation of the first coarse garment template and the second latent space representation of the second coarse garment template to the SPADE block; generating, by the SPADE block, a normalized latent space representation of the coarse garment template; and generating, by the decoder based on the normalized latent space representation of the garment generated by the SPADE block, the rendered garment image. 10. The system of claim 9 , wherein the encoder comprises a set of convolutional layers that encode the neural texture into latent space representations of consecutive frames. 11. The system of claim 8 , the operations further comprising: generating a motion feature image representing three-dimensional (3D) positions of points on a surface of the garment; and concatenating the motion feature image with the neural texture to form a neural texture map, wherein the neural texture map is used to generate the rendered garment image. 12. The system of claim 8 , wherein generating, by the second machine learning model, a feature image for the garment; providing a background image to the second machine learning model; generating a mask; and blending, by the second machine learning model, the feature image and the background image using the mask to generate the rendered garment image. 13. The system of claim 8 , wherein the second machine learning model comprises a decoder neural network comprising a plurality of weights, the operations further comprising training the decoder neural network by performing multiple iterations of a training procedure to minimize a loss function to update values of the weights of the decoder neural network. 14. The system of claim 13 , wherein the decoder neural network is part of a generative adversarial network comprising a discriminator with multiple layers, and wherein the loss function comprises: a perceptual loss based on a training image of the garment and a generated image of the garment; and an adversarial loss based on features extracted from the multiple layers of the discriminator. 15. A non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations comprising: providing, to a first machine learning model trained to generate coarse garment templates of a garment, position data; generating, by the first machine learning model, a coarse garment template based on the position data, wherein the coarse garment template is an intermediate representation that reflec
Shifting the patterns to accommodate for positional errors · CPC title
Training; Learning · CPC title
Artificial neural networks [ANN] · CPC title
Cloth · CPC title
Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.