Appearance synthesis of digital faces

US11257276B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11257276-B2
Application numberUS-202016895734-A
CountryUS
Kind codeB2
Filing dateJun 8, 2020
Priority dateMar 5, 2020
Publication dateFeb 22, 2022
Grant dateFeb 22, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for generating digital faces. In some examples, a style-based generator receives as inputs initial tensor(s) and style vector(s) corresponding to user-selected semantic attribute styles, such as the desired expression, gender, age, identity, and/or ethnicity of a digital face. The style-based generator is trained to process such inputs and output low-resolution appearance map(s) for the digital face, such as a texture map, a normal map, and/or a specular roughness map. The low-resolution appearance map(s) are further processed using a super-resolution generator that is trained to take the low-resolution appearance map(s) and low-resolution 3D geometry of the digital face as inputs and output high-resolution appearance map(s) that align with high-resolution 3D geometry of the digital face. Such high-resolution appearance map(s) and high-resolution 3D geometry can then be used to render standalone images or the frames of a video that include the digital face.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for rendering one or more images of a digital face, the method comprising: generating, via a first machine learning model, one or more first appearance maps based on a user selection of one or more styles associated with one or more attributes of a digital face; generating, via a second machine learning model, one or more second appearance maps and a first three-dimensional (3D) geometry associated with the digital face based on jointly upsampling the one or more first appearance maps and a second 3D geometry associated with the digital face, wherein the one or more second appearance maps are aligned with the first 3D geometry; and rendering one or more images including the digital face based on the one or more second appearance maps and the first 3D geometry. 2. The computer-implemented method of claim 1 , wherein the one or more first appearance maps have a lower resolution than the one or more second appearance maps, and the second 3D geometry has a lower resolution than the first 3D geometry. 3. The computer-implemented method of claim 1 , wherein generating the one or more first appearance maps based on the user selection of one or more styles comprises controlling one or more adaptive instance normalization (AdaIN) operations based on the user selection of one or more styles. 4. The computer-implemented method of claim 3 , wherein the one or more AdaIN operations are performed in conjunction with one or more convolution operations. 5. The computer-implemented method of claim 4 , wherein the first machine learning model comprises a plurality of semantics transfer blocks, and performing the one or more AdaIN operations in conjunction with the one or more convolution operations comprises, for each semantics transfer block included in the plurality of semantics transfer blocks: performing multiple sets of convolution operations and AdaIN operations in parallel; determining a weighted sum of outputs of the multiple sets of convolution operations and AdaIN operations; and upscaling the weighted sum of outputs. 6. The computer-implemented method of claim 1 , wherein the one or more second appearance maps include at least one of a texture map, a normal map, or a specular roughness map. 7. The computer-implemented method of claim 1 , wherein the one or more attributes of the digital face include at least one of expression, gender, age, identity, or ethnicity. 8. The computer-implemented method of claim 1 , wherein the second machine learning model comprises a super-resolution generator. 9. The computer-implemented method of claim 1 , wherein the first 3D geometry associated with the digital face conveys a non-neutral facial expression. 10. A non-transitory computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to perform steps for rendering one or more images of a digital face, the steps comprising: generating, via a first machine learning model, one or more first appearance maps based on a user selection of one or more styles associated with one or more attributes of a digital face; generating, via a second machine learning model, one or more second appearance maps and a first three-dimensional (3D) geometry associated with the digital face based on jointly upsampling the one or more first appearance maps and a second 3D geometry associated with the digital face, wherein the one or more second appearance maps are aligned with the first 3D geometry; and rendering one or more images including the digital face based on the one or more second appearance maps and the first 3D geometry. 11. The computer-readable storage medium of claim 10 , wherein generating the one or more first appearance maps based on the user selection of one or more styles comprises controlling one or more adaptive instance normalization (AdaIN) operations based on the user selection of one or more styles. 12. The computer-readable storage medium of claim 11 , wherein the first machine learning model comprises a plurality of semantics transfer blocks, and generating the one or more first appearance maps based on the user selection of one or more styles includes, for each semantics transfer block included in the plurality of semantics transfer blocks: performing multiple sets of convolution operations and AdaIN operations in parallel; determining a weighted sum of outputs of the multiple sets of convolution operations and AdaIN operations; and upscaling the weighted sum of outputs. 13. The computer-readable storage medium of claim 12 , wherein one or more weights used in the weighted sum of outputs, the multiple sets of convolution operations and AdaIN operations, one or more initial tensors, and one or more style vectors associated with the one or more attributes of the digital face are determined while training the first machine learning model. 14. The computer-readable storage medium of claim 10 , wherein the first machine learning model is trained using a progressive training technique. 15. The computer-readable storage medium of claim 10 , wherein the second machine learning model is trained using ground truth and adversarial learning techniques. 16. The computer-readable storage medium of claim 10 , wherein the one or more first appearance maps have a lower resolution than the one or more second appearance maps, and the second 3D geometry has a lower resolution than the first 3D geometry. 17. The computer-readable storage medium of claim 10 , wherein the one or more second appearance maps include at least one of a texture map, a normal map, or a specular roughness map. 18. The computer-readable storage medium of claim 10 , wherein the one or more attributes of the digital face include at least one of expression, gender, age, identity, or ethnicity. 19. The computer-readable storage medium of claim 10 , wherein the second machine learning model comprises a super-resolution generator. 20. A computing device comprising: a memory storing an application; and a processor coupled to the memory, wherein when executed by the processor, the application causes the processor to: generate, via a first machine learning model, one or more first appearance maps based on a user selection of one or more styles associated with one or more attributes of a digital face; generate, via a second machine learning model, one or more second appearance maps and a first three-dimensional (3D) geometry associated with the digital face based on jointly upsampling the one or more first appearance maps and a second 3D geometry associated with the digital face, wherein the one or more second appearance maps are aligned with the first 3D geometry; and render one or more images including the digital face based on the one or more second appearance maps and the first 3D geometry.

Assignees

Inventors

Classifications

  • Probabilistic or stochastic networks · CPC title

  • Combinations of networks · CPC title

  • Generative networks · CPC title

  • Supervised learning · CPC title

  • Adversarial learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11257276B2 cover?
Techniques are disclosed for generating digital faces. In some examples, a style-based generator receives as inputs initial tensor(s) and style vector(s) corresponding to user-selected semantic attribute styles, such as the desired expression, gender, age, identity, and/or ethnicity of a digital face. The style-based generator is trained to process such inputs and output low-resolution appearan…
Who is the assignee on this patent?
Disney Entpr Inc, Eth Zuerich Eidgenoessische Technische Hochschule Zuerich, Eth Zurich Eidgenoessische Technische Hochschule Zuerich
What technology area does this patent fall under?
Primary CPC classification G06T17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).