Shape and appearance reconstruction with deep geometric refinement
US-2023252714-A1 · Aug 10, 2023 · US
US12561883B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-12561883-B1 |
| Application number | US-202318174477-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 24, 2023 |
| Priority date | Feb 24, 2022 |
| Publication date | Feb 24, 2026 |
| Grant date | Feb 24, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Rendering an avatar for a user in a communication session includes obtaining enrollment data associated with the user. For each of one or more frames of the communication session, a set of expression latents is obtained for the user. A first texture for a first portion of the face of the user is generated based on the enrollment data and expression latents. A combined albedo map is generated based on one or more identity textures from the enrollment data and the first texture. A target texture is generated based on the one or more identity textures and the combined albedo map for a particular frame of the one or more frames.
Opening claim text (preview).
The invention claimed is: 1 . A method comprising: obtaining enrollment data associated with a user, wherein the enrollment data comprises a plurality of identity textures of a face of the user, and wherein the enrollment data is generated from sensor data captured during an enrollment period; obtaining a set of expression latents for the user for a first frame of a runtime period; generating a first targeted texture for a first portion of the face of the user based on the enrollment data and the set of expression latents; and generating a target texture for the first frame by combining the first targeted texture with a first identity texture of the plurality of identity textures to obtain a combined albedo map for the first frame. 2 . The method of claim 1 , further comprising: generating an expressive mesh based on the set of expression latents; and rendering an avatar by applying the target texture to the expressive mesh. 3 . The method of claim 1 , wherein generating the target texture further comprises: obtaining one or more neural maps based on the set of expression latents, a head pose for the user, and a selected lighting, and warping the plurality of identity textures in accordance with the one or more neural maps, wherein the one or more neural maps comprises at least one selected from a group consisting of a neural displacement map, an ambient map, a diffuse map, and a specular map. 4 . The method of claim 1 , wherein generating the first targeted texture comprises applying the set of expression latents to a first model trained to predict a texture for a first portion of an expressive mesh associated with the set of expression latents. 5 . The method of claim 1 , further comprising: obtaining a second texture for a second portion of the face based on the set of expression latents and image data for the second portion of the face, wherein the combined albedo map is further generated based on the second texture. 6 . The method of claim 5 , wherein the second texture is obtained by applying image data for the second portion of the face and the set of expression latents to a second model trained to predict a texture for a second portion of an expressive mesh associated with the set of expression latents. 7 . The method of claim 1 , wherein the plurality of identity textures comprises at least one selected from a group consisting of a pseudo normal texture, a diffuse albedo texture, and a specular albedo texture. 8 . The method of claim 1 , further comprising, for each of one or more additional frames: obtaining an additional set of expression latents for the user; and generating an additional texture based on the enrollment data and the additional set of expression latents; generating an additional combined albedo map based on the plurality of identity textures from the enrollment data and the additional texture; and generating an additional target texture based on the plurality of identity textures and the additional combined albedo map. 9 . A non-transitory computer readable medium comprising computer readable code executable by one or more processors to: obtain enrollment data associated with a user, wherein the enrollment data comprises a plurality of identity textures of a face of the user, and wherein the enrollment data is generated from sensor data captured during an enrollment period; obtain a set of expression latents for the user for a first frame of a runtime period; generate a first targeted texture for a first portion of the face of the user based on the enrollment data and the set of expression latents; and generate a target texture for the first frame by combining the first targeted texture with a first identity texture of the plurality of identity textures to obtain a combined albedo map for the first frame. 10 . The non-transitory computer readable medium of claim 9 , further comprising, computer readable code to: generating an expressive mesh based on the set of expression latents; and rendering an avatar for the first frame by applying the target texture to the expressive mesh. 11 . The non-transitory computer readable medium of claim 9 , wherein the computer readable code to generate the target texture further comprises computer readable code to: obtain one or more neural maps based on the set of expression latents, a head pose for the user, and a selected lighting, and warp the plurality of identity textures in accordance with the one or more neural maps, wherein the one or more neural maps comprises at least one selected from a group consisting of a neural displacement map, an ambient map, a diffuse map, and a specular map. 12 . The non-transitory computer readable medium of claim 9 , wherein the computer readable code to generate the first targeted texture comprises computer readable code to apply the set of expression latents to a first model trained to predict a texture for a first portion of an expressive mesh associated with the set of expression latents. 13 . The non-transitory computer readable medium of claim 9 , further comprising computer readable code to: obtaining a second texture for a second portion of the face based on the set of expression latents and image data for the second portion of the face, wherein the combined albedo map is further generated based on the second texture. 14 . The non-transitory computer readable medium of claim 13 , wherein the second texture is obtained by applying image data for the second portion of the face and the set of expression latents to a second model trained to predict a texture for a second portion of an expressive mesh associated with the set of expression latents. 15 . The non-transitory computer readable medium of claim 9 , wherein the plurality of identity textures comprises at least one selected from a group consisting of a pseudo normal texture, a diffuse albedo texture, and a specular albedo texture. 16 . The non-transitory computer readable medium of claim 9 , further comprising computer readable code to, for each of one or more additional frames: obtain an additional set of expression latents for the user; and generate an additional texture based on the enrollment data and the additional set of expression latents; generate an additional combined albedo map based on the plurality of identity textures from the enrollment data and the additional texture; and generate an additional target texture based on the plurality of identity textures and the additional combined albedo map. 17 . A system comprising: one or more processors; and one or more computer readable media comprising computer readable code executable by the one or more processors to: obtain enrollment data associated with a user, wherein the enrollment data comprises a plurality of identity textures of a face of the user, and wherein the enrollment data is generated from sensor data captured during an enrollment period; obtain a set of expression latents for the user for a first frame of a runtime period; generate a first targeted texture for a first portion of the face of the user based on the enrollment data and the set of expression latents; and generate a target texture for the first frame by combining the first targeted texture with a first identity texture of the plurality of identity textures to obtain a combined albedo map for the first frame. 18 . The system of claim 17 , further comprising, computer readable code to: generate an expressive mesh based on the set of expression latents; and render an avatar for
Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts · CPC title
Texture mapping · CPC title
Multi-user, collaborative environment · CPC title
Colour editing, changing, or manipulating; Use of colour codes · CPC title
Shape modification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.