Systems and methods for rendering avatar with high resolution geometry

US11423616B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11423616-B1
Application numberUS-202016833360-A
CountryUS
Kind codeB1
Filing dateMar 27, 2020
Priority dateMar 27, 2020
Publication dateAug 23, 2022
Grant dateAug 23, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a system may access an input image of an object captured by cameras, and the input image depicts appearance information associated with an object. The system may generate a first mesh of the object based on features identified from the input image of the object. The system may generate, by processing the first mesh using a machine-learning model, a position map that defines a contour of the object. Each pixel in the position map corresponds to a three-dimensional coordinate. The system may further generate a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh. The system may render an output image of the object based on the second mesh. The system disclosed in the present application can render a dense mesh which has a higher resolution to provide details which cannot be compensated by texture information.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising, by a computing system: accessing one or more input images of an object captured by one or more cameras, the one or more input images depicting appearance information associated with the object; generating a first mesh of the object based on features identified from depth measurements of the one or more input images of the object; generating, by processing the first mesh and an initial texture of the object using a machine-learning model, (a) a position map that is a two-dimensional image defining an unwrapped geometry of the object, (b) an intermediate texture of the object, and (c) a warp field, wherein each pixel in the position map specifies a corresponding three-dimensional location of the object using a three-dimensional coordinate; generating a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh; generating an output texture corresponding to the second mesh by warping the intermediate texture using the warp field; and rendering an output image of the object based on the second mesh and the output texture. 2. The method of claim 1 , wherein generating the second mesh based on the position map comprises: sampling pixels in the position map based on geometry; determining vertices from each of the sampled pixels in the position map; and generating the second mesh based on the determined vertices. 3. The method of claim 1 , wherein the machine-learning model is trained by: comparing the output image of the object with the one or more input images of the object; and calculating an image loss based on the comparison to update the machine-learning model. 4. The method of claim 1 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; computing output depth measurements in the output image of the object; comparing the output depth measurements with the input depth measurements; and calculating an output depth loss based on the comparison to update the machine-learning model. 5. The method of claim 1 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; smoothing depth in the output image of the object to obtain a curvature of the object; computing smoothed depth measurements in the output image of the object; comparing the smoothed depth measurements with the input depth measurements; and calculating a normal loss based on the comparison to update the machine-learning model. 6. The method of claim 1 , wherein the machine-learning model is trained by: comparing each of the features in the one or more input images of the object with its corresponding feature in the output image of the object; and calculating a tracking loss based on the comparison to update the machine-learning model. 7. The method of claim 1 , wherein the machine-learning model is configured to generate images for television monitors, cinema screens, computer monitors, mobile phones, or tablets. 8. One or more computer-readable non-transitory storage media embodying software that is operable when executed to: access one or more input images of an object captured by one or more cameras, the one or more input images depicting appearance information associated with the object; generate a first mesh of the object based on features identified from depth measurements of the one or more input images of the object; generate, by processing the first mesh and an initial texture of the object using a machine-learning model, (a) a position map that is a two-dimensional image defining an unwrapped geometry of the object, (b) an intermediate texture of the object, and (c) a warp field, wherein each pixel in the position map specifies a corresponding three-dimensional location of the object using a three-dimensional coordinate; generate a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh; generate an output texture corresponding to the second mesh by warping the intermediate texture using the warp field; and render an output image of the object based on the second mesh and the output texture. 9. The media of claim 8 , wherein generating the second mesh based on the position map comprises: sampling pixels in the position map based on geometry; determining vertices from each of the sampled pixels in the position map; and generating the second mesh based on the determined vertices. 10. The media of claim 8 , wherein the machine-learning model is trained by: comparing the output image of the object with the one or more input images of the object; and calculating an image loss based on the comparison to update the machine-learning model. 11. The media of claim 8 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; computing output depth measurements in the output image of the object; comparing the output depth measurements with the input depth measurements; and calculating an output depth loss based on the comparison to update the machine-learning model. 12. The media of claim 8 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; smoothing depth in the output image of the object to obtain a curvature of the object; computing smoothed depth measurements in the output image of the object; comparing the smoothed depth measurements with the input depth measurements; and calculating a normal loss based on the comparison to update the machine-learning model. 13. The media of claim 8 , wherein the machine-learning model is trained by: comparing each of the features in the one or more input images of the object with its corresponding feature in the output image of the object; and calculating a tracking loss based on the comparison to update the machine-learning model. 14. The media of claim 8 , wherein the machine-learning model is configured to generate images for television monitors, cinema screens, computer monitors, mobile phones, or tablets. 15. A system comprising: one or more processors; and one or more computer-readable non-transitory storage media coupled to one or more of the processors and comprising instructions operable when executed by one or more of the processors to cause the system to: access one or more input images of an object captured by one or more cameras, the one or more input images depicting appearance information associated with the object; generate a first mesh of the object based on features identified from depth measurements of the one or more input images of the object; generate, by processing the first mesh and an initial texture of the object using a machine-learning model, (a) a position map that is a two-dimensional image defining an unwrapped geometry of the object, (b) an intermediate texture of the object, and (c) a warp field, wherein each pixel in the position map specifies a corresponding three-dimensional location of the object using a three-dimensional coordinate; generate a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh; generate an output texture corresponding to the second mesh by warping the intermediate texture using the warp field; and render an output image of the object based on the second mesh and the output texture. 16. The system of claim 15 , wherein generating the second mesh based on the position ma

Assignees

Inventors

Classifications

  • G06T17/20Primary

    Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title

  • Texture mapping · CPC title

  • Adversarial learning · CPC title

  • Generative networks · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11423616B1 cover?
In one embodiment, a system may access an input image of an object captured by cameras, and the input image depicts appearance information associated with an object. The system may generate a first mesh of the object based on features identified from the input image of the object. The system may generate, by processing the first mesh using a machine-learning model, a position map that defines a…
Who is the assignee on this patent?
Facebook Tech Llc
What technology area does this patent fall under?
Primary CPC classification G06T17/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 23 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).