Animating avatars from headset cameras

US2020402284A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020402284-A1
Application numberUS-201916449117-A
CountryUS
Kind codeA1
Filing dateJun 21, 2019
Priority dateJun 21, 2019
Publication dateDec 24, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a computing system may access a plurality of first captured images that are captured in a first spectral domain, generate, using a first machine-learning model, a plurality of first domain-transferred images based on the first captured images, wherein the first domain-transferred images are in a second spectral domain, render, based on a first avatar, a plurality of first rendered images comprising views of the first avatar, and update the first machine-learning model based on comparisons between the first domain-transferred images and the first rendered images, wherein the first machine-learning model is configured to translate images in the first spectral domain to the second spectral domain. The system may also generate, using a second machine-learning model, the first avatar based on the first captured images. The first avatar may be rendered using a parametric face model based on a plurality of avatar parameters.

First claim

Opening claim text (preview).

1 . A system comprising: one or more non-transitory computer-readable storage media embodying instructions; one or more processors coupled to the storage media and operable to execute the instructions to: access a plurality of first captured images of a user that are captured in a first spectral domain; generate, using a first machine-learning model, a plurality of first domain-transferred images based on the first captured images, wherein the first domain-transferred images are in a second spectral domain; render, based on a first avatar, a plurality of first rendered images comprising views of the first avatar, the first avatar being a virtual representation of the user; and update the first machine-learning model based on comparisons between the first domain-transferred images and the first rendered images, wherein the updated first machine-learning model is configured to translate images in the first spectral domain to the second spectral domain. 2 . The system of claim 1 , wherein the processors are further operable to execute the instructions to: generate, using a second machine-learning model, the first avatar based on the first captured images. 3 . The system of claim 1 , wherein the first avatar is rendered using a parametric face model based on a plurality of avatar parameters. 4 . The system of claim 3 , wherein a distribution of the avatar parameters in the first domain-transferred images corresponds to a distribution of the avatar parameters in the first rendered images. 5 . The system of claim 1 , wherein the first machine-learning model is updated using a loss function based on a difference between each first domain-transferred image and each corresponding first rendered image. 6 . The system of claim 5 , wherein the loss function is further based on one or more spatial relationships between cameras that capture the images. 7 . The system of claim 1 , wherein the processors are further operable to execute the instructions to: access a plurality of second captured images that are captured in the first spectral domain; generate, using the updated first machine-learning model, a plurality of second domain-transferred images based on the second captured images, wherein the second domain-transferred images are in the second spectral domain; generate, using a second machine-learning model, a second avatar based on the second captured images; render, based on the second avatar, a plurality of second rendered images comprising views of the second avatar; and update the second machine-learning model based on comparisons between the second domain-transferred images and the second rendered images, wherein the second machine-learning model is configured to generate, based on one or more first input images, one or more avatar parameters for rendering avatars that correspond to the first input images. 8 . The system of claim 7 , wherein the updated second machine-learning model is further configured to generate, based on the one or more first input images, pose information that represents spatial orientation of the avatars. 9 . The system of claim 7 , wherein the first and second captured images are captured by cameras associated with a training headset. 10 . The system of claim 7 , wherein the processors are further operable to execute the instructions to: access a plurality of third captured images, wherein the third captured images are from a subset of the second captured images; generate, using the updated second machine-learning model, avatar parameters that correspond to the third captured images; and train, based on the correspondence between the third captured images and the corresponding avatar parameters, a third machine-learning model to generate, based on one or more second input images, one or more avatar parameters that correspond to the second input images. 11 . The system of claim 10 , wherein the third machine-learning model generates the output avatar parameters in real-time. 12 . The system of claim 10 , wherein the third captured images are captured by a plurality of training cameras associated with a training headset, and the second input images are captured by a plurality of non-intrusive cameras associated with a non-intrusive headset. 13 . The system of claim 12 , wherein positions of the non-intrusive cameras on the non-intrusive headset correspond to positions of a subset of the training cameras on the training headset. 14 . The system of claim 10 , wherein the third machine-learning model comprises: a plurality of convolutional neural network branches that generate one-dimensional vectors, wherein each branch corresponds to a camera and converts received images captured by the corresponding camera in the first spectral domain to a corresponding one of the one-dimensional vectors; and a multilayer perceptron that converts the vectors to avatar parameters. 15 . The system of claim 10 , wherein the processors are further operable to execute the instructions to: access a plurality of third images captured in the first spectral domain, wherein the third images are captured by non-intrusive cameras, the second images are captured by intrusive cameras, and the non-intrusive cameras are fewer in number than the intrusive cameras; generate, using the third machine-learning model, avatar parameters based on the third images; render, based on the avatar parameters, a plurality of third rendered images comprising views of a third avatar; and present, to a user, the third rendered images. 16 . The system of claim 10 , wherein the system further comprises first and second Graphics Processing Units (GPUs), and the processors are further operable to execute the instructions to: access a plurality of images of a first user that are captured in the first spectral domain; generate, by executing the third machine-learning model on the first GPU, first avatar parameters based on the images of the first user; and send, via a communication network, the first avatar parameters to a computing device of a second user. 17 . The system of claim 16 , wherein the processors are further operable to execute the instructions to: receive, via the communications network, second avatar parameters from the computing device of the second user; render, using the second GPU and based on the second avatar parameters, a plurality of third rendered images comprising views of an avatar of the second user; and present, to the first user, the third rendered images. 18 . The system of claim 1 , wherein the first spectral domain is infrared, and the second spectral domain is visible light. 19 . One or more computer-readable non-transitory storage media embodying software that is operable when executed to: access a plurality of first captured images of a user that are captured in a first spectral domain; generate, using a first machine-learning model, a plurality of first domain-transferred images based on the first captured images, wherein the first domain-transferred images are in a second spectral domain; render, based on a first avatar, a plurality of first rendered images comprising views of the first avatar, the first avatar being a virtual representation of the user; and update the first machine-learning model based on comparisons between the first domain-transferred images and the first rendered images, wherein the updated first machine-learning model is configured to translate images in the first spectral domain to the second spectral domain. 20 . A method compri

Assignees

Inventors

Classifications

  • Dynamic expression · CPC title

  • G06T17/00Primary

    Three-dimensional [3D] modelling for computer graphics · CPC title

  • Matching configurations of points or features · CPC title

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020402284A1 cover?
In one embodiment, a computing system may access a plurality of first captured images that are captured in a first spectral domain, generate, using a first machine-learning model, a plurality of first domain-transferred images based on the first captured images, wherein the first domain-transferred images are in a second spectral domain, render, based on a first avatar, a plurality of first ren…
Who is the assignee on this patent?
Facebook Tech Llc
What technology area does this patent fall under?
Primary CPC classification G06T17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).