Avatar facial expression and/or speech driven animations
US-2017039750-A1 · Feb 9, 2017 · US
US11538211B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11538211-B2 |
| Application number | US-201917052161-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 1, 2019 |
| Priority date | May 7, 2018 |
| Publication date | Dec 27, 2022 |
| Grant date | Dec 27, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method ( 300 ) includes receiving a first facial framework ( 144 a ) and a first captured image ( 130 a ) of a face ( 20 ). The first facial framework corresponds to the face at a first frame and includes a first facial mesh ( 142 a ) of facial information ( 140 ). The method also includes projecting the first captured image onto the first facial framework and determining a facial texture ( 212 ) corresponding to the face based on the projected first captured image. The method also includes receiving a second facial framework ( 144 b ) at a second frame that includes a second facial mesh ( 142 b ) of facial information and updating the facial texture based on the received second facial framework. The method also includes displaying the updated facial texture as a three-dimensional avatar ( 160 ). The three-dimensional avatar corresponds to a virtual representation of the face.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, at data processing hardware, a first facial framework and a first captured image of a face of a user with a neutral facial expression, the first facial framework corresponding to the face of the user at a first frame and comprising a first facial mesh of facial information; projecting, by the data processing hardware, the first captured image of the face onto the first facial framework; determining, by the data processing hardware, a facial texture corresponding to the face of the user based on the projected first captured image; receiving, at the data processing hardware, a second facial framework, the second facial framework corresponding to the face of the user at a second frame and comprising a second facial mesh of facial information; updating, by the data processing hardware, the facial texture based on the received second facial framework; displaying, by the data processing hardware, the updated facial texture as a three-dimensional avatar, the three-dimensional avatar corresponding to a virtual representation of the face of the user; and generating, by the data processing hardware, a rendition of an eye or a mouth of the user by: detecting, by the data processing hardware, edges of the eye or the mouth; determining, by the data processing hardware, that a sum of angles associated with the edges of the eye or the mouth correspond to a value of about two pi; approximating, by the data processing hardware, a position of the eye or the mouth based on the detected edges that correspond to a value of about two pi; extracting, by the data processing hardware, the mouth or the eye at the approximated position from the captured image of the face; and rendering, by the data processing hardware, the extracted mouth or the extracted eye at the approximated position with a fill. 2. The method of claim 1 , further comprising: receiving, at the data processing hardware, a second captured image of the face of the user, the second captured image capturing a smile as a facial expression of the user; receiving, at the data processing hardware, a third captured image of the face of the user, the third captured image capturing, as the facial expression of the user, both eyebrows raised; receiving, at the data processing hardware, a fourth captured image of the face of the user, the fourth captured image capturing, as the facial expression of the user, a smile and both eyebrows raised; for each captured image, determining, by the data processing hardware, a facial expression texture corresponding to the face of the user; blending, by the data processing hardware, the facial expression textures of each captured image and the updated facial texture based on the received second facial framework to generate a blended facial texture; and rendering, by the data processing hardware, the three-dimensional avatar with the blended facial texture. 3. The method of claim 2 , wherein blending further comprises: determining a texture vector for each captured image, the texture vector corresponding to a vector representation of a difference from the first captured image with the neutral facial expression; determining a current texture vector based on the received second facial framework; assigning rendering weights based on a difference between the current texture vector and the texture vector of each captured image; and rendering the three-dimensional avatar with the blended facial texture based on the rendering weights. 4. The method of claim 3 , wherein the rendering weights have a sum equal to one. 5. The method of claim 3 , wherein each of the current texture vector and the texture vector of each captured image correspond to a fifty-two variable float vector. 6. The method of claim 5 , wherein the rendering weights descend in magnitude as the difference between the current texture vector and the texture vector of each captured image increases. 7. The method of claim 1 , further comprising: receiving, at the data processing hardware, a captured current image of the face of the user with a current facial expression mesh of facial information at the second frame; and updating, by the data processing hardware, the facial texture based on the captured current image. 8. The method of claim 7 , wherein the received captured current image corresponds to a reduced amount of facial texture. 9. The method of claim 8 , further comprising: determining, by the data processing hardware, an obstructed portion of the face of the user based on the received captured current image; and blending, by the data processing hardware, the obstructed portion of the face of the user with facial texture generated from an unobstructed captured image from a prior frame. 10. The method of claim 1 , wherein the first captured image comprises a red-green- and blue image from a mobile phone. 11. The method of claim 1 , wherein the three-dimensional avatar is displayed on an augmented reality device. 12. A system comprising: data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising: receiving a first facial framework and a first captured image of a face of a user with a neutral facial expression, the first facial framework corresponding to the face of the user at a first frame and comprising a first facial mesh of facial information; projecting the first captured image of the face onto the first facial framework; determining a facial texture corresponding to the face of the user based on the projected first captured image; receiving a second facial framework, the second facial framework corresponding to the face of the user at a second frame and comprising a second facial mesh of facial information; updating the facial texture based on the received second facial framework; displaying the updated facial texture as a three-dimensional avatar, the three-dimensional avatar corresponding to a virtual representation of the face of the user; and generating a rendition of an eye or a mouth of the user by: detecting edges of the eye or the mouth; determining that a sum of angles associated with the edges of the eye or the mouth correspond to a value of about two pi; approximating a position of the eye or the mouth based on the detected edges that correspond to the value of about two pi; extracting the mouth or the eye at the approximated position from the captured image of the face; and rendering the extracted mouth or the extracted eye at the approximated position with a fill. 13. The system of claim 12 , wherein the operations further comprise: receiving a second captured image of the face of the user, the second captured image capturing a smile as a facial expression of the user; receiving a third captured image of the face of the user, the third captured image capturing, as the facial expression of the user, both eyebrows raised; receiving a fourth captured image of the face of the user, the fourth captured image capturing, as the facial expression of the user, a smile and both eyebrows raised; for each captured image, determining a facial expression texture corresponding to the face of the user; blending the facial expression textures of each captured image and the updated facial texture based on the received second facial framework to generate a blended facial texture; and rendering the three-dimensional avatar with the blended facial texture. 14. The system of claim 13 , wherein
defining a virtual conference space and using avatars or agents (computer conference optimisation or adaptation H04L12/1827) · CPC title
Texture mapping · CPC title
Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title
involving 3D image data · CPC title
Human faces, e.g. facial parts, sketches or expressions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.