Systems and methods for generating dynamic virtual representations of an object or event
US-2024420395-A1 · Dec 19, 2024 · US
US2020234034A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020234034-A1 |
| Application number | US-201916251436-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 18, 2019 |
| Priority date | Jan 18, 2019 |
| Publication date | Jul 23, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are systems and a method for photorealistic real-time face reenactment. An example method includes receiving a target video including a target face and a source video including a source face. The method includes determining, based on the target face, a target facial expression. The method includes determining, based on the source face, a source facial expression. The method includes synthesizing, using the parametric face model, an output face. The output face including the target face wherein the target facial expression is modified to imitate the source facial expression. The method includes generating, based on a deep neural network, mouth and eyes regions, and combining the output face, the mouth, and eyes regions to generate a frame of an output video.
Opening claim text (preview).
What is claimed is: 1 . A method for face reenactment, the method comprising: receiving, by a computing device, a target video, the target video including at least one target frame, the at least one target frame including a target face; receiving, by the computing device, a source video, the source video including a source face; determining, by a computing device and based on the target face in the at least one frame of the target video, at least a target facial expression; determining, by the computing device and based on the source face in a frame of the source video, at least a source facial expression; synthesizing, by the computing device and using a parametric face model and a texture model, an output face, the output face including the target face, wherein the target facial expression is modified to imitate the source facial expression; generating, by the computing device and based on a deep neural network (DNN), a mouth region and an eyes region; and combining, by the computing device, the output face, the mouth region, and the eyes region to generate a frame of an output video. 2 . The method of claim 1 , wherein the parametrical face model depends on a facial expression, a facial identity, and a facial texture. 3 . The method of claim 1 , wherein the parametrical face model includes a template mesh pre-generated based on historical images of faces of a plurality of individuals, the template mesh including a pre-determined number of vertices. 4 . The method of claim 3 , wherein the texture model includes a set of colors associated with the vertices. 5 . The method of claim 3 , wherein the individuals are of different ages, gender, and ethnicity. 6 . The method of claim 3 , wherein the historical images of faces includes at least one set of pictures belonging to a single individual having a pre-determined number of facial expressions. 7 . The method of claim 6 , wherein the facial expressions include at least one of a neutral expression, a mouth-open expression, a smile, and an angry expression. 8 . The method of claim 7 , wherein the parametrical face model includes a set of blend shapes, the blend shapes representing the facial expressions. 9 . The method of claim 1 , wherein an input of the DNN includes at least parameters associated with the parametric face model. 10 . The method of claim 1 , wherein an input of the DNN includes a previous mouth region and a previous eyes region, the previous mouth region and the previous eyes region being associated with at least one previous frame of the target video. 11 . The method of claim 1 , wherein the DNN is trained using historical images of faces of a plurality of individuals. 12 . A system for face reenactment, the system comprising at least one processor, a memory storing processor-executable codes, wherein the at least one processor is configured to implement the following operations upon executing the processor-executable codes: receiving a target video, the target video including at least one target frame, the at least one target frame including a target face; receiving, by the computing device, a source video, the source video including a source face; determining, based on the target face in the at least one frame of the target video, at least a target facial expression; determining, based on the source face in a frame of the source video, at least a source facial expression; synthesizing, using a parametric face model and a texture model, an output face, the output face including the target face wherein the target facial expression is modified to imitate the source facial expression; generating, based on a deep neural network (DNN), a mouth region and an eyes region; and combining the output face, the mouth region, and the eyes region to generate a frame of an output video. 13 . The system of claim 1 , wherein parametrical face model depends on a facial expression, a facial identity, and a facial texture. 14 . The system of claim 1 , wherein the parametrical face model includes a template mesh pre-generated based on historical images of faces of a plurality of individuals, the template mesh including a pre-determined number of vertices, the individuals are of different ages, gender, and ethnicity. 15 . The system of claim 14 , wherein the texture model includes a set of colors associated with the vertices. 16 . The system of claim 14 , wherein the historical images of faces includes at least one set of pictures belonging to a single individual having a pre-determined number of facial expressions. 17 . The system of claim 16 , wherein the facial expressions include: at least one of a neutral expression, a mouth-open expression, a smile, and an angry expression; and the parametrical face model includes a set of blend shapes, the blend shapes representing the facial expressions. 18 . The system of claim 1 , wherein an input of the DNN includes parameters associated with the parametric face model; and a previous mouth region and a previous eyes region, the previous mouth region and the previous eyes region being associated with at least one previous frame of the target video. 19 . The method of claim 1 , wherein DNN is trained using historical images of faces of a plurality of individuals. 20 . A non-transitory processor-readable medium having instructions stored thereon, which when executed by one or more processors, cause the one or more processors to implement a method for face reenactment, the method comprising: receiving a target video, the target video including at least one target frame, the at least one target frame including a target face; receiving, by the computing device, a source video, the source video including a source face; determining, based on the target face in the at least one frame of the target video, at least a target facial expression; determining, based on the source face in a frame of the source video, a source facial expression; synthesizing, using a parametric face model and a texture model, an output face, the output face including the target face wherein the target facial expression is modified to imitate the source facial expression; generating, based on a deep neural network (DNN), a mouth region and an eyes region; and combining the output face, the mouth region, and the eyes region to generate a frame of an output video. 21 . A system for providing personalized advertisements, the system comprising: a database configured to store one or more advertisement videos, the one or more advertisement videos including at least a target face, the target face being associated with a first individual; a user information collection module configured to: receive a user data associated with a user, the user data including an image of a source face, the source face being associated with a second individual, the second individual being different from the first individual; and determine, based on the user data, parameters of the source face; and a personalized video generation module configured to: segment a frame of the one or more advertisement videos into a first part and a second part, the first part including the target face and the second part including a background; and modify, based on the parameters of the source face, the first part of the frame to replace the target face with the source face; combine the modified first part and the second part to obtain an output frame of an output advertisement video. 2
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
Supervised learning · CPC title
Generative networks · CPC title
Adversarial learning · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.