What technology area does this patent fall under?

Primary CPC classification G06T15/04. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for implementing three-dimensional facial modeling and visual speech synthesis

US11145100B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11145100-B2
Application number	US-201816477591-A
Country	US
Kind code	B2
Filing date	Jan 12, 2018
Priority date	Jan 12, 2017
Publication date	Oct 12, 2021
Grant date	Oct 12, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Novel tools and techniques are provided for implementing three-dimensional facial modeling and visual speech synthesis. In various embodiments, a computing system might determine an orientation, size, and location of a face in a received input image; retrieve a three-dimensional model template comprising a face and head; project the input image onto the model template to generate a three-dimensional model; define, on the model, a polygon mesh in a region of facial feature corresponding to feature in the input image; adjust parameters on the model; and display the model. The computing system might parse a text string into allophonic units; encode each allophonic unit into a point(s) in linguistic space corresponding to mouth movements; retrieve, from a codebook, indexed images/morphs corresponding to encoded points in the linguistic space; render the indexed images/morphs into an animation of the three-dimensional model; synchronize, for output, the animation with audio representations of the text string.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, with a computing system, an input image comprising a face; determining, with the computing system, an orientation, a size, and a location of the face in the input image; retrieving, with the computing system, a three-dimensional model template comprising a face and a head; projecting, with the computing system, the input image onto the three-dimensional model template comprising the face and the head to generate a three-dimensional model corresponding to the input image; defining, with the computing system and on the three-dimensional model, a polygon mesh in a region of at least one facial feature; adjusting, with the computing system, parameters on the three-dimensional model, the region of the at least one facial feature corresponding to at least one facial feature in the input image, wherein scaling the three-dimensional model comprises: defining, with the computing system, a first box to frame the face in the input image; defining, with the computing system, a second box to frame the face of the three-dimensional model template; and scaling, with the computing system, the second box of the three-dimensional model template by performing at least one of scaling to fit the first box of the input image or scaling so that the second box is centered on the first box in the input image; and displaying, with the computing system, the three-dimensional model with the face of the input image projected onto the three-dimensional model. 2. The method of claim 1 , wherein the computing system comprises at least one of a client computer, a host computer, a user device, a server computer over a network, a cloud-based computing system, or a distributed computing system. 3. The method of claim 1 , wherein the input image is captured with an image sensor of a device. 4. The method of claim 1 , wherein the input image is at least one of a photograph or a drawing. 5. The method of claim 1 , wherein rotating the three-dimensional model comprises: determining, with the computing system, an eye alignment on the face of the input image; and rotating, with the computing system, the three-dimensional model template to align eyes of the three-dimensional model with the eyes of the input image. 6. The method of claim 1 , wherein the three-dimensional model template comprises at least one facial feature, the method further comprising: determining, with the computing system, at least one facial feature on the face of the input image; determining, with the computing system, an orientation, a size, and a location of the at least one facial feature on the face in the input image; rotating, with the computing system, the three-dimensional model template to orient the at least one facial feature to the corresponding at least one facial feature in the input image; scaling, with the computing system, the three-dimensional model template to match the size of the at least one facial feature to the corresponding at least one facial feature in the input image; translating, with the computing system, the three-dimensional model template to match the location of the at least one facial feature to the corresponding at least one facial feature in the input image; and projecting, with the computing system, the at least one facial feature in the input image onto the corresponding at least one facial feature of the three-dimensional model template to represent the at least one facial feature on the three-dimensional model. 7. The method of claim 6 , wherein the at least one facial feature of the input image comprises at least one of an eye, lip, eyebrow, nose, cheek, ear, forehead, chin, or neck, and wherein the at least one facial feature of the three-dimensional model comprises at least one of an eye, lip, eyebrow, nose, cheek, ear, forehead, chin, or neck. 8. The method of claim 1 , further comprising: determining, with the computing system, a perspective of an input image; and applying, with the computing system, a perspective deformation to the three-dimensional model template. 9. The method of claim 1 , wherein the display of the three-dimensional model is capable of being rotated in any direction. 10. The method of claim 1 , further comprising: rotating, with the computing system, the three-dimensional model template comprising the face and the head to match the orientation of the face in the input image; scaling, with the computing system, the three-dimensional model template comprising the face and the head to match the size of the face in the input image; and translating, with the computing system, the three-dimensional model template comprising the face and the head to match the location of the face in the input image. 11. A device, comprising: a display; one or more processors in communication with an image sensor, an accelerometer, and the display; and a non-transitory computer readable medium in communication with the one or more processors, the non-transitory computer readable medium having encoded thereon a set of instructions executable by the one or more processors to cause the device to: receive an input image comprising a face; determine an orientation, a size, and a location of the face in the input image; retrieve a three-dimensional model template comprising a face and a head; project the input image onto the three-dimensional model template comprising the face and the head to generate a three-dimensional model corresponding to the input image; define, on the three-dimensional model, a polygon mesh in a region of at least one facial feature; adjust parameters on the three-dimensional model, the region of the at least one facial feature corresponding to at least one facial feature in the input image, wherein scaling the three-dimensional model comprises: defining a first box to frame the face in the input image; defining a second box to frame the face of the three-dimensional model template; and scaling the second box of the three-dimensional model template by performing at least one of scaling to fit the first box of the input image or scaling so that the second box is centered on the first box in the input image; and display the three-dimensional model with the face of the input image projected onto the three-dimensional model. 12. An apparatus, comprising: one or more processors; and a non-transitory computer readable medium having encoded thereon a set of instructions executable by the one or more processors to cause the apparatus to: receive an input image comprising a face; determine an orientation, a size, and a location of the face in the input image; retrieve a three-dimensional model template comprising a face and a head; project the input image onto the three-dimensional model template comprising the face and the head to generate a three-dimensional model corresponding to the input image; define, on the three-dimensional model, a polygon mesh in a region of at least one facial feature; adjust parameters on the three-dimensional model, the region of the at least one facial feature corresponding to at least one facial feature in the input image, wherein scaling the three-dimensional model comprises: defining a first box to frame the face in the input image; defining a second box to frame the face of the three-dimensional model template; and scaling the second box of the three-dimensional model template by performing at least one of scaling to fit the first box of the input image or scaling so that the second box is centered on the first box in the input image; and display the three-dimensional model with the face of the input image projected onto the three-dimensio

Assignees

Univ Colorado Regents

Inventors

Classifications

G06V40/171
Local features and components; Facial parts (eye characteristics G06V40/18); Occluding parts, e.g. glasses; Geometrical relationships · CPC title
G10L13/00
Speech synthesis; Text to speech systems · CPC title
G06T15/04Primary
Texture mapping · CPC title
G06T13/205
driven by audio data · CPC title
G06T17/10
Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes · CPC title

Patent family

Related publications grouped by family.

View patent family 62839529

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11145100B2 cover?: Novel tools and techniques are provided for implementing three-dimensional facial modeling and visual speech synthesis. In various embodiments, a computing system might determine an orientation, size, and location of a face in a received input image; retrieve a three-dimensional model template comprising a face and head; project the input image onto the model template to generate a three-dimens…
Who is the assignee on this patent?: Univ Colorado Regents
What technology area does this patent fall under?: Primary CPC classification G06T15/04. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and Methods for Automating the Animation of Blendshape Rigs

Avatar video apparatus and method

Avatar-based video encoding

Frequently asked questions