What technology area does this patent fall under?

Primary CPC classification G06N3/006. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Feb 19 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Computer generated emulation of a subject

US2015052084A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2015052084-A1
Application number	US-201414458556-A
Country	US
Kind code	A1
Filing date	Aug 13, 2014
Priority date	Aug 16, 2013
Publication date	Feb 19, 2015
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for emulating a subject, to allow a user to interact with a computer generated talking head with the subject's face and voice; said system comprising a processor, a user interface and a personality storage section, the user interface being configured to emulate the subject, by displaying a talking head which comprises the subject's face and output speech from the mouth of the face with the subject's voice, the user interface further comprising a receiver for receiving a query from the user, the emulated subject being configured to respond to the query received from the user, the processor comprising a dialogue section and a talking head generation section, wherein said dialogue section is configured to generate a response to a query inputted by a user from the user interface and generate a response to be outputted by the talking head, the response being generated by retrieving information from said personality storage section, said personality storage section comprising content created by or about the subject, and said talking head generation section is configured to: convert said response into a sequence of acoustic units, the talking head generation section further comprising a statistical model, said statistical model comprising a plurality of model parameters, said model parameters being derived from said personality storage section, the model parameters describing probability distributions which relate an acoustic unit to an image vector and speech vector, said image vector comprising a plurality of parameters which define the subject's face and said speech vector comprising a plurality of parameters which define the subject's voice, the talking head generation section being further configured to output a sequence of speech vectors and image vectors which are synchronised such that the head appears to talk.

First claim

Opening claim text (preview).

1 . A system for emulating a subject, to allow a user to interact with a computer generated talking head with the subject's face and voice; said system comprising a processor, a user interface and a personality storage section, the user interface being configured to emulate the subject, by displaying a talking head which comprises the subject's face and output speech from the mouth of the face with the subject's voice, the user interface further comprising a receiver for receiving a query from the user, the emulated subject being configured to respond to the query received from the user, the processor comprising a dialogue section and a talking head generation section, wherein said dialogue section is configured to generate a response to a query inputted by a user from the user interface and generate a response to be outputted by the talking head, the response being generated by retrieving information from said personality storage section, said personality storage section comprising content created by or about the subject, and said talking head generation section is configured to: convert said response into a sequence of acoustic units, the talking head generation section further comprising a statistical model, said statistical model comprising a plurality of model parameters, said model parameters being derived from said personality storage section, the model parameters describing probability distributions which relate an acoustic unit to an image vector and speech vector, said image vector comprising a plurality of parameters which define the subject's face and said speech vector comprising a plurality of parameters which define the subject's voice, the talking head generation section being further configured to output a sequence of speech vectors and image vectors which are synchronised such that the head appears to talk. 2 . A system according to claim 1 , wherein the content created by or about the subject comprises posts collected from social media websites, e-mails and other content from or about the subject which has been provided to the personality storage section. 3 . A system according to claim 1 , wherein the dialogue section is configured to navigate a set of rules stored in said personality storage section to generate the response. 4 . A system according to claim 1 , wherein the dialogue section is configured to retrieve a response from said personality storage section by searching information which has been stored in said personality storage section in an unstructured form. 5 . A system according to claim 4 , wherein said dialogue section is configured to search said information stored in a non-hierarchical form using a word-vector or n-gram search model. 6 . A system according to claim 1 , wherein the dialogue section is configured to interpret said query and based on said interpretation select to generate said response using a set of rules stored in said personality storage section or by searching information stored in an unstructured form. 7 . A system for creating a response to an inputted user query, said system comprising: a personality file storage section, said personality file storage section comprising a plurality of documents stored in an unstructured form; a query conversion section configured to convert said query into a word vector; a first comparison section configured to compare said word vector generated from said query with word vectors generated from the documents in said personality file storage section and output identified documents; a second comparison section configured to compare said word vector selected from said query and passages from said identified documents and to rank said selected passages, said ranking being based on the number of matches between said selected passage and said query; and a concatenation section adapted to concatenate selected passages together using sentence connectors, wherein said sentence connectors are chosen from a plurality of sentence connectors, said sentence connectors being chosen on the basis of a statistical model. 8 . A system according to claim 7 , wherein the said ranking is based on a normalised measure of the number of matches between said selected passage and said query. 9 . A system according to claim 7 , wherein said sentence connectors are chosen using a language model. 10 . A system according to claim 7 , wherein the system is configured to set a predetermined size for the response. 11 . A system according to claim 1 , configured to output an expressive response such that said face and voice demonstrate expression, said processor further comprising an expression deriving section configured to determine the expression with which to output the generated response, and wherein the said model parameters describe probability distributions which relate an acoustic unit to an image vector and speech vector for an associated expression. 12 . A system according to claim 11 , wherein the model parameter in each probability distribution in said associated expression is expressed as a weighted sum of parameters of the same type, and wherein the weighting used is expression dependent, such that converting said sequence of acoustic units to a sequence of image vectors comprises retrieving the expression dependent weights for said selected expression. 13 . A system according to claim 12 , wherein the parameters are provided in clusters and each cluster comprises at least one sub-cluster, wherein said expression dependent weights are retrieved for each cluster such that there is one weight per sub-cluster. 14 . A system according to claim 11 , wherein said expression deriving section is configured to extract expressive features from said response to form an expressive linguistic feature vector constructed in a first space and map said expressive linguistic feature vector to an expressive synthesis feature vector that is constructed in a second space, said expressive linguistic feature vector being related to the model parameters of said acoustical model. 15 . A system according to claim 14 , wherein said expression deriving section is configured to extract expressive features from said response to form an expressive linguistic feature vector constructed in a first space and map said expressive linguistic feature vector to the said expression dependent weights. 16 . A system according to claim 1 , wherein said image vector comprises parameters which allow the face to be constructed from a weighted sum of modes using weighting parameters, and wherein the modes represent reconstructions of a face or part thereof. 17 . A system according to claim 16 , wherein the modes comprise modes to represent shape and appearance of the face. 18 . A system according to claim 16 , wherein the same weighting parameter is used for a shape mode and its corresponding appearance mode. 19 . A system for generating a personality file, said personality file being used to store information relating to the speech, face and dialogue intelligence of a subject such that the subject can be emulated using a system in accordance with claim 1 , said personality file being stored in said personality storage section, the system for generating a personality file comprising: an interface for inputting information identifying content created by or about the subject; an audio-visual recording system configured to record the voice and face of a subject, when reading known text, while using a range of different emotions; and a processor being configured to: curate said information id

Assignees

Toshiba Kk

Inventors

Classifications

G06F18/2113
by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation · CPC title
G06N3/006Primary
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
G10L13/086
Detection of language · CPC title
G06F17/30979
Physics · mapped topic
G06N99/005
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 49301825

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2015052084A1 cover?: A system for emulating a subject, to allow a user to interact with a computer generated talking head with the subject's face and voice; said system comprising a processor, a user interface and a personality storage section, the user interface being configured to emulate the subject, by displaying a talking head which comprises the subject's face and output speech from the mouth of th…
Who is the assignee on this patent?: Toshiba Kk
What technology area does this patent fall under?: Primary CPC classification G06N3/006. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Feb 19 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).