Snippet extraction and ranking
US-8954425-B2 · Feb 10, 2015 · US
US11144597B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11144597-B2 |
| Application number | US-201815923566-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 16, 2018 |
| Priority date | Aug 16, 2013 |
| Publication date | Oct 12, 2021 |
| Grant date | Oct 12, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system for emulating a subject, to allow a user to interact with a computer generated talking head with the subject's face and voice; said system comprising a processor, a user interface and a personality storage section, the user interface being configured to emulate the subject, by displaying a talking head which comprises the subject's face and output speech from the mouth of the face with the subject's voice, the user interface further comprising a receiver for receiving a query from the user, the emulated subject being configured to respond to the query received from the user, the processor comprising a dialogue section and a talking head generation section, wherein said dialogue section is configured to generate a response to a query inputted by a user from the user interface and generate a response to be outputted by the talking head, the response being generated by retrieving information from said personality storage section, said personality storage section comprising content created by or about the subject, and said talking head generation section is configured to: convert said response into a sequence of acoustic units, the talking head generation section further comprising a statistical model, said statistical model comprising a plurality of model parameters, said model parameters being derived from said personality storage section, the model parameters describing probability distributions which relate an acoustic unit to an image vector and speech vector, said image vector comprising a plurality of parameters which define the subject's face and said speech vector comprising a plurality of parameters which define the subject's voice, the talking head generation section being further configured to output a sequence of speech vectors and image vectors which are synchronised such that the head appears to talk.
Opening claim text (preview).
The invention claimed is: 1. A system for emulating a subject, to allow a user to interact with a computer-generated talking head with a face and a voice of the subject, said system comprising: processing circuitry, a user interface, and a personality memory, the user interface being configured to emulate the subject, by displaying a talking head, which comprises the face of the subject, and output speech from the mouth of the face with the voice of the subject, the user interface further comprising a receiver to receive a query from the user, the emulated subject being configured to respond to the query received from the user, wherein the processing circuitry is configured to generate a response to the query inputted by the user from the user interface, the response to be outputted by the talking head, the response being generated by retrieving information from said personality memory, said personality memory storing content created by or about the subject, the response being an expressive response such that the face and the voice demonstrate expression, determine the expression with which to output the generated response, convert said response into a sequence of acoustic units using a statistical model, said statistical model comprising a plurality of model parameters, said model parameters being derived from said personality memory, the model parameters describing probability distributions that relate an acoustic unit to an image vector and a speech vector for an associated expression, said image vector comprising a plurality of parameters that define the face of the subject, and said speech vector comprising a plurality of parameters that define the voice of the subject, and output a sequence of speech vectors and image vectors, which are synchronized such that the head appears to talk. 2. The system according to claim 1 , wherein the content created by or about the subject comprises posts collected from social media websites, e-mails, and other content from or about the subject that has been provided to the personality memory. 3. The system according to claim 1 , wherein the processing circuitry is further configured to navigate a set of rules stored in said personality memory to generate the response. 4. The system according to claim 1 , wherein the processing circuitry is further configured to retrieve a response from said personality memory by searching information which that has been stored in said personality memory in an unstructured form. 5. The system according to claim 4 , wherein the processing circuitry is further configured to search said information stored in a non-hierarchical form using a word-vector or an n-gram search model. 6. The system according to claim 1 , wherein the processing circuitry is further configured to interpret said query, and based on said interpretation, generate said response using a set of rules stored in said personality memory or by searching information stored in an unstructured form. 7. The system according to claim 1 , wherein the model parameter in each probability distribution in said associated expression is expressed as a weighted sum of parameters of the same type, and wherein the weighting used is expression dependent, such that converting said sequence of acoustic units to a sequence of image vectors by the processing circuitry comprises retrieving the expression dependent weights for said selected expression. 8. The system according to claim 7 , wherein the parameters are provided in clusters and each cluster comprises at least one sub-cluster, and wherein said expression dependent weights are retrieved by the processing circuitry for each cluster such that there is one weight per sub-cluster. 9. The system according to claim 7 , wherein the processing circuitry is further configured to extract expressive features from said response to form an expressive linguistic feature vector constructed in a first space, and map said expressive linguistic feature vector to an expressive synthesis feature vector that is constructed in a second space, said expressive linguistic feature vector being related to the model parameters of said statistical model. 10. The system according to claim 9 , wherein the processing circuitry is further configured to extract the expressive features from said response to form the expressive linguistic feature vector constructed in the first space, and map said expressive linguistic feature vector to the said expression dependent weights. 11. The system according to claim 1 , wherein said image vector comprises parameters that allow the face to be constructed by the processing circuitry from a weighted sum of modes using weighting parameters, and wherein the modes represent reconstructions of the face or a part thereof. 12. The system according to claim 11 , wherein the modes comprise modes to represent shape and appearance of the face. 13. The system according to claim 11 , wherein a same weighting parameter is used by the processing circuitry for a shape mode and a corresponding appearance mode. 14. A system for generating a personality file, said personality file being used to store information relating to the speech, the face and dialogue intelligence of the subject such that the subject can be emulated using the system for emulating the subject of claim 1 , said personality file being stored in said personality memory, the system for generating a personality file comprising: a particular interface for the system for generating the particular personality file, the particular interface inputting information identifying content created by or about the subject; an audio-visual recording system configured to record the voice and the face of the subject, when reading known text, while using a range of different emotions; and circuitry configured to: curate said information identifying content created by or about said user, said curation comprising organizing said content into documents and building an n-gram language model for said documents, and a word vector model for each document; and produce said statistical model, said statistical model comprising the plurality of model parameters describing probability distributions that relate an acoustic unit to an image vector and the speech vector, said image vector comprising the plurality of parameters that define the face of the subject and said speech vector comprising a plurality of parameters that define the voice of the subject, the circuitry being further configured to train said statistical model such that a sequence of speech vectors and image vectors, which are synchronized when outputted, cause the generated head to appear to talk. 15. A method for emulating a subject, to allow a user to interact with a computer-generated talking head with a face and a voice of the subject, the method comprising: receiving a user inputted query; generating a response to the query inputted by a user from a user interface, the response to be outputted by the talking head, the response being generated by retrieving information from a personality memory, said personality memory storing content created by or about the subject, the response being an expressive response such that the face and the voice demonstrate expression; and outputting said response by displaying a talking head that comprises the face of the subject, and output speech from the mouth of the face with the voice of the subject, wherein said talking head outputs said response by converting said response into a sequence of acoustic units using a statistical model, said statistical model comprising a plurality of model parameters,
by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation · CPC title
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title
of characters, e.g. humans, animals or virtual beings · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.