Generating audio rendering from textual content based on character models
US-2019043474-A1 · Feb 7, 2019 · US
US10600404B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10600404-B2 |
| Application number | US-201715826149-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 29, 2017 |
| Priority date | Nov 29, 2017 |
| Publication date | Mar 24, 2020 |
| Grant date | Mar 24, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of systems, apparatuses, and/or methods are disclosed for automatic speech imitation. An apparatus may include a machine learner to perform an analysis of tagged data that is to be generated based on a speech pattern and/or a speech context behavior in media content. The machine learner may further generate, based on the analysis, a trained speech model that is to be applied to the media content to transform speech data to mimic data. The apparatus may further include a data analyzer to perform an analysis of the speech pattern, the speech context behavior, and/or the tagged data. The data analyzer may further generate, based on the analysis, a programmed speech rule that is to be applied to transform the speech data to the mimic data.
Opening claim text (preview).
I claim: 1. A computer system comprising: a training data provider to provide training data and generate one or more of a trained speech model or a programmed speech rule, wherein the one or more of the trained speech model or the programmed speech rule are to be applied to transform speech data to mimic data, and the training data provider includes a condition evaluator to determine a factor associated with one or more of a speech pattern or a speech context behavior, wherein the factor is to include one or more of a usage frequency, a response trigger, or a bias, and one or more of a machine learner to: perform a machine learning analysis of tagged data that is to be generated based on one or more of a speech pattern or a speech context behavior in media content; and generate, based on the machine learning analysis, the trained speech model that is to be applied to transform the speech data to the mimic data, or a data analyzer to: perform a data analysis of one or more of a speech pattern, a speech context behavior, or tagged data; and generate, based on the data analysis, the programmed speech rule that is to be applied to transform the speech data to the mimic data; and a speech device to output imitated speech based on the mimic data. 2. The system of claim 1 , further including: a speech pattern identifier to identify one or more of an ordered speech pattern, a literary point of view, or a disordered speech pattern in the media content; and a context behavior identifier to identify one or more of a trained behavior, a replacement behavior, or an additive behavior in the media content. 3. The system of claim 1 , further including a media content tagger to: modify the media content with a speech pattern tag; and modify the media content with a speech context behavior tag. 4. The system of claim 3 , wherein the machine learner is to: learn the speech pattern, associated with the machine learning analysis, based on the speech pattern tag; and learn the speech context behavior, associated with the machine learning analysis, based on the speech context behavior tag. 5. The system of claim 1 , wherein the factor includes the usage frequency, the response trigger, and the bias. 6. The system of claim 1 , further including a user interface to: provide a character input field for a configurable speech object value defining a speech object for which one or more of a selected speech pattern or a selected speech context behavior is to be applied; and provide a probability input field for a configurable probability value defining a rate at which one or more of the selected speech pattern, the selected speech context behavior, or a selected factor thereof is to be applied. 7. The system of claim 1 , wherein the training data provider further includes a context mapper to: evaluate a reference data set that includes characteristic speech for a speech object and one or more of a context of a characteristic response by the speech object or the characteristic response, and map, for an identified speech context behavior, one or more of a feature associated with the characteristic response, available speech data based on the characteristic response that is to be available to modify audio data, or an identification of the speech object for which the available speech data is to be applied to generate the mimic data. 8. The system of claim 1 , wherein the training data provider further includes a speech data modifier to modify the speech data to generate the mimic data based on one or more of the trained speech model or the programmed speech rule, wherein the mimic data includes one or more of text of the mimicked speech or a tag that is to instruct the speech device how to output the mimicked speech. 9. An apparatus comprising: a training data provider to generate one or more of a trained speech model or a programmed speech rule, wherein the one or more of the trained speech model or the programmed speech rule are to be applied to transform speech data to mimic data, and the training data provider includes a condition evaluator to determine a factor associated with one or more of a speech pattern or a speech context behavior, wherein the factor is to include one or more of a usage frequency, a response trigger, or a bias; and one or more of a machine learner to: perform a machine learning analysis of tagged data that is to be generated based on one or more of a speech pattern or a speech context behavior in media content; and generate, based on the machine learning analysis, the trained speech model that is to be applied to transform the speech data to the mimic data; or a data analyzer to: perform a data analysis of one or more of a speech pattern, a speech context behavior, or tagged data; and generate, based on the data analysis, the programmed speech rule that is to be applied to transform the speech data to the mimic data. 10. The apparatus of claim 9 , further including: a speech pattern identifier to identify one or more of an ordered speech pattern, a literary point of view, or a disordered speech pattern in the media content; and a context behavior identifier to identify one or more of a trained behavior, a replacement behavior, or an additive behavior in the media content. 11. The apparatus of claim 9 , further including a media content tagger to: modify the media content with a speech pattern tag; and modify the media content with a speech context behavior tag. 12. The apparatus of claim 11 , wherein the machine learner is to: learn the speech pattern, associated with the machine learning analysis, based on the speech pattern tag; and learn the speech context behavior, associated with the machine learning analysis, based on the speech context behavior tag. 13. The apparatus of claim 9 , wherein the factor includes the usage frequency, the response trigger, and the bias. 14. The apparatus of claim 9 , further including a user interface to: provide a character input field for a configurable speech object value defining a speech object for which one or more of a selected speech pattern or a selected speech context behavior is to be applied; and provide a probability input field for a configurable probability value defining a rate at which one or more of the selected speech pattern, the selected speech context behavior, or a selected factor thereof is to be applied. 15. The apparatus of claim 9 , wherein the training data provider further includes a context mapper to: evaluate a reference data set that includes characteristic speech for a speech object and one or more of a context of a characteristic response by the speech object or the characteristic response; and map, for an identified speech context behavior, one or more of a feature associated with the characteristic response, available speech data based on the characteristic response that is to be available to modify audio data, or an identification of the speech object for which the available speech data is to be applied to generate the mimic data. 16. The apparatus of claim 9 , wherein the training data provider further includes a speech data modifier to modify the speech data to generate the mimic data based on one or more of the trained speech model or the programmed speech rule, wherein the mimic data includes one or more of text of the mimicked speech or a tag that is to instruct a speech device how to output the mimicked speech. 17. At least one non-transitory computer readable storage medium comprising a set of instructions which, when executed by a device, ca
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Interaction with lists of selectable items, e.g. menus · CPC title
Parsing · CPC title
Architecture of speech synthesisers · CPC title
Execution procedure of a spoken command · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.