Dynamic embedded recognizer and preloading on client devices grammars for recognizing user inquiries and responses
US-2015379568-A1 · Dec 31, 2015 · US
US9626695B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9626695-B2 |
| Application number | US-201414451151-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 4, 2014 |
| Priority date | Jun 26, 2014 |
| Publication date | Apr 18, 2017 |
| Grant date | Apr 18, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An automated communication system with an associated method for presenting customized voices is disclosed. The system which performs a predetermined task accepts information regarding an intended user indicating the intended user's identity, preferences, etc. Next, the system customizes one or more voices for the intended user based on the accepted information. The system then presents to the intended user one or more audible communications converted from text associated with a predetermined task performed by the system using the one or more customized voices.
Opening claim text (preview).
The invention claimed is: 1. At least one non-transitory computer-readable medium, carrying instructions, which when executed by at least one hardware data processor, performs a method of presenting customized voices by an automated communication system, the method comprising: accepting information regarding an intended user of the automated communication system, wherein accepting the information includes: i) analyzing text and voice input from the user, and ii) deducing an interaction style of the user based on the text and voice input from the user, wherein the user's interaction style indicates a mood of the user and an interaction style that the user desires, and wherein the mood includes being available or busy and wherein the interaction style includes a formal or an informal style, and iii) storing data regarding the user's interaction style; customizing or selecting one of multiple text-to-speech voices based on the accepted information, wherein the multiple voices are audio properties of the multiple voices that vary by volume and pitch, and wherein the multiple voices include an irreverent voice and a succinct voice; and presenting, to the user and for audible output to the user on a computing device of the user, one or more audible communications for a task performed by the automated communication system using the one or more customized or selected voices. 2. The non-transitory computer-readable medium of claim 1 , wherein the automated communication system is a server, and wherein the task is a computerized event invitation provided by the server to the computing device of the user, a computerized greeting card provided by the server to the computing device of the user, a computerized game provided by the server to the computing device of the user, or a computerized storybook provided by the server to the computing device of the user, and wherein the server computer also provides to the user, via the computing device, a spoken advertisement. 3. The non-transitory computer-readable medium of claim 1 , further comprising storing multiple predetermined voices, wherein customizing or selecting the one of multiple voices includes selecting the one voice from the stored voices, and wherein the voices are each different computer-generated voices. 4. The non-transitory computer-readable medium of claim 1 , further comprising storing multiple predetermined voice components, wherein the customizing or selecting the one of multiple voices includes synthesizing the one voice using one or more of the stored voice components. 5. The non-transitory computer-readable medium of claim 1 , wherein the intended user interacts with the automated communication system, and the one or more audible communications are presented in response to inquiries from the user, and wherein the method further comprises customizing or selecting the one or more voices based on a nature of one of the inquiries or a state of the task. 6. The non-transitory computer-readable medium of claim 1 , wherein the one voice is selected based on an occasion and an intended audience to receive the one or more audible communications, and wherein the task is an electronic invitation to attend a child's party, to attend a woman's wedding, or to attend a professor's retirement party. 7. The non-transitory computer-readable medium of claim 1 , wherein the one or more audible communications correspond to text associated with the task, wherein the task is a game that uses multiple, different voices to present different questions depending on contestants playing the game, depending on question categories, and depending on prizes involved. 8. The non-transitory computer-readable medium of claim 1 , wherein the accepted information indicates a profession or a hobby of the user. 9. The non-transitory computer-readable medium of claim 1 , further comprising: identifying multiple stages of or roles in the task; and assigning different customized or selected voices to the stages or roles, wherein the presenting is performed based on a current stage of or an active role in the task. 10. A method of operating an automated communication system to present customized voices, the method comprising: accepting information regarding an intended user of the automated communication system, wherein accepting the information includes: i) analyzing text and voice input from the user, and ii) deducing an interaction style of the user based on the text and voice input from the user, wherein the user's interaction style indicates a mood of the user and an interaction style that the user desires, and wherein the mood includes being available or busy and wherein the interaction style includes a formal or an informal style, and iii) storing data regarding the user's interaction style; customizing or selecting one of multiple text-to-speech voices based on the accepted information, wherein the multiple voices are audio properties of the multiple voices that vary by volume and pitch, wherein the customizing or selecting is performed by a hardware processor, and wherein the multiple voices include an irreverent voice and a succinct voice; and presenting, to the user and for audible output to the user on a computing device of the user, one or more audible communications for a task performed by the automated communication system using the one or more customized or selected voices. 11. The method of claim 10 , wherein the automated communication system is a server, and wherein the task is a computerized event invitation provided by the server to the computing device of the user, a computerized greeting card provided by the server to the computing device of the user, a computerized game provided by the server to the computing device of the user, or a computerized storybook provided by the server to the computing device of the user, and wherein the server computer also provides to the user, via the computing device, a spoken advertisement. 12. The method of claim 10 , further comprising storing multiple predetermined voices, wherein customizing or selecting the one of multiple voices includes selecting the one voice from the stored voices, and wherein the voices are each different computer-generated voices. 13. The method of claim 10 , further comprising storing multiple predetermined voice components, wherein the customizing or selecting the one of multiple voices includes synthesizing the one voice using one or more of the stored voice components. 14. The method of claim 10 , wherein the intended user interacts with the automated communication system, and the one or more audible communications are presented in response to inquiries from the user, and wherein the method further comprises customizing or selecting the one or more voices based on a nature of one of the inquiries or a state of the task. 15. The method of claim 10 , wherein the one voice is selected based on an occasion and an intended audience to receive the one or more audible communications, and wherein the task is an electronic invitation to attend a child's party, to attend a woman's wedding, or to attend a professor's retirement party. 16. The method of claim 10 , wherein the one or more audible communications correspond to text associated with the task, wherein the task is a game that uses multiple, different voices to present different questions depending on contestants playing the game, depending on question categories, and depending on prizes involved. 17. The method of claim 10 , wherein the accepted information indicates a
Voice editing, e.g. manipulating the voice of the synthesiser · CPC title
based on user history · CPC title
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
Wireless devices · CPC title
User requested · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.