System and method for customized voice response
US-2016240191-A1 · Aug 18, 2016 · US
US10468014B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10468014-B1 |
| Application number | US-201916268937-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 6, 2019 |
| Priority date | Feb 6, 2019 |
| Publication date | Nov 5, 2019 |
| Grant date | Nov 5, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device causes a communication session to be established between the device and a user device to allow the device and the user device to communicate speech, and receives user speech from the user device. The device processes the user speech using a natural language processing technique to determine a plurality of characteristics of the user speech, and updates a speech generation setting of a plurality of speech generation settings based on the plurality of characteristics of the user speech. The device generates, after updating the speech generation setting, device speech using a text-to-speech technique based on the speech generation setting, and sends the device speech to the user device.
Opening claim text (preview).
What is claimed is: 1. A device, comprising: one or more memories; and one or more processors, communicatively coupled to the one or more memories, configured to: cause a communication session to be established between the device and a user device to allow the device and the user device to communicate speech; generate first device speech based on a speech setting of the device, the first device speech having a first rate of device speech; send the first device speech to the user device; receive, after sending the first device speech, first user speech from the user device; determine a rate of user speech of the first user speech; update the speech setting to match the rate of user speech, wherein the one or more processors, when updating the speech setting, are to: determine, using a trained machine learning model, a relationship between the speech setting and the rate of user speech; and update the speech setting to match the rate of user speech based on the relationship; generate, after updating the speech setting, second device speech based on the speech setting, the second device speech having a second rate of device speech, different than the first rate of device speech, that matches the rate of user speech; and send the second device speech to the user device. 2. The device of claim 1 , wherein the one or more processors, when determining the rate of speech of the first user speech, are to: process the first user speech using a natural language processing technique to determine one or more words of the first user speech; determine a number of words of the first user speech based on the one or more words of the first user speech; determine a length of time of the first user speech; and determine the rate of speech of the first user speech based on the number of words of the first user speech and the length of time of the first user speech. 3. The device of claim 1 , wherein the one or more processors, when generating the second device speech based on the speech setting, are to: process the first user speech using a natural language processing technique to determine first user speech content; determine, based on the first user speech content, second device speech content; and generate, based on the second device speech content, the second device speech using a text-to-speech technique that utilizes the speech setting. 4. The device of claim 1 , wherein the one or more processors, when causing the communication session to be established between the device and the user device to allow the device and the user device to communicate speech, are to: receive a communication request from the user device; generate a communication response based on the communication request; and send the communication response to a different device to cause the different device to establish the communication session between the device and the user device. 5. A method, comprising: generating, by a device, first device speech based on one or more speech generation settings; sending, by the device, the first device speech to a user device; receiving, by the device after sending the first device speech, first user speech from the user device; determining, by the device, one or more characteristics of the first user speech; updating, by the device, a first set of speech generation settings of the one or more speech generation settings to match the one or more characteristics of the first user speech, wherein updating the first set of speech generation settings comprises: determining, using a trained machine learning model, a relationship between the first set of speech generation settings and the one or more characteristics of the first user speech, and updating the first set of speech generation settings to match the one or more characteristics of the first user speech based on the relationship; generating, by the device and after updating the first set of speech generation settings of the one or more speech generation settings based on the one or more characteristics of the first user speech, second device speech based on the updated first set of speech generation settings of the one or more speech generation settings, wherein at least one characteristic of the second device speech is different from at least one characteristic of the first device speech and matches the one or more characteristics of the first user speech; and sending, by the device, the second device speech to the user device. 6. The method of claim 5 , further comprising: receiving, by the device after sending the second device speech, second user speech from the user device; determining, by the device, one or more characteristics of the second user speech; updating, by the device, the first set of speech generation settings or a second set of speech generation settings of the one or more speech generation settings based on the one or more characteristics of the second user speech; generating, by the device and after updating the first set of speech generation settings or the second set of speech generation settings of the one or more speech generation settings based on the one or more characteristics of the second user speech, third device speech based on the updated first set of speech generation settings or second set of speech generation settings of the one or more speech generation settings, wherein at least one characteristic of the third device speech is different from the at least one characteristic of the second device speech; and sending, by the device, the third device speech to the user device. 7. The method of claim 5 , wherein the one or more characteristics of the first user speech include at least one of: a rate of speech of the first user speech; a cadence of the first user speech; a loudness of the first user speech; a timbre of the first user speech; a language associated with the first user speech; a dialect associated with the first user speech; an accent associated with the first user speech; or a grammar associated with the first user speech. 8. The method of claim 5 , wherein the one or more speech generation settings include at least one of: a speech setting; a speech generation cadence setting; a speech generation loudness setting; a speech generation timbre setting; a speech generation language setting; a speech generation dialect setting; a speech generation accent setting; or a speech generation grammar setting. 9. The method of claim 5 , wherein determining the one or more characteristics of the first user speech comprises: determining a number of syllables of the first user speech; determining a length of time of the first user speech; and determining a rate of speech of the first user speech based on the number of syllables of the first user speech and the length of time of the first user speech. 10. The method of claim 5 , wherein determining the one or more characteristics of the first user speech comprises: processing the first user speech to remove filler words; determining, after removing the filler words, a number of words of the first user speech; determining, after removing the filler words, a length of time of the first user speech; and determining a rate of speech of the first user speech based on the number of words of the first user speech and the length of time of the first user speech. 11. The method of claim 5 , wherein a characteristic of the one or more characteristics of the first user speech is a language associated with the first user speech, wherein updating the first set of speech generation settings of the one or more speech generation settings based on the one or more characterist
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
for comparison or discrimination · CPC title
Voice editing, e.g. manipulating the voice of the synthesiser · CPC title
to the speaker · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.