Method and system for generating textual representation of user spoken utterance
US-2020312312-A1 · Oct 1, 2020 · US
US11711469B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11711469-B2 |
| Application number | US-202117316615-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 10, 2021 |
| Priority date | May 10, 2021 |
| Publication date | Jul 25, 2023 |
| Grant date | Jul 25, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, computer program products, and systems are presented. The methods, computer program products, and systems can include, for instance: determining, in performance of an interactive voice response (IVR) session, prompting data for presenting to a user, and storing text based data defining the prompting data into a data repository; presenting the prompting data to the user; receiving return voice string data from the user in response to the prompting data; generating a plurality of candidate text strings associated to the return voice string of the user; examining the text based data defining the prompting data; augmenting the plurality of candidate text strings in dependence on a result of the examining to provide a plurality of augmented candidate text strings associated to the return voice string data; and evaluating respective ones of the plurality of augmented candidate text strings associated to the return voice string data; and selecting one of the augmented candidate text strings as a returned transcription associated to the return voice string data.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method comprising: determining, in performance of an interactive voice response (IVR) session, prompting data for presenting to a user, and storing text based data defining the prompting data into a data repository; presenting the prompting data to the user; receiving return voice string data from the user in response to the prompting data; generating a plurality of candidate text strings associated to the return voice string data of the user; examining the text based data defining the prompting data; augmenting the plurality of candidate text strings in dependence on a result of the examining to provide a plurality of augmented candidate text strings associated to the return voice string data, wherein the augmenting includes transforming the text based data defining the prompting data to provide transformed prompting data; evaluating respective ones of the plurality of augmented candidate text strings associated to the return voice string data; and selecting one of the augmented candidate text strings as a returned transcription associated to the return voice string data. 2. The computer implemented method of claim 1 , wherein the evaluating includes querying a predictive model provided by a general language model with the respective ones of the plurality of augmented candidate text strings associated to the return voice string data, and examining returned confidence level parameter values resulting from the querying. 3. The computer implemented method of claim 1 , wherein the method includes, prior to the examining and the augmenting, ascertaining, using the plurality of candidate text strings, performance of a predictive language model, the predictive language model for use in performing the evaluating, determining based on the ascertaining that the predictive language model will not perform satisfactorily for the return voice string data from the user, and performing the examining and the augmenting selectively in response to the determining. 4. The computer implemented method of claim 1 , wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data that is referenced within a mapping data structure stored in a data repository, wherein the mapping data structure maps text strings to transformed text strings, wherein the augmenting includes using a certain transformed text string associated to the certain text string within the mapping data structure. 5. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to natural language processing for assignment of part of speech tags to the text based data, wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data that matches a template text string within a mapping data structure stored in a data repository, wherein the template text string stored in data repository includes one or more term expressed in wildcard format as a part of speech. 6. The computer implemented method of claim 1 , wherein the presenting includes using text to speech conversion to present the prompting data to the user in synthesized voice. 7. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to natural language processing for assignment of part of speech tags to respective terms of the text based data, and using the part of speech tags to transform text based data defining the prompting data. 8. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to part of speech tagging to provide part of speech tags associated to the text based data defining the prompting data, and wherein the augmenting includes transforming the text based data defining the prompting data using a tag of the part of speech tags to provide a transformed text string, and prepending the transformed text string to a text string of plurality candidate text strings. 9. The computer implemented method of claim 1 , wherein the presenting includes using text to speech conversion to present the prompting data to the user in synthesized voice, wherein the evaluating includes querying a predictive model provided by a general language model with the respective ones of the plurality of augmented candidate text strings associated to the return voice string data. 10. The computer implemented method of claim 1 , wherein the presenting includes using text to speech conversion to present the prompting data to the user in synthesized voice, wherein the generating the plurality of candidate text strings associated to the return voice string of the user includes querying a predictive acoustic model, wherein the evaluating includes querying a predictive model provided by a general language model with the respective ones of the plurality of augmented candidate text strings associated to the return voice string data, and examining returned confidence level parameter values resulting from the querying. 11. The computer implemented method of claim 1 , wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data. 12. The computer implemented method of claim 1 , wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data, and transforming the certain text string into a transformed text string using data repository stored data. 13. The computer implemented method of claim 1 , wherein the augmenting includes prepending the transformed prompting data to respective candidate text strings of the plurality of candidate text strings. 14. The computer implemented method of claim 1 , wherein the augmenting includes adapting respective candidate text strings of the plurality of candidate text strings using the transformed prompting data. 15. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to natural language processing for assignment of part of speech tags to the text based data. 16. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to part of speech tagging to provide part of speech tags associated to the text based data defining the prompting data. 17. A system comprising: a memory; at least one processor in communication with the memory; program instructions executable by one or more processor via the memory to perform a method comprising: determining, in performance of an interactive voice response (IVR) session, prompting data for presenting to a user, and storing text based data defining the prompting data into a data repository; presenting the prompting data to the user; receiving return voice string data from the user in response to the prompting data; generating a plurality of candidate text strings associated to the return voice string data of the user; examining the text based data defining the prompting data; augmenting the plurality of candidate text strings in dependence on a result of the examining to provide a plurality of augmented candidate text strings associated to the return voice string data, wherein the augmenting includes transforming the text based data defining the prompting data to provide transformed prompting data; evaluating respective ones of the plurality of augmented candidate text strings a
Speech interaction details (speech recognition per se G10L15/00) · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
in combination with interactive voice response systems or voice portals, e.g. as front-ends · CPC title
Execution procedure of a spoken command · CPC title
using speech recognition · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.