What technology area does this patent fall under?

Primary CPC classification H04M3/4936. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Jul 25 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Contextualized speech to text conversion

US11711469B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11711469-B2
Application number	US-202117316615-A
Country	US
Kind code	B2
Filing date	May 10, 2021
Priority date	May 10, 2021
Publication date	Jul 25, 2023
Grant date	Jul 25, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, computer program products, and systems are presented. The methods, computer program products, and systems can include, for instance: determining, in performance of an interactive voice response (IVR) session, prompting data for presenting to a user, and storing text based data defining the prompting data into a data repository; presenting the prompting data to the user; receiving return voice string data from the user in response to the prompting data; generating a plurality of candidate text strings associated to the return voice string of the user; examining the text based data defining the prompting data; augmenting the plurality of candidate text strings in dependence on a result of the examining to provide a plurality of augmented candidate text strings associated to the return voice string data; and evaluating respective ones of the plurality of augmented candidate text strings associated to the return voice string data; and selecting one of the augmented candidate text strings as a returned transcription associated to the return voice string data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method comprising: determining, in performance of an interactive voice response (IVR) session, prompting data for presenting to a user, and storing text based data defining the prompting data into a data repository; presenting the prompting data to the user; receiving return voice string data from the user in response to the prompting data; generating a plurality of candidate text strings associated to the return voice string data of the user; examining the text based data defining the prompting data; augmenting the plurality of candidate text strings in dependence on a result of the examining to provide a plurality of augmented candidate text strings associated to the return voice string data, wherein the augmenting includes transforming the text based data defining the prompting data to provide transformed prompting data; evaluating respective ones of the plurality of augmented candidate text strings associated to the return voice string data; and selecting one of the augmented candidate text strings as a returned transcription associated to the return voice string data. 2. The computer implemented method of claim 1 , wherein the evaluating includes querying a predictive model provided by a general language model with the respective ones of the plurality of augmented candidate text strings associated to the return voice string data, and examining returned confidence level parameter values resulting from the querying. 3. The computer implemented method of claim 1 , wherein the method includes, prior to the examining and the augmenting, ascertaining, using the plurality of candidate text strings, performance of a predictive language model, the predictive language model for use in performing the evaluating, determining based on the ascertaining that the predictive language model will not perform satisfactorily for the return voice string data from the user, and performing the examining and the augmenting selectively in response to the determining. 4. The computer implemented method of claim 1 , wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data that is referenced within a mapping data structure stored in a data repository, wherein the mapping data structure maps text strings to transformed text strings, wherein the augmenting includes using a certain transformed text string associated to the certain text string within the mapping data structure. 5. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to natural language processing for assignment of part of speech tags to the text based data, wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data that matches a template text string within a mapping data structure stored in a data repository, wherein the template text string stored in data repository includes one or more term expressed in wildcard format as a part of speech. 6. The computer implemented method of claim 1 , wherein the presenting includes using text to speech conversion to present the prompting data to the user in synthesized voice. 7. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to natural language processing for assignment of part of speech tags to respective terms of the text based data, and using the part of speech tags to transform text based data defining the prompting data. 8. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to part of speech tagging to provide part of speech tags associated to the text based data defining the prompting data, and wherein the augmenting includes transforming the text based data defining the prompting data using a tag of the part of speech tags to provide a transformed text string, and prepending the transformed text string to a text string of plurality candidate text strings. 9. The computer implemented method of claim 1 , wherein the presenting includes using text to speech conversion to present the prompting data to the user in synthesized voice, wherein the evaluating includes querying a predictive model provided by a general language model with the respective ones of the plurality of augmented candidate text strings associated to the return voice string data. 10. The computer implemented method of claim 1 , wherein the presenting includes using text to speech conversion to present the prompting data to the user in synthesized voice, wherein the generating the plurality of candidate text strings associated to the return voice string of the user includes querying a predictive acoustic model, wherein the evaluating includes querying a predictive model provided by a general language model with the respective ones of the plurality of augmented candidate text strings associated to the return voice string data, and examining returned confidence level parameter values resulting from the querying. 11. The computer implemented method of claim 1 , wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data. 12. The computer implemented method of claim 1 , wherein the augmenting includes identifying a certain text string within the text based data defining the prompting data, and transforming the certain text string into a transformed text string using data repository stored data. 13. The computer implemented method of claim 1 , wherein the augmenting includes prepending the transformed prompting data to respective candidate text strings of the plurality of candidate text strings. 14. The computer implemented method of claim 1 , wherein the augmenting includes adapting respective candidate text strings of the plurality of candidate text strings using the transformed prompting data. 15. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to natural language processing for assignment of part of speech tags to the text based data. 16. The computer implemented method of claim 1 , wherein the examining includes subjecting the text based data defining the prompting data to part of speech tagging to provide part of speech tags associated to the text based data defining the prompting data. 17. A system comprising: a memory; at least one processor in communication with the memory; program instructions executable by one or more processor via the memory to perform a method comprising: determining, in performance of an interactive voice response (IVR) session, prompting data for presenting to a user, and storing text based data defining the prompting data into a data repository; presenting the prompting data to the user; receiving return voice string data from the user in response to the prompting data; generating a plurality of candidate text strings associated to the return voice string data of the user; examining the text based data defining the prompting data; augmenting the plurality of candidate text strings in dependence on a result of the examining to provide a plurality of augmented candidate text strings associated to the return voice string data, wherein the augmenting includes transforming the text based data defining the prompting data to provide transformed prompting data; evaluating respective ones of the plurality of augmented candidate text strings a

Assignees

Inventors

Classifications

H04M3/4936Primary
Speech interaction details (speech recognition per se G10L15/00) · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
H04M3/5166
in combination with interactive voice response systems or voice portals, e.g. as front-ends · CPC title
G10L2015/223
Execution procedure of a spoken command · CPC title
H04M2201/40
using speech recognition · CPC title

Patent family

Related publications grouped by family.

View patent family 83901820

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11711469B2 cover?: Methods, computer program products, and systems are presented. The methods, computer program products, and systems can include, for instance: determining, in performance of an interactive voice response (IVR) session, prompting data for presenting to a user, and storing text based data defining the prompting data into a data repository; presenting the prompting data to the user; receiving retur…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification H04M3/4936. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Jul 25 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and system for generating textual representation of user spoken utterance

Dialog based speech recognition

Semantic representation and realization for conversational systems

Speech recognition with sequence-to-sequence models

Data classifier

Methods and systems for correcting transcribed audio files

Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data

Frequently asked questions