Cognitive intervention for voice recognition failure

US10971147B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10971147-B2
Application numberUS-201916295463-A
CountryUS
Kind codeB2
Filing dateMar 7, 2019
Priority dateFeb 1, 2017
Publication dateApr 6, 2021
Grant dateApr 6, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an aspect of the present disclosure, a method for providing an alternate modality of input for filling a form field in response to a failure of voice recognition is disclosed including prompting the user for information corresponding to a field of a form, generating speech data by capturing a spoken response of the user to the prompt using at least one input device, attempting to convert the speech data to text, determining that the attempted conversion has failed, evaluating the failure using at least one speech rule, selecting, based on the evaluation, an alternate input modality to be used for receiving the information corresponding to the field of the form, receiving the information corresponding to the field of the form from the alternate input modality, and injecting the received information into the field of the form.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented by at least one hardware processor for multi-modal re-routing of a failed voice recognition based on projected best channel, the method comprising: prompting a user for information corresponding to a field of a form; generating speech data by at least one input device, the speech data generated by capturing a spoken response of the user to the prompt; attempting to convert the speech data to text; determining that the attempted conversion has failed, and identifying the speech data as failed; determining a reason for the failed speech data; identifying, based on the determined reason for the failed speech data, an alternate channel as a projected best channel for receiving the information corresponding to the field of the form; transmitting a message to the user prompting the user to input information corresponding to the failed speech data using the identified alternate channel; receiving from the user the information corresponding to the field of the form from the identified alternate channel; and injecting the received information into the field of the form; and wherein the identifying an alternate channel as a projected best channel includes using a cognitive rules search engine to evaluate the failed speech data, based on a multitude of available alternate channels, the field of the form, user history data, and a confidence level of each of the available alternate channels, to identify one of the available alternate channels as the projected best channel for receiving the information corresponding to the field of the form. 2. The method of claim 1 , wherein the determining a reason for the speech data as failed includes determining a context of the failed speech data. 3. The method of claim 2 , wherein the identifying an alternate channel includes identifying the alternate channel based on the determined context of the failed speech data. 4. The method according to claim 3 , wherein the context of the failed speech data is a reduced cognitive ability of the user. 5. The method according to claim 4 , wherein the determining a context of the failed speech data includes determining the reduced cognitive ability of the user based on a comparison to historical speech data of the user. 6. The method according to claim 1 , wherein the identifying an alternate channel for receiving the information corresponding to the field of the form includes identifying one of the multitude of alternate channels based on which one of the multitude of alternate channels has the highest success rate for the user when the user is experiencing reduced cognitive abilities. 7. The method according to claim 1 , further comprising after receiving the information corresponding to the field of the form, continuing prompting the user for voice input to fill remaining fields of the form. 8. A system for providing an alternate modality of input for filling a form field in response to a failure of voice recognition comprising: at least one input device configured to generate speech data by capturing a spoken response of a user; and at least one hardware processor configured to: prompt a user for information corresponding to a field of a form; generate speech data by at least one input device, the speech data generated by capturing a spoken response of the user to the prompt; attempt to convert the speech data to text; determine that the attempted conversion has failed, and identifying the speech data as failed; determine a reason for the failed speech data; identify, based on the determined reason for the failed speech data, an alternate channel for receiving the information corresponding to the field of the form; transmit a message to the user prompting the user to input information corresponding to the failed speech data using the identified alternate channel; receive from the user the information corresponding to the field of the form from the identified alternate channel; and inject the received information into the field of the form; and wherein identifying an alternate channel as a projected best channel includes using a cognitive rules search engine to evaluate the failed speech data, based on a multitude of available alternate channels, the field of the form, user history data, and a confidence level of each of the available alternate channels, to identify one of the available alternate channels as the projected best channel for receiving the information corresponding to the field of the form. 9. The system of claim 8 , wherein determining a reason for the failed speech data includes determining a context of the failed speech data. 10. The system of claim 9 , wherein identifying an alternate channel includes identifying the alternate channel based on the determined context of the failed speech data. 11. The system according to claim 10 , wherein the context of the failed speech data is a reduced cognitive ability of the user. 12. The system according to claim 11 , wherein the determining a context of the failed speech data includes determining the reduced cognitive ability of the user based on a comparison to historical speech data of the user. 13. The system according to claim 8 , wherein identifying an alternate channel for receiving the information corresponding to the field of the form includes identifying one of the multitude of alternate channels based on which one of the multitude of alternate channels has the highest success rate for the user when the user is experiencing reduced cognitive abilities. 14. The system according to claim 8 , wherein the at least one hardware processor is further configured to: after receiving the information corresponding to the field of the form, continue to prompt the user for voice input to fill remaining fields of the form. 15. A non-transitory computer readable medium comprising instructions for multi-modal re-routing of a failed voice recognition based on projected best channel that, when executed by at least one processor, configures the at least one processor to: prompt a user for information corresponding to a field of a form; generate speech data by at least one input device, the speech data generated by capturing a spoken response of the user to the prompt; attempt to convert the speech data to text; determine that the attempted conversion has failed, and identifying the speech data as failed; determine a reason for the failed speech data; identify, based on the determined reason for the failed speech data, an alternate channel as a projected best channel for receiving the information corresponding to the field of the form; transmit a message to the user prompting the user to input information corresponding to the failed speech data using the identified alternate channel; receive from the user the information corresponding to the field of the form from the identified alternate channel; and inject the received information into the field of the form; and wherein identifying an alternate channel as a projected best channel includes using a cognitive rules search engine to evaluate the failed speech data, based on a multitude of available alternate channels, the field of the form, user history data, and a confidence level of each of the available alternate channels, to identify one of the available alternate channels as the projected best channel for receiving the information corresponding to the field of the form. 16. The non-transitory computer readable medium of claim 15 , wherein determining a reason for the failed speech data includes determining a context of the failed speech data.

Assignees

Inventors

Classifications

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Form filling; Merging · CPC title

  • Assessment or evaluation of speech recognition systems · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10971147B2 cover?
In an aspect of the present disclosure, a method for providing an alternate modality of input for filling a form field in response to a failure of voice recognition is disclosed including prompting the user for information corresponding to a field of a form, generating speech data by capturing a spoken response of the user to the prompt using at least one input device, attempting to convert the…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).