Leveraging interaction context to improve recognition confidence scores

US9542931B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9542931-B2
Application numberUS-201414521990-A
CountryUS
Kind codeB2
Filing dateOct 23, 2014
Priority dateOct 27, 2010
Publication dateJan 10, 2017
Grant dateJan 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

On a computing device a speech utterance is received from a user. The speech utterance is a section of a speech dialog that includes a plurality of speech utterances. One or more features from the speech utterance are identified. Each identified feature from the speech utterance is a specific characteristic of the speech utterance. One or more features from the speech dialog are identified. Each identified feature from the speech dialog is associated with one or more events in the speech dialog. The one or more events occur prior to the speech utterance. One or more identified features from the speech utterance and one or more identified features from the speech dialog are used to calculate a confidence score for the speech utterance.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for improving speech recognition on a computing device, the method comprising: accessing, on the computing device, a log file comprising a plurality of speech dialogues, wherein each of the plurality of speech dialogues is associated with an interaction context and comprises one or more speech utterances; automatically extracting from the log file: one or more features associated with the one or more speech utterances; a first confidence score associated with the one or more features associated with the one or more speech utterances; one or more dialog-level features from the speech dialog, each dialog-level feature corresponding to a specific characteristic of the speech dialog; and a value associated with each of the one or more dialog-level features from the speech dialog; and using the values associated with the one or more dialog-level features to adjust the first confidence score associated with one or more of the speech utterances. 2. The method of claim 1 , wherein the dialog-level features include a position of each utterance in the speech dialog. 3. The method of claim 1 , wherein the dialog-level features include a degree in which re-prompting occurred for one or more utterances in the speech dialog. 4. The method of claim 1 , wherein the adjusting of the first confidence score is associated with recalibrating one or more speech recognition models on the computing device. 5. The method of claim 1 , wherein the adjusting of the first confidence score is associated with recalibrating a confidence classifier module. 6. The method of claim 1 , wherein the one more features associated with the one or more speech utterances includes a degree to which an acoustic match is determined for each speech utterance. 7. The method of claim 1 , wherein the one or more features associated with the one or more speech utterances includes a noise of an acoustic signal associated with each speech utterance. 8. The method of claim 1 , wherein the one or more features associated with the one or more speech utterances includes a degree to which a first recognition for a speech utterance is similar to a second recognition for the speech utterance. 9. A system comprising: one or more processors; and a memory coupled to the one or more processors, the memory for storing instructions which, when executed by the one or more processors, performs a method for improving speech recognition on a computing device, the method comprising: accessing, on the computing device, a log file comprising a plurality of speech dialogues, wherein each of the plurality of speech dialogues is associated with an interaction context and comprises one or more speech utterances; automatically extracting from the log file: one or more features associated with the one or more speech utterances; a first confidence score associated with the one or more features associated with the one or more speech utterances; one or more dialog-level features from the speech dialog, each dialog-level feature corresponding to a specific characteristic of the speech dialog; and a value associated with each of the one or more dialog-level features from the speech dialog; and using the values associated with the one or more dialog-level features to adjust the first confidence score associated with one or more of the speech utterances. 10. The system of claim 9 , wherein the dialog-level features further include at least a degree in which re-prompting occurred for one or more utterances in the speech dialog. 11. The system of claim 9 , wherein the adjusting of the first confidence score is associated with recalibrating one or more speech recognition models on the computing device. 12. The system of claim 9 , wherein the adjusting of the first confidence score is associated with recalibrating a confidence classifier module. 13. The system of claim 9 , wherein the one more features associated with the one or more speech utterances includes a degree to which an acoustic match is determined for each speech utterance. 14. The system of claim 9 , wherein the one or more features associated with the one or more speech utterances includes a noise of an acoustic signal associated with each speech utterance. 15. The system of claim 9 , wherein the one or more features associated with the one or more speech utterances includes a degree to which a first recognition for a speech utterance is similar to a second recognition for the speech utterance. 16. The system of claim 9 , wherein the log file includes contextual information for the one or more speech utterances. 17. The system of claim 16 , wherein the contextual information includes information from previous and future speech utterances in the speech dialog. 18. The system of claim 9 , wherein the speech dialog includes a plurality of dialog events, and wherein the one or more dialog-level features are derived from the log file for a first dialog event of the plurality of dialog events occurring previous to a current speech utterance and for a second dialog event of the plurality of dialog events occurring after the current speech utterance. 19. A hardware device comprising instructions that, when executed by a computing device, cause the computing device to: access, on the computing device, a log file comprising a plurality of speech dialogues, wherein each of the plurality of speech dialogues is associated with an interaction context and comprises one or more speech utterances; automatically extract from the log file: one or more features associated with the one or more speech utterances; a first confidence score associated with the one or more features associated with the one or more speech utterances; one or more dialog-level features from the speech dialog, each dialog-level feature corresponding to a specific characteristic of the speech dialog; and a value associated with each of the one or more dialog-level features from the speech dialog; and use the values associated with the one or more dialog-level features to adjust the first confidence score associated with one or more of the speech utterances. 20. The hardware device of claim 19 , wherein the dialog-level features further include a value associated with re-prompting for the one or more utterances in the speech dialog.

Assignees

Inventors

Classifications

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • G10L15/01Primary

    Assessment or evaluation of speech recognition systems · CPC title

  • G10L15/08Primary

    Speech classification or search · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9542931B2 cover?
On a computing device a speech utterance is received from a user. The speech utterance is a section of a speech dialog that includes a plurality of speech utterances. One or more features from the speech utterance are identified. Each identified feature from the speech utterance is a specific characteristic of the speech utterance. One or more features from the speech dialog are identified. Eac…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/01. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).