Speech recognition error diagnosis

US10019984B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10019984-B2
Application numberUS-201514634714-A
CountryUS
Kind codeB2
Filing dateFeb 27, 2015
Priority dateFeb 27, 2015
Publication dateJul 10, 2018
Grant dateJul 10, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a) perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; (b) perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and (c) determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for diagnosing speech recognition errors, comprising: at least one processing component; and one or more media operably coupled to the at least one processing component and bearing one or more instructions that, when executed by the at least one processing component, perform operations including at least: determine that a speech recognition result is at least partially erroneous; perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result, including determine that the at least one category of recognition error includes at least an acoustic model error when (a) the first error analysis result indicates that a reference language model score associated with a reference speech is higher than a recognition language model score associated with the at least partially erroneous speech recognition result; and (b) the second error analysis result indicates that a reference acoustic model score associated with the reference speech is lower than a recognition acoustic model score associated with the at least partially erroneous speech recognition result; determine at least one corrective action to at least partially correct at least one aspect of a speech recognition component based at least partially on the at least one category of recognition error associated with the at least partially erroneous speech recognition result; and at least one of: provide an indication of the at least one corrective action; or adjust at least one aspect of the speech recognition component based on the at least one corrective action. 2. The system of claim 1 , wherein: the first error analysis includes at least one language model scoring operation; and the second error analysis includes at least one acoustic model scoring operation. 3. The system of claim 1 , wherein: the first error analysis includes at least one dictionary check operation; and the second error analysis includes at least one transcription analysis operation. 4. The system of claim 1 , wherein: the first error analysis includes at least one emulation operation; and the second error analysis includes at least one grammar analysis operation. 5. The system of claim 1 , wherein: the first error analysis of the at least partially erroneous speech recognition result includes a comparison of a language model score associated with the at least partially erroneous speech recognition result with a language model score associated with a reference speech recognition result; and the second error analysis of the at least partially erroneous speech recognition result includes a comparison of an acoustic model score associated with the at least partially erroneous speech recognition result with an acoustic model score associated with the reference speech recognition result. 6. The system of claim 1 , wherein at least one of the first error analysis or the second error analysis comprises: one or more emulation operations that assume an ideal operation of an acoustic model to assess an actual operation of a language model. 7. The system of claim 1 , wherein the operations further comprise: perform a third error analysis of the at least partially erroneous speech recognition result to provide a third error analysis result; and determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of at least the first error analysis result, the second error analysis result, and the third error analysis result. 8. The system of claim 7 , wherein: the first error analysis includes at least one language model scoring operation; the second error analysis includes at least one acoustic model scoring operation; and the third error analysis includes at least one of an engine setting check operation, a penalty model setting check operation, a force alignment operation, a 1:1 alignment test operation, an emulation operation, or a dictionary check operation. 9. The system of claim 1 , wherein determine at least one corrective action to at least partially correct at least one aspect of a speech recognition component based at least partially on the at least one category of recognition error associated with the at least partially erroneous speech recognition result comprises: determine at least one corrective action to at least partially correct at least one aspect of at least one of a language model, an acoustic model, a transcription model, a pruning model, a penalty model, or a grammar of a speech recognition component based at least partially on the at least one category of recognition error associated with the at least partially erroneous speech recognition result. 10. The system of claim 1 , wherein provide an indication of the at least one corrective action comprises: provide at least one recommended action to at least partially correct at least one aspect of at least one of a language model, an acoustic model, a transcription model, a pruning model, a penalty model, or a grammar of the speech recognition component based at least partially on the at least one category of recognition error associated with the at least partially erroneous speech recognition result. 11. The system of claim 1 , wherein adjust at least one aspect of the speech recognition component based on the at least one corrective action comprises: adjust at least one aspect of at least one of a language model, an acoustic model, a transcription model, a pruning model, a penalty model, or a grammar of a speech recognition component based at least partially on the at least one category of recognition error associated with the at least partially erroneous speech recognition result. 12. A system for diagnosing speech recognition errors, comprising: at least one processing component; and one or more media operably coupled to the at least one processing component and bearing one or more instructions that, when executed by the at least one processing component, perform operations including at least: determine that a speech recognition result is at least partially erroneous; perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result, including determine that the at least one category of recognition error includes at least an acoustic model error and a language model error when (a) the first error analysis result indicates that a reference language model score associated with a reference speech is lower than a recognition language model score associated with the at least partially erroneous speech recognition result; and (b) the second error analysis result indicates that a reference acoustic model score associated with the reference speech is lower than a recognition acoustic model score associated with the at least partially erroneous speech recognition result; determine at least one

Assignees

Inventors

Classifications

  • G10L15/01Primary

    Assessment or evaluation of speech recognition systems · CPC title

  • using context dependencies, e.g. language models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10019984B2 cover?
Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/01. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 10 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).