Post processing of natural language automatic speech recognition

US9431012B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9431012-B2
Application numberUS-201213460462-A
CountryUS
Kind codeB2
Filing dateApr 30, 2012
Priority dateApr 30, 2012
Publication dateAug 30, 2016
Grant dateAug 30, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A post-processing speech system includes a natural language-based speech recognition system that compares a spoken utterance to a natural language vocabulary that includes words used to generate a natural language speech recognition result. A master conversation module engine compares the natural language speech recognition result to domain specific words and phrases. A voting engine selects a word or a phrase from the domain specific words and phrases that is transmitted to an application control system. The application control system transmits one or more control signals that are used to control an internal or an external device or an internal or an external process.

First claim

Opening claim text (preview).

I claim: 1. A post-processing automated speech recognition system comprising: a natural language-based speech recognition system that compares a spoken utterance to a natural language vocabulary comprising words in one or more active grammars to generate a natural language speech recognition result and a natural language confidence score; a master conversation module engine that post processes the natural language speech recognition result by comparing the natural language speech recognition result generated by the natural language-based speech processing system to lists of words and phrases from a plurality of active sub-grammars that are mapped to a plurality of domain specific words and phrases, and generates a post processed natural language speech recognition result confidence score for each of the listed words and phrases from the plurality of active sub-grammars based on a likelihood that the respective listed words and phrases from the plurality of active sub-grammars match the natural language speech recognition result; a voting module engine that is interfaced to the master conversation module engine and that is operable to select a word or a phrase from the list of words and phrases from the plurality of active sub-grammars, based on the post processed natural language speech recognition result confidence scores; wherein: when the voting module engine selects a word or phrase from the plurality of active sub-grammars based on the post processed natural language speech recognition result confidence scores, the selected word or phrase is transmitted to an application control system that transmits control signals used to control an internal or an external device or an internal or an external process; or when both of: the voting module does not select a word or phrase from the plurality of active sub-grammars based on the post processed natural language speech recognition result confidence scores, and the natural language confidence score exceeds a specified threshold, then the master conversation module engine selects the natural language speech recognition result to be transmitted to the application control system; where the natural language-based speech recognition system, the master conversation module engine, and the voting module engine are executed by one or more processors. 2. The post-processing automated speech recognition system of claim 1 where the natural language speech recognition result and the word or the phrase from the plurality of active sub-grammars selected by the voting module engine are represented by language agnostic indicia, objects or symbols that control one or more devices or processes. 3. The post-processing automated speech recognition system of claim 1 where a portion of the natural language-based speech recognition system is remote from the master conversation module engine and the plurality of active sub-grammars are remote from a portion of the natural language-based speech processing system. 4. The post-processing automated speech recognition system of claim 1 further comprising an ambiguity resolver module engine that resolves an ambiguity arising from the comparisons between the natural language speech recognition result generated by the natural language-based speech recognition system and the list of words and phrases from the plurality of active sub-grammars. 5. The post-processing automated speech recognition system of claim 1 further comprising a grammar-based speech recognition system that compares the spoken utterance to a grammar-based vocabulary comprising the active sub-grammars that are mapped to the plurality of domain specific words and phrases. 6. The post-processing automated speech recognition system of claim 5 further comprising an alignment engine that selects between the word or the phrase selected by the voting module engine and recognition results generated by the grammar-based speech recognition system, where the alignment engine transmits the selection to one of a plurality of application control systems. 7. The post-processing automated speech recognition system of claim 6 where the voting module engine comprises a portion of the master conversation module engine and the word or the phrase selected by the voting engine comprises the recognition result of the master conversation module engine, and where the alignment engine selects between the recognition results of the master conversation module engine and the recognition result generated by the grammar-based speech recognition system based on the natural language confidence score and a grammar-based confidence score. 8. The post-processing automated speech recognition system of claim 7 where the alignment engine selects between the recognition results of the master conversation module engine and the recognition result generated by the grammar-based speech recognition system based on an order the recognition results are received at the alignment engine. 9. The post-processing automated speech recognition system of claim 6 further comprising an application engine that controls an internal or external device or process in response to the alignment engine's transmission of a selection result. 10. The post-processing automated speech recognition system of claim 1 further comprising a grammar-based speech processing system that generates a grammar-based speech recognition result and grammar-based confidence score synchronously with the natural language-based speech recognition system. 11. The post-processing automated speech recognition system of claim 10 where the grammar-based speech processing system and the natural language-based speech processing system process the same spoken utterance. 12. The post-processing automated speech recognition system of claim 10 where the grammar-based speech processing system and the natural language-based speech recognition system are executed by a plurality of processors that run simultaneously. 13. The post-processing automated speech recognition system of claim 10 where the grammar-based speech processing system and the natural language-based speech recognition system, and the voting module engine are executed by a plurality of parallel processors. 14. The post-processing automated speech recognition system of claim 10 where the grammar-based speech processing system and the natural language-based speech recognition system, and the voting module engine each comprise a computing thread executed by a multitasking processor. 15. The post-processing automated speech recognition system of claim 1 further comprising a grammar-based speech recognition system that generates grammar-based speech recognition results synchronously with a plurality of natural language-based speech recognition systems interfaced to the master conversation module engine. 16. A computer implemented method of automatically recognizing speech comprising: capturing speech utterances and converting the speech utterances into frames of speech; recognizing speech utterances by comparing the frames of speech to a list of words in active grammars; generating a natural language-based speech recognition result and a natural language confidence score; post processing the natural language-based speech recognition result by comparing the natural language-based speech recognition result to a domain specific vocabulary that comprises lists of words and phrases from sub-grammars that are mapped to domain specific words or phrases and generating a post process confidence score for each of the words and phrases from the sub-grammars, based on a likelihood that the words and phrases

Assignees

Inventors

Classifications

  • G10L15/19Primary

    Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

  • Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9431012B2 cover?
A post-processing speech system includes a natural language-based speech recognition system that compares a spoken utterance to a natural language vocabulary that includes words used to generate a natural language speech recognition result. A master conversation module engine compares the natural language speech recognition result to domain specific words and phrases. A voting engine selects a …
Who is the assignee on this patent?
Fry Darrin Kenneth, Ontario Inc 2236008
What technology area does this patent fall under?
Primary CPC classification G10L15/19. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 30 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).