Speech recognition using an operating system hooking component for context-aware recognition models

US10062375B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10062375-B2
Application numberUS-201615334523-A
CountryUS
Kind codeB2
Filing dateOct 26, 2016
Priority dateJun 19, 2011
Publication dateAug 28, 2018
Grant dateAug 28, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Inputs provided into user interface elements of an application are observed. Records are made of the inputs and the state(s) the application was in while the inputs were provided. For each state, a corresponding language model is trained based on the input(s) provided to the application while the application was in that state. When the application is next observed to be in a previously-observed state, a language model associated with the application's current state is applied to recognize speech input provided by a user and thereby to generate speech recognition output that is provided to the application. An application's state at a particular time may include the user interface element(s) that are displayed and/or in focus at that time, and is determined by an operating system hooking component embedded in the automatic speech recognition system.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method, comprising: receiving, by an automatic speech recognition system executed by one computer processor, a first plurality of inputs into an application while the application is in a first state; training, by the automatic speech recognition system, a first language model based on the received first plurality of inputs; determining, by the automatic speech recognition system, that the application is in the first state; and applying, by the automatic speech recognition system, the first language model to a first speech input in response to determining that the application is in the first state, wherein the automatic speech recognition system is executed on a first computing device, and wherein the application is executed on a second computing device different from the first computing device; wherein training the first language model further comprises: identifying a pattern of use of a user interface associated with the received first plurality of inputs, wherein the identification is performed by an operating system hooking component included in the speech recognition system by intercepting messages between the said user interface and the computer processor's operating system; and modifying the first language model based on the identified pattern of use. 2. The method of claim 1 , wherein training the first language model further comprises associating a probability with a word in the first language model based on preceding words of the word. 3. The method of claim 1 , wherein applying the first language model further comprises applying the first language model to the first speech input if a number of the first plurality of inputs associated with the first language model exceeds a predefined threshold. 4. The method of claim 1 , wherein training the first language model further comprises: receiving a second plurality of inputs into another copy of the application, while the another copy of the application is in the first state and executing on a different computing device than a computing device on which the first plurality of inputs is received; and modifying the first language model based on the received second plurality of inputs. 5. The method of claim 1 , wherein training the first language model further comprises associating a probability with a word in the first language model. 6. The method of claim 1 , wherein determining that the application is in the first state further comprises: analyzing application data to determine that the application is in the first state; comparing the determined first state of the application to a state associated with the first language model; and determining that the determined first state of the application and the state associated with the first language model are substantially a same state. 7. The method of claim 1 , wherein determining that the application is in the first state further comprises: comparing application data of the application to application data associated with the first language model; and determining that the application data of the application and the application data associated with the first language model are a same data. 8. The method of claim 1 , wherein applying the first language model further comprises applying the first language model to the first speech input after achieving a degree of confidence in a level of accuracy of the first language model. 9. An automated speech recognition system, comprising: means for receiving a first plurality of inputs into an application while the application is in a first state; means for training a first language model based on the received first plurality of inputs; means for determining that the application is in the first state; and means for applying the first language model to a first speech input in response to determining that the application is in the first state, wherein the automatic speech recognition system is executed on a first computing device by a computer processor, and wherein the application is executed on a second computing device different from the first computing device; wherein training the first language model further comprises: identifying a pattern of use of a user interface associated with the received first plurality of inputs, wherein the identification is by an operating system hooking component included in the speech recognition system by intercepting messages between the said user interface and the computer processor's operating system; and modifying the first language model based on the identified pattern of use. 10. The automated speech recognition system of claim 9 , further comprising an operating system hooking component receiving the first plurality of inputs from at least one of a text-based input device, a pointing device, and a speech input device. 11. The automated speech recognition system of claim 9 , further comprising means for providing, to the application, a result of applying the first language model to the first speech input. 12. The automated speech recognition system of claim 9 , further comprising means for modifying a resource accessed by the first language model, based on the received first plurality of inputs. 13. A non-transitory computer readable medium storing computer program instructions which, when executed by at least one computer processor, causes the at least one computer processor to: receive, by an automatic speech recognition system executed by the at least one computer processor, a first plurality of inputs into an application while the application is in a first state; train, by the automatic speech recognition system, a first language model based on the received first plurality of inputs; determine, by the automatic speech recognition system, that the application is in the first state; and apply, by the automatic speech recognition system, the first language model to a first speech input in response to determining that the application is in the first state, wherein the automatic speech recognition system is executed on a first computing device, and wherein the application is executed on a second computing device different from the first computing device; wherein training the first language model further comprises: identifying a pattern of use of a user interface associated with the received first plurality of inputs, wherein the identification is by an operating system hooking component included in the speech recognition system by intercepting messages between the said user interface and the computer processor's operating system; and modifying the first language model based on the identified pattern of use. 14. The non-transitory computer readable medium of claim 13 , further comprising computer program instructions causing the at least one computer processor to: identify, in the received first plurality of inputs, a plurality of input values; identify a frequency with which each of the plurality of input values occurs in the first plurality of inputs; and train the first language model based on the identified frequency with which each of the plurality of input values occurs in the first plurality of inputs. 15. The non-transitory computer readable medium of claim 13 , further comprising computer program instructions causing the at least one computer processor to: identify, for one of the received first plurality of inputs, an input value; determine that the input value is an instance of a concept; identify, in the received first plurality of inputs, a number of instances of the concept; identify a frequency with which the concept occur

Assignees

Inventors

Classifications

  • Converting codes to words; Guess-ahead of partial word inputs · CPC title

  • updating or merging of old and new templates; Mean values; Weighting · CPC title

  • Multi-language systems; Localisation; Internationalisation · CPC title

  • Execution arrangements for user interfaces · CPC title

  • using icons (graphical or visual programming using iconic symbols G06F8/34) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10062375B2 cover?
Inputs provided into user interface elements of an application are observed. Records are made of the inputs and the state(s) the application was in while the inputs were provided. For each state, a corresponding language model is trained based on the input(s) provided to the application while the application was in that state. When the application is next observed to be in a previously-observed…
Who is the assignee on this patent?
Mmodal Ip Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 28 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).