System with multiple simultaneous speech recognizers

US10186262B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10186262-B2
Application numberUS-201313956145-A
CountryUS
Kind codeB2
Filing dateJul 31, 2013
Priority dateJul 31, 2013
Publication dateJan 22, 2019
Grant dateJan 22, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A speech recognition system interprets both spoken system commands as well as application commands. Users may speak commands to an open microphone of a computing device that may be interpreted by at least two speech recognizers operating simultaneously. The first speech recognizer interprets operating system commands and the second speech recognizer interprets application commands. The system commands may include at least opening and closing an application and the application commands may include at least a game command or navigation within a menu. A reserve word may be used to identify whether the command is for the operation system or application. A user's cadence may also indicate whether the speech is a global command or application command. A speech recognizer may include a natural language software component located in a remote computing device, such as in the so-called cloud.

First claim

Opening claim text (preview).

What is claimed is: 1. A method to operate a computing device, the method comprising: receiving, by a microphone, first analog audio data that represents a global command, and second analog audio data that represents an application command; transforming, via an analog to digital converter, the first and second analog audio data into first and second digital audio data; receiving, by a first speech recognizer within the computing device, the first digital audio data, the first speech recognizer configured to recognize global commands in digital audio data and output the global commands and associated confidence levels to an operating system; receiving, by a second speech recognizer within the computing device, the second digital audio data, the second speech recognizer operating simultaneously with the first speech recognizer for at least a portion of a time, and the second speech recognizer configured to recognize application commands in digital audio data and output the application commands and associated confidence levels to an application controlled by the operating system; determining that the first digital audio data represents a global command and that the second digital data represents an application command by: interpreting the second digital audio data as the application command in the absence of detecting via the first speech recognizer a reserved word and by detecting via the second speech recognizer the application command in the second digital audio data, and interpreting the first digital audio data as the global command where the reserved word and the global command following the reserved word are detected via the first speech recognizer, and not receiving the second digital audio data by the second speech recognizer after the reserved word is detected until determining that a global system interaction is complete; and performing, by the computing device, a computing operation in response to one of the first digital audio data that represents the global command and the second digital audio data that represents the application command. 2. The method of claim 1 , wherein the first and second speech recognizers are included in the operating system. 3. The method of claim 2 , wherein the computing device includes an intelligent agent, and wherein the method further includes providing a voice out, by the intelligent agent, in response to one of the first digital data that represents the global command and the second digital data that represents the application command. 4. The method of claim 1 , wherein the global command includes at least one of launching another application, closing another application, switch between running applications, a social command, search within the application, search across a system, controlling settings for the application, controlling settings for the system, pausing background music and controlling a voice call and controlling a playing of a video. 5. The method of claim 1 , wherein the application command includes a game command, navigation within a menu, transport control and browse the application for available content. 6. The method of claim 1 , wherein a single reserved word is used. 7. The method of claim 2 , wherein at least the second speech recognizer includes processor readable instructions executed by another computing device that is remote from the computing device. 8. The method of claim 1 , wherein at least the second speech recognizer includes a natural language software component to interpret the application command. 9. An apparatus comprising: at least one microphone to receive at least a first analog audio signal that represents a global command and a second analog audio signal that represents an application command; an analog to digital converter for converting the at least first and second analog audio signals into first and second digital audio data; at least one processor; and at least one processor readable memory to store an operating system having processor readable instructions that includes a first speech recognizer configured to recognize global commands in digital audio data and output the global commands and associated confidence levels to the operating system and a second speech recognizer configured to operate simultaneously with the first speech recognizer for at least a portion of a time to recognize application commands in digital audio data and output the application commands and associated confidence levels to an application controlled by the operating system, and the at least one processor readable memory to store the application via processor readable instructions, wherein the at least one processor executes the processor readable instructions of the operating system and application to categorize the first digital audio data as the global command where a reserved word and the global command following the reserved word are detected via the first speech recognizer, no longer receive the digital audio data by the second speech recognizer after the reserved word is detected until determining that a global system interaction is complete, and categorize the second digital audio data as the application command in the absence of detecting via the first speech recognizer the reserved word and by detecting the application command in the second digital audio data via the second speech recognizer, wherein the at least one processor executes the processor readable instructions of the operating system and application to perform a computing operation in response to one of the first digital audio data that represents the global command and the second digital audio data that represents the application command. 10. The apparatus of claim 9 , wherein the at least one processor executes at least a portion of the processor readable instructions of the application in response to the second command. 11. The apparatus of claim 9 , wherein the apparatus is included in a game console and the application is an electronic interactive game. 12. The method of claim 1 , wherein determining that the first digital audio data represents a global command and that the second digital data represents an application command further comprises chaining together a plurality of global commands following the reserved word. 13. The method of claim 1 , wherein determining that the first digital audio data represents a global command and that the second digital data represents an application command further comprises changing from interpreting digital audio data as the global command to the application command based on a cadence of the analog audio data received by the microphone. 14. The apparatus of claim 9 , wherein the at least one processor executes the processor readable instructions of the operating system and application to perform at least one of changing from categorizing digital audio data as the global command to the application command based on a cadence of the analog audio signals received by the at least one microphone, and changing from categorizing digital audio data as the application command to the global command based on a cadence of the analog audio signals received by the at least one microphone.

Assignees

Inventors

Classifications

  • of application context · CPC title

  • Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • G06F3/167Primary

    Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10186262B2 cover?
A speech recognition system interprets both spoken system commands as well as application commands. Users may speak commands to an open microphone of a computing device that may be interpreted by at least two speech recognizers operating simultaneously. The first speech recognizer interprets operating system commands and the second speech recognizer interprets application commands. The system c…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/167. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 22 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).