Speech recognition power management

US9704486B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9704486-B2
Application numberUS-201213711510-A
CountryUS
Kind codeB2
Filing dateDec 11, 2012
Priority dateDec 11, 2012
Publication dateJul 11, 2017
Grant dateJul 11, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: an audio input module; an audio detection module in communication with the audio input module; a speech detection module in communication with the audio detection module; a wakeword recognition module in communication with the speech detection module; and a network interface module in communication with the wakeword recognition module, wherein: the audio detection module is configured to: receive audio input from the audio input module; determine a volume of at least a portion of the audio input; cause the audio input module to increase a sampling rate of the audio input based at least in part on the volume exceeding a threshold; and cause activation of the speech detection module based at least in part on the volume exceeding the threshold; the speech detection module is configured to determine a first score indicating a likelihood that the audio input comprises speech and cause activation of the wakeword recognition module based at least on part on the score; and the wakeword recognition module is configured to: determine a second score indicating a likelihood that the audio input comprises a wakeword; and cause activation of a network interface module based on the second score by providing power to the network interface module; and the network interface module is configured to transmit at least a portion of the obtained audio input to a remote computing device. 2. The system of claim 1 , wherein the audio input device comprises a microphone, the audio detection module comprises a first digital signal processor, the speech detection module comprises a second digital signal processor, and the wakeword recognition module comprises a microprocessor. 3. The system of claim 1 , wherein: the speech detection module is further configured to determine the first score using at least one of a hidden Markov model, a Gaussian mixture model, energies in a plurality of spectral bands, or signal to noise ratios in a plurality of spectral bands; and the wakeword recognition module is further configured to determine the second score using at least one of an application processing module, a hidden Markov model, and a Gaussian mixture model. 4. The system of claim 1 , wherein: the wakeword recognition module is further configured to cause deactivation of the audio detection module based at least in part on the first score; and the wakeword recognition module is further configured to cause deactivation of the speech detection module based at least in part on the second score. 5. A computer-implemented method of operating a first computing device, the method comprising: receiving an audio input; determining one or more values from the audio input, wherein the one or more values comprise at least one of: a first value indicating an energy level of the audio input; or a second value indicating a likelihood that the audio input comprises speech; increasing a sampling rate of the audio input, from a first lower sampling rate to a second higher sampling rate, based at least in part on the one or more values; activating a first module of the first computing device based at least in part on the one or more values; performing an operation, by the first module, wherein the operation comprises at least one of: determining that the audio input comprises a wakeword and causing activation of a network interface module in response to determining that the audio input comprises a wakeword, wherein causing activation of the network interface module comprises providing power to the network interface module; performing speech recognition on at least a portion of the audio input to obtain speech recognition results; or causing transmission of at least a portion of the audio input to a second computing device. 6. The computer-implemented method of claim 5 , wherein: the first module comprises a processor that is switchable between a low-power state and a high-power state; and the processor only performs the operation when it is in the high-power state. 7. The computer-implemented method of claim 6 , wherein activating the first module comprises switching the processor from the low-power state to the high-power state. 8. The computer-implemented method of claim 6 further comprising deactivating the first module, wherein deactivating the first module comprises switching the processor from the high-power state to the low-power state. 9. The computer-implemented method of claim 6 , wherein the processor comprises at least one of a digital signal processor or a microprocessor. 10. The computer-implemented method of claim 5 , wherein the first module comprises a software module configured to be executed by a microprocessor. 11. The computer-implemented method of claim 10 , wherein activating the first module comprises causing the microprocessor to execute the software module. 12. The computer-implemented method of claim 5 , wherein the operation further comprising receiving speech recognition results from the second computing device. 13. The computer-implemented method of claim 12 , wherein the speech recognition results comprise at least one of a transcription of at least a portion of the audio input and a response to an intelligent agent query included in at least a portion of the audio input. 14. The computer-implemented method of claim 12 further comprising: activating a second module of the first computing device based at least in part on the one or more values, wherein the second module is configured to implement a speech recognition application; and processing the speech recognition results with the speech recognition application. 15. The computer-implemented method of claim 5 , wherein providing power to the network interface module causes the network interface module to transition from a deactivated state to an activated state. 16. The computer-implemented method of claim 15 , wherein communications sent via the network interface module are prevented while the network interface module is in the deactivated state. 17. The computer-implemented method of claim 15 , wherein communications sent via the network interface module are enabled while the network interface module is in the activated state. 18. The computer-implemented method of claim 5 , wherein providing power to the network interface module comprises providing power to a processor of the first computing device. 19. The computer-implemented method of claim 5 , further comprising determining that the energy level of the audio input satisfies a threshold, wherein the increasing the sampling rate is performed in response to determining that the energy level of the audio input satisfies the threshold. 20. A device comprising: a first processor configured to: determine one or more values, wherein the one or more values comprise at least one of a first value indicating an energy level of an audio input or a second value indicating a likelihood that the audio input comprises speech; and cause an increase in a sampling rate of the audio input, from a first lower sampling rate to a second higher sampling rate, based at least in part on the one or more values; cause activation of a second processor based at least in part on the one or more values; the second processor configured to perform an operation, wherein the operation comprises at least one of: determining that the audio input comprises a wakeword and causing activation of a network interface module in response to determining that the audio inpu

Assignees

Inventors

Classifications

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Word spotting · CPC title

  • G10L15/28Primary

    Constructional details of speech recognition systems · CPC title

  • Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

  • Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9704486B2 cover?
Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech rec…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/28. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).