Low-power voice command detector

US9685156B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9685156-B2
Application numberUS-201514656079-A
CountryUS
Kind codeB2
Filing dateMar 12, 2015
Priority dateMar 12, 2015
Publication dateJun 20, 2017
Grant dateJun 20, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A low-power voice command detection method uses an audio monitoring device to capture sound. The captured sound is analyzed in steps to determine if it fulfills a number of criteria regarding sound level, voice content and identifiable voice commands. For each step the processing is more complex and power demanding. A threshold between the first and subsequent steps is used to gate further processing. This threshold is dynamically adjusted, based on the outcome of the analysis, to avoid unnecessary processing and increase system performance.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device, comprising: audio circuitry for capturing ambient sound; voice command detection circuitry for determining voice commands provided by a user, the voice command detection circuitry configured to disregard, without analysing for the presence of speech, sound captured by the audio circuitry that does not exceed a first sound pressure threshold level; analyze sound captured by the audio circuitry that exceeds the first sound threshold level using a speech recognition algorithm for detecting the presence of words or ambient noise; dynamically adjust the first sound pressure threshold level based on the analysis of the captured sound; increase the first sound pressure threshold level when at least one of the captured sound is determined to be ambient noise or a word hit rate of the captured sound is below a first prescribed hit rate threshold; and decrease the first sound pressure threshold level when the captured sound includes speech having a word hit rate exceeding a second prescribed hit rate threshold or a prescribed time period has elapsed. 2. The device according to claim 1 , wherein the voice command detection circuitry is configured to conclude the captured sound is ambient noise when a difference between at least one of the sound pressure level and phase of sound captured by a first microphone and the sound pressure level and phase of sound captured by a second microphone is less than a corresponding first prescribed threshold. 3. The device according to claim 1 , wherein the audio circuitry comprises a first microphone arranged at one location of the electronic device, and a second microphone arranged at a different location of the electronic device, and wherein the voice command detection circuitry is configured to compare at least one of a sound pressure level and phase of sound captured by the first microphone with a corresponding at least one of a sound pressure level and phase of sound captured by the second microphone, and conclude the captured sound is user speech when a difference between the at least one of the sound pressure level and phase of sound captured by the first microphone and the corresponding at least one of the sound pressure level and phase of sound captured by the second microphone exceeds a corresponding second prescribed threshold. 4. The device according to claim 1 , wherein the voice detection circuitry is configured to: increase the first sound pressure threshold level when the captured sound exceeds the first sound pressure threshold level and a percentage of noise within the captured sound exceeds a prescribed percentage of the total captured sound; and decrease the first sound pressure threshold level when at least one of i) a prescribed time period has elapsed or ii) the captured sound includes speech and is less than a second sound pressure threshold level, the second sound pressure threshold level greater than the first sound pressure threshold level. 5. The device according to claim 4 , wherein when the first sound pressure threshold level is decreased under step ii, the voice detection circuitry is configured to maintain the first and second sound pressure threshold levels when the captured sound includes speech and exceeds the second sound pressure threshold level. 6. The device according to claim 1 , wherein the voice detection circuitry is configured to adjust a second sound pressure threshold level in proportion to an adjustment made to the first sound pressure threshold level. 7. The device according to claim 1 , where the voice command detection circuitry is configured to disregard the sound captured by the audio circuitry that does not exceed the first sound pressure threshold by performing no further processing based on the captured sound. 8. The device according to claim 1 , wherein the first sound pressure threshold level comprises a sound amplitude. 9. The device according to claim 1 , wherein the voice detection circuitry is configured to: increase the first sound pressure threshold level when the captured sound exceeds the first sound pressure threshold level and a percentage of noise within the captured sound exceeds a prescribed percentage of the total captured sound; and decrease the first sound pressure threshold level when the captured sound includes speech and is less than a second sound pressure threshold level, the second sound pressure threshold level greater than the first sound pressure threshold level. 10. A method for detecting voice commands, comprising: using audio circuitry to capture sound; disregarding, without analysing for the presence of speech, sound captured by the audio circuitry that does not exceed a first sound pressure threshold level, analyzing sound captured by the audio circuitry that exceeds the first sound pressure threshold level using a speech recognition algorithm for detecting the presence of words or ambient noise in the captured sound; dynamically adjusting the first sound pressure threshold level based on the analysis of the captured sound; increasing the first sound pressure threshold level when at least one of the captured sound is determined to be ambient noise or a word hit rate of the captured sound is below a first prescribed hit rate threshold; and decreasing the first sound pressure threshold level when the captured sound includes speech having a word hit exceeding a second prescribed hit rate threshold or a prescribed time period has elapsed. 11. The method according to claim 10 , where dynamically adjusting the first sound pressure threshold level comprises: increasing the first sound pressure threshold level when the captured sound exceeds the first sound pressure threshold level and a percentage of noise within the captured sound exceeds a prescribed percentage of the total captured sound; and decreasing the first sound pressure threshold level when at least one of i) a prescribed time period has elapsed or ii) the captured sound includes speech and is less than a second sound pressure threshold level, the second sound pressure threshold level greater than the first sound pressure threshold level. 12. The method according to claim 11 , wherein decreasing the first sound pressure threshold level under step ii includes maintaining the first and second sound pressure threshold levels when the captured sound includes speech and exceeds the second sound pressure threshold level. 13. The method according to claim 10 , wherein analyzing the captured sound includes using the captured sound in a speech recognition method to determine the presence of speech. 14. The method according to claim 10 , further comprising concluding the captured sound is ambient noise when a difference between the at least one of the sound pressure level and phase of sound captured by a first microphone and the corresponding at least one of the sound pressure level and phase of sound captured by a second microphone is less than a corresponding first prescribed value. 15. The method according to claim 10 , wherein the audio circuitry comprises a first microphone arranged at one location of an electronic device, and a second microphone arranged at a different location of the electronic device, and wherein analyzing the captured sound includes comparing at least one of a sound pressure level and phase of sound captured by the first microphone with a corresponding at least one of the sound pressure level and phase of sound captured by the second microphone, and concluding the captured sound is user speech when a difference between the at least one of the sound pressure level and phase captured

Assignees

Inventors

Classifications

  • G10L25/84Primary

    for discriminating voice from noise · CPC title

  • Adaptive threshold · CPC title

  • Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

  • Execution procedure of a spoken command · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9685156B2 cover?
A low-power voice command detection method uses an audio monitoring device to capture sound. The captured sound is analyzed in steps to determine if it fulfills a number of criteria regarding sound level, voice content and identifiable voice commands. For each step the processing is more complex and power demanding. A threshold between the first and subsequent steps is used to gate further proc…
Who is the assignee on this patent?
Sony Corp, Sony Mobile Communications Inc
What technology area does this patent fall under?
Primary CPC classification G10L25/84. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 20 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).