Low power voice detection

US9633654B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9633654-B2
Application numberUS-201113997070-A
CountryUS
Kind codeB2
Filing dateDec 6, 2011
Priority dateDec 6, 2011
Publication dateApr 25, 2017
Grant dateApr 25, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods of enabling voice processing with minimal power consumption includes recording time-domain audio signal at a first clock frequency and a first voltage, and performing Fast Fourier Transform (FFT) operations on the time-domain audio signal at a second clock frequency to generate frequency-domain audio signal. The frequency domain audio signal may be enhanced to obtain better signal to noise ratio, through one or multiple filtering and enhancing techniques. The enhanced audio signal may be used to generate the total signal energy and estimate the background noise energy. Decision logic may determine from the signal energy and the background noise, the presence or absence of the human voice. The first clock frequency may be different from the second clock frequency.

First claim

Opening claim text (preview).

We claim: 1. An apparatus comprising: a transform component to operate between a first mode and a second mode, wherein the first mode is based on a first clock frequency and a first voltage, and wherein the second mode is based on a second clock frequency and a second voltage, and logic to, store a digital representation of a time-domain audio signal in a memory configured to operate based on the first clock frequency and the first voltage, wherein the memory is coupled to a first buffer that is coupled in series to a second buffer, transmit the time-domain audio signal to the second buffer via the first buffer, wherein the first buffer is configured to operate based on the first clock frequency and the first voltage, and wherein the second buffer is configured to operate based on the second clock frequency and the second voltage, and cause the transform component operating in the second mode to perform Fast Fourier Transform (FFT) operations on the time-domain audio signal to generate a frequency-domain audio signal, wherein the first clock frequency is to be faster than the second clock frequency, wherein the transform component operating between the first mode and the second mode obtains a balance between active and leakage power. 2. The apparatus of claim 1 , wherein the logic is further to: cause the transform component operating in the first mode to perform a first set of FFT operations, perform complex number multiplication operations, and cause the transform component operating in the second mode to perform a second set of FFT operations in series with the first set of FFT operations. 3. The apparatus of claim 2 , wherein the second voltage is to be lower than the first voltage. 4. The apparatus of claim 3 , wherein the logic is to: perform noise suppression operations, and perform filtering operations on the frequency-domain audio signal based on the second clock frequency and the second voltage to generate an enhanced audio signal. 5. The apparatus of claim 4 , wherein the complex number multiplication operations and filtering operations are to be implemented using a same hardware component. 6. The apparatus of claim 4 , wherein the logic is to perform human voice detection operations on the enhanced audio signal based on the second clock frequency and the second voltage. 7. The apparatus of claim 6 , wherein the logic is to determine total energy in a frame of the enhanced audio signal, and to determine background noise in the frame of the enhanced audio signal. 8. The apparatus of claim 7 , wherein the logic is to perform median filtering operations, and perform contour tracking operations. 9. The apparatus of claim 7 , wherein the logic is to execute a command associated with the detected human voice based on the first clock frequency and the first voltage. 10. A computer-implemented method comprising: recording a digital representation of a time-domain audio signal in a memory at a first clock frequency and a first voltage for a first mode, wherein the memory is configured to operate based on the first clock frequency and the first voltage, and wherein the memory is coupled to a first buffer that is coupled in series to a second buffer; transmitting the time-domain audio signal to the second buffer via the first buffer; and performing Fast Fourier Transform (FFT) operations, using a transform component, on the digital representation of the time-domain audio signal at a second clock frequency for a second mode to generate a frequency-domain audio signal, wherein the first buffer is configured to operate based on the first clock frequency and the first voltage, wherein the second buffer is configured to operate based on the second clock frequency and the second voltage, wherein the first clock frequency is faster than the second clock frequency, and wherein the FFT operations operating between the first mode and the second mode obtain a balance between active and leakage power. 11. The method of claim 10 , wherein the FFT operations are performed at a second voltage for the second mode that is lower than the first voltage for the first mode. 12. The method of claim 11 , further including: performing noise suppression operations on the frequency-domain audio signal at the second clock frequency and the second voltage to generate an enhanced audio signal. 13. The method of claim 12 , further including: performing voice detection operations on the enhanced audio signal at the second clock frequency and the second voltage to detect human voice. 14. The method of claim 13 , wherein performing the human voice detection operations includes: determining total energy in a frame of the enhanced audio signal; determining energy associated with background noise in the frame of the enhanced audio signal; and detecting the human voice by subtracting the energy associated with the background noise from the total energy in the frame of the enhanced audio signal. 15. The method of claim 13 , further including: executing a command associated with the human voice at the first clock frequency and the first voltage. 16. The method of claim 15 , wherein the time-domain audio signal is recorded continuously and converted from Pulse Density Modulation (PDM) to Pulse-code modulation (PCM) at the first clock frequency and the first voltage. 17. The method of claim 16 , wherein the FFT operations are performed in series. 18. A non-transitory computer readable storage medium comprising a set of instructions which, if executed by a processor, cause a computer to: record a digital representation of a time-domain audio signal to a memory at a first clock frequency and a first voltage for a first mode, wherein the memory is configured to operate based on the first clock frequency and the first voltage, and wherein the memory is coupled to a first buffer that is coupled in series to a second buffer; transmit the time-domain audio signal to the second buffer via the first buffer; and perform Fast Fourier Transform (FFT) operations on the digital representation of the time-domain audio signal at a second clock frequency for a second mode to generate a frequency-domain audio signal, wherein the first buffer is configured to operate based on the first clock frequency and the first voltage, wherein the second buffer is configured to operate based on the second clock frequency and the second voltage, wherein the first clock frequency is to be faster than the second clock frequency, and wherein performing the FFT operations between the first mode and the second mode obtain a balance between active and leakage power. 19. The medium of claim 18 , wherein the FFT operations are to be performed at a second voltage for the second mode lower than the first voltage for the first mode. 20. The medium of claim 19 , further comprising a set of instructions which, if executed by the processor, cause the computer to: perform noise suppression operations on the frequency-domain audio signal at the second clock frequency and the second voltage to generate an enhanced audio signal; perform voice detection operations on the enhanced audio signal at the second clock frequency and the second voltage to detect human voice; and execute a command associated with the human voice at the first clock frequency and the first voltage. 21. The medium of claim 20 , wherein the voice detection operations are to be performed by determining total energy in a frame of the enhanced audio signal, determining energy associated

Assignees

Inventors

Classifications

  • Memory allocation or algorithm optimisation to reduce hardware requirements · CPC title

  • G10L25/84Primary

    for discriminating voice from noise · CPC title

  • Power saving characterised by the action undertaken · CPC title

  • Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm · CPC title

  • Processing in the frequency domain · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9633654B2 cover?
Methods of enabling voice processing with minimal power consumption includes recording time-domain audio signal at a first clock frequency and a first voltage, and performing Fast Fourier Transform (FFT) operations on the time-domain audio signal at a second clock frequency to generate frequency-domain audio signal. The frequency domain audio signal may be enhanced to obtain better signal to no…
Who is the assignee on this patent?
Raychowdhury Arijit, Beltman Willem M, Tschanz James W, and 4 more
What technology area does this patent fall under?
Primary CPC classification G10L25/84. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).