Signal processing apparatus having voice activity detection unit and related signal processing methods

US8972252B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-8972252-B2
Application numberUS-201213615515-A
CountryUS
Kind codeB2
Filing dateSep 13, 2012
Priority dateJul 6, 2012
Publication dateMar 3, 2015
Grant dateMar 3, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A signal processing apparatus includes a speech recognition system and a voice activity detection unit. The voice activity detection unit is coupled to the speech recognition system, and arranged for detecting whether an audio signal is a voice signal and accordingly generating a voice activity detection result to the speech recognition system to control whether the speech recognition system should perform speech recognition upon the audio signal.

First claim

Opening claim text (preview).

What is claimed is: 1. A signal processing apparatus, comprising: a speech recognition system; and a voice activity detection unit, coupled to the speech recognition system, the voice activity detection unit arranged for detecting whether an audio signal is a voice signal or not, and outputting a voice activity detection result to the speech recognition system to control whether the speech recognition system should perform speech recognition upon the audio signal; wherein only when the speech recognition system enters a power saving mode from a normal mode, the voice activity detection unit is enabled; and when the speech recognition system is in the normal mode, the voice activity detection unit is disabled. 2. The signal processing apparatus of claim 1 , wherein when the voice activity detection unit detects that the audio signal is the voice signal, the speech recognition system performs speech recognition upon the audio signal, and when the voice activity detection unit detects that the audio signal is not the voice signal, the speech recognition system does not perform speech recognition upon the audio signal. 3. The signal processing apparatus of claim 1 , wherein when the voice activity detection unit detects that the audio signal is a voice signal, the speech recognition system leaves the power saving mode, and enters the normal mode to perform speech recognition upon the audio signal. 4. The signal processing apparatus of claim 3 , wherein the speech recognition system performs speech recognition upon the audio signal to determine whether the audio signal contains a predetermined command information; and when the speech recognition system determines that the audio signal does not contain the predetermined command information, the speech recognition system leaves the normal mode and enters the power saving mode. 5. The signal processing apparatus of claim 4 , wherein the predetermined command information is a system wake-up command. 6. The signal processing apparatus of claim 1 , wherein the audio signal comprises a current audio frame, and the voice activity detection unit is arranged to compare an average power of the current audio frame with a threshold value. 7. The signal processing apparatus of claim 1 , wherein the audio signal comprises a current audio frame and at least one previous audio frame, and the voice activity detection unit is arranged to determine a signal power trend value according to an average power of the at least one previous audio frame, calculate a difference between an average power of the current audio frame and the signal power trend value, and compare the difference with a threshold value. 8. The signal processing apparatus of claim 1 , wherein the audio signal comprises a plurality of audio frames, and the voice activity detection unit is arranged to compare a number of successive audio frames, determined as not containing the audio signal, with a threshold value. 9. The signal processing apparatus of claim 1 , wherein the audio signal comprises a previous audio frame and a current audio frame, and the voice activity unit is arranged to calculate a difference between an average power of the current audio frame and an average power of the previous audio frame, and compare the difference with a threshold value. 10. A signal processing method, comprising: detecting whether an audio signal is a voice signal or not, and outputting a voice activity detection result to a speech recognition system to control whether the speech recognition system should perform speech recognition upon the audio signal; wherein the steps of detecting whether the audio signal is the voice signal are performed only when the speech recognition system enters a power saving mode from a normal mode; and the steps of detecting whether the audio signal is the voice signal are not performed when the speech recognition system is in the normal mode. 11. The signal processing method of claim 10 , wherein the steps of controlling the speech recognition system to recognize the audio signal according to the voice activity detection result comprise: when detecting that the audio signal is the voice signal, the speech recognition system performs speech recognition upon the audio signal; and when detecting that the audio signal is not the voice signal, the speech recognition system does not perform speech recognition upon the audio signal. 12. The signal processing method of claim 10 , wherein the steps of controlling the speech recognition system to recognize the audio signal according to the voice activity detection result comprise: when detecting that the audio signal is the voice signal, the speech recognition system leaves the power saving mode, and enters the normal mode to perform speech recognition upon the audio signal. 13. The signal processing method of claim 12 , wherein the speech recognition system performs speech recognition upon the audio signal to determine whether the audio signal contains a predetermined command information, and the signal processing method further comprises: when the speech recognition system determines that the audio signal does not contain the predetermined command information, the speech recognition system leaves the normal mode and enters the power saving mode. 14. The signal processing method of claim 13 , wherein the predetermined command information is a system wake-up command. 15. The signal processing method of claim 10 , wherein the audio signal comprises a current audio frame, and the voice activity detection unit is arranged to compare an average power of the current audio frame with a threshold value. 16. The signal processing method of claim 10 , wherein the audio signal comprises a current audio frame and at least one previous audio frame, and the steps of detecting whether the audio signal is the voice signal comprise: determining a signal power trend value according to an average power of the at least one previous audio frame; calculating a difference between an average power of the current audio frame and the signal power trend value; and comparing the difference with a threshold value. 17. The signal processing method of claim 10 , wherein the audio signal comprises a plurality of audio frames, and the steps of detecting whether the audio signal is the voice signal comprise: comparing a number of successive audio frames, determined as not containing the audio signal, with a threshold value. 18. The signal processing method of claim 10 , wherein the audio signal comprises a previous audio frame and a current audio frame, and the steps of detecting whether the audio signal is the voice signal comprise: calculating a difference between an average power of the current audio frame and an average power of the previous audio frame; and comparing the difference with a threshold value.

Assignees

Inventors

Classifications

  • detecting a user operation or a tactile contact or a motion of the device · CPC title

  • Monitoring of peripheral devices · CPC title

  • G10L15/28Primary

    Constructional details of speech recognition systems · CPC title

  • Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

  • in wireless communication networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US8972252B2 cover?
A signal processing apparatus includes a speech recognition system and a voice activity detection unit. The voice activity detection unit is coupled to the speech recognition system, and arranged for detecting whether an audio signal is a voice signal and accordingly generating a voice activity detection result to the speech recognition system to control whether the speech recognition system sh…
Who is the assignee on this patent?
Hung Chia-Yu, Yeh Tsung-Li, Tu Yi-Chang, and 1 more
What technology area does this patent fall under?
Primary CPC classification G10L15/28. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 03 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).