Methods and apparatus for broadened beamwidth beamforming and postfiltering

US9990939B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9990939-B2
Application numberUS-201415306767-A
CountryUS
Kind codeB2
Filing dateJul 2, 2014
Priority dateMay 19, 2014
Publication dateJun 5, 2018
Grant dateJun 5, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus for broadening the beamwidth of beamforming and postfiltering using a plurality of beamformers and signal and power spectral density mixing, and controlling a postfilter based on spatial activity detection such that de-reverberation or noise reduction is performed when a speech source is between the first and second beams.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, comprising: receiving a plurality of microphone signals from respective microphones, wherein the microphone signals comprise speech from a speaker with a command for an action to be taken by a system having an automatic speech recognition (ASR) system; forming, using a computer processor, a first beam and generating a first beamformed signal, a first spatial activity detection signal and a first directional power spectral density signal from the plurality of microphone signals; forming a second beam and generating a second beamformed signal, a second spatial activity detection signal and a second directional power spectral density signal from the plurality of microphone signals; determining non-directional power spectral density signals from the plurality of microphone signals; determining whether speech received by the microphones is from a source located within the first and second beams or between the first and second beams; mixing the first and second beamformed signals, the first and second directional power spectral density signals and the non-directional power spectral density signals based upon the first and second spatial activity detection signals to generate a mixed beamformed signal and a mixed power spectral density signal; performing postfiltering based on the mixed power spectral density signal, wherein spatial postfiltering is performed on the mixed beamformed signal when the source is within the first or second beams and non-spatial postfiltering is performed on the mixed beamformed signal when the source is in between the first and second beams; and performing automatic speech recognition after the postfiltering and implementing, by the system, the command from the speaker. 2. The method according to claim 1 , further including forming further beams and determining whether the speech received by the microphones is from a source located within or between the first, second or further beams. 3. The method according to claim 1 , further including determining that the location of the source is between the first and second beams by detecting speech in adjacent spatial voice activity detection (SVAD) sectors. 4. The method according to claim 1 , further including computing a fading factor from the first and second spatial activity detection signals for use in generating the mixed beamformed signal. 5. The method according to claim 1 , further using a single post filter module to perform the postfiltering. 6. The method according to claim 1 , further including generating a power spectral density estimate comprising a reverberation estimate. 7. The method according to claim 6 , further including generating a power spectral density estimate comprising a stationary noise estimate. 8. The method according to claim 1 , further including performing non-spatial deverberation if the source is located between the first and second beams. 9. The method according to claim 1 , further including using a blocking matrix to generate the first directional power spectral density signal. 10. The method according to claim 1 , further including performing speech recognition on an output of the postfiltering. 11. An article, comprising: a non-transitory computer-readable medium having stored instructions that enable a machine to: receive a plurality of microphone signals from respective microphones, wherein the microphone signals comprise speech from a speaker with a command for an action to be taken by a system having an automatic speech recognition (ASR) system; form, using a computer processor, a first beam and generating a first beamformed signal, a first spatial activity detection signal and a first directional power spectral density signal from the plurality of microphone signals; form a second beam and generating a second beamformed signal, a second spatial activity detection signal and a second directional power spectral density signal from the plurality of microphone signals; determine non-directional power spectral density signals from the plurality of microphone signals; determine whether speech received by the microphones is from a source located within the first and second beams or between the first and second beams; mix the first and second beamformed signals, the first and second directional power spectral density signals and the non-directional power spectral density signals based upon the first and second spatial activity detection signals to generate a mixed beamformed signal and a mixed power spectral density signal; perform postfiltering based on the mixed power spectral density signal, wherein spatial postfiltering is performed on the mixed beamformed signal when the source is within the first or second beams and non-spatial postfiltering is performed on the mixed beamformed signal when the source is in between the first and second beams; and perform automatic speech recognition after the postfiltering and implementing, by the system, the command from the speaker. 12. The article according to claim 11 , further including instructions to form further beams and determining whether the speech received by the microphones is from a source located within or between the first, second or further beams. 13. The article according to claim 11 , further including instructions to determine that the location of the source is between the first and second beams by detecting speech in adjacent spatial voice activity detection (SVAD) sectors. 14. The article according to claim 11 , further including instructions to compute a fading factor from the first and second spatial activity detection signals for use in generating the mixed beamformed signal. 15. The article according to claim 11 , further instructions to use a single post filter module to perform the postfiltering. 16. The article according to claim 11 , further including instructions to generate a power spectral density estimate comprising a reverberation estimate. 17. The article according to claim 16 , further including instructions to generate a power spectral density estimate comprising a stationary noise estimate. 18. The article according to claim 11 , further including instructions to perform non-spatial deverberation if the source is located between the first and second beams. 19. The article according to claim 11 , further including instructions to use a blocking matrix to generate the first directional power spectral density signal. 20. A system, comprising: a processor; and a memory coupled to the processor, the processor and the memory configured to: receive a plurality of microphone signals from respective microphones, wherein the microphone signals comprise speech from a speaker with a command for an action to be taken by the system which includes an automatic speech recognition (ASR) system; form, using a computer processor, a first beam and generating a first beamformed signal, a first spatial activity detection signal and a first directional power spectral density signal from the plurality of microphone signals; form a second beam and generating a second beamformed signal, a second spatial activity detection signal and a second directional power spectral density signal from the plurality of microphone signals; determine non-directional power spectral density signals from the plurality of microphone signals; determine whether speech received by the microphones is from a source located within the first and second beams or between the first and second beams; mix the first and second beamfo

Assignees

Inventors

Classifications

  • the noise being echo, reverberation of the speech · CPC title

  • Processing in the frequency domain · CPC title

  • the extracted parameters being power information · CPC title

  • G10L25/84Primary

    for discriminating voice from noise · CPC title

  • Microphone arrays; Beamforming · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9990939B2 cover?
Methods and apparatus for broadening the beamwidth of beamforming and postfiltering using a plurality of beamformers and signal and power spectral density mixing, and controlling a postfilter based on spatial activity detection such that de-reverberation or noise reduction is performed when a speech source is between the first and second beams.
Who is the assignee on this patent?
Nuance Communications Inc
What technology area does this patent fall under?
Primary CPC classification G10L25/84. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 05 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).