Detection of acoustic impulse events in voice applications using a neural network

US10475471B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10475471-B2
Application numberUS-201715583012-A
CountryUS
Kind codeB2
Filing dateMay 1, 2017
Priority dateOct 11, 2016
Publication dateNov 12, 2019
Grant dateNov 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In accordance with embodiments of the present disclosure, an integrated circuit for implementing at least a portion of an audio device may include an audio output configured to reproduce audio information by generating an audio output signal for communication to at least one transducer of the audio device, a microphone input configured to receive an input signal indicative of ambient sound external to the audio device, and a processor configured to implement an impulsive noise detector. The impulsive noise detector may comprise a plurality of processing blocks for determining a feature vector based on characteristics of the input signal and a neural network for determining based on the feature vector whether the impulsive event comprises a speech event or a noise event.

First claim

Opening claim text (preview).

What is claimed is: 1. An integrated circuit for implementing at least a portion of an audio device, comprising: an audio input for receiving audio information to be reproduced; an audio output configured to reproduce the audio information by generating an audio output signal for communication to at least one transducer of the audio device; a microphone input configured to receive an input signal indicative of ambient sound external to the audio device; and a processor configured to implement an impulsive noise detector comprising: a plurality of processing blocks for determining a feature vector based on characteristics of the input signal, wherein the feature vector comprises a statistic indicative of a degree of temporal modulation of a signal spectrum of the input signal; a pre-processing block configured to: augment the feature vector with at least one previous frame of the input signal to generate an augmented feature vector, wherein the augmented feature vector has an increased feature redundancy relative to the feature vector based on temporal correlations between frames; and reduce the feature redundancy of the augmented feature vector via feature dimension reduction; and a neural network for determining, based on the augmented feature vector, whether an impulsive event comprises a speech event or a noise event, wherein the neural network is trained with an augmented training data set based on amplitude, time, and frequency scaling of an initial training data set of impulsive noise events; wherein the processor is further configured to modify the generated audio output signal based on the determination of the neural network. 2. The integrated circuit of claim 1 , wherein the processor is further configured to modify a characteristic associated with the audio information in response to detection of a noise event. 3. The integrated circuit of claim 2 , wherein the characteristic comprises one or more of an amplitude of the audio information and spectral content of the audio information. 4. The integrated circuit of claim 2 , wherein the characteristic comprises at least one coefficient of a voice-based processing algorithm including at least one of a noise suppressor, a background noise estimator, an adaptive beamformer, dynamic beam steering, always-on voice, and a conversation-based playback management system. 5. The integrated circuit of claim 1 , wherein the feature vector further comprises statistics indicative of harmonicity and sparsity of the signal spectrum of the input signal to determine whether the impulsive event comprises a speech event or a noise event. 6. The integrated circuit of claim 5 , wherein the harmonicity at a particular frequency is based on a ratio of total energy to harmonic energy. 7. The integrated circuit of claim 5 , wherein the sparsity is based on a harmonic product spectrum and a spectral flatness measure of the input signal. 8. The integrated circuit of claim 1 , wherein the feature vector comprises a statistic indicative of an acoustic energy present in the input signal. 9. The integrated circuit of claim 1 , wherein the feature vector comprises a statistic indicative of an occurrence of a signal burst event of the input signal. 10. The integrated circuit of claim 9 , wherein the statistic indicative of the occurrence of the signal burst event is based on a normalized signal energy normalized by an instantaneous signal dynamic range. 11. The integrated circuit of claim 1 , wherein the feature vector comprises a statistic indicative of mel cepstral coefficients of the input signal. 12. The integrated circuit of claim 1 , wherein the pre-processing block is further configured to: normalize statistics of the augmented feature vector with respect to each other. 13. The integrated circuit of claim 1 , wherein the temporal modulation is based on changes in a sub-band spectral flatness measure of the input signal. 14. A method for impulsive noise detection comprising: receiving, at an audio input, audio information to be reproduced; receiving an input signal indicative of ambient sound external to an audio device; determining a feature vector based on characteristics of the input signal, wherein the feature vector comprises a statistic indicative of a degree of temporal modulation of a signal spectrum of the input signal; augmenting the feature vector with at least one previous frame of the input signal to generate an augmented feature vector, wherein the augmented feature vector has an increased feature redundancy relative to the feature vector based on temporal correlations between frames; reducing the feature redundancy of the augmented feature vector via feature dimension reduction; using a neural network to determine, based on the augmented feature vector, whether an impulsive event comprises a speech event or a noise event, wherein the neural network is trained with an augmented training data set based on amplitude, time, and frequency scaling of an initial training data set of impulsive noise events; and reproducing the audio information by generating an audio output signal for communication to at least one transducer of an audio device based on the input signal and the determination of whether the impulsive event comprises a speech event or a noise event, wherein the generated audio output signal is modified based on the determination of the neural network. 15. The method of claim 14 , further comprising modifying a characteristic associated with the audio information in response to detection of a noise event. 16. The method of claim 15 , wherein the characteristic comprises one or more of an amplitude of the audio information and spectral content of the audio information. 17. The method of claim 15 , wherein the characteristic comprises at least one coefficient of a voice-based processing algorithm including at least one of a noise suppressor, a background noise estimator, an adaptive beamformer, dynamic beam steering, always-on voice, and a conversation-based playback management system. 18. The method of claim 14 , wherein the feature vector comprises statistics indicative of harmonicity and sparsity of the signal spectrum of the input signal to determine whether the impulsive event comprises a speech event or a noise event. 19. The method of claim 18 , wherein the harmonicity at a particular frequency is based on a ratio of total energy to harmonic energy. 20. The method of claim 18 , wherein the sparsity is based on a harmonic product spectrum and a spectral flatness measure of the input signal. 21. The method of claim 14 , wherein the feature vector comprises a statistic indicative of an acoustic energy present in the input signal. 22. The method of claim 14 , wherein the feature vector comprises a statistic indicative of an occurrence of a signal burst event of the input signal. 23. The method of claim 22 , wherein the statistic indicative of the occurrence of the signal burst event is based on a normalized signal energy normalized by an instantaneous signal dynamic range. 24. The method of claim 14 , wherein the feature vector comprises a statistic indicative of mel cepstral coefficients of the input signal. 25. The method of claim 14 , further comprising normalizing statistics of the augmented feature vector with respect to each other. 26. The method of claim 14 , wherein the temporal modulation is based on changes in a sub-band

Assignees

Inventors

Classifications

  • characterised by the method used for estimating noise · CPC title

  • G10L25/30Primary

    using neural networks · CPC title

  • the extracted parameters being power information · CPC title

  • Microphone arrays; Beamforming · CPC title

  • G10L25/84Primary

    for discriminating voice from noise · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10475471B2 cover?
In accordance with embodiments of the present disclosure, an integrated circuit for implementing at least a portion of an audio device may include an audio output configured to reproduce audio information by generating an audio output signal for communication to at least one transducer of the audio device, a microphone input configured to receive an input signal indicative of ambient sound exte…
Who is the assignee on this patent?
Cirrus Logic Int Semiconductor Ltd, Cirrus Logic Inc
What technology area does this patent fall under?
Primary CPC classification G10L25/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).