Event detection for playback management in an audio device
US-2017040029-A1 · Feb 9, 2017 · US
US10475471B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10475471-B2 |
| Application number | US-201715583012-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 1, 2017 |
| Priority date | Oct 11, 2016 |
| Publication date | Nov 12, 2019 |
| Grant date | Nov 12, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In accordance with embodiments of the present disclosure, an integrated circuit for implementing at least a portion of an audio device may include an audio output configured to reproduce audio information by generating an audio output signal for communication to at least one transducer of the audio device, a microphone input configured to receive an input signal indicative of ambient sound external to the audio device, and a processor configured to implement an impulsive noise detector. The impulsive noise detector may comprise a plurality of processing blocks for determining a feature vector based on characteristics of the input signal and a neural network for determining based on the feature vector whether the impulsive event comprises a speech event or a noise event.
Opening claim text (preview).
What is claimed is: 1. An integrated circuit for implementing at least a portion of an audio device, comprising: an audio input for receiving audio information to be reproduced; an audio output configured to reproduce the audio information by generating an audio output signal for communication to at least one transducer of the audio device; a microphone input configured to receive an input signal indicative of ambient sound external to the audio device; and a processor configured to implement an impulsive noise detector comprising: a plurality of processing blocks for determining a feature vector based on characteristics of the input signal, wherein the feature vector comprises a statistic indicative of a degree of temporal modulation of a signal spectrum of the input signal; a pre-processing block configured to: augment the feature vector with at least one previous frame of the input signal to generate an augmented feature vector, wherein the augmented feature vector has an increased feature redundancy relative to the feature vector based on temporal correlations between frames; and reduce the feature redundancy of the augmented feature vector via feature dimension reduction; and a neural network for determining, based on the augmented feature vector, whether an impulsive event comprises a speech event or a noise event, wherein the neural network is trained with an augmented training data set based on amplitude, time, and frequency scaling of an initial training data set of impulsive noise events; wherein the processor is further configured to modify the generated audio output signal based on the determination of the neural network. 2. The integrated circuit of claim 1 , wherein the processor is further configured to modify a characteristic associated with the audio information in response to detection of a noise event. 3. The integrated circuit of claim 2 , wherein the characteristic comprises one or more of an amplitude of the audio information and spectral content of the audio information. 4. The integrated circuit of claim 2 , wherein the characteristic comprises at least one coefficient of a voice-based processing algorithm including at least one of a noise suppressor, a background noise estimator, an adaptive beamformer, dynamic beam steering, always-on voice, and a conversation-based playback management system. 5. The integrated circuit of claim 1 , wherein the feature vector further comprises statistics indicative of harmonicity and sparsity of the signal spectrum of the input signal to determine whether the impulsive event comprises a speech event or a noise event. 6. The integrated circuit of claim 5 , wherein the harmonicity at a particular frequency is based on a ratio of total energy to harmonic energy. 7. The integrated circuit of claim 5 , wherein the sparsity is based on a harmonic product spectrum and a spectral flatness measure of the input signal. 8. The integrated circuit of claim 1 , wherein the feature vector comprises a statistic indicative of an acoustic energy present in the input signal. 9. The integrated circuit of claim 1 , wherein the feature vector comprises a statistic indicative of an occurrence of a signal burst event of the input signal. 10. The integrated circuit of claim 9 , wherein the statistic indicative of the occurrence of the signal burst event is based on a normalized signal energy normalized by an instantaneous signal dynamic range. 11. The integrated circuit of claim 1 , wherein the feature vector comprises a statistic indicative of mel cepstral coefficients of the input signal. 12. The integrated circuit of claim 1 , wherein the pre-processing block is further configured to: normalize statistics of the augmented feature vector with respect to each other. 13. The integrated circuit of claim 1 , wherein the temporal modulation is based on changes in a sub-band spectral flatness measure of the input signal. 14. A method for impulsive noise detection comprising: receiving, at an audio input, audio information to be reproduced; receiving an input signal indicative of ambient sound external to an audio device; determining a feature vector based on characteristics of the input signal, wherein the feature vector comprises a statistic indicative of a degree of temporal modulation of a signal spectrum of the input signal; augmenting the feature vector with at least one previous frame of the input signal to generate an augmented feature vector, wherein the augmented feature vector has an increased feature redundancy relative to the feature vector based on temporal correlations between frames; reducing the feature redundancy of the augmented feature vector via feature dimension reduction; using a neural network to determine, based on the augmented feature vector, whether an impulsive event comprises a speech event or a noise event, wherein the neural network is trained with an augmented training data set based on amplitude, time, and frequency scaling of an initial training data set of impulsive noise events; and reproducing the audio information by generating an audio output signal for communication to at least one transducer of an audio device based on the input signal and the determination of whether the impulsive event comprises a speech event or a noise event, wherein the generated audio output signal is modified based on the determination of the neural network. 15. The method of claim 14 , further comprising modifying a characteristic associated with the audio information in response to detection of a noise event. 16. The method of claim 15 , wherein the characteristic comprises one or more of an amplitude of the audio information and spectral content of the audio information. 17. The method of claim 15 , wherein the characteristic comprises at least one coefficient of a voice-based processing algorithm including at least one of a noise suppressor, a background noise estimator, an adaptive beamformer, dynamic beam steering, always-on voice, and a conversation-based playback management system. 18. The method of claim 14 , wherein the feature vector comprises statistics indicative of harmonicity and sparsity of the signal spectrum of the input signal to determine whether the impulsive event comprises a speech event or a noise event. 19. The method of claim 18 , wherein the harmonicity at a particular frequency is based on a ratio of total energy to harmonic energy. 20. The method of claim 18 , wherein the sparsity is based on a harmonic product spectrum and a spectral flatness measure of the input signal. 21. The method of claim 14 , wherein the feature vector comprises a statistic indicative of an acoustic energy present in the input signal. 22. The method of claim 14 , wherein the feature vector comprises a statistic indicative of an occurrence of a signal burst event of the input signal. 23. The method of claim 22 , wherein the statistic indicative of the occurrence of the signal burst event is based on a normalized signal energy normalized by an instantaneous signal dynamic range. 24. The method of claim 14 , wherein the feature vector comprises a statistic indicative of mel cepstral coefficients of the input signal. 25. The method of claim 14 , further comprising normalizing statistics of the augmented feature vector with respect to each other. 26. The method of claim 14 , wherein the temporal modulation is based on changes in a sub-band
Related publications grouped by family.
Answers are generated from the same data shown on this page.