Dynamic selection of appropriate far-field signal separation algorithms
US-2024257825-A1 · Aug 1, 2024 · US
US10854218B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10854218-B2 |
| Application number | US-201716469938-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 15, 2017 |
| Priority date | Dec 15, 2016 |
| Publication date | Dec 1, 2020 |
| Grant date | Dec 1, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A multichannel microphone-based reverberation time estimation method and device which use a deep neural network (DNN) are disclosed. A multichannel microphone-based reverberation time estimation method using a DNN, according to one embodiment, comprises the steps of: receiving a voice signal through a multichannel microphone; deriving a feature vector including spatial information by using the inputted voice signal; and estimating the degree of reverberation by applying the feature vector to the DNN.
Opening claim text (preview).
What is claimed is: 1. A method for multichannel microphone-based reverberation time estimation using a deep neural network (DNN), the method comprising: receiving an input of a voice signal through a multichannel microphone; deriving a feature vector that includes spatial information using the input voice signal, and estimating a degree of reverberation by applying the feature vector to the DNN, wherein the deriving of the feature vector comprises: deriving a negative-side variance (NSV) by deriving time and frequency information from the input voice signal using a short-time Fourier transform (STFT) and by deriving distribution of envelopes for each frequency band based on the derived time and frequency information; and deriving a cross-correlation function representing a correlation between two microphones in the input voice signal, and wherein the estimating of the degree of reverberation comprises estimating a reverberation time by using the derived NSV and the cross-correlation function as an input of the DNN. 2. The method of claim 1 , wherein the receiving of the voice signal through the multichannel microphone comprises estimating relative spatial information between voice signals input using the multichannel microphone. 3. The method of claim 1 , wherein the deriving of the NSV comprises: deriving a log-energy envelope from a domain of the STFT; deriving a gradient from the log-energy envelope using a least squares linear fitting; and deriving an NSV for estimating a reverberation time having a negative gradient, excluding a reverberation time having a positive gradient. 4. The method of claim 1 , wherein the estimating of the degree of reverberation by applying the feature vector to the DNN comprises estimating a reverberation time by using the derived NSV as an input of the DNN. 5. The method of claim 1 , wherein the DNN comprises three hidden layers, and each of the hidden layers is configured to be finely adjusted through a pre-training process using a plurality of epochs. 6. An apparatus for multichannel microphone-based reverberation time estimation using a deep neural network (DNN), the apparatus comprising: an inputter configured to receive an input of a voice signal through a multichannel microphone; a feature vector extractor configured to derive a feature vector that includes spatial information using the input voice signal; and a reverberation estimator configured to estimate a degree of reverberation by applying the feature vector to the DNN, wherein the feature vector extractor comprises: a negative-side variance (NSV) deriver configured to derive an NSV by deriving time and frequency information from the input voice signal using a short-time Fourier transform (STFT) and by deriving distribution of envelopes for each frequency band based on the derived time and frequency information; and a cross-correlation function deriver configured to derive a cross-correlation function representing a correlation between two microphones in the input voice signal, and wherein the reverberation estimator is configured to estimate a reverberation time by using the derived NSV the cross-correlation function as an input of the DNN. 7. The apparatus of claim 6 , wherein the NSV deriver is configured to derive a log-energy envelope from a domain of the STFT, to derive a gradient from the log-energy envelope using a least squares linear fitting, and to derive an NSV for estimating a reverberation having a negative gradient, excluding a reverberation time having a positive gradient. 8. The apparatus of claim 6 , wherein the reverberation estimator is configured to estimate a reverberation time by using the derived NSV as an input of the DNN. 9. The apparatus of claim 6 , wherein the DNN comprises three hidden layers, and each of the hidden layers is configured to be finely adjusted through a pre-training process using a plurality of epochs.
Supervised learning · CPC title
Feedforward networks · CPC title
Two microphones, one receiving mainly the noise signal and the other one mainly the speech signal · CPC title
the noise being echo, reverberation of the speech · CPC title
characterised by the method used for estimating noise · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.