Systems, methods, and computer-readable media for improved real-time audio processing
US-2019318755-A1 · Oct 17, 2019 · US
US11308973B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11308973-B2 |
| Application number | US-202016987475-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 7, 2020 |
| Priority date | Aug 7, 2019 |
| Publication date | Apr 19, 2022 |
| Grant date | Apr 19, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to an embodiment, an electronic device may include a memory configured to store a noise removal neural network model and data utilized in the noise removal neural network model and a processor electrically connected to the memory wherein the memory may store instructions that, when executed, enable the processor to: output a first channel signal using a first beamformer for a multi-channel audio signal; output a second channel signal using a second beamformer; generate a third channel signal that compensates for a difference in noise levels between the first channel signal and the second channel signal; and train the noise removal neural network model by using the third channel signal in which the difference in noise levels is compensated for and the first channel signal as input values.
Opening claim text (preview).
What is claimed is: 1. An electronic device comprising: a communication circuit; at least two microphones; a processor configured to be operatively connected to the communication circuit and the at least two microphones; and a memory configured to be operatively connected to the processor and store a noise removal neural network model using a first channel signal corresponding to a first beamformer and a second channel signal corresponding to a second beamformer as an input value, wherein the memory stores instructions that, when executed, enable the processor to: acquire the first channel signal using the first beamformer and the second channel signal using the second beamformer for an audio input signal input through the at least two microphones; generate a third channel signal in which a difference in noise levels between the first channel signal and the second channel signal is compensated; and output a neural network output signal by using the first channel signal and the third channel signal as input values for the noise removal neural network model. 2. The electronic device of claim 1 , wherein the memory further stores instructions that, when executed, enable the processor to: identify sound sections and silent sections from the first channel signal and the second channel signal; and compare a difference in output levels between the silent section of the first channel signal and the silent section of the second channel signal to calculate the difference in noise levels. 3. The electronic device of claim 2 , wherein the memory further stores instructions that, when executed, enable the processor to: estimate an average value of the difference in noise levels; and frequency-compensate the second channel signal with a compensation value corresponding to the average value to generate the third channel signal. 4. The electronic device of claim 1 , wherein the first channel signal includes a signal in which a speaker's voice level is emphasized and noise is attenuated using a difference in arrival times of multi-channel voices and a noise removal rate, and the second channel signal includes a signal in which the speaker's voice level is attenuated. 5. The electronic device of claim 1 , wherein the neural network output signal includes a signal with an improved signal-to-noise-ratio (SNR) of a voice compared to a noise signal or a voice signal from which noise is removed via the noise removal neural network model, wherein the memory further stores instructions that, when executed, enable the processor to: identify a sound section of the first channel signal, compare the first channel signal in the sound section and the neural network output signal to estimate an average value of a difference in signal level for the sound section; and process the neural network output signal in order to compensate for voice distortion based on the average value of the difference in signal level in the sound section. 6. The electronic device of claim 5 , wherein the memory further stores instructions that, when executed, enable the processor to: identify a sound section and a silent section of the second channel signal, estimate a noise removal amount from frequency components of the silent section of the second channel signal; and estimate a noise removal amount of the sound section of the second channel signal. 7. The electronic device of claim 6 , wherein the memory further stores instructions that, when executed, enable the processor to: compensate the noise removal amount in the silent section of the second channel signal to be reduced by the noise removal amount in the sound section of the second channel signal; and process the neural network output signal according to the compensated noise removal amount in the silent section of the second channel signal so that a noise removal amount of the neural network output signal is uniform. 8. The electronic device of claim 7 , wherein the memory further stores instructions that, when executed, enable the processor to remove or attenuate residual noise in the processed neural network output signal such that the compensated voice distortion and noise removal amount are uniform. 9. The electronic device of claim 1 , wherein the memory further stores instructions that, when executed, enable the processor to: transmit the audio input signal input through the at least two microphones to the first beamformer and the second beamformer by performing short time Fourier transform (STFT); and perform inverse short time Fourier transform (ISTFT) on the neural network output signal whose residual noise is removed or attenuated to restore to a sound source signal. 10. The electronic device of claim 1 , wherein the noise removal neural network model is a neural network model based on a convolutional neural network (CNN), a recurrent neural network (RNN), and/or a deep neural network (DNN). 11. An electronic device comprising: a memory configured to store a noise removal neural network model and data utilized in the noise removal neural network model; and a processor configured to be electrically connected to the memory, wherein the memory stores instructions that, when executed, enable the processor to: output a first channel signal using a first beamformer for a multi-channel voice signal; output a second channel signal using a second beamformer; generate a third channel signal in which a difference in noise levels between the first channel signal and the second channel signal is compensated; and train the noise removal neural network model by using the third channel signal in which the difference in noise levels is compensated and the first channel signal as input values. 12. The electronic device of claim 11 , wherein the memory stores instructions that, when executed, enable the processor to: receive the multi-channel voice signal to build the noise removal neural network model that outputs a noise-removed voice signal or a signal with improved signal-to-noise-ratio (SNR) compared to a noise signal; and train the noise removal neural network model to update a noise removal rule, a noise removal function, and/or a noise removal gain coefficient of the noise removal neural network model. 13. The electronic device of claim 12 , further comprising a communication circuit, wherein the memory stores instructions that, when executed, enable the processor to transmit the noise removal neural network model, the updated noise removal neural network model, and update information of the noise removal neural network model to an external electronic device through the communication circuit. 14. The electronic device of claim 11 , wherein the first channel signal includes a signal in which a speaker's voice level is emphasized and noise is attenuated using a difference in arrival times of the multi-channel voice signal and a noise removal rate, and the second channel signal includes a signal in which the speaker's voice level is attenuated. 15. The electronic device of claim 11 , wherein the memory stores instructions that, when executed, enable the processor to: identify sound sections and silent sections from the first channel signal and the second channel signal; and compare a difference in output levels between the silent section of the first channel signal and the silent section of the second channel signal to calculate the difference in noise levels. 16. The electronic device of claim 15 , wherein the memory stores instructions that, when executed, enable the processor to: estimate an average value of an output level difference be
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Architecture, e.g. interconnection topology · CPC title
Noise reduction using microphones having different directional characteristics · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.