Low-rank hidden input layer for speech recognition neural network
US-2016092766-A1 · Mar 31, 2016 · US
US9866984B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9866984-B2 |
| Application number | US-201615355053-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 18, 2016 |
| Priority date | Dec 14, 2015 |
| Publication date | Jan 9, 2018 |
| Grant date | Jan 9, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method includes extracting a difference value through extraction of features of a front audio channel signal and a surround channel of multichannel sound content by setting the front audio channel signal and the surround channel as input and output channel signals, respectively, training a deep neural network (DNN) model by setting the input channel signal and the difference value as an input and an output of the DNN model, respectively, normalizing a frequency-domain signal of the input channel signal by converting the input channel signal into the frequency-domain signal, and extracting estimated difference values by decoding the normalized frequency-domain signal through the DNN model, deriving an estimated spectral amplitude of the surround channel based on the front audio channel signal and the difference value, and deriving an audio signal of a final surround channel by converting the estimated spectral amplitude of the surround channel into the time domain.
Opening claim text (preview).
What is claimed is: 1. A method of generating surround channel audio in a front channel only stereo audio system from a surround channel audio signal comprising a front audio channel signal and a surround channel signal, comprising: transforming the front audio channel signal and the surround channel signal into frequency-domain signals; extracting a difference value of the transformed front audio channel signal and the transformed surround channel signal; training a deep neural network (DNN) model using the difference value and the transformed front audio signal to obtain a DNN parameter; normalizing the transformed front audio channel signal; calculating an estimated difference value of the front audio channel signal and the surround channel signal from the normalized transformed front audio channel signal and the DNN parameter; deriving an estimated transformed surround channel signal based on the front audio channel signal and the estimated difference value; deriving an final audio signal for play in the front channel only stereo system by converting the estimated transformed surround channel signal into a time domain; and playing the final audio signal in the front channel only stereo system. 2. The method of generating surround channel audio according to claim 1 , wherein transforming the front audio channel signal and the surround channel signal into frequency-domain signals comprises transforming the front audio channel signal and the surround channel signal by short-time Fourier transform (STFT). 3. The method of generating surround channel audio according to claim 1 , further comprising: normalizing the difference value and the transformed front audio channel signal to a value of 0 to 1. 4. The method of generating surround channel audio according to claim 1 , wherein the difference value is obtained by subtracting a certain proportion of the transformed front audio channel signal from the transformed surround channel signal. 5. The method of generating surround channel audio according to claim 4 , wherein the certain proportion is represented by ε for limiting the range of the estimated transformed surround channel signal generated from the DNN model, and has a value of 0.5 such that the estimated transformed surround channel signal comprises a certain portion of the transformed front audio channel signal. 6. The method of generating surround channel audio according to claim 1 , wherein deriving the estimated transformed surround channel signal comprises calculating a sum of a certain proportion of the transformed surround channel signal and the estimated difference value. 7. The method of generating surround channel audio according to claim 6 , wherein the certain proportion is represented by ε and set to a value of 0.5, ε being a factor serving to adjust a degree of limiting the transformed front audio channel signal. 8. The method of generating surround channel audio according to claim 1 , wherein deriving the final audio signal comprises converting the estimated transformed surround channel signal into a time domain by inverse STFT with reference to a phase of the front audio channel signal.
using neural networks · CPC title
Vocoder architecture · CPC title
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
using orthogonal transformation · CPC title
of the pseudo five- or more-channel type, e.g. virtual surround · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.