Speech coding by quantizing with random-noise signal
US-9263051-B2 · Feb 16, 2016 · US
US9530423B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9530423-B2 |
| Application number | US-58399809-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 28, 2009 |
| Priority date | Jan 6, 2009 |
| Publication date | Dec 27, 2016 |
| Grant date | Dec 27, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, system and program for encoding and decoding speech according to a source-filter model whereby speech is modelled to comprise a source signal filtered by a time-varying filter. The method comprises: receiving a speech signal comprising successive frames. For each of a plurality of frames of the speech signal: adding a predetermined noise signal generated by a quantization gain multiplied by 0.5 times an inverse of a pitch correlation to the speech signal to generate a simulated signal, determining linear predictive coding coefficients based on the simulated signal frame, and determining a linear predictive coding residual signal based on the linear predictive coding coefficients and one of the speech signal and the simulated signal. Then forming an encoded signal representing said speech signal, based on the linear predictive coding coefficients and the linear predictive coding residual signal.
Opening claim text (preview).
The invention claimed is: 1. A method of encoding speech according to a source-filter model, the speech modelled to comprise a source signal filtered by a time-varying filter, the method comprising: receiving a speech signal, the speech signal comprising successive frames; for each of the frames of the speech signal: adding, by a first signal-processing module, a predetermined noise signal to the speech signal to generate a simulated signal, the predetermined noise signal generated by combining a white noise signal with a quantization gain value, the quantization gain value calculated as a constant multiplied by a square root of residual energy from a noise shaping analysis, wherein for voiced frames of the speech signal, the quantization gain value is further multiplied by 0.5 times an inverse of a pitch correlation determined by a pitch analysis; determining, by a second signal-processing module, linear predictive coding coefficients based on the simulated signal frame and determining a linear predictive coding residual signal based on the linear predictive coding coefficients and one of the speech signal or the simulated signal; generating a quantized residual signal based on the linear predictive coding residual signal; and forming, by a third signal-processing module, an encoded signal representing said speech signal by arithmetically encoding the quantized residual signal and the linear predictive coding coefficients. 2. The method according to claim 1 , wherein generating the quantized residual signal further comprises generating an associated quantization noise signal, and wherein said predetermined noise signal comprises white noise having a variance equal to a variance of the quantization noise signal. 3. An encoder for encoding speech according to a source-filter model, the speech modelled to comprise a source signal filtered by a time-varying filter, the encoder comprising: an input configured to receive a speech signal, the speech signal comprising successive frames; a first signal-processing module configured to generate, for each of the frames of the speech signal, a simulated signal frame by adding a predetermined noise signal to each of the speech signal frames, the predetermined noise signal generated by combining a white noise signal with a quantization gain value, the quantization gain value calculated as a constant multiplied by a square root of residual energy from a noise shaping analysis, wherein for voiced frames of the speech signal, the quantization gain value is further multiplied by 0.5 times an inverse of a pitch correlation determined by a pitch analysis; a second signal-processing module configured to determine linear predictive coding coefficients based on the simulated signal frame, the second signal-processing module further configured to determine a linear predictive coding residual signal based on the input speech signal and the linear predictive coding coefficients; a third signal-processing module configured to generate a quantized residual signal based on the linear predictive coding residual signal; and a fourth signal-processing module configured to form an encoded signal representing the speech signal by arithmetically encoding the quantized residual signal and the linear predictive coding coefficients. 4. The encoder according to claim 3 , wherein generating the quantized residual signal further generates an associated quantization noise signal, and wherein said first signal-processing module is further configured to generate the predetermined noise signal to include white noise having a variance equal to a variance of the quantization noise. 5. The encoder according to claim 3 wherein the second signal-processing module comprises a linear predictive coding analysis module. 6. The encoder of claim 3 , wherein the third signal-processing module comprises a noise shaping quantizer module. 7. One or more hardware memory devices having code stored thereon that, when executed by a processor, performs a method comprising: receiving a speech signal, the speech signal comprising successive frames; for each of the frames of the speech signal: adding a predetermined noise signal to the input speech signal to generate a simulated signal, the predetermined noise signal generated by combining a white noise signal with a quantization gain value, the quantization gain value calculated as a constant multiplied by a square root of residual energy from a noise shaping analysis, wherein for voiced frames of the speech signal, the quantization gain value is further multiplied by 0.5 times an inverse of a pitch correlation determined by a pitch analysis; determining linear predictive coding coefficients based on the simulated signal frame; determining a linear predictive coding residual signal based on the speech input signal and the linear predictive coding coefficients; generating a quantized residual signal based on the linear predictive coding residual signal; and forming an encoded signal representing said speech signal by arithmetically encoding the quantized residual signal and the linear predictive coding coefficients. 8. The one or more hardware memory devices according to claim 7 , wherein generating the quantized residual signal further comprises generating an associated quantization noise signal, and wherein said predetermined noise signal comprises white noise having a variance equal to a variance of the quantization noise signal.
Pitch determination of speech signals · CPC title
using spectral analysis, e.g. transform vocoders or subband vocoders · CPC title
Spectral prediction for preventing pre-echo; Temporary noise shaping [TNS], e.g. in MPEG2 or MPEG4 · CPC title
Noise filtering · CPC title
the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.