Speech encoding by determining a quantization gain based on inverse of a pitch correlation

US9530423B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9530423-B2
Application numberUS-58399809-A
CountryUS
Kind codeB2
Filing dateAug 28, 2009
Priority dateJan 6, 2009
Publication dateDec 27, 2016
Grant dateDec 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, system and program for encoding and decoding speech according to a source-filter model whereby speech is modelled to comprise a source signal filtered by a time-varying filter. The method comprises: receiving a speech signal comprising successive frames. For each of a plurality of frames of the speech signal: adding a predetermined noise signal generated by a quantization gain multiplied by 0.5 times an inverse of a pitch correlation to the speech signal to generate a simulated signal, determining linear predictive coding coefficients based on the simulated signal frame, and determining a linear predictive coding residual signal based on the linear predictive coding coefficients and one of the speech signal and the simulated signal. Then forming an encoded signal representing said speech signal, based on the linear predictive coding coefficients and the linear predictive coding residual signal.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of encoding speech according to a source-filter model, the speech modelled to comprise a source signal filtered by a time-varying filter, the method comprising: receiving a speech signal, the speech signal comprising successive frames; for each of the frames of the speech signal: adding, by a first signal-processing module, a predetermined noise signal to the speech signal to generate a simulated signal, the predetermined noise signal generated by combining a white noise signal with a quantization gain value, the quantization gain value calculated as a constant multiplied by a square root of residual energy from a noise shaping analysis, wherein for voiced frames of the speech signal, the quantization gain value is further multiplied by 0.5 times an inverse of a pitch correlation determined by a pitch analysis; determining, by a second signal-processing module, linear predictive coding coefficients based on the simulated signal frame and determining a linear predictive coding residual signal based on the linear predictive coding coefficients and one of the speech signal or the simulated signal; generating a quantized residual signal based on the linear predictive coding residual signal; and forming, by a third signal-processing module, an encoded signal representing said speech signal by arithmetically encoding the quantized residual signal and the linear predictive coding coefficients. 2. The method according to claim 1 , wherein generating the quantized residual signal further comprises generating an associated quantization noise signal, and wherein said predetermined noise signal comprises white noise having a variance equal to a variance of the quantization noise signal. 3. An encoder for encoding speech according to a source-filter model, the speech modelled to comprise a source signal filtered by a time-varying filter, the encoder comprising: an input configured to receive a speech signal, the speech signal comprising successive frames; a first signal-processing module configured to generate, for each of the frames of the speech signal, a simulated signal frame by adding a predetermined noise signal to each of the speech signal frames, the predetermined noise signal generated by combining a white noise signal with a quantization gain value, the quantization gain value calculated as a constant multiplied by a square root of residual energy from a noise shaping analysis, wherein for voiced frames of the speech signal, the quantization gain value is further multiplied by 0.5 times an inverse of a pitch correlation determined by a pitch analysis; a second signal-processing module configured to determine linear predictive coding coefficients based on the simulated signal frame, the second signal-processing module further configured to determine a linear predictive coding residual signal based on the input speech signal and the linear predictive coding coefficients; a third signal-processing module configured to generate a quantized residual signal based on the linear predictive coding residual signal; and a fourth signal-processing module configured to form an encoded signal representing the speech signal by arithmetically encoding the quantized residual signal and the linear predictive coding coefficients. 4. The encoder according to claim 3 , wherein generating the quantized residual signal further generates an associated quantization noise signal, and wherein said first signal-processing module is further configured to generate the predetermined noise signal to include white noise having a variance equal to a variance of the quantization noise. 5. The encoder according to claim 3 wherein the second signal-processing module comprises a linear predictive coding analysis module. 6. The encoder of claim 3 , wherein the third signal-processing module comprises a noise shaping quantizer module. 7. One or more hardware memory devices having code stored thereon that, when executed by a processor, performs a method comprising: receiving a speech signal, the speech signal comprising successive frames; for each of the frames of the speech signal: adding a predetermined noise signal to the input speech signal to generate a simulated signal, the predetermined noise signal generated by combining a white noise signal with a quantization gain value, the quantization gain value calculated as a constant multiplied by a square root of residual energy from a noise shaping analysis, wherein for voiced frames of the speech signal, the quantization gain value is further multiplied by 0.5 times an inverse of a pitch correlation determined by a pitch analysis; determining linear predictive coding coefficients based on the simulated signal frame; determining a linear predictive coding residual signal based on the speech input signal and the linear predictive coding coefficients; generating a quantized residual signal based on the linear predictive coding residual signal; and forming an encoded signal representing said speech signal by arithmetically encoding the quantized residual signal and the linear predictive coding coefficients. 8. The one or more hardware memory devices according to claim 7 , wherein generating the quantized residual signal further comprises generating an associated quantization noise signal, and wherein said predetermined noise signal comprises white noise having a variance equal to a variance of the quantization noise signal.

Assignees

Inventors

Classifications

  • Pitch determination of speech signals · CPC title

  • using spectral analysis, e.g. transform vocoders or subband vocoders · CPC title

  • G10L19/03Primary

    Spectral prediction for preventing pre-echo; Temporary noise shaping [TNS], e.g. in MPEG2 or MPEG4 · CPC title

  • Noise filtering · CPC title

  • the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9530423B2 cover?
A method, system and program for encoding and decoding speech according to a source-filter model whereby speech is modelled to comprise a source signal filtered by a time-varying filter. The method comprises: receiving a speech signal comprising successive frames. For each of a plurality of frames of the speech signal: adding a predetermined noise signal generated by a quantization gain multipl…
Who is the assignee on this patent?
Vos Koen Bernard, Skype
What technology area does this patent fall under?
Primary CPC classification G10L19/03. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).