Speech/audio signal processing method and apparatus

US9691396B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9691396-B2
Application numberUS-201414470559-A
CountryUS
Kind codeB2
Filing dateAug 27, 2014
Priority dateMar 1, 2012
Publication dateJun 27, 2017
Grant dateJun 27, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention discloses a speech/audio signal processing method and apparatus. In an embodiment, the speech/audio signal processing method includes: when a speech/audio signal switches bandwidth, obtaining an initial high frequency signal corresponding to a current frame of speech/audio signal; obtaining a time-domain global gain parameter of the initial high frequency signal; performing weighting processing on an energy ratio and the time-domain global gain parameter, and using an obtained weighted value as a predicted global gain parameter, where the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; correcting the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and synthesizing a current frame of narrow frequency time-domain signal and the corrected high frequency time-domain signal and outputting the synthesized signal.

First claim

Opening claim text (preview).

What is claimed is: 1. A speech/audio signal processing method, comprising: when a speech/audio signal switches from a wide frequency signal to a narrow frequency signal, obtaining, by a decoder, an initial high frequency signal corresponding to the narrow frequency signal; obtaining, by the decoder, a time-domain global gain parameter of the initial high frequency signal according to a spectrum tilt parameter of the current frame of speech/audio signal and a correlation between a narrow frequency signal of the current frame and a narrow frequency signal of a historical frame; performing, by the decoder, weighting processing on an energy ratio and the time-domain global gain parameter, and using an obtained weighted value as a predicted global gain parameter, wherein the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; correcting, by the decoder, the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and synthesizing, by the decoder, a narrow frequency time-domain signal of the current frame and the corrected high frequency time-domain signal and outputting, by the decoder, the synthesized signal. 2. The method according to claim 1 , wherein the obtaining a time-domain global gain parameter of the initial high frequency signal according to a spectrum tilt parameter of the current frame of speech/audio signal and a correlation between a narrow frequency signal of the current frame and a narrow frequency signal of a historical frame comprises: classifying the current frame of speech/audio signal as a first type of signal or a second type of signal according to the spectrum tilt parameter of the current frame of speech/audio signal and the correlation between the narrow frequency signal of the current frame and the narrow frequency signal of the historical frame; when the current frame of speech/audio signal is a first type of signal, limiting the spectrum tilt parameter to less than or equal to a first predetermined value, to obtain a limited spectrum tilt parameter value; when the current frame of speech/audio signal is a second type of signal, limiting the spectrum tilt parameter to a value in a first range, to obtain a limited spectrum tilt parameter value; and using the limited spectrum tilt parameter value as the time-domain global gain parameter of the initial high frequency signal. 3. The method according to claim 2 , wherein the first predetermined value is 8 and the first range is [0.5, 1]. 4. The method according to claim 1 , further comprising: obtaining, by the decoder, a time-domain envelope parameter corresponding to the initial high frequency signal, wherein the correcting the initial high frequency signal by using the time-domain global gain parameter comprises: correcting the initial high frequency signal by using the time-domain envelope parameter and the time-domain global gain parameter. 5. A speech/audio signal processing apparatus, comprising: a processor; a predicting unit controlled by the processor, configured to: when a speech/audio signal switches from a wide frequency signal to a narrow frequency signal, obtain an initial high frequency signal corresponding to a current frame of speech/audio signal; a parameter obtaining unit controlled by the processor, configured to obtain a time-domain global gain parameter of the initial high frequency signal according to a spectrum tilt parameter of the current frame of speech/audio signal and a correlation between a narrow frequency signal of the current frame and a narrow frequency signal of a historical frame; a weighting processing unit controlled by the processor, configured to perform weighting processing on an energy ratio and the time-domain global gain parameter, and use an obtained weighted value as a predicted global gain parameter, wherein the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; a correcting unit controlled by the processor, configured to correct the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and a synthesizing unit controlled by the processor, configured to synthesize a narrow frequency time-domain signal of the current frame and the corrected high frequency time-domain signal and output the synthesized signal. 6. The apparatus according to claim 5 , wherein the parameter obtaining unit is further configured to obtain a time-domain envelope parameter corresponding to the initial high frequency signal; and the correcting unit is configured to correct the initial high frequency signal by using the time-domain envelope parameter and the time-domain global gain parameter. 7. The apparatus according to claim 5 , wherein the parameter obtaining unit comprises: a classifying unit, configured to classify the current frame of speech/audio signal as a first type of signal or a second type of signal according to the spectrum tilt parameter of the current frame of speech/audio signal and the correlation between the narrow frequency signal of the current frame and the narrow frequency signal of the historical frame; a first limiting unit, configured to: when the current frame of speech/audio signal is a first type of signal, limit the spectrum tilt parameter to less than or equal to a first predetermined value, to obtain a limited spectrum tilt parameter value, and use the limited spectrum tilt parameter value as the time-domain global gain parameter of the initial high frequency signal; and a second limiting unit, configured to: when the current frame of speech/audio signal is a second type of signal, limit the spectrum tilt parameter to a value in a first range, to obtain a limited spectrum tilt parameter value, and use the limited spectrum tilt parameter value as the time-domain global gain parameter of the initial high frequency signal. 8. The apparatus according to claim 7 , wherein the first predetermined value is 8 and the first range is [0.5, 1]. 9. A speech/audio signal processing apparatus, comprising: a processor; an acquiring unit controlled by the processor, configured to: when a speech/audio signal switches bandwidth, obtain an initial high frequency signal corresponding to a current frame of speech/audio signal; a parameter obtaining unit controlled by the processor, configured to obtain a time-domain global gain parameter corresponding to the initial high frequency signal; a weighting processing unit controlled by the processor, configured to perform weighting processing on an energy ratio and the time-domain global gain parameter, and use an obtained weighted value as a predicted global gain parameter, wherein the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; a correcting unit controlled by the processor, configured to correct the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and a synthesizing unit controlled by the processor, configured to synthesize a narrow frequency time-domain signal of the current frame and the corrected high frequency time-domain signal and output the synthesized signal. 10. The apparatus according to claim 9 , wherein the bandwidth switching is switching from a narrow frequency signal to a wide frequency signal, and the apparatus further comprises: a weighting factor setting uni

Assignees

Inventors

Classifications

  • G10L19/083Primary

    the excitation function being an excitation gain (G10L25/90 takes precedence) · CPC title

  • using subband decomposition · CPC title

  • using band spreading techniques · CPC title

  • Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP] · CPC title

  • Processing in the time domain · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9691396B2 cover?
The present invention discloses a speech/audio signal processing method and apparatus. In an embodiment, the speech/audio signal processing method includes: when a speech/audio signal switches bandwidth, obtaining an initial high frequency signal corresponding to a current frame of speech/audio signal; obtaining a time-domain global gain parameter of the initial high frequency signal; performin…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G10L19/083. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 27 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).