High-band signal generation
US-2016372125-A1 · Dec 22, 2016 · US
US9691396B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9691396-B2 |
| Application number | US-201414470559-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 27, 2014 |
| Priority date | Mar 1, 2012 |
| Publication date | Jun 27, 2017 |
| Grant date | Jun 27, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention discloses a speech/audio signal processing method and apparatus. In an embodiment, the speech/audio signal processing method includes: when a speech/audio signal switches bandwidth, obtaining an initial high frequency signal corresponding to a current frame of speech/audio signal; obtaining a time-domain global gain parameter of the initial high frequency signal; performing weighting processing on an energy ratio and the time-domain global gain parameter, and using an obtained weighted value as a predicted global gain parameter, where the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; correcting the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and synthesizing a current frame of narrow frequency time-domain signal and the corrected high frequency time-domain signal and outputting the synthesized signal.
Opening claim text (preview).
What is claimed is: 1. A speech/audio signal processing method, comprising: when a speech/audio signal switches from a wide frequency signal to a narrow frequency signal, obtaining, by a decoder, an initial high frequency signal corresponding to the narrow frequency signal; obtaining, by the decoder, a time-domain global gain parameter of the initial high frequency signal according to a spectrum tilt parameter of the current frame of speech/audio signal and a correlation between a narrow frequency signal of the current frame and a narrow frequency signal of a historical frame; performing, by the decoder, weighting processing on an energy ratio and the time-domain global gain parameter, and using an obtained weighted value as a predicted global gain parameter, wherein the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; correcting, by the decoder, the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and synthesizing, by the decoder, a narrow frequency time-domain signal of the current frame and the corrected high frequency time-domain signal and outputting, by the decoder, the synthesized signal. 2. The method according to claim 1 , wherein the obtaining a time-domain global gain parameter of the initial high frequency signal according to a spectrum tilt parameter of the current frame of speech/audio signal and a correlation between a narrow frequency signal of the current frame and a narrow frequency signal of a historical frame comprises: classifying the current frame of speech/audio signal as a first type of signal or a second type of signal according to the spectrum tilt parameter of the current frame of speech/audio signal and the correlation between the narrow frequency signal of the current frame and the narrow frequency signal of the historical frame; when the current frame of speech/audio signal is a first type of signal, limiting the spectrum tilt parameter to less than or equal to a first predetermined value, to obtain a limited spectrum tilt parameter value; when the current frame of speech/audio signal is a second type of signal, limiting the spectrum tilt parameter to a value in a first range, to obtain a limited spectrum tilt parameter value; and using the limited spectrum tilt parameter value as the time-domain global gain parameter of the initial high frequency signal. 3. The method according to claim 2 , wherein the first predetermined value is 8 and the first range is [0.5, 1]. 4. The method according to claim 1 , further comprising: obtaining, by the decoder, a time-domain envelope parameter corresponding to the initial high frequency signal, wherein the correcting the initial high frequency signal by using the time-domain global gain parameter comprises: correcting the initial high frequency signal by using the time-domain envelope parameter and the time-domain global gain parameter. 5. A speech/audio signal processing apparatus, comprising: a processor; a predicting unit controlled by the processor, configured to: when a speech/audio signal switches from a wide frequency signal to a narrow frequency signal, obtain an initial high frequency signal corresponding to a current frame of speech/audio signal; a parameter obtaining unit controlled by the processor, configured to obtain a time-domain global gain parameter of the initial high frequency signal according to a spectrum tilt parameter of the current frame of speech/audio signal and a correlation between a narrow frequency signal of the current frame and a narrow frequency signal of a historical frame; a weighting processing unit controlled by the processor, configured to perform weighting processing on an energy ratio and the time-domain global gain parameter, and use an obtained weighted value as a predicted global gain parameter, wherein the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; a correcting unit controlled by the processor, configured to correct the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and a synthesizing unit controlled by the processor, configured to synthesize a narrow frequency time-domain signal of the current frame and the corrected high frequency time-domain signal and output the synthesized signal. 6. The apparatus according to claim 5 , wherein the parameter obtaining unit is further configured to obtain a time-domain envelope parameter corresponding to the initial high frequency signal; and the correcting unit is configured to correct the initial high frequency signal by using the time-domain envelope parameter and the time-domain global gain parameter. 7. The apparatus according to claim 5 , wherein the parameter obtaining unit comprises: a classifying unit, configured to classify the current frame of speech/audio signal as a first type of signal or a second type of signal according to the spectrum tilt parameter of the current frame of speech/audio signal and the correlation between the narrow frequency signal of the current frame and the narrow frequency signal of the historical frame; a first limiting unit, configured to: when the current frame of speech/audio signal is a first type of signal, limit the spectrum tilt parameter to less than or equal to a first predetermined value, to obtain a limited spectrum tilt parameter value, and use the limited spectrum tilt parameter value as the time-domain global gain parameter of the initial high frequency signal; and a second limiting unit, configured to: when the current frame of speech/audio signal is a second type of signal, limit the spectrum tilt parameter to a value in a first range, to obtain a limited spectrum tilt parameter value, and use the limited spectrum tilt parameter value as the time-domain global gain parameter of the initial high frequency signal. 8. The apparatus according to claim 7 , wherein the first predetermined value is 8 and the first range is [0.5, 1]. 9. A speech/audio signal processing apparatus, comprising: a processor; an acquiring unit controlled by the processor, configured to: when a speech/audio signal switches bandwidth, obtain an initial high frequency signal corresponding to a current frame of speech/audio signal; a parameter obtaining unit controlled by the processor, configured to obtain a time-domain global gain parameter corresponding to the initial high frequency signal; a weighting processing unit controlled by the processor, configured to perform weighting processing on an energy ratio and the time-domain global gain parameter, and use an obtained weighted value as a predicted global gain parameter, wherein the energy ratio is a ratio between energy of a historical frame of high frequency time-domain signal and energy of a current frame of initial high frequency signal; a correcting unit controlled by the processor, configured to correct the initial high frequency signal by using the predicted global gain parameter, to obtain a corrected high frequency time-domain signal; and a synthesizing unit controlled by the processor, configured to synthesize a narrow frequency time-domain signal of the current frame and the corrected high frequency time-domain signal and output the synthesized signal. 10. The apparatus according to claim 9 , wherein the bandwidth switching is switching from a narrow frequency signal to a wide frequency signal, and the apparatus further comprises: a weighting factor setting uni
the excitation function being an excitation gain (G10L25/90 takes precedence) · CPC title
using subband decomposition · CPC title
using band spreading techniques · CPC title
Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP] · CPC title
Processing in the time domain · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.