Data processing method, and storage medium and electronic device thereof
US-2024339107-A1 · Oct 10, 2024 · US
US9368103B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9368103-B2 |
| Application number | US-201314418680-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 30, 2013 |
| Priority date | Aug 1, 2012 |
| Publication date | Jun 14, 2016 |
| Grant date | Jun 14, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
For high-accuracy analysis and high-quality synthesis of voice sound (singing and speech), provided herein are a system and a method for estimating from an audio signal spectral envelopes and group delays for sound analysis and synthesis with high accuracy and high temporal resolution. An estimation system of spectral envelopes and group delays includes a fundamental frequency estimation section, an amplitude spectrum acquisition section, a group delay extraction section, a spectral envelope integration section, and a group delay integration section. The spectral envelope integration section sequentially obtains a spectral envelope for sound synthesis by averaging overlapped spectra. The group delay integration section selects from a plurality of group delays a group delay corresponding to the maximum envelope of each frequency component of the spectral envelope and integrates groups delays thus selected to sequentially obtain a group delay for sound synthesis.
Opening claim text (preview).
The invention claimed is: 1. An estimation system of spectral envelopes and group delays for sound analysis and synthesis comprising at least one processor operable to function as: a fundamental frequency estimation section configured to estimate F 0 s from an audio signal at all points of time or at all points of sampling; an amplitude spectrum acquisition section configured to divide the audio signal into a plurality of frames, centering on each point of time or each point of sampling, by using a window having a window length changing with F 0 at each point of time or each point of sampling, to perform Discrete Fourier Transform (DFT) analysis on the plurality of frames of the audio signal, and thus to acquire amplitude spectra at the respective frames; a group delay extraction section configured to extract group delays as phase frequency differentials at the respective frames by performing a group delay extraction algorithm accompanied by DFT analysis on the plurality of frames of the audio signal; a spectral envelope integration section configured to obtain overlapped spectra at a predetermined time interval by overlapping the amplitude spectra corresponding to the frames included in a certain period determined based on a fundamental period of F 0 , and to average the overlapped spectra to sequentially obtain a spectral envelope for sound synthesis; and a group delay integration section configured to select a group delay corresponding to a maximum envelope for each frequency component of the spectral envelope from the group delays at a predetermined time interval, and to integrate the thus selected group delays to sequentially obtain a group delay for sound synthesis. 2. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 1 , wherein: the fundamental frequency estimation section is configured to identify voiced segments and unvoiced segments in addition to the estimation of F 0 s and to interpolate the unvoiced segments with F 0 values of the voiced segments or allocate predetermined values to the unvoiced segments as F 0 . 3. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 1 , wherein: the spectral envelope integration section is configured to obtain the spectral envelope for sound synthesis by calculating a mean value of the maximum envelope and a minimum envelope of the overlapped spectra. 4. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 3 , wherein: the spectral envelope integration section is configured to obtain the spectral envelope for sound synthesis by using, as the mean value, a median value of the maximum envelope and the minimum envelope of the overlapped spectra. 5. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 4 , wherein: the maximum envelope is transformed to fill in valleys of the minimum envelope and a transformed minimum envelope thus obtained is used as the minimum envelope in calculating the mean value. 6. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 3 , wherein: the maximum envelope is transformed to fill in valleys of the minimum envelope and a transformed minimum envelope thus obtained is used as the minimum envelope in calculating the mean value. 7. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 3 , wherein: the spectral envelope integration section is configured to obtain the spectral envelope for sound synthesis by replacing amplitude values of the spectral envelope of frequency bins under F 0 with an amplitude value of the spectral envelope at F 0 . 8. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 7 , further comprising: a two-dimensional low-pass filter operable to filter the replaced spectral envelope. 9. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 1 , wherein: the group delay integration section is configured to store, by frequency, the group delays in the frames corresponding to the maximum envelopes for respective frequency components of the overlapped spectra, to compensate a time-shift of analysis of the stored group delays, and to normalize the stored group delays for use in sound synthesis. 10. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 9 , wherein: the group delay integration section is configured to obtain the group delay for sound synthesis by replacing values of group delay of frequency bins under F 0 with a value of the group delay at F 0 . 11. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 10 , wherein: the group delay integration section is configured to smooth the replaced group delays for use in sound synthesis. 12. The estimation system of spectral envelopes and group delays for sound analysis and synthesis according to claim 11 , wherein: in smoothing the replaced group delays for use in sound synthesis, the replaced group delays are converted with sin function and cos function to remove discontinuity due to the fundamental period, the converted group delays are subsequently filtered with a two-dimensional low-pass filter, and then the filtered group delays are converted to an original state with tan −1 function for use in sound synthesis. 13. An audio signal synthesis system using the spectral envelopes and group delays for sound analysis and synthesis estimated by the estimation system according to claim 1 , the audio signal synthesis system comprising at least one processor operable to function as: a reading section configured to read out, in a fundamental period for sound synthesis, the spectral envelopes and group delays for sound synthesis from a data file of the spectral envelopes and group delays for sound synthesis estimated by the estimation system, wherein the fundamental period for sound synthesis is a reciprocal of the fundamental frequency for sound synthesis; a conversion section configured to convert the read-out group delays into phase spectra; a unit waveform generation section configured to generate unit waveforms based on the read-out spectral envelopes and the phase spectra; and a synthesis section configured to output a synthesized audio signal obtained by performing overlap-add calculation on the generated unit waveforms in the fundamental period for sound synthesis. 14. The audio signal synthesis system according to claim 13 , further comprising: a discontinuity suppression section configured to suppress an occurrence of discontinuity of the read-out group delays along a time axis in a low frequency range before the conversion section converts the read-out group delays. 15. The audio signal synthesis system according to claim 14 , wherein: the discontinuity suppression section is configured to smooth group delays in the low frequency range after adding an optimal offset to the group delay for each voiced segment. 16. The audio signal synthesis system according to claim 15 , further comprising: a compensation section configured to multiply the respective group delays by the fundamental period for sound synthesis as a multiplier coefficient after the conversion section converts the group delays or before the discontinuity suppression s
the extracted parameters being formant information · CPC title
Methods for producing synthetic speech; Speech synthesisers · CPC title
Pitch tracking · CPC title
Adapting to target pitch · CPC title
the extracted parameters being spectral information of each sub-band · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.