Methods of Decoding Speech from the Brain and Systems for Practicing the Same
US-2015380009-A1 · Dec 31, 2015 · US
US9754603B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9754603-B2 |
| Application number | US-201213728287-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 27, 2012 |
| Priority date | Jan 10, 2012 |
| Publication date | Sep 5, 2017 |
| Grant date | Sep 5, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to one embodiment, a speech feature extraction apparatus includes an extraction unit and a calculation unit. The extraction unit extracts speech segments over a predetermined period at intervals of a unit time from either an input speech signal or a plurality of subband input speech signals obtained by extracting signal components of a plurality of frequency bands from the input speech signal, to generate either a unit speech signal or a plurality of subband unit speech signals. The calculation unit calculates either each average time of the unit speech signal in each of the plurality of frequency bands or each average time of each of the plurality of subband unit speech signals to obtain a speech feature.
Opening claim text (preview).
What is claimed is: 1. A speech feature extraction apparatus, comprising: a computer programmed to comprise: an extraction unit configured to extract speech segments over a predetermined period at intervals of a unit time from an input speech signal to generate a unit speech signal; a first calculation unit configured to calculate each subband average time corresponding to time required to reach a center of energy gravity of the unit speech signal in each of a plurality of frequency bands obtained by dividing an overall frequency band into a number smaller than a bin number of frequency; a generation unit configured to generate a speech feature in each of the frequency bands based on the subband average time; and a decoder performing speech recognition processing to transform the input speech signal into words using the speech feature in the each of the frequency bands, wherein the speech feature is expressed in terms of time. 2. The apparatus according to claim 1 , wherein the computer is further programmed to comprise: a second calculation unit configured to calculate a power spectrum of the unit speech signal, and wherein the extraction unit extracts speech segments over the predetermined period from the input speech signal at intervals of the unit time to generate the unit speech signal, and wherein the first calculation unit calculates the subband average time based on the power spectrum. 3. The apparatus according to claim 2 , wherein the computer is further programmed to comprise: a third calculation unit configured to calculate a first product of a real part of a first spectrum of the unit speech signal and a real part of a second spectrum of a product of the unit speech signal and a time, to calculate a second product of an imaginary part of the first spectrum and an imaginary part of the second spectrum, and to add the first product and the second product together to obtain a third spectrum; and wherein the first calculation unit calculates the subband average time based on the power spectrum and the third spectrum. 4. The apparatus according to claim 3 , wherein the computer is further programmed to comprise: a first application unit configured to apply a first filter bank to the power spectrum to obtain a filtered power spectrum; and a second application unit configured to apply a second filter bank to the third spectrum to obtain a filtered third spectrum, and wherein the first calculation unit calculates the subband average time based on the filtered power spectrum and the filtered third spectrum. 5. The apparatus according to claim 3 , wherein the first calculation unit calculates the subband average time in a given frequency band of the frequency bands by dividing a summation of the third spectrum in the given frequency band by a summation of the power spectrum in the given frequency band. 6. The apparatus according to claim 2 , wherein the computer is further programmed to comprise: a third calculation unit configured to calculate a group delay spectrum of the unit speech signal; and a multiplication unit configured to multiply the power spectrum by the group delay spectrum to obtain a multiplication spectrum, and wherein the first calculation unit calculates the subband average time based on the power spectrum and the multiplication spectrum. 7. The apparatus according to claim 6 , wherein the computer is further programmed to comprise: a first application unit configured to apply a first filter bank to the power spectrum to obtain a filtered power spectrum; and a second application unit configured to apply a second filter bank to the multiplication spectrum to obtain a filtered multiplication spectrum, and wherein the first calculation unit calculates the subband average time based on the filtered power spectrum and the filtered multiplication spectrum. 8. The apparatus according to claim 2 , wherein the computer is further programmed to comprise: an application unit configured to apply a filter bank to the power spectrum to obtain a filtered power spectrum, and wherein the first calculation unit calculates the subband average time based on the filtered power spectrum. 9. The apparatus according to claim 1 , wherein the generation unit generates the speech feature by applying an axis transformation process on the subband average time. 10. A non-transitory computer readable storage medium storing instructions of a computer program which when executed by a computer results in performance of steps comprising: extracting speech segments over a predetermined period at intervals of a unit time from an input speech signal to generate a unit speech signal; calculating each subband average time corresponding to time required to reach a center of energy gravity of the unit speech signal in each of a plurality of frequency bands obtained by dividing an overall frequency band into a number smaller than a bin number of frequency; generating a speech feature in each of the frequency bands based on the subband average time; and transforming, by a decoder that performs speech recognition processing, the input speech signal into words using the speech feature in the each of the frequency bands, wherein the speech feature is expressed in terms of time. 11. A speech feature extraction method, comprising: controlling a computer to: extract speech segments over a predetermined period at intervals of a unit time from an input speech signal to generate a unit speech signal; calculate each subband average time corresponding to time required to reach a center of energy gravity of the unit speech signal in each of a plurality of frequency bands obtained by dividing an overall frequency band into a number smaller than a bin number of frequency; generate a speech feature in each of the frequency bands based on the subband average time; and transform, by a decoder that performs speech recognition processing, the input speech signal into words using the speech feature in the each of the frequency bands, wherein the speech feature is expressed in terms of time. 12. A speech feature extraction method, comprising: controlling a computer to: extract speech segments over a predetermined period at intervals of a unit time from the plurality of subband input speech signals obtained by extracting signal components of a plurality of frequency bands from the input speech signal to generate a plurality of subband unit speech signals; calculate each subband average time corresponding to a center of energy gravity of power of each of the plurality of subband unit speech signals within a predetermined interval; generate a speech feature in each of the frequency bands based on the subband average time; and transform, by a decoder that performs speech recognition processing, the input speech signal into words using the speech feature in the each of the frequency bands, wherein the speech feature is expressed in terms of time. 13. A non-transitory computer readable storage medium storing instructions of a computer program which when executed by a computer results in performance of steps comprising: extracting speech segments over a predetermined period at intervals of a unit time from a plurality of subband input speech signals obtained by extracting signal components of a plurality of frequency bands from an input speech signal to generate a plurality of subband unit speech signals, wherein the plurality of subband input speech signals is obtained from the input speech signal by a band-pass filter; calculating each subband average time corresponding to a center of energy gravity of power of each of the plurality of subband unit speech s
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility (G10L19/00 takes precedence) · CPC title
Feature extraction for speech recognition; Selection of recognition unit · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.