Technique for estimating particular audio component
US-9224406-B2 · Dec 29, 2015 · US
US9363596B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9363596-B2 |
| Application number | US-201313840667-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 15, 2013 |
| Priority date | Mar 15, 2013 |
| Publication date | Jun 7, 2016 |
| Grant date | Jun 7, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of improving voice quality in a mobile device starts by receiving acoustic signals from microphones included in earbuds and the microphone array included on a headset wire. The headset may include the pair of earbuds and the headset wire. An output from an accelerometer that is included in the pair of earbuds is then received. The accelerometer may detect vibration of the user's vocal chords filtered by the vocal tract based on vibrations in bones and tissue of the user's head. A spectral mixer included in the mobile device may then perform spectral mixing of the scaled output from the accelerometer with the acoustic signals from the microphone array to generate a mixed signal. Performing spectral mixing includes scaling the output from the inertial sensor by a scaling factor based on a power ratio between the acoustic signals from the microphone array and the output from the inertial sensor. Other embodiments are also described.
Opening claim text (preview).
The invention claimed is: 1. A method of improving voice quality in a mobile device comprising: receiving acoustic signals from one or more microphones included with a pair of earbuds, wherein a headset includes the pair of earbuds and a headset wire; receiving an output from an inertial sensor that is included in the pair of earbuds; performing spectral mixing of the output from the inertial sensor with the acoustic signals from the one or more microphones to generate a mixed signal, wherein performing spectral mixing includes scaling the output from the inertial sensor by a scaling factor based on a power ratio between the acoustic signals from the one or more microphones and the output from the inertial sensor. 2. The method of claim 1 , wherein the one or more microphones included with the pair of earbuds comprises: a front microphone and a rear microphone in each of the earbuds. 3. The method of claim 1 , wherein the inertial sensor is an accelerometer that is included in each of the earbuds. 4. The method of claim 3 , performing spectral mixing to generate the mixed signal further comprises: pre-emphasizing the output from the accelerometer to account for lip radiation characteristic to generate a pre-emphasized accelerometer signal. 5. The method of claim 4 , performing spectral mixing to generate the mixed signal further comprises: receiving from a voice activity detector (VAD) a VAD output that is based on (i) the acoustic signals from the one or more microphones and (ii) the data output by the accelerometer; when the VAD output indicates that no voice activity is detected, computing an acoustic noise power signal and an accelerometer noise power signal, wherein the acoustic noise power signal is a noise power signal in the acoustic signal from the one or more microphones and the accelerometer noise power signal is a noise power signal in the pre-emphasized accelerometer signal; when an alternative non-stationary noise detector is employed it estimates the noise power in the acoustic signal and the accelerometer signal during intervals with either voice activity or no voice activity; when the VAD output indicates that voice activity is detected, computing an acoustic power signal and an accelerometer power signal, wherein the acoustic power signal is a power signal during speech in the acoustic signal from the one or more microphones and the accelerometer power signal is a power signal during speech in the pre-emphasized accelerometer signal; and generating (i) a final acoustic power signal by removing the acoustic noise power signal from the acoustic power signal and (ii) a final accelerometer power signal by removing the accelerometer noise power signal from the accelerometer power signal. 6. The method of claim 5 , wherein performing spectral mixing to generate the mixed signal further comprises: applying limits to the noise powers subtracted by the noise subtraction module in order to generate a positive low-frequency final accelerometer power signal and a positive low-frequency final acoustic power signal; computing the power ratio between the low-frequency final accelerometer power signal and the low-frequency final acoustic power signal, wherein the low-frequency final accelerometer power signal and the low-frequency final acoustic power signal are within a same low frequency band; and computing the scaling factor by smoothing the power ratio, limiting it to an allowable range, and by extracting the square root from the smoothed and limited power ratio. 7. The method of claim 6 , wherein performing spectral mixing to generate the mixed signal further comprises: applying a low-pass filter with a cutoff frequency (Fc) to the pre-emphasized accelerometer signal to generate a low-pass filtered pre-emphasized accelerometer signal; and scaling the low-pass filtered pre-emphasized accelerometer signal using the scaling factor to generate a final accelerometer signal during the time when voice activity is detected (VAD=1); and applying a certain fixed attenuation to the low-pass filtered pre-emphasized accelerometer signal when voice activity is not detected (VAD=0). 8. The method of claim 7 , wherein performing spectral mixing to generate the mixed signal further comprises: applying a high-pass filter with the cutoff frequency (Fc) to the acoustic signals from the one or more microphones to generate a final acoustic signal from the one or more microphones; and mixing the scaled accelerometer signal with the final acoustic signal from the one or more microphones to generate the mixed signal. 9. The method of claim 8 , further comprising: calculating a delay between the final acoustic signal and the scaled accelerometer signal based on cross-correlation; and applying the delay to the scaled accelerometer signal before mixing the scaled accelerometer signal with the final acoustic signal to generate the mixed signal. 10. The method of claim 9 , further comprising: receiving by a switch (i) the mixed signal and (ii) a speech signal from a beamformer, wherein the acoustic signals from the one or more microphones are received by the beamformer; outputting by the switch the mixed signal when the acoustic noise power signal is greater than a noise threshold or when wind noise is detected by the one or more microphones; and outputting by the switch the speech signal from the beamformer when the acoustic noise power signal is lesser than or equal to the noise threshold and when wind noise is not detected by the one or more microphones. 11. The method of claim 10 , further comprising: receiving by a noise suppressor (i) the output from the switch, (ii) the VAD output and (iii) a noise beam output from the beamformer; and suppressing by the noise suppressor noise included in the output from the switch based on the VAD output and using a noise estimate from the noise beam output. 12. The method of claim 11 , further comprising: generating pitch estimate by a pitch detector based on autocorrelation method and using the output from the accelerometer, wherein the pitch estimate is obtained by (i) using an X, Y, or Z signal generated by the accelerometer that has a highest power level or (ii) using a combination of the X, Y, and Z signals generated by the accelerometer. 13. The method of claim 3 , wherein receiving the output from the accelerometer further comprises: receiving an output signal for each of the three axes of the accelerometer, wherein the output signal for each of the three axes are X, Y, and Z signals generated by the accelerometer, respectively; determining a total power in each of the X, Y, and Z signals generated by the accelerometer, respectively; and selecting the X, Y, or Z signal having the highest power as the output from the accelerometer. 14. The method of claim 3 , wherein receiving the output from the accelerometer further comprises: receiving an output signal for each of the three axes of the accelerometer, wherein the output signal for each of the three axes are X, Y, and Z signals generated by the accelerometer, respectively; and computing an average of the X, Y, and Z signals to generate the output from the accelerometer. 15. The method of claim 3 , wherein receiving the output from the accelerometer further comprises: receiving an output signal for each of the three axes of the accelerometer, wherein the output signal for each of the three axes are X, Y, and Z signals generated by the accelerometer, respectively; computing using cross-correlation a delay between the X and Y signals, a delay between the X and Z signals, and a delay between the Y and
characterised by the method used for estimating noise · CPC title
for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title
Pitch determination of speech signals · CPC title
Monophonic and stereophonic headphones with microphone for two-way hands free communication · CPC title
Special adaptations for use as contact microphones, e.g. on musical instrument, on stethoscope (throat mountings H04R1/14) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.