Enhanced de-esser for in-car communications systems
US-2024062770-A1 · Feb 22, 2024 · US
US2016372135A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016372135-A1 |
| Application number | US-201615181716-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 14, 2016 |
| Priority date | Jun 19, 2015 |
| Publication date | Dec 22, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus for processing a speech signal is provided. The apparatus includes a communicator comprising communication circuitry configured to transmit and receive data, an actuator comprising actuation circuitry configured to generate vibration and to output a signal, a formant enhancement filter configured to increase a formant of the speech signal, and a controller comprising processing circuitry configured to control the speech signal to be received through the communicator, to estimate at least one formant frequency from the speech signal based on linear predictive coding (LPC), to estimate a bandwidth of the at least one formant frequency, to determine whether the speech signal is a voiced sound or a voiceless sound, to configure the formant enhancement filter based on the at least one formant frequency, the bandwidth of the at least one formant frequency, characteristics of the determined voiced sound or voiceless sound, and signal delivery characteristics of a human body, to apply the formant enhancement filter to the speech signal, and to control the speech signal to which the formant enhancement filter is applied to be output using the actuator through the human body.
Opening claim text (preview).
What is claimed is: 1 . An apparatus for processing a speech signal, the apparatus comprising: a communicator comprising communication circuitry, the communicator configured to transmit and receive data; an actuator comprising actuation circuitry configured to generate vibration and to output a signal; a formant enhancement filter configured to increase a formant of the speech signal; and a controller comprising processing circuitry, the controller configured to control the speech signal to be received through the communicator, to estimate at least one formant frequency from the speech signal based on linear predictive coding (LPC), to estimate a bandwidth of the at least one formant frequency, to determine whether the speech signal is a voiced sound or a voiceless sound, to configure the formant enhancement filter based on the at least one formant frequency, the bandwidth of the at least one formant frequency, characteristics of the determined voiced sound or voiceless sound, and signal delivery characteristics of a human body, to apply the formant enhancement filter to the speech signal, and to control the speech signal to which the formant enhancement filter is applied to be output through the human body using the actuator. 2 . The apparatus of claim 1 , wherein when the controller is configured to estimate the at least one formant frequency, the controller is further configured to obtain LPC coefficients (LPCCs) through LPC analysis, to obtain an LPC envelope based on the LPCCs, and to estimate, as the at least one formant frequency, frequency at which a slope of the LPC envelope becomes a negative value from a positive value. 3 . The apparatus of claim 2 , wherein when the controller is configured to estimate the at least one formant frequency, the controller is further configured to estimate the bandwidth of the at least one formant frequency based on the LPCCs. 4 . The apparatus of claim 2 , wherein when the controller is configured to determine whether the speech signal is the voiced sound or the voiceless sound, the controller is further configured to determine whether the speech signal is the voiced sound or the voiceless sound based on the LPC envelope. 5 . The apparatus of claim 1 , wherein the formant enhancement filter is configured to be implemented with a window function that reinforces a gain of a formant frequency band. 6 . The apparatus of claim 1 , further comprising: a microphone configured to receive an audio signal; and a noise and echo removal filter configured to remove a noise component and an echo component from the received audio signal, wherein the controller is further configured to obtain echo power by estimating power with respect to the echo component based on a speech signal to which the formant enhancement filter is applied and which is input back through the microphone, to obtain noise signal power by estimating power of a background noise input through the microphone, to obtain combined power by combining the echo power with the noise signal power, to configure the noise and echo removal filter based on the combined power, to receive the audio signal comprising a user speech signal through the microphone, and to estimate the user speech signal included in the audio signal by applying the noise and echo removal filter to the received audio signal. 7 . The apparatus of claim 6 , wherein when the controller is configured to obtain the echo power, the controller is further configured to estimate a gain value filter based on the speech signal to which the formant enhancement filter is applied, to obtain a magnitude spectrum of the echo component using the estimated gain value filter, and to obtain current echo power by performing smoothing using the obtained magnitude spectrum and estimated echo power. 8 . The apparatus of claim 6 , wherein when the controller is configured to obtain the combined power by combining the echo power with the noise signal power, the controller is further configured to obtain current combined power by performing smoothing using the echo power, the noise signal power, and previously combined power. 9 . The apparatus of claim 6 , wherein when the controller configures the noise and echo removal filter based on the combined power, the controller is further configured to estimate a first priori signal-to-combined power ratio (SCR) and a posteriori SCR based on the combined power, to estimate a second priori SCR in a decision-direction way based on the combined power, the posteriori SCR, and power of a previous speech signal, and to configure the noise and echo removal filter based on the second priori SCR. 10 . The apparatus of claim 9 , wherein when the controller is configured to estimate the user speech signal included in the audio signal, the controller is further configured to compare the posteriori SCR with a threshold value for the posteriori SCR to determine an indicator function value, to estimate a prior probability of a current speech being absent based on the indicator function value and the prior probability of the previous speech signal being absent, to determine a likelihood ratio based on the first prior SCR, the posteriori SCR, and the second prior SCR, to determine a probability of a speech signal being present based on the prior probability of the current speech signal being absent and the likelihood ratio, and to estimate the user speech signal based on the noise and echo removal filter and the probability of the speech signal being present. 11 . A method of processing a speech signal, the method comprising: receiving a speech signal; estimating at least one formant frequency from the speech signal based on linear predictive coding (LPC); estimating a bandwidth of the at least one formant frequency; determining whether the speech signal is a voiced sound or a voiceless sound; configuring a formant enhancement filter based on the at least one formant frequency, the bandwidth of the at least one formant frequency, characteristics of the determined voiced sound or voiceless sound, and signal delivery characteristics of a human body; applying the formant enhancement filter to the speech signal; and outputting the speech signal to which the formant enhancement filter is applied through the human body. 12 . The method of claim 11 , wherein the estimating of the at least one formant frequency comprises: obtaining LPC coefficients (LPCCs) through LPC analysis; obtaining an LPC envelope based on the LPCCs; and estimating, as the at least one formant frequency, frequency at which a slope of the LPC envelope becomes a negative value from a positive value. 13 . The method of claim 12 , wherein the estimating of the bandwidth of the at least one formant frequency comprises: estimating the bandwidth of the at least one formant frequency based on the LPCCs. 14 . The method of claim 12 , wherein the determining of whether the speech signal is the voiced sound or the voiceless sound comprises: determining based on the LPC envelope whether the speech signal is the voiced sound or the voiceless sound. 15 . The method of claim 11 , wherein the formant enhancement filter is implemented with a window function that reinforces a gain of a formant frequency band. 16 . The method of claim 11 , further comprising: obtaining echo power by estimating power with respect to the echo component based on a speech signal to which the formant enhancement filter is applied and which is input back through a microphone; obtaining noise signal power by estimating power of a background noise input through the micr
Details of processing therefor · CPC title
Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients · CPC title
the noise being echo, reverberation of the speech · CPC title
for improving intelligibility · CPC title
the extracted parameters being formant information · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.