Method and apparatus for processing speech signal

US2016372135A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016372135-A1
Application numberUS-201615181716-A
CountryUS
Kind codeA1
Filing dateJun 14, 2016
Priority dateJun 19, 2015
Publication dateDec 22, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus for processing a speech signal is provided. The apparatus includes a communicator comprising communication circuitry configured to transmit and receive data, an actuator comprising actuation circuitry configured to generate vibration and to output a signal, a formant enhancement filter configured to increase a formant of the speech signal, and a controller comprising processing circuitry configured to control the speech signal to be received through the communicator, to estimate at least one formant frequency from the speech signal based on linear predictive coding (LPC), to estimate a bandwidth of the at least one formant frequency, to determine whether the speech signal is a voiced sound or a voiceless sound, to configure the formant enhancement filter based on the at least one formant frequency, the bandwidth of the at least one formant frequency, characteristics of the determined voiced sound or voiceless sound, and signal delivery characteristics of a human body, to apply the formant enhancement filter to the speech signal, and to control the speech signal to which the formant enhancement filter is applied to be output using the actuator through the human body.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus for processing a speech signal, the apparatus comprising: a communicator comprising communication circuitry, the communicator configured to transmit and receive data; an actuator comprising actuation circuitry configured to generate vibration and to output a signal; a formant enhancement filter configured to increase a formant of the speech signal; and a controller comprising processing circuitry, the controller configured to control the speech signal to be received through the communicator, to estimate at least one formant frequency from the speech signal based on linear predictive coding (LPC), to estimate a bandwidth of the at least one formant frequency, to determine whether the speech signal is a voiced sound or a voiceless sound, to configure the formant enhancement filter based on the at least one formant frequency, the bandwidth of the at least one formant frequency, characteristics of the determined voiced sound or voiceless sound, and signal delivery characteristics of a human body, to apply the formant enhancement filter to the speech signal, and to control the speech signal to which the formant enhancement filter is applied to be output through the human body using the actuator. 2 . The apparatus of claim 1 , wherein when the controller is configured to estimate the at least one formant frequency, the controller is further configured to obtain LPC coefficients (LPCCs) through LPC analysis, to obtain an LPC envelope based on the LPCCs, and to estimate, as the at least one formant frequency, frequency at which a slope of the LPC envelope becomes a negative value from a positive value. 3 . The apparatus of claim 2 , wherein when the controller is configured to estimate the at least one formant frequency, the controller is further configured to estimate the bandwidth of the at least one formant frequency based on the LPCCs. 4 . The apparatus of claim 2 , wherein when the controller is configured to determine whether the speech signal is the voiced sound or the voiceless sound, the controller is further configured to determine whether the speech signal is the voiced sound or the voiceless sound based on the LPC envelope. 5 . The apparatus of claim 1 , wherein the formant enhancement filter is configured to be implemented with a window function that reinforces a gain of a formant frequency band. 6 . The apparatus of claim 1 , further comprising: a microphone configured to receive an audio signal; and a noise and echo removal filter configured to remove a noise component and an echo component from the received audio signal, wherein the controller is further configured to obtain echo power by estimating power with respect to the echo component based on a speech signal to which the formant enhancement filter is applied and which is input back through the microphone, to obtain noise signal power by estimating power of a background noise input through the microphone, to obtain combined power by combining the echo power with the noise signal power, to configure the noise and echo removal filter based on the combined power, to receive the audio signal comprising a user speech signal through the microphone, and to estimate the user speech signal included in the audio signal by applying the noise and echo removal filter to the received audio signal. 7 . The apparatus of claim 6 , wherein when the controller is configured to obtain the echo power, the controller is further configured to estimate a gain value filter based on the speech signal to which the formant enhancement filter is applied, to obtain a magnitude spectrum of the echo component using the estimated gain value filter, and to obtain current echo power by performing smoothing using the obtained magnitude spectrum and estimated echo power. 8 . The apparatus of claim 6 , wherein when the controller is configured to obtain the combined power by combining the echo power with the noise signal power, the controller is further configured to obtain current combined power by performing smoothing using the echo power, the noise signal power, and previously combined power. 9 . The apparatus of claim 6 , wherein when the controller configures the noise and echo removal filter based on the combined power, the controller is further configured to estimate a first priori signal-to-combined power ratio (SCR) and a posteriori SCR based on the combined power, to estimate a second priori SCR in a decision-direction way based on the combined power, the posteriori SCR, and power of a previous speech signal, and to configure the noise and echo removal filter based on the second priori SCR. 10 . The apparatus of claim 9 , wherein when the controller is configured to estimate the user speech signal included in the audio signal, the controller is further configured to compare the posteriori SCR with a threshold value for the posteriori SCR to determine an indicator function value, to estimate a prior probability of a current speech being absent based on the indicator function value and the prior probability of the previous speech signal being absent, to determine a likelihood ratio based on the first prior SCR, the posteriori SCR, and the second prior SCR, to determine a probability of a speech signal being present based on the prior probability of the current speech signal being absent and the likelihood ratio, and to estimate the user speech signal based on the noise and echo removal filter and the probability of the speech signal being present. 11 . A method of processing a speech signal, the method comprising: receiving a speech signal; estimating at least one formant frequency from the speech signal based on linear predictive coding (LPC); estimating a bandwidth of the at least one formant frequency; determining whether the speech signal is a voiced sound or a voiceless sound; configuring a formant enhancement filter based on the at least one formant frequency, the bandwidth of the at least one formant frequency, characteristics of the determined voiced sound or voiceless sound, and signal delivery characteristics of a human body; applying the formant enhancement filter to the speech signal; and outputting the speech signal to which the formant enhancement filter is applied through the human body. 12 . The method of claim 11 , wherein the estimating of the at least one formant frequency comprises: obtaining LPC coefficients (LPCCs) through LPC analysis; obtaining an LPC envelope based on the LPCCs; and estimating, as the at least one formant frequency, frequency at which a slope of the LPC envelope becomes a negative value from a positive value. 13 . The method of claim 12 , wherein the estimating of the bandwidth of the at least one formant frequency comprises: estimating the bandwidth of the at least one formant frequency based on the LPCCs. 14 . The method of claim 12 , wherein the determining of whether the speech signal is the voiced sound or the voiceless sound comprises: determining based on the LPC envelope whether the speech signal is the voiced sound or the voiceless sound. 15 . The method of claim 11 , wherein the formant enhancement filter is implemented with a window function that reinforces a gain of a formant frequency band. 16 . The method of claim 11 , further comprising: obtaining echo power by estimating power with respect to the echo component based on a speech signal to which the formant enhancement filter is applied and which is input back through a microphone; obtaining noise signal power by estimating power of a background noise input through the micr

Assignees

Inventors

Classifications

  • Details of processing therefor · CPC title

  • Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients · CPC title

  • the noise being echo, reverberation of the speech · CPC title

  • for improving intelligibility · CPC title

  • the extracted parameters being formant information · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016372135A1 cover?
An apparatus for processing a speech signal is provided. The apparatus includes a communicator comprising communication circuitry configured to transmit and receive data, an actuator comprising actuation circuitry configured to generate vibration and to output a signal, a formant enhancement filter configured to increase a formant of the speech signal, and a controller comprising processing cir…
Who is the assignee on this patent?
Samsung Electronics Co Ltd, Industry-Univ Coop Found Hanyang Univ
What technology area does this patent fall under?
Primary CPC classification G10L21/0364. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 22 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).