Speech signal processing to improve naturalness

US9160843B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9160843-B2
Application numberUS-92484810-A
CountryUS
Kind codeB2
Filing dateOct 6, 2010
Priority dateDec 8, 2009
Publication dateOct 13, 2015
Grant dateOct 13, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, terminal and program for processing a speech signal, in which the speech signal is received over a network from a transmitting device, wherein the frequency components in the received speech signal are limited to a predetermined frequency range and the received speech signal has been filtered using a transmitter frequency response over the predetermined frequency range. The received speech signal is decoded. The decoded speech signal is filtered using a receiver frequency response which is complementary to the transmitter frequency response over the predetermined frequency range to thereby reduce distortion in the speech signal introduced over the predetermined frequency range by using said transmitter frequency response.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of processing a speech signal, the method comprising: receiving the speech signal at a communication client running on a receiving terminal, the receiving terminal connected to a packet switched network, the speech signal originating at a telephone device coupled to a circuit switched network, the circuit switched network and the packet switched network coupled via a gateway, wherein frequency components in the received speech signal are limited to a predetermined frequency range and the received speech signal has been filtered by the telephone device using a transmitter frequency response over the predetermined frequency range; the communication client decoding the received speech signal and making a determination regarding whether the telephone device is a terminal in the circuit switched network or the packet switched network; and responsive to determining that the telephone device is a terminal in the circuit switched network, filtering, by the communication client, the decoded speech signal using a receiver frequency response which is complementary to the transmitter frequency response over the predetermined frequency range to thereby reduce distortion in the speech signal by adapting the speech signal to a target power spectrum derived from long-term averaging of voiced speech signals. 2. The method of claim 1 wherein the receiver frequency response is an inverse of the transmitter frequency response. 3. The method of claim 1 wherein the step of filtering the decoded speech signal is performed with a filter, the coefficients of the filter being selected to provide the receiver frequency response. 4. The method of claim 1 , further comprising dynamically adapting the receiver frequency response. 5. The method of claim 4 further comprising: storing a plurality of predetermined frequency responses associated with respective types of speech signal; and analysing the received speech signal to determine a type of the received speech signal, wherein the receiver frequency response is adapted to be the predetermined frequency response associated with the determined type of speech signal. 6. The method of claim 4 , further comprising: analysing the filtered speech signal to determine a filtered power spectrum of voiced speech in the filtered speech signal; and determining the target power spectrum, wherein the receiver frequency response is adapted to reduce a difference between the filtered power spectrum and the target power spectrum. 7. The method of claim 6 wherein the step of determining the target power spectrum comprises taking a long term average of a voiced speech signal that has not been filtered using the transmitter frequency response. 8. The method of claim 4 , further comprising: storing the adapted receiver frequency response with an identifier of the telephone device, wherein on initiation of a subsequent communication event with the telephone device the receiver frequency response is set to be the stored receiver frequency response. 9. The method of claim 1 , wherein the receiver frequency response is static. 10. The method of claim 1 wherein the receiver frequency response is selected at design to be complementary to an average transmitting frequency response over the predetermined frequency range in accordance with a telephonic standard of the circuit switched network. 11. The method of claim 1 wherein the receiver frequency response is selected at design to be complementary to a worst case transmitting frequency response over the predetermined frequency range that reflects an expected amount of distortion when using a telephonic standard of the circuit switched network. 12. The method of claim 1 , further comprising: not filtering the decoded speech signal if the telephone device is determined to be a terminal in the packet switched network. 13. The method of claim 1 , further comprising performing a signal processing algorithm on the filtered speech signal. 14. The method of claim 13 wherein the signal processing algorithm is one of a speech recognition algorithm and an artificial bandwidth extension algorithm. 15. A terminal for processing a speech signal comprising: processing hardware; a voice over internet protocol (VOIP) client executable via the processing hardware to perform operations for receiving the speech signal from a telephone device, the telephone device connected to a circuit switched network, the terminal connected to a packet switched network, the circuit switched network and the packet switched network coupled via a gateway, wherein frequency components in the received speech signal are limited to a predetermined frequency range and the received speech signal has been filtered by the telephone device using a transmitter frequency response over the predetermined frequency range; the VOIP client including a decoder for decoding the received speech signal and making a determination regarding whether the telephone device is a terminal in the circuit switched network or the packet switched network; and the VOIP client including a filter for filtering the decoded speech signal responsive to determining that the telephone device is a terminal in the circuit switched network, the filtering using a receiver frequency response which is complementary to the transmitter frequency response over the predetermined frequency range to thereby reduce distortion in the speech signal by adapting the speech signal to a target power spectrum derived from long-term averaging of voiced speech signals. 16. The terminal of claim 15 wherein coefficients of the filter are selected to provide the receiver frequency response. 17. The terminal of claim 15 , further comprising means for performing one of a speech recognition algorithm and an artificial bandwidth extension algorithm on the filtered speech signal. 18. A computer terminal in a packet switched network, comprising: the computer terminal configured to receive a speech signal over the packet switched network, the speech signal originating at a telephone device coupled to a circuit switched network, the circuit switched network and the packet switched network coupled via a gateway, wherein frequency components in the received speech signal are limited to a predetermined frequency range and the received speech signal has been filtered using a transmitter frequency response over the predetermined frequency range; a set of computer-readable instructions that, when executed by the computer terminal, cause the computer terminal to implement a voice over internet protocol (VOIP) client to perform operations including: decoding the received speech signal; making a determination regarding whether the telephone device is a terminal in the circuit switched network or the packet switched network; and responsive to determining that the telephone device is a terminal in the circuit switched network, filtering the decoded speech signal using a receiver frequency response which is complementary to the transmitter frequency response over the predetermined frequency range to thereby reduce distortion in the speech signal by adapting the speech signal to a target power spectrum derived from long-term averaging of voiced speech signals. 19. The method of claim 1 wherein the circuit switched network is the Public Switched Telephone Network (PSTN).

Assignees

Inventors

Classifications

  • Processing in the frequency domain · CPC title

  • H04M3/40Primary

    Applications of speech amplifiers · CPC title

  • for improving intelligibility · CPC title

  • adapted for voice communication over an Internet Protocol [IP] network (Voice over Internet Protocol (VoIP) network equipment and services H04M7/006; implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP] H04L69/16) · CPC title

  • comprising a residential gateway, e.g. those which provide an adapter for POTS or ISDN terminals · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9160843B2 cover?
A method, terminal and program for processing a speech signal, in which the speech signal is received over a network from a transmitting device, wherein the frequency components in the received speech signal are limited to a predetermined frequency range and the received speech signal has been filtered using a transmitter frequency response over the predetermined frequency range. The received s…
Who is the assignee on this patent?
Nilsson Mattias, Strommer Stefan, Andersen Soren Vang, and 1 more
What technology area does this patent fall under?
Primary CPC classification H04M3/40. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 13 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).