Combined voice recognition, hands-free telephony and in-car communication

US9978389B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9978389-B2
Application numberUS-201715442843-A
CountryUS
Kind codeB2
Filing dateFeb 27, 2017
Priority dateMay 16, 2012
Publication dateMay 22, 2018
Grant dateMay 22, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A multi-mode speech communication system is described that has different operating modes for different speech applications. A signal processing module is in communication with the speech applications and includes an input processing module and an output processing module. The input processing module processes microphone input signals to produce a set user input signals for each speech application that are limited to currently active system users for that speech application. The output processing module processes application output communications from the speech applications to produce loudspeaker output signals to the system users, wherein for each different speech application, the loudspeaker output signals are directed only to system users currently active in that speech application.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for multi-mode speech communication using a plurality of different operating modes, each operating mode associated with one of a plurality of different speech applications, the method comprising: receiving a plurality of microphone input signals; processing the microphone input signals with an input processing module to produce user input signals for each speech application containing speech of only currently active system users of the respective speech application; enhancing signals associated with application output communications from the speech applications based on the speech application with an output processing module to produce enhanced signals; adaptively mixing the enhanced signals with the output processing module to produce loudspeaker output signals for each currently active speech application; and dynamically controlling processing of the microphone input signals and the loudspeaker output signals to respond to changes in system users currently active in each speech application. 2. A method according to claim 1 , wherein the speech applications include a hands free telephone application and the operating modes include a mode optimized for the hands free telephone application. 3. A method according to claim 1 , wherein the speech applications include an in-car communication system and the operating modes include a mode optimized for the in-car communication system. 4. A method according to claim 1 , wherein the speech applications include an automatic speech recognition (ASR) application and the operating modes include a mode optimized for the ASR application. 5. A method according to claim 1 , wherein a plurality of different speech applications operate in parallel. 6. A method according to claim 1 , wherein dynamically controlling the processing of the microphone input signals and the loudspeaker output signals is performed in response to control mechanism inputs from the system users. 7. A method according to claim 1 , wherein the microphone input signals are received from a plurality of input microphones within a speech service compartment corresponding to a passenger compartment of an automobile. 8. A method according to claim 7 , wherein the loudspeaker output signals are directed to output loudspeakers within the service compartment associated with system users of the currently active speech applications. 9. A method according to claim 8 , wherein the output loudspeakers are located in a plurality of different locations within the service compartment and each of the system users is associated with at least one different loudspeaker. 10. A method according to claim 8 , wherein substantially all input microphones and substantially all output loudspeakers are available to system users when the systems users are each active in a same speech application. 11. A method according to claim 1 , wherein a number of microphone channels processed by the input processing module does not have to match a number of currently active speech applications or system users. 12. A method according to claim 1 , wherein a number of loudspeaker output channels processed and developed by the output processing module does not have to match a number of received signals from different speech applications or for a total number of system users. 13. A method according to claim 1 , wherein the processing modules are capable of operating in a plurality of the different operating modes at a same time, and wherein the processing of the microphone input signals comprises enhancing the user input signals for each speech application to maximize speech from system users of the speech application and to minimize audio sources other than those associated with the speech application. 14. A method according to claim 13 , wherein enhancement of the user input signals includes performing at least one of noise reduction and echo cancellation on the user input signals. 15. A method according to claim 13 , wherein the enhanced user input signals are further adaptively mixed to reflect different background noises, different speech signals levels, and exploit diversity effects associated with the user input signals for each speech application. 16. A multi-mode speech communication system having a plurality of different operating modes, each operating mode associated with one of a plurality of different speech applications, the system comprising: a signal processing module in communication with the speech applications and including: an input processing module to processes microphone input signals to produce user input signals for each speech application containing speech of only currently active system users of the respective speech application; an output processing module to enhance signals associated with application output communications from the speech applications based on the speech application to produce enhanced signals, and to adaptively mix the enhanced signals to produce loudspeaker output signals for each currently active speech application; and a control module to dynamically control processing of the microphone input signals and the loudspeaker output signals to respond to changes in currently active system users for each application. 17. A system according to claim 16 , wherein the system operates a plurality of different speech applications in parallel. 18. A system according to claim 16 , wherein the microphone input signals are received from a plurality of input microphones within a speech service compartment corresponding to a passenger compartment of an automobile. 19. A system according to claim 16 , wherein the control module dynamically controls the processing of the microphone input signals and the loudspeaker output signals in response to control mechanism inputs from the system users. 20. A computer program product encoded in a non-transitory computer-readable medium for multi-mode speech communication using a plurality of different operating modes, each operated mode associated with one of a plurality of different speech applications, the product comprising: program code for developing a plurality of microphone input signals; program code for processing the microphone input signals with an input processing module to produce user input signals for each speech application containing speech of only currently active system users of the respective speech application; and program code for enhancing signals associated with application output communications from the speech applications based on the speech application with an output processing module to produce enhanced signals; program code for adaptively mixing the enhanced signals with the output processing module to produce loudspeaker output signals for each currently active speech application; and program code for dynamically controlling processing of the microphone input signals and the loudspeaker output signals with a control module to respond to changes in system users currently active in each speech application.

Assignees

Inventors

Classifications

  • Voice signal separating · CPC title

  • Physics · mapped topic

  • Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

  • Microphone arrays; Beamforming · CPC title

  • H04M3/568Primary

    audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants (echo suppression in two-way loud-speaking telephone systems H04M9/02; sound field processing per se H04S7/30) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9978389B2 cover?
A multi-mode speech communication system is described that has different operating modes for different speech applications. A signal processing module is in communication with the speech applications and includes an input processing module and an output processing module. The input processing module processes microphone input signals to produce a set user input signals for each speech applicati…
Who is the assignee on this patent?
Nuance Communications Inc
What technology area does this patent fall under?
Primary CPC classification G10L21/0205. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 22 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).