What technology area does this patent fall under?

Primary CPC classification G10L17/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 16 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for using conversational biometrics and speaker identification/verification to filter voice streams

US9870776B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9870776-B2
Application number	US-201614988884-A
Country	US
Kind code	B2
Filing date	Jan 6, 2016
Priority date	Oct 6, 2008
Publication date	Jan 16, 2018
Grant date	Jan 16, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for using conversational biometrics and speaker identification and/or verification to filter voice streams during mixed mode communication. The method includes receiving an audio stream of a communication between participants. Additionally, the method includes filtering the audio stream of the communication into separate audio streams, one for each of the participants. Each of the separate audio streams contains portions of the communication attributable to a respective participant. Furthermore, the method includes outputting the separate audio streams to a storage system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented in a computing system, the method comprising: removing background noise from a communication and performing normalization of the communication to enhance accuracy of the communication; extracting a plurality of audio streams from the enhanced communication, wherein: the plurality of audio streams correspond, respectively, to a plurality of participants in the enhanced communication, and the plurality of audio streams contain portions of the enhanced communication corresponding, respectively, to the plurality of participants; and matching one or more of the portions of the enhanced communication in the plurality of audio streams to voice prints by comparing the plurality of audio streams to only a plurality of the voice prints corresponding to identified participants within the enhanced communication and by utilizing physiological traits and behavioral traits found in speech of the identified participants; adapting a speaker model of the voice prints after successfully matching the one or more of the portions of the enhanced communication in the plurality of audio streams to the voice prints, wherein adapting the speaker model includes capturing long-term voice changes of the identified participants in the voice prints used for the matching. 2. The method of claim 1 , further comprising performing a verification process for at least one of the plurality of participants. 3. The method of claim 1 , wherein the voice prints include seed phrases received for each of the plurality of participants. 4. The method of claim 1 , wherein: providers of the voice prints are each associated with a role; and each of the voice prints is stored in one of a plurality of discrete databases based on the role of the associated provider. 5. The method of claim 1 , wherein the matching comprises: matching the one or more of the portions of the enhanced communication to the voice prints using at least one of conversational biometrics, speaker identification and speaker verification; and assigning the one or more portions of the enhanced communication attributable to each of the plurality of participants to the separate audio streams corresponding to each of the plurality of participants. 6. The method of claim 1 further comprising verifying at least one of the plurality of participants by authenticating a match between a voice sample of the at least one of the plurality of participants and a previously stored voice print of the at least one of the plurality of participants. 7. The method of claim 1 further comprising receiving the audio stream of the enhanced communication from a sensor. 8. The method of claim 1 further comprising converting the enhanced communication from speech to text. 9. The method of claim 1 , wherein the enhanced communication is a call center communication and the plurality of participants includes at least two of: a caller, an agent and an interactive voice response (IVR) system. 10. The method of claim 9 further comprising collecting seed phrases for the agent and the IVR system prior to the receiving the audio stream. 11. The method of claim 1 , wherein the enhanced communication is a medical professional communication and the plurality of participants include at least a medical professional and a patient. 12. The method of claim 1 , wherein the extracting comprises using a voice print for speaker identification of all but one of the plurality of participants and a process of elimination for the one of the plurality of participants. 13. The method of claim 1 , wherein the extracting the plurality of audio streams comprises one of: performed in real-time and performed in a batch. 14. The method of claim 1 , wherein the extracting comprises determining an origin of voice of at least one of the plurality of participants. 15. The method of claim 1 , further comprising filtering the audio streams into separate audio streams corresponding, respectively, to each of a plurality of participants in the enhanced communication, wherein each of the separate audio streams contains portions of the enhanced communication attributable to the corresponding one of the plurality of participants. 16. The method of claim 15 , further comprising identifying the plurality of participants by matching one or more of the portions of the enhanced communication in each of the separate audio streams to voice prints. 17. The method of claim 16 , wherein the matching comprises comparing the separate audio streams to only the plurality of the voice prints corresponding to the identified participants within the enhanced communication. 18. The method of claim 1 , wherein the physiological traits includes acoustic patterns of the identified participants which reflect an anatomy of the identified participants. 19. The method of claim 18 , wherein the behavioral traits are learned behavioral patterns in the identified participants. 20. The method of claim 19 , wherein the behavioral traits comprises voice pitch and speaking style of the identified participants.

Assignees

Inventors

Classifications

H04M3/5166
in combination with interactive voice response systems or voice portals, e.g. as front-ends · CPC title
H04M2201/41
using speaker recognition · CPC title
H04M3/4936
Speech interaction details (speech recognition per se G10L15/00) · CPC title
H04M3/5175
Call or contact centers supervision arrangements · CPC title
G10L17/08
Use of distortion metrics or a particular distance between probe pattern and reference templates · CPC title

Patent family

Related publications grouped by family.

View patent family 42075820

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9870776B2 cover?: A method and system for using conversational biometrics and speaker identification and/or verification to filter voice streams during mixed mode communication. The method includes receiving an audio stream of a communication between participants. Additionally, the method includes filtering the audio stream of the communication into separate audio streams, one for each of the participants. Each …
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G10L17/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 16 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).