System and method for voice print generation

US9721571B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9721571-B2
Application numberUS-201514738891-A
CountryUS
Kind codeB2
Filing dateJun 14, 2015
Priority dateJun 14, 2015
Publication dateAug 1, 2017
Grant dateAug 1, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for enrolling in a database voice prints generated from audio streams may include receiving an audio stream of a communication session and creating a preliminary association between the audio stream and an identity of a customer that has engaged in the communication session based on identification information. The method may further include determining a confidence level of the preliminary association based on authentication information related to the customer and if the confidence level is higher than a threshold, sending a request to compare the audio stream to a database of voice prints of known fraudsters. If the audio stream does not match any known fraudsters, sending a request to generate from the audio stream a current voice print associated with the customer and enrolling the voice print in a customer voice print database.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method of generating a text-dependent voice print for an individual by passive enrollment using a not predetermined repeated phrase to enroll the individual in a system, the method comprising: receiving, based on identification information of the individual from an audio server, audio data of past communication sessions involving the individual: searching, by a speech analytics server, the audio data of the past communication sessions that include speech by the individual for the not predetermined repeated phrase that is uttered more than at least three times; when the not predetermined repeated phrase is uttered more than three times, locating at least a predetermined number of utterances of said not predetermined repeated phrase in the audio data of the past communication sessions, said predetermined number being more than three times and when not found, reporting by the speech analytics server to an enrollment unit that the enrollment of the individual has failed; determining whether the repeated phrase contains more than three words and when not, reporting by the speech analytics to the enrollment unit that the enrollment of the individual has failed; when the repeated phrase contains more than three words, creating a separate audio file for each utterance of the repeated phrase; generating, by a voice biometric server, the text-dependent voice print for the individual based on the audio files containing located utterances of the repeated phrase; and storing the text-dependent voice print in association with the identification information of the individual. 2. The method of claim 1 , wherein locating comprises using metadata to indicate the start of each of the utterances or the end of each of the utterances, or both. 3. The method of claim 1 , wherein the repeated phrase is determined by searching one or more recordings of the audio data for any phrase that is uttered by the individual at least said predetermined number of times. 4. The method of claim 3 , wherein only phrases having at least a predetermined number of words are used for generating the voice print, wherein the predetermined number is more than three times. 5. The method of claim 1 , further comprising marking the audio data, using metadata to mark one or both of the start and end of an utterance of each of the repeated phrases, and storing the metadata in association with the audio data. 6. The method of claim 1 , further comprising storing the voice print in association with other data related to the individual. 7. The method of claim 1 , further comprising storing the repeated phrase in text form in association with other data related to the individual as a pass phrase for future authentication of the individual. 8. The method of claim 1 , further comprising: in a future communication session, receiving a new utterance of the repeated phrase by the individual; and authenticating the individual from the new utterance using the voice print for the individual. 9. The method of claim 8 , wherein the new utterance of the repeated phrase is used to enrich the voice print for the individual previously generated from the utterances in the one or more stored audio files. 10. The method of claim 7 , further comprising prompting the individual to utter the pass phrase for seamless authentication during an interaction between the individual and a third party. 11. The method of claim 1 , wherein searching for the repeated phrase comprises converting at least part of the audio data to text. 12. A system for generating a text-dependent voice print for an individual by passive enrollment using an unknown phrase to enroll the individual in the system, the system comprising: a speech analytics server configured to: receive, based on identification information of the individual from an audio server, audio data of past communication sessions involving the individual; search the audio data of the past communication sessions that include speech by the individual for at least one not predetermined repeated phrase that is uttered more than at least three times; when a repeated phrase that is uttered more than three times is found, locate at least a predetermined number of utterances of said at least one repeated phrase in the audio data of the past communication sessions, said predetermined number being more than three times and when not found, report to an enrolment unit that the enrolment of the individual has failed; determine whether the repeated phrase contains more than three words and when not, reporting by the speech analytics to the enrolment unit that the enrolment of the individual has failed; when the repeated phrase contains more than three words, create a separate audio file for each utterance of the repeated phrase; and a voice biometric server configured to generate the text-dependent voice print for the individual by analyzing the audio files containing the utterances of the repeated phrase located by the speech analytics server. 13. The system of claim 12 , wherein the voice biometric server is further configured to receive a new utterance of said repeated phrase and to use the voice print to determine whether the new utterance was uttered by the individual. 14. The system of claim 13 , wherein the voice biometric server is further configured to enrich the voice print using a new utterance following a determination that the new utterance was by the individual. 15. The system of claim 13 , wherein the speech analytics server is configured to convert the phrase to text to be stored in association with other information relating to the individual.

Assignees

Inventors

Classifications

  • G10L17/04Primary

    Training, enrolment or model building · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9721571B2 cover?
A computer-implemented method for enrolling in a database voice prints generated from audio streams may include receiving an audio stream of a communication session and creating a preliminary association between the audio stream and an identity of a customer that has engaged in the communication session based on identification information. The method may further include determining a confidence…
Who is the assignee on this patent?
Nice-Systems Ltd, Nice Ltd
What technology area does this patent fall under?
Primary CPC classification G10L17/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 01 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).