Voiceprint security with messaging services

US11252152B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11252152-B2
Application numberUS-202016892024-A
CountryUS
Kind codeB2
Filing dateJun 3, 2020
Priority dateJan 31, 2018
Publication dateFeb 15, 2022
Grant dateFeb 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An online system authenticates a user through a voiceprint biometric verification process. When a user needs to be authenticated, the online system generates and provides a random phrase to the user. The online system receives an audio recording of the randomly generated phrase and retrieves a previously trained voiceprint model for the user. The online system analyzes the audio recording by applying the voiceprint model to determine whether the audio recording satisfies a first criteria of whether the voice in the audio recording belongs the user and a second criteria of whether the audio recording includes a vocalization of the randomly generated phrase. If the audio recording satisfies both criteria, the online system authenticates the user. Therefore, the user can be provided access to a new communication session in response to being authenticated.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, from a client device, a request for authentication; generating a random phrase for a user of the client device; providing, to the client device, the generated random phrase; receiving an audio recording of a vocalization of the random phrase by the user from the client device; providing the received audio recording to a neural network, the neural network trained using audio recordings obtained from the user to generate a voiceprint model for the user, the voiceprint model having learned parameters that, during training, are adjusted for the voiceprint model to identify a voice of the user or a vocalized phrase by the user in an audio recording from the user, the voiceprint model comprising a first portion and a second portion; receiving, from the first portion of the voiceprint model, an authentication score for the received audio recording, the authentication score representing a similarity between a voice in the received audio recording and the voice of the user; receiving, from the second portion of the voiceprint model, a predicted phrase, the predicted phrase representing a prediction of the vocalized random phrase in the received audio recording; and authenticating the client device based on the received authentication score and the received predicted phrase determined for the received audio recording. 2. The method of claim 1 , wherein generating the random phrase for the user of the client device comprises: comparing the generated random phrase to one or more phrases previously generated for the user. 3. The method of claim 1 , wherein the neural network was trained for the user by: generating a training phrase for the user; obtaining a training audio recording of the user, the training audio recording corresponding to the training phrase; extracting voice pattern features from the training audio recording; and training the neural network to generate the voiceprint model based on the extracted voice pattern features from the training audio recording. 4. The method of claim 3 , wherein the extracted voice pattern features of the training audio recording comprise one or more of: a statistical measure of an amplitude of the training audio recording, a statistical measure of a frequency of the training audio recording, a cadence, a lisp, a categorization of a phoneme, a cepstral coefficient, a perceptual linear predictive coefficient, or a filter-bank feature. 5. The method of claim 1 , wherein the request for authentication is a request for a communication session, and wherein the communication session is one of a new phone call, a new online chat, a new voice call, a new video call, or a new text message. 6. The method of claim 1 further comprising: receiving a first level of authentication information; and verifying the first level of authentication information, wherein authenticating the client device is further based on the verification of the first level of authentication information. 7. The method of claim 1 , further comprising: extracting voice pattern features from the received audio recording, and wherein providing the received audio recording as the input to the neural network comprises providing the extracted voice pattern features as input to the neural network. 8. A non-transitory computer readable medium comprising instructions that, when executed by a processor, cause the processor to: receive, from a client device, a request for authentication; generate a random phrase for a user of the client device; provide, to the client device, the generated random phrase; receive an audio recording of a vocalization of the random phrase by the user from the client device; provide the received audio recording to a neural network, the neural network trained using audio recordings obtained from the user to generate a voiceprint model for the user, the voiceprint model having learned parameters that, during training, are adjusted for the voiceprint model to identify a voice of the user or a vocalized phrase by the user in an audio recording from the user, the voiceprint model comprising a first portion and a second portion; receive, from the first portion of the voiceprint model, an authentication score for the received audio recording, the authentication score representing a similarity between a voice in the received audio recording and the voice of the user; receive, from the second portion of the voiceprint model, a predicted phrase, the predicted phrase representing a prediction of the vocalized random phrase in the received audio recording; and authenticate the client device based on the authentication score and predicted phrase determined for the received audio recording. 9. The non-transitory computer readable medium of claim 8 , wherein the instructions that cause the processor to generate the random phrase for the user of the client device further comprises instructions that, when executed by the processor, cause the processor to: compare the generated random phrase to one or more phrases previously generated for the user. 10. The non-transitory computer readable medium of claim 8 , wherein the neural network was trained for the user by: generating a training phrase for the user; obtaining a training audio recording of the user, the audio recording corresponding to the training phrase; extracting voice pattern features from the training audio recording; and training the neural network to generate the voiceprint model based on the extracted voice pattern features from the training audio recording. 11. The non-transitory computer readable medium of claim 10 , wherein the extracted features of the training audio recording comprise one or more of: a statistical measure of an amplitude of the training audio recording, a statistical measure of a frequency of the training audio recording, a cadence, a lisp, a categorization of a phoneme, a cepstral coefficient, a perceptual linear predictive coefficient, or a filter-bank feature. 12. The non-transitory computer readable medium of claim 8 , wherein the request for authentication is a request for a communication session, and wherein the communication session is one of a new phone call, a new online chat, a new voice call, video call, or a new text message. 13. The non-transitory computer readable medium of claim 8 , further comprising instructions that, when executed by the processor, cause the processor to: receive a first level of authentication information; and verify the first level of authentication information, wherein authenticating the client device is further based on the verification of the first level of authentication information. 14. The non-transitory computer readable medium of claim 8 , further comprising instructions that, when executed by the processor, cause the processor to: extract voice pattern features from the received audio recording, and wherein providing the received audio recording as the input to the neural network comprises providing the extracted voice pattern features of the received audio recording as input to the neural network. 15. A system comprising: a processor; and a non-transitory computer-readable medium containing instructions that, when executed by the processor, cause the processor to: receive, from a client device, a request for authentication; generate a random phrase for a user of the client device; provide, to the client device, the generated random phrase; receive an audio recording of a vocalization of the random phrase by the user from the client device; provide the received audio recording to a neural netw

Assignees

Inventors

Classifications

  • using biometrical features, e.g. fingerprint, retina-scan (cryptographic mechanisms or cryptographic arrangements for entity authentication using biological data H04L9/3231) · CPC title

  • using biometric data, e.g. fingerprints, iris scans or voiceprints · CPC title

  • Use of phonemic categorisation or speech recognition prior to speaker recognition or verification · CPC title

  • Training, enrolment or model building · CPC title

  • for controlling access to devices or network resources · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11252152B2 cover?
An online system authenticates a user through a voiceprint biometric verification process. When a user needs to be authenticated, the online system generates and provides a random phrase to the user. The online system receives an audio recording of the randomly generated phrase and retrieves a previously trained voiceprint model for the user. The online system analyzes the audio recording by ap…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification H04L63/0861. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).