Voice vector framework for authenticating user interactions

US11700250B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11700250-B2
Application numberUS-202017070755-A
CountryUS
Kind codeB2
Filing dateOct 14, 2020
Priority dateOct 14, 2020
Publication dateJul 11, 2023
Grant dateJul 11, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

There are provided systems and methods for a voice vector framework that authenticates user interactions. A service provider server receives user interaction data having audio data that is associated with an interaction between a user device and the service provider server. The server extracts user attributes from the audio data and obtains user account information associated with the user device. The server selects a classifier that corresponds to a select combination of features based on the user account information and applies the classifier to the user attributes. The server generates a voice vector that includes multiple scores indicating likelihoods that a respective user attribute corresponds to an attribute of the select combination of features. The server compares the voice vector to a baseline vector corresponding to a predetermined combination of features and sends a notification to an agent device with an indication of whether the user device is verified.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, by one or more hardware processors, voice data associated with a voice communication between a user of a user device and a service provider server; determining, by the one or more hardware processors, that the voice communication is associated with a user account with the service provider server; extracting, by the one or more hardware processors, a plurality of user attributes from the voice data; applying corresponding weights to one or more user attributes in the plurality of user attributes based at least in part on one or more characteristics of the user account; determining, from the plurality of user attributes, a particular user attribute combination; selecting, from a plurality of voice classifiers, a particular classifier corresponding to the particular user attribute combination; generating, by the one or more hardware processors, a voice vector for the user based on the particular classifier and the corresponding weights applied to the one or more user attributes; determining a difference between the voice vector and a baseline vector that corresponds to the user account; determining that the user is a malicious user when the difference exceeds a threshold; and in response to determining that the user is a malicious user, restricting the user from accessing the user account during the voice communication. 2. The method of claim 1 , wherein the comparing the voice vector to the baseline vector comprises: determining a deviation between one or more values in the voice vector and corresponding values in the baseline vector; and determining whether the deviation between each of the one or more values and the corresponding values exceeds a predetermined distance threshold. 3. The method of claim 1 , further comprising: generating a notification indicating that the user is not authenticated to access the user account; and transmitting the notification to a communication device associated with the service provider server. 4. The method of claim 3 , wherein the notification prompts the communication device to prompt the user device for additional verification information. 5. The method of claim 1 , further comprising: masking a subset of user attributes in the plurality of user attributes based at least in part on the one or more characteristics of the user account, wherein the generating the voice vector is further based on the masking of the subset of user attributes, wherein the masked subset of user attributes is excluded from being considered during the determining the difference. 6. The method of claim 1 , further comprising determining the threshold based on the one or more characteristics of the user account. 7. The method of claim 1 , further comprising: obtaining a plurality of baseline vectors associated with a plurality of the user accounts; and comparing the voice vector to each of the plurality of baseline vectors. 8. The method of claim 7 , wherein the user account is a first user account, wherein the method further comprises: determining that the user was involved in a second voice communication associated with a second user account based on the comparing the voice vector to each of the plurality of baseline vectors, wherein the determining that the user is a malicious user is further based on the determining that the user was involved in the second voice communication. 9. The method of claim 1 , further comprising: determining, from the plurality of user attributes, a plurality of user attribute combinations; and generating a plurality of machine learning-based networks based on the plurality of user attribute combinations, wherein the plurality of machine learning-based networks corresponds to respective ones of the plurality of user attribute combinations. 10. The method of claim 9 , further comprising: training each of the plurality of machine learning-based networks with a training dataset to form a plurality of machine learning-trained classifiers, the training dataset comprising attribute data that corresponds to one of the plurality of user attribute combinations, wherein the selecting the particular classifier comprises selecting a machine learning-trained classifier from the plurality of machine learning-trained classifiers. 11. The method of claim 1 , wherein the particular user attribute combination comprises a gender-age-accent combination. 12. A system, comprising: a non-transitory memory; and one or more hardware processors coupled to the non-transitory memory and configured to execute instructions from the non-transitory memory to cause the system to perform operations comprising: receiving user interaction data associated with an interaction between a user device and a service provider server associated with a service, the user interaction data comprising audio data associated with the interaction; extracting, using a feature extraction engine, a plurality of user attributes from the audio data; determining an intent of the interaction from the plurality of user attributes; selecting, from a plurality of machine learning-trained classifiers, one or more machine learning-trained classifiers that correspond to the determined intent; applying each of the one or more machine learning-trained classifiers to the plurality of user attributes; generating a plurality of voice vectors, wherein each voice vector of the plurality of voice vectors is associated with a corresponding one of the one or more machine learning-trained classifiers; selecting one of the one or more machine learning-trained classifiers that corresponds to one of the plurality of voice vectors having a highest aggregate score; determining a voice signature associated with the selected one of the one or more machine learning-trained classifiers; determining whether the user device interacted with the service provider server in a prior interaction based on the voice signature; and sending a notification to a communication device associated with the service provider server, the notification comprising an indication of whether the user device interacted with the service provider server in the prior interaction. 13. The system of claim 12 , wherein the determining whether the user device interacted with the service provider server comprises: accessing user account information in a data repository communicably coupled to the service provider server; comparing the voice signature to a historical voice signature associated with the accessed user account information that is stored in the data repository; determining that a Euclidean distance between the voice signature and the historical voice signature is within a predetermined tolerance threshold; and determining that the user device interacted with the service provider server in the prior interaction based on the Euclidean distance being within the predetermined tolerance threshold. 14. The system of claim 12 , wherein the determining whether the user device interacted with the service provider server further comprises: obtaining a plurality of historical voice signatures from user account information associated with respective ones of a plurality of user accounts; and comparing the voice signature to each of the plurality of historical voice signatures. 15. The system of claim 12 , wherein the operations further comprise: obtaining user account information from a user account associated with the user device; determining, from the plurality of user attributes, a particular user attribute combination based on the user account information; selecting, from th

Assignees

Inventors

Classifications

  • for transmitting results of analysis · CPC title

  • using different networks or channels, e.g. using out of band channels (cryptographic mechanisms or cryptographic arrangements for key distribution involving distinctive intermediate devices or communication paths H04L9/0827; cryptographic mechanisms or cryptographic arrangements for authentication using a plurality of channels H04L9/3215) · CPC title

  • Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices · CPC title

  • Use of distortion metrics or a particular distance between probe pattern and reference templates · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11700250B2 cover?
There are provided systems and methods for a voice vector framework that authenticates user interactions. A service provider server receives user interaction data having audio data that is associated with an interaction between a user device and the service provider server. The server extracts user attributes from the audio data and obtains user account information associated with the user devi…
Who is the assignee on this patent?
Paypal Inc
What technology area does this patent fall under?
Primary CPC classification H04L63/0861. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jul 11 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).