Identifying duplicate user accounts in an identification document processing system

US2020311844A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020311844-A1
Application numberUS-202016832726-A
CountryUS
Kind codeA1
Filing dateMar 27, 2020
Priority dateMar 27, 2019
Publication dateOct 1, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system processes images of documents, for example, identification documents. The system transforms an image of a document to generate an image that represent the document in a canonical form. For example, if the input image has a document that is tilted at an angle with respect to the sides of the image, the system modifies the orientation of the document to show the document having sides aligned with the sides of the image. The system stores user accounts that include user information including images. The system generates a graph of nodes that represent user accounts with edges determined based on similarity scores between user accounts. The system determines connected components of user accounts, such that each connected component represents user accounts that have a high likelihood of being duplicates.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for identifying duplicate user accounts comprising: receiving a plurality of user accounts; for each of a plurality of pairs of user accounts, the pair of user accounts comprising a first user account and a second user account, determining a similarity score indicative of similarity between the first user account and the second user account; determining an initial threshold similarity score, wherein the initial threshold similarity score is indicative of a particular degree of similarity between user accounts; repeating for a plurality of iterations, wherein each iteration has a threshold similarity score, the threshold similarity score initialized to the initial threshold similarity score, the steps comprising: determining one or more connected components of a graph comprising nodes and edges, wherein each node represents a user account and a pair of nodes has an edge if the similarity score of the pair of nodes indicates greater degree of similarity than that indicated by the threshold similarity score; and modifying the threshold similarity score for the next iteration to a value indicative of higher degree of similarity between user accounts compared to the threshold similarity score for the current iteration; responsive to repeating the steps for the plurality of iterations, identifying one or more connected components, each identified connected component representing a set of user accounts for a particular user; and transmitting information describing the identified one or more connected components. 2 . The computer-implemented method of claim 1 , wherein each iteration further comprises: responsive to modifying the threshold similarity score, removing one or more edges with a similarity score indicative of a degree of similarity less than the modified threshold similarity score. 3 . The computer-implemented method of claim 1 , wherein each user account is associated with an image, and wherein determining similarity scores between a pair of user accounts comprises performing facial recognition on the images and comparing results of facial recognition for the pair of user accounts. 4 . The computer-implemented method of claim 1 , wherein each user account is associated with text, and wherein determining similarity scores between a pair of user accounts comprises comparing text associated with each of the user accounts of the pair. 5 . The computer-implemented method of claim 1 , wherein the iterations are repeated until a current iteration determines a set of one or more connected components that are identical to the one or more connected components determined by a previous iteration. 6 . The computer-implemented method of claim 1 , further comprising: disabling one or more accounts from at least one connected component from the identified one or more connected components. 7 . The computer-implemented method of claim 1 , further comprising: sending a message to at least one of the user accounts of at least one connected component from the identified one or more connected components. 8 . A non-transitory computer-readable storage medium comprising instructions executable by a processor, the instructions comprising: instructions for receiving a plurality of user accounts; for each of a plurality of pairs of user accounts, the pair of user accounts comprising a first user account and a second user account, instructions for determining a similarity score indicative of similarity between the first user account and the second user account; instructions for determining an initial threshold similarity score, wherein the initial threshold similarity score is indicative of a particular degree of similarity between user accounts; instructions for repeating for a plurality of iterations, wherein each iteration has a threshold similarity score, the threshold similarity score initialized to the initial threshold similarity score, the steps comprising: instructions for determining one or more connected components of a graph comprising nodes and edges, wherein each node represents a user account and a pair of nodes has an edge if the similarity score of the pair of nodes indicates greater degree of similarity than that indicated by the threshold similarity score; and instructions for modifying the threshold similarity score for the next iteration to a value indicative of higher degree of similarity between user accounts compared to the threshold similarity score for the current iteration; responsive to repeating the steps for the plurality of iterations, instructions for identifying one or more connected components, each identified connected component representing a set of user accounts for a particular user; and instructions for transmitting information describing the identified one or more connected components. 9 . The non-transitory computer-readable storage medium of claim 8 , wherein the instructions for each iteration further comprise: responsive to modifying the threshold similarity score, instructions for removing one or more edges with a similarity score indicative of a degree of similarity less than the modified threshold similarity score. 10 . The non-transitory computer-readable storage medium of claim 8 , wherein each user account is associated with an image, and wherein determining similarity scores between a pair of user accounts comprises performing facial recognition on the images and comparing results of facial recognition for the pair of user accounts. 11 . The non-transitory computer-readable storage medium of claim 8 , wherein each user account is associated with text, and wherein the instructions for determining similarity scores between a pair of user accounts comprise instructions for comparing text associated with each of the user accounts of the pair. 12 . The non-transitory computer-readable storage medium of claim 8 , wherein the iterations are repeated until a current iteration determines a set of one or more connected components that are identical to the one or more connected components determined by a previous iteration. 13 . The non-transitory computer-readable storage medium of claim 8 , the instructions further comprising: instructions for disabling one or more accounts from at least one connected component from the identified one or more connected components. 14 . The non-transitory computer-readable storage medium of claim 8 , the instructions further comprising: instructions for sending a message to at least one of the user accounts of at least one connected component from the identified one or more connected components. 15 . A computer system comprising: a computer processor; and a non-transitory computer-readable storage medium storing instructions that when executed by the computer processor perform actions comprising: receiving a plurality of user accounts; for each of a plurality of pairs of user accounts, the pair of user accounts comprising a first user account and a second user account, determining a similarity score indicative of similarity between the first user account and the second user account; determining an initial threshold similarity score, wherein the initial threshold similarity score is indicative of a particular degree of similarity between user accounts; repeating for a plurality of iterations, wherein each iteration has a threshold similarity score, the threshold similarity score initialized to the initial threshold similarity score, the steps comprising: determining one or more connected components of a graph comprising nodes and edges, wherein each no

Assignees

Inventors

Classifications

  • by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids · CPC title

  • Classification techniques · CPC title

  • using neural networks · CPC title

  • Combinations of networks · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020311844A1 cover?
A system processes images of documents, for example, identification documents. The system transforms an image of a document to generate an image that represent the document in a canonical form. For example, if the input image has a document that is tilted at an angle with respect to the sides of the image, the system modifies the orientation of the document to show the document having sides ali…
Who is the assignee on this patent?
Uber Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06Q50/265. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 01 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).