Neural network based identification document processing system

US12182895B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12182895-B2
Application numberUS-202217948068-A
CountryUS
Kind codeB2
Filing dateSep 19, 2022
Priority dateMar 27, 2019
Publication dateDec 31, 2024
Grant dateDec 31, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system processes images of documents, for example, identification documents. The system transforms an image of a document to generate an image that represent the document in a canonical form. For example, if the input image has a document that is tilted at an angle with respect to the sides of the image, the system modifies the orientation of the document to show the document having sides aligned with the sides of the image. The system stores user accounts that include user information including images. The system generates a graph of nodes that represent user accounts with edges determined based on similarity scores between user accounts. The system determines connected components of user accounts, such that each connected component represents user accounts that have a high likelihood of being duplicates.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for identifying duplicate user accounts comprising: receiving a plurality of user accounts, each user account associated with both text data and image data; for each of a plurality of pairs of user accounts, the pair of user accounts comprising a first user account and a second user account, determining a similarity score indicative of similarity between the first user account and the second user account, wherein the similarity score is determined based on processing text data and image data associated with the first user account and the second user account; determining an initial threshold similarity score, wherein the initial threshold similarity score is indicative of a particular degree of similarity between user accounts; determining a graph comprising nodes and edges, wherein each node represents a user account, and a pair of nodes has an edge if the similarity score of the pair of nodes indicates a greater degree of similarity than that indicated by the initial threshold similarity score; repeating for a plurality of iterations, wherein each iteration has a threshold similarity score different from a threshold similarity score in a previous iteration, each of the plurality of iterations comprising following steps: modifying the threshold similarity score for a current iteration to a value indicative of a higher degree of similarity between user accounts compared to the threshold similarity score for a previous iteration; and modifying edges of the graph by pruning edges from a previous iteration of the graph that connect pairs of nodes with similarity scores below the modified threshold similarity score; responsive to repeating the steps for the plurality of iterations, identifying one or more subgraphs within the graph, each including connected nodes, each identified subgraph representing a set of user accounts associated with a particular user; and transmitting information describing the set of user accounts associated with the particular user. 2. The computer-implemented method of claim 1 , wherein each iteration further comprises: responsive to modifying the threshold similarity score, removing one or more edges with a similarity score indicative of a degree of similarity less than the modified threshold similarity score. 3. The computer-implemented method of claim 1 , wherein each user account is associated with an image, and wherein determining similarity scores between a pair of user accounts comprises performing facial recognition on the images and comparing results of facial recognition for the pair of user accounts. 4. The computer-implemented method of claim 1 , wherein each user account is associated with text, and wherein determining similarity scores between a pair of user accounts comprises comparing text associated with each of the user accounts of the pair. 5. The computer-implemented method of claim 1 , wherein the iterations are repeated until a current iteration determines a set of one or more connected components that are identical to the one or more connected components determined by a previous iteration. 6. The computer-implemented method of claim 1 , further comprising: disabling one or more accounts associated with nodes of at least one of the one or more identified subgraphs. 7. The computer-implemented method of claim 1 , further comprising: sending a message to at least one of the user accounts associated with nodes of at least one of the one or more identified subgraphs. 8. A non-transitory computer-readable storage medium comprising instructions executable by a processor, the instructions comprising: instructions for receiving a plurality of user accounts, each user account associated with both text data and image data; for each of a plurality of pairs of user accounts, the pair of user accounts comprising a first user account and a second user account, instructions for determining a similarity score indicative of similarity between the first user account and the second user account, wherein the similarity score is determined based on processing text data and image data associated with the first user account and the second user account; instructions for determining an initial threshold similarity score, wherein the initial threshold similarity score is indicative of a particular degree of similarity between user accounts; instructions for determining a graph comprising nodes and edges, wherein each node represents a user account, and a pair of nodes has an edge if the similarity score of the pair of nodes indicates a greater degree of similarity than that indicated by the initial threshold similarity score; instructions for repeating for a plurality of iterations, wherein each iteration has a threshold similarity score different from a threshold similarity score in a previous iteration, each of the plurality of iterations comprising following steps: instructions for modifying the threshold similarity score for a current iteration to a value indicative of a higher degree of similarity between user accounts compared to the threshold similarity score for a previous iteration; and instructions for modifying edges of the graph by pruning edges from a previous iteration of the graph that connect pairs of nodes with similarity scores below the modified threshold similarity score; responsive to repeating the steps for the plurality of iterations, instructions for identifying one or more subgraphs within the graph, each including connected nodes, each identified subgraph representing a set of user accounts associated with a particular user; and instructions for transmitting information describing the set of user accounts associated with the particular user. 9. The non-transitory computer-readable storage medium of claim 8 , wherein the instructions for each iteration further comprise: responsive to modifying the threshold similarity score, instructions for removing one or more edges with a similarity score indicative of a degree of similarity less than the modified threshold similarity score. 10. The non-transitory computer-readable storage medium of claim 8 , wherein each user account is associated with an image, and wherein determining similarity scores between a pair of user accounts comprises performing facial recognition on the images and comparing results of facial recognition for the pair of user accounts. 11. The non-transitory computer-readable storage medium of claim 8 , wherein each user account is associated with text, and wherein the instructions for determining similarity scores between a pair of user accounts comprise instructions for comparing text associated with each of the user accounts of the pair. 12. The non-transitory computer-readable storage medium of claim 8 , wherein the iterations are repeated until a current iteration determines a set of one or more connected components that are identical to the one or more connected components determined by a previous iteration. 13. The non-transitory computer-readable storage medium of claim 8 , the instructions further comprising: instructions for disabling one or more accounts associated with nodes of at least one of the one or more identified subgraphs. 14. The non-transitory computer-readable storage medium of claim 8 , the instructions further comprising: instructions for sending a message to at least one of the user accounts associated with nodes of at least one of the one or more identified subgraphs. 15. A computer system comprising: a computer processor; and a non-transitory computer-readable storage medium storing instructions that whe

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Affine transformations (for image registration G06T3/147; for image mosaicing G06T3/4038) · CPC title

  • using neural networks · CPC title

  • Classification techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12182895B2 cover?
A system processes images of documents, for example, identification documents. The system transforms an image of a document to generate an image that represent the document in a canonical form. For example, if the input image has a document that is tilted at an angle with respect to the sides of the image, the system modifies the orientation of the document to show the document having sides ali…
Who is the assignee on this patent?
Uber Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06Q50/265. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).