Recognizing text in image data
US-2024346069-A1 · Oct 17, 2024 · US
US2022309813A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022309813-A1 |
| Application number | US-202217841571-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 15, 2022 |
| Priority date | Dec 20, 2019 |
| Publication date | Sep 29, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Computer systems and methods are provided for extracting information from an image of a document. A computer system receives image data, the image data including an image of a document. The computer system determines a portion of the received image data that corresponds to a predefined document field. The computer system utilizes a neural network system to assign a label to the determined portion of the received image data. The computer system performs text recognition on the portion of the received image data and stores the recognized text in association with the assigned label.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method, comprising: at a server system including one or more processors and memory storing one or more programs for execution by the one or more processors: receiving image data, the image data including an image of a document; determining a portion of the received image data that corresponds to a predefined document field; utilizing a neural network system to assign a label to the determined portion of the received image data; performing text recognition on the portion of the received image data; and storing recognized text in association with the assigned label. 2 . The method of claim 1 , further comprising: after receiving the image data, determining a document type corresponding to the image of the document, the document type including document characteristics for the document type. 3 . The method of claim 1 , wherein the predefined document field corresponds to at least one of a name, a location, a date, a document type, or a document number. 4 . The method of claim 2 , further comprising: comparing the portion of the received image data with the document type to determine respective document characteristics for the portion of the received image data. 5 . The method of claim 4 , further comprising: generating, based on the respective document characteristics for the portion of the received image data, sanitized document information, wherein the sanitized document stores the document characteristics corresponding to the document type with a predetermined format. 6 . The method of claim 2 , wherein the document characteristics corresponding to the document type include a predetermined layout for the document type, wherein the predetermined layout includes a landscape layout or a portrait layout. 7 . The method of claim 2 , wherein the document characteristics corresponding to the document type include at least one or more selected from the group consisting of: one or more anchors, date format, or text format. 8 . The method of claim 1 , further comprising: before performing the text recognition on the portion of the received image data, determining a position of the image of the document; determining whether the position of the image of the document meets orientation criteria; in accordance with a determination that the position of the document meets the orientation criteria, performing the text recognition on the portion of the received image data. 9 . The method of claim 8 , wherein determining the position of the image of the document includes: identifying respective corners of the image of the document; and comparing the respective corners of the image of the document with document characteristics corresponding to a document type to determine the position of the document. 10 . The method of claim 8 , wherein: the image of the document includes facial image data; and determining the position for the image of the document includes: determining one or more facial features corresponding to the facial image data; and determining the position of the image of the document of the image data based on the one or more facial features. 11 . The method of claim 8 , wherein: the predefined document field includes text; and determining the position for the image of the document includes: determining, based on the predefined document field, a text position; and determining the position of the image of the document based on the text position. 12 . The method of claim 8 , wherein determining the position of the image of the document includes cropping the image of the document in the image data. 13 . The method of claim 8 , further comprising: in accordance with a determination that the position of the image of the document does not meet the orientation criteria, adjusting the image of the document to satisfy the orientation criteria; and in accordance with a determination that the position corresponding to the adjusted image of the document meets the orientation criteria, performing the text recognition on the adjusted image of the document. 14 . The method of claim 1 , further comprising: determining a saliency value for the predefined document field; determining whether the saliency value for the predefined document field meets a predetermined saliency threshold; and in accordance with a determination that the saliency value does not meet the predetermined saliency threshold, requesting new image data that includes an image of the document. 15 . The method of claim 14 , further comprising: in accordance with a determination that the saliency value meets the predetermined saliency threshold, generating a bounding box for the predefined document field; and performing the text recognition on the generated bounding box. 16 . The method of claim 15 , wherein utilizing the neural network system to assign the label to the determined portion of the received image data includes: determining a label for the generated bounding box; and assigning the label to the generated bounding box. 17 . The method of claim 16 , wherein the neural network system includes at least one recurrent neural network (RNN), or a convolutional neural network (CNN). 18 . The method of claim 17 , wherein the neural network system includes a plurality of neural networks, the plurality of neural networks including both the RNN and the CNN. 19 . The method of claim 17 , wherein: the neural network system includes a plurality of neural networks, the plurality of neural networks including both the RNN and the CNN; and determining the label for the generated bounding box includes: determining, using a first neural network of the plurality of neural networks, a first label for the generated bounding box; determining, using a second neural network of the plurality of neural networks, a second label for the generated bounding box; and comparing the first label and the second label to determine whether the first label and the second label match; and in accordance with a determination that the first label and the second label match, assigning the first label or the second label to the generated bounding box. 20 . The method of claim 19 , further comprising in accordance with a determination that the first label and the second label do not match, assigning a respective label of the first label or the second label with a highest relevance score. 21 . The method of claim 16 , wherein the neural network system includes a registration system, the registration system including a first template, wherein the first template includes a first predetermined label, the first predetermined label associated with a first predetermined label location; and determining the label for the generated bounding box includes: determining whether the first predetermined label corresponds to the generated bounding box by superimposing the first template over the image of the document; comparing the predetermined label location with the generated bounding box to determine a template value; determining whether the template value meets similarity threshold; and in accordance with a determination that the template value meets the similarity threshold, determining a relevant label based on the first predetermined label. in accordance with a determination that the template value for the first template does not meet the similarity threshold, determining the label for the generated bounding box based on a second template.
Classification of content, e.g. text, photographs or tables · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Smoothing the distance, e.g. radial basis function networks [RBFN] · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.