System and method for generating best potential rectified data based on past recordings of data

US12183100B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12183100-B2
Application numberUS-202217653999-A
CountryUS
Kind codeB2
Filing dateMar 8, 2022
Priority dateJan 22, 2022
Publication dateDec 31, 2024
Grant dateDec 31, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various methods, apparatuses/systems, and media for data processing are disclosed. A processor receives a digital document; applies an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identifies defective data extracted by the OCR tool resulted from relatively inferior image quality of the received digital document; implements an auto rectification algorithm on the identified defective data; automatically generates, in response to implementing the auto rectification algorithm, corresponding auto-rectified data for each identified defective data; records the defective data and corresponding auto-rectified data at a field level; receives user input data on said recorded auto-rectified data; determines whether the auto-rectified data is correct or not; and populates, based on determining that the auto-rectified data is correct, a machine learning model with said received user input data to be utilized for subsequently received digital document.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for data processing by utilizing one or more processors along with allocated memory, the method comprising: receiving a digital document; applying an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identifying defective data extracted by the OCR tool resulted from relatively inferior image quality of the received digital document; implementing an auto rectification algorithm on the identified defective data; automatically generating, in response to implementing the auto rectification algorithm, corresponding auto-rectified data for each identified defective data; recording the defective data and corresponding auto-rectified data at a field level; receiving user input data on said recorded auto-rectified data; determining whether the auto-rectified data is correct or not; populating, based on determining that the auto-rectified data is correct, a machine learning model with said received user input data to be utilized for subsequently received digital document; generating a plurality of first selectable icons, wherein each of said first selectable icon is configured to display corresponding auto-rectified field data when user input is received by clicking or hovering over the first selectable icon; receiving user input data that the auto-rectified field data is not correct based on user's comparing comparison of the auto-rectified field data with a corresponding original image data of the digital document; and receiving user input data indicating a user defined correct field data replacing the auto-rectified field data. 2. The method according to claim 1 , wherein the defective data includes one or more of the following data: unwanted extraction data, partial data, incomplete data, junk data, and perfect but incomplete extraction data. 3. The method according to claim 1 , further comprising: receiving user input data indicating approval of the auto-rectified field data when a difference between an auto-rectified data value and user input data value is equal to or more than a predetermined threshold value. 4. The method according to claim 1 , further comprising: receiving user input data indicating disapproval of the auto-rectified field data when a difference between an auto-rectified data value and user input data value is below a predetermined threshold value. 5. The method according to claim 4 , further comprising: generating a plurality of second selectable icons, wherein each of said second selectable icon is configured to display, upon receiving user input via clicking or hovering over the second selectable icon, corresponding suggested potential match field data for a corresponding disapproval of the auto-rectified field data based on historical patterns data that was generated previously in correcting the disapproved auto-rectified field data; receiving user input in approving the suggested potential match field data; and populating the machine learning model with said approved suggested potential match field data to be utilized for subsequently received digital document. 6. The method according to claim 5 , wherein when suggested potential match field data is not available for a certain extracted or user populated data, the method further comprising: receiving user input data that accepts the certain extracted or the user populated data as a new field data for subsequent suggestions; and populating the machine learning model with said new field data to be utilized for subsequently received digital document. 7. A system for data processing, the system comprising: a processor; and a memory operatively connected to the processor via a communication interface, the memory storing computer readable instructions, when executed, causes the processor to: receive a digital document; apply an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identify defective data extracted by the OCR tool resulted from relatively inferior image quality of the received digital document; implement an auto rectification algorithm on the identified defective data; automatically generate, in response to implementing the auto rectification algorithm, corresponding auto-rectified data for each identified defective data; record the defective data and corresponding auto-rectified data at a field level; receive user input data on said recorded auto-rectified data; determine whether the auto-rectified data is correct or not; populate, based on determining that the auto-rectified data is correct, a machine learning model with said received user input data to be utilized for subsequently received digital document; generate a plurality of first selectable icons, wherein each of said first selectable icon is configured to display corresponding auto-rectified field data when user input is received by clicking or hovering over the first selectable icon; receive user input data that the auto-rectified field data is not correct based on user's comparing comparison of the auto-rectified field data with a corresponding original image data of the digital document; and receive user input data indicating a user defined correct field data replacing the auto-rectified field data. 8. The system according to claim 7 , wherein the defective data includes one or more of the following data: unwanted extraction data, partial data, incomplete data, junk data, and perfect but incomplete extraction data. 9. The system according to claim 7 , wherein the processor is further configured to: receive user input data indicating approval of the auto-rectified field data when a difference between an auto-rectified data value and user input data value is equal to or more than a predetermined threshold value. 10. The system according to claim 7 , wherein the processor is further configured to: receive user input data indicating disapproval of the auto-rectified field data when a difference between an auto-rectified data value and user input data value is below a predetermined threshold value. 11. The system according to claim 10 , wherein the processor is further configured to: generate a plurality of second selectable icons, wherein each of said second selectable icon is configured to display, upon receiving user input via clicking or hovering over the second selectable icon, corresponding suggested potential match field data for a corresponding disapproval of the auto-rectified field data based on historical patterns data that was generated previously in correcting the disapproved auto-rectified field data; receive user input in approving the suggested potential match field data; and populate the machine learning model with said approved suggested potential match field data to be utilized for subsequently received digital document. 12. The system according to claim 11 , wherein when suggested potential match field data is not available for a certain extracted or user populated data, the processor is further configured to: receive user input data that accepts the certain extracted or the user populated data as a new field data for subsequent suggestions; and populate the machine learning model with said new field data to be utilized for subsequently received digital document. 13. A non-transitory computer readable medium configured to store instructions for data processing, wherein, when executed, the instructions cause a processor to perform the following: receiving a digital document; applying an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identifying

Assignees

Inventors

Classifications

  • Interaction with lists of selectable items, e.g. menus · CPC title

  • Interactive pattern learning with a human teacher · CPC title

  • G06F40/166Primary

    Editing, e.g. inserting or deleting · CPC title

  • Validation; Performance evaluation · CPC title

  • using icons (graphical or visual programming using iconic symbols G06F8/34) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12183100B2 cover?
Various methods, apparatuses/systems, and media for data processing are disclosed. A processor receives a digital document; applies an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identifies defective data extracted by the OCR tool resulted from relatively inferior image quality of the received digital document; implements an auto rec…
Who is the assignee on this patent?
Jpmorgan Chase Bank Na
What technology area does this patent fall under?
Primary CPC classification G06F40/166. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).