Method and system for evaluating quality of a document

US2024177285A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024177285-A1
Application numberUS-202318120363-A
CountryUS
Kind codeA1
Filing dateMar 11, 2023
Priority dateNov 28, 2022
Publication dateMay 30, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system of determining quality of a document image is disclosed that includes segmenting, by one or more processors, a document image into a plurality of regions each of which comprises text data. The plurality of regions is classified into one of a plurality of image quality classes based on a determination of a highest prediction value from one of a plurality of machine learning models. The plurality of machine learning models is trained corresponding to one of the plurality of image quality classes. A cumulative quality score for the image is computed based on a weighted average of a number of regions classified into each of the plurality of image quality classes. The quality of the image is determined based on the cumulative quality score.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of determining quality of a document image, the method comprising: segmenting, by a computing device, the document image into a plurality of regions, wherein each of the plurality of regions comprises text data; classifying, by the computing device, each of the plurality of regions into one of a plurality of image quality classes, wherein each of the plurality of regions is classified based on a determination of a highest prediction value from one of a plurality of machine learning models, and wherein each of the plurality of machine learning models is trained corresponding to one of the plurality of image quality classes; computing, by the computing device, a cumulative quality score for the document image based on a weighted average of a number of regions classified into each of the plurality of image quality classes; and determining, by the computing device, the quality of the document image based on the cumulative quality score. 2 . The method of claim 1 , wherein the training of each of the plurality of machine learning models corresponding to one of the plurality of image quality classes comprises determining a training dataset for each of the plurality of image quality classes. 3 . The method of claim 2 , wherein the determination of the training dataset for each of the plurality of image quality classes comprises: segmenting, by a computing device, a training image into a plurality of regions, wherein each of the plurality of regions comprises text data; for each of the plurality of regions, performing, by the computing device, optical character recognition (OCR) using two or more OCR systems to determine corresponding two or more OCR text data; determining, by the computing device, text matching scores based on a comparison among the two or more OCR text data using a plurality of string matching techniques; determining, by the computing device, a plurality of threshold values for the plurality of image quality classes based on a statistical analysis of the text matching scores based on the plurality of string matching techniques and for the plurality of regions; and clustering, by the computing device, the plurality of regions into one of the plurality of image quality classes based on the plurality of threshold values. 4 . The method of claim 3 , wherein the determination of the plurality of threshold values comprises: computing, for each of the plurality of regions, a minimum text matching score, a maximum text matching score, and an average text matching score based on the text matching scores for the plurality of string matching techniques; and determining the plurality of threshold values based on a statistical calculation of the minimum, the maximum, and the average text matching scores for the plurality of regions. 5 . The method of claim 3 , wherein the plurality of threshold values comprises a lower threshold value and an upper threshold value, and wherein the plurality of image quality classes comprises a bad image quality class for document images for the average text matching score less than or equal to the lower threshold value, a good image quality class for the average text matching score greater than or equal to the upper threshold value, and a medium image quality class for the average text matching score greater than the lower threshold value and less than the upper threshold value. 6 . The method of claim 1 , wherein the plurality of regions comprises at least one of a region with word level text data, a region with sentence level text data, a region with paragraph level text data, and a region with page level text data. 7 . The method of claim 1 , wherein the computing device is configured to determine quality of a document comprising a plurality of document images by: computing a cumulative quality score for each of the plurality of document images based on the weighted average of the number of regions classified into each of the plurality of image quality classes; and determining the quality of the document based on the cumulative quality score for each of the plurality of document images. 8 . The method of claim 1 , wherein the computing device is configured to re-train the plurality of machine learning models based on a variance in a prediction value from each of the plurality of machine learning models. 9 . A system for determining quality of a document image, comprising: one or more processors; a memory communicatively coupled to the processors, wherein the memory stores a plurality of processor-executable instructions, which, upon execution, cause the processors to: segment the document image into a plurality of regions, wherein each of the plurality of regions comprises text data; classify each of the plurality of regions into one of a plurality of image quality classes, wherein each of the plurality of regions is classified based on a determination of a highest prediction value from one of a plurality of machine learning models, and wherein each of the plurality of machine learning models is trained corresponding to one of the plurality of image quality classes; compute a cumulative quality score for the document image based on a weighted average of a number of regions classified into each of the plurality of image quality classes; and determine the quality of the document image based on the cumulative quality score. 10 . The system of claim 9 , wherein the training of each of the plurality of machine learning models corresponding to one of the plurality of image quality classes comprises determining a training dataset for each of the plurality of image quality classes. 11 . The system of claim 9 , wherein the one or more processors are further configured to determine the training dataset for each of the plurality of image quality classes by: segmenting, by a computing device, a training image into a plurality of regions, wherein each of the plurality of regions comprises text data; for each of the plurality of regions, performing optical character recognition (OCR) using two or more OCR systems to determine corresponding two or more OCR text data; determining text matching scores based on a comparison among the two or more OCR text data using a plurality of string matching techniques; determining a plurality of threshold values for the plurality of image quality classes based on a statistical analysis of the text matching scores for the plurality of string matching techniques and for the plurality of regions; and clustering the plurality of regions into one of the plurality of image quality classes based on the plurality of threshold values. 12 . The system of claim 11 , wherein the determination of the plurality of threshold values comprises: computing, for each of the plurality of regions, a minimum text matching score, a maximum text matching score, and an average text matching score based on the text matching scores for the plurality of string matching techniques; and determining the plurality of threshold values based on a statistical calculation of the minimum, the maximum, and the average text matching scores for the plurality of regions. 13 . The system of claim 11 , wherein the plurality of threshold values comprises a lower threshold value and an upper threshold value, and wherein the plurality of image quality classes comprises a bad image quality class for document images for the average text matching score less than or equal to the lower threshold value, a good image quality class for the average text matching score greater than or equal to the upper threshold value, and a medium image quality class

Assignees

Inventors

Classifications

  • G06T7/0002Primary

    Inspection of images, e.g. flaw detection · CPC title

  • Region-based segmentation · CPC title

  • using clustering, e.g. of similar faces in social networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Proximity measures, i.e. similarity or distance measures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024177285A1 cover?
A method and system of determining quality of a document image is disclosed that includes segmenting, by one or more processors, a document image into a plurality of regions each of which comprises text data. The plurality of regions is classified into one of a plurality of image quality classes based on a determination of a highest prediction value from one of a plurality of machine learning m…
Who is the assignee on this patent?
L&T Technology Services Ltd
What technology area does this patent fall under?
Primary CPC classification G06T7/0002. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 30 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).