Systems and methods for automated document image orientation correction

US11776248B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11776248-B2
Application numberUS-202218047045-A
CountryUS
Kind codeB2
Filing dateOct 17, 2022
Priority dateJul 22, 2020
Publication dateOct 3, 2023
Grant dateOct 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are configured for correcting the orientation of an image data object subject to optical character recognition (OCR) by receiving an original image data object, generating initial machine readable text for the original image data object via OCR, generating an initial quality score for the initial machine readable text via machine-learning models, determining whether the initial quality score satisfies quality criteria, upon determining that the initial quality score does not satisfy the quality criteria, generating a plurality of rotated image data objects each comprising the original image data object rotated to a different rotational position, generating a rotated machine readable text data object for each of the plurality of rotated image data objects and generating a rotated quality score for each of the plurality of rotated machine readable text data objects, and determining that one of the plurality of rotated quality scores satisfies the quality criteria.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method comprising: providing, by one or more processors, a first rotated machine readable text data object of a plurality of rotated machine readable text data objects to a natural language processing (NLP) engine, wherein the first rotated machine readable text data object is generated by: (a) generating, by applying an optical character recognition (OCR) process, initial machine readable text for an original image data object, (b) generating, using one or more machine learning models, an initial quality score for the initial machine readable text, wherein the initial quality score indicates a probability that an error in the initial machine readable text is attributable to an image orientation associated with the original image data object, (c) responsive to a determination that the initial quality score does not satisfy one or more quality criteria, generating a plurality of rotated image data objects, wherein (i) each of the plurality of rotated image data objects corresponds to a different rotational position and (ii) each of the plurality of rotated image data objects comprises the original image data object rotated to a corresponding rotational position, (d) generating the plurality of rotated machine readable text data objects for the plurality of rotated image data objects, (e) generating, using one or more machine learning models, a plurality of rotated quality scores comprising a rotated quality score for each of the plurality of rotated machine readable text data objects, and (f) determining that a first rotated quality score of the plurality of rotated quality scores satisfies the one or more quality criteria, wherein (i) the first rotated quality score corresponds to the first rotated machine readable text data object and (ii) determining that the first rotated quality score satisfies the one or more quality criteria indicates that the first rotated machine readable text data object is to be provided to the NLP engine. 2. The computer-implemented method of claim 1 , wherein generating the initial quality score comprises: identifying one or more words within the initial machine readable text based at least in part on a machine-learning model for identifying spaces between words; comparing each of the one or more words identified within the initial machine readable text against words within a dictionary retrieved for checking spelling within the initial machine readable text; generating a spelling error detection rate for the initial machine readable text; and determining the initial quality score based at least in part on the spelling error detection rate for the initial machine readable text. 3. The computer-implemented method of claim 2 , further comprising: identifying, within metadata associated with the original image data object, a language associated with the original image data object; and retrieving the dictionary based at least in part on the language associated with the original image data object. 4. The computer-implemented method of claim 1 , wherein generating the plurality of rotated image data objects comprises: generating a first rotated image data object comprising the original image data object rotated to a first rotational position; generating a second rotated image data object comprising the original image data object rotated to a second rotational position; generating a third rotated image data object comprising the original image data object rotated to a third rotational position; and storing each of the first rotated image data object, the second rotated image data object, and the third rotated image data object in association with the original image data object. 5. The computer-implemented method of claim 1 , wherein generating the initial quality score for the initial machine readable text comprises: generating text metadata comprising text summarization metrics for the initial machine readable text; and processing the text metadata using one or more machine learning models to generate the initial quality score and associating the initial quality score with the initial machine readable text. 6. The computer-implemented method of claim 5 , wherein the text summarization metrics comprise at least one of: a count of words not evaluated within the initial machine readable text, a count of words evaluated within the initial machine readable text, a count of words within the initial machine readable text not found in a dictionary, a count of words within the initial machine readable text found in the dictionary, a count of words within the initial machine readable text, or a count of space characters within the initial machine readable text. 7. A computing apparatus comprising memory and one or more processors communicatively coupled to the memory, the one or more processors configured to: provide a first rotated machine readable text data object of a plurality of rotated machine readable text data objects to a natural language processing (NLP) engine, wherein the first rotated machine readable text data object is generated by: (a) generating, by applying an optical character recognition (OCR) process, initial machine readable text for an original image data object, (b) generating, using one or more machine learning models, an initial quality score for the initial machine readable text, wherein the initial quality score indicates a probability that an error in the initial machine readable text is attributable to an image orientation associated with the original image data object, (c) responsive to a determination that the initial quality score does not satisfy one or more quality criteria, generating a plurality of rotated image data objects, wherein (i) each of the plurality of rotated image data objects corresponds to a different rotational position and (ii) each of the plurality of rotated image data objects comprises the original image data object rotated to a corresponding rotational position, (d) generating the plurality of rotated machine readable text data objects for the plurality of rotated image data objects, (e) generating, using one or more machine learning models, a plurality of rotated quality scores comprising a rotated quality score for each of the plurality of rotated machine readable text data objects, and (f) determining that a first rotated quality score of the plurality of rotated quality scores satisfies the one or more quality criteria, wherein (i) the first rotated quality score corresponds to the first rotated machine readable text data object and (ii) determining that the first rotated quality score satisfies the one or more quality criteria indicates that the first rotated machine readable text data object is to be provided to the NLP engine. 8. The computing apparatus of claim 7 , wherein generating the initial quality score comprises: identifying one or more words within the initial machine readable text based at least in part on a machine-learning model for identifying spaces between words; comparing each of the one or more words identified within the initial machine readable text against words within a dictionary retrieved for checking spelling within the initial machine readable text; generating a spelling error detection rate for the initial machine readable text; and determining the initial quality score based at least in part on the spelling error detection rate for the initial machine readable text. 9. The computing apparatus of claim 8 , wherein the one or more processors are further configured to: identify, within metadata associated with the original image data object, a language associated with the original image data object; and retrieve the dictionary based at l

Assignees

Inventors

Classifications

  • G06V10/98Primary

    Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns · CPC title

  • Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11776248B2 cover?
Systems and methods are configured for correcting the orientation of an image data object subject to optical character recognition (OCR) by receiving an original image data object, generating initial machine readable text for the original image data object via OCR, generating an initial quality score for the initial machine readable text via machine-learning models, determining whether the init…
Who is the assignee on this patent?
Optum Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/98. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).