Method and system for detecting and correcting orientation of document images

US12555261B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12555261-B2
Application numberUS-202318120364-A
CountryUS
Kind codeB2
Filing dateMar 11, 2023
Priority dateNov 24, 2022
Publication dateFeb 17, 2026
Grant dateFeb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure relates to method and system for detecting orientation. The method includes detecting a plurality of regions in a document image, each region including text data, and determining positional information of each of the regions; for each of the plurality of regions, determining a region orientation to be one of first orientation or second orientation based on height and width of the region; determining a ratio of number of regions having first orientation and number of regions having second orientation; determining page orientation of the image as third orientation or second orientation, or rotating the image by 90° in counter-clockwise direction based on the ratio; determining first optical character recognition (OCR) data and second OCR data corresponding to the image and the image rotated by 180°, respectively; and determining number of correct words in first OCR data and second OCR data based on comparison with dictionary data.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of detecting orientation of a document image, comprising: detecting, by a computing device, a plurality of regions in the document image, each region comprising text data; determining, by the computing device, positional information of each of the regions, wherein the positional information comprises x and y coordinates of a top left corner and width and height of each of the plurality of regions; for each of the plurality of regions: determining, by the computing device, a region orientation to be one of: a first orientation if height of a region is greater than three times width of the region, wherein the first orientation is equal to 90 degrees, or a second orientation if the width of the region is greater than three times the height of the region, wherein the second orientation is equal to 180 degrees; determining, by the computing device, a ratio of a number of regions determined as having the second orientation to a number of regions determined as having the first orientation; determining, by the computing device, a page orientation of the document image as a third orientation or the second orientation if the ratio is greater than a pre-determined threshold or if the number of regions determined as having the first orientation is zero; wherein the third orientation is equal to 0 degree; and wherein the ratio is computed if the number of regions determined as having the first orientation is non-zero and if the number of regions determined as having the first orientation is zero, the page orientation is determined as the third orientation or the second orientation without computing the ratio; rotating, by the computing device, the document image by 90 degrees in counter-clockwise direction if the ratio is less than the pre-determined threshold or if the number of regions determined as having the second orientation is zero; determining, by the computing device: a first OCR data by performing optical character recognition (OCR) of each of the regions of the document image, and a second OCR data by performing optical character recognition (OCR) of each of the regions of the document image rotated by 180 degrees; determining, by the computing device, a number of correct words in each of the first OCR data and the second OCR data based on a comparison with a dictionary data; determining, by the computing device, the page orientation as the third orientation if the number of correct words in the first OCR data is greater than the number of correct words in the second OCR data; and rotating, by the computing device, the document image by 180 degrees if the number of correct words in the first OCR data is less than the number of correct words in the second OCR data. 2 . The method of claim 1 , wherein the region orientation is determined based on contour detection. 3 . The method of claim 1 , wherein the dictionary data comprises data of one or more languages. 4 . The method of claim 1 , wherein the pre-determined threshold is determined based on an AI-based contour detection model trained based on training data with respect to the first orientation, the second orientation and the third orientation. 5 . A system to detect and correct orientation of a document image, comprising: one or more processors; a memory communicatively coupled to the processors, wherein the memory stores a plurality of processor-executable instructions, which, upon execution, cause the processors to: detect a plurality of regions in the document image, each region comprising text data; determine positional information of each of the regions, wherein the positional information comprises x and y coordinates of a top left corner and width and height of each of the plurality of regions; for each of the plurality of regions: determine a region orientation to be one of: a first orientation if height of a region is greater than three times width of the region, wherein the first orientation is equal to 90 degrees, or a second orientation if the width of the region is greater than three times the height of the region, wherein the second orientation is equal to 180 degrees; determine a ratio of a number of regions determined as having the second orientation to a number of regions determined as having the first orientation; determine a page orientation of the document image as a third orientation or the second orientation if the ratio is greater than a pre-determined threshold or if the number of regions determined as having the first orientation is zero; wherein the third orientation is equal to 0 degree; and wherein the ratio is computed if the number of regions determined as having the first orientation is non-zero and if the number of regions determined as having the first orientation is zero, the page orientation is determined as the third orientation or the second orientation without computing the ratio; rotate the document image by 90 degrees in counter-clockwise direction if the ratio is less than the pre-determined threshold or if the number of regions determined as having the second orientation is zero; determine: a first OCR data by performing optical character recognition (OCR) of each of the regions of the document image, and a second OCR data by performing optical character recognition (OCR) of each of the regions of a rotated document image rotated by 180 degrees; determine a number of correct words in each of the first OCR data and the second OCR data based on a comparison with a dictionary data; determine the page orientation as the third orientation if the number of correct words in the first OCR data is greater than the number of correct words in the second OCR data; and rotate the document image by 180 degrees if the number of correct words in the first OCR data is less than the number of correct words in the second OCR data. 6 . The system of claim 5 , wherein the region orientation is determined based on contour detection. 7 . The system of claim 5 , wherein the dictionary data comprises data of one or more languages. 8 . The system of claim 5 , wherein the pre-determined threshold is determined based on an AI-based contour detection model trained based on training data with respect to the first orientation, the second orientation and the third orientation. 9 . A non-transitory computer-readable medium storing computer-executable instructions for detecting orientation of a document image, the computer-executable instructions configured for: detecting a plurality of regions in the document image, each region comprising text data; determining positional information of each of the regions, wherein the positional information comprises x and y coordinates of a top left corner and width and height of each of the plurality of regions; for each of the plurality of regions: determining a region orientation to be one of: a first orientation if height of a region is greater than three times width of the region, wherein the first orientation is equal to 90 degrees, or a second orientation if the width of the region is greater than three times the height of the region, wherein the second orientation is equal to 180 degrees; determining a ratio of a number of regions determined as having the second orientation to a number of regions determined as having the first orientation; determining a page orientation of the document image as a third orientation or the second orientation if the ratio is greater than a pre-determined threshold or if the number of regions determined as having the first orientation is zero; wherein the third orientation is equal to 0 degree; and wherein the ratio is computed if the number of regions determined as having the first orient

Assignees

Inventors

Classifications

  • Document · CPC title

  • Rotation of whole images or parts thereof · CPC title

  • Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title

  • Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching (specially adapted for image segmentation G06T7/10; specially adapted for the analysis of motion G06T7/20; specially adapted for image alignment G06T7/30; specially adapted for the calculation of depth from stereo images G06T7/50; specially adapted for position determination G06T7/70) · CPC title

  • G06T7/70Primary

    Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12555261B2 cover?
This disclosure relates to method and system for detecting orientation. The method includes detecting a plurality of regions in a document image, each region including text data, and determining positional information of each of the regions; for each of the plurality of regions, determining a region orientation to be one of first orientation or second orientation based on height and width of th…
Who is the assignee on this patent?
L&T Technology Services Ltd
What technology area does this patent fall under?
Primary CPC classification G06T7/70. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).