Text orientation estimation in camera captured OCR

US9224061B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9224061-B1
Application numberUS-201414464365-A
CountryUS
Kind codeB1
Filing dateAug 20, 2014
Priority dateJul 24, 2014
Publication dateDec 29, 2015
Grant dateDec 29, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system estimates text orientation in images captured using a handheld camera prior detecting text in the image. Text orientation is estimated based on edges detected within the image, and the image is rotated based on the estimated orientation. Text detection and processing is then performed on the rotated image. Non-text features along a periphery of the image may be sampled to assure that clutter will not undermine the estimation of orientation.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for rotating and processing text in an image, the method comprising: detecting edges in the image; estimating lines by applying a Hough transform to the detected edges; determining a length and an angle of each of the estimated lines; assigning each estimated line to a discrete angle of a plurality of discrete angles based on the angle of the respective line; determining a sum of lengths of lines assigned to each discrete angle; determining a first discrete angle of the plurality of discrete angles that is associated with a largest determined sum; determining a weighted sum angle based on a sum of the length of the lines assigned to the first discrete angle, a sum of the length of lines assigned to a first neighboring discrete angle smaller than the first discrete angle, and a sum of the length of lines assigned to a second neighboring discrete angle larger than the first discrete angle; rotating the image based on the weighted sum angle; detecting text in the rotated image; and performing optical character recognition on the detected text. 2. The method of claim 1 , further comprising: selecting a region of the image, wherein the region is adjacent to a border of the image; determining a percentage of pixels corresponding to the detected edges within the region; and determining that the percentage is less than a threshold value. 3. The method of claim 1 , further comprising: determining that the first discrete angle does not correspond to zero degrees; determining a ratio of a sum of the lengths of lines assigned to the first discrete angle, the first neighboring angle, and the second neighboring angle to a sum of lines assigned to the discrete angle that includes zero degrees; and determining that the ratio is greater than a threshold value. 4. A computing device comprising: at least one processor; a memory including instructions operable to be executed by the at least one processor to configure the at least one processor to: detect a plurality of edges in the image; determine, based on the plurality of edges, a first line; determine, based on the plurality of edges, a second line; determine a first angle of the first line; determine a second angle of the second line; associate, based on the first angle, the first line with a first bin, the first bin associated with a first range of angles; associate, based on the second angle, the second line with a second bin, the second bin associated with a second range of angles; determine, for the first bin, a first sum of lengths of lines associated with the first bin; determine, for the second bin, a second sum of lengths of lines associated with the second bin; determine that the first sum is greater than the second sum; determine a first discrete angle associated with the first bin; rotate the image based at least in part on the first discrete angle; and detect text in the rotated image. 5. The computing device of claim 4 , the instructions further configuring the at least one processor to apply a Hough transform to detected edges in the image. 6. The computing device of claim 4 , the instructions further configuring the at least one processor to: determine a third sum of lengths of lines associated with a third bin, the third bin associated with a third range of angles, the third range of angles neighboring the first range of angles; determine a fourth sum of lengths of lines associated with a fourth bin, the fourth bin associated with a fourth range of angles, the fourth range of angles neighboring the first range of angles; determine a weighted sum angle based on the first sum of lengths, the third sum of lengths and the fourth sum of lengths; and rotate the image based further at least in part on the weighted sum angle. 7. The computing device of claim 6 , the instructions further configuring the at least one processor to: determine that the first discrete angle does not correspond to zero degrees prior to determining the weighted sum angle. 8. The computing device of claim 4 , the instructions further configuring the at least one processor to: binarize the detected text in the rotated image; perform optical character recognition on the binarized detected text. 9. The computing device of claim 4 , further comprising a communications interface, the instructions further configuring the at least one processor to: binarize the detected text in the rotated image; transmit the binarized detected text to a second device via the communications interface with an instruction to perform optical character recognition on the binarized detected text; and receive recognized text from the second device via the communication interface. 10. The computing device of claim 4 , the instructions further configuring the at least one processor to: select a region of the image, wherein the region is adjacent to a border of the image; determine a percentage of pixels corresponding to the detected edges within the region; and determine that the percentage is less than a threshold value. 11. The computing device of claim 4 , the instructions further configuring the at least one processor to: downscale the image; and crop the downscaled image, wherein the estimated orientation of text in the image is based on the cropped downscaled image. 12. A non-transitory computer-readable storage medium storing processor-executable instructions for controlling a computing device, to configure the computing device to: detect a plurality of edges in the image; determine, based on the plurality of edges, a first line; determine, based on the plurality of edges, a second line; determine a first angle of the first line; determine a second angle of the second line; associate, based on the first angle, the first line with a first bin, the first bin associated with a first range of angles; associate, based on the second angle, the second line with a second bin, the second bin associated with a second range of angles; determine, for the first bin, a first sum of lengths of lines associated with the first bin; determine, for the second bin, a second sum of lengths of lines associated with the second bin; determine that the first sum is greater than the second sum; determine a first discrete angle associated with the first bin; rotate the image based at least in part on the first discrete angle; and detect text in the rotated image. 13. The non-transitory computer-readable storage medium of claim 12 , the instructions further configuring the computing device to apply a Hough transform to detected edges in the image. 14. The non-transitory computer-readable storage medium of claim 12 , the instructions further configuring the computing device to: determine a third sum of lengths of lines associated with a third bin, the third bin associated with a third range of angles, the third range of angles neighboring the first range of angles; determine a fourth sum of lengths of lines associated with a fourth bin, the fourth bin associated with a fourth range of angles, the fourth range of angles neighboring the first range of angles; determine a weighted sum angle based on the first sum of lengths, the third sum of lengths and the fourth sum of lengths; and rotate the image based further at least in part on the weighted sum angle. 15. The non-transitory computer-readable storage medium of claim 14 , the instructions further configuring the computing device to: determine that the first discrete angle does not correspond to zero degrees prior to determining the weighted

Assignees

Inventors

Classifications

  • Orientation detection or correction, e.g. rotation of multiples of 90 degrees · CPC title

  • G06V20/63Primary

    Scene text, e.g. street names · CPC title

  • Character recognition · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9224061B1 cover?
A system estimates text orientation in images captured using a handheld camera prior detecting text in the image. Text orientation is estimated based on edges detected within the image, and the image is rotated based on the estimated orientation. Text detection and processing is then performed on the rotated image. Non-text features along a periphery of the image may be sampled to assure that c…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/63. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 29 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).