What technology area does this patent fall under?

Primary CPC classification G06V20/62. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for detecting and recognizing text in images

US9530069B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9530069-B2
Application number	US-201514613279-A
Country	US
Kind code	B2
Filing date	Feb 3, 2015
Priority date	Jan 23, 2008
Publication date	Dec 27, 2016
Grant date	Dec 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative effects of variable lighting. This results in the creation of multiple images that are versions of the same text region. Some of these multiple versions are sent to a character-recognition system. The resulting texts from each of the versions of the image sent to the character-recognition system are then combined to a single result, wherein the single result is detected text.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: under the control of one or more computer systems configured with executable instructions, receiving an input image that includes at least one image variation; filtering and segmenting the input image; selecting regions within the filtered and segmented input image having connected components; creating a mask corresponding to the regions of connected components, the mask including bounding boxes that at least partially enclose corresponding regions of the connected components; intersecting the filtered and segmented input image with the mask to produce a first output image; separately processing the filtered and segmented input image corresponding to the mask to create a binary output image; separately recognizing text in the first output image and in the binaryoutput image using an optical character recognizer; and combining the separately recognized text from the first output image and from the binary output image to produce a single output. 2. The computer-implemented method of claim 1 , further comprising: separately processing the input image to produce a third output image using a different processing technique than used to produce the first output image and the second output image, the recognized text from the first output image, the second output image, and the third output image being combined using a majority vote process to select portions from the first output image, the second output image, and the third output image. 3. The computer-implemented method of claim 1 , wherein combining the separately recognized text from the first output image and the second output image comprises taking a logical OR of the first output image and the second output image. 4. The computer-implemented method of claim 1 , wherein the at least one image variation includes at least one of noise, blur, or a lighting variation. 5. The computer-implemented method of claim 1 , wherein selecting regions having the connected components includes identifying regions of connected pixels based on an intensity value of the pixels and a distance between the pixels. 6. The computer-implemented method of claim 5 , wherein separately recognizing the text in the first output image and in the binary output image is based upon whether pixel values for pixels are above or below a threshold value. 7. The computer-implemented method of claim 1 , wherein the bounding boxes are rectangular in shape. 8. A computing system, comprising: a processor; and a memory including instructions that, when executed by the processor, cause the computing system to: receive an input image that includes at least one image variation; filter and segmenting the input image; select regions within the filtered and segmented input image having connected components; create a mask corresponding to the regions of connected components, the mask including bounding boxes that at least partially enclose corresponding regions of the connected components; intersect the filtered and segmented input image with the mask to produce a first output image; separately process the filtered and segmented input image corresponding to the mask to create a binary output image; separately recognize text in the first output image and in the binary output image using an optical character recognizer; and combine the separately recognized text from the first output image and from the binary output image to produce a single output. 9. The computing system of claim 8 , wherein the instructions, when executed by the processor, further cause the computing system to: separately process the input image to produce a third output image using a different processing technique than is used to produce the first output image and the second output image, the recognized text from the first output image, the second output image, and the third output image being combined using a majority vote process to select portions from the first output image, the second output image, and the third output image. 10. The computing system of claim 8 , wherein the instructions, when executed by the processor, further cause the computing system to combine the separately recognized text from the first output image and the second output image by taking a logical OR of the first output image and the second output image. 11. The computing system of claim 8 , wherein the at least one image variation includes at least one of noise, blur, or a lighting variation. 12. The computing system of claim 8 , wherein the instructions, when executed by the processor, further cause the computing system to select regions having the connected components by identifying regions of connected pixels based on an intensity value of the pixels and a distance between the pixels. 13. The computing system of claim 8 , wherein the instructions, when executed by the processor, further cause the computing system to separately recognize the text in the first output image and in the binary output image based upon whether pixel values for pixels are above or below a threshold value. 14. The computing system of claim 8 , wherein the bounding boxes are rectangular in shape. 15. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to: receive an input image that includes at least one image variation; filter and segmenting the input image; select regions within the filtered and segmented input image having connected components; create a mask corresponding to the regions of connected components, the mask including bounding boxes that at least partially enclose corresponding regions of the connected components; intersect the filtered and segmented input image with the mask to produce a first output image; separately process the filtered and segmented input image corresponding to the mask to create a binary output image; separately recognize text in the first output image and in the binary output image using an optical character recognizer; and combine the separately recognized text from the first output image and from the binary output image to produce a single output. 16. The non-transitory computer-readable storage medium of claim 15 , wherein the instructions, when executed by the processor, further cause the processor to: separately process the input image to produce a third output image using a different processing technique than is used to produce the first output image and the second output image, the recognized text from the first output image, the second output image, and the third output image being combined using a majority vote process to select portions from the first output image, the second output image, and the third output image. 17. The non-transitory computer-readable storage medium of claim 15 , wherein the instructions, when executed by the processor, further cause the processor to combine the separately recognized text from the first output image and the second output image by taking a logical OR of the first output image and the second output image. 18. The non-transitory computer-readable storage medium of claim 15 , wherein the at least one image variation includes at least one of noise, blur, or a lighting variation. 19. The non-transitory computer-readable storage medium of claim 15 , wherein the instructions, when executed by the processor, further cause the processor to select regions having the connected components by identifying regions of connected pixels based on an intensity value of the pi

Assignees

A9 Com Inc

Inventors

Classifications

G06V30/164
Noise filtering · CPC title
G06V30/162
Quantising the image signal · CPC title
G06V30/155
Removing patterns interfering with the pattern to be recognised, such as ruled lines or underlines · CPC title
G06V30/15
Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques · CPC title
G06V20/62Primary
Text, e.g. of license plates, overlay texts or captions on TV images · CPC title

Patent family

Related publications grouped by family.

View patent family 44486312

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9530069B2 cover?: Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative…
Who is the assignee on this patent?: A9 Com Inc
What technology area does this patent fall under?: Primary CPC classification G06V20/62. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).