Processing of co-mingled paper correspondence
US-9213970-B1 · Dec 15, 2015 · US
US2019089849A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2019089849-A1 |
| Application number | US-201816128972-A |
| Country | US |
| Kind code | A1 |
| Filing date | Sep 12, 2018 |
| Priority date | Sep 21, 2017 |
| Publication date | Mar 21, 2019 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An image processing apparatus including: an analysis unit configured to extract a text area by performing area division processing for a binary image obtained by binarizing the scanned image by a first binarization method; a determination unit configured to determine a binary image used in OCR processing; and a character recognition unit configured to perform the OCR processing by using the binary image determined by the determination unit for the text area extracted by the analysis unit, and the determination unit: in a case where a binary image used in the area division processing is suitable to the OCR processing, determines the binary image as a binary image used in the OCR processing; and in a case where a binary image used in the area division processing is not suitable to the OCR processing, generates a binary image by a second binarization method whose accuracy is higher than that of the first binarization method and determines the generated binary image as a binary image used in the OCR processing.
Opening claim text (preview).
What is claimed is: 1 . An image processing apparatus that acquires character information from a scanned image, the image processing apparatus comprising: an analysis unit configured to extract a text area by performing area division processing for a binary image obtained by binarizing the scanned image by a first binarization method; a determination unit configured to determine a binary image used in OCR processing; and a character recognition unit configured to perform the OCR processing by using the binary image determined by the determination unit for the text area extracted by the analysis unit, wherein the determination unit: in a case where a binary image used in the area division processing is suitable to the OCR processing, determines the binary image as a binary image used in the OCR processing; and in a case where a binary image used in the area division processing is not suitable to the OCR processing, generates a binary image by a second binarization method whose accuracy is higher than that of the first binarization method and determines the generated binary image as a binary image used in the OCR processing. 2 . The image processing apparatus according to claim 1 , wherein the determination unit determines whether a binary image is in a state suitable to the OCR processing based on an aspect ratio of a character frame indicating a circumscribing rectangle of each character included in the extracted text area. 3 . The image processing apparatus according to claim 2 , wherein in the area division processing, a row area obtained by dividing the text area in units of rows is extracted, and the determination unit performs the determination of whether a binary image is in a state suitable to the OCR processing based on an average value of aspect ratios of the character frames included in the row area. 4 . The image processing apparatus according to claim 3 , wherein the determination unit performs the determination of whether a binary image is in a state suitable to the OCR processing by comparing an average value of aspect ratios of character frames included in the row area and an average value of aspect ratios of character frames in a predetermined font stored in advance. 5 . The image processing apparatus according to claim 3 , wherein the determination unit further performs the OCR processing for a part of character frames included in the row area and performs the determination of whether a binary image is in a state suitable to the OCR processing based on a reliability of obtained character recognition results. 6 . The image processing apparatus according to claim 5 , wherein the determination unit stores a threshold value in advance, which is a reference of the reliability, and performs the determination of whether a binary image is in a state suitable to the OCR processing by comparison processing between the reliability of the obtained character recognition results and the threshold value. 7 . The image processing apparatus according to claim 6 , wherein the reliability is a matching rate of feature amounts in the character recognition results. 8 . The image processing apparatus according to claim 1 , wherein the first binarization method is a binarization method whose processing speed is higher than that of the second binarization method. 9 . The image processing apparatus according to claim 1 , wherein the first binarization method is a binarization method that uses a single threshold value, and the second binarization method is a binarization method that uses a plurality of threshold values. 10 . The image processing apparatus according to claim 3 , further comprising: a user interface that receives a selection of an arbitrary row area of a plurality of row areas extracted in the area division processing, wherein the determination unit generates a binary image by the second binarization method by taking a row area selected by a user via the user interface as a target. 11 . The image processing apparatus according to claim 10 , wherein the character recognition unit performs OCR processing in advance in accordance with a predetermined condition for each row area included in a binary image obtained by binarizing the scanned image by the first binarization method before a user performs the selection, and the determination unit, in a case where a reliability of character recognition results obtained by the OCR processing performed in advance does not satisfy a predetermined reference, generates a binary image by the second binarization method and performs OCR processing again by using the generated binary image. 12 . The image processing apparatus according to claim 11 , wherein the predetermined condition is that priority of a row area existing in a predetermined range at least including a range displayed on the user interface is higher than that of a row area existing outside the predetermined range. 13 . A method of image processing to acquire character information from a scanned image, the method comprising the steps of: extracting a text area by performing area division processing for a binary image obtained by binarizing the scanned image by a first binarization method; determining a binary image used in OCR processing; and performing the OCR processing by using the determined binary image for the extracted text area, wherein at the determination step: in a case where a binary image used in the area division processing is suitable to the OCR processing, the binary image is determined as a binary image used in the OCR processing; and in a case where a binary image used in the area division processing is not suitable to the OCR processing, a binary image is generated by a second binarization method whose accuracy is higher than that of the first binarization method and the generated binary image is determined as a binary image used in the OCR processing. 14 . A non-transitory computer readable storage medium storing a program for causing a computer to perform a method of image processing to acquire character information from a scanned image, the method comprising the steps of: extracting a text area by performing area division processing for a binary image obtained by binarizing the scanned image by a first binarization method; determining a binary image used in OCR processing; and performing the OCR processing by using the determined binary image for the extracted text area, wherein at the determination step: in a case where a binary image used in the area division processing is suitable to the OCR processing, the binary image is determined as a binary image used in the OCR processing; and in a case where a binary image used in the area division processing is not suitable to the OCR processing, a binary image is generated by a second binarization method whose accuracy is higher than that of the first binarization method and the generated binary image is determined as a binary image used in the OCR processing.
Character recognition · CPC title
with an apparatus performing optical character recognition (character recognition G06V30/10) · CPC title
Multifunctional device, i.e. a device capable of all of reading, reproducing, copying, facsimile transception, file transception · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.