Method and subsystem for identifying document subimages within digital images
US-2017372134-A1 · Dec 28, 2017 · US
US10289924B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10289924-B2 |
| Application number | US-201615359314-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 22, 2016 |
| Priority date | Oct 17, 2011 |
| Publication date | May 14, 2019 |
| Grant date | May 14, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method are provided for the correction of a warped page image. The method first accepts a camera image of a page, creates a filtered edge map, and identifies text-likely regions. The filtered edge map and text-likely regions are projected into a polar coordinate system to determine page lines and warped image page curves. An adaptive two-dimensional (2D) ruled mesh piecewise planar approximation of a warped page surface is created. A three-dimensional (3D) model is created using the adaptive 2D ruled mesh and the estimate of the camera focal length estimate. Using the 3D model, a 2D target mesh is created for rectifying the image of the page. In one aspect, the adaptive 2D ruled mesh is projected onto a 3D warped page surface using the estimated camera focal length and an estimated surface normal of each planar strip from the adaptive 2D ruled mesh.
Opening claim text (preview).
I claim: 1. A method for the correction of a warped page image, the method comprising: accepting a camera image of a page; creating a filtered edge map and identifying text-likely regions; projecting the filtered edge map and text-likely regions into a polar coordinate system to determine page lines and warped image page curves; subsequent to projecting the filtered edge map and text-likely regions into the polar coordinate system, histogramming the filtered and projected edge map into theta bins, computing minimum and maximum theta angles for text-likely regions, localizing a first page line as a closest significant edge bin less than minimum text-likely region theta angle and a second page line as a closest significant edge bin greater than maximum text-likely region theta angle; creating an adaptive two-dimensional (2D) ruled mesh piecewise planar approximation of a warped page surface; using the created adaptive two-dimensional (2D) ruled mesh piecewise planar approximation of a warped page surface for estimating a camera focal length; creating a three-dimensional (3D) model using the adaptive 2D ruled mesh and camera focal length estimate; and, using the 3D model, creating a 2D target mesh rectifying the image of the page; wherein rectifying the image of the page includes using a perspective dewarp to interpolate image values from the adaptive 2D ruled mesh into the image defined by the 2D target rectilinear mesh. 2. The method of claim 1 further comprising: subsequent to projecting the filtered edge map and text-likely regions into the polar coordinate system, performing a multi-scale edge localization refinement for each determined page curve. 3. The method of claim 2 wherein performing the multi-scale refinement of the each determined page curves includes: transforming top and bottom page curves into point sets having a first resolution; scaling the point sets to a second resolution, lower than the first resolution; creating a gradient image of the second resolution point sets; identifying zero crossings points in the gradient image as a page edge; and, scaling the zero crossing points to the first resolution. 4. The method of claim 1 further comprising: subsequent to projecting the filtered edge map and text-likely regions into the polar coordinate system, independently fitting a pair of warped image page curve sections to a corresponding pair of Bezier curves. 5. The method of claim 1 wherein creating the adaptive 2D ruled mesh includes creating a plurality of planar strips; and, estimating a camera focal length includes independently estimating a camera focal length for each planar strip. 6. The method of claim 5 wherein creating the adaptive 2D ruled mesh includes using a polygonal approximation of Bezier curves and polygonal simplification, with a fit tolerance (fitTH) of 0.5 pixels, to assist in creating the adaptive 2D ruled mesh. 7. The method of claim 5 wherein creating the 3D model includes projecting the adaptive 2D ruled mesh onto a 3D warped page surface using the estimated camera focal length and an estimated surface normal of each planar strip from the adaptive 2D ruled mesh. 8. The method of claim 1 wherein estimating the camera focal length includes using the adaptive 2D ruled mesh of the warped page surface. 9. The method of claim 1 wherein creating the 2D target mesh includes determining a resolution of a 2D target mesh in response to scaling 3D mesh width and height values, where the scaling is determined by measuring an overall 3D page width and height and an input region width and height. 10. The method of claim 1 , further comprising: identifying page gutters for images of adjacent pages, where each gutter is determined as the closest significant theta edge bin to a theta angle halfway between the minimum text-likely region theta angle and the maximum text-likely region angle. 11. The method of claim 1 , further comprising determining a contour region of the page by: filtering curves not between page boundary lines, and dividing remaining curves into a top set defined by curves that are within the region defined by boundary lines L 1 , L 2 , and LC 1 , and a bottom set defined by curves that are within the region defined by the boundary lines L 1 , L 2 , and LC 2 , where L 1 is a left boundary, L 2 is a right boundary, and LC 2 is a bottom boundary; selecting a top page curve as a top set curve with highest completeness and lowest curvature between the boundary lines L 1 and L 2 ; and, selecting a bottom page curve as bottom set curve with the highest completeness and lowest curvature between the boundary lines L 1 and L 2 . 12. The method in claim 11 further comprising determining contour regions on two adjacent pages of a book, by: processing an image of adjacent pages, where each page is divided into upper and lower curve sets using boundary lines and gutter, and upper and lower curve lines are favored that join at the gutter. 13. A system for the correction of a warped page image, the system comprising: a non-transitory memory; a processor; an interface to accept a camera image of a page, and to supply a rectified image of the page; an image correction application stored in the memory and enabled as a sequence of processor executable steps for: creating a filtered edge map and identifying text-likely regions; projecting the filtered edge map and text-likely regions into a polar coordinate system to determine page lines and warped image page curves; subsequent to projecting the filtered edge map and text-likely regions into the polar coordinate system, histograms the filtered and projected edge map into theta bins, computes minimum and maximum theta angles for text-likely regions, and localizes a first page line as a closest significant edge bin less than minimum text-likely region theta angle and a second page line as a closest significant edge bin greater than maximum text-likely region theta angle; creating an adaptive two-dimensional (2D) ruled mesh piecewise planar approximation of a warped page surface; using the created adaptive two-dimensional (2D) ruled mesh piecewise planar approximation of a warped page surface for estimating a camera focal length; creating a three-dimensional (3D) model using the adaptive 2D ruled mesh and camera focal length estimate; and, using the 3D model, creating a 2D target mesh rectifying the image of the page; wherein the image correction application rectifies the image of the page using a perspective dewarp to interpolate image values from the adaptive 2D ruled mesh into the image defined by the 2D target rectilinear mesh. 14. The system of claim 13 wherein the image correction application, subsequent to projecting the filtered edge map and text-likely regions into the polar coordinate system, performs a multi-scale edge localization refinement for each determined page curve. 15. The system of claim 14 wherein the image correction application performs the multi-scale refinement of the determined page curves by: transforming top and bottom page curves into point sets having a first resolution; scaling the point sets to a second resolution, lower than the first resolution; creating a gradient image of the second resolution point sets; identifying zero crossings points in the gradient image as a page edge; and, scaling the zero crossing points to the first resolution. 16. The system of claim 13 wherein the image correction application, subsequent to projecting the filtered edge map and text-likely regions into the polar coordinate system, independently fits
Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title
using recognition of characters or words · CPC title
Aligning, centring, orientation detection or correction of the image · CPC title
Probabilistic image processing · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.