Systems and methods for detection and high-quality capture of documents on a cluttered tabletop with an automatically controlled camera

US2016259971A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016259971-A1
Application numberUS-201514637391-A
CountryUS
Kind codeA1
Filing dateMar 3, 2015
Priority dateMar 3, 2015
Publication dateSep 8, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described are systems and methods for recognizing paper documents on a tabletop using an overhead camera mounted on pan-tilt servos. The described automated system first finds paper documents on a cluttered desk based on a text probability map, constructed using multiple images acquired at fixed grid positions, and then captures a sequence of high-resolution overlapping frames of the located document(s), which are then fused together and perspective-rectified, using computed homography, to reconstruct a high quality and fronto-parallel document image that is of sufficient quality required for optical character recognition. The extracted textual information may be used, for example, for indexing and search, document repository and/or language translation applications.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method being performed in a computerized system comprising a processing unit, a memory and a camera, the camera being mounted on a turret operatively coupled to the processing unit, the computer-implemented method comprising: a. scanning a surface with the camera to acquire a first plurality of images of the surface; b. using the acquired first plurality of images of the surface to determine at least one location of a text on the surface; c. capturing a second plurality of images using the camera based on the determined location of the text on the surface; and d. extracting the text using the second captured plurality of images. 2 . The computer-implemented method of claim 1 , wherein the processing unit is configured to cause the turret to move the camera during the scanning and capturing. 3 . The computer-implemented method of claim 1 , wherein in a. the first plurality of images is acquired by moving the camera to a plurality of fixed positions along a predetermined path, wherein each image of the first plurality of images corresponds to a fixed position of the plurality of fixed positions, where the image was acquired. 4 . The computer-implemented method of claim 3 , wherein a. further comprises, for each image in the first plurality of images of the surface, computing feature points and a text response map. 5 . The computer-implemented method of claim 4 , wherein the each text response map is computed using a probability histogram, the probability histogram being pre-computed based on the feature points. 6 . The computer-implemented method of claim 4 , wherein a. further comprises stitching the text response maps corresponding to the first plurality of images into a single text response map based on the plurality of fixed positions corresponding to the first plurality of images. 7 . The computer-implemented method of claim 6 , wherein a. further comprises detecting text blobs in the single text response map. 8 . The computer-implemented method of claim 7 , wherein a. further comprises identifying at least one large rectangular shaped blob from the detected text blobs as the location of the text on the surface. 9 . The computer-implemented method of claim 1 , wherein in b. the second plurality of images is captured by moving the camera to a plurality of fixed positions along a predetermined path, wherein each image of the second plurality of images corresponds to a fixed position of the plurality of fixed positions, where the image was captured. 10 . The computer-implemented method of claim 9 , wherein the determined location of the text on the surface is a location of a text blob and wherein the predetermined path is a center line through a bounding box of the corresponding text blob. 11 . The computer-implemented method of claim 9 , wherein the images of the second plurality of images overlap with one another. 12 . The computer-implemented method of claim 11 , further comprising stitching and fusing the images of the second plurality of images to obtain a second stitched image. 13 . The computer-implemented method of claim 12 , further comprising fitting lines around a boundary of a text blob in the second stitched image and estimating vanishing points. 14 . The computer-implemented method of claim 12 , further comprising performing a perspective rectification on the second stitched image based on a computed homography. 15 . The computer-implemented method of claim 14 , wherein in d. the text is extracted using an optical character recognition performed on the perspective rectified second stitched image. 16 . A non-transitory computer-readable medium embodying a set of computer-executable instructions, which, when executed in a computerized system comprising a processing unit, a memory and a camera, the camera being mounted on a turret operatively coupled to the processing unit, cause the computerized system to perform a method comprising: a. scanning a surface with the camera to acquire a first plurality of images of the surface; b. using the acquired first plurality of images of the surface to determine at least one location of a text on the surface; c. capturing a second plurality of images using the camera based on the determined location of the text on the surface; and d. extracting the text using the second captured plurality of images. 17 . The non-transitory computer-readable medium of claim 16 , wherein the set of computer-executable instructions configures the processing unit to cause the turret to move the camera during the scanning and capturing. 18 . The non-transitory computer-readable medium of claim 16 , wherein in a. the first plurality of images is acquired by moving the camera to a plurality of fixed positions along a predetermined path, wherein each image of the first plurality of images corresponds to a fixed position of the plurality of fixed positions, where the image was acquired. 19 . The non-transitory computer-readable medium of claim 16 , wherein a. further comprises, for each image in the first plurality of images of the surface, computing feature points and a text response map. 20 . A computerized system comprising a processing unit, a memory and a camera, the camera being mounted on a turret operatively coupled to the processing unit, the memory storing a set of computer-executable instructions causing the computerized system to perform a method comprising: a. scanning a surface with the camera to acquire a first plurality of images of the surface; b. using the acquired first plurality of images of the surface to determine at least one location of a text on the surface; c. capturing a second plurality of images using the camera based on the determined location of the text on the surface; and d. extracting the text using the second captured plurality of images.

Assignees

Inventors

Classifications

  • Document-oriented image-based pattern recognition · CPC title

  • G06T3/4038Primary

    Image mosaicing, e.g. composing plane images from plane sub-images · CPC title

  • for achieving an enlarged field of view, e.g. panoramic image capture · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016259971A1 cover?
Described are systems and methods for recognizing paper documents on a tabletop using an overhead camera mounted on pan-tilt servos. The described automated system first finds paper documents on a cluttered desk based on a text probability map, constructed using multiple images acquired at fixed grid positions, and then captures a sequence of high-resolution overlapping frames of the located do…
Who is the assignee on this patent?
Fuji Xerox Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T3/4038. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Sep 08 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).