Detecting fields in document images

US2025329187A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025329187-A1
Application numberUS-202519256919-A
CountryUS
Kind codeA1
Filing dateJul 1, 2025
Priority dateJul 21, 2021
Publication dateOct 23, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of detecting fields in document images includes: receiving, by a processing device, a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a respective keypoint region of a first set of document images; calculating, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; loading a document image for extraction of target fields; and detecting fields in the document image based on the calculated frequency distributions.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising: receiving, by a processing device, a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a respective keypoint region of a first set of document images; calculating, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; loading a document image for extraction of target fields; and detecting fields in the document image based on the calculated frequency distributions. 2 . The method of claim 1 , wherein the codebook is optimized on a third set of document images. 3 . The method of claim 1 , wherein calculating the respective frequency distribution comprises calculating an integral two-dimensional histogram of shift of a position of the specified field, and wherein the integral two-dimensional histogram incorporates a plurality of shifts relative to possible positions of each visual word. 4 . The method of claim 1 , wherein detecting fields in the document image further comprises: obtaining an accumulated distribution histogram based on possible positions of the target field with respect to two or more visual words of the set of visual words. 5 . The method of claim 1 , wherein a plurality of document images of the second set of document images have a similar layout. 6 . The method of claim 1 , further comprising: dividing the second set of document images into groups based on document similarity prior to at least one of: training a model or using the model. 7 . The method of claim 1 , further comprising: extracting the respective keypoint region by morphologically preprocessing each document image of the first set of document images. 8 . A system, comprising: a memory; and a processing device coupled to the memory, the processing device configured to: receive a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a respective keypoint region of a first set of document images; calculate, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; load a document image for extraction of target fields; and detect fields in the document image based on the calculated frequency distributions. 9 . The system of claim 8 , wherein the codebook is optimized on a third set of document images. 10 . The system of claim 8 , wherein calculating the respective frequency distribution comprises calculating an integral two-dimensional histogram of shift of a position of the specified field, and wherein the integral two-dimensional histogram incorporates a plurality of shifts relative to possible positions of each visual word. 11 . The system of claim 8 , wherein detecting fields in the document image further comprises: obtaining an accumulated distribution histogram based on possible positions of the target field with respect to two or more visual words of the set of visual words. 12 . The system of claim 8 , wherein a plurality of document images of the second set of document images have a similar layout. 13 . The system of claim 8 , wherein the processing device is further configured to: divide the second set of document images into groups based on document similarity prior to at least one of: training a model or using the model. 14 . The system of claim 8 , wherein the processing device is further configured to: extract the respective keypoint region by morphologically preprocessing each document image of the first set of document images. 15 . A non-transitory computer-readable storage medium comprising executable instructions that, when executed by a processing device, cause the processing device to: receive a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a respective keypoint region of a first set of document images; calculate, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; load a document image for extraction of target fields; and detect fields in the document image based on the calculated frequency distributions. 16 . The non-transitory computer-readable storage medium of claim 15 , wherein the codebook is optimized on a third set of document images. 17 . The non-transitory computer-readable storage medium of claim 15 , wherein calculating the respective frequency distribution comprises calculating an integral two-dimensional histogram of shift of a position of the specified field, and wherein the integral two-dimensional histogram incorporates a plurality of shifts relative to possible positions of each visual word. 18 . The non-transitory computer-readable storage medium of claim 15 , wherein detecting fields in the document image further comprises: obtaining an accumulated distribution histogram based on possible positions of the target field with respect to two or more visual words of the set of visual words. 19 . The non-transitory computer-readable storage medium of claim 15 , wherein a plurality of document images of the second set of document images have a similar layout. 20 . The non-transitory computer-readable storage medium of claim 15 , further comprising executable instructions that, when executed by the processing device, cause the processing device to: divide the second set of document images into groups based on document similarity prior to at least one of: training a model or using the model.

Assignees

Inventors

Classifications

  • Extracting features based on a plurality of salient regional features, e.g. "bag of words" · CPC title

  • Salient features, e.g. scale invariant feature transforms [SIFT] · CPC title

  • Extracting features based on salient regional features, e.g. scale invariant feature transform [SIFT] keypoints · CPC title

  • Partitioning the feature space · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025329187A1 cover?
A method of detecting fields in document images includes: receiving, by a processing device, a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a respective keypoint region of a first set of document images; calculating, based on a second set of document images, for each visu…
Who is the assignee on this patent?
Abbyy Dev Inc
What technology area does this patent fall under?
Primary CPC classification G06V30/18152. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).