Three-dimensional shape expression method and device thereof
US-2020027215-A1 · Jan 23, 2020 · US
US12354397B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12354397-B2 |
| Application number | US-202318502343-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 6, 2023 |
| Priority date | Jul 21, 2021 |
| Publication date | Jul 8, 2025 |
| Grant date | Jul 8, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of detecting fields in document images includes: receiving a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors; calculating, based on a set of user labeled document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified labeled field with respect to the visual word; loading a document image for extraction of target fields; calculating a statistical predicate of a possible position of a target field in the document image based on the frequency distributions; and detecting, using the trained model, fields in the document image based on the calculated statistical predicate.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: receiving, by a processing device, a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a keypoint region of a first set of document images; calculating, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; loading a document image for extraction of target fields; calculating a statistical predicate of a possible position of a target field in the document image based on the frequency distributions; and detecting fields in the document image based on the calculated statistical predicate. 2. The method of claim 1 , wherein the codebook is optimized on a third set of document images. 3. The method of claim 1 , wherein calculating the respective frequency distribution comprises calculating an integral two-dimensional histogram of shift of a position of the specified field, and wherein the integral two-dimensional histogram incorporates a plurality of shifts relative to possible positions of each visual word. 4. The method of claim 1 , wherein calculating the statistical predicate further comprises: obtaining an accumulated distribution histogram based on possible positions of the target field with respect to two or more visual words of the set of visual words. 5. The method of claim 1 , wherein a plurality of document images of the second set of document images have a similar layout. 6. The method of claim 1 , further comprising: dividing the second set of document images into groups based on document similarity prior to at least one of: training a model or using the model. 7. The method of claim 1 , wherein the statistical predicate is represented by a linear combination of individual predicates corresponding to a plurality of visual words detected in the document image. 8. A system, comprising: a memory; and a processing device coupled to the memory, the processing device configured to: receive a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a keypoint region of a first set of document images; calculate, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; load a document image for extraction of target fields; calculate a statistical predicate of a possible position of a target field in the document image based on the frequency distributions; and detect fields in the document image based on the calculated statistical predicate. 9. The system of claim 8 , wherein the codebook is optimized on a third set of document images. 10. The system of claim 8 , wherein calculating the respective frequency distribution comprises calculating an integral two-dimensional histogram of shift of a position of the specified field, and wherein the integral two-dimensional histogram incorporates a plurality of shifts relative to possible positions of each visual word. 11. The system of claim 8 , wherein calculating the statistical predicate further comprises: obtaining an accumulated distribution histogram based on possible positions of the target field with respect to two or more visual words of the set of visual words. 12. The system of claim 8 , wherein a plurality of document images of the second set of document images have a similar layout. 13. The system of claim 8 , wherein the processing device is further configured to: dividing the second set of document images into groups based on document similarity prior to at least one of: training a model or using the model. 14. The system of claim 8 , wherein the statistical predicate is represented by a linear combination of individual predicates corresponding to a plurality of visual words detected in the document image. 15. A non-transitory computer-readable storage medium comprising executable instructions that, when executed by a processing device, cause the processing device to: receive a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a keypoint region of a first set of document images; calculate, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; load a document image for extraction of target fields; calculate a statistical predicate of a possible position of a target field in the document image based on the frequency distributions; and detect fields in the document image based on the calculated statistical predicate. 16. The non-transitory computer-readable storage medium of claim 15 , wherein the codebook is optimized on a third set of document images. 17. The non-transitory computer-readable storage medium of claim 15 , wherein calculating the respective frequency distribution comprises calculating an integral two-dimensional histogram of shift of a position of the specified field, and wherein the integral two-dimensional histogram incorporates a plurality of shifts relative to possible positions of each visual word. 18. The non-transitory computer-readable storage medium of claim 15 , wherein calculating the statistical predicate further comprises: obtaining an accumulated distribution histogram based on possible positions of the target field with respect to two or more visual words of the set of visual words. 19. The non-transitory computer-readable storage medium of claim 15 , wherein a plurality of document images of the second set of document images have a similar layout. 20. The non-transitory computer-readable storage medium of claim 15 , further comprising: dividing the second set of document images into groups based on document similarity prior to at least one of: training a model or using the model.
Extracting features based on a plurality of salient regional features, e.g. "bag of words" · CPC title
Salient features, e.g. scale invariant feature transforms [SIFT] · CPC title
Extracting features based on salient regional features, e.g. scale invariant feature transform [SIFT] keypoints · CPC title
Partitioning the feature space · CPC title
Matching criteria, e.g. proximity measures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.