Automatic key/value pair extraction from document images using deep learning
US-10896357-B1 · Jan 19, 2021 · US
US12424007B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12424007-B2 |
| Application number | US-202217814856-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 26, 2022 |
| Priority date | Jul 26, 2022 |
| Publication date | Sep 23, 2025 |
| Grant date | Sep 23, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method for text block segmentation includes determining a first text block segmentation pattern utilized to generate a segmented text block based, at least in part, on a comparison of semantic information associated with the segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph; calculating a first degree of confidence in a size of the segmented text block based, at least in part, on comparing semantic entities associated with the segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node included in the graph and representative of the first type of text block segmentation pattern; and determining that the size of the segmented text block is non-optimal based on the calculated degree of confidence in the size of the segmented text block being below a predetermined threshold.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for text block segmentation, comprising: receiving a digital image of a text document; determining a first text block segmentation pattern utilized to generate a digital segmented text block for the digital image based, at least in part, on a comparison of semantic information associated with the digital segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph, wherein non-leaf nodes of the graph represent types of text block segmentation pattern and leaf nodes stemming from the non-leaf nodes represent associated semantic entities; calculating a first degree of confidence in a size of the digital segmented text block for the digital image based, at least in part, on comparing semantic entities associated with the digital segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node representative of the first type of text block segmentation pattern; determining that the size of the digital segmented text block for the digital image is non-optimal based on the calculated degree of confidence in the size of the digital segmented text block being below a predetermined threshold; responsive to determining that the size of the digital segmented text block for the digital image is non-optimal, altering the size of the digital segmented text block for the digital image based, at least in part, on re-segmenting the digital segmented text block for the digital image; and performing text extraction based on the re-segmented digital text block for the digital image. 2. The computer-implemented method of claim 1 , further comprising: calculating a second degree of confidence in a size of the re-segmented text block; and replacing the segmented text block with the re-segmented text block in response to the second degree of confidence in the size of the re-segmented text block being greater than the first degree of confidence in the size of the segmented text block. 3. The computer-implemented method of claim 1 , wherein re-segmenting the segmented text block further includes: determining to increase the size of the segmented text block if a number of semantic entities included in the segmented text block is less than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern; and determining to decrease the size of the segmented text block if a number of semantic entities included in the segmented text block is greater than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern. 4. The computer-implemented method of claim 3 , further comprising: responsive to determining to increase the size of the segmented text block, selecting a second non-leaf node representative of a second type of text block segmentation pattern, wherein the second non-leaf node is located subsequent to the first non-leaf node in the graph; and re-segmenting the segmented text block utilizing the second type of text block segmentation pattern to increase the size of the segmented text block. 5. The computer-implemented method of claim 3 , further comprising: responsive to determining to decrease the size of the segmented text block, selecting a third non-leaf node representative of a third type of text block segmentation pattern, wherein the third non-leaf node is located prior to the first non-leaf node in the graph; and re-segmenting the segmented text block utilizing the third type of text block segmentation pattern to decrease the size of the segmented text block. 6. The computer-implemented method of claim 1 , further comprising: generating Key-Value Pairs (KVPs) from the semantic information associated with the segmented text block, wherein a key of a KVP corresponds to a primary semantic category, and values of the KVP correspond to respective semantic entities associated with the primary semantic category. 7. The computer-implemented method of claim 6 , wherein determining the first text block segmentation pattern from a plurality of predefined types of text block segmentation patterns utilized to generate the segmented text block is further based on: identifying leaf nodes in the graph that match respective values of the KVPs associated with the segmented text block; identifying non-leaf nodes connected to the leaf nodes in the graph that match the respective values of the KVPs associated with the segmented text block; and selecting the non-leaf node having a highest percentage of connected leaf nodes that match the respective values of the KVPs associated with the segmented text block. 8. A computer program product for text block segmentation, the computer program product including one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions being executable by one or more computer processors to: receive a digital image of a text document; determine a first text block segmentation pattern utilized to generate a digital segmented text block for the digital image based, at least in part, on a comparison of semantic information associated with the digital segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph, wherein non-leaf nodes of the graph represent types of text block segmentation pattern and leaf nodes stemming from the non-leaf nodes represent associated semantic entities; calculate a first degree of confidence in a size of the digital segmented text block for the digital image based, at least in part, on comparing semantic entities associated with the digital segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node representative of the first type of text block segmentation pattern; determine that the size of the digital segmented text block for the digital image is non-optimal based on the calculated degree of confidence in the size of the digital segmented text block being below a predetermined threshold; responsive to determining that the size of the digital segmented text block for the digital image is non-optimal, alter the size of the digital segmented text block for the digital image based, at least in part, on re-segmenting the digital segmented text block for the digital image; and perform text extraction based on the re-segmented digital text block for the digital image. 9. The computer program product of claim 8 , further comprising program instructions to: calculate a second degree of confidence in a size of the re-segmented text block; and replace the segmented text block with the re-segmented text block in response to the second degree of confidence in the size of the re-segmented text block being greater than the first degree of confidence in the size of the segmented text block. 10. The computer program product of claim 8 , wherein the program instructions to re-segment the segmented text block further include instructions to: determine to increase the size of the segmented text block if a number of semantic entities included in the segmented text block is less than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern; and determine to decrease the size of the segmented text block if a number of semantic entities included in the segmented text block is greater than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern. 11. The computer prog
Graphical representation, e.g. directed attributed graph · CPC title
Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title
Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title
using recognition of characters or words · CPC title
Postal images, e.g. labels or addresses on parcels or postal envelopes · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.