Text block segmentation

US12424007B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12424007-B2
Application numberUS-202217814856-A
CountryUS
Kind codeB2
Filing dateJul 26, 2022
Priority dateJul 26, 2022
Publication dateSep 23, 2025
Grant dateSep 23, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for text block segmentation includes determining a first text block segmentation pattern utilized to generate a segmented text block based, at least in part, on a comparison of semantic information associated with the segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph; calculating a first degree of confidence in a size of the segmented text block based, at least in part, on comparing semantic entities associated with the segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node included in the graph and representative of the first type of text block segmentation pattern; and determining that the size of the segmented text block is non-optimal based on the calculated degree of confidence in the size of the segmented text block being below a predetermined threshold.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for text block segmentation, comprising: receiving a digital image of a text document; determining a first text block segmentation pattern utilized to generate a digital segmented text block for the digital image based, at least in part, on a comparison of semantic information associated with the digital segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph, wherein non-leaf nodes of the graph represent types of text block segmentation pattern and leaf nodes stemming from the non-leaf nodes represent associated semantic entities; calculating a first degree of confidence in a size of the digital segmented text block for the digital image based, at least in part, on comparing semantic entities associated with the digital segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node representative of the first type of text block segmentation pattern; determining that the size of the digital segmented text block for the digital image is non-optimal based on the calculated degree of confidence in the size of the digital segmented text block being below a predetermined threshold; responsive to determining that the size of the digital segmented text block for the digital image is non-optimal, altering the size of the digital segmented text block for the digital image based, at least in part, on re-segmenting the digital segmented text block for the digital image; and performing text extraction based on the re-segmented digital text block for the digital image. 2. The computer-implemented method of claim 1 , further comprising: calculating a second degree of confidence in a size of the re-segmented text block; and replacing the segmented text block with the re-segmented text block in response to the second degree of confidence in the size of the re-segmented text block being greater than the first degree of confidence in the size of the segmented text block. 3. The computer-implemented method of claim 1 , wherein re-segmenting the segmented text block further includes: determining to increase the size of the segmented text block if a number of semantic entities included in the segmented text block is less than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern; and determining to decrease the size of the segmented text block if a number of semantic entities included in the segmented text block is greater than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern. 4. The computer-implemented method of claim 3 , further comprising: responsive to determining to increase the size of the segmented text block, selecting a second non-leaf node representative of a second type of text block segmentation pattern, wherein the second non-leaf node is located subsequent to the first non-leaf node in the graph; and re-segmenting the segmented text block utilizing the second type of text block segmentation pattern to increase the size of the segmented text block. 5. The computer-implemented method of claim 3 , further comprising: responsive to determining to decrease the size of the segmented text block, selecting a third non-leaf node representative of a third type of text block segmentation pattern, wherein the third non-leaf node is located prior to the first non-leaf node in the graph; and re-segmenting the segmented text block utilizing the third type of text block segmentation pattern to decrease the size of the segmented text block. 6. The computer-implemented method of claim 1 , further comprising: generating Key-Value Pairs (KVPs) from the semantic information associated with the segmented text block, wherein a key of a KVP corresponds to a primary semantic category, and values of the KVP correspond to respective semantic entities associated with the primary semantic category. 7. The computer-implemented method of claim 6 , wherein determining the first text block segmentation pattern from a plurality of predefined types of text block segmentation patterns utilized to generate the segmented text block is further based on: identifying leaf nodes in the graph that match respective values of the KVPs associated with the segmented text block; identifying non-leaf nodes connected to the leaf nodes in the graph that match the respective values of the KVPs associated with the segmented text block; and selecting the non-leaf node having a highest percentage of connected leaf nodes that match the respective values of the KVPs associated with the segmented text block. 8. A computer program product for text block segmentation, the computer program product including one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions being executable by one or more computer processors to: receive a digital image of a text document; determine a first text block segmentation pattern utilized to generate a digital segmented text block for the digital image based, at least in part, on a comparison of semantic information associated with the digital segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph, wherein non-leaf nodes of the graph represent types of text block segmentation pattern and leaf nodes stemming from the non-leaf nodes represent associated semantic entities; calculate a first degree of confidence in a size of the digital segmented text block for the digital image based, at least in part, on comparing semantic entities associated with the digital segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node representative of the first type of text block segmentation pattern; determine that the size of the digital segmented text block for the digital image is non-optimal based on the calculated degree of confidence in the size of the digital segmented text block being below a predetermined threshold; responsive to determining that the size of the digital segmented text block for the digital image is non-optimal, alter the size of the digital segmented text block for the digital image based, at least in part, on re-segmenting the digital segmented text block for the digital image; and perform text extraction based on the re-segmented digital text block for the digital image. 9. The computer program product of claim 8 , further comprising program instructions to: calculate a second degree of confidence in a size of the re-segmented text block; and replace the segmented text block with the re-segmented text block in response to the second degree of confidence in the size of the re-segmented text block being greater than the first degree of confidence in the size of the segmented text block. 10. The computer program product of claim 8 , wherein the program instructions to re-segment the segmented text block further include instructions to: determine to increase the size of the segmented text block if a number of semantic entities included in the segmented text block is less than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern; and determine to decrease the size of the segmented text block if a number of semantic entities included in the segmented text block is greater than a number of leaf nodes stemming from the first non-leaf node representative of the first type of text block segmentation pattern. 11. The computer prog

Assignees

Inventors

Classifications

  • Graphical representation, e.g. directed attributed graph · CPC title

  • Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title

  • Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title

  • G06V30/153Primary

    using recognition of characters or words · CPC title

  • G06V30/424Primary

    Postal images, e.g. labels or addresses on parcels or postal envelopes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12424007B2 cover?
A computer-implemented method for text block segmentation includes determining a first text block segmentation pattern utilized to generate a segmented text block based, at least in part, on a comparison of semantic information associated with the segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph; calculating a first degree of conf…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06V30/153. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).