Image Table Extraction Method And Apparatus, Electronic Device, And Storgage Medium
US-2021390294-A1 · Dec 16, 2021 · US
US12400439B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12400439-B2 |
| Application number | US-202318165459-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 7, 2023 |
| Priority date | Mar 8, 2022 |
| Publication date | Aug 26, 2025 |
| Grant date | Aug 26, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are a table recognition method and apparatus. The table recognition method includes steps of obtaining an image vision feature and a character content feature of a table image; fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature, and carrying out recognition based on the first fusion feature to acquire a table structure; and performing, based on the table structure, character recognition on the table image to acquire table character contents.
Opening claim text (preview).
What is claimed is: 1. A table recognition method comprising: obtaining an image vision feature and a character content feature of a table image; fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature, and carrying out recognition based on the first fusion feature to acquire a table structure; and performing, based on the table structure, character recognition on the table image to acquire table character contents, wherein the obtaining includes generating a first feature matrix serving as the image vision feature of the table image, the first feature matrix having a same dimension as the table image; recognizing character contents and location regions of the character contents in the table image to generate a vector representation corresponding to the character contents; and constructing a zero matrix whose dimension is same as the table image, and filling, based on the location regions of the character contents, the vector representation corresponding to the character contents into the zero matrix to acquire a second feature matrix serving as the character content feature of the table image. 2. The table recognition method in accordance with claim 1 , wherein, the fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature includes inputting the image vision feature and the character content feature of the table image into a fully-connected layer of a neural network model to acquire the first fusion feature output from the fully-connected layer; performing stitching on the image vision feature and the character content feature of the table image to acquire the first fusion feature; or conducting weighted summation with respect to the image vision feature and the character content feature of the table image to acquire the first fusion feature. 3. The table recognition method in accordance with claim 1 , wherein, the carrying out recognition based on the first fusion feature to acquire a table structure includes detecting, based on the first fusion feature of the table image, a location region of each cell in the table image; and recognizing, based on the location region of each cell in the table image, the table structure. 4. The table recognition method in accordance with claim 3 , wherein, before recognizing, based on the location region of each cell in the table image, the table structure, the carrying out recognition based on the first fusion feature to acquire a table structure further includes building, based on the location region of each cell in the table image, a positional relationship network map of cells; and optimizing and adjusting, based on the positional relationship network map of cells, the location region of each cell. 5. The table recognition method in accordance with claim 4 , wherein, the optimizing and adjusting, based on the positional relationship network map of cells, the location region of each cell includes inputting the positional relationship network map of cells into a pre-trained multi-task learning model to acquire the location region of each cell after optimization and adjustment, and wherein, the pre-trained multi-task learning model contains a classification task configured to determine whether a cell is deleted; and a coordinate regression task configured to perform adjustment on position coordinates of a cell. 6. The table recognition method in accordance with claim 1 , wherein, performing, based on the table structure, character recognition on the table image to acquire table character contents includes extracting, based on location regions of cells in the table structure, a cell image corresponding to each cell from the table image to generate an image vision feature of each cell; fusing, for each cell, the image vision features of the cell and peripheral cells thereof to acquire a second fusion feature corresponding to the cell, the peripheral cells containing cells belonging to a row and/or a column of the cell; and inputting the second fusion feature corresponding to each cell into a pre-trained optical character recognition model to acquire the character content of the cell. 7. A non-transitory computer-readable medium having computer-executable instructions for execution by a processor, wherein, the computer-executable instructions cause, when executed by the processor, the processor to conduct the table recognition method in accordance with claim 1 . 8. A table recognition apparatus comprising: a processor; and a storage storing computer-executable instructions, coupled to the processor, wherein the computer-executable instructions cause, when executed by the processor, the processor to: obtain an image vision feature and a character content feature of a table image; fuse the image vision feature and the character content feature of the table image to acquire a first fusion feature, and conduct recognition based on the first fusion feature to acquire a table structure; and perform, based on the table structure, character recognition on the table image to acquire table character contents, and in obtaining the image vision feature and the character content feature of the table image, the processor is caused to: generate a first feature matrix serving as the image vision feature of the table image, the first feature matrix having a same dimension as the table image: recognize character contents and location regions of the character contents in the table image to generate a vector representation corresponding to the character contents; and construct a zero matrix whose dimension is same as the table image, and fill, based on the location regions of the character contents, the vector representation corresponding to the character contents into the zero matrix to acquire a second feature matrix serving as the character content feature of the table image. 9. The table recognition apparatus in accordance with claim 8 , wherein the processor is further caused to: input the image vision feature and the character content feature of the table image into a fully-connected layer of a neural network model to acquire the first fusion feature output from the fully-connected layer; perform stitching on the image vision feature and the character content feature of the table image to acquire the first fusion feature; or conduct weighted summation with respect to the image vision feature and the character content feature of the table image to acquire the first fusion feature. 10. The table recognition apparatus in accordance with claim 8 , wherein the processor is further caused to: detect, based on the first fusion feature of the table image, a location region of each cell in the table image; and recognize, based on the location region of each cell in the table image, the table structure. 11. The table recognition apparatus in accordance with claim 8 , wherein the processor is further caused to: extract, based on location regions of cells in the table structure, a cell image corresponding to each cell from the table image to generate an image vision feature of each cell; fuse, for each cell, the image vision features of the cell and peripheral cells thereof to acquire a second fusion feature corresponding to the cell, the peripheral cells containing cells belonging to a row and/or a column of the cell; and input the second fusion feature corresponding to each cell into a pre-trained optical character recognition model to acquire the character content of the cell.
Classification of content, e.g. text, photographs or tables · CPC title
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
using neural networks · CPC title
related to a structural representation of the pattern · CPC title
Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.