Table recognition method and apparatus and non-transitory computer-readable medium

US12400439B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12400439-B2
Application numberUS-202318165459-A
CountryUS
Kind codeB2
Filing dateFeb 7, 2023
Priority dateMar 8, 2022
Publication dateAug 26, 2025
Grant dateAug 26, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are a table recognition method and apparatus. The table recognition method includes steps of obtaining an image vision feature and a character content feature of a table image; fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature, and carrying out recognition based on the first fusion feature to acquire a table structure; and performing, based on the table structure, character recognition on the table image to acquire table character contents.

First claim

Opening claim text (preview).

What is claimed is: 1. A table recognition method comprising: obtaining an image vision feature and a character content feature of a table image; fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature, and carrying out recognition based on the first fusion feature to acquire a table structure; and performing, based on the table structure, character recognition on the table image to acquire table character contents, wherein the obtaining includes generating a first feature matrix serving as the image vision feature of the table image, the first feature matrix having a same dimension as the table image; recognizing character contents and location regions of the character contents in the table image to generate a vector representation corresponding to the character contents; and constructing a zero matrix whose dimension is same as the table image, and filling, based on the location regions of the character contents, the vector representation corresponding to the character contents into the zero matrix to acquire a second feature matrix serving as the character content feature of the table image. 2. The table recognition method in accordance with claim 1 , wherein, the fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature includes inputting the image vision feature and the character content feature of the table image into a fully-connected layer of a neural network model to acquire the first fusion feature output from the fully-connected layer; performing stitching on the image vision feature and the character content feature of the table image to acquire the first fusion feature; or conducting weighted summation with respect to the image vision feature and the character content feature of the table image to acquire the first fusion feature. 3. The table recognition method in accordance with claim 1 , wherein, the carrying out recognition based on the first fusion feature to acquire a table structure includes detecting, based on the first fusion feature of the table image, a location region of each cell in the table image; and recognizing, based on the location region of each cell in the table image, the table structure. 4. The table recognition method in accordance with claim 3 , wherein, before recognizing, based on the location region of each cell in the table image, the table structure, the carrying out recognition based on the first fusion feature to acquire a table structure further includes building, based on the location region of each cell in the table image, a positional relationship network map of cells; and optimizing and adjusting, based on the positional relationship network map of cells, the location region of each cell. 5. The table recognition method in accordance with claim 4 , wherein, the optimizing and adjusting, based on the positional relationship network map of cells, the location region of each cell includes inputting the positional relationship network map of cells into a pre-trained multi-task learning model to acquire the location region of each cell after optimization and adjustment, and wherein, the pre-trained multi-task learning model contains a classification task configured to determine whether a cell is deleted; and a coordinate regression task configured to perform adjustment on position coordinates of a cell. 6. The table recognition method in accordance with claim 1 , wherein, performing, based on the table structure, character recognition on the table image to acquire table character contents includes extracting, based on location regions of cells in the table structure, a cell image corresponding to each cell from the table image to generate an image vision feature of each cell; fusing, for each cell, the image vision features of the cell and peripheral cells thereof to acquire a second fusion feature corresponding to the cell, the peripheral cells containing cells belonging to a row and/or a column of the cell; and inputting the second fusion feature corresponding to each cell into a pre-trained optical character recognition model to acquire the character content of the cell. 7. A non-transitory computer-readable medium having computer-executable instructions for execution by a processor, wherein, the computer-executable instructions cause, when executed by the processor, the processor to conduct the table recognition method in accordance with claim 1 . 8. A table recognition apparatus comprising: a processor; and a storage storing computer-executable instructions, coupled to the processor, wherein the computer-executable instructions cause, when executed by the processor, the processor to: obtain an image vision feature and a character content feature of a table image; fuse the image vision feature and the character content feature of the table image to acquire a first fusion feature, and conduct recognition based on the first fusion feature to acquire a table structure; and perform, based on the table structure, character recognition on the table image to acquire table character contents, and in obtaining the image vision feature and the character content feature of the table image, the processor is caused to: generate a first feature matrix serving as the image vision feature of the table image, the first feature matrix having a same dimension as the table image: recognize character contents and location regions of the character contents in the table image to generate a vector representation corresponding to the character contents; and construct a zero matrix whose dimension is same as the table image, and fill, based on the location regions of the character contents, the vector representation corresponding to the character contents into the zero matrix to acquire a second feature matrix serving as the character content feature of the table image. 9. The table recognition apparatus in accordance with claim 8 , wherein the processor is further caused to: input the image vision feature and the character content feature of the table image into a fully-connected layer of a neural network model to acquire the first fusion feature output from the fully-connected layer; perform stitching on the image vision feature and the character content feature of the table image to acquire the first fusion feature; or conduct weighted summation with respect to the image vision feature and the character content feature of the table image to acquire the first fusion feature. 10. The table recognition apparatus in accordance with claim 8 , wherein the processor is further caused to: detect, based on the first fusion feature of the table image, a location region of each cell in the table image; and recognize, based on the location region of each cell in the table image, the table structure. 11. The table recognition apparatus in accordance with claim 8 , wherein the processor is further caused to: extract, based on location regions of cells in the table structure, a cell image corresponding to each cell from the table image to generate an image vision feature of each cell; fuse, for each cell, the image vision features of the cell and peripheral cells thereof to acquire a second fusion feature corresponding to the cell, the peripheral cells containing cells belonging to a row and/or a column of the cell; and input the second fusion feature corresponding to each cell into a pre-trained optical character recognition model to acquire the character content of the cell.

Assignees

Inventors

Classifications

  • Classification of content, e.g. text, photographs or tables · CPC title

  • Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title

  • using neural networks · CPC title

  • related to a structural representation of the pattern · CPC title

  • Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12400439B2 cover?
Disclosed are a table recognition method and apparatus. The table recognition method includes steps of obtaining an image vision feature and a character content feature of a table image; fusing the image vision feature and the character content feature of the table image to acquire a first fusion feature, and carrying out recognition based on the first fusion feature to acquire a table structur…
Who is the assignee on this patent?
Ding Lei, Dong Bin, Jiang Shanshan, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06V10/806. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 26 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).