Document decomposition based on determined logical visual layering of document content
US-2024403543-A1 · Dec 5, 2024 · US
US2025068831A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025068831-A1 |
| Application number | US-202218725598-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 31, 2022 |
| Priority date | Apr 11, 2022 |
| Publication date | Feb 27, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A text error correction method and apparatus, and an electronic device and a medium. The text error correction method includes: performing image encoding on an acquired image to be analyzed, so as to obtain image features (S 101 ); performing text encoding on acquired noisy text, so as to obtain text features (S 102 ); performing feature comparison on the image features and the text features according to a set attention mechanism, so as to obtain an error correction signal (S 103 ); and predicting an initial text label according to the error correction signal by using a trained decoder, so as to obtain error-corrected text information (S 104 ).
Opening claim text (preview).
1 . A text error correction method, comprising: performing image encoding on an acquired image to be analyzed, so as to obtain image features; performing text encoding on acquired noisy text, so as to obtain text features; performing feature comparison on the image features and the text features according to a set attention mechanism, so as to obtain an error correction signal; and predicting an initial text label according to the error correction signal by using a trained decoder, so as to obtain error-corrected text information. 2 . The method according to claim 1 , wherein a number of the text features is the same as a number of characters comprised in the noisy text. 3 . The method according to claim 1 , wherein the attention mechanism comprises a self-attention mechanism and a cross-attention mechanism; and the performing feature comparison on the image features and the text features according to a set attention mechanism, so as to obtain an error correction signal comprises: performing association analysis on the image features and the text features according to the self-attention mechanism, so as to obtain alignment features; and analyzing the alignment features and the text features according to the self-attention mechanism and the cross-attention mechanism, so as to obtain the error correction signal. 4 . The method according to claim 3 , wherein the alignment features comprise correspondence relationships between the image features and the text features. 5 . The method according to claim 3 , wherein the self-attention mechanism comprises a self-attention layer, a layer normalization module, and an adding module. 6 . The method according to claim 3 , wherein the performing association analysis on the image features and the text features according to the self-attention mechanism, so as to obtain alignment features comprises: splicing the image features with the text features, inputting spliced image features and text features to the self-attention mechanism for encoding, so as to obtain the alignment features output by the self-attention mechanism. 7 . The method according to claim 3 , wherein the performing association analysis on the image features and the text features according to the self-attention mechanism, so as to obtain alignment features comprises: determining self-attention vectors of the image features and the text features; and performing layer normalization and adding processing on the self-attention vectors, so as to obtain the alignment features. 8 . The method according to claim 7 , wherein the self-attention vectors comprise associated features between each dimension of feature of the image features and each dimension of feature of the text features. 9 . The method according to claim 8 , wherein the determining self-attention vectors of the image features and the text features comprises: determining the self-attention vectors of the image features and the text features according to the following formulas: attention ( f ) = soft max ( ( W q · f ) T × ( W k · f ) size ( f ) ) × ( W v · f ) ; wherein soft max ( x ) = e x ∑ j = 1 n e x , f represents the spliced image features and text features; and W q , W k , and W v are model parameters obtained by model training. 10 . The method according to claim 3 , wherein the analyzing the alignment features and the text features according to the self-attention mechanism and the cross-attention mechanism, so as to obtain the error correction signal comprises: performing attention analysis on the alignment features according to the self-attention mechanism, so as to obtain self-attention features of the alignment features; performing attention analysis on the text features according to the self-attention mechanism, so as to obtain self-attention features of the text features; determining cross-attention vectors between the self-attention features of the alignment features and the self-attention features of the text features; and performing layer normalization, adding, and error correction processing on the cross-attention vectors, so as to obtain the error correction signal. 11 . The method according to claim 10 , wherein the error correction processing is achieved based on superimposition of a plurality of error correction layers. 12 . The method according to claim 1
using neural networks · CPC title
Proximity, similarity or dissimilarity measures · CPC title
Annotation, e.g. comment data or footnotes · CPC title
Image coding (bandwidth or redundancy reduction for static pictures H04N1/41; coding or decoding of static colour picture signals H04N1/64; methods or arrangements for coding, decoding, compressing or decompressing digital video signals H04N19/00) · CPC title
of extracted features · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.