Simulated handwriting image generator
US-2021166013-A1 · Jun 3, 2021 · US
US12518503B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12518503-B2 |
| Application number | US-202218077026-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 7, 2022 |
| Priority date | Dec 8, 2021 |
| Publication date | Jan 6, 2026 |
| Grant date | Jan 6, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of rectifying a text image, a training method, an electronic device, and a medium, which relate to a field of an artificial intelligence technology, in particular to fields of computer vision, deep learning technology, intelligent transportation and high-precision maps. An exemplary implementation includes: performing, based on a gating strategy, a plurality of first layer-wise processing on a text image to be rectified, so as to obtain respective feature maps of a plurality of layer levels, wherein each of the feature maps includes a text structural feature related to the text image to be rectified, and the gating strategy is configured to increase an attention to the text structural feature; and performing a plurality of second layer-wise processing on the respective feature maps of the plurality of layer levels, so as to obtain a rectified text image corresponding to the text image to be rectified.
Opening claim text (preview).
What is claimed is: 1 . A method of rectifying a text image, the method comprising: performing, based on a gating strategy, a plurality of first layer-wise processing on a text image to be rectified, so as to obtain respective feature maps of a plurality of layer levels, wherein each of the feature maps comprises a text structural feature related to the text image to be rectified, and the gating strategy is configured to increase an attention to the text structural feature; and performing a plurality of second layer-wise processing on the respective feature maps of the plurality of layer levels, so as to obtain a rectified text image corresponding to the text image to be rectified, wherein the performing, based on a gating strategy, a plurality of first layer-wise processing comprises performing, based on a text image rectification model, a plurality of first layer-wise processing on the text image to be rectified, so as to obtain the respective feature maps of the plurality of layer levels, wherein the text image rectification model comprises a gating module created according to the gating strategy, wherein the text image rectification model further comprises an encoder and a decoder, the gating module comprises a plurality of channel layer units, and each of the channel layer units is configured to determine a channel weight of each channel in the feature map corresponding to the channel layer unit, wherein the performing, based on a text image rectification model, a plurality of first layer-wise processing comprises performing, based on the encoder and the plurality of channel layer units, a plurality of first layer-wise processing on the text image to be rectified, so as to obtain the respective feature maps of the plurality of layer levels, wherein the performing a plurality of second layer-wise processing comprises performing, based on the decoder, a plurality of second layer-wise processing on the respective feature maps of the plurality of layer levels, so as to obtain the rectified text image corresponding to the text image to be rectified, wherein the encoder comprises N down-sampling modules connected in cascade, the decoder comprises N up-sampling modules connected in cascade, and the gating module comprises N channel layer units, where N is an integer greater than 1; wherein the performing, based on the encoder and the plurality of channel layer units, a plurality of first layer-wise processing comprises: for 1<i≤N, processing a first down-sampling feature map of an (i−1) th layer level by using an (i−1) th channel layer unit, so as to obtain a channel weight feature map of the (i−1) th layer level; and processing the channel weight feature map of the (i−1) th layer level by using an i th down-sampling module, so as to obtain a first down-sampling feature map of the i th layer level; and wherein the performing, based on the decoder, a plurality of second layer-wise processing comprises: for 1≤i<N, processing a first output feature map of an (i+1) th layer level by using an i th up-sampling module, so as to obtain a first up-sampling feature map of an i th layer level; fusing the first down-sampling feature map and the first up-sampling feature map of the i th layer level to obtain a first fusion feature map of the i th layer level; processing the first fusion feature map of the i th layer level by using the i th up-sampling module, so as to obtain a first output feature map of the i th layer level; and determining, according to the first output feature map of a first layer level, the rectified text image corresponding to the text image to be rectified. 2 . The method according to claim 1 , wherein the gating module further comprises a fine-grain layer unit; further comprising processing a channel weight feature map of an N th layer level by using the fine-grain layer unit, so as to obtain a first fine-grain feature map of the N th layer level; and wherein the performing, based on the decoder, a plurality of second layer-wise processing on the respective feature maps of the plurality of layer levels, so as to obtain the rectified text image corresponding to the text image to be rectified comprises: for i=N, processing the first fine-grain feature map of the N th layer level by using an N th up-sampling module, so as to obtain a first up-sampling feature map of the N th layer level; fusing the first up-sampling feature map and the first down-sampling feature map of the N th layer level to obtain a first fusion feature map of the N th layer level; and processing the first fusion feature map of the N th layer level by using the N th up-sampling module, so as to obtain a first output feature map of the N th layer level. 3 . The method according to claim 1 , wherein the gating module further comprises N coarse-grain layer units; further comprising processing a first down-sampling feature map of an i th layer level by using an i th coarse-grain layer unit, so as to obtain a first coarse-grain feature map of the i th layer level; and wherein the fusing the first down-sampling feature map of the i th layer level and the first up-sampling feature map of the i th layer level to obtain a first fusion feature map of the i th layer level comprises fusing the first coarse-grain feature map of the i th layer level and the first up-sampling feature map of the i th layer level to obtain the first fusion feature map of the i th layer level. 4 . The method according to claim 1 , wherein the (i−1) th channel layer unit comprises M first processing layer combinations connected in cascade, each first processing layer combination comprises a first processing layer and a second processing layer connected in cascade, each first processing layer comprises Q pooling layers connected in parallel, and each second processing layer comprises U first convolution layers connected in cascade, where M, Q and U are integers greater than or equal to 1; and wherein the processing a first down-sampling feature map of the (i−1) th layer level by using an (i−1) th channel layer unit, so as to obtain a channel weight feature map of the (i−1) th layer level comprises: processing a first down-sampling feature map of the (i−1) th layer level by using the M first processing layer combinations connected in cascade of the (i−1) th channel layer unit, so as to obtain first intermediate feature maps respectively corresponding to the Q first processing layers connected in parallel of the (i−1) th layer level; obtaining a first gating map of the (i−1) th layer level according to the Q first intermediate feature maps of the (i−1) th layer level; performing a dot multiplication on the first down-sampling feature map of the (i−1) th layer level and the first gating map of the (i−1) th layer level to obtain a second intermediate feature map of the (i−1) th layer level; and obtaining the channel weight feature map of the (i−1) th layer level according to the first down-sampling feature map and the second intermediate feature map of the (i−1) th layer level. 5 . The method according to claim 1 , wherein the fine-grain layer unit comprises P second processing layer combinations connected in parallel, each second processing layer combination comprises V third processing layers connected in parallel, and each third processing layer comprises S second convolution layers connected in cascade, where P, V and S are integers greater than or equal to 1; and wherein the processing a channel weight feature map of an N th layer level by using the fine-grain layer unit, so as to obtain a first fine-grain feature map of the N th layer level comprises: processing the channel weight feature map of the N th layer level by using the P second processing layer combinations connected in parallel,
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Music notations · CPC title
Correcting image deformation, e.g. trapezoidal deformation caused by perspective · CPC title
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
using machine learning, e.g. neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.