Who is the assignee on this patent?

Beijing Baidu Netcom Sci & Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06T5/80. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 12 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for correcting distorted document image

US11756170B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11756170-B2
Application number	US-202117151783-A
Country	US
Kind code	B2
Filing date	Jan 19, 2021
Priority date	Jan 20, 2020
Publication date	Sep 12, 2023
Grant date	Sep 12, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure provide a method and apparatus for correcting a distorted document image, where the method for correcting a distorted document image includes: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; where the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted. By inputting the distorted document image to be corrected into the correction model, the corrected image corresponding to the distorted document image can be obtained through the correction model, which realizes document image correction end-to-end, improves accuracy of the document image correction, and extends application scenarios of the document image correction.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for correcting a distorted document image, comprising: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; wherein the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted, wherein the correction model comprises a deformation parameter prediction module and a deformation correction module connected in series; wherein the deformation parameter prediction module is a U-shaped convolutional neural network model obtained by training with the set of image samples as inputs and a deformation parameter of each pixel of each image sample comprised in the set of image samples as an output, and the deformation correction module is a model obtained by training with the set of image samples and output results of the deformation parameter prediction module as inputs and the corrected image corresponding to each image sample in the set of image samples as an output; the inputting the distorted document image into the correction model, and obtaining the corrected image corresponding to the distorted document image comprises: inputting the distorted document image into the correction model, outputting an intermediate result through the deformation parameter prediction module, and obtaining, according to the intermediate result, the corrected image corresponding to the distorted document image through the deformation correction module; the intermediate result comprising a deformation parameter of each pixel in the distorted document image; wherein the deformation parameter prediction module comprises at least two stages of deformation parameter prediction sub-modules connected in series; wherein a first-stage deformation parameter prediction sub-module is a U-shaped convolutional neural network model obtained by training with the set of image samples as inputs and a deformation parameter of each pixel of each image sample comprised in the set of image samples as an output, and another stage deformation parameter prediction sub-module is a U-shaped convolutional neural network model obtained by training with the set of image samples and output results of a previous deformation parameter prediction sub-module as inputs and a deformation parameter of each pixel of each image sample comprised in the set of image samples as an output; the intermediate result is an output result of a last-stage deformation parameter prediction sub-module of the at least two stages of deformation parameter prediction sub-modules. 2. The method according to claim 1 , wherein the obtaining, according to the intermediate result, the corrected image corresponding to the distorted document image through the deformation correction module comprises: obtaining an operating parameter, the operating parameter indicating a number of pixels on which correction operations are performed in parallel; obtaining, according to the operating parameter, multiple pixels in the distorted document image; and correcting, according to deformation parameters respectively corresponding to the multiple pixels, the multiple pixels in parallel through the deformation correction module, and obtaining multiple corrected pixels. 3. The method according to claim 1 , wherein the U-shaped convolutional neural network model comprises an encoding unit and a decoding unit, the encoding unit and the decoding unit each comprise multiple convolutional layers, and a convolutional layer in the encoding unit comprises multiple dilation convolution operations. 4. The method according to claim 1 , wherein the U-shaped convolutional neural network model comprises an encoding unit and a decoding unit, the encoding unit and the decoding unit each comprise multiple convolutional layers, and a convolutional layer in the encoding unit comprises multiple dilation convolution operations. 5. The method according to claim 2 , wherein the U-shaped convolutional neural network model comprises an encoding unit and a decoding unit, the encoding unit and the decoding unit each comprise multiple convolutional layers, and a convolutional layer in the encoding unit comprises multiple dilation convolution operations. 6. The method according to claim 3 , wherein dilation ratios between the multiple dilation convolution operations comprised in the convolutional layer in the encoding unit gradually increase and are coprime. 7. The method according to claim 3 , wherein the U-shaped convolutional neural network model further comprises a parallel convolution unit between the encoding unit and the decoding unit, the parallel convolution unit is configured to perform multiple dilation convolution operations in parallel on a feature map outputted by a last layer of the convolutional layers in the encoding unit, and dilation ratios between the multiple dilation convolution operations performed in parallel are different. 8. The method according to claim 3 , wherein a convolutional layer in the decoding unit comprises a convolution operation and a recombination operation, the convolution operation is used for up-sampling a feature map, and the recombination operation is used for reconstructing the a number of rows, columns, and dimensions of a matrix for the up-sampled feature map. 9. An apparatus for correcting a distorted document image, comprising: a memory and a processor; wherein the memory is configured to store program instructions; and the processor is configured to call the program instructions stored in the memory to: obtain a distorted document image; and input the distorted document image into a correction model, and obtain a corrected image corresponding to the distorted document image; wherein the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted, wherein the correction model comprises a deformation parameter prediction module and a deformation correction module connected in series; wherein the deformation parameter prediction module is a U-shaped convolutional neural network model obtained by training with the set of image samples as inputs and a deformation parameter of each pixel of each image sample comprised in the set of image samples as an output, and the deformation correction module is a model obtained by training with the set of image samples and output results of the deformation parameter prediction module as inputs and the corrected image corresponding to each image sample in the set of image samples as an output; the processor is specifically configured to: input the distorted document image into the correction model, output an intermediate result through the deformation parameter prediction module, and obtain, according to the intermediate result, the corrected image corresponding to the distorted document image through the deformation correction module; the intermediate result comprising a deformation parameter of each pixel in the distorted document image; wherein the deformation parameter prediction module comprises at least two stages of deformation parameter prediction sub-modules connected in series; wherein a first-stage deformation parameter prediction sub-module is a U-shaped convolutional neural network model obtained by training with the set of image samples as inputs and a deformation parameter of each pixel of each image sample comprised in the set of image samples as an output, and another stage deformation parame

Assignees

Beijing Baidu Netcom Sci & Tech Co Ltd

Inventors

Classifications

G06T2207/30176
Document · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06T5/80Primary
Geometric correction · CPC title
G06T5/30
Erosion or dilatation, e.g. thinning · CPC title
G06T5/60Primary
using machine learning, e.g. neural networks · CPC title

Patent family

Related publications grouped by family.

View patent family 70952492

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11756170B2 cover?: Embodiments of the present disclosure provide a method and apparatus for correcting a distorted document image, where the method for correcting a distorted document image includes: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; where the correction model is a…
Who is the assignee on this patent?: Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06T5/80. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 12 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and apparatus of image-to-document conversion based on ocr, device, and readable storage medium

Systems and methods for image processing

Un-supervised convolutional neural network for distortion map estimation and correction in MRI

Systems and methods for image data processing

Generating gaze corrected images using bidirectionally trained network

Feature-preserving noise removal

Frequently asked questions