Method, system, and neural network for identifying direction of a document

US10891476B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10891476-B2
Application numberUS-201815876334-A
CountryUS
Kind codeB2
Filing dateJan 22, 2018
Priority dateJan 24, 2017
Publication dateJan 12, 2021
Grant dateJan 12, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, system, and neural network for identifying direction of a document where the method comprises: extracting a text line in the document; calculating a first normal direction result indicative of the text line probably being in a normal direction and a first upside-down direction result indicative of the text line probably being in a direction upside-down with respect to the normal direction; calculating a second normal direction result indicative of the text line after being rotated by 180 degrees probably being in the normal direction and a second upside-down direction result indicative of the text line after being rotated by 180 degrees probably being in the direction upside-down with respect to the normal direction; and determining the direction of the document according to the first normal direction result, the first upside-down direction result, the second normal direction result and the second upside-down direction result.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for identifying a direction of a document, comprising: extracting a text line in the document; calculating a first normal direction result indicative of a first probability that the text line is in a normal direction and a first upside-down direction result indicative of a second probability that the text line is in a direction upside-down with respect to the normal direction; calculating a second normal direction result indicative of a third probability that the text line after being rotated by 180 degrees is in the normal direction and a second upside-down direction result indicative of a fourth probability that the text line after being rotated by 180 degrees is in the direction upside-down with respect to the normal direction; and determining the direction of the document according to the first normal direction result and the first upside-down direction result as well as the second normal direction result and the second upside-down direction result. 2. The method according to claim 1 , wherein the first normal direction result indicative of the first probability that the text line is in a normal direction and the first upside-down direction result indicative of the second probability that the text line is in a direction upside-down with respect to the normal direction are calculated using a convolutional neural network. 3. The method according to claim 2 , wherein the convolutional neural network comprises a convolutional pooling part having a structure formed by superimposing several convolutional layers and pooling layers and a classifying part, and wherein the calculating of the first normal direction result indicative of the first probability that the text line in a normal direction and the first upside-down direction result indicative of the second probability that the text line is in a direction upside-down with respect to the normal direction comprises: performing convolution processing and pooling processing on the text line by the convolutional pooling part of the convolutional neural network to obtain a one-dimensional array; and performing classification processing on the one-dimensional array by the classifying part of the convolutional neural network, to output the first probability indicative of the text line being in a normal direction as the first normal direction result, and the second probability indicative of the text line being in the direction upside-down with respect to the normal direction as the first upside-down direction result. 4. The method according to claim 3 , wherein the second normal direction result indicative of the third probability that the text line, after being rotated by 180 degrees, is in the normal direction and the second upside-down direction result indicative of the fourth probability that the text line, after being rotated by 180 degrees, is in the direction upside-down with respect to the normal direction are calculated using an extended convolutional neural network which comprises a rotating layer, a convolutional pooling part having a structure formed by superimposing several convolutional layers and pooling layers, an inversing layer and a classifying part, comprising: rotating the text line by 180 degrees by the rotating layer; performing convolution processing and pooling processing on a rotated text line by the convolutional pooling part of the extended convolutional neural network to obtain a one-dimensional array; inversing orders of respective elements in the one-dimensional array by the inversing layer; and performing classification processing on the inversed one-dimensional array by the classifying part of the extended convolutional neural network, to output the third probability indicative of the text line being in the normal direction as the second normal direction result, and the fourth probability indicative of the text line being in the direction upside-down with respect to the normal direction as the second upside-down direction result. 5. The method according to claim 4 , wherein the determining the direction of the document according to the first normal direction result and the first upside-down direction result as well as the second normal direction result and the second upside-down direction result comprises: adding the first normal direction result and the second upside-down direction result, as a normal direction confidence; adding the first upside-down direction result and the second normal direction result, as a upside-down direction confidence; and determining the direction of the document according to the normal direction confidence and the upside-down direction confidence. 6. The method according to claim 5 , wherein the extended convolutional neural network is obtained by inserting the rotating layer and the inversing layer in the convolutional neural network. 7. The method according to claim 6 , wherein only the convolutional neural network shall be trained. 8. The method according to claim 6 , wherein the classifying part of the convolutional neural network comprises a classifier which performs classification processing. 9. The method according to claim 8 , wherein the classifier is a softmax classifier. 10. A system for identifying a direction of a document, comprising: an extracting device which extracts a text line in the document; a first calculating device which is connected to the extracting device and calculates a first normal direction result indicative of a first probability that the text line is in a normal direction and a first upside-down direction result indicative of a second probability that the text line is in a direction upside-down with respect to the normal direction; a second calculating device which is connected to the extracting device and calculates a second normal direction result indicative of a third probability that the text line after being rotated by 180 degrees is in the normal direction and a second upside-down direction result indicative of a fourth probability that the text line after being rotated by 180 degrees is in the direction upside-down with respect to the normal direction; and a determining device connected to the first calculating device and the second calculating device, which determines the direction of the document according to the first normal direction result and the first upside-down direction result as well as the second normal direction result and the second upside-down direction result. 11. The system according to claim 10 , wherein the first calculating device calculates a first normal direction result indicative of the first probability that the text line is in a normal direction and a first upside-down direction result indicative of the second probability that the text line is in a direction upside-down with respect to the normal direction using a convolutional neural network. 12. The system according to claim 11 , wherein the convolutional neural network comprises: a convolutional pooling part which has a structure formed by superimposing several convolutional layers and pooling layers and performs convolution processing and pooling processing on the text line to obtain a one-dimensional array; and a classifying part which performs classification processing on the one-dimensional array, to output the first probability indicative of the text line being in the normal direction as the first normal direction result, and the second probability indicative of the text line being in the direction upside-down with respect to the normal direction as the first upside-down direction result. 13. The system according to claim 12 , wherein the second calculating device calculates the second

Assignees

Inventors

Classifications

  • Classification techniques · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

  • Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title

  • relating to the classification model, e.g. parametric or non-parametric approaches · CPC title

  • Activation functions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10891476B2 cover?
A method, system, and neural network for identifying direction of a document where the method comprises: extracting a text line in the document; calculating a first normal direction result indicative of the text line probably being in a normal direction and a first upside-down direction result indicative of the text line probably being in a direction upside-down with respect to the normal direc…
Who is the assignee on this patent?
Fujitsu Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).