System and method for machine translation of text

US2020394431A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020394431-A1
Application numberUS-201916542392-A
CountryUS
Kind codeA1
Filing dateAug 16, 2019
Priority dateJun 13, 2019
Publication dateDec 17, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for machine translation of text is disclosed. The method includes processing an image comprising a text to generate a pattern associated with the text based on a trained Convolution Neural Network (CNN). The method further includes mapping the pattern to a word in a mapping table and at least one text attribute, based on a classifier network. The method further includes initiating an Optical Character Recognition (OCR) conversion for the pattern, when at least one of the mapping between at least one of the pattern and at least one word in the mapping table and the mapping between the pattern and the at least one text attribute is below a predefined threshold. The method further includes performing incremental learning for the trained CNN and the classifier network based on the OCR conversion.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of machine translation of text, the method comprising: processing, by a translation device, an image comprising a text to generate a pattern associated with the text based on a trained Convolution Neural Network (CNN), wherein the text is in a source language; mapping, by the translation device, the pattern to a word in a mapping table and at least one text attribute, based on a classifier network, wherein the word is in a target language; initiating, by the translation device, an Optical Character Recognition (OCR) conversion for the pattern, when at least one of the mapping between at least one of the pattern and at least one word in the mapping table and the mapping between the pattern and the at least one text attribute is below a predefined threshold; and performing, by the translation device, incremental learning for the trained CNN and the classifier network based on the OCR conversion. 2 . The method of claim 1 , wherein the at least one text attribute comprises at least one of font, style, or color, and wherein the style is associated with a text being in bold, italics, strikethrough, or underline. 3 . The method of claim 1 , further comprising identifying the text from the image, wherein the image comprises a plurality of words. 4 . The method of claim 3 , wherein the identifying comprises: identifying column and paragraph separation within the image based on difference in pixel intensity; and for each paragraph in each column, identifying each word based on difference in pixel intensity of an area designated by space between two adjacent words. 5 . The method of claim 4 further comprising extracting words sequentially from each paragraph of each column in response to the identifying of each word. 6 . The method of claim 5 further comprising creating sentences in response to sequential extraction of words from each paragraph of each column, using a Long Short Term Memory (LSTM). 7 . The method of claim 6 further comprising rendering the sentences created using the LSTM to a user. 8 . The method of claim 7 further comprising: receiving feedback from the user on the sentences rendered; and performing the incremental learning based on the feedback. 9 . The method of claim 7 further comprising generating a notification for the user, wherein the notification comprises at least one of warning of incorrect decoding and inability to support an attribute of the at least one attribute. 10 . A translation device for machine translation of text, the translation device comprising: a processor; and a memory communicatively coupled to the processor, wherein the memory stores processor instructions, which, on execution, causes the processor to: process an image comprising a text to generate a pattern associated with the text based on a trained Convolution Neural Network (CNN), wherein the text is in a source language; map the pattern to a word in a mapping table and at least one text attribute, based on a classifier network, wherein the word is in a target language; initiate an Optical Character Recognition (OCR) conversion for the pattern, when at least one of the mapping between at least one of the pattern and at least one word in the mapping table and the mapping between the pattern and the at least one text attribute is below a predefined threshold; and perform incremental learning for the trained CNN and the classifier network based on the OCR conversion. 11 . The translation device of claim 10 , wherein the at least one text attribute comprises at least one of font, style, or color, and wherein the style is associated with a text being in bold, italics, strikethrough, or underline. 12 . The translation device of claim 11 , wherein the processor instructions further cause the processor to identify the text from the image, wherein the image comprises a plurality of words. 13 . The translation device of claim 12 , wherein to identify the processor instructions further cause the processor to: identify column and paragraph separation within the image based on difference in pixel intensity; and for each paragraph in each column, identify each word based on difference in pixel intensity of an area designated by space between two adjacent words. 14 . The translation device of claim 13 , wherein the processor instructions further cause the processor to extract words sequentially from each paragraph of each column in response to the identifying of each word. 15 . The translation device of claim 14 , wherein the processor instructions further cause the processor to create sentences in response to sequential extraction of words from each paragraph of each column, using a Long Short Term Memory (LSTM). 16 . The translation device of claim 15 , wherein the processor instructions further cause the processor to render the sentences created using the LSTM to a user. 17 . The translation device of claim 16 , wherein the processor instructions further cause the processor to: receive feedback from the user on the sentences rendered; and perform the incremental learning based on the feedback. 18 . The translation device of claim 17 , wherein the processor instructions further cause the processor to generate a notification for the user, wherein the notification comprises at least one of warning of incorrect decoding and inability to support an attribute of the at least one attribute. 19 . A non-transitory computer-readable storage medium having stored thereon, a set of computer-executable instructions causing a computer comprising one or more processors to perform steps comprising: processing, by a translation device, an image comprising a text to generate a pattern associated with the text based on a trained Convolution Neural Network (CNN), wherein the text is in a source language; mapping, by the translation device, the pattern to a word in a mapping table and at least one text attribute, based on a classifier network, wherein the word is in a target language; initiating, by the translation device, an Optical Character Recognition (OCR) conversion for the pattern, when at least one of the mapping between at least one of the pattern and at least one word in the mapping table and the mapping between the pattern and the at least one text attribute is below a predefined threshold; and performing, by the translation device, incremental learning for the trained CNN and the classifier network based on the OCR conversion.

Assignees

Inventors

Classifications

  • G06F40/44Primary

    Statistical methods, e.g. probability models · CPC title

  • Validation; Performance evaluation · CPC title

  • Classification techniques · CPC title

  • using neural networks · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020394431A1 cover?
A method and system for machine translation of text is disclosed. The method includes processing an image comprising a text to generate a pattern associated with the text based on a trained Convolution Neural Network (CNN). The method further includes mapping the pattern to a word in a mapping table and at least one text attribute, based on a classifier network. The method further includes init…
Who is the assignee on this patent?
Wipro Ltd
What technology area does this patent fall under?
Primary CPC classification G06F40/44. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 17 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).