Enhancing machine translation of handwritten documents

US12080089B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12080089-B2
Application numberUS-202117643227-A
CountryUS
Kind codeB2
Filing dateDec 8, 2021
Priority dateDec 8, 2021
Publication dateSep 3, 2024
Grant dateSep 3, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method, a computer system and a computer program product enhance machine translation of a document. The method includes capturing an image of the document. The document includes a plurality of characters that are arranged in a character layout. The method also includes classifying the image by a document type based on the character layout. The method further includes determining a strategy for an intelligent character recognition (ICR) algorithm with the image based on the character layout of the image. Lastly, the method includes generating a translated document by applying the intelligent character recognition (ICR) algorithm to the plurality of characters in the image using the strategy. The translated document includes a plurality of translated characters that are arranged in the character layout.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for enhancing machine translation of a document, the method comprising: capturing an image of the document, wherein the document includes a plurality of characters that are arranged in a character layout; classifying the image by a document type using a machine learning classification model based on the character layout, wherein the machine learning classification model is trained using a plurality of training documents stored in a knowledge corpus; determining a strategy for an intelligent character recognition (ICR) algorithm with the image based on the character layout; and generating a translated document by applying the intelligent character recognition (ICR) algorithm to the plurality of characters in the image using the strategy, wherein the translated document includes a plurality of translated characters that are arranged in the character layout. 2. The computer-implemented method of claim 1 , further comprising: identifying each of the plurality of characters within the document; receiving a confidence level from a machine translation algorithm for the identified character; in response to the confidence level below a threshold, displaying an output of the machine translation algorithm for the identified character to a user; monitoring user interactions with the output; and updating the translated document according to the monitored user interactions. 3. The computer-implemented method of claim 1 , wherein determining the strategy for the intelligent character recognition (ICR) algorithm includes re-arranging the plurality of characters from the character layout of the document to an optimum character layout for machine translation. 4. The computer-implemented method of claim 1 , wherein generating the translated document further comprises: comparing an output of the intelligent character recognition (ICR) algorithm to a database; and in response to the output matching a prior translation in the database, updating the translated document based on the prior translation. 5. The computer-implemented method of claim 1 , wherein generating the translated document further comprises: displaying the translated document to a user; receiving an interaction record from the user; associating the interaction record with the translated document; and storing the interaction record with the associated translated document in a database. 6. The computer-implemented method of claim 1 , wherein each of the plurality of characters are handwritten. 7. The computer-implemented method of claim 1 , wherein the document type is classified by the machine learning classification model based on at least the character layout and a text within the image. 8. The computer-implemented method of claim 1 , wherein the machine learning classification model additionally leverages a language of the text within the image in classifying the image by the document type. 9. The computer-implemented method of claim 1 , wherein the plurality of characters in the image are re-arranged based on the document type prior to generating the translated document. 10. A computer system for enhancing machine translation of a document, the system comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage media, and program instructions stored on at least one of the one or more tangible storage media for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: capturing an image of the document, wherein the document includes a plurality of characters that are arranged in a character layout; classifying the image by a document type using a machine learning classification model based on the character layout, wherein the machine learning classification model is trained using a plurality of training documents stored in a knowledge corpus; determining a strategy for an intelligent character recognition (ICR) algorithm with the image based on the character layout; and generating a translated document by applying the intelligent character recognition (ICR) algorithm to the plurality of characters in the image using the strategy, wherein the translated document includes a plurality of translated characters that are arranged in the character layout. 11. The computer system of claim 10 , further comprising: identifying each of the plurality of characters within the document; receiving a confidence level from a machine translation algorithm for the identified character; in response to the confidence level below a threshold, displaying an output of the machine translation algorithm for the identified character to a user; monitoring user interactions with the output; and updating the translated document according to the monitored user interactions. 12. The computer system of claim 10 , wherein determining the strategy for the intelligent character recognition (ICR) algorithm includes re-arranging the plurality of characters from the character layout of the document to an optimum character layout for machine translation. 13. The computer system of claim 10 , wherein generating the translated document further comprises: comparing an output of the intelligent character recognition (ICR) algorithm to a database; and in response to the output matching a prior translation in the database, updating the translated document based on the prior translation. 14. The computer system of claim 10 , wherein generating the translated document further comprises: displaying the translated document to a user; receiving an interaction record from the user; associating the interaction record with the translated document; and storing the interaction record with the associated translated document in a database. 15. The computer system of claim 10 , wherein each of the plurality of characters are handwritten. 16. A computer program product for enhancing machine translation of a document comprising: a computer readable storage device having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method comprising: capturing an image of the document, wherein the document includes a plurality of characters that are arranged in a character layout; classifying the image by a document type using a machine learning classification model based on the character layout, wherein the machine learning classification model is trained using a plurality of training documents stored in a knowledge corpus; determining a strategy for an intelligent character recognition (ICR) algorithm with the image based on the character layout; and generating a translated document by applying the intelligent character recognition (ICR) algorithm to the plurality of characters in the image using the strategy, wherein the translated document includes a plurality of translated characters that are arranged in the character layout. 17. The computer program product of claim 16 , further comprising: identifying each of the plurality of characters within the document; receiving a confidence level from a machine translation algorithm for the identified character; in response to the confidence level below a threshold, displaying an output of the machine translation algorithm for the identified character to a user; monitoring user interactions with the output; and updating the translated document according to the monitored user interactions.

Assignees

Inventors

Classifications

  • Classification of content, e.g. text, photographs or tables · CPC title

  • characterised by the type of writing · CPC title

  • Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title

  • Document management systems · CPC title

  • with the intervention of an operator · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12080089B2 cover?
A computer-implemented method, a computer system and a computer program product enhance machine translation of a document. The method includes capturing an image of the document. The document includes a plurality of characters that are arranged in a character layout. The method also includes classifying the image by a document type based on the character layout. The method further includes dete…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06V30/414. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 03 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).