System for training machine learning model which recognizes characters of text images

US2022327816A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022327816-A1
Application numberUS-202217714322-A
CountryUS
Kind codeA1
Filing dateApr 6, 2022
Priority dateApr 9, 2021
Publication dateOct 13, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system trains a machine learning model which recognizes characters of text images. The system stores the machine learning model which recognizes characters of text images. The machine learning model includes a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images, a domain adaptation network configured to classify the text images into domains based on the visual features, and a text recognition network configured to recognize characters in the text images based on the character bounding boxes and the visual features. The system is configured to (1) reverse gradients in the training of the domain adaptation network to minus gradients and back-propagate the minus gradients through the character segmentation network (2) back-propagate gradients in the training of the text recognition network through the character segmentation network.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for training a machine learning model which recognizes characters of text images, the system comprising: one or more processors; and one or more storage devices, wherein the one or more storage devices store the machine learning model which recognizes characters of text images, wherein the machine learning model which recognizes characters of text images includes: a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images; a domain adaptation network configured to classify the text images into domains based on the visual features; and a text recognition network configured to recognize characters in the text images based on the character bounding boxes and the visual features, and wherein the one or more processors are configured to: reverse gradients in training of the domain adaptation network to minus gradients, and to back-propagate the minus gradients through the character segmentation network; and back-propagate gradients in training of the text recognition network through the character segmentation network. 2 . The system according to claim 1 , wherein the domain adaptation network is configured to classify the text images into domains based on the character bounding boxes and the visual features. 3 . The system according to claim 1 , wherein the domain adaptation network includes: a layer configured to extract feature maps corresponding to the character bounding boxes from the visual features; a concatenation layer configured to concatenate the extracted feature maps; and a block configured to discriminate the domains of the text images based on the concatenated feature maps. 4 . The system according to claim 1 , wherein the text recognition network is configured to align visual features to output sequences by the character bounding box. 5 . The system according to claim 1 , wherein the text recognition network includes: an RNN encoder configured to encode the visual features; an RNN decoder configured to output character sequences; and an alignment layer provided between the RNN encoder and the RNN decoder, wherein the alignment layer is configured to align encoded features obtained from the RNN encoder, to a character sequences by the character bounding boxes obtained by the character segmentation network, and wherein the RNN decoder is configured to output character sequences from the extracted encoded features. 6 . The system according to claim 1 , further comprising: an input apparatus; and a monitor, wherein the one or more processors is configured to: display, on the monitor, output from at least one of the character segmentation network, the domain adaptation network, or the text recognition network; and receive a revision of the output which has been input from the input apparatus. 7 . A method of training a machine learning model which recognizes characters of text images by a system, the system storing the machine learning model which recognizes characters of text images, the machine learning model which recognizes characters of text images including: a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images; a domain adaptation network configured to classify the text images into domains based on the visual features; and a text recognition network configured to recognize characters in the text images based on the character bounding boxes and the visual features, the method comprising: reversing, by the system, gradients in the training of the domain adaptation network to minus gradients, and backpropagating the minus gradients through the character segmentation network; and back-propagating, by the system, gradients in the training of the text recognition network through the character segmentation network. 8 . The method according to claim 7 , further comprising of the domain adaptation network, classifying the text images into domains based on the character bounding boxes and the visual features.

Assignees

Inventors

Classifications

  • G06V10/82Primary

    using neural networks · CPC title

  • Segmentation of character regions · CPC title

  • using classification, e.g. of video objects · CPC title

  • G06V30/18Primary

    Extraction of features or characteristics of the image · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022327816A1 cover?
A system trains a machine learning model which recognizes characters of text images. The system stores the machine learning model which recognizes characters of text images. The machine learning model includes a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images, a domain adaptation networ…
Who is the assignee on this patent?
Hitachi Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).