What technology area does this patent fall under?

Primary CPC classification G06V10/82. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Oct 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

System for training machine learning model which recognizes characters of text images

US2022327816A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2022327816-A1
Application number	US-202217714322-A
Country	US
Kind code	A1
Filing date	Apr 6, 2022
Priority date	Apr 9, 2021
Publication date	Oct 13, 2022
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system trains a machine learning model which recognizes characters of text images. The system stores the machine learning model which recognizes characters of text images. The machine learning model includes a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images, a domain adaptation network configured to classify the text images into domains based on the visual features, and a text recognition network configured to recognize characters in the text images based on the character bounding boxes and the visual features. The system is configured to (1) reverse gradients in the training of the domain adaptation network to minus gradients and back-propagate the minus gradients through the character segmentation network (2) back-propagate gradients in the training of the text recognition network through the character segmentation network.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for training a machine learning model which recognizes characters of text images, the system comprising: one or more processors; and one or more storage devices, wherein the one or more storage devices store the machine learning model which recognizes characters of text images, wherein the machine learning model which recognizes characters of text images includes: a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images; a domain adaptation network configured to classify the text images into domains based on the visual features; and a text recognition network configured to recognize characters in the text images based on the character bounding boxes and the visual features, and wherein the one or more processors are configured to: reverse gradients in training of the domain adaptation network to minus gradients, and to back-propagate the minus gradients through the character segmentation network; and back-propagate gradients in training of the text recognition network through the character segmentation network. 2 . The system according to claim 1 , wherein the domain adaptation network is configured to classify the text images into domains based on the character bounding boxes and the visual features. 3 . The system according to claim 1 , wherein the domain adaptation network includes: a layer configured to extract feature maps corresponding to the character bounding boxes from the visual features; a concatenation layer configured to concatenate the extracted feature maps; and a block configured to discriminate the domains of the text images based on the concatenated feature maps. 4 . The system according to claim 1 , wherein the text recognition network is configured to align visual features to output sequences by the character bounding box. 5 . The system according to claim 1 , wherein the text recognition network includes: an RNN encoder configured to encode the visual features; an RNN decoder configured to output character sequences; and an alignment layer provided between the RNN encoder and the RNN decoder, wherein the alignment layer is configured to align encoded features obtained from the RNN encoder, to a character sequences by the character bounding boxes obtained by the character segmentation network, and wherein the RNN decoder is configured to output character sequences from the extracted encoded features. 6 . The system according to claim 1 , further comprising: an input apparatus; and a monitor, wherein the one or more processors is configured to: display, on the monitor, output from at least one of the character segmentation network, the domain adaptation network, or the text recognition network; and receive a revision of the output which has been input from the input apparatus. 7 . A method of training a machine learning model which recognizes characters of text images by a system, the system storing the machine learning model which recognizes characters of text images, the machine learning model which recognizes characters of text images including: a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images; a domain adaptation network configured to classify the text images into domains based on the visual features; and a text recognition network configured to recognize characters in the text images based on the character bounding boxes and the visual features, the method comprising: reversing, by the system, gradients in the training of the domain adaptation network to minus gradients, and backpropagating the minus gradients through the character segmentation network; and back-propagating, by the system, gradients in the training of the text recognition network through the character segmentation network. 8 . The method according to claim 7 , further comprising of the domain adaptation network, classifying the text images into domains based on the character bounding boxes and the visual features.

Assignees

Hitachi Ltd

Inventors

Classifications

G06V10/82Primary
using neural networks · CPC title
G06V30/148
Segmentation of character regions · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G06V30/18Primary
Extraction of features or characteristics of the image · CPC title

Patent family

Related publications grouped by family.

View patent family 83510850

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022327816A1 cover?: A system trains a machine learning model which recognizes characters of text images. The system stores the machine learning model which recognizes characters of text images. The machine learning model includes a character segmentation network which is configured to extract visual features from text images, and to generate character bounding boxes from the text images, a domain adaptation networ…
Who is the assignee on this patent?: Hitachi Ltd
What technology area does this patent fall under?: Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Oct 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).