Model generation system and model generation method

US12380718B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12380718-B2
Application numberUS-202318111254-A
CountryUS
Kind codeB2
Filing dateFeb 17, 2023
Priority dateMar 22, 2022
Publication dateAug 5, 2025
Grant dateAug 5, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a model generation system for generating a text line recognition model that recognizes a text line included in a text line image, the model generation system including a processor section, in which the text line recognition model includes a visual feature extractor and a language context relation network, the processor section determines a variable of the language context relation network by acquiring text data for training and thus training the language context relation network by using the acquired text data, determines a variable of the visual feature extractor by training the text line recognition model through the use of a labeled text line image while the variable of the language context relation network is fixed, and generates the text line recognition model while the variable of the language context relation network is set to the determined variable thereof and the variable of the visual feature extractor is set to the determined variable thereof.

First claim

Opening claim text (preview).

What is claimed is: 1. A model generation system for generating a text line recognition model that recognizes a text line included in a text line image, the model generation system comprising: a memory; and a processor section, wherein the text line recognition model includes a visual feature extractor that, when executed by the processor section, outputs image feature values from the text line image, and a language context relation network that, when executed by the processor section, inputs the feature values outputted from the visual feature extractor, and outputs the text line, the processor section by executing a program stored in the memory that performs the following steps: (1) determining a variable of the language context relation network by acquiring text data for training and thus training the language context relation network by using the acquired text data, (2) determining a variable of the visual feature extractor by training the text line recognition model through use of an existing labeled text line image while the variable of the language context relation network is set according to step (1), and (3) generating the text line recognition model while the variable of the language context relation network is set according to step (1) and the variable of the visual feature extractor is set according to step (2), wherein the memory is configured to store the text line recognition model. 2. The model generation system according to claim 1 , wherein the processor section adjusts the variable of the text line recognition model by training the text line recognition model through use of labeled text line images smaller in number than a predetermined number. 3. The model generation system according to claim 1 , wherein the model generation system is connected to the Internet, and the processor section accesses the Internet to acquire the text data for the training. 4. The model generation system according to claim 3 , wherein the text data for the training is formed by copyright-free text data published on the Internet. 5. The model generation system according to claim 2 , wherein the processor section by executing a program stored in the memory that: receives a text line image and a label to be attached to the text line image that are inputted by a user, and adjusts a variable of the text line recognition model by training the text line recognition model through use of the received text line image and label. 6. The model generation system according to claim 1 , wherein the processor section trains the language context relation network by acquiring text line data for the training, performing word embedding for quantifying the acquired text line data, convolving the quantified data, and inputting the resulting data to the language context relation network. 7. The model generation system according to claim 1 , wherein the processor section trains the language context relation network by acquiring the text line data for the training, converting the acquired text line data to a text line image through use of a predetermined font, inputting the resulting text line image to a predetermined visual feature extractor, and inputting the output of the predetermined visual feature extractor to the language context relation network. 8. The model generation system according to claim 1 , wherein the existing labeled text line image is managed via the processor by a plurality of style-specific image groups formed by text line images of a same style, and the processor section determines the variable of the visual feature extractor by training the text line recognition model through use of the labeled text line image in each of the style-specific image groups while the variable of the language context relation network is fixed at the determined variable. 9. A model generation method adopted by a model generation system for generating a text line recognition model that recognizes a text line included in a text line image, the text line recognition model including a visual feature extractor that, when executed by the model generation system, outputs image feature values from the text line image, and a language context relation network that, when executed by the model generation system, inputs the feature values outputted from the visual feature extractor, and outputs the text line, the model generation method comprising: by the model generation system, the method including the following steps: (1) determining a variable of the language context relation network by acquiring text data for training and thus training the language context relation network by using the acquired text data; (2) determining a variable of the visual feature extractor by training the text line recognition model through use of an existing labeled text line image while the variable of the language context relation network is set according to step (1); and (3) generating the text line recognition model while the variable of the language context relation network is set according to step (1) and the variable of the visual feature extractor is set according to step (2).

Assignees

Inventors

Classifications

  • G06V30/18Primary

    Extraction of features or characteristics of the image · CPC title

  • using context analysis, e.g. lexical, syntactic or semantic context · CPC title

  • Interactive pattern learning with a human teacher · CPC title

  • of cursive writing · CPC title

  • using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12380718B2 cover?
Provided is a model generation system for generating a text line recognition model that recognizes a text line included in a text line image, the model generation system including a processor section, in which the text line recognition model includes a visual feature extractor and a language context relation network, the processor section determines a variable of the language context relation n…
Who is the assignee on this patent?
Hitachi Ltd
What technology area does this patent fall under?
Primary CPC classification G06V30/18. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).