What technology area does this patent fall under?

Primary CPC classification G06F40/295. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jun 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Information processing apparatus, information processing system, information processing method, and storage medium

US2024193370A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2024193370-A1
Application number	US-202318533685-A
Country	US
Kind code	A1
Filing date	Dec 8, 2023
Priority date	Dec 13, 2022
Publication date	Jun 13, 2024
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

To make it possible to extract a character string corresponding to each extraction-target item with accuracy even in a case where the character string ranges of a plurality of extraction-target items overlap one another in the task of named entity recognition. By using a training model trained to extract a character string corresponding to each of a plurality of items within a document, a character string corresponding to each of the plurality of items is extracted and output for an input document image. Then, a character string corresponding to an item among the plurality of items, for which a corresponding character string is not extracted, is re-extracted from the character string output by the first extracting.

First claim

Opening claim text (preview).

What is claimed is: 1 . An information processing apparatus comprising: one or more memories storing instructions; and one or more processors executing the instructions to perform: first extracting to extract, by using a training model trained to extract a character string corresponding to each of a plurality of items within a document, a character string corresponding to each of the plurality of items for an input document image; and second extracting to extract a character string corresponding to an item among the plurality of items, for which a corresponding character string is not extracted by the first extracting, from the character string obtained by the first extracting. 2 . The information processing apparatus according to claim 1 , wherein the second extracting is performed by using the training model used for the first extracting, whose input and output are limited. 3 . The information processing apparatus according to claim 1 , wherein the second extracting is performed by using a training model different from the training model used for the first extracting, which is trained to extract a character string corresponding to a second item different from a first item from a character string corresponding to the first item of the plurality of items. 4 . The information processing apparatus according to claim 1 , wherein in the second extracting, key-value extracting is performed, to which a keyword and a data type corresponding to an item among the plurality of items, for which a corresponding character string is not extracted by the first extracting, are set. 5 . The information processing apparatus according to claim 1 , wherein the one or more processors further execute the instructions to perform setting an extraction-target item in the second extracting in advance and the second extracting is performed in a case where the item among the plurality of items, for which a corresponding character string is not extracted by the first extracting, is the extraction-target item set in advance. 6 . The information processing apparatus according to claim 1 , wherein the one or more processors further execute the instructions to perform causing a display unit to display a UI screen on which results of the first extracting are shown, on the UI screen, a UI element for a user to give instructions to perform the second extracting exists, and based on user instructions via the UI screen, the second extracting is performed. 7 . The information processing apparatus according to claim 6 , wherein the UI element is displayed on the UI screen in association with the item among the plurality of items, for which a corresponding character string is not extracted by the first extracting and in the second extracting, a character string corresponding to the item with which the UI element is associated is extracted. 8 . An information processing system comprising: a training device generating a training model by performing training for extracting a character string corresponding to each of a plurality of items from a document image; and an information processing apparatus performing first extracting to extract, by using the training model, a character string corresponding to each of the plurality of items for an input document image and second extracting to extract a character string corresponding to an item among the plurality of items, for which a corresponding character string is not extracted by the first extracting, from the character string obtained by the first extracting. 9 . The information processing system according to claim 8 , wherein the second extracting is performed by using the training model used for the first extracting, whose input and output are limited. 10 . The information processing system according to claim 8 , wherein the second extracting is performed by using a training model different from the training model used for the first extracting and the training device further generates the other different training model by performing training for extracting a character string corresponding to a second item different from a first item from a character string corresponding to the first item of the plurality of items. 11 . The information processing system according to claim 8 , wherein in the second extracting, key-value extracting is performed, to which a keyword and a data type corresponding to an item among the plurality of items, for which a corresponding character string is not extracted by the first extracting, are set. 12 . An information processing method comprising the steps of: performing first extracting to extract, by using a training model trained to extract a character string corresponding to each of a plurality of items within a document, a character string corresponding to each of the plurality of items for an input document image; and performing second extracting to extract a character string corresponding to an item among the plurality of items, for which a corresponding character string is not extracted by the first extracting, from the character string obtained by the first extracting. 13 . A non-transitory computer readable storage medium storing a program for causing a computer to perform an information processing method comprising the steps of: performing first extracting to extract, by using a training model trained to extract a character string corresponding to each of a plurality of items within a document, a character string corresponding to each of the plurality of items for an input document image; and performing second extracting to extract a character string corresponding to an item among the plurality of items, for which a corresponding character string is not extracted by the first extracting, from the character string obtained by the first extracting.

Assignees

Canon Kk

Inventors

Achiwa Ken

Classifications

G06V30/10
Character recognition · CPC title
G06F40/295Primary
Named entity recognition · CPC title

Patent family

Related publications grouped by family.

View patent family 91380931

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024193370A1 cover?: To make it possible to extract a character string corresponding to each extraction-target item with accuracy even in a case where the character string ranges of a plurality of extraction-target items overlap one another in the task of named entity recognition. By using a training model trained to extract a character string corresponding to each of a plurality of items within a document, a chara…
Who is the assignee on this patent?: Canon Kk
What technology area does this patent fall under?: Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jun 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and system for extracting information from a document image

Information processing apparatus and non-transitory computer readable medium storing information processing program

Automatic generation and population of digital interfaces based on adaptively processed image data

Extraction of expression for natural language processing

Frequently asked questions