Automatic rule prediction and generation for document classification and validation

US11810381B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11810381-B2
Application numberUS-202117303914-A
CountryUS
Kind codeB2
Filing dateJun 10, 2021
Priority dateJun 10, 2021
Publication dateNov 7, 2023
Grant dateNov 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method is provided. The method may include, in response to electronically receiving a document, automatically classifying the document and different parts of the document, by electronically identifying a document type associated with the document and electronically tagging data associated with the different parts of the document based on classification rules. The method may further include automatically extracting the tagged data associated with the automatically classified document based on data extraction rules. The method may further include detecting first feedback associated with the classification rules and second feedback associated with the data extraction rules. The method may further include automatically generating and updating validation rules based on the identified document type, the detected first feedback, and the detected second feedback to validate the automatically classified document and the automatically tagged and extracted data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: in response to electronically receiving a document, automatically classifying the document, wherein automatically classifying the document comprises electronically identifying a document type associated with the document; based on the document type associated with the document, automatically classifying different parts of the document and electronically tagging data associated with the different parts of the document based on one or more classification rules pertaining to the identified document type and identified data type in the document, wherein automatically classifying the document and automatically classifying the different parts of the document further comprises automatically generating and applying one or more classification rule suggestions for classifying the document and the different parts of the document; in response to detecting first feedback associated with the one or more classification rule suggestions, updating the one or more classification rules corresponding to the one or more classification rule suggestions based on the first feedback, and applying the updated one or more classification rules; automatically extracting the tagged data associated with the automatically classified document based on one or more data extraction rules associated with the identified document type and the identified data type, wherein automatically extracting the tagged data associated with the automatically classified document based on one or more data extraction rules further comprises automatically generating and applying one or more data extraction rule suggestions for extracting the tagged data; in response to detecting second feedback associated with the one more data extraction rule suggestions, updating the one or more data extraction rules corresponding to the one more data extraction rule suggestions based on the second feedback, and applying the updated one or more data extraction rules; and automatically generating, updating, and applying validation rules based on the identified document type, the detected first feedback, and the detected second feedback to validate the automatically classified document and the automatically tagged and extracted data, wherein automatically generating and applying the validation rules further comprises automatically generating and applying one or more validation rule suggestions for validating the automatically classified document and the automatically tagged and extracted data. 2. The method of claim 1 , wherein automatically classifying the document and the different parts of the document further comprises: generating the one or more classification rule suggestions based on the one or more classification rules; and applying at least one classification rule suggestion from the generated one or more classification rules suggestions based on a percentage threshold. 3. The method of claim 1 , wherein automatically extracting the tagged data associated with the automatically classified document further comprises: generating the one or more data extraction rule suggestions based on the one or more data extraction rules; and applying at least one data extraction rule suggestion from the generated one or more data extraction rule suggestions based on a percentage threshold. 4. The method of claim 1 , wherein detecting the first feedback associated with the one or more classification rules comprises detecting via a user interface at least one of a selection of a classification rule, an acceptance of a classification rule suggestion, a rejection of the classification rule suggestion, and a edit of the classification rule suggestion. 5. The method of claim 1 , wherein detecting the second feedback associated with the one or more data extraction rules comprises detecting via a user interface at least one of a selection of a data extraction rule, an acceptance of a data extraction rule suggestion, a rejection of the data extraction rule suggestion, and a edit of the data extraction rule suggestion. 6. The method of claim 1 , further comprising: detecting third feedback, wherein the third feedback comprises data corrections to the automatically tagged and extracted data. 7. The method of claim 1 , further comprising: automatically generating and updating the validation rules based on at least one of a type of industry associated with the document, an originating geography associated with the document, a company associated with the document. 8. A computer system, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage devices, and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: in response to electronically receiving a document, automatically classifying the document, wherein automatically classifying the document comprises electronically identifying a document type associated with the document; based on the document type associated with the document, automatically classifying different parts of the document and electronically tagging data associated with the different parts of the document based on one or more classification rules pertaining to the identified document type and identified data type in the document, wherein automatically classifying the document and automatically classifying the different parts of the document further comprises automatically generating and applying one or more classification rule suggestions for classifying the document and the different parts of the document; in response to detecting first feedback associated with the one or more classification rule suggestions, updating the one or more classification rules corresponding to the one or more classification rule suggestions based on the first feedback, and applying the updated one or more classification rules; automatically extracting the tagged data associated with the automatically classified document based on one or more data extraction rules associated with the identified document type and the identified data type, wherein automatically extracting the tagged data associated with the automatically classified document based on one or more data extraction rules further comprises automatically generating and applying one or more data extraction rule suggestions for extracting the tagged data; in response to detecting second feedback associated with the one more data extraction rule suggestions, updating the one or more data extraction rules corresponding to the one more data extraction rule suggestions based on the second feedback, and applying the updated one or more data extraction rules; and automatically generating, updating, and applying validation rules based on the identified document type, the detected first feedback, and the detected second feedback to validate the automatically classified document and the automatically tagged and extracted data, wherein automatically generating and applying the validation rules further comprises automatically generating and applying one or more validation rule suggestions for validating the automatically classified document and the automatically tagged and extracted data. 9. The computer system of claim 8 , wherein automatically classifying the document and the different parts of the document further comprises: generating the one or more classification rule suggestions based on the one or more classification rules; and applying at least one classification rule suggestion from the generated one or more classification rules suggestions

Assignees

Inventors

Classifications

  • G06V30/413Primary

    Classification of content, e.g. text, photographs or tables · CPC title

  • based on feedback of a supervisor · CPC title

  • Rule-based classification · CPC title

  • using rules for classification or partitioning the feature space · CPC title

  • using rules for classification or partitioning the feature space · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11810381B2 cover?
A method is provided. The method may include, in response to electronically receiving a document, automatically classifying the document and different parts of the document, by electronically identifying a document type associated with the document and electronically tagging data associated with the different parts of the document based on classification rules. The method may further include au…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06V30/413. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).