Artificial intelligence based document processor
US-2019354720-A1 · Nov 21, 2019 · US
US12118478B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12118478-B2 |
| Application number | US-202016998674-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 20, 2020 |
| Priority date | May 8, 2020 |
| Publication date | Oct 15, 2024 |
| Grant date | Oct 15, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for deriving classification rules from documents and a database using rule-based machine learning. The method includes extracting first variables from documents corresponding to an organization. The method further includes extracting second variables from a database corresponding to the organization. The method also includes filtering the extracted second variables based on at least one of null values, repeat variables, location variables, ID variables, or data variables. The method further includes deriving first classification rules based on the first variables using a rule-based machine learning algorithm. The method also includes calculating an accuracy of the derived first classification rules. The method also includes deriving second classification rules based on the first variables and the filtered second variables. The method further includes determining a suggested additional variable based on the derived second classification rules and the calculated accuracy.
Opening claim text (preview).
What is claimed: 1. A method for deriving classification rules from documents and a database using rule-based machine learning, the method comprising: extracting, by a server computing device, a first plurality of variables from documents corresponding to an organization; extracting, by the server computing device, a second plurality of variables from a database corresponding to the organization; filtering, by the server computing device, the extracted second plurality of variables based on at least one of null values, repeat variables, location variables, ID variables, or date variables; deriving, by the server computing device, a first plurality of classification rules based on the first plurality of variables using a rule-based machine learning algorithm, comprising: a) executing the rule-based machine learning algorithm using the first plurality of variables as input to derive a classification rule and identify a subset of the first plurality of variables that satisfy the classification rule, b) removing the identified subset of variables from the first plurality of variables, c) repeating steps a) and b) using the remaining first plurality of variables as input to identify additional subsets of the remaining variables that satisfy additional classification rules, and d) storing the classification rules derived by the rule-based machine learning algorithm in step a) as the first plurality of classification rules; calculating, by the server computing device, an accuracy of the derived first plurality of classification rules; deriving, by the server computing device, a second plurality of classification rules based on the first plurality of variables and filtered second plurality of variables using the rule-based machine learning algorithm, comprising: e) executing the rule-based machine learning algorithm using the first plurality of variables and one of the filtered second plurality of variables as input to derive a classification rule and identify a subset of the first plurality of variables and one of the filtered second plurality of variables that satisfy the classification rule, f) removing the identified subset of variables from the first plurality of variables and the one of the filtered second plurality of variables, g) updating the calculated accuracy of the derived first plurality of classification rules based upon the first plurality of variables and the one of the filtered second plurality of variables, h) repeating steps e), f) and g) using the remaining first plurality of variables and another one of the filtered second plurality of variables as input to identify additional subsets of the remaining variables that satisfy additional classification rules, and i) storing the classification rules generated by the rule-based machine learning algorithm in step e) as the second plurality of classification rules; and identifying, by the server computing device, a suggested additional variable from the filtered second plurality of variables for inclusion in the first plurality of variables based upon the updated accuracy of the derived first plurality of classification rules; and generating, by the server computing device, for display the derived first plurality of classification rules, the derived second plurality of classification rules, the updated accuracy of the derived first plurality of classification rules, and the suggested additional variable. 2. The method of claim 1 , wherein the server computing device is configured to calculate the accuracy of the derived first plurality of classification rules based on a known plurality of classification rules corresponding to the organization. 3. The method of claim 1 , wherein the server computing device is further configured to extract the first plurality of variables using natural language processing. 4. The method of claim 1 , wherein the database comprises demographic data, employment data, and benefit plan data. 5. The method of claim 1 , wherein the server computing device is further configured to map the extracted first plurality of variables to corresponding entries of the database. 6. The method of claim 1 , wherein the server computing device is further configured to classify each of the extracted first plurality of variables and second plurality of variables as character-based or numeric. 7. The method of claim 1 , wherein the first plurality of classification rules are derived sequentially using the rule-based machine learning algorithm. 8. The method of claim 1 , wherein the server computing device is further configured to sequentially derive the second plurality of classification rules based on the first plurality of variables and the filtered second plurality of variables using the rule-based machine learning algorithm. 9. The method of claim 8 , wherein the server computing device is further configured to calculate an accuracy of the derived second plurality of classification rules based on a known plurality of classification rules corresponding to the organization. 10. A system for deriving classification rules from documents and a database using rule-based machine learning, the system comprising: a server computing device communicatively coupled to a database corresponding to an organization and a display device, the server computing device configured to: extract a first plurality of variables from documents corresponding to an organization; extract a second plurality of variables from the database corresponding to the organization; filter the extracted second plurality of variables based on at least one of null values, repeat variables, location variables, ID variables, or date variables; derive a first plurality of classification rules based on the first plurality of variables using a rule-based machine learning algorithm, comprising: a) executing the rule-based machine learning algorithm using the first plurality of variables as input to derive a classification rule and identify a subset of the first plurality of variables that satisfy the classification rule, b) removing the identified subset of variables from the first plurality of variables, c) repeating steps a) and b) using the remaining first plurality of variables as input to identify additional subsets of the remaining variables that satisfy additional classification rules, and d) storing the classification rules derived by the rule-based machine learning algorithm in step a) as the first plurality of classification rules; calculate an accuracy of the derived first plurality of classification rules; derive a second plurality of classification rules based on the first plurality of variables and filtered second plurality of variables using a rule-based machine learning algorithm, comprising: e) executing the rule-based machine learning algorithm using the first plurality of variables and one of the filtered second plurality of variables as input to derive a classification rule and identify a subset of the first plurality of variables and one of the filtered second plurality of variables that satisfy the classification rule, f) removing the identified subset of variables from the first plurality of variables and the one of the filtered second plurality of variables, g) updating the calculated accuracy of the derived first plurality of classification rules based upon the first plurality of variables and the one of the filtered second plurality of variables, h) repeating steps e), f) and g) using the remaining first plurality of variables and another one of the filtered second plurality of variables as input to identify additional subsets of the remaining variables that satisfy additional classification rules, and i) storing the classification rules g
Machine learning · CPC title
Document management systems · CPC title
Recognition of textual entities · CPC title
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
Extracting rules from data · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.