Search item generation method and related device
US-2019370605-A1 · Dec 5, 2019 · US
US12191007B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12191007-B2 |
| Application number | US-201716618656-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 29, 2017 |
| Priority date | Aug 30, 2017 |
| Publication date | Jan 7, 2025 |
| Grant date | Jan 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Example embodiments relate to a method for training a predictive model from data. The method includes defining a multitude of predicates as binary functions operating on time sequences of the features or logical operations on the time sequences of the features. The method also includes iteratively training a boosting model by generating a number of new random predicates, scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model, selecting a number of the new random predicates with the highest weighted information gain and adding them to the boosting model, computing weights for all the predicates in the boosting model, removing one or more of the selected new predicates with the highest information gain from the boosting model in response to input from an operator. The method may include repeating the prior steps a plurality of times.
Opening claim text (preview).
We claim: 1. A computer-implemented method of training a predictive model from data comprising a multitude of features, each feature associated with a real value and a time component, comprising the steps of executing the following instructions in a processor of the computer: a) defining a multitude of predicates as binary functions operating on time sequences of the features or logical operations on the time sequences of the features; b) iteratively training a boosting model by performing the following: 1) Generating a number of new random predicates as binary functions operating on at least one of (i) time sequences of the features or (ii) logical operations on the time sequences of the features; 2) Scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model; 3) Selecting, from the new random predicates, a number of the new random predicates that are the highest with respect to their weighted information gain scores and adding them to the boosting model; 4) Computing weights for all the predicates in the boosting model; 5) Removing one or more of the selected number of the new random predicates from the boosting model in response to input from an operator; and 6) Repeating the performance of steps 1, 2, 3, 4 and 5 a plurality of times and thereby generating a final iteratively trained boosting model. 2. The method of claim 1 , further comprising the step of c) evaluating the final iteratively trained boosting model. 3. The method of claim 2 , wherein the evaluation step (c) comprises evaluating the final iteratively trained boosting model for at least one of accuracy, complexity, or trustworthiness. 4. The method of claim 1 , wherein the data is in a tuple format of the type {X, x i , t i } where X is the name of feature, x i is a real value of the feature and t i is a time component for the real value x i , and wherein the predicates are defined as binary functions operating on at least one of (i) sequences of tuples or (ii) logical operations on sequences of the tuples. 5. The method of claim 4 , wherein the sequences of tuples are defined by time periods selected from the group consisting of 1 or more days, 1 or more hours, 1 or more minutes, or 1 or more months. 6. The method of claim 1 , wherein the data comprises electronic health record data for a multitude of patients. 7. The method of claim 1 , wherein the method further comprises the step of dividing the predicates into groups based on understandability, namely a first group of relatively more human understandable predicates and a second group of relatively less human understandable predicates and wherein the new random predicates are selected from the first group. 8. The method of claim 7 , wherein the data comprises electronic health record data for a multitude of patients, and wherein the set of predicates are represented in a manner to show the subject matter or source within the electronic health record data of the predicates. 9. The method of claim 8 , wherein the predicates comprise an existence predicate returning a result of 0 or 1 depending on whether a feature exists in the electronic health record data for a given patient in the multitude of patients; and a counts predicate returning a result of 0 or 1 depending on the number of counts of a feature in the electronic health record data for a given patient in the multitude of patients relative to a numeric parameter C. 10. The method of claim 1 , wherein step b) 5) further comprises the step of graphically representing the predicates currently in the boosting model and providing the operator with the ability to remove one or more of the predicates. 11. The method of claim 10 , further comprising the step of graphically representing the weights computed for each of the predicates in step b) 4). 12. The method of claim 1 , further comprising the step of graphically representing a set of predicates added to the boosting model after each of the iterations of step b) 6). 13. The method of claim 1 , wherein step b) further comprises the step of providing the operator with the ability to define a predicate during model training. 14. The method of claim 1 , wherein step b) further comprises the step of removing redundant predicates. 15. The method of claim 1 , further comprising the step of ranking the predicates selected in step b) 3). 16. The method of claim 1 , further comprising the step of generating statistics of predicates in the boosting model and presenting them to the operator. 17. The method of claim 1 , wherein in step b) 5) the one or more predicates are removed which are not causally related to the prediction of the boosting model. 18. A computer-implemented method of training a predictive model from electronic health record data for a multitude of patients, the data comprising a multitude of features, each feature associated with real values and a time component, wherein the data is in a tuple format of the type {X, x i , t i } where X is the name of feature, x i is a real value of the feature and t i is a time component for the real value x i , comprising the steps of implementing the following instructions in a processor of the computer: a) defining a multitude of predicates as at least one of (i) binary functions operating on sequences of the tuples or (ii) logical operations on the sequences of the tuples; b) dividing the multitude of predicates into groups based on understandability, namely a first group of relatively more human understandable predicates and a second group of relatively less human understandable predicates; c) iteratively training a boosting model by performing the following: 1) Generating a number of new random predicates from the first group of predicates as binary functions operating on at least one of (i) sequences of the tuples or (ii) logical operations on the sequences of the tuples; 2) Scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model; 3) Selecting, from the new random predicates, a number of the new random predicates that are the highest with respect to their weighted information gain scores and adding them to the boosting model; 4) Computing weights for all the predicates in the boosting model; 5) Removing one or more of the selected number of the new random predicates from the boosting model in response to input from an operator; and 6) Repeating the performance of steps 1, 2, 3, 4 and 5 a plurality of times and thereby generating a final iteratively trained boosting model. 19. The method of claim 18 , further comprising the step d) of evaluating the final iteratively trained boosting model. 20. A workstation for providing operator input into iteratively training a boosting model, wherein the workstation comprises an interface and a processor, and wherein the processor is configured to perform operations comprising: 1) Generating a number of new random predicates as binary functions operating on at least one of (i) time sequences of input features or (ii) logical operations on the time sequences of the input features; 2) Scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model; 3) Selecting, from the new random predicates, a number of the new random predicates that are the highest with respect to their weighted information gain scores and adding them to the boosting
Machine learning · CPC title
Knowledge-based neural networks; Logical representations of neural networks · CPC title
Methods or arrangements for processing data by operating upon the order or content of the data handled (logic circuits H03K19/00) · CPC title
Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.