Systems and techniques for determining the predictive value of a feature
US-2018060738-A1 · Mar 1, 2018 · US
US10558924B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10558924-B2 |
| Application number | US-201715790756-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 23, 2017 |
| Priority date | May 23, 2014 |
| Publication date | Feb 11, 2020 |
| Grant date | Feb 11, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A predictive modeling method may include obtaining a fitted, first-order predictive model configured to predict values of output variables based on values of first input variables; and performing a second-order modeling procedure on the fitted, first-order model, which may include: generating input data including observations including observed values of second input variables and predicted values of the output variables; generating training data and testing data from the input data; generating a fitted second-order model of the fitted first-order model by fitting a second-order model to the training data; and testing the fitted, second-order model of the first-order model on the testing data. Each observation of the input data may be generated by (1) obtaining observed values of the second input variables, and (2) applying the first-order predictive model to corresponding observed values of the first input variables to generate the predicted values of the output variables.
Opening claim text (preview).
What is claimed is: 1. A predictive modeling method comprising: obtaining a fitted, first-order predictive model, wherein the first-order predictive model is configured to predict values of one or more output variables of a prediction problem based on values of one or more first input variables; creating a fitted second-order predictive model that is more computationally efficient than the fitted first-order predictive model, wherein creating the fitted second-order predictive model comprises performing a second-order predictive modeling procedure on the fitted, first-order model, wherein the second-order modeling procedure is associated with a second-order predictive model, and wherein performing the second-order predictive modeling procedure on the fitted, first-order model includes: generating second-order input data including a plurality of second-order observations, wherein each second-order observation includes respective observed values of one or more second input variables and predicted values of the output variables, and wherein generating the second-order input data comprises, for each second-order observation: obtaining the respective observed values of the second input variables and corresponding observed values of the first input variables, and applying the first-order predictive model to the corresponding observed values of the first input variables to generate the respective predicted values of the output variables, generating, from the second-order input data, second-order training data and second-order testing data, generating the fitted second-order predictive model of the fitted first-order model by fitting the second-order predictive model to the second-order training data, and testing the fitted, second-order predictive model of the fitted first-order model on the second-order testing data; determining that the fitted second-order model is more computationally efficient than the fitted first-order model based on a measurement of a computational resource utilization of the fitted second-order model being less than a measurement of the computational resource utilization of the fitted first-order model; and deploying the more computationally efficient fitted second-order model rather than the less computationally efficient fitted first-order model, wherein deploying the fitted second-order model comprises generating a plurality of predictions by applying the fitted second-order model to other data representing instances of the prediction problem, wherein the second-order input data do not include the other data. 2. The method of claim 1 , wherein obtaining the fitted, first-order model comprises blending two fitted predictive models. 3. The method of claim 1 , wherein the second-order predictive model is a RuleFit model, a generalized additive model, or a blend thereof. 4. The method of claim 1 , further comprising performing cross-validation of the second-order model, wherein the second-order input data comprise at least one data set, wherein generating the second-order training data comprises obtaining a first subset of the data set, and wherein generating the second-order testing data comprises obtaining a second subset of the data set. 5. The method of claim 4 , wherein the second-order training data are first second-order training data, wherein the second-order testing data are first second-order testing data, wherein the fitted second-order model is a first fitted second-order model, and wherein performing the cross-validation of the second-order model comprises: (a) generating second second-order training data and second second-order testing data from the second-order input data, wherein the second second-order training data include a third subset of the data set, and wherein the second second-order testing data include a fourth subset of the data set; (b) fitting the second-order predictive model to the second second-order training data to obtain a second fitted second-order predictive model; and (c) testing the second fitted second-order predictive model on the second second-order testing data. 6. The method of claim 5 , further comprising partitioning the data set into a plurality of partitions including at least a first partition and a second partition. 7. The method of claim 6 , wherein partitioning the data set into a plurality of partitions comprises randomly assigning each observation in the data set to a respective partition. 8. The method of claim 7 , wherein: the first second-order training data comprise the first partition of the data set; the first second-order testing data comprise all of the partitions of the data set except the first partition; the second second-order training data comprise the second partition of the data set; and the second second-order testing data comprise all of the partitions of the data set except the second partition. 9. The method of claim 7 , wherein: the first second-order training data comprise a subset of the first partition of the data set; the first second-order testing data comprise respective subsets of all of the partitions of the data set except the first partition; the second second-order training data comprise a subset of the second partition of the data set; and the second second-order testing data comprise respective subsets of all the partitions of the data set except the second partition. 10. The method of claim 5 , wherein: the second-order input data comprise a first partition and a second partition, the data set comprises the first partition of the second-order input data, and the method further comprises testing the first and second fitted second-order models on holdout data comprising the second partition of the second-order input data. 11. The method of claim 10 , wherein no predictive model is fitted to the holdout data. 12. The method of claim 1 , wherein performing the second-order predictive modeling procedure further includes performing nested cross-validation of the second-order predictive model. 13. The method of claim 12 , wherein: the second-order input data comprise at least one data set; performing the nested cross-validation of the second-order predictive model comprises: partitioning the data set into a first plurality of partitions of the data set including at least a first partition of the data set and a second partition of the data set, and partitioning the first partition of the data set into a plurality of partitions of the first partition of the data set including at least a first partition of the first partition of the data set and a second partition of the first partition of the data set; the second-order training data comprise the first partition of the first partition of the data set; and the second-order testing data comprise all of the partitions of the first partition of the data set except the first partition of the first partition of the data set. 14. The method of claim 13 , wherein the second-order training data are first second-order training data, the second-order testing data are first second-order testing data, the fitted second-order model is a first fitted second-order model, and performing the nested cross-validation of the second-order predictive model further comprises: (a) generating, from the first partition of the data set, second second-order training data and second second-order testing data, wherein the second second-order training data comprise the second partition of the first partition of the data set, and wherein the second second-order testing data comprise a plurality of the partitions of the first partition of the data set other than the second partition of the fir
Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem" (market predictions or forecasting for commercial activities G06Q30/0202) · CPC title
Inference or reasoning models · CPC title
Knowledge representation; Symbolic representation · CPC title
Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling · CPC title
the resources being hardware resources other than CPUs, Servers and Terminals · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.