Model building for simulation of one or more target features
US-10402726-B1 · Sep 3, 2019 · US
US10963790B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10963790-B2 |
| Application number | US-201715582496-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 28, 2017 |
| Priority date | Apr 28, 2017 |
| Publication date | Mar 30, 2021 |
| Grant date | Mar 30, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method includes receiving input that identifies one or more data sources and determining, based on the input, a machine learning problem type of a plurality of machine learning problem types supported by an automated model building (AMB) engine. The method also includes generating an input data set of the AMB engine based on application of one or more rules to the one or more data sources. The method further includes, based on the input data set and the machine learning problem type, initiating execution of the AMB engine to generate a neural network configured to model at least a portion of the input data set.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, at a processor of a computing device, input that identifies one or more data sources; determining, based on the input, a machine learning problem type of a plurality of machine learning problem types supported by an automated model building (AMB) engine; generating an input data set of the AMB engine based on application of one or more rules to the one or more data sources, wherein the one or more rules indicate that a column is to be dropped responsive to determining that: the column has zero standard deviation; the column includes a unique value in at least a first threshold percentage of rows; the column has at least a second threshold percentage of missing or corrupted values; the column represents categorical data and includes more than a threshold number of unique values; or any combination thereof; and based on the input data set and the machine learning problem type, initiating execution of the AMB engine to generate a neural network configured to model at least a portion of the input data set. 2. The method of claim 1 , wherein determining the machine learning problem type comprises at least one of: determining a classification problem type responsive to receiving second input to predict a categorical column; determining a regression problem type responsive to receiving third input to predict a numerical column; or determining a reinforcement learning problem type responsive to receiving fourth input indicating at least of a state data structure, an action data structure, a reward function, or an interaction function. 3. A computer system comprising: an automated model building (AMB) pre-processor configured to: receive input that identifies one or more data sources; determine, based on the input, a machine learning problem type of a plurality of machine learning problem types supported by an AMB engine; generate an input data set of the AMB engine based on application of one or more rules to the one or more data sources, wherein the one or more rules indicate that a column is to be dropped responsive to determining that: the column has zero standard deviation; the column includes a unique value in at least a first threshold percentage of rows; the column has at least a second threshold percentage of missing or corrupted values; the column represents categorical data and includes more than a threshold number of unique values; or any combination thereof; and based on the input data set and the machine learning problem type, initiate execution of the AMB engine to generate a neural network configured to model at least a portion of the input data set. 4. The computer system of claim 3 , wherein the AMB pre-processor comprises a data source analyzer configured to determine a combined data source based on the one or more data sources. 5. The computer system of claim 4 , wherein the data source analyzer is configured to determine whether a particular data source includes column headers. 6. The computer system of claim 3 , wherein the AMB pre-processor comprises a data profiler configured to determine at least one of a data profile, an input profile, or a target profile. 7. The computer system of claim 6 , wherein the data profiler is further configured to perform at least one of a data cleaning operation or a data scaling operation. 8. The computer system of claim 7 , wherein performing the data cleaning operation includes performing an imputation operation to determine at least one missing data value of at least one data source. 9. The computer system of claim 3 , wherein the one or more rules indicate that a column including fewer than a threshold number of unique values corresponds to categorical data. 10. The computer system of claim 3 , wherein the AMB pre-processor is further configured to determine an error function based on the machine learning problem type, determine data sampling criteria used to generate the input data set of the AMB engine, or both. 11. The computer system of claim 3 , wherein the AMB engine comprises a first device configured to execute a genetic algorithm. 12. The computer system of claim 11 , wherein the AMB engine comprises a second device configured to execute an optimizer. 13. The computer system of claim 3 , further comprising an output interface configured to send one or more graphical user interfaces (GUIs) to a display device. 14. The computer system of claim 13 , wherein the one or more GUIs are configured to receive the input identifying the one or more data sources, second input indicating the machine learning problem type, third input indicating a training time threshold, or any combination thereof. 15. The computer system of claim 13 , wherein the one or more GUIs are configured to receive fourth input identifying a target column to be predicted by the neural network. 16. The computer system of claim 13 , wherein the one or more GUIs are configured to receive fifth input identifying a failure prediction lead time, sixth input indicating at least one failure, or both. 17. The computer system of claim 13 , wherein the one or more GUIs are configured to receive seventh input indicating a reinforcement learning data structure, eight input indicating a reinforcement learning reward function, or both. 18. A computer-readable storage device storing instructions that, when executed, cause a computer to perform operations comprising: receiving input that identifies one or more data sources; determining, based on the input, a machine learning problem type of a plurality of machine learning problem types supported by an automated model building (AMB) engine; generating an input data set of the AMB engine based on application of one or more rules to the one or more data sources, wherein the one or more rules indicate that a column is to be dropped responsive to determining that: the column has zero standard deviation; the column includes a unique value in at least a first threshold percentage of rows; the column has at least a second threshold percentage of missing or corrupted values; the column represents categorical data and includes more than a threshold number of unique values; or any combination thereof; and based on the input data set and the machine learning problem type, initiating execution of the AMB engine to generate a neural network configured to model at least a portion of the input data set. 19. The computer-readable storage device of claim 18 , wherein: the operations include replacing a categorical column with a plurality of input columns in accordance with a one-hot encoding scheme; a classification output of the neural network is based on a softmax of a plurality of output nodes of the neural network; or both.
Recurrent networks, e.g. Hopfield networks · CPC title
Reinforcement learning · CPC title
Supervised learning · CPC title
Feedforward networks · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.