Automated machine learning using nearest neighbor recommender systems
US-2022044078-A1 · Feb 10, 2022 · US
US12555029B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12555029-B2 |
| Application number | US-202117237379-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 22, 2021 |
| Priority date | Apr 22, 2021 |
| Publication date | Feb 17, 2026 |
| Grant date | Feb 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In a method for ranking machine learning (ML) pipelines for a dataset, a processor receives first performance curves predicted by a meta learner model for a plurality of ML pipelines. A processor allocates a first subset of data points from the dataset to each of the plurality of ML pipelines. A processor receives first performance scores for each of the ML pipelines for the first subset of data points. A processor updates the meta learner model using the first performance scores. A processor receives second performance curves from the meta learner model updated with the first performance scores. A processor ranks the plurality of ML pipelines based on the second performance curves.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for ranking machine learning (ML) pipelines for a dataset, the method comprising: receiving, by one or more processors, first performance curves made up of performance values predicted for dependent variables of data allocation size, wherein the data allocation size comprises a number of data points fed to each of a plurality of machine learning (ML) pipelines by a meta learner model, wherein the first performance curves are the predicted performance values with respect to the dependent variables for each of the ML pipelines; allocating a first subset of data points from the dataset to each of the plurality of ML pipelines selected based on changing points where each changing point is a point where each of the first performance curves cross another of the first performance curves; receiving first performance scores for each of the ML pipelines for the first subset of data points; generating, by one or more processors, a new meta learner model by backpropagating a difference in the performance value between the first performance curves at a particular dependent variable value and the first performance scores for the first subset of data points at the particular dependent variable value; receiving second performance curves from the new meta learner model; and ranking the plurality of ML pipelines based on the second performance curves. 2 . The method of claim 1 , comprising training the meta learner model using meta features of a training dataset. 3 . The method of claim 1 , comprising allocating a second subset of data points from the dataset to each of the plurality of ML pipelines. 4 . The method of claim 3 , wherein the second subset of data points are selected based on the changing points. 5 . The method of claim 1 , wherein the meta learner model is updated via backpropagation of a difference between first performance curves and the first performance scores at each point in the first subset of data points. 6 . The method of claim 1 , wherein the first performance curves comprise scores for a selection from the group consisting of accuracy, error, recall, memory consumption, CPU usage, and running time, at a range of data points. 7 . A computer program product for ranking machine learning (ML) pipelines for a dataset, comprising: one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions comprising: program instructions to receive first performance curves made up of performance values predicted for dependent variables of data allocation size, wherein the data allocation size comprises a number of data points fed to each of a plurality of machine learning (ML) pipelines by a meta learner model, wherein the first performance curves are the predicted performance values with respect to the dependent variables for each of the ML pipelines; program instructions to allocate a first subset of data points from a dataset to each of the plurality of ML pipelines selected based on changing points where the each changing point is a point where each of the first performance curves cross another of the first performance curves; program instructions to receive first performance scores for each of the ML pipelines for the first subset of data points; program instructions to generate a new meta learner model by backpropagating a difference in the performance value between the first performance curves at a particular dependent variable value and the first performance scores for the data first subset of data points at the particular dependent variable value; program instructions to receive second performance curves from the new meta learner model; and program instructions to rank the plurality of ML pipelines based on the second performance curves. 8 . The computer program product of claim 7 , comprising program instructions to train the meta learner model using meta features of a training dataset. 9 . The computer program product of claim 7 , comprising program instructions to allocate a second subset of data points from the dataset to each of the plurality of ML pipelines. 10 . The computer program product of claim 9 , wherein the second subset of data points are selected based on changing points where the first performance curves cross. 11 . The computer program product of claim 7 , wherein the meta learner model is updated via backpropagation of a difference between first performance curves and the first performance scores at each point in the first subset of data points. 12 . The computer program product of claim 7 , wherein the first performance curves comprise scores for a selection from the group consisting of accuracy, error, recall, memory consumption, CPU usage, and running time. 13 . A computer system comprising: one or more computer processors, one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the program instructions comprising: program instructions to receive first performance curves made up of performance values predicted for dependent variables of data allocation size, wherein the data allocation size comprises a number of data points fed to each of a plurality of machine learning (ML) pipelines by a meta learner model, wherein the first performance curves are the predicted performance values with respect to the dependent variables for each of the ML pipelines; program instructions to allocate a first subset of data points from a dataset to each of the plurality of ML pipelines selected based on changing points where each changing point is a point where each of the first performance curves cross another of the first performance curves; program instructions to receive first performance scores for each of the ML pipelines for the first subset of data points; program instructions to generate a new meta learner model by backpropagating a difference in the performance value between the first performance curves at a particular dependent variable value and the first performance scores for the first subset of data points at the particular dependent variable value; program instructions to receive second performance curves from the new meta learner model; and program instructions to rank the plurality of ML pipelines based on the second performance curves. 14 . The system of claim 13 , comprising program instructions to train the meta learner model using meta features of a training dataset. 15 . The system of claim 13 , comprising program instructions to allocate a second subset of data points from the dataset to each of the plurality of ML pipelines. 16 . The system of claim 15 , wherein the second subset of data points are selected based on the changing points where the first performance curves cross. 17 . The system of claim 13 , wherein the meta learner model is updated via backpropagation of a difference between first performance curves and the first performance scores at each point in the first subset of data points. 18 . The system of claim 13 , wherein the first performance curves comprise scores for performance values selected from the group consisting of accuracy, error, recall, memory consumption, CPU usage, and running time.
Performance evaluation by modeling · CPC title
where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title
Inference or reasoning models · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.