System for universal hardware-neural network architecture search (co-design)
US-2022108054-A1 · Apr 7, 2022 · US
US12596908B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12596908-B2 |
| Application number | US-202017122428-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 15, 2020 |
| Priority date | Dec 15, 2020 |
| Publication date | Apr 7, 2026 |
| Grant date | Apr 7, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A neural architecture search (NAS) with a weak predictor comprises: receiving network architecture scoring information; iteratively sampling a search space, wherein the sampling comprises: generating a set of candidate architectures within the search space; learning a first predictor; evaluating performance of the candidate architectures; and based on at least the performance of the set of candidate architectures and the network architecture scoring information, refining the search space to a smaller search space; based on at least the network architecture scoring information, thresholding the performance of candidate architectures to determine scored output candidate architectures; and reporting the scored output candidate architectures. In some examples, the candidate architectures each comprise a machine learning (ML) model, for example a neural network (NN). In some examples, searching continues to iterate until stopping criteria is met, such as a specified maximum number of iterations or a set of candidate architectures achieves a performance goal.
Opening claim text (preview).
What is claimed is: 1 . A method of neural architecture search (NAS) using progressively stronger predictors to sample progressively smaller search spaces, the method comprising: receiving network architecture scoring information; sampling, during a first iteration, a first subset of candidate architectures within a first search space, wherein at least some candidate architectures within the first search space go unsampled during the first iteration; learning a first predictor based on at least the first subset of candidate architectures sampled during the first iteration, the first predictor bi-furcating the first search space; evaluating performance of the first subset of candidate architectures with the first predictor based on validation accuracy as a function of a number of samples used by the corresponding candidate architecture; based on at least the performance of the first subset of candidate architectures and the network architecture scoring information, refining the first search space to a second search space smaller than the first search space, the second search space including fewer candidate architectures than the first search space; sampling, during a second iteration, a second subset of candidate architectures within the second search space, wherein at least some candidate architectures within the second search space go unsampled during the second iteration, and wherein the second subset of candidate architectures includes at least one candidate architecture from the first subset of candidate architectures that has been trained based on training data from the second search space; learning a second predictor based on at least the second subset of candidate architectures, the first predictor based on sparser samples than the second predictor such that the first predictor is weaker than the second predictor, wherein the relative strength of respective predictors is based on the relative sparsity of search space samplings used to generate the respective predictors such that weaker predictors are generated from sparser search space samplings than stronger predictors; evaluating performance of the second subset of candidate architectures with the second predictor; based on at least the performance of the second subset of candidate architectures and the network architecture scoring information, ranking the second subset of candidate architectures to determine ranked output candidate architectures; and reporting the ranked output candidate architectures, wherein at least two of the ranked output candidate architectures are trained for image classification using training data and tested using testing data to select an optimal architecture for training a neural network model for image classification. 2 . The method of claim 1 , wherein refining the first search space comprises thresholding the performance of the first subset of candidate architectures. 3 . The method of claim 1 , wherein the candidate architectures each comprise a machine learning (ML) model. 4 . A system for neural architecture search (NAS), the system comprising: a processor; and a non-transitory computer-readable medium storing instructions that are operative upon execution by the processor to: receive network architecture scoring information; sample, during a first iteration, a first subset of candidate architectures within a first search space, wherein at least some candidate architectures within the first search space go unsampled during the first iteration; learn a first predictor based on at least the first subset of candidate architectures sampled during the first iteration, the first predictor bi-furcating the first search space; evaluate performance of the first subset of candidate architectures with the first predictor based on validation accuracy as a function of a number of samples used by the corresponding candidate architecture; based on at least the performance of the first subset of candidate architectures and the network architecture scoring information, refine the first search space to a second search space smaller than the first search space, the second search space including fewer candidate architectures than the first search space; sample, during a second iteration, a second subset of candidate architectures within the second search space, wherein at least some candidate architectures within the second search space go unsampled during the second iteration, and wherein the second subset of candidate architectures includes at least one candidate architecture from the first subset of candidate architectures that has been trained based on training data from the second search space; learn a second predictor based on at least the second subset of candidate architectures, the first predictor based on sparser samples than the second predictor such that the first predictor is weaker than the second predictor, wherein the relative strength of respective predictors is based on the relative sparsity of search space samplings used to generate the respective predictors such that weaker predictors are generated from sparser search space samplings than stronger predictors; evaluate performance of the second subset of candidate architectures with the second predictor; based on at least the performance of the second subset of candidate architectures and the network architecture scoring information, rank the second subset of candidate architectures to determine ranked output candidate architectures; and report the ranked output candidate architectures, wherein at least two of the ranked output candidate architectures are trained for image classification using training data and tested using testing data to select an optimal architecture for training a neural network model for image classification. 5 . The system of claim 4 , wherein refining the first search space comprises thresholding the performance of the first subset of candidate architectures. 6 . The system of claim 4 , wherein the candidate architectures each comprise a machine learning (ML) model. 7 . One or more computer storage devices having computer-executable instructions thereon for performing a neural architecture search (NAS) using progressively stronger predictors to sample progressively smaller search spaces, which, on execution by a computer, cause the computer to perform operations comprising: receiving network architecture scoring information; sampling, during a first iteration, a first subset of candidate architectures within a first search space, wherein at least some candidate architectures within the first search space go unsampled during the first iteration; learning a first predictor based on at least the first subset of candidate architectures sampled during the first iteration, the first predictor bi-furcating the first search space; evaluating performance of the first subset of candidate architectures with the first predictor based on validation accuracy as a function of a number of samples used by the corresponding candidate architecture; based on at least the performance of the first subset of candidate architectures and the network architecture scoring information, refining the first search space to a second search space smaller than the first search space, the second search space including fewer candidate architectures than the first search space; sampling, during a second iteration, a second subset of candidate architectures within the second search space, wherein at least some candidate architectures within the second search space go unsampled during the second iteration, and wherein the second subset of candidate architectures includes at least one candidate architecture from the first subset of candidate architectures that has been trained based on training data from the second s
Validation; Performance evaluation; Active pattern learning techniques · CPC title
Learning methods · CPC title
for performance assessment · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.