Weak neural architecture search (NAS) predictor

US12596908B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12596908-B2
Application numberUS-202017122428-A
CountryUS
Kind codeB2
Filing dateDec 15, 2020
Priority dateDec 15, 2020
Publication dateApr 7, 2026
Grant dateApr 7, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural architecture search (NAS) with a weak predictor comprises: receiving network architecture scoring information; iteratively sampling a search space, wherein the sampling comprises: generating a set of candidate architectures within the search space; learning a first predictor; evaluating performance of the candidate architectures; and based on at least the performance of the set of candidate architectures and the network architecture scoring information, refining the search space to a smaller search space; based on at least the network architecture scoring information, thresholding the performance of candidate architectures to determine scored output candidate architectures; and reporting the scored output candidate architectures. In some examples, the candidate architectures each comprise a machine learning (ML) model, for example a neural network (NN). In some examples, searching continues to iterate until stopping criteria is met, such as a specified maximum number of iterations or a set of candidate architectures achieves a performance goal.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of neural architecture search (NAS) using progressively stronger predictors to sample progressively smaller search spaces, the method comprising: receiving network architecture scoring information; sampling, during a first iteration, a first subset of candidate architectures within a first search space, wherein at least some candidate architectures within the first search space go unsampled during the first iteration; learning a first predictor based on at least the first subset of candidate architectures sampled during the first iteration, the first predictor bi-furcating the first search space; evaluating performance of the first subset of candidate architectures with the first predictor based on validation accuracy as a function of a number of samples used by the corresponding candidate architecture; based on at least the performance of the first subset of candidate architectures and the network architecture scoring information, refining the first search space to a second search space smaller than the first search space, the second search space including fewer candidate architectures than the first search space; sampling, during a second iteration, a second subset of candidate architectures within the second search space, wherein at least some candidate architectures within the second search space go unsampled during the second iteration, and wherein the second subset of candidate architectures includes at least one candidate architecture from the first subset of candidate architectures that has been trained based on training data from the second search space; learning a second predictor based on at least the second subset of candidate architectures, the first predictor based on sparser samples than the second predictor such that the first predictor is weaker than the second predictor, wherein the relative strength of respective predictors is based on the relative sparsity of search space samplings used to generate the respective predictors such that weaker predictors are generated from sparser search space samplings than stronger predictors; evaluating performance of the second subset of candidate architectures with the second predictor; based on at least the performance of the second subset of candidate architectures and the network architecture scoring information, ranking the second subset of candidate architectures to determine ranked output candidate architectures; and reporting the ranked output candidate architectures, wherein at least two of the ranked output candidate architectures are trained for image classification using training data and tested using testing data to select an optimal architecture for training a neural network model for image classification. 2 . The method of claim 1 , wherein refining the first search space comprises thresholding the performance of the first subset of candidate architectures. 3 . The method of claim 1 , wherein the candidate architectures each comprise a machine learning (ML) model. 4 . A system for neural architecture search (NAS), the system comprising: a processor; and a non-transitory computer-readable medium storing instructions that are operative upon execution by the processor to: receive network architecture scoring information; sample, during a first iteration, a first subset of candidate architectures within a first search space, wherein at least some candidate architectures within the first search space go unsampled during the first iteration; learn a first predictor based on at least the first subset of candidate architectures sampled during the first iteration, the first predictor bi-furcating the first search space; evaluate performance of the first subset of candidate architectures with the first predictor based on validation accuracy as a function of a number of samples used by the corresponding candidate architecture; based on at least the performance of the first subset of candidate architectures and the network architecture scoring information, refine the first search space to a second search space smaller than the first search space, the second search space including fewer candidate architectures than the first search space; sample, during a second iteration, a second subset of candidate architectures within the second search space, wherein at least some candidate architectures within the second search space go unsampled during the second iteration, and wherein the second subset of candidate architectures includes at least one candidate architecture from the first subset of candidate architectures that has been trained based on training data from the second search space; learn a second predictor based on at least the second subset of candidate architectures, the first predictor based on sparser samples than the second predictor such that the first predictor is weaker than the second predictor, wherein the relative strength of respective predictors is based on the relative sparsity of search space samplings used to generate the respective predictors such that weaker predictors are generated from sparser search space samplings than stronger predictors; evaluate performance of the second subset of candidate architectures with the second predictor; based on at least the performance of the second subset of candidate architectures and the network architecture scoring information, rank the second subset of candidate architectures to determine ranked output candidate architectures; and report the ranked output candidate architectures, wherein at least two of the ranked output candidate architectures are trained for image classification using training data and tested using testing data to select an optimal architecture for training a neural network model for image classification. 5 . The system of claim 4 , wherein refining the first search space comprises thresholding the performance of the first subset of candidate architectures. 6 . The system of claim 4 , wherein the candidate architectures each comprise a machine learning (ML) model. 7 . One or more computer storage devices having computer-executable instructions thereon for performing a neural architecture search (NAS) using progressively stronger predictors to sample progressively smaller search spaces, which, on execution by a computer, cause the computer to perform operations comprising: receiving network architecture scoring information; sampling, during a first iteration, a first subset of candidate architectures within a first search space, wherein at least some candidate architectures within the first search space go unsampled during the first iteration; learning a first predictor based on at least the first subset of candidate architectures sampled during the first iteration, the first predictor bi-furcating the first search space; evaluating performance of the first subset of candidate architectures with the first predictor based on validation accuracy as a function of a number of samples used by the corresponding candidate architecture; based on at least the performance of the first subset of candidate architectures and the network architecture scoring information, refining the first search space to a second search space smaller than the first search space, the second search space including fewer candidate architectures than the first search space; sampling, during a second iteration, a second subset of candidate architectures within the second search space, wherein at least some candidate architectures within the second search space go unsampled during the second iteration, and wherein the second subset of candidate architectures includes at least one candidate architecture from the first subset of candidate architectures that has been trained based on training data from the second s

Assignees

Inventors

Classifications

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • for performance assessment · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12596908B2 cover?
A neural architecture search (NAS) with a weak predictor comprises: receiving network architecture scoring information; iteratively sampling a search space, wherein the sampling comprises: generating a set of candidate architectures within the search space; learning a first predictor; evaluating performance of the candidate architectures; and based on at least the performance of the set of cand…
Who is the assignee on this patent?
Microsoft Tech Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 07 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).