What technology area does this patent fall under?

Primary CPC classification G06N3/0985. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Sep 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for searching for neural network ensemble model, and electronic device

US2024311651A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2024311651-A1
Application number	US-202418668637-A
Country	US
Kind code	A1
Filing date	May 20, 2024
Priority date	Nov 22, 2021
Publication date	Sep 19, 2024
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a method for searching for a neural network architecture ensemble model. The method includes: obtaining a dataset, where the dataset includes a sample and an annotation in a classification task; performing search by using a distributional neural network architecture search algorithm, including: determining a hyperparameter of a neural network architecture distribution; sampling a valid neural network architecture from the architecture distribution defined by the hyperparameter; training and evaluating the neural network architecture on the dataset, to obtain a performance indicator; determining, based on the performance indicator, neural network architecture distributions that share the hyperparameter, to obtain a candidate pool of base learners; and determining a surrogate model; and predicting test performance of the base learner in the candidate pool by using the surrogate model, and determining that k diverse base learners that meet a task scenario requirement form an ensemble model.

First claim

Opening claim text (preview).

1 . A method for searching for a neural network architecture ensemble model, wherein the method comprises: obtaining a dataset, wherein the dataset comprises a sample and an annotation in a classification task; performing search by using a distributional neural network architecture search algorithm, comprising: determining a hyperparameter of a neural network architecture distribution; sampling a neural network architecture from the architecture distribution defined by the hyperparameter; training and evaluating the neural network architecture, based on the sample and the annotation in the classification task, to obtain a performance indicator; determining, based on the performance indicator, predicted neural network architecture distributions that share the hyperparameter; to obtain a candidate pool of base learners, wherein a base learner is a neural network architecture that meets an architecture distribution requirement, and the neural network architecture is formed by repeatedly stacking neural network architecture cells; and determining a surrogate model, wherein the surrogate model is used to predict test performance of an unevaluated neural network architecture; and predicting test performance of a base learner in the candidate pool by using the surrogate model, and determining that k base learners that meet a requirement of the classification task form an ensemble model, wherein a size of the ensemble model is k. 2 . The method of claim 1 , wherein the performing search by using a distributional neural network architecture search algorithm further comprises: performing distributional neural network architecture search by using an approximate neural network architecture search via operation distribution (ANASOD) algorithm. 3 . The method of claim 1 , wherein the determining a hyperparameter of a neural network architecture distribution comprises: determining that the hyperparameter of the neural network architecture distribution is an ANASOD encoding, wherein the ANASOD encoding is a vector indicating probability distributions of operators in a neural network architecture cell, and there is a one-to-many mapping between an ANASOD encoding and the neural network architecture cell. 4 . The method of claim 1 , wherein the determining a hyperparameter of a neural network architecture distribution comprises: optimizing the hyperparameter of the neural network architecture distribution by using a search policy, wherein the search policy is Bayesian optimization, and the search policy is used to sample, in a next iteration, a neural network cell whose performance indicator better meets a requirement than that of a current neural network architecture cell. 5 . The method of claim 3 , wherein the sampling a neural network architecture from the architecture distribution defined by the hyperparameter comprises: determining a specific quantity of operators in constituent cells of the neural network architecture based on an operator probability distribution defined by the ANASOD encoding; and connecting different operators based on a specified search space to obtain a valid neural network architecture. 6 . The method of claim 1 , wherein the training and evaluating the neural network architecture to obtain a performance indicator comprises: training the neural network architecture on a training dataset; and evaluating the neural network architecture on a validation dataset to obtain the performance indicator, wherein both training set data and validation set data belong to the dataset. 7 . The method of claim 1 , wherein the performing search by using a distributional neural network architecture search (distributional NAS) algorithm further comprises: determining a search policy for the neural network architecture distribution based on the performance indicator and the hyperparameter of the predicted neural network architecture distribution. 8 . The method of claim 1 , wherein the performing search by using a distributional neural network architecture search (distributional NAS) algorithm further comprises: determining a predicted performance value of a hyperparameter of another unknown distribution, comprising a mean value and a variance, based on a hyperparameter and a performance indicator of each found neural network architecture distribution; and determining a performance prediction policy for the neural network architecture distribution based on the mean value and the variance, wherein the performance prediction policy is used to predict the performance indicator of the neural network architecture distribution. 9 . The method of claim 1 , wherein the determining, based on the performance indicator, neural network architecture distributions that share the hyperparameter; to obtain a candidate pool of base learners comprises: determining a search policy for the neural network architecture distribution based on the performance indicator and the hyperparameter; determining a performance prediction policy for the neural network architecture distribution based on the performance indicator and a neural network architecture cell; and searching, according to the search policy and the performance prediction policy, the neural network architecture distributions that share the hyperparameter, to determine the candidate pool of the base learners. 10 . The method of claim 1 , wherein the determining, based on the performance indicator, neural network architecture distributions that share the hyperparameter; to obtain a candidate pool of base learners comprises: outputting, based on a plurality of neural network architectures in a historical search and corresponding performance indicators, a plurality of neural network architectures that share the hyperparameter; determining, based on the plurality of neural network architectures that share the hyperparameter, a neural network architecture distribution that meets a requirement; and generating a plurality of neural network architecture cells based on the neural network architecture distribution that meets the requirement, to obtain a generation distribution/the candidate pool of the base learners. 11 . The method of claim 1 , wherein the determining a surrogate model comprises: obtaining the surrogate model through training on the dataset based on the neural network architecture cells and the performance indicator. 12 . The method of claim 1 , wherein the predicting test performance of the base learner in the candidate pool by using the surrogate model, and determining that k base learners that meet a task scenario requirement form an ensemble model comprises: predicting test performance of a plurality of base learners in the candidate pool by using the surrogate model; performing local search based on a prediction result, and determining q estimated vertex architectures, wherein an estimated vertex architecture is a neural network architecture whose performance indicator predicted by the surrogate model on a validation set is higher than that of an adjacent architecture; and combining k architectures whose performance indicators meet the requirement in the q estimated vertex architectures to obtain the ensemble model. 13 . The method of claim 12 , wherein the combining k architectures whose performance indicators meet the requirement in the q estimated vertex architectures comprises: sorting performance indicators of the q estimated vertex architectures in descending order, and combining k architectures whose performance indicators rank top. 14 . The method of claim 12 , wherein the combining k architectures whose performance indicators meet the re

Assignees

Huawei Tech Co Ltd

Inventors

Classifications

G06N3/0985Primary
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
G06N3/082
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
G06N3/045Primary
Combinations of networks · CPC title
G06N3/04Primary
Architecture, e.g. interconnection topology · CPC title
G06V10/26
Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title

Patent family

Related publications grouped by family.

View patent family 86356560

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024311651A1 cover?: Disclosed is a method for searching for a neural network architecture ensemble model. The method includes: obtaining a dataset, where the dataset includes a sample and an annotation in a classification task; performing search by using a distributional neural network architecture search algorithm, including: determining a hyperparameter of a neural network architecture distribution; sampling a v…
Who is the assignee on this patent?: Huawei Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/0985. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Sep 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).