Method and system for performance tuning and performance tuning device
US-2021398013-A1 · Dec 23, 2021 · US
US12093814B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12093814-B2 |
| Application number | US-201916544969-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 20, 2019 |
| Priority date | Aug 20, 2019 |
| Publication date | Sep 17, 2024 |
| Grant date | Sep 17, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, system, and computer program product for hyper-parameter determination. In a method, a network architecture of a learning model may be determined, and the learning model may be configured for performing a computing task based on machine learning. A metric value record associated with a group of hyper-parameters may be obtained during hyper-parameter determination for the learning model. An estimation of a metric value may be obtained based on the network architecture, and the metric value record and an association relationship representing an association between network architectures and metric values for the network architectures. The group of hyper-parameters may be selected in response to the estimation of the metric value meeting a predefined criterion. With these embodiments, a group of hyper-parameters may be selected, and further the learning model may be trained based on the selected group of hyper-parameters.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: determining, by one or more processors, a network architecture of a learning model, the learning model being configured for performing a computing task based on machine learning, wherein the network architecture is represented as a vector comprising data corresponding to each layer in the learning model, wherein a data structure associated with each layer in the machine learning model comprises a type of each node from among a plurality of nodes and a connection relationship associated with each node, wherein the type corresponds to a neural network cell or an activation function, and wherein the computing task includes at least one of image classification, face recognition, and text processing performed by the learning model, and wherein the network architecture is determined based on the connection relationship and the plurality of nodes by building a directed acyclic graph represented by a matrix and a value at a location in the matrix corresponding to whether two nodes from among the plurality of nodes are connected; obtaining, by the one or more processors, a metric value record associated with a group of hyper-parameters during hyper-parameter determination for the learning model; obtaining, by the one or more processors, an estimation of a metric value based on the network architecture, the metric value record and an association relationship representing an association between network architectures and metric values for the network architecture; and selecting, by the one or more processors, the group of hyper-parameters in response to the estimation of the metric value meeting a predefined criterion, wherein the predefined criterion corresponds to a comparison among multiple estimations associated with multiple groups of hyper-parameters; and training, by the one or more processors, the association relationship based on determining a number of iterations corresponding to a convergence associated with the selected group of hyper-parameters and obtaining metric value records for more iterations and less iterations than the number of iterations corresponding to the convergence, wherein the method further comprises: with respect to a sample learning model in a plurality of sample learning models, determining, by the one or more processors, a sample network architecture of the sample learning model, the plurality of sample learning models being configured for performing a plurality of sample tasks based on the machine learning, respectively; obtaining, by the one or more processors, a plurality of metric value records during a plurality of experiments for the hyper-parameter determination; and training, by the one or more processors, the association relationship based on the sample network architecture and the plurality of metric value records, such that the trained association relationship represents an association between the sample network architecture and the plurality of metric value records, wherein the plurality of metric value records corresponds to various percentages of the iterations relative to the convergence such that the association relationship has knowledge of various time points during hyper-parameters determination. 2. The method of claim 1 , wherein the determining, by the one or more processors, the network architecture of the learning model comprises: extracting, by the one or more processors, the connection relationship among the plurality of nodes comprised in the learning model; and determining, by the one or more processors, the network architecture based on the connection relationship and the plurality of nodes. 3. The method of claim 2 , wherein the determining, by the one or more processors, the network architecture based on the connection relationship and the plurality of nodes comprises: determining, by the one or more processors, a plurality of layers formed by the plurality of nodes; and determining, by the one or more processors, the network architecture based on the connection relationship and the plurality of layers. 4. The method of claim 1 , further comprising: obtaining, by the one or more processors, a further metric value record associated with a further group of hyper-parameters during the hyper-parameter determination for the learning model; obtaining, by the one or more processors, a further estimation of a metric value based on the network architecture, the further metric value record and the association relationship; and wherein the selecting, by the one or more processors, the group of hyper-parameters in response to the estimation of the metric value meeting the predefined criterion comprises: selecting, by the one or more processors, the group of hyper-parameters in response to the estimation of the metric value being closer to the convergence during the hyper-parameter determination than the further estimation. 5. The method of claim 4 , wherein the estimation of the metric value comprises an extreme value among a plurality of metric values associated with a plurality of group of hyper-parameters during the hyper-parameter determination. 6. The method of claim 1 , wherein the obtaining, by the one or more processors, the plurality of metric value records comprises: obtaining, by the one or more processors, one of the plurality of the metric value records based on metric values associated with a progress of the hyper-parameter determination. 7. The method of claim 6 , wherein the obtaining, by the one or more processors, one of the plurality of the metric value records comprises: determining, by the one or more processors, the convergence during the hyper-parameter determination; and obtaining, by the one or more processors, a metric value record based on the determined convergence. 8. The method of claim 1 , further comprising: obtaining, by the one or more processors, a group of sample data for training the learning model; and training, by the one or more processors, the learning model based on the group of sample data and the selected group of hyper-parameters. 9. The method of claim 8 , further comprising: obtaining, by the one or more processors, an object that is to be processed by the computing task; and processing, by the one or more processors, the object based on the trained learning model. 10. A computer-implemented system, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage media, and program instructions stored on at least one of the one or more computer-readable tangible storage media for execution by at least one of the one or more processors via at least one of the one or more computer-readable memories, wherein the computer system is capable of performing a method comprising: determining a network architecture of a learning model, the learning model being configured for performing a computing task based on machine learning, wherein the network architecture is represented as a vector comprising data corresponding to each layer in the learning model, wherein a data structure associated with each layer in the machine learning model comprises a type of each node from among a plurality of nodes and a connection relationship associated with each node, wherein the type corresponds to a neural network cell or an activation function, wherein the computing task includes at least one of image classification, face recognition, and text processing performed by the learning model, and wherein the network architecture is determined based on the connection relationship and the plurality of nodes by building a directed acyclic graph represented by a matrix and a value at a location in the matrix correspon
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
using neural networks · CPC title
Validation; Performance evaluation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.