Search space exploration for deep learning

US11989656B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11989656-B2
Application numberUS-202016935445-A
CountryUS
Kind codeB2
Filing dateJul 22, 2020
Priority dateJul 22, 2020
Publication dateMay 21, 2024
Grant dateMay 21, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Aspects of the invention include systems and methods to obtain meta features of a dataset for training in a deep learning application. A method includes selecting an initial search space that defines a type of deep learning architecture representation that specifies hyperparameters for two or more neural network architectures. The method also includes applying a search strategy to the initial search space. One of the two or more neural network architectures are selected based on a result of an evaluation according to the search strategy. A new search space is generated with new hyperparameters using an evolutionary algorithm and a mutation type that defines one or more changes in the hyperparameters specified by the initial search space, and, based on the mutation type, the new hyperparameters are applied to the one of the two or more neural networks or the search strategy is applied to the new search space.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: A computer-implemented method comprising: obtaining, using a processor, meta features corresponding with a dataset configured to be used for training in a deep learning application; selecting, using the processor, an initial search space, wherein the initial search space defines a type of deep learning architecture representation to represent two or more neural network architectures and specifies hyperparameters for the two or more neural network architectures; applying, using the processor, a search strategy to the initial search space, wherein the search strategy performs an evaluation of each of the two or more neural network architectures represented according to the initial search space; selecting, using the processor, one of the two or more neural network architectures based on a result of the evaluation according to the search strategy; generating, using the processor, a new search space with new hyperparameters that differ from the hyperparameters specified by the initial search space using an evolutionary algorithm and a mutation type selected from a plurality of mutation types, wherein each mutation type defines one or more changes in the hyperparameters specified by the initial search space, each mutation type is part of a first category or a second category, the generating is performed iteratively for each selected mutation type, and the generating includes, for each iteration: randomly selecting the mutation type, performing a check to determine whether the selected mutation type is part of the first category or the second category, based on the selected mutation type being part of the first category, applying the search strategy to the new search space to re-select a neural network architecture, and based on the selected mutation type being part of the second category, applying the new hyperparameters to the one of the two or more neural networks to re-select the neural network architecture; based on a result of applying the search strategy or the new hyperparameters having a sufficient accuracy, acquiring a color image dataset and training the re-selected neural network architecture using the color image dataset; based on the result having an insufficient accuracy, repeating the iteration until the re-selected neural network architecture has at least the sufficient accuracy, and training the re-selected neural network architecture using the dataset; and inputting color image data related to an application of interest to the trained re-selected neural network architecture, and performing at least one of feature extraction and classification by the trained re-selected neural network architecture. 2. The computer-implemented method according to claim 1 , wherein the selecting the initial search space is based on a supervised learning process. 3. The computer-implemented method according to claim 1 further comprising selecting the search strategy based on a supervised learning process. 4. The computer-implemented method according to claim 1 , wherein the applying the search strategy to the new search space is based on the mutation type being a change in at least one of a set of operations that include separable convolution, average pooling, max pooling, skip connect, and dilated convolution. 5. The computer-implemented method according to claim 1 , wherein the applying the new hyperparameters to the one of the two or more neural networks is based on the mutation type being a change in a number of layers of the one of the two or more neural network architectures. 6. The computer-implemented method according to claim 1 , wherein the applying the new hyperparameters to the one of the two or more neural networks is based on the mutation type being a change in a minimal number of channels of a convolutional operation in the one of the two or more neural networks. 7. The computer-implemented method according to claim 1 , wherein the applying the new hyperparameters to the one of the two or more neural networks is based on the mutation type being a change in a skip pattern that defines connections between operations in a chain when the deep learning architecture representation is a chain-structure representation. 8. The computer-implemented method according to claim 1 , wherein the applying the new hyperparameters to the one of the two or more neural networks is based on the mutation type being a change in a number of input connections to a node when the deep learning architecture representation is a cell structure representation. 9. A system comprising: a memory having computer readable instructions; and one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising: obtaining meta features corresponding with a dataset configured to be used for training in a deep learning application; selecting an initial search space, wherein the initial search space defines a type of deep learning architecture representation to represent two or more neural network architectures and specifies hyperparameters for the two or more neural network architectures; applying a search strategy to the initial search space, wherein the search strategy performs an evaluation of each of the two or more neural network architectures represented according to the initial search space; selecting one of the two or more neural network architectures based on a result of the evaluation according to the search strategy; generating a new search space with new hyperparameters that differ from the hyperparameters specified by the initial search space using an evolutionary algorithm and a mutation type selected from a plurality of mutation types, wherein each mutation type defines one or more changes in the hyperparameters specified by the initial search space, each mutation type is part of a first category or a second category, the generating is performed iteratively for each selected mutation type, and the generating includes, for each iteration: randomly selecting the mutation type, performing a check to determine whether the selected mutation type is part of the first category or the second category, based on the selected mutation type being part of the first category, applying the search strategy to the new search space to re-select a neural network architecture, and based on the selected mutation type being part of the second category, applying the new hyperparameters to the one of the two or more neural networks to re-select the neural network architecture; based on a result of applying the search strategy or the new hyperparameters having a sufficient accuracy, acquiring a color image dataset and training the re-selected neural network architecture using the color image dataset; based on the result having an insufficient accuracy, repeating the iteration until the re-selected neural network architecture has at least the sufficient accuracy, and training the re-selected neural network architecture using the dataset; and inputting color image data related to an application of interest to the trained re-selected neural network architecture, and performing at least one of feature extraction and classification by the trained re-selected neural network architecture. 10. The system according to claim 9 , wherein the one or more processors are configured to select the initial search space is based on a supervised learning process. 11. The system according to claim 9 further comprising the one or more processors selecting the search strategy based on a supervised learning process. 12. The system according to claim 9 , wherein the one

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Reinforcement learning · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11989656B2 cover?
Aspects of the invention include systems and methods to obtain meta features of a dataset for training in a deep learning application. A method includes selecting an initial search space that defines a type of deep learning architecture representation that specifies hyperparameters for two or more neural network architectures. The method also includes applying a search strategy to the initial s…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/086. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 21 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).