Method and system for obtaining improved structure of a target neural network
US-2015006444-A1 · Jan 1, 2015 · US
US10438112B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10438112-B2 |
| Application number | US-201514836901-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 26, 2015 |
| Priority date | May 26, 2015 |
| Publication date | Oct 8, 2019 |
| Grant date | Oct 8, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for configuring a neural network is provided. The method includes: selecting a neural network including a plurality of layers, each of the layers including a plurality of neurons for processing an input and providing an output; and, incorporating at least one switch configured to randomly select and disable at least a portion of the neurons in each layer. Another method in the computer program product is disclosed.
Opening claim text (preview).
What is claimed is: 1. A method to train a neural network, the method comprising: selecting, using a processor, a first subset of feature detectors from a plurality of feature detectors of a first layer of a neural network, the first subset of feature detectors and a second subset of feature detectors forming the plurality of feature detectors of the first layer; disabling each feature detector of the first subset of feature detectors; enabling each feature detector of the second subset of feature detectors; selecting a third subset of feature detectors from a plurality of feature detectors of a second layer of the neural network, the second layer being closer to an output of the neural network than the first layer, the third subset of feature detectors and a fourth subset of feature detectors forming the plurality of feature detectors of the second layer, the feature detectors of the third subset of feature detectors being connected to each disabled feature detector of the first subset of feature detectors; disabling each feature detector of the third subset of feature detectors; enabling each feature detector of the fourth subset of feature detectors; inputting training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the fourth subset of feature detectors enabled; backpropagating data output from the neural network in response to inputting the training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the third subset of feature detectors enabled; and updating at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on an optimization of a loss function. 2. The method of claim 1 , wherein the third subset of feature detectors further comprises the feature detectors that are connected to each disabled feature detector of the first subset of feature detectors and at least one feature detector selected from a remaining feature detector of the plurality of feature detectors of the second layer. 3. The method of claim 1 , wherein updating the at least one feature detector further comprises updating the at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on a gradient descent associated with each updated feature detector. 4. The method of claim 3 , wherein updating the at least one feature detector further comprises: updating at least one feature detector of the second subset of feature detectors based on a gradient descent of the updated feature detector of the second subset of feature detectors; and updating at least one feature detector of the fourth subset of feature detectors based on a gradient descent of the updated feature detector of the fourth subset of feature detectors. 5. The method of claim 1 , wherein selecting the first subset of feature detector comprises randomly selecting the first subset of feature detectors. 6. The method of claim 1 , wherein the feature detectors of at least one of the first layer and the second layer comprises a plurality of channels. 7. The method of claim 6 , wherein the feature detectors of the first layer comprises a plurality of channels and the feature detectors of the second layer comprises a plurality of channels. 8. A system, comprising: a processor programmed to initiate executable operations to train a neural network comprising: selecting a first subset of feature detectors from a plurality of feature detectors of a first layer of the neural network, the first subset of feature detectors and a second subset of feature detectors forming the plurality of feature detectors of the first layer; disabling each feature detector of the first subset of feature detectors; enabling each feature detector of the second subset of feature detectors; selecting a third subset of feature detectors from a plurality of feature detectors of a second layer of the neural network, the second layer being closer to an output of the neural network than the first layer, the third subset of feature detectors and a fourth subset of feature detectors forming the plurality of feature detectors of the second layer, the feature detectors of the third subset of feature detectors being connected to each disabled feature detector of the first subset of feature detectors; disabling each feature detector of the third subset of feature detectors; enabling each feature detector of the fourth subset of feature detectors; inputting training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the fourth subset of feature detectors enabled; backpropagating data output from the neural network in response to inputting the training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the third subset of feature detectors enabled; and updating at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on an optimization of a loss function. 9. The system of claim 8 , wherein the third subset of feature detectors further comprises the feature detectors that are connected to each disabled feature detector of the first subset of feature detectors and at least one feature detector selected from a remaining feature detector of the plurality of feature detectors of the second layer. 10. The system of claim 8 , wherein updating the at least one feature detector further comprises updating the at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on a gradient descent associated with each updated feature detector. 11. The system of claim 10 , wherein updating the at least one feature detector further comprises: updating at least one feature detector of the second subset of feature detectors based on a gradient descent of the updated feature detector of the second subset of feature detectors; and updating at least one feature detector of the fourth subset of feature detectors based on a gradient descent of the updated feature detector of the fourth subset of feature detectors. 12. The system of claim 8 , wherein selecting the first subset of feature detector comprises randomly selecting the first subset of feature detectors. 13. The system of claim 8 , wherein the feature detectors of at least one of the first layer and the second layer comprises a plurality of channels. 14. The system of claim 13 , wherein the feature detectors of the first layer comprises a plurality of channels and the feature detectors of the second layer comprises a plurality of channels. 15. A non-transitory computer-readable medium having stored thereon instructions that, if executed by a processor, result in at least the following: selecting, a first subset of feature detectors from a plurality of feature detectors of a first layer of a neural network, the first subset of feature detectors and a second subset of feature detectors forming the plurality of feature detectors of the first layer; disabling each feature detector of the first subset of feature detectors; enabling each feature detector of the second subset of feature detectors; selecting a third subset of feature detectors from a plurality of feature detectors of a second layer of the neural network, the second layer being closer to an output of the neural network than the first layer, the third subset of feature detectors and a f
Backpropagation, e.g. using gradient descent · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
Architecture, e.g. interconnection topology · CPC title
Physics · mapped topic
Distributed learning, e.g. federated learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.