Method and apparatus of learning neural network via hierarchical ensemble learning

US10438112B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10438112-B2
Application numberUS-201514836901-A
CountryUS
Kind codeB2
Filing dateAug 26, 2015
Priority dateMay 26, 2015
Publication dateOct 8, 2019
Grant dateOct 8, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for configuring a neural network is provided. The method includes: selecting a neural network including a plurality of layers, each of the layers including a plurality of neurons for processing an input and providing an output; and, incorporating at least one switch configured to randomly select and disable at least a portion of the neurons in each layer. Another method in the computer program product is disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method to train a neural network, the method comprising: selecting, using a processor, a first subset of feature detectors from a plurality of feature detectors of a first layer of a neural network, the first subset of feature detectors and a second subset of feature detectors forming the plurality of feature detectors of the first layer; disabling each feature detector of the first subset of feature detectors; enabling each feature detector of the second subset of feature detectors; selecting a third subset of feature detectors from a plurality of feature detectors of a second layer of the neural network, the second layer being closer to an output of the neural network than the first layer, the third subset of feature detectors and a fourth subset of feature detectors forming the plurality of feature detectors of the second layer, the feature detectors of the third subset of feature detectors being connected to each disabled feature detector of the first subset of feature detectors; disabling each feature detector of the third subset of feature detectors; enabling each feature detector of the fourth subset of feature detectors; inputting training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the fourth subset of feature detectors enabled; backpropagating data output from the neural network in response to inputting the training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the third subset of feature detectors enabled; and updating at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on an optimization of a loss function. 2. The method of claim 1 , wherein the third subset of feature detectors further comprises the feature detectors that are connected to each disabled feature detector of the first subset of feature detectors and at least one feature detector selected from a remaining feature detector of the plurality of feature detectors of the second layer. 3. The method of claim 1 , wherein updating the at least one feature detector further comprises updating the at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on a gradient descent associated with each updated feature detector. 4. The method of claim 3 , wherein updating the at least one feature detector further comprises: updating at least one feature detector of the second subset of feature detectors based on a gradient descent of the updated feature detector of the second subset of feature detectors; and updating at least one feature detector of the fourth subset of feature detectors based on a gradient descent of the updated feature detector of the fourth subset of feature detectors. 5. The method of claim 1 , wherein selecting the first subset of feature detector comprises randomly selecting the first subset of feature detectors. 6. The method of claim 1 , wherein the feature detectors of at least one of the first layer and the second layer comprises a plurality of channels. 7. The method of claim 6 , wherein the feature detectors of the first layer comprises a plurality of channels and the feature detectors of the second layer comprises a plurality of channels. 8. A system, comprising: a processor programmed to initiate executable operations to train a neural network comprising: selecting a first subset of feature detectors from a plurality of feature detectors of a first layer of the neural network, the first subset of feature detectors and a second subset of feature detectors forming the plurality of feature detectors of the first layer; disabling each feature detector of the first subset of feature detectors; enabling each feature detector of the second subset of feature detectors; selecting a third subset of feature detectors from a plurality of feature detectors of a second layer of the neural network, the second layer being closer to an output of the neural network than the first layer, the third subset of feature detectors and a fourth subset of feature detectors forming the plurality of feature detectors of the second layer, the feature detectors of the third subset of feature detectors being connected to each disabled feature detector of the first subset of feature detectors; disabling each feature detector of the third subset of feature detectors; enabling each feature detector of the fourth subset of feature detectors; inputting training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the fourth subset of feature detectors enabled; backpropagating data output from the neural network in response to inputting the training data into the neural network with the first subset and the third subset of feature detectors disabled and the second subset and the third subset of feature detectors enabled; and updating at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on an optimization of a loss function. 9. The system of claim 8 , wherein the third subset of feature detectors further comprises the feature detectors that are connected to each disabled feature detector of the first subset of feature detectors and at least one feature detector selected from a remaining feature detector of the plurality of feature detectors of the second layer. 10. The system of claim 8 , wherein updating the at least one feature detector further comprises updating the at least one feature detector of at least one of the second subset and the fourth subset of feature detectors based on a gradient descent associated with each updated feature detector. 11. The system of claim 10 , wherein updating the at least one feature detector further comprises: updating at least one feature detector of the second subset of feature detectors based on a gradient descent of the updated feature detector of the second subset of feature detectors; and updating at least one feature detector of the fourth subset of feature detectors based on a gradient descent of the updated feature detector of the fourth subset of feature detectors. 12. The system of claim 8 , wherein selecting the first subset of feature detector comprises randomly selecting the first subset of feature detectors. 13. The system of claim 8 , wherein the feature detectors of at least one of the first layer and the second layer comprises a plurality of channels. 14. The system of claim 13 , wherein the feature detectors of the first layer comprises a plurality of channels and the feature detectors of the second layer comprises a plurality of channels. 15. A non-transitory computer-readable medium having stored thereon instructions that, if executed by a processor, result in at least the following: selecting, a first subset of feature detectors from a plurality of feature detectors of a first layer of a neural network, the first subset of feature detectors and a second subset of feature detectors forming the plurality of feature detectors of the first layer; disabling each feature detector of the first subset of feature detectors; enabling each feature detector of the second subset of feature detectors; selecting a third subset of feature detectors from a plurality of feature detectors of a second layer of the neural network, the second layer being closer to an output of the neural network than the first layer, the third subset of feature detectors and a f

Assignees

Inventors

Classifications

  • Backpropagation, e.g. using gradient descent · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • G06N3/0454Primary

    Physics · mapped topic

  • G06N3/098Primary

    Distributed learning, e.g. federated learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10438112B2 cover?
A method for configuring a neural network is provided. The method includes: selecting a neural network including a plurality of layers, each of the layers including a plurality of neurons for processing an input and providing an output; and, incorporating at least one switch configured to randomly select and disable at least a portion of the neurons in each layer. Another method in the computer…
Who is the assignee on this patent?
Zhang Qiang, Ji Zhengping, Shi Lilong, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06N3/0454. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).