Semi-Supervised Learning for Training an Ensemble of Deep Convolutional Neural Networks
US-2019114544-A1 · Apr 18, 2019 · US
US12367661B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-12367661-B1 |
| Application number | US-202218088726-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 26, 2022 |
| Priority date | Dec 29, 2021 |
| Publication date | Jul 22, 2025 |
| Grant date | Jul 22, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Some embodiments provide a method for training a machine-trained network that includes multiple parameters. The method propagates a batch of input training items through the network to generate output values and compute values of a loss function for each of the input training items. The method computes a weight for each input training item based on the computed loss function values for each of the input training items. The method selects input training items with larger weights more often than input training items with smaller weights for subsequent batches of input training items.
Opening claim text (preview).
What is claimed is: 1. A method for training a machine-trained network comprising a plurality of parameters, the method comprising: propagating a batch of input training items through the network to generate output values and compute values of a loss function for each of the input training items; computing a weight for each input training item based on the computed loss function values for each of the input training items; and selecting input training items with larger weights more often than input training items with smaller weights for subsequent batches of input training items. 2. The method of claim 1 , wherein: each input training item has a corresponding expected output value; and computing a value of a loss function for a particular input training item comprises comparing the corresponding expected output value to the generated output value for the input training item. 3. The method of claim 2 , wherein the loss function values for the particular input training item increases as a distance between the corresponding expected output value and the generated output value for the input training item increases. 4. The method of claim 2 , wherein the loss function is a measure of unhappiness. 5. The method of claim 1 , wherein the weights for the input training items are proportional to the computed loss function values for the input training items. 6. The method of claim 1 , wherein: the input training items are selected from a plurality of available input training items; and a number of available input training items is larger than a number of input training items in each batch of input training items. 7. The method of claim 6 , wherein each input training item is selected at most once per batch of input training items. 8. The method of claim 6 , wherein each input training item is selected at least once in the subsequent batches of input training items. 9. The method of claim 1 , wherein: the network is trained for classifying items into a predefined set of classes; and the generated output value for a particular input training item comprises, for each class, a probability that the particular input training item belongs to the class. 10. The method of claim 1 , wherein selecting input training items with larger weights more often enables the parameters of the machine-trained network to converge more quickly to optimal values. 11. A non-transitory machine-readable medium storing a program which when executed by at least one processing unit trains a machine-trained network comprising a plurality of parameters, the program comprising sets of instructions for: propagating a batch of input training items through the network to generate output values and compute values of a loss function for each of the input training items; computing a weight for each input training item based on the computed loss function values for each of the input training items; and selecting input training items with larger weights more often than input training items with smaller weights for subsequent batches of input training items. 12. The non-transitory machine-readable medium of claim 11 , wherein: each input training item has a corresponding expected output value; and the set of instructions for computing a value of a loss function for a particular input training item comprises a set of instructions for comparing the corresponding expected output value to the generated output value for the input training item. 13. The non-transitory machine-readable medium of claim 12 , wherein the loss function values for the particular input training item increases as a distance between the corresponding expected output value and the generated output value for the input training item increases. 14. The non-transitory machine-readable medium of claim 12 , wherein the loss function is a measure of unhappiness. 15. The non-transitory machine-readable medium of claim 11 , wherein the weights for the input training items are proportional to the computed loss function values for the input training items. 16. The non-transitory machine-readable medium of claim 11 , wherein: the input training items are selected from a plurality of available input training items; and a number of available input training items is larger than a number of input training items in each batch of input training items. 17. The non-transitory machine-readable medium of claim 16 , wherein each input training item is selected at most once per batch of input training items. 18. The non-transitory machine-readable medium of claim 16 , wherein each input training item is selected at least once in the subsequent batches of input training items. 19. The non-transitory machine-readable medium of claim 11 , wherein: the network is trained for classifying items into a predefined set of classes; and the generated output value for a particular input training item comprises, for each class, a probability that the particular input training item belongs to the class. 20. The non-transitory machine-readable medium of claim 11 , wherein the selection of input training items with larger weights more often enables the parameters of the machine-trained network to converge more quickly to optimal values.
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Validation; Performance evaluation · CPC title
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
using classification, e.g. of video objects · CPC title
using neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.