Neural network recogntion and training method and apparatus
US-2019102678-A1 · Apr 4, 2019 · US
US11475280B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11475280-B2 |
| Application number | US-202016808069-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 3, 2020 |
| Priority date | Nov 15, 2019 |
| Publication date | Oct 18, 2022 |
| Grant date | Oct 18, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system includes a computing platform having a hardware processor and a memory storing a software code and a neural network (NN) having multiple layers including a last activation layer and a loss layer. The hardware processor executes the software code to identify different combinations of layers for testing the NN, each combination including candidate function(s) for the last activation layer and candidate function(s) for the loss layer. For each different combination, the software code configures the NN based on the combination, inputs, into the configured NN, a training dataset including multiple data objects, receives, from the configured NN, a classification of the data objects, and generates a performance assessment for the combination based on the classification. The software code determines a preferred combination of layers for the NN including selected candidate functions for the last activation layer and the loss layer, based on a comparison of the performance assessments.
Opening claim text (preview).
What is claimed is: 1. A system for classifying data objects, the system comprising: a computing platform including a hardware processor and a system memory; a software code and a neural network (NN) stored in the system memory, the NN having a plurality of layers including one or more activation layers and a loss layer, the one or more activation layers comprising a last activation layer; the hardware processor configured to execute the software code to: identify a plurality of different combinations of layers for testing the NN, each combination of the plurality of different combinations of layers including one or more candidate functions for the last activation layer and one or more candidate functions for the loss layer; for each combination of the plurality of different combinations of layers: configure the NN based on the each combination; input, into the configured NN, a training dataset including a plurality of data objects; receive, from the configured NN, a classification of the plurality of data objects in the training dataset; generate a performance assessment for the each combination based on the classification; and determine a preferred combination of layers for the NN from among the plurality of different combinations of layers based on a comparison of the performance assessments, the preferred combination of layers comprising a selected candidate amongst the one or more candidate functions for the last activation layer and a selected candidate amongst the one or more candidate functions for the loss layer. 2. The system of claim 1 , wherein the selected candidate for the last activation layer of the preferred combination of layers is a softmax activation function, and wherein the last activation layer follows a sigmoid activation layer. 3. The system of claim 1 , wherein the one or more activation layers includes an additional normalization layer, and wherein each combination of the plurality of different combinations of layers further includes one or more candidate functions for the additional normalization layer. 4. The system of claim 3 , wherein the selected candidate for the last activation layer is one of a sigmoid activation function, a softmax activation function, or an L 1 -normalization function. 5. The system of claim 3 , wherein the selected candidate for the loss layer comprises a cross entropy loss function. 6. The system of claim 1 , wherein the plurality of data objects in the training dataset comprises a plurality of images. 7. The system of claim 6 , wherein the hardware processor is further configured to execute the software code to generate the plurality of images in the training dataset. 8. The system of claim 6 , wherein the hardware processor is further configured to execute the software code to: obtain a plurality of real images; composite the plurality of real images to form a montage of the plurality of real images; identify a plurality of labels for association with the montage; label the montage using one or more of the plurality of identified labels to generate the plurality of images in the training dataset; wherein noise is parametrically introduced into the training dataset, resulting in a subset of the plurality of images being purposely mislabeled. 9. The system of claim 8 , wherein a plurality of parameters utilized to introduce the noise into the training dataset comprises a number of peaks (dp) in the training dataset, a likelihood of noise (pn) in the training dataset, and a balance (pa) between false positives and false negatives in the training dataset. 10. A method for use by a system for classifying data objects, the system including a computing platform having a hardware processor and a system memory storing a software code and a neural network (NN), the NN having a plurality of layers including one or more activation layers and a loss layer, the or more activation layers comprising a last activation layer, the method comprising: identifying, by the software code executed by the hardware processor, a plurality of different combinations of layers for testing the NN, each combination of the plurality of different combinations of layers including one or more candidate functions for the last activation layer and one or more candidate functions for the loss layer; for each combination of the plurality of different combinations of layers: configuring, by the software code executed by the hardware processor, the NN based on the each combination; inputting into the configured NN, by the software code executed by the hardware processor, a training dataset including a plurality of data objects; receiving from the configured NN, by the software code executed by the hardware processor, a classification of the plurality of data objects in the training dataset; generating, by the software code executed by the hardware processor, a performance assessment for the each combination based on the classification; and determining, by the software code executed by the hardware processor, a preferred combination of layers for the NN from among the plurality of different combinations of layers based on a comparison of the performance assessments, the preferred combination of layers comprising a selected candidate amongst the one or more candidate functions for the last activation layer and a selected candidate amongst the one or more candidate functions for the loss layer. 11. The method of claim 10 , wherein the selected candidate for the last activation layer of the preferred combination of layers is a softmax activation function, and wherein the last activation layer follows a sigmoid activation layer. 12. The method of claim 10 , wherein the one or more activation layers includes an additional normalization layer, and wherein each combination of the plurality of different combinations of layers further includes one or more candidate functions for the additional normalization layer. 13. The method of claim 12 , wherein the selected candidate for the last activation layer is one of a sigmoid activation function, a softmax activation function, or an L 1 -normalization function. 14. The method of claim 12 , wherein the selected candidate for the loss layer comprises a cross entropy loss function. 15. The method of claim 10 , wherein the plurality of data objects in the training dataset comprises a plurality of images. 16. The method of claim 15 , further comprising generating, by the software code executed by the hardware processor, the plurality of images in the training dataset. 17. The method of claim 15 , further comprising: obtaining, by the software code executed by the hardware processor, a plurality of real images; compositing, by the software code executed by the hardware processor, the plurality of real images to form a montage of the plurality of real images; identifying, by the software code executed by the hardware processor, a plurality of labels for association with the montage; and labeling, by the software code executed by the hardware processor, the montage using one or more of the plurality of identified labels to generate the plurality of images in the training dataset; wherein noise is parametrically introduced into the training dataset, resulting in a subset of the plurality of images being purposely mislabeled. 18. The method of claim 17 , wherein a plurality of parameters utilized to introduce the noise into the training dataset comprises a number of peaks (dp) in the training dataset, a likelihood of noise (pn) in the training dataset, and a balance (pa) between false
Incorporation of unlabelled data, e.g. multiple instance learning [MIL] · CPC title
using neural networks · CPC title
Validation; Performance evaluation · CPC title
using classification, e.g. of video objects · CPC title
Activation functions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.