Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US2017300811A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2017300811-A1 |
| Application number | US-201615099077-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 14, 2016 |
| Priority date | Apr 14, 2016 |
| Publication date | Oct 19, 2017 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In an example embodiment, an a loss layer of a deep convolutional neural network is modified to include a dynamically changing function that adjusts based on statistical analysis of the samples, and specifically an analysis of which sample images showed the most deviation between their assigned professionalism score and an expected professionalism score. This allows outliers in training data to be automatically handled.
Opening claim text (preview).
What is claimed is: 1 . A computerized method of training a deep convolutional neural network (DCNN), the method comprising: training the DCNN by: inputting a current plurality of samples to the DCNN, each of the samples having a label, the inputting including, for each sample: passing the sample to a convolutional layer of the DCNN, the convolutional layer comprising one or more filters having dynamically adjustable weights, the one or more filters configured to filter the sample to produce an output volume for the corresponding sample, the output volume comprising a different feature map for each of the one or more filters; passing the output volume from the convolutional layer through a nonlinearity layer, the nonlinearity layer applying a nonlinearity function to the output volume from the convolutional layer; passing the output volume from the nonlinearity layer through a pooling layer, the pooling layer lowering spatial dimensions of the output volume from the nonlinearity layer; passing the output volume from the pooling layer through a classification layer, the classification layer comprising a specialized convolutional layer having a filter designed to output a prediction for the sample based on the output volume from the pooling layer; passing the sample through a loss layer, the loss layer applying a loss function to the sample, resulting an in indication of a level of error in the prediction from the classification layer in comparison to the label of the sample; ranking each of the current plurality of samples based on their corresponding levels of error; applying a dynamic loss function to the current plurality of samples to eliminate lower ranked samples from consideration; determining whether a combination of the levels of error for the current plurality of samples not eliminated from consideration by the dynamic loss function transgresses a preset threshold; and in response to a determination that the combination of the levels of error transgresses a preset threshold, updating weights of the one or more filters in the convolutional layers of the DCNN to reduce the combination of the levels of error and repeating the training of the DCNN using a different plurality of samples and the updated weights. 2 . The method of claim 1 , wherein the DCNN comprises multiple stages, each stage containing a different convolutional layer, nonlinearity layer, and pooling layer. 3 . The method of claim 1 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples. 4 . The method of claim 1 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples and an additional plurality of samples previously used to train the DCNN, yet is applied just to the current plurality of samples. 5 . The method of claim 1 , wherein the dynamic loss function is designed to automatically become stricter as more iterations of the training occur. 6 . The method of claim 1 , wherein the dynamic loss function eliminates a preset percentage of samples from consideration. 7 . The method of claim 1 , wherein the dynamic loss function eliminates samples from consideration by weighting each samples probability of being an outlier assuming a Gaussian distribution of errors. 8 . A system for training a deep convolutional neural (DCNN), the system a computer readable medium having instructions stored there on, which, when executed by a processor, cause the system to: train the DCNN by: inputting a current plurality of samples to the DCNN, each of the samples having a label, the inputting including, for each sample: passing the sample to a convolutional layer of the DCNN, the convolutional layer comprising one or more filters having dynamically adjustable weights, the one or more filters configured to filter the sample to produce an output volume for the corresponding sample, the output volume comprising a different feature map for each of the one or more filters; passing the output volume from the convolutional layer through a nonlinearity layer, the nonlinearity layer applying a nonlinearity function to the output volume from the convolutional layer; passing the output volume from the nonlinearity layer through a pooling layer, the pooling layer lowering spatial dimensions of the output volume from the nonlinearity layer; passing the output volume from the pooling layer through a classification layer, the classification layer comprising a specialized convolutional layer having a filter designed to output a prediction for the sample based on the output volume from the pooling layer; passing the sample through a loss layer, the loss layer applying a loss function to the sample, resulting an in indication of a level of error in the prediction from the classification layer in comparison to the label of the sample; ranking each of the current plurality of samples based on their corresponding levels of error; applying a dynamic loss function to the current plurality of samples to eliminate lower ranked samples from consideration; determining whether a combination of the levels of error for the current plurality of samples not eliminated from consideration by the dynamic loss function transgresses a preset threshold; and in response to a determination that the combination of the levels of error transgresses a preset threshold, updating weights of the one or more filters in the convolutional layers of the DCNN to reduce the combination of the levels of error and repeating the training of the DCNN using a different plurality of samples and the updated weights. 9 . The system of claim 8 , wherein the DCNN comprises multiple stages, each stage containing a different convolutional layer, nonlinearity layer, and pooling layer. 10 . The system of claim 8 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples. 11 . The system of claim 8 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples and an additional plurality of samples previously used to train the DCNN, yet is applied just to the current plurality of samples. 12 . The system of claim 8 , wherein the dynamic loss function is designed to automatically become stricter as more iterations of the training occur. 13 . The system of claim 8 , wherein the dynamic loss function eliminates a preset percentage of samples from consideration. 14 . The system of claim 8 , wherein the dynamic loss function eliminates samples from consideration by weighting each samples probability of being an outlier assuming a Gaussian distribution of errors. 15 . A non-transitory machine-readable storage medium comprising instructions, which when implemented by one or more machines, cause the one or more machines to perform operations for training a deep convolutional neural network (DCNN), the operations comprising: training the DCNN by: inputting a current plurality of samples to the DCNN, each of the samples having a label, the inputting including, for each sample: passing the sample to a convolutional layer of the DCNN, the convolutional layer comprising one or more filters having dynamically adjustable weights, the one or more filters configured to filter the sample to produce an output volume for the corresponding sample, the output volume comprising a different feature map for each of the one or more filters; passing the output volume from the convolutional layer through a nonlinearity layer, the nonlinearity layer applying a nonlinearity function to the output volume from the convolutional layer; p
Combinations of networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Physics · mapped topic
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.