What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Oct 19 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Dynamic loss function based on statistics in loss layer of deep convolutional neural network

US2017300811A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2017300811-A1
Application number	US-201615099077-A
Country	US
Kind code	A1
Filing date	Apr 14, 2016
Priority date	Apr 14, 2016
Publication date	Oct 19, 2017
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an example embodiment, an a loss layer of a deep convolutional neural network is modified to include a dynamically changing function that adjusts based on statistical analysis of the samples, and specifically an analysis of which sample images showed the most deviation between their assigned professionalism score and an expected professionalism score. This allows outliers in training data to be automatically handled.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computerized method of training a deep convolutional neural network (DCNN), the method comprising: training the DCNN by: inputting a current plurality of samples to the DCNN, each of the samples having a label, the inputting including, for each sample: passing the sample to a convolutional layer of the DCNN, the convolutional layer comprising one or more filters having dynamically adjustable weights, the one or more filters configured to filter the sample to produce an output volume for the corresponding sample, the output volume comprising a different feature map for each of the one or more filters; passing the output volume from the convolutional layer through a nonlinearity layer, the nonlinearity layer applying a nonlinearity function to the output volume from the convolutional layer; passing the output volume from the nonlinearity layer through a pooling layer, the pooling layer lowering spatial dimensions of the output volume from the nonlinearity layer; passing the output volume from the pooling layer through a classification layer, the classification layer comprising a specialized convolutional layer having a filter designed to output a prediction for the sample based on the output volume from the pooling layer; passing the sample through a loss layer, the loss layer applying a loss function to the sample, resulting an in indication of a level of error in the prediction from the classification layer in comparison to the label of the sample; ranking each of the current plurality of samples based on their corresponding levels of error; applying a dynamic loss function to the current plurality of samples to eliminate lower ranked samples from consideration; determining whether a combination of the levels of error for the current plurality of samples not eliminated from consideration by the dynamic loss function transgresses a preset threshold; and in response to a determination that the combination of the levels of error transgresses a preset threshold, updating weights of the one or more filters in the convolutional layers of the DCNN to reduce the combination of the levels of error and repeating the training of the DCNN using a different plurality of samples and the updated weights. 2 . The method of claim 1 , wherein the DCNN comprises multiple stages, each stage containing a different convolutional layer, nonlinearity layer, and pooling layer. 3 . The method of claim 1 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples. 4 . The method of claim 1 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples and an additional plurality of samples previously used to train the DCNN, yet is applied just to the current plurality of samples. 5 . The method of claim 1 , wherein the dynamic loss function is designed to automatically become stricter as more iterations of the training occur. 6 . The method of claim 1 , wherein the dynamic loss function eliminates a preset percentage of samples from consideration. 7 . The method of claim 1 , wherein the dynamic loss function eliminates samples from consideration by weighting each samples probability of being an outlier assuming a Gaussian distribution of errors. 8 . A system for training a deep convolutional neural (DCNN), the system a computer readable medium having instructions stored there on, which, when executed by a processor, cause the system to: train the DCNN by: inputting a current plurality of samples to the DCNN, each of the samples having a label, the inputting including, for each sample: passing the sample to a convolutional layer of the DCNN, the convolutional layer comprising one or more filters having dynamically adjustable weights, the one or more filters configured to filter the sample to produce an output volume for the corresponding sample, the output volume comprising a different feature map for each of the one or more filters; passing the output volume from the convolutional layer through a nonlinearity layer, the nonlinearity layer applying a nonlinearity function to the output volume from the convolutional layer; passing the output volume from the nonlinearity layer through a pooling layer, the pooling layer lowering spatial dimensions of the output volume from the nonlinearity layer; passing the output volume from the pooling layer through a classification layer, the classification layer comprising a specialized convolutional layer having a filter designed to output a prediction for the sample based on the output volume from the pooling layer; passing the sample through a loss layer, the loss layer applying a loss function to the sample, resulting an in indication of a level of error in the prediction from the classification layer in comparison to the label of the sample; ranking each of the current plurality of samples based on their corresponding levels of error; applying a dynamic loss function to the current plurality of samples to eliminate lower ranked samples from consideration; determining whether a combination of the levels of error for the current plurality of samples not eliminated from consideration by the dynamic loss function transgresses a preset threshold; and in response to a determination that the combination of the levels of error transgresses a preset threshold, updating weights of the one or more filters in the convolutional layers of the DCNN to reduce the combination of the levels of error and repeating the training of the DCNN using a different plurality of samples and the updated weights. 9 . The system of claim 8 , wherein the DCNN comprises multiple stages, each stage containing a different convolutional layer, nonlinearity layer, and pooling layer. 10 . The system of claim 8 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples. 11 . The system of claim 8 , wherein the dynamic loss function is based on statistics regarding the current plurality of samples and an additional plurality of samples previously used to train the DCNN, yet is applied just to the current plurality of samples. 12 . The system of claim 8 , wherein the dynamic loss function is designed to automatically become stricter as more iterations of the training occur. 13 . The system of claim 8 , wherein the dynamic loss function eliminates a preset percentage of samples from consideration. 14 . The system of claim 8 , wherein the dynamic loss function eliminates samples from consideration by weighting each samples probability of being an outlier assuming a Gaussian distribution of errors. 15 . A non-transitory machine-readable storage medium comprising instructions, which when implemented by one or more machines, cause the one or more machines to perform operations for training a deep convolutional neural network (DCNN), the operations comprising: training the DCNN by: inputting a current plurality of samples to the DCNN, each of the samples having a label, the inputting including, for each sample: passing the sample to a convolutional layer of the DCNN, the convolutional layer comprising one or more filters having dynamically adjustable weights, the one or more filters configured to filter the sample to produce an output volume for the corresponding sample, the output volume comprising a different feature map for each of the one or more filters; passing the output volume from the convolutional layer through a nonlinearity layer, the nonlinearity layer applying a nonlinearity function to the output volume from the convolutional layer; p

Assignees

Linkedin Corp

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/09
Supervised learning · CPC title
G06N99/005
Physics · mapped topic
G06N3/08Primary
Learning methods · CPC title

Patent family

Related publications grouped by family.

View patent family 60040120

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017300811A1 cover?: In an example embodiment, an a loss layer of a deep convolutional neural network is modified to include a dynamically changing function that adjusts based on statistical analysis of the samples, and specifically an analysis of which sample images showed the most deviation between their assigned professionalism score and an expected professionalism score. This allows outliers in training data to…
Who is the assignee on this patent?: Linkedin Corp
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Oct 19 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).