Behavioral Analytic System
US-2018033024-A1 · Feb 1, 2018 · US
US12051000B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12051000-B2 |
| Application number | US-202217962789-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 10, 2022 |
| Priority date | Nov 29, 2016 |
| Publication date | Jul 30, 2024 |
| Grant date | Jul 30, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Some embodiments provide a method for configuring a machine-trained (MT) network that includes multiple configurable weights to train. The method propagates a set of inputs through the MT network to generate a set of output probability distributions. Each input has a corresponding expected output probability distribution. The method calculates a value of a continuously-differentiable loss function that includes a term approximating an extremum function of the difference between the expected output probability distributions and generated set of output probability distributions. The method trains the weights by back-propagating the calculated value of the continuously-differentiable loss function.
Opening claim text (preview).
What is claimed is: 1. A method for training a classification network that classifies inputs into a plurality of different categories, the method comprising: propagating a set of inputs through the classification network to generate a set of output probability distributions, the generated output probability distribution for each input providing a probability of the input belonging to each of the categories, each input having a corresponding expected output probability distribution that specifies a particular one of the categories to which the input belongs; calculating a value of a continuously-differentiable loss function comprising a term that approximates a maximum of entropy calculations for each of the different categories; and using the calculated continuously-differentiable loss function value to train weights of the classification network, wherein the term that approximates the maximum of the entropy calculations biases the training of the weights towards reducing a difference between the expected output probability distributions and the generated output probability distributions for inputs belonging to a category with the largest entropy calculations. 2. The method of claim 1 , wherein calculating the value of the continuously-differentiable loss function comprises calculating the entropy for each of the different categories. 3. The method of claim 2 , wherein calculating the entropy for each of the different categories comprises, for each of the categories: calculating an average of the generated output probability distributions for the inputs belonging to the category; and calculating the entropy of the average of the generated output probability distributions for the inputs belonging to the category. 4. The method of claim 2 , wherein calculating the entropy for each of the different categories comprises using a log-sum-exponent formulation that highlights inputs with the largest divergence between expected output probability distributions and generated output probability distributions. 5. The method of claim 4 , wherein the term that approximates the maximum of the entropy calculations is a log-sum-exponent term that uses the log-sum-exponent formulation of the entropy as its exponent. 6. The method of claim 5 , wherein: the summation in the log-sum-exponent term is a summation over the plurality of different categories; and the summation in the log-sum-exponent formulation of the entropy calculation for a particular category is a summation over the inputs belonging to the particular category. 7. The method of claim 1 , wherein: the set of inputs comprises a plurality of inputs for each of the categories; and for each category, the expected output probability distribution for each input belonging to the category is 1 for the category to which the input belongs and 0 for each other category. 8. The method of claim 1 , wherein using the calculated continuously-differentiable loss function value to train the weights of the classification network comprises: back-propagating the calculated loss function value to determine, for each of a plurality of the weights of the classification network, a rate of change in the calculated loss function value relative to a rate of change in the weight; and modifying each respective weight of the plurality of weights according to the respective rate of change determined for the weight. 9. The method of claim 1 , wherein the classification network is for embedding into a device after the classification network is trained. 10. The method of claim 1 , wherein the inputs are images and the plurality of categories are different types of objects represented in the images. 11. A non-transitory machine-readable medium storing a program which when executed by at least one processing unit trains a classification network that classifies inputs into a plurality of different categories, the program comprising sets of instructions for: propagating a set of inputs through the classification network to generate a set of output probability distributions, the generated output probability distribution for each input providing a probability of the input belonging to each of the categories, each input having a corresponding expected output probability distribution that specifies a particular one of the categories to which the input belongs; calculating a value of a continuously-differentiable loss function comprising a term that approximates a maximum of entropy calculations for each of the different categories; and using the calculated continuously-differentiable loss function value to train weights of the classification network, wherein the term that approximates the maximum of the entropy calculations biases the training of the weights towards reducing a difference between the expected output probability distributions and the generated output probability distributions for inputs belonging to a category with the largest entropy calculations. 12. The non-transitory machine-readable medium of claim 11 , wherein the set of instructions for calculating the value of the continuously-differentiable loss function comprises a set of instructions for calculating the entropy for each of the different categories. 13. The non-transitory machine-readable medium of claim 12 , wherein the set of instructions for calculating the entropy for each of the different categories comprises sets of instructions for, for each of the categories: calculating an average of the generated output probability distributions for the inputs belonging to the category; and calculating the entropy of the average of the generated output probability distributions for the inputs belonging to the category. 14. The non-transitory machine-readable medium of claim 12 , wherein the set of instructions for calculating the entropy for each of the different categories comprises a set of instructions for using a log-sum-exponent formulation that highlights inputs with the largest divergence between expected output probability distributions and generated output probability distributions. 15. The non-transitory machine-readable medium of claim 14 , wherein the term that approximates the maximum of the entropy calculations is a log-sum-exponent term that uses the log-sum-exponent formulation of the entropy as its exponent. 16. The non-transitory machine-readable medium of claim 15 , wherein: the summation in the log-sum-exponent term is a summation over the plurality of different categories; and the summation in the log-sum-exponent formulation of the entropy calculation for a particular category is a summation over the inputs belonging to the particular category. 17. The non-transitory machine-readable medium of claim 11 , wherein: the set of inputs comprises a plurality of inputs for each of the categories; and for each category, the expected output probability distribution for each input belonging to the category is 1 for the category to which the input belongs and 0 for each other category. 18. The non-transitory machine-readable medium of claim 11 , wherein the set of instructions for using the calculated continuously-differentiable loss function value to train the weights of the classification network comprises sets of instructions for: back-propagating the calculated loss function value to determine, for each of a plurality of the weights of the classification network, a rate of change in the calculated loss function value relative to a rate of change in the weight; and modifying each respective weight of the plurality of weights according to the respecti
Related publications grouped by family.
Answers are generated from the same data shown on this page.