Method and system for optimizing accuracy-specificity trade-offs in large scale visual recognition
US-9158965-B2 · Oct 13, 2015 · US
US9928448B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9928448-B1 |
| Application number | US-201615273872-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 23, 2016 |
| Priority date | Sep 23, 2016 |
| Publication date | Mar 27, 2018 |
| Grant date | Mar 27, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method includes utilizing two or more classifiers to calculate, for an input image, probability scores for a plurality of classes based on visual information extracted from the input image and semantic relationships in a classification hierarchy, wherein each of the two or more classifiers is associated with a given one of two or more levels in the classification hierarchy with each level in the classification hierarchy comprising a subset of the plurality of classes, and classifying the input image based on the calculated probability scores.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: utilizing two or more classifiers to calculate, for an input image, probability scores for respective subsets of a plurality of classes based on visual information extracted from the input image and semantic relationships in a classification hierarchy, wherein each of the two or more classifiers is associated with a given one of two or more levels in the classification hierarchy with each level in the classification hierarchy comprising a subset of the plurality of classes; and classifying the input image based on the calculated probability scores; wherein utilizing the two or more classifiers to calculate the probability scores comprises training the two or more classifiers by utilizing a deep learning neural network, the deep learning neural network comprising a first set of levels of different feature sets and a second set of levels corresponding to the two or more levels in the classification hierarchy. 2. The method of claim 1 , wherein utilizing the two or more classifiers to calculate the probability scores further comprises performing label inference to refine classification probabilities in the two or more classifiers based on semantic relationships in the classification hierarchy. 3. The method of claim 1 , wherein utilizing the two or more classifiers to calculate the probability scores further comprises using an objective function combining recognition results for the second set of levels in the deep learning neural network. 4. The method of claim 1 , wherein the two or more classifiers share feature representation based on the first set of levels in the deep learning neural network. 5. The method of claim 2 , wherein the classification hierarchy comprises a tree structure and performing label inference to refine the classification probabilities in the two or more classifiers comprises, for two or more leaf nodes having classification probabilities within a designated threshold from one another, adjusting the classification probabilities for the two or more leaf nodes based on classification probabilities for parent nodes in a higher level of the classification hierarchy relative to the two or more leaf nodes. 6. The method of claim 2 , wherein the classification hierarchy comprises a tree structure and performing label inference to refine the classification probabilities in the two or more classifiers comprises, for two or more parent nodes having classification probabilities within a designated threshold from one another, adjusting the classification probabilities for the two or more parent nodes based on the classification probabilities for two or more leaf nodes corresponding to the two or more parent nodes, wherein the two or more parent nodes are in a higher level of the classification hierarchy relative to the two or more leaf nodes. 7. The method of claim 2 , wherein performing label inference to refine the classification probabilities in the two or more classifiers comprises: taking as input a graph structure having initial values for nodes corresponding to classification probabilities in the two or more classifiers; and outputting the graph structure with modified values for the nodes. 8. The method of claim 1 , wherein training the two or more classifiers further comprises utilizing a multi-task learning based loss function on top of the deep learning neural network that jointly optimizes classifiers associated with each of the two or more levels in the classification hierarchy. 9. The method of claim 8 , wherein the multi-task learning based loss function utilizes a tradeoff parameter that adjusts contributions of fine-grained classifications from a lowest level of the classification hierarchy with semantic relationships among classes at one or more higher levels of the classification hierarchy. 10. The method of claim 8 , wherein the multi-task learning based loss function trains the two or more classifiers such that misclassification of the input image based on the calculated probability scores falls within a semantically-related category of classes for a correct classification of the input image. 11. The method of claim 2 , wherein performing label inference to refine the classification probabilities in the two or more classifiers comprises utilizing a random walk process that smooths classification probabilities over two or more classes in a same semantic path in the classification hierarchy. 12. The method of claim 1 , wherein hierarchical relationships among the plurality of classes in the classification hierarchy are at least one of: obtained from a semantic data store; and learned using natural language processing. 13. The method of claim 1 , further comprising: capturing the input image using a mobile device; and utilizing the classification of the input image to obtain additional information related to the input image. 14. The method of claim 13 , wherein at least one of: the input image comprises a food dish and the additional information comprises nutritional information relating to the food dish; and the input image comprises a product and the additional information comprises information relating to ordering information for the product. 15. The method of claim 1 , wherein the two or more classifiers are provided as software-as-a-service in a cloud environment. 16. The method of claim 1 , wherein the two or more classifiers are provided as an on-demand self-service in a cloud environment. 17. A computer program product comprising a computer readable storage medium for storing computer readable program code which, when executed, causes a computer: to utilize two or more classifiers to calculate, for an input image, probability scores for a plurality of classes based on visual information extracted from the input image and semantic relationships in a classification hierarchy, wherein each of the two or more classifiers is associated with a given one of two or more levels in the classification hierarchy with each level in the classification hierarchy comprising a subset of the plurality of classes; and to classify the input image based on the calculated probability scores; wherein the utilization of the two or more classifiers to calculate the probability scores comprises a training of the two or more classifiers by utilizing a deep learning neural network, the deep learning neural network comprising a first set of levels of different feature sets and a second set of levels corresponding to the two or more levels in the classification hierarchy. 18. An apparatus comprising: a memory; and a processor coupled to the memory and configured: to utilize two or more classifiers to calculate, for an input image, probability scores for a plurality of classes based on visual information extracted from the input image and semantic relationships in a classification hierarchy, wherein each of the two or more classifiers is associated with a given one of two or more levels in the classification hierarchy with each level in the classification hierarchy comprising a subset of the plurality of classes; and to classify the input image based on the calculated probability scores; wherein, in utilizing the two or more classifiers to calculate the probability scores, the processor is configured to train the two or more classifiers by utilizing a deep learning neural network, the deep learning neural network comprising a first set of levels of different feature sets and a second set of levels corresponding to the two or more levels in the classification hierarchy.
Classification techniques · CPC title
using classification, e.g. of video objects · CPC title
based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate · CPC title
relating to the classification model, e.g. parametric or non-parametric approaches · CPC title
relating to the decision surface · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.