Two dimensional to three dimensional moving image converter
US-12058306-B1 · Aug 6, 2024 · US
US9075824B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9075824-B2 |
| Application number | US-201213458183-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 27, 2012 |
| Priority date | Apr 27, 2012 |
| Publication date | Jul 7, 2015 |
| Grant date | Jul 7, 2015 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An instance-level retrieval method and system are provided. A representation of a query image is embedded in a multi-dimensional space using a learned projection. The projection is learned using category-labeled training data to optimize a classification rate on the training data. The joint learning of the projection and the classifiers improves the computation of similarity/distance between images by embedding them in a subspace where the similarity computation outputs more accurate results. An input query image can thus be used to retrieve similar instances in a database by computing the comparison measure in the embedding space.
Opening claim text (preview).
What is claimed is: 1. A retrieval method comprising: learning a projection for embedding an original image representation in an embedding space, the original image representation being based on features extracted from the image, the projection being learned from category-labeled training data to optimize a classification rate on the training data, the learning of the projection including, for a plurality of iterations: selecting a sample from the training data; embedding the sample with a current projection; scoring the embedded sample with current first and second classifiers, the first classifier corresponding to a category of the label of the sample, the second classifier corresponding to a different category, selected from a set of categories; updated the current projection and at least one of the current first and second classifier for iterations where the second classifier generates a higher score than the first classifier, the updated projection serving as the current projection for a subsequent iteration, each of the updated classifiers serving as the current classifier for the respective category for a subsequent iteration; and storing one of the updated projections as the learned projection; and with a processor, for each of plurality of database images, computing a comparison measure between a query image and the database image, the comparison measure being computed in the embedding space, respective original image representations of the query image and the database image being embedded in the embedding space with the projection; and providing for retrieving at least one of the database images based on the comparison. 2. The method of claim 1 , wherein the learning of the projection is performed jointly with learning a respective classifier for each of a set of the categories. 3. The method of claim 1 , wherein the learning of the projection includes optimizing an objective function which sums, over a set of samples and categories, a function of a score of the sample on the classifier corresponding to its category and a score of the sample on a classifier not corresponding to its category. 4. The method of claim 3 , wherein the learning of the projection includes optimizing the objective function with stochastic gradient descent. 5. The method of claim 1 , wherein the classifiers are updated as a function of the embedded sample and a learning rate. 6. The method of claim 1 , wherein the classifiers are updated as according to the expressions: w y+ ←w y+ +ηUq and w y− ←w y− −ηUq where η represents a learning rate, Uq represents the sample embedded with the projection, w y+ represents the first classifier and w y− represents the second classifier. 7. The method of claim 1 , wherein the projection is updated as a function of the first and second classifiers and a learning rate. 8. The method of claim 1 , wherein the projection is updated according to the expression: U←U +η( w y+ −w y− ) q′ where η represents a learning rate, U represents the projection matrix, q is a feature-based representation of the sample, w y+ represents the first classifier and w y− represents the second classifier. 9. The method of claim 1 , further comprising generating an original representation of the query image based on the extracted features and wherein the computing of the comparison between the query image and the database image comprises embedding the original representation of the query image with the projection matrix. 10. The method of claim 1 , wherein the original representation comprises a statistical representation of the extracted features. 11. The method of claim 10 , wherein the original representation comprises at least one of a Fisher Vector and a Bag-of-Visual-words representation. 12. The method of claim 1 , wherein the original representation is of higher dimensionality than the embedded representation. 13. The method of claim 1 , wherein the comparison measure is a distance measure and the computing of the distance measure includes computing a dot product between the query image and the database image embedded in the embedding space. 14. The method of claim 1 , wherein the projection comprises a projection matrix. 15. The method of claim 1 , further comprising outputting at least one of: at least one of the retrieved images, and a decision based on at least one of the retrieved images. 16. The method of claim 15 , wherein the decision is used for at least one of duplicate removal and copy detection. 17. A computer program product comprising a non-transitory recoding medium storing instructions which when executed by a computer, perform the method of claim 1 . 18. A system comprising memory which stores instructions for performing the method of claim 1 and a processor in communication with the memory which implements the instructions. 19. A retrieval method comprising: with a processor, learning a projection for embedding an original image representation in an embedding space, the original image representation being based on features extracted from the image, the projection being learned from category-labeled training data to optimize a classification rate on the training data, the learning of the projection including optimizing an objective function of the form: Σ (q,y+,y−) min{0, t−s ( q,y +)+ s ( q,y −)} where t represents a predetermined threshold, q represents a sample, s(q,y+) represents a score of the sample on the classifier corresponding to its category and s(q,y−) represents a score of the sample on a classifier not corresponding to its category; and for each of plurality of data base images, computing a comparison measure between a query image and the database image, the comparison measure being computed in the embedding space, respective original image representations of the query image and the database image being embedded in the embedding space with the projection; and providing for retrieving at least one of the database images based on the comparison. 20. The method of claim 19 , wherein the learning of the projection includes, for a plurality of iterations: selecting a sample from the training data; embedding the sample with a current projection; scoring the embedded sample with current first and second classifiers, the first classifier corresponding to a category of the label of the sample, the second classifier corresponding to a different category, selected from a set of categories; and updating the current projection and at least one of the current first and second classifiers for iterations where the second classifier generates a higher score than the first classifier, the updated projection serving as the current projection for a subsequent iteration, each of the updated classifiers serving as the current classifier for the respective category for a subsequent iteration; and storing an updated projection as the learned projection. 21. A retrieval system comprising: memory which stores: a projection matrix for embedding image features in an embedding space, the projection matrix having been learned from category-labeled training data to optimize a classification rate on the training data, including, for a plurality of iterations; selecting a sample from the training data; embedding the sample with a current projection; scoring the embedded sample with current first and second classifiers, the first classifier corresponding to a category of the label of the sample, the second class
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
based on approximation criteria, e.g. principal component analysis · CPC title
using metadata automatically derived from the content · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.