Retrieval system and method leveraging category-level labels

US9075824B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9075824-B2
Application numberUS-201213458183-A
CountryUS
Kind codeB2
Filing dateApr 27, 2012
Priority dateApr 27, 2012
Publication dateJul 7, 2015
Grant dateJul 7, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An instance-level retrieval method and system are provided. A representation of a query image is embedded in a multi-dimensional space using a learned projection. The projection is learned using category-labeled training data to optimize a classification rate on the training data. The joint learning of the projection and the classifiers improves the computation of similarity/distance between images by embedding them in a subspace where the similarity computation outputs more accurate results. An input query image can thus be used to retrieve similar instances in a database by computing the comparison measure in the embedding space.

First claim

Opening claim text (preview).

What is claimed is: 1. A retrieval method comprising: learning a projection for embedding an original image representation in an embedding space, the original image representation being based on features extracted from the image, the projection being learned from category-labeled training data to optimize a classification rate on the training data, the learning of the projection including, for a plurality of iterations: selecting a sample from the training data; embedding the sample with a current projection; scoring the embedded sample with current first and second classifiers, the first classifier corresponding to a category of the label of the sample, the second classifier corresponding to a different category, selected from a set of categories; updated the current projection and at least one of the current first and second classifier for iterations where the second classifier generates a higher score than the first classifier, the updated projection serving as the current projection for a subsequent iteration, each of the updated classifiers serving as the current classifier for the respective category for a subsequent iteration; and storing one of the updated projections as the learned projection; and with a processor, for each of plurality of database images, computing a comparison measure between a query image and the database image, the comparison measure being computed in the embedding space, respective original image representations of the query image and the database image being embedded in the embedding space with the projection; and providing for retrieving at least one of the database images based on the comparison. 2. The method of claim 1 , wherein the learning of the projection is performed jointly with learning a respective classifier for each of a set of the categories. 3. The method of claim 1 , wherein the learning of the projection includes optimizing an objective function which sums, over a set of samples and categories, a function of a score of the sample on the classifier corresponding to its category and a score of the sample on a classifier not corresponding to its category. 4. The method of claim 3 , wherein the learning of the projection includes optimizing the objective function with stochastic gradient descent. 5. The method of claim 1 , wherein the classifiers are updated as a function of the embedded sample and a learning rate. 6. The method of claim 1 , wherein the classifiers are updated as according to the expressions: w y+ ←w y+ +ηUq and w y− ←w y− −ηUq where η represents a learning rate, Uq represents the sample embedded with the projection, w y+ represents the first classifier and w y− represents the second classifier. 7. The method of claim 1 , wherein the projection is updated as a function of the first and second classifiers and a learning rate. 8. The method of claim 1 , wherein the projection is updated according to the expression: U←U +η( w y+ −w y− ) q′ where η represents a learning rate, U represents the projection matrix, q is a feature-based representation of the sample, w y+ represents the first classifier and w y− represents the second classifier. 9. The method of claim 1 , further comprising generating an original representation of the query image based on the extracted features and wherein the computing of the comparison between the query image and the database image comprises embedding the original representation of the query image with the projection matrix. 10. The method of claim 1 , wherein the original representation comprises a statistical representation of the extracted features. 11. The method of claim 10 , wherein the original representation comprises at least one of a Fisher Vector and a Bag-of-Visual-words representation. 12. The method of claim 1 , wherein the original representation is of higher dimensionality than the embedded representation. 13. The method of claim 1 , wherein the comparison measure is a distance measure and the computing of the distance measure includes computing a dot product between the query image and the database image embedded in the embedding space. 14. The method of claim 1 , wherein the projection comprises a projection matrix. 15. The method of claim 1 , further comprising outputting at least one of: at least one of the retrieved images, and a decision based on at least one of the retrieved images. 16. The method of claim 15 , wherein the decision is used for at least one of duplicate removal and copy detection. 17. A computer program product comprising a non-transitory recoding medium storing instructions which when executed by a computer, perform the method of claim 1 . 18. A system comprising memory which stores instructions for performing the method of claim 1 and a processor in communication with the memory which implements the instructions. 19. A retrieval method comprising: with a processor, learning a projection for embedding an original image representation in an embedding space, the original image representation being based on features extracted from the image, the projection being learned from category-labeled training data to optimize a classification rate on the training data, the learning of the projection including optimizing an objective function of the form: Σ (q,y+,y−) min{0, t−s ( q,y +)+ s ( q,y −)} where t represents a predetermined threshold, q represents a sample, s(q,y+) represents a score of the sample on the classifier corresponding to its category and s(q,y−) represents a score of the sample on a classifier not corresponding to its category; and for each of plurality of data base images, computing a comparison measure between a query image and the database image, the comparison measure being computed in the embedding space, respective original image representations of the query image and the database image being embedded in the embedding space with the projection; and providing for retrieving at least one of the database images based on the comparison. 20. The method of claim 19 , wherein the learning of the projection includes, for a plurality of iterations: selecting a sample from the training data; embedding the sample with a current projection; scoring the embedded sample with current first and second classifiers, the first classifier corresponding to a category of the label of the sample, the second classifier corresponding to a different category, selected from a set of categories; and updating the current projection and at least one of the current first and second classifiers for iterations where the second classifier generates a higher score than the first classifier, the updated projection serving as the current projection for a subsequent iteration, each of the updated classifiers serving as the current classifier for the respective category for a subsequent iteration; and storing an updated projection as the learned projection. 21. A retrieval system comprising: memory which stores: a projection matrix for embedding image features in an embedding space, the projection matrix having been learned from category-labeled training data to optimize a classification rate on the training data, including, for a plurality of iterations; selecting a sample from the training data; embedding the sample with a current projection; scoring the embedded sample with current first and second classifiers, the first classifier corresponding to a category of the label of the sample, the second class

Assignees

Inventors

Classifications

  • characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title

  • based on approximation criteria, e.g. principal component analysis · CPC title

  • G06F16/583Primary

    using metadata automatically derived from the content · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9075824B2 cover?
An instance-level retrieval method and system are provided. A representation of a query image is embedded in a multi-dimensional space using a learned projection. The projection is learned using category-labeled training data to optimize a classification rate on the training data. The joint learning of the projection and the classifiers improves the computation of similarity/distance between im…
Who is the assignee on this patent?
Gordo Albert, Rodriguez Serrano Jose Antonio, Perronnin Florent, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F16/583. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 07 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).