Learning beautiful and ugly visual attributes

US9082047B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9082047-B2
Application numberUS-201313971092-A
CountryUS
Kind codeB2
Filing dateAug 20, 2013
Priority dateAug 20, 2013
Publication dateJul 14, 2015
Grant dateJul 14, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for learning visual attribute labels for images includes, from textual comments associated with a corpus of images, identifying a set of candidate textual labels that are predictive of aesthetic scores associated with images in the corpus. The candidate labels in the set are clustered into a plurality of visual attribute clusters based on similarity and each of the clusters assigned a visual attribute label. For each of the visual attribute labels, a classifier is trained using visual representations of images in the corpus and respective visual attribute labels. The visual attribute labels are evaluated, based on performance of the trained classifier. A subset of the visual attribute labels is retained, based on the evaluation. The visual attribute labels can be used in processes such as image retrieval, image labeling, and the like.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for learning visual attribute labels for images comprising: from textual comments associated with a corpus of images, identifying a set of candidate textual labels that are predictive of aesthetic scores associated with images in the corpus; clustering the candidate labels in the set into a plurality of visual attribute clusters based on similarity and assigning each of the clusters a visual attribute label; for each of the visual attribute labels, training a visual attribute classifier using visual representations of images in the corpus and respective visual attribute labels; evaluating the visual attribute labels based on performance of the trained visual attribute classifiers, the evaluating comprising comparing performance of each of the visual attribute classifiers with a predefined threshold; and retaining a subset of the visual attribute labels based on the evaluation, including retaining the visual attribute labels for the visual attribute classifiers that meet the performance threshold; wherein at least one of the identifying a set of candidate textual labels, clustering the candidate labels, training the classifier, and evaluating the classifier performance is performed with a processor. 2. The method of claim 1 , wherein the identifying of the set of candidate textual labels comprises, for each image, generating a text-based representation of a set of textual features from which the candidate labels are selected. 3. The method of claim 2 , wherein the text-based representation of the image is based on the textual features extracted from textual comments associated with the image. 4. The method of claim 3 , wherein the identifying of the set of candidate textual labels comprises, for each image in the corpus, generating a document based on textual comments associated with the image in which stop words and punctuation have been removed to generate a sequence of words, and from the sequence of words, extracting a set of textual features, the text-based representation being based on occurrence of the textual features in the document. 5. The method of claim 2 , wherein each of the textual features in the text-based representation corresponds to a respective sequence of at least one word extracted from the textual comments associated with the corpus of images. 6. The method of claim 5 , wherein at least some of the textual features represented in the text-based representation each correspond to a bigram. 7. The method of claim 2 , wherein the identifying of the set of candidate textual labels comprises: optimizing a regression function that outputs a regression coefficient for each the textual features represented in the text-based representation; and ranking at least some of the textual features based on the regression coefficients. 8. The method of claim 7 , wherein the regression function is an Elastic Net. 9. The method of claim 1 , wherein the assigning each of the clusters a visual attribute label comprises selecting one of the textual features assigned to the cluster as the visual attribute label. 10. The method of claim 1 , wherein each of the visual representations comprises a statistical representation of low level features extracted from patches of the respective image. 11. The method of claim 1 , further comprising, with the trained classifiers, assigning visual attribute labels to a query image based on a visual representation of the query image. 12. The method of claim 1 , further comprising receiving one of the retained visual attribute labels as a query and retrieving images from a collection of images that are labeled with visual attribute labels selected from the set of visual attribute labels. 13. A computer program product comprising a non-transitory storage medium which stores instructions, which when executed by a computer, performs the method of claim 1 . 14. A system comprising memory which stores instructions for performing the method of claim 1 and a processor in communication with the memory for executing the instructions. 15. A method for learning visual attribute labels for images comprising: from textual comments associated with a corpus of images, identifying a set of candidate textual labels that are predictive of aesthetic scores associated with images in the corpus, the identifying comprising, for each image, generating a text-based representation of a set of textual features from which the candidate labels are selected, optimizing a regression function that outputs a regression coefficient for each the textual features represented in the text-based representation, the optimizing of the regression function is being based on the text-based representations for images in the corpus and respective aesthetic scores for the images, and ranking at least some of the textual features based on the regression coefficients; clustering the candidate labels in the set into a plurality of visual attribute clusters based on similarity and assigning each of the clusters a visual attribute label; for each of the visual attribute labels, training a classifier using visual representations of images in the corpus and respective visual attribute labels; evaluating the visual attribute labels based on performance of the trained classifier; and retaining a subset of the visual attribute labels based on the evaluation; wherein at least one of the identifying a set of candidate textual labels, clustering the candidate labels, training the classifier, and evaluating the classifier performance is performed with a processor. 16. The method of claim 15 , wherein the evaluating the visual attribute labels based on a performance criterion comprises comparing performance of each of the visual attribute classifiers with a predefined threshold and the retaining includes retaining the visual attribute labels for the visual attribute classifiers that meet the performance threshold. 17. A method for learning visual attribute labels for images comprising: from textual comments associated with a corpus of images, identifying a set of candidate textual labels that are predictive of aesthetic scores associated with images in the corpus, the identifying comprising, for each image, generating a text-based representation of a set of textual features from which the candidate labels are selected, optimizing a regression function that outputs a regression coefficient for each the textual features represented in the text-based representation, identifying a first group of positive regression coefficients and a second group of negative regression coefficients, the ranking of the at least some of the textual features comprising separately ranking the first and second groups, and ranking of the at least some of the textual features based on the regression coefficients, comprising separately ranking the first and second groups; clustering the candidate labels in the set into a plurality of visual attribute clusters based on similarity and assigning each of the clusters a visual attribute label; for each of the visual attribute labels, training a classifier using visual representations of images in the corpus and respective visual attribute labels; evaluating the visual attribute labels based on performance of the trained classifier; and retaining a subset of the visual attribute labels based on the evaluation; wherein at least one of the identifying a set of candidate textual labels, clustering the candidate labels, training the classifier, and evaluating the classifier performance is performed with a processor.

Assignees

Inventors

Classifications

  • based on feedback from supervisors · CPC title

  • Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • using classification, e.g. of video objects · CPC title

  • based on distances to training or reference patterns · CPC title

  • G06F16/50Primary

    of still image data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9082047B2 cover?
A method for learning visual attribute labels for images includes, from textual comments associated with a corpus of images, identifying a set of candidate textual labels that are predictive of aesthetic scores associated with images in the corpus. The candidate labels in the set are clustered into a plurality of visual attribute clusters based on similarity and each of the clusters assigned a …
Who is the assignee on this patent?
Xerox Corp
What technology area does this patent fall under?
Primary CPC classification G06F16/50. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 14 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).