Image-based faceted system and method

US9411829B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9411829-B2
Application numberUS-201313913943-A
CountryUS
Kind codeB2
Filing dateJun 10, 2013
Priority dateJun 10, 2013
Publication dateAug 9, 2016
Grant dateAug 9, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein is a system and method that facilitate searching and/or browsing of images by clustering, or grouping, the images into a set of image clusters using facets, such as without limitation visual properties or visual characteristics, of the images, and representing each image cluster by a representative image selected for the image cluster. A map-reduce based probabilistic topic model may be used to identify one or more images belonging to each image cluster and update model parameters.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: representing, by at least one computing device, each of a plurality of images as a plurality of quantized gradient-related feature vectors; building, by the at least one computing device, a vocabulary of words using each image's plurality of quantized gradient-related feature vectors; generating, by the at least one computing device and using the vocabulary, a probabilistic topic model comprising image-specific parameters for each image in the plurality of images and cluster-specific parameters for each of a plurality of clusters, the image-specific parameters and cluster-specific parameters being learned in parallel using a map-reduce architecture; assigning, by the at least one computing device, each image from the plurality of images to a cluster from the plurality of clusters using the image's image-specific parameters and the probabilistic topic model, each cluster corresponding to a word in the vocabulary; and identifying, by the at least one computing device and for each cluster from the plurality of clusters, at least one image assigned to the cluster as a representative image for the cluster. 2. The method of claim 1 , the method further comprising: the at least one computing device comprising a number of first computing devices as mappers in the map-reduce architecture and a number of second computing devices as reducers in the map-reduce architecture, the mappers and the reducers operating in parallel; learning the image-specific parameters for each image in the plurality of images processed by a first computing device as a mapper in the map-reduce architecture; and learning the cluster-specific parameters for each cluster in the plurality of clusters processed by a second computing device as a reducer in the map-reduce architecture. 3. The method of claim 2 , a first computing device from the number of first computing devices being further configured for use as a reducer and a second computing device from the number of second computing devices being further configured for use as a mapper. 4. The method of claim 2 , the method further comprising: each mapper learning an image's image-specific parameters by performing operations comprising: retrieving data associated with the image from a distributed file system; retrieving the cluster-specific parameters from a distributed cache; and learning the image's image-specific parameters using the image's retrieved data and the retrieved cluster-specific parameters; each reducer learning the cluster-specific parameters by performing operations comprising: receiving data from at least one mapper, the data received from each mapper comprising the image's image-specific parameters learned by the mapper; retrieving the cluster-specific parameters from the distributed cache; and making any updates to the cluster-specific parameters learned using the received image-specific parameters and the retrieved cluster-specific parameters. 5. The method of claim 1 , the image-specific parameters for an image comprising a probability distribution over the plurality of clusters, the probability distribution comprising a cluster membership probability for each cluster of the plurality of clusters, each cluster membership probability indicating a probability that the image belongs to the cluster. 6. The method of claim 1 , the cluster-specific parameters comprising a probability distribution for each cluster over a plurality of visual word vectors, each visual word vector corresponding to an image of the plurality of images, a plurality of visual word vectors determined using the plurality of quantized gradient-related feature vectors determined for images from the plurality of images, the probability distribution for a cluster comprising a probability for each visual word vector of the plurality of visual word vectors, each probability indicating a probability that the visual word vector is related to the cluster. 7. The method of claim 1 , the representing each of a plurality of images as a plurality of quantized gradient-related feature vectors further comprising: partitioning an image into a plurality of partitions; extracting gradient feature vectors from each partition of the plurality of partitions; and quantizing the gradient feature vectors using k-means clustering, where k corresponds to a number of words in the vocabulary of words, and each of the quantized gradient feature vectors corresponds to a word in the vocabulary of words. 8. A system comprising: at least one computing device, each computing device comprising a processor and a storage medium for tangibly storing thereon program logic for execution by the processor, the stored program logic comprising: representing logic executed by the processor for representing each of a plurality of images as a plurality of quantized gradient-related feature vectors; building logic executed by the processor for building a vocabulary of words using each image's plurality of quantized gradient-related feature vectors; generating logic executed by the processor for generating, using the vocabulary, a probabilistic topic model comprising image-specific parameters for each image in the plurality of images and cluster-specific parameters for each of a plurality of clusters, the image-specific parameters and cluster-specific parameters being learned in parallel using a map-reduce architecture; assigning logic executed by the processor for assigning each image from the plurality of images to a cluster from the plurality of clusters using the image's image-specific parameters and the probabilistic topic model, each cluster corresponding to a word in the vocabulary; and identifying logic executed by the processor for identifying, for each cluster from the plurality of clusters, at least one image assigned to the cluster as a representative image for the cluster. 9. The system of claim 8 : the at least one computing device comprising a number of first computing devices as mappers in the map-reduce architecture and a number of second computing devices as reducers in the map-reduce architecture, the mappers and the reducers operating in parallel; each first computing device's storage medium tangibly storing thereon program logic for execution by the processor, the stored program code comprising learning logic executed by the processor for learning the image-specific parameters for each image in the plurality of images processed by the first computing device as a mapper in the map-reduce architecture; and each second computing device's storage medium tangibly storing thereon program logic for execution by the processor, the stored program code comprising learning logic executed by the processor for learning the cluster-specific parameters for each cluster in the plurality of clusters processed by the second computing device as a reducer in the map-reduce architecture. 10. The system of claim 9 , a first computing device from the number of first computing devices being further configured for use as a reducer and a second computing device from the number of second computing devices being configured for use as a mapper. 11. The system of claim 9 , the stored program logic further comprising: for each mapper, the learning logic executed by the processor for learning an image's image-specific parameters further comprising: retrieving logic executed by the processor for retrieving data associated with the image from a distributed file system; retrieving logic executed by the processor for retrieving the cluster-specific parameters from a distributed cache; and learning logic executed by the processor for learning the image's image-specific pa

Assignees

Inventors

Classifications

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • with fixed number of clusters, e.g. K-means clustering · CPC title

  • using statistics or function optimisation, e.g. modelling of probability density functions · CPC title

  • using shape and object relationship · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9411829B2 cover?
Disclosed herein is a system and method that facilitate searching and/or browsing of images by clustering, or grouping, the images into a set of image clusters using facets, such as without limitation visual properties or visual characteristics, of the images, and representing each image cluster by a representative image selected for the image cluster. A map-reduce based probabilistic topic mod…
Who is the assignee on this patent?
Yahoo Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30274. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 09 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).