Image-based faceted system and method

US9922051B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9922051-B2
Application numberUS-201615225908-A
CountryUS
Kind codeB2
Filing dateAug 2, 2016
Priority dateJun 10, 2013
Publication dateMar 20, 2018
Grant dateMar 20, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein is a system and method that facilitate searching and/or browsing of images by clustering, or grouping, the images into a set of image clusters using facets, such as without limitation visual properties or visual characteristics, of the images, and representing each image cluster by a representative image selected for the image cluster. A map-reduce based probabilistic topic model may be used to identify one or more images belonging to each image cluster and update model parameters.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: representing, by a computing device, each digital image of a plurality of digital images as a plurality of quantized gradient-related feature vectors; building, by the computing device, a vocabulary of words using the plurality of quantized gradient-related feature vectors of each digital image of the plurality of digital images; generating, by the computing device and using the vocabulary, a probabilistic topic model comprising image-specific parameters for each digital image in the plurality of digital images and cluster-specific parameters for each cluster of a plurality of clusters; assigning, by the computing device, each digital image from the plurality of digital images to a cluster of the plurality of clusters using the digital image's image-specific parameters and the probabilistic topic model, each cluster corresponding to a word in the vocabulary; identifying, by the computing device and for each cluster of the plurality of clusters, a representative digital image for the cluster; receiving, by the computing device, a digital image in a digital image search request of a user; generating, via the computing device, a set of words, from the vocabulary of words, for the received digital image; identifying, via the computing device and using the set of words generated for the received digital image as a query and using each word in the vocabulary corresponding to each cluster of the plurality, a number of representative digital images, each representative digital image of the number being representative of a cluster of the plurality of clusters; and generating, via the computing device, a response to the digital image search request of the user, the response comprising the number of representative digital images. 2. The method of claim 1 , further comprising: communicating, via the computing device, the number of representative digital images to the user for display on a device of the user. 3. The method of claim 2 , further comprising: receiving, at the computing device, input indicative of a selection of the user of a representative digital image of the number; and retrieving, via the computing device, a number of digital images from the cluster of the plurality being represented by the user-selected representative digital image; and communicating, via the computing device, the number of digital images to the user for display on the device of the user. 4. The method of claim 1 , generation of the set of words for the received digital image further comprising: representing, via the computing device, a digital image received with the digital image search request of the user as a plurality of quantized gradient-related feature vectors; the generating further comprising generating, via the computing device, the set of words, for the received digital image, from the vocabulary of words using the received digital image's plurality of quantized gradient-related feature vectors. 5. The method of claim 1 , the probabilistic topic model generation further comprising: learning the image-specific parameters and cluster-specific parameters in parallel using a map-reduce architecture and a plurality of processors, the learning comprising: assigning a first number of the plurality of processors as mappers in the map-reduce architecture and a second number of the processors as reducers in the map-reduce architecture, the mappers and the reducers operating in parallel; learning the image-specific parameters for each digital image in the plurality of digital images processed by a mapper in the map-reduce architecture; and learning the cluster-specific parameters for each cluster in the plurality of clusters processed by a reducer in the map-reduce architecture. 6. The method of claim 5 , further comprising: each mapper learning a digital image's image-specific parameters by performing operations comprising: retrieving data associated with the digital image from a distributed file system; retrieving the cluster-specific parameters from a distributed cache; and learning the digital image's image-specific parameters using the digital image's retrieved data and the retrieved cluster-specific parameters; each reducer learning the cluster-specific parameters by performing operations comprising: receiving data from at least one mapper, the data received from each mapper comprising the digital image's image-specific parameters learned by the mapper; retrieving the cluster-specific parameters from the distributed cache; and making any updates to the cluster-specific parameters learned using the received image-specific parameters and the retrieved cluster-specific parameters. 7. The method of claim 1 , the image-specific parameters for a digital image comprising a probability distribution over the plurality of clusters, the probability distribution comprising a cluster membership probability for each cluster of the plurality of clusters, each cluster membership probability indicating a probability that the digital image belongs to the cluster. 8. The method of claim 1 , the cluster-specific parameters comprising a probability distribution for each cluster over a plurality of visual word vectors, each visual word vector corresponding to a digital image of the plurality of digital images, a plurality of visual word vectors determined using the plurality of quantized gradient-related feature vectors determined for digital images from the plurality of digital images, the probability distribution for a cluster comprising a probability for each visual word vector of the plurality of visual word vectors, each probability indicating a probability that the visual word vector is related to the cluster. 9. The method of claim 1 , representation of a digital image of the plurality of digital images as a plurality of quantized gradient-related feature vectors further comprising: partitioning the digital image into a plurality of partitions; extracting gradient feature vectors from each partition of the plurality of partitions; and quantizing the gradient feature vectors using k-means clustering, where k corresponds to a number of words in the vocabulary of words, and each of the quantized gradient feature vectors corresponds to a word in the vocabulary of words. 10. A non-transitory computer-readable storage medium tangibly encoded with computer-executable instructions, that when executed by a processor associated with a computing device, performs a method comprising: representing each digital image of a plurality of digital images as a plurality of quantized gradient-related feature vectors; building a vocabulary of words using the plurality of quantized gradient-related feature vectors of each digital image of the plurality of digital images; generating, using the vocabulary, a probabilistic topic model comprising image-specific parameters for each digital image in the plurality of digital images and cluster-specific parameters for each cluster of a plurality of clusters; assigning each digital image from the plurality of digital images to a cluster of the plurality of clusters using the digital image's image-specific parameters and the probabilistic topic model, each cluster corresponding to a word in the vocabulary; identifying, for each cluster of the plurality of clusters, a representative digital image for the cluster; receiving a digital image in a digital image search request of a user; generating a set of words, from the vocabulary of words, for the received digital image; identifying, using the set of words generated for the digital image as a query and using each word in the vocabulary corresponding to each cluster of the plurality, a num

Assignees

Inventors

Classifications

  • using statistics or function optimisation, e.g. modelling of probability density functions · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • with fixed number of clusters, e.g. K-means clustering · CPC title

  • using shape and object relationship · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9922051B2 cover?
Disclosed herein is a system and method that facilitate searching and/or browsing of images by clustering, or grouping, the images into a set of image clusters using facets, such as without limitation visual properties or visual characteristics, of the images, and representing each image cluster by a representative image selected for the image cluster. A map-reduce based probabilistic topic mod…
Who is the assignee on this patent?
Yahoo Inc, Oath Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30274. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 20 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).