Automatic canonical digital image selection method and apparatus

US10163041B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10163041-B2
Application numberUS-201615198295-A
CountryUS
Kind codeB2
Filing dateJun 30, 2016
Priority dateJun 30, 2016
Publication dateDec 25, 2018
Grant dateDec 25, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are systems and methods for automatic selection of canonical digital images from a large corpus of digital images, such as the corpus of digital images available on the web, for an entity, such as and without limitation a person, a point of interest, object, etc. The automated, unsupervised approach for selecting a diverse set of high quality, canonical digital images, is well suited for processing a large corpus of digital images. A set of canonical digital images identified for an entity can be retrieved in response to a digital image request for digital images depicting the entity.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: receiving, at a computing device, a request for a set of canonical digital images of an entity; generating, via the computing device, a number of digital image search result sets, the search result set generation comprising querying a number of digital image data stores using a number of queries, each query comprising a number of search terms; selecting, via the computing device, a plurality of candidate digital images from the number of digital image search result sets, the plurality of candidate digital images being selected using a relevancy score associated with each candidate digital image of the plurality; analyzing, via the computing device, each candidate digital image to detect an object of a type corresponding to an object type of the entity and to detect a number of fiducial points of, the object type, in the detected object; determining, via the computing device, an n-dimensional feature vector for a candidate digital image of the plurality using data of pixels corresponding to the number of fiducial points of the object detected in the candidate digital image, the feature vector determination being performed for each candidate digital image of the plurality to determine a plurality of feature vectors; forming, via the computing device, a plurality of clusters using the plurality of feature vectors, each cluster of the plurality comprising a number of feature vectors, each feature vector in each cluster corresponding to a candidate digital image of the plurality; and selecting, via the computing device, a set of candidate digital images for the set of canonical digital images using a number of clusters of the plurality, the candidate digital image selection comprising determining, for each candidate digital image with a corresponding feature vector belonging to a cluster of the number of clusters, a measure of quality based on at least one consideration of quality, each candidate digital image of the set of candidate digital images having a higher measure of quality relative to the measure of quality associated with each unselected candidate digital image. 2. The method of claim 1 , further comprising: communicating, via the computing device and to a client computing device over an electronic communications network, the set of canonical digital images of the entity for display at the client computing device. 3. The method of claim 2 , the request for the set of canonical digital images of the entity being received from the client computing device over the electronic communications network. 4. The method of claim 1 , the feature vector determination further comprising: determining, via the computing device and for the candidate digital image of the plurality, a number of feature descriptors for each fiducial point of the number of fiducial points using a number of pixel regions, each feature descriptor being generated by analyzing a pixel region of the number of pixel regions using a feature descriptor algorithm; and aggregating the number of feature descriptors to form the feature vector for the candidate digital image. 5. The method of claim 4 , the candidate digital image analysis further comprising: analyzing, via the computing device, a candidate digital image of the plurality using a face detector algorithm to detect a plurality of fiducial points of the object, wherein the detected object is a face object; and selecting the number of fiducial points to represent the face object. 6. The method of claim 4 , the feature descriptor aggregation further comprising: reducing, via the computing device, the feature vector's dimensionality using a principal component analysis transformation. 7. The method of claim 1 , the cluster formation further comprising: forming, via the computing device, the plurality of clusters using a Markov Cluster (MCL) and an input graph representing the plurality of candidate digital images, the input graph comprising a node for each candidate digital image of the plurality and an edge for each pair of candidate digital images of the plurality, the edge having an associated edge weight that is based on a measure of similarity determined using the feature vectors corresponding to the candidate digital images of the pair, input to the MCL for the node corresponding to a candidate digital image including the feature vector determined for the candidate digital image. 8. The method of claim 7 , further comprising: setting, via the computing device, each edge weight determined to be less than a similarity to a predetermined minimum edge weight. 9. The method of claim 1 , the canonical digital image set selection further comprising: grouping, via the computing device, the number of feature vectors in a given cluster of the number of clusters based on similarity, the grouping forming a number of feature vector groups in the given cluster, the canonical digital image set selection comprising selecting the set of canonical digital images from each group in the given cluster. 10. The method of claim 1 , the measure of quality, for a given candidate digital image, is at least based on a proportion of the pixels corresponding to the object detected in the given candidate digital image relative to a total number of pixels of the candidate digital image. 11. The method of claim 1 , the measure of quality, for a given candidate digital image, is at least based on a location of the object detected in the given candidate digital image. 12. The method of claim 1 , the measure of quality, for a given candidate digital image, is at least based on a determination whether or not the candidate digital image depicts text. 13. The method of claim 1 , the measure of quality, for a given candidate digital image, is at least based on a determination whether or not the candidate digital image is a natural image. 14. The method of claim 1 , the measure of quality, for a given candidate digital image, is at least based on a determination of an aesthetic quality of the given candidate digital image. 15. A non-transitory computer-readable storage medium tangibly encoded with computer-executable instructions, that when executed by a processor associated with a computing device, performs a method comprising: receiving a request for a set of canonical digital images of an entity; generating a number of digital image search result sets, the search result set generation comprising querying a number of digital image data stores using a number of queries, each query comprising a number of search terms; selecting a plurality of candidate digital images from the number of digital image search result sets, the plurality of candidate digital images being selected using a relevancy score associated with each candidate digital image of the plurality; analyzing each candidate digital image to detect an object of a type corresponding to an object type of the entity and to detect a number of fiducial points of, the object type, in the detected object; determining an n-dimensional feature vector for a candidate digital image of the plurality using data of pixels corresponding to the number of fiducial points of the object detected in the candidate digital image, the feature vector determination being performed for each candidate digital image of the plurality to determine a plurality of feature vectors; forming a plurality of clusters using the plurality of feature vectors, each cluster of the plurality comprising a number of feature vectors, each feature vector in each cluster corresponding to a candidate digital image of the plurality; and selecting a set o

Assignees

Inventors

Classifications

  • Markov-related models; Markov random fields · CPC title

  • based on graphs, e.g. graph cuts or spectral clustering · CPC title

  • Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title

  • Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models · CPC title

  • using shape and object relationship · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10163041B2 cover?
Disclosed are systems and methods for automatic selection of canonical digital images from a large corpus of digital images, such as the corpus of digital images available on the web, for an entity, such as and without limitation a person, a point of interest, object, etc. The automated, unsupervised approach for selecting a diverse set of high quality, canonical digital images, is well suited …
Who is the assignee on this patent?
Oath Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/5854. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 25 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).