What technology area does this patent fall under?

Primary CPC classification G06V10/464. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 23 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Weighting scheme for pooling image descriptors

US9424492B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9424492-B2
Application number	US-201314141612-A
Country	US
Kind code	B2
Filing date	Dec 27, 2013
Priority date	Dec 27, 2013
Publication date	Aug 23, 2016
Grant date	Aug 23, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating an image representation includes generating a set of embedded descriptors, comprising, for each of a set of patches of an image, extracting a patch descriptor which is representative of the pixels in the patch and embedding the patch descriptor in a multidimensional space to form an embedded descriptor. An image representation is generated by aggregating the set of embedded descriptors. In the aggregation, each descriptor is weighted with a respective weight in a set of weights, the set of weights being computed based on the patch descriptors for the image. Information based on the image representation is output. At least one of the extracting of the patch descriptors, embedding the patch descriptors, and generating the image representation is performed with a computer processor.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for generating an image representation comprising: generating a set of embedded patch descriptors, comprising, for each of a set of patches of an image, extracting a patch descriptor which is representative of the pixels in the patch; and embedding the patch descriptor in a multidimensional space to form an embedded patch descriptor; generating an image representation comprising aggregating the set of embedded patch descriptors, wherein in the aggregation, each patch descriptor is weighted with a respective weight in a set of weights, the set of weights being computed based on the extracted patch descriptors for the image, wherein the generating of the image representation comprises identifying an image representation that optimizes the probability that when a matrix of the embedded patch descriptors is multiplied by the image representation, the result is a vector in which each element of the vector has a constant same value, the optimization including identifying an image representation Ψ that minimizes the expression ∥Φ T Ψ−c M ∥ 2 +λ∥Ψ∥ 2 , where Φ is a D×M matrix that contains the D-dimensional patch embeddings, c M is the vector in which all values are the same, and λ represents a non-zero regularization parameter; and outputting information based on the image representation, wherein at least one of the extracting of the patch descriptors, embedding of the patch descriptors, generating the image representation, and outputting information is performed with a computer processor. 2. The method of claim 1 , wherein the generating of the image representation comprises learning the set of weights such that when the evaluation of a kernel function between a first patch descriptor selected from the set of patch descriptors and one other patch descriptor from the set of descriptors is weighted by the weight of the other patch descriptor and summed over all the patch descriptors, the sum is a constant value for each of the patch descriptors when treated as the first patch descriptor. 3. The method of claim 1 , wherein λ is selected from a range of 1 to 10,000. 4. The method of claim 1 , wherein the optimization is performed by Conjugate Gradient Descent. 5. The method of claim 1 , wherein the generating an image representation comprises normalizing the aggregation of weighted image descriptors. 6. The method of claim 1 , wherein the aggregation of image weighted descriptors comprises a sum of the weighted image descriptors. 7. The method of claim 1 , wherein the method further comprises extracting the patches from the image. 8. The method of claim 1 , wherein embedding of the patch descriptor comprises computing higher-order statistics which assume the patch descriptor is emitted by a generative model. 9. The method of claim 1 , wherein the set of patches comprises at least 100 patches. 10. The method of claim 1 , wherein the extracting of the patch descriptor comprises extracting at least one of an intensity gradient-based descriptor and a color descriptor. 11. The method of claim 1 , further comprising classifying the image based on the image representation and wherein the outputting information comprises outputting information based on the classification. 12. The method of claim 11 , wherein the classification is performed with a linear classifier. 13. The method of claim 1 , wherein the outputting information comprises computing a similarity between two images as a function of a dot product between image representations of the two images generated by the method of claim 1 . 14. A computer program product comprising a non-transitory recording medium storing instructions, which when executed on a computer causes the computer to perform a method comprising: generating a set of embedded patch descriptors, comprising, for each of a set of patches of an image, extracting a patch descriptor which is representative of the pixels in the patch; and embedding the patch descriptor in a multidimensional space to form an embedded patch descriptor; generating an image representation comprising aggregating the set of embedded patch descriptors, wherein in the aggregation, each patch descriptor is weighted with a respective weight in a set of weights, the set of weights being computed based on the extracted patch descriptors for the image, which includes optimizing one of: Φ T Ψ=c M , and Kw=c M , where Φ is a D×M matrix that contains M of the D-dimensional embedded patch descriptors, Ψ is the image representation, and c M is a vector in which each of the M elements has a constant, same value, K is an M×M kernel matrix between individual patch descriptors and w is an M×1 vector of the weights; and outputting information based on the image representation, wherein at least one of the extracting of the patch descriptors, embedding of the patch descriptors, generating the image representation, and outputting information is performed with a computer processor. 15. A system comprising memory storing instructions for performing the method of claim 1 and a processor in communication with the memory which executes the instructions. 16. A system for generating an image representation comprising: a descriptor extractor which extracts a set of patch descriptors, each patch descriptor being representative of the pixels in a patch of an image; an embedding component which embeds each of the patch descriptors in a multidimensional space to form a respective embedded patch descriptor; a pooling component which aggregates the set of embedded descriptors, wherein in the aggregation, each patch descriptor is weighted with a respective weight in a set of weights, the set of weights being computed based on the extracted patch descriptors for the image, which includes optimizing one of: Φ T Ψ=c M , and Kw=c M , where Φ is a D×M matrix that contains M of the D-dimensional embedded patch descriptors, W is the image representation, and c M is a vector in which each of the M elements has a constant, same value, K is an M×M kernel matrix between individual patch descriptors and w is an M×1 vector of the weights; and a processor which implements the descriptor extractor, embedding component, and pooling component. 17. A method for generating an image representation comprising: for each of a set of M patches of an image, extracting a patch descriptor which is representative of the pixels in the patch and embedding the patch descriptor in a multidimensional space with an embedding function to form a D-dimensional embedded descriptor; with a processor, generating a representation of the image comprising aggregating the embedded descriptors as Ψ=Σ i=1 M w i φ(x i ), where Ψ is the aggregated representation, φ(x i ) represents one of the M embedded patch descriptors and w i represents a respective weight, the weights being selected by one of: a) finding a vector w=[mw 1 , . . . , w M ] that minimizes the expression: ∥Φ T ΦW−c M ∥ 2 —λ∥w∥ 2 where Φ is a D×M matrix that contains the D-dimensional embedded patch descriptors, c M is a vector in which all values are a same constant value, and λ is a non-negative regularization parameter; and b) finding the aggregated representation W that minimizes the expression: ∥Φ T Ψ−c M ∥ 2 +λ∥Ψ∥ 2 (Eqn. 11), where Φ is a D×M matrix that contains the D-dimensional embedded patch descriptors, c M is a vector in which all values are all a same constant value, and λ is a non-negative regularization parameter; and generating an image representation based on Ψ. 18. A computer pr

Assignees

Xerox Corp

Inventors

Classifications

G06V10/806
of extracted features · CPC title
G06F18/253
of extracted features · CPC title
G06V10/464Primary
using a plurality of salient features, e.g. bag-of-words [BoW] representations · CPC title
G06K9/629Primary
Physics · mapped topic
G06K9/4676
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 53396020

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9424492B2 cover?: A method for generating an image representation includes generating a set of embedded descriptors, comprising, for each of a set of patches of an image, extracting a patch descriptor which is representative of the pixels in the patch and embedding the patch descriptor in a multidimensional space to form an embedded descriptor. An image representation is generated by aggregating the set of embed…
Who is the assignee on this patent?: Xerox Corp
What technology area does this patent fall under?: Primary CPC classification G06V10/464. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 23 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).