Fine-grained image similarity

US10181091B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10181091-B2
Application numberUS-201515504870-A
CountryUS
Kind codeB2
Filing dateJun 19, 2015
Priority dateJun 20, 2014
Publication dateJan 15, 2019
Grant dateJan 15, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, for determining fine-grained image similarity. In one aspect, a method includes training an image embedding function on image triplets by selecting image triplets of first, second and third images; generating, by the image embedding function, a first, second and third representations of the features of the first, second and third images; determining, based on the first representation of features and the second representation of features, a first similarity measure for the first image to the second image; determining, based on the first representation of features and the third representation of features, a second similarity measure for the first image to the third image; determining, based on the first and second similarity measures, a performance measure of the image embedding function for the image triplet; and adjusting the parameter weights of the image embedding function based on the performance measures for the image triplets.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method performed by data processing apparatus, the method comprising: iteratively training an image embedding function on image triplets, the embedding function comprising a set of parameter weights that operate on an input image to produce as output a representation of features of the image, each iteration of the training comprising: selecting image triplets, each image triplet being a combination of a first image, a second image and a third image, wherein a first pairwise relevance score that measures a similarity of the first image to the second image is greater than a second pairwise relevance score that measures the similarity of the first image to the third image; for each image triplet: providing each of the first, second and third images as input to the image embedding function, generating, by the image embedding function, a first representation of the features of the first image, a second representation of the features of the second image, and a third representation of the features of the third image; determining, based on the first representation of features and the second representation of features, a first similarity measure that measures a similarity of the first image to the second image; determining, based on the first representation of features and the third representation of features, a second similarity measure that measures a similarity of the first image to the third image; determining, based on the first and second similarity measures, a performance measure of the image embedding function for the image triplet; adjusting the parameter weights of the image embedding function based on the performance measures for the image triplets; and performing another iteration of the training until a cessation event occurs. 2. The computer-implemented method of claim 1 , wherein: determining the first similarity measure that measures the similarity of the first image to the second image comprises determining a first distance measure from the first representation of the features of the first image and the second representation of features of the second image; and determining the second similarity measure that measures the similarity of the first image to the third image comprises determining a second distance measure from the first representation of the features of the first image and the second representation of features of the second image. 3. The computer-implemented method of claim 1 , wherein the image embedding function generates a mapping of the image in Euclidean space as the output representation of features; wherein determining the first similarity measure that measures the similarity of the first image to the second image comprises determining a first Euclidean distance between the first representation of the features of the first image and the second representation of features of the second image; and wherein determining the second similarity measure that measures the similarity of the first image to the third image comprises determining a second Euclidean distance between the first representation of the features of the first image and the third representation of features of the third image. 4. The computer-implemented method of claim 3 , wherein determining a performance measure of the image embedding function for the image triplet comprises determining the performance measure based on the first Euclidean distance and the second Euclidean distance. 5. The computer-implemented method of claim 4 , wherein determining the performance measure based on the first Euclidean distance and the second Euclidean distance comprises determining a hinge loss based on a difference of the first Euclidean distance and the second Euclidean distance. 6. The computer-implemented method of claim 5 , further comprising: summing the hinge losses for the image triplets, determining whether the summation of the hinge losses meets a minimization criterion; and determining the cessation event occurs when the summation of the hinge losses meets the minimization criterion. 7. The computer-implemented method of claim 1 , wherein the image embedding function comprises; a first convolutional neural network having a first quantity of convolutional layers and trained to classify a plurality of images into a plurality of different classes, and configured to receive as input an image at a first resolution; a second convolution neural network having a second quantity of convolutional layers and trained to extract low-resolution features of a second resolution that is less than the first resolution, and configured to receive as input an image at the second resolution, wherein the second quantity of convolutional layers is less than the first quantity of convolutional layers; and wherein generating by the image embedding function a representation of the features of the image comprises: providing the image to the first convolution neural network at the first resolution; down sampling the image to the second resolution to generate a down sampled image; and providing the down sampled image to the second convolutional neural network. 8. The computer-implemented method of claim 7 , wherein the image embedding function further comprises: a first normalization layer that normalizes the output of the first convolutional neural network; a second normalization layer that normalizes the output of the second convolutional neural network; and a linear embedding layer that combines the normalized outputs of the first convolutional neural network and the second convolutional neural network. 9. The computer-implemented method of claim 1 , further comprising: accessing a plurality of images, the images collected into respective classes of images; and for at least a class of images of the respective classes of images: determining, for each image, a pairwise relevance total that is based on pairwise relevance scores that respectively measure the similarity of the image to a respective other image in the class of images; selecting an image in the class of images as a first image in the image triplet according to a likelihood that is proportional to its pairwise relevance total; selecting another image in the class of images as a second image in the image triplet according to a likelihood based on a selection threshold and a pairwise relevance score that measures a similarity of the first image to the second image; and selecting another image in the class of images as a third image in the image triplet according to a likelihood based on the section threshold and the pairwise relevance score that measures a similarity of the first image to the third image. 10. The computer-implemented method of claim 9 , wherein selecting another image in the class of images as the second image in the image triplet comprises selecting another image in the class of images based on a minimum of the selection threshold and the pairwise relevance score; and wherein selecting another image in the class of images as the third image in the image triplet comprises selecting another image in the class of images based on a minimum of the selection threshold and the pairwise relevance score. 11. The computer-implemented method of claim 9 , further comprising: for at least one image triplet, selecting an image in another class of images of the respective classes of images as a third image in the image triplet. 12. The computer-implemented method of claim 9 , wherein, for each image triplet, the first, second and third images are selected such that the difference resulting from subtracting the second pairwise relevance score from the first pairwise releva

Assignees

Inventors

Classifications

  • Matching criteria, e.g. proximity measures · CPC title

  • G06N20/10Primary

    using kernel methods, e.g. support vector machines [SVM] · CPC title

  • based on distances to training or reference patterns · CPC title

  • Combinations of networks · CPC title

  • using colour · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10181091B2 cover?
Methods, systems, and apparatus, for determining fine-grained image similarity. In one aspect, a method includes training an image embedding function on image triplets by selecting image triplets of first, second and third images; generating, by the image embedding function, a first, second and third representations of the features of the first, second and third images; determining, based on th…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 15 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).