Localized visual similarity

US11809520B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11809520-B1
Application numberUS-202117216234-A
CountryUS
Kind codeB1
Filing dateMar 29, 2021
Priority dateMar 29, 2021
Publication dateNov 7, 2023
Grant dateNov 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Devices and techniques are generally described for determining localized visual similarity. In some examples, a selection of a first location of interest on a first image data depicting at least one article of clothing may be received. In some examples, a first machine learning model may generate a feature map representing the first image data. In some examples, a reduced feature map may be generated based at least in part on a mapping of the first location of interest to the feature map. In some examples, a second image depicting at least a second article of clothing may be determined based at least in part on the reduced feature map.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a selection of a first location of interest on a first image data representing at least one object; sending the first image data to a first machine learning model; generating, by the first machine learning model, a feature map representing the first image data, wherein the feature map comprises global representation data representing the at least one object; generating a reduced feature map based at least in part on a mapping of the first location of interest to the feature map, wherein the reduced feature map comprises the global representation data and local representation data representing the first image data at the first location of interest; and determining a second image of a different object based at least in part on the reduced feature map. 2. The method of claim 1 , further comprising: determining a first coordinate of the first location of interest in a pixel space of the first image data; determining a portion of the feature map spatially corresponding to the first coordinate; and determining the reduced feature map using the portion of the feature map. 3. The method of claim 2 , further comprising: determining four coordinates in the feature map defining a rectangle that correspond to a region surrounding the first location of interest in the pixel space; and generating a feature vector representing the first location of interest by bilinearly interpolating feature vectors of the four coordinates. 4. The method of claim 1 , further comprising: determining a grid of pixels surrounding the first location of interest in the first image data, wherein the reduced feature map represents each pixel of the grid of pixels; and determining an embedding representing the reduced feature map using at least convolutional layer. 5. The method of claim 1 , further comprising: generating a set of training images that, for a first fashion attribute, comprise: a reference sample that identifies the first fashion attribute; a positive sample that includes the first fashion attribute and that is labeled as a positive match with respect to the reference sample; a first negative sample that is labeled as a negative match with respect to the reference sample; and a second negative sample that is labeled as the negative match with respect to the reference sample. 6. The method of claim 5 , wherein: a first margin of loss associated with the first negative sample comprises a first loss value; and a second margin of loss associated with the second negative sample comprises a second loss value, wherein the first loss value is greater than the second loss value. 7. The method of claim 1 , further comprising: generating a first embedding using the reduced feature map; searching an embedding space using the first embedding; and determining a second embedding based at least in part on a similarity between the first embedding and the second embedding, wherein the second image is associated with the second embedding. 8. The method of claim 1 , further comprising generating training instances used to train the first machine learning model by moving the first location of interest from a first coordinate in a pixel space of the first image data to a second coordinate in the pixel space of the first image data, wherein the second coordinate is less than or equal to a threshold distance from the first coordinate. 9. A system comprising: at least one processor; and non-transitory computer-readable memory storing instructions that, when executed by the at least one processor, are effective to: receive a selection of a first location of interest on a first image data representing at least one object; generate, by a first machine learning model, a feature map representing the first image data, wherein the feature map comprises global representation data representing the at least one object; generate a reduced feature map based at least in part on a mapping of the first location of interest to the feature map, wherein the reduced feature map comprises the global representation data and local representation data representing the first image data at the first location of interest; and determine a second image depicting at least a second object based at least in part on the reduced feature map. 10. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: determine a first coordinate of the first location of interest in a pixel space of the first image data; determine a portion of the feature map spatially corresponding to the first coordinate; and determine the reduced feature map using the portion of the feature map. 11. The system of claim 10 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: determine four coordinates in the feature map defining a rectangle that correspond to a region surrounding the first location of interest in the pixel space; and generate a feature vector representing the first location of interest by bilinearly interpolating feature vectors of the four coordinates. 12. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: determine a grid of pixels surrounding the first location of interest in the first image data, wherein the reduced feature map represents each pixel of the grid of pixels; and determine an embedding representing the reduced feature map using at least convolutional layer. 13. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: generate a set of training images that, for a first fashion attribute, comprise: a reference sample that identifies the first fashion attribute; a positive sample that includes the first fashion attribute and that is labeled as a positive match with respect to the reference sample; a first negative sample that is labeled as a negative match with respect to the reference sample; and a second negative sample that is labeled as the negative match with respect to the reference sample. 14. The system of claim 13 , wherein: a first margin of loss associated with the first negative sample comprises a first loss value; and a second margin of loss associated with the second negative sample comprises a second loss value, wherein the first loss value is greater than the second loss value. 15. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: generate a first embedding using the reduced feature map; search an embedding space using the first embedding; and determine a second embedding based at least in part on a similarity between the first embedding and the second embedding, wherein the second image is associated with the second embedding. 16. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to generate training instances used to train the first machine learning model by moving the first location of interest from a first coordinate in a pixel space of the first image data to a second coordinate in the pixel space of the first image data, wherein the second co

Assignees

Inventors

Classifications

  • G06F18/214Primary

    Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • using ranking · CPC title

  • Browsing; Visualisation therefor · CPC title

  • having vectorial format · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11809520B1 cover?
Devices and techniques are generally described for determining localized visual similarity. In some examples, a selection of a first location of interest on a first image data depicting at least one article of clothing may be received. In some examples, a first machine learning model may generate a feature map representing the first image data. In some examples, a reduced feature map may be gen…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F18/214. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).