Systems and methods for image feature extraction
US-2022180476-A1 · Jun 9, 2022 · US
US11809520B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11809520-B1 |
| Application number | US-202117216234-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 29, 2021 |
| Priority date | Mar 29, 2021 |
| Publication date | Nov 7, 2023 |
| Grant date | Nov 7, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Devices and techniques are generally described for determining localized visual similarity. In some examples, a selection of a first location of interest on a first image data depicting at least one article of clothing may be received. In some examples, a first machine learning model may generate a feature map representing the first image data. In some examples, a reduced feature map may be generated based at least in part on a mapping of the first location of interest to the feature map. In some examples, a second image depicting at least a second article of clothing may be determined based at least in part on the reduced feature map.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving a selection of a first location of interest on a first image data representing at least one object; sending the first image data to a first machine learning model; generating, by the first machine learning model, a feature map representing the first image data, wherein the feature map comprises global representation data representing the at least one object; generating a reduced feature map based at least in part on a mapping of the first location of interest to the feature map, wherein the reduced feature map comprises the global representation data and local representation data representing the first image data at the first location of interest; and determining a second image of a different object based at least in part on the reduced feature map. 2. The method of claim 1 , further comprising: determining a first coordinate of the first location of interest in a pixel space of the first image data; determining a portion of the feature map spatially corresponding to the first coordinate; and determining the reduced feature map using the portion of the feature map. 3. The method of claim 2 , further comprising: determining four coordinates in the feature map defining a rectangle that correspond to a region surrounding the first location of interest in the pixel space; and generating a feature vector representing the first location of interest by bilinearly interpolating feature vectors of the four coordinates. 4. The method of claim 1 , further comprising: determining a grid of pixels surrounding the first location of interest in the first image data, wherein the reduced feature map represents each pixel of the grid of pixels; and determining an embedding representing the reduced feature map using at least convolutional layer. 5. The method of claim 1 , further comprising: generating a set of training images that, for a first fashion attribute, comprise: a reference sample that identifies the first fashion attribute; a positive sample that includes the first fashion attribute and that is labeled as a positive match with respect to the reference sample; a first negative sample that is labeled as a negative match with respect to the reference sample; and a second negative sample that is labeled as the negative match with respect to the reference sample. 6. The method of claim 5 , wherein: a first margin of loss associated with the first negative sample comprises a first loss value; and a second margin of loss associated with the second negative sample comprises a second loss value, wherein the first loss value is greater than the second loss value. 7. The method of claim 1 , further comprising: generating a first embedding using the reduced feature map; searching an embedding space using the first embedding; and determining a second embedding based at least in part on a similarity between the first embedding and the second embedding, wherein the second image is associated with the second embedding. 8. The method of claim 1 , further comprising generating training instances used to train the first machine learning model by moving the first location of interest from a first coordinate in a pixel space of the first image data to a second coordinate in the pixel space of the first image data, wherein the second coordinate is less than or equal to a threshold distance from the first coordinate. 9. A system comprising: at least one processor; and non-transitory computer-readable memory storing instructions that, when executed by the at least one processor, are effective to: receive a selection of a first location of interest on a first image data representing at least one object; generate, by a first machine learning model, a feature map representing the first image data, wherein the feature map comprises global representation data representing the at least one object; generate a reduced feature map based at least in part on a mapping of the first location of interest to the feature map, wherein the reduced feature map comprises the global representation data and local representation data representing the first image data at the first location of interest; and determine a second image depicting at least a second object based at least in part on the reduced feature map. 10. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: determine a first coordinate of the first location of interest in a pixel space of the first image data; determine a portion of the feature map spatially corresponding to the first coordinate; and determine the reduced feature map using the portion of the feature map. 11. The system of claim 10 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: determine four coordinates in the feature map defining a rectangle that correspond to a region surrounding the first location of interest in the pixel space; and generate a feature vector representing the first location of interest by bilinearly interpolating feature vectors of the four coordinates. 12. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: determine a grid of pixels surrounding the first location of interest in the first image data, wherein the reduced feature map represents each pixel of the grid of pixels; and determine an embedding representing the reduced feature map using at least convolutional layer. 13. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: generate a set of training images that, for a first fashion attribute, comprise: a reference sample that identifies the first fashion attribute; a positive sample that includes the first fashion attribute and that is labeled as a positive match with respect to the reference sample; a first negative sample that is labeled as a negative match with respect to the reference sample; and a second negative sample that is labeled as the negative match with respect to the reference sample. 14. The system of claim 13 , wherein: a first margin of loss associated with the first negative sample comprises a first loss value; and a second margin of loss associated with the second negative sample comprises a second loss value, wherein the first loss value is greater than the second loss value. 15. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to: generate a first embedding using the reduced feature map; search an embedding space using the first embedding; and determine a second embedding based at least in part on a similarity between the first embedding and the second embedding, wherein the second image is associated with the second embedding. 16. The system of claim 9 , the non-transitory computer-readable memory storing further instructions that, when executed by the at least one processor, are further effective to generate training instances used to train the first machine learning model by moving the first location of interest from a first coordinate in a pixel space of the first image data to a second coordinate in the pixel space of the first image data, wherein the second co
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
using ranking · CPC title
Browsing; Visualisation therefor · CPC title
having vectorial format · CPC title
Search customisation based on user profiles and personalisation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.