Systems and methods for modeling item similarity using converted image information

US11294971B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11294971-B1
Application numberUS-202117157284-A
CountryUS
Kind codeB1
Filing dateJan 25, 2021
Priority dateJan 25, 2021
Publication dateApr 5, 2022
Grant dateApr 5, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for correlating item data are disclosed. A system for correlating item data may include a memory storing instructions and at least one processor configured to execute instructions to perform operations including: receiving text and image data associated with a reference item from a remote device; converting, using a computer-modeled embedding layer, at least one image to an image embedding; comparing the image embedding to reference embeddings stored in a database; selecting a subset of the candidate item text as candidate text data based on the comparison; selecting a subset of the candidate item images as candidate image data based on the comparison; selecting a text correlation model; determining a first similarity score; selecting an image correlation model; determining a second similarity score; calculating a confidence score based on the first and second similarity scores; and performing a responsive action based on the calculated confidence score.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for correlating item data, the system comprising: at least one processor; and a non-transitory computer-readable medium containing a set of instructions that, when executed by the at least one processor, cause the processor to perform steps comprising: receiving text data associated with a reference item from a remote device; receiving image data comprising at least one image associated with the reference item from the remote device; converting, using a computer-modeled embedding layer, the at least one image to an image embedding; comparing the image embedding to reference embeddings stored in a database by determining a subset of the reference embeddings within a Euclidean distance of the image embedding, the stored reference embeddings being associated with pairs of candidate item images and candidate item text of candidate items; selecting, based on the comparison, candidate text data, wherein the candidate text data: is a subset of the candidate item text, and includes a price within a predetermined range; selecting, based on the comparison, candidate image data, the candidate image data being a subset of the candidate item images; selecting a text correlation model; determining a first similarity score by applying the selected text correlation model to the received text data and the subset of the candidate item text; selecting an image correlation model; determining a second similarity score by applying the selected image correlation model to the received image data and the subset of the candidate item images; calculating a confidence score based on the determined first and second similarity scores; and performing a responsive action based on the calculated confidence score, wherein the responsive action comprises creating an association between the reference item and one of the candidate items; wherein at least one of the text correlation model or the image correlation model is selected based on a category of the reference item. 2. The system of claim 1 , wherein converting the at least one image to the image embedding comprises removing background pixels from the at least one image. 3. The system of claim 1 , wherein converting the at least one image to the image embedding comprises performing at least one of a cropping operation, a re-sizing operation, a brightness alteration operation, a contrast alteration operation, or an interpolation operation on the at least one image. 4. The system of claim 1 , wherein the image embedding is a vector containing floating-point values associated with pixel information of the at least one image. 5. The system of claim 4 , wherein the image embedding further contains values associated with metadata of the at least one image. 6. The system of claim 4 , wherein comparing the image embedding to reference embeddings comprises performing a Euclidean-space nearest-neighbor search. 7. The system of claim 6 , the steps further comprising determining that the reference embeddings are associated with an item category of the reference item. 8. The system of claim 7 , wherein the compared reference embeddings are part of a Euclidean-space embedding cluster among a plurality of Euclidean-space reference embedding clusters. 9. The system of claim 8 , wherein the responsive action further comprises updating at least one of the plurality of Euclidean-space reference embedding clusters by performing at least one of: adding a reference embedding to one of the Euclidean-space reference embedding clusters; removing a reference embedding from one of the Euclidean-space reference embedding clusters; or adjusting a boundary of one of the Euclidean-space reference embedding clusters. 10. The system of claim 1 , wherein: the candidate text data includes a set of canonical attributes of the one of the candidate items; and determining the first similarity score comprises: determining a set of reference attributes of the reference item corresponding in part to the set of canonical attributes; and comparing the reference set of attributes to the set of canonical attributes. 11. The system of claim 10 , wherein at least one of the canonical and reference attributes corresponds to: a color, a dimension, a model number, a weight, a shape, a scent, a material, a time of production, a multi-part item, or an item feature. 12. The system of claim 1 , wherein the steps further comprise: determining whether the calculated confidence score falls below a threshold; and when the calculated confidence score falls below the threshold, determining a differentiation factor indicating a difference between the reference item and at least one of the candidate items; and wherein the responsive action comprises adjusting a parameter of the text model or the image model using the differentiation factor or adding a new parameter to the text model or image model based on the differentiation factor. 13. The system of claim 1 , wherein the steps further comprise: determining whether the calculated confidence score is equal to or greater than a threshold; and when the calculated confidence score is equal to or greater than the threshold: creating an association between the reference item and the one of the candidate items; monitoring a webpage associated with the reference item to detect a change at the webpage; and transmitting a notification to a user device upon detecting a change to information associated with the reference item at the webpage. 14. The system of claim 13 , wherein the detected change is associated with a price of the reference item at the monitored webpage. 15. The system of claim 1 , wherein the selected image correlation model is a random forest model. 16. The system of claim 1 , wherein the selected text correlation model contains a text frequency parameter having a weight that is inversely related to a frequency of a character combination in a reference dataset. 17. The system of claim 1 , wherein the selected text correlation model is trained to ignore a property of the received text data when determining the first similarity score or the image correlation model is trained to ignore a property of the received image data or the reference image data when determining the second similarity score. 18. The system of claim 17 , wherein the ignored property is based on a user input. 19. A computer-implemented method for correlating item data comprising: receiving text data associated with a reference item from a remote device; receiving image data comprising at least one image associated with the reference item from the remote device; converting, using a computer-modeled embedding layer, the at least one image to an image embedding; comparing the image embedding to reference embeddings stored in a database by determining a subset of the reference embeddings within a Euclidean distance of the image embedding, the stored reference embeddings being associated with pairs of candidate item images and candidate item text of candidate items; selecting, based on the comparison, candidate text data, wherein the candidate text data: is a subset of the candidate item text, and includes a price within a predetermined range; selecting, based on the comparison, candidate image data, the candidate image data being a subset of the candidate item images; selecting a text correlation model; determining a first similarity score by applying the selected text correlation model to the received text data and the subset of the candidate item text; selectin

Assignees

Inventors

Classifications

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Distances to closest patterns, e.g. nearest neighbour classification · CPC title

  • Clustering techniques · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • G06F16/583Primary

    using metadata automatically derived from the content · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11294971B1 cover?
Systems and methods for correlating item data are disclosed. A system for correlating item data may include a memory storing instructions and at least one processor configured to execute instructions to perform operations including: receiving text and image data associated with a reference item from a remote device; converting, using a computer-modeled embedding layer, at least one image to an …
Who is the assignee on this patent?
Coupang Corp
What technology area does this patent fall under?
Primary CPC classification G06F16/583. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 05 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).