Gender attribute assignment using a multimodal neural graph

US12062081B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12062081-B2
Application numberUS-202318103862-A
CountryUS
Kind codeB2
Filing dateJan 31, 2023
Priority dateJan 31, 2020
Publication dateAug 13, 2024
Grant dateAug 13, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system including one or more processors and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform functions including: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective text embedding; generating a graph of the set of items based on at least co-view data to create pairs of items that are co-viewed by joining respective pairs of items; training the text embedding model and a machine learning model using a neural loss function based on the graph; and automatically determining, using the machine learning model, as trained, a label for each item of the set of items. Other embodiments are disclosed.

First claim

Opening claim text (preview).

What is claimed: 1. A system comprising: one or more processors; and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform functions comprising: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective text embedding using a text embedding model for each item of the set of items; generating a graph of the set of items based on at least co-view data to create pairs of items that are co-viewed by joining respective pairs of items that are connected by a set of edges, wherein each pair of items joined by a respective edge of the set of edges in the graph has been viewed together in one or more respective sessions, and the respective edge comprises a respective weight comprising a co-view count of a respective pair of items; training the text embedding model and a machine learning model using a neural loss function based on the graph; and automatically determining, using the machine learning model, as trained, a label for each item of the set of items. 2. The system of claim 1 , wherein: the text embedding model is a Bidirectional Encoder Representations from Transformers (“BERT”); and an output from the text embedding model comprises a vector representation. 3. The system of claim 1 , wherein the set of edges comprises (a) one or more unlabeled-unlabeled edges, (b) one or more labeled-unlabeled edges, and (c) one or more labeled-labeled edges. 4. The system of claim 1 , wherein training the text embedding model and the machine learning model using the neural loss function based on the graph further comprises: training the machine learning model with the neural loss function based on first distances between first text embeddings for first pairs of nodes connected by one or more labeled-labeled edges, second distances between second text embeddings for second pairs of nodes connected by one or more labeled-unlabeled edges, third distances between third text embeddings for third pairs of nodes connected by one or more unlabeled-unlabeled edges, and a softmax loss cost function for fourth text embeddings of nodes of the graph that are labeled. 5. The system of claim 1 , wherein the computing instructions, when executed on the one or more processors, further cause the one or more processors to perform a function comprising: determining, based on an image embedding model, as trained, a label for each second item of the set of items that does not meet a predetermined threshold. 6. The system of claim 5 , wherein the predetermined threshold is 5. 7. The system of claim 5 , wherein the computing instructions, when executed on the one or more processors, further cause the one or more processors to perform a function comprising: transforming an image into a vector representing the image using a residual neural network (“ResNet”). 8. The system of claim 1 , wherein the computing instructions when executed on the one or more processors, further cause the one or more processors, to perform a function comprising: training an image embedding model based on images of items from an item catalog database using loss equations to minimize a distance between text representations and image representations for the items. 9. The system of claim 8 , wherein the images of the items from depict items of clothing. 10. The system of claim 1 , wherein the at least one respective attribute value comprises a gender classification. 11. A method being implemented via execution of computing instructions configured to run on one or more processors and stored at one or more non-transitory computer-readable media, the method comprising: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective text embedding using a text embedding model for each item of the set of items; generating a graph of the set of items based on at least co-view data to create pairs of items that are co-viewed by joining respective pairs of items that are connected by a set of edges, wherein each pair of items joined by a respective edge of the set of edges in the graph has been viewed together in one or more respective sessions, and the respective edge comprises a respective weight comprising a co-view count of a respective pair of items; training the text embedding model and a machine learning model using a neural loss function based on the graph; and automatically determining, using the machine learning model, as trained, a label for each item of the set of items. 12. The method of claim 11 , wherein: the text embedding model is a Bidirectional Encoder Representations from Transformers (“BERT”); and an output from the text embedding model comprises a vector representation. 13. The method of claim 11 , wherein the set of edges comprises (a) one or more unlabeled-unlabeled edges, (b) one or more labeled-unlabeled edges, and (c) one or more labeled-labeled edges. 14. The method of claim 11 , wherein training the text embedding model and the machine learning model using the neural loss function based on the graph further comprises: training the machine learning model with the neural loss function based on first distances between first text embeddings for first pairs of nodes connected by one or more labeled-labeled edges, second distances between second text embeddings for second pairs of nodes connected by one or more labeled-unlabeled edges, third distances between third text embeddings for third pairs of nodes connected by one or more unlabeled-unlabeled edges, and a softmax loss cost function for fourth text embeddings of nodes of the graph that are labeled. 15. The method of claim 11 further comprising: determining, based on an image embedding model, as trained, a label for each second item of the set of items that does not meet a predetermined threshold. 16. The method of claim 15 , wherein the predetermined threshold is 5. 17. The method of claim 15 further comprising: transforming an image into a vector representing the image using a residual neural network (“ResNet”). 18. The method of claim 11 further comprising: training an image embedding model based on images of items from an item catalog database using loss equations to minimize a distance between text representations and image representations for the items. 19. The method of claim 18 , wherein the images of the items from the item catalog database depict items of clothing. 20. The method of claim 11 , wherein the at least one respective attribute value comprises a gender classification.

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Activation functions · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12062081B2 cover?
A system including one or more processors and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform functions including: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective t…
Who is the assignee on this patent?
Walmart Apollo Llc
What technology area does this patent fall under?
Primary CPC classification G06Q30/0631. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).