What technology area does this patent fall under?

Primary CPC classification G06Q30/0631. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Gender attribute assignment using a multimodal neural graph

US12062081B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12062081-B2
Application number	US-202318103862-A
Country	US
Kind code	B2
Filing date	Jan 31, 2023
Priority date	Jan 31, 2020
Publication date	Aug 13, 2024
Grant date	Aug 13, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system including one or more processors and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform functions including: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective text embedding; generating a graph of the set of items based on at least co-view data to create pairs of items that are co-viewed by joining respective pairs of items; training the text embedding model and a machine learning model using a neural loss function based on the graph; and automatically determining, using the machine learning model, as trained, a label for each item of the set of items. Other embodiments are disclosed.

First claim

Opening claim text (preview).

What is claimed: 1. A system comprising: one or more processors; and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform functions comprising: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective text embedding using a text embedding model for each item of the set of items; generating a graph of the set of items based on at least co-view data to create pairs of items that are co-viewed by joining respective pairs of items that are connected by a set of edges, wherein each pair of items joined by a respective edge of the set of edges in the graph has been viewed together in one or more respective sessions, and the respective edge comprises a respective weight comprising a co-view count of a respective pair of items; training the text embedding model and a machine learning model using a neural loss function based on the graph; and automatically determining, using the machine learning model, as trained, a label for each item of the set of items. 2. The system of claim 1 , wherein: the text embedding model is a Bidirectional Encoder Representations from Transformers (“BERT”); and an output from the text embedding model comprises a vector representation. 3. The system of claim 1 , wherein the set of edges comprises (a) one or more unlabeled-unlabeled edges, (b) one or more labeled-unlabeled edges, and (c) one or more labeled-labeled edges. 4. The system of claim 1 , wherein training the text embedding model and the machine learning model using the neural loss function based on the graph further comprises: training the machine learning model with the neural loss function based on first distances between first text embeddings for first pairs of nodes connected by one or more labeled-labeled edges, second distances between second text embeddings for second pairs of nodes connected by one or more labeled-unlabeled edges, third distances between third text embeddings for third pairs of nodes connected by one or more unlabeled-unlabeled edges, and a softmax loss cost function for fourth text embeddings of nodes of the graph that are labeled. 5. The system of claim 1 , wherein the computing instructions, when executed on the one or more processors, further cause the one or more processors to perform a function comprising: determining, based on an image embedding model, as trained, a label for each second item of the set of items that does not meet a predetermined threshold. 6. The system of claim 5 , wherein the predetermined threshold is 5. 7. The system of claim 5 , wherein the computing instructions, when executed on the one or more processors, further cause the one or more processors to perform a function comprising: transforming an image into a vector representing the image using a residual neural network (“ResNet”). 8. The system of claim 1 , wherein the computing instructions when executed on the one or more processors, further cause the one or more processors, to perform a function comprising: training an image embedding model based on images of items from an item catalog database using loss equations to minimize a distance between text representations and image representations for the items. 9. The system of claim 8 , wherein the images of the items from depict items of clothing. 10. The system of claim 1 , wherein the at least one respective attribute value comprises a gender classification. 11. A method being implemented via execution of computing instructions configured to run on one or more processors and stored at one or more non-transitory computer-readable media, the method comprising: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective text embedding using a text embedding model for each item of the set of items; generating a graph of the set of items based on at least co-view data to create pairs of items that are co-viewed by joining respective pairs of items that are connected by a set of edges, wherein each pair of items joined by a respective edge of the set of edges in the graph has been viewed together in one or more respective sessions, and the respective edge comprises a respective weight comprising a co-view count of a respective pair of items; training the text embedding model and a machine learning model using a neural loss function based on the graph; and automatically determining, using the machine learning model, as trained, a label for each item of the set of items. 12. The method of claim 11 , wherein: the text embedding model is a Bidirectional Encoder Representations from Transformers (“BERT”); and an output from the text embedding model comprises a vector representation. 13. The method of claim 11 , wherein the set of edges comprises (a) one or more unlabeled-unlabeled edges, (b) one or more labeled-unlabeled edges, and (c) one or more labeled-labeled edges. 14. The method of claim 11 , wherein training the text embedding model and the machine learning model using the neural loss function based on the graph further comprises: training the machine learning model with the neural loss function based on first distances between first text embeddings for first pairs of nodes connected by one or more labeled-labeled edges, second distances between second text embeddings for second pairs of nodes connected by one or more labeled-unlabeled edges, third distances between third text embeddings for third pairs of nodes connected by one or more unlabeled-unlabeled edges, and a softmax loss cost function for fourth text embeddings of nodes of the graph that are labeled. 15. The method of claim 11 further comprising: determining, based on an image embedding model, as trained, a label for each second item of the set of items that does not meet a predetermined threshold. 16. The method of claim 15 , wherein the predetermined threshold is 5. 17. The method of claim 15 further comprising: transforming an image into a vector representing the image using a residual neural network (“ResNet”). 18. The method of claim 11 further comprising: training an image embedding model based on images of items from an item catalog database using loss equations to minimize a distance between text representations and image representations for the items. 19. The method of claim 18 , wherein the images of the items from the item catalog database depict items of clothing. 20. The method of claim 11 , wherein the at least one respective attribute value comprises a gender classification.

Assignees

Walmart Apollo Llc

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0895
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
G06N3/048
Activation functions · CPC title
G06F16/9024
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 77062271

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12062081B2 cover?: A system including one or more processors and one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform functions including: receiving a respective item description and at least one respective attribute value for each item of a set of items; generating at least one respective t…
Who is the assignee on this patent?: Walmart Apollo Llc
What technology area does this patent fall under?: Primary CPC classification G06Q30/0631. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Multimodal Image Classifier using Textual and Visual Embeddings

Search engine use of neural network regressor for multi-modal item recommendations based on visual semantic embeddings

Systems for modeling uncertainty in multi-modal retrieval and methods thereof

Text field detection using neural networks

Natural language recommendation feedback

Product recommendations based on augmented reality viewpoints

Frequently asked questions