What technology area does this patent fall under?

Primary CPC classification G06Q30/0631. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Generating digital recommendations utilizing collaborative filtering, reinforcement learning, and inclusive sets of negative feedback

US12586114B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12586114-B2
Application number	US-202117367134-A
Country	US
Kind code	B2
Filing date	Jul 2, 2021
Priority date	Jul 2, 2021
Publication date	Mar 24, 2026
Grant date	Mar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize collaborative filtering and a reinforcement learning model having an actor-critic framework to provide digital content items across client devices. In particular, in one or more embodiments, the disclosed systems monitor interactions of a client device with one or more digital content items to generate item embeddings (e.g., utilizing a collaborative filtering model). The disclosed systems further utilize a reinforcement learning model to generate a recommendation (e.g., determine one or more additional digital content items to provide to the client device) based on the user interactions. In some implementations, the disclosed systems utilize the reinforcement learning model to analyze every negative and positive interaction observed when generating the recommendation. Further, the disclosed systems utilize the reinforcement learning model to analyze item embeddings, which encode the relationships among the digital content items, when generating the recommendation.

First claim

Opening claim text (preview).

What is claimed is: 1 . A non-transitory computer-readable medium storing instructions thereon that, when executed by at least one processor, cause the at least one processor to perform operations comprising: generating, for a plurality of digital content items, a set of item embeddings that encode interactions across client devices associated with the plurality of digital content items; monitoring user interactions of a client device with one or more digital content items from the plurality of digital content items during an interaction session; determining, utilizing the set of item embeddings, a negative interaction map and a positive interaction map from the user interactions of the client device during the interaction session by: determining the positive interaction map or the negative interaction map using, from the set of item embeddings, item embeddings that correspond to the user interactions of the client device during the interaction session; and determining at least one of the positive interaction map or the negative interaction map using, from the set of item embeddings, additional item embeddings selected based on distances within an item embedding space between the additional item embeddings and the item embeddings that correspond to the user interactions, the additional item embeddings corresponding to digital content items with which the client device did not interact during the interaction session; determining, utilizing a reinforcement learning model, one or more additional digital content items from the plurality of digital content items to provide for display by: generating, utilizing a first convolutional gated recurrent unit neural network layer of the reinforcement learning model, a negative state for the client device based on the negative interaction map; generating, utilizing a second convolutional gated recurrent unit neural network layer of the reinforcement learning model, a positive state for the client device based on the positive interaction map; and generating a recommendation for the one or more additional digital content items based on the set of item embeddings, the negative state, and the positive state; determining, using one or more rectified linear interaction neural network layers of the reinforcement learning model, a value function indicating a measure of quality of the one or more additional digital content items; and modifying parameters of the reinforcement learning model using the value function. 2 . The non-transitory computer-readable medium of claim 1 , wherein: determining the positive interaction map or the negative interaction map using the item embeddings that correspond to the user interactions comprises determining, for the negative interaction map and from the set of item embeddings, one or more negative item embeddings by determining an item embedding for each negative interaction from the user interactions; and determining at least one of the positive interaction map or the negative interaction map using the additional item embeddings selected based on the distances within the item embedding space comprises determining, for the negative interaction map and from the set of item embeddings, one or more additional negative item embeddings based on a proximity to the one or more negative item embeddings within the item embedding space. 3 . The non-transitory computer-readable medium of claim 1 , wherein modifying the parameters of the reinforcement learning model using the value function comprises: modifying, using the value function, a first set of parameters for the first convolutional gated recurrent unit neural network layer; and modifying, using the value function, a second set of parameters for the second convolutional gated recurrent unit neural network layer. 4 . The non-transitory computer-readable medium of claim 3 , wherein: modifying, using the value function, the first set of parameters for the first convolutional gated recurrent unit neural network layer comprises back propagating the value function to the first convolutional gated recurrent unit neural network layer; and modifying, using the value function, the second set of parameters for the second convolutional gated recurrent unit neural network layer comprises back propagating the value function to the second convolutional gated recurrent unit neural network layer. 5 . The non-transitory computer-readable medium of claim 1 , wherein determining, utilizing the reinforcement learning model, the one or more additional digital content items based on the negative state, the positive state, and the set of item embeddings comprises: generating a first similarity metric between the positive state and the set of item embeddings; generating a second similarity metric between the negative state and the set of item embeddings; and determining the one or more additional digital content items utilizing the first similarity metric and the second similarity metric. 6 . The non-transitory computer-readable medium of claim 1 , wherein generating the set of item embeddings for the plurality of digital content items comprises generating the set of item embeddings via collaborative filtering or graph embedding to encode the interactions across the client devices associated with the plurality of digital content items. 7 . The non-transitory computer-readable medium of claim 1 , wherein: determining the positive interaction map or the negative interaction map using the item embeddings that correspond to the user interactions comprises determining, for the negative interaction map and from the set of item embeddings, one or more negative item embeddings by determining an item embedding for each negative interaction from the user interactions; and determining at least one of the positive interaction map or the negative interaction map using the additional item embeddings selected based on the distances within the item embedding space comprises determining, for the positive interaction map and from the set of item embeddings, one or more positive item embeddings based on a distance from the one or more negative item embeddings within the item embedding space. 8 . The non-transitory computer-readable medium of claim 1 , wherein determining the negative interaction map from the user interactions of the client device during the interaction session comprises determining the negative interaction map without sampling a subset of negative interactions from the user interactions. 9 . The non-transitory computer-readable medium of claim 1 , wherein: generating, utilizing the first convolutional gated recurrent unit neural network layer, the negative state for the client device based on the negative interaction map comprises generating, using the first convolutional gated recurrent unit neural network layer having a set of shared weights, the negative state for the client device based on the negative interaction map and a previous negative state corresponding to a previous interaction session of the client device; and generating, utilizing the second convolutional gated recurrent unit neural network layer, the positive state for the client device based on the positive interaction map comprises generating, using the second convolutional gated recurrent unit neural network layer having the set of shared weights, the positive state for the client device based on the positive interaction map and a previous positive state corresponding to the previous interaction session of the client device. 10 . The non-transitory computer-readable medium of claim 9 , wherein determining at least one of the positive interaction map or the negative interaction map using the additional item embeddings corr

Assignees

Adobe Inc

Inventors

Classifications

G06Q30/0202
Market predictions or forecasting for commercial activities · CPC title
G06N3/088
Non-supervised learning, e.g. competitive learning · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06F18/2178
based on feedback of a supervisor · CPC title
G06Q30/0282
Rating or review of business operators or products · CPC title

Patent family

Related publications grouped by family.

View patent family 84492970

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12586114B2 cover?: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize collaborative filtering and a reinforcement learning model having an actor-critic framework to provide digital content items across client devices. In particular, in one or more embodiments, the disclosed systems monitor interactions of a client device with one or more digital content ite…
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06Q30/0631. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Recommendation with neighbor-aware hyperbolic embedding

Systems and methods for generating music recommendations

Learning to schedule control fragments for physics-based character simulation and robots using deep Q-learning

Learning to schedule control fragments for physics-based character simulation and robots using deep q-learning

Convolutional gated recurrent neural networks

Frequently asked questions