What technology area does this patent fall under?

Primary CPC classification G06N3/044. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Diversity aware media content recommendation

US2022012565A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2022012565-A1
Application number	US-202117320439-A
Country	US
Kind code	A1
Filing date	May 14, 2021
Priority date	May 15, 2020
Publication date	Jan 13, 2022
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A reinforcement learning ranker can take into account previously-recommended media content items to produce a ranked list of media content items to recommend next. The ranker finds a policy that gives the probability of sampling a media content item given a state. The policy is learned such that it maximizes a reward. A reward function associated with the media content item can be defined with respect to whether the user finds the media content item relevant (likelihood that the user will like the media content item) and a diversity score of the media content item.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for selecting a media content item, the method comprising: obtaining data describing feedback from previous content consumption sessions of a user account; obtaining data regarding media content items previously recommended during a current content consumption session of the user account; generating a score for a potential media content item with a reinforcement learning model based on: the data regarding media content items previously recommended during the current content consumption session of the user account; and the data describing feedback from the previous playback sessions of the user account; and selecting, for the user account, the potential media content item based on the score, wherein the reinforcement learning model applies a reward function that takes into account relevance and diversity. 2 . The method of claim 1 , wherein the potential media content item is a potential track. 3 . The method of claim 1 , wherein the data describing feedback from previous playback sessions of the user account comprises a feedback aware embedding. 4 . The method of claim 3 , further comprising calculating the feedback aware embedding with a feedback aware embedder based on a meta feature, a media content item, and a dynamic user embedding. 5 . The method of claim 4 , further comprising calculating the dynamic user embedding with a dynamic user embedder based on representations of prior sessions. 6 . The method of claim 1 , wherein generating the score for the potential media content item includes applying a stacked LSTM initialed based on a session meta feature. 7 . The method of claim 1 , wherein the reward function includes the calculation: R ( t, s )= r ( t, u )− c+αd ( t, u )× r ( t, u ), where R(t, s) is a reward function for a given media content item t and session s; where r(t, u) is a reward function for the given media content item t and user u; where c is a value configured to ensure a negative reward for non-relevant media content items; where α is a weighting parameter; and where d(t, u) is a diversity function for a given media content item t and user u. 8 . The method of claim 1 , further comprising: calculating the diversity of the potential media content item based on a popularity of the potential media content item. 9 . The method of claim 1 , further comprising: calculating the diversity of the potential media content item based on a similarity of the potential media content item to other media content items played by the user account. 10 . A non-transitory computer-readable medium having instructions stored thereon that, when executed by one or more processors, cause the one or more processors to: obtain data describing feedback from previous content consumption sessions of a user account; obtain data regarding media content items previously recommended during a current content consumption session of the user account; generate a score for a potential media content item with a reinforcement learning model based on: the data regarding media content items previously recommended during the current content consumption session of the user account; and the data describing feedback from the previous playback sessions of the user account; and select, for the user account, the potential media content item based on the score, wherein the reinforcement learning model applies a reward function that takes into account relevance and diversity. 11 . The non-transitory computer-readable medium of claim 10 , wherein the data describing feedback from previous playback sessions of the user account comprises a feedback aware embedding. 12 . The non-transitory computer-readable medium of claim 11 , wherein the instructions further cause the one or more processors to calculate the feedback aware embedding with a feedback aware embedder based on a meta feature, a media content item, and a dynamic user embedding. 13 . The non-transitory computer-readable medium of claim 12 , wherein the instructions further cause the one or more processors to calculate the dynamic user embedding with a dynamic user embedder based on representations of prior sessions. 14 . The non-transitory computer-readable medium of claim 10 , wherein to generate the score for the potential media content item includes to apply a stacked LSTM initialed based on a session meta feature. 15 . The non-transitory computer-readable medium of claim 10 , wherein the reward function includes the calculation: R ( t, s )= r ( t, u )− c+αd ( t, u )× r ( t, u ), where R(t, s) is a reward function for a given media content item t and session s; where r(t, u) is a reward function for the given media content item t and user u; where c is a value configured to ensure a negative reward for non-relevant media content items; where α is a weighting parameter; and where d(t, u) is a diversity function for a given media content item t and user u. 16 . The non-transitory computer-readable medium of claim 10 , wherein the instructions further cause the one or more processors to calculate the diversity of the potential media content item based on a popularity of the potential media content item or based on a similarity of the potential media content item to other media content items played by the user account. 17 . A system comprising: a media-playback device; and a media-delivery system configured to: obtain data describing feedback from previous content consumption sessions of a user account; obtain data regarding media content items previously recommended during a current content consumption session of the user account; generate a score for a potential media content item with a reinforcement learning model based on: the data regarding media content items previously recommended during the current content consumption session of the user account; and the data describing feedback from the previous playback sessions of the user account; and select, for the user account, the potential media content item based on the score; and transmit the selected media content item to the media-playback device for playback, wherein the reinforcement learning model applies a reward function that takes into account relevance and diversity. 18 . The system of claim 17 , wherein to generate the score for the potential media content item includes to apply a stacked LSTM initialed based on a session meta feature. 19 . The system of claim 17 , wherein the reward function includes the calculation: R ( t, s )= r ( t, u )− c+αd ( t, u )× r ( t, u ), where R(t, s) is a reward function for a given media content item t and session s; where r(t, u) is a reward function for the given media content item t and user u; where c is a value configured to ensure a negative reward for non-relevant media content items; where α is a weighting parameter; and where d(t, u) is a diversity function for a given media content item t and user u. 20 . The system of claim 17 , wherein the media-delivery system is further configured to: calculate the diversity of the potential media content item based on a popularity of the potential media content item or based on a similarity of the potential media content item to other media content items played by the user account.

Assignees

Spotify Ab

Inventors

Classifications

G06N3/044Primary
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/042Primary
Knowledge-based neural networks; Logical representations of neural networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/092
Reinforcement learning · CPC title

Patent family

Related publications grouped by family.

View patent family 79172770

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022012565A1 cover?: A reinforcement learning ranker can take into account previously-recommended media content items to produce a ranked list of media content items to recommend next. The ranker finds a policy that gives the probability of sampling a media content item given a state. The policy is learned such that it maximizes a reward. A reward function associated with the media content item can be defined with …
Who is the assignee on this patent?: Spotify Ab
What technology area does this patent fall under?: Primary CPC classification G06N3/044. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

System and method for a recommender

Automatically predicting relevant contexts for media items

Adapting a sequence model for use in predicting future device interactions with a computing system

Global Vector Recommendations Based on Implicit Interaction and Profile Data

Automatically Predicting Relevant Contexts For Media Items

Frequently asked questions