Generating and providing personalized digital content in real time based on live user context
US-2020288204-A1 · Sep 10, 2020 · US
US11367120B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11367120-B2 |
| Application number | US-202016834815-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 30, 2020 |
| Priority date | Nov 8, 2019 |
| Publication date | Jun 21, 2022 |
| Grant date | Jun 21, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Business goals may be achieved using adaptive rewarding for the personalization of contents. In response to receiving user information, personalized contents for the user can be recommended using a reinforcement learning algorithm. In response to presenting the personalized content to the user, an action by the user selecting a particular content may be received. A reward value can be calculated for the action based on a reward function. The reward function can be based, at least in part, upon the action, the selected content, and/or the user. The reward function can be based upon one or more business goals, such as user engagement, monetization, and/or security. The calculated reward value can be provided to the reinforcement learning algorithm, which can be adapted based upon the reward value for future selection of personalized contents.
Opening claim text (preview).
The invention claimed is: 1. A computer-readable storage medium storing instructions which, when executed by a hardware processor, cause the hardware processor to: receive content information regarding available contents; receive user information regarding a user; input content vectors based on the content information and a user vector based on the user information to a reinforcement learning model; recommend a set of personalized contents for the user from the available contents, the set of personalized contents being output by the reinforcement learning model; receive a user action in response to a presentation of the set of personalized contents to the user; train the reinforcement learning model by: calculating a reward value for the user action based on a reward function that includes a monetization term and an engagement term, the monetization term including a monetization tuning parameter that is manually set as a weight for targeting a monetization business goal, the engagement term including an engagement tuning parameter that is manually set as a weight for targeting an engagement business goal; and adapting the reinforcement learning model using the reward value to increase a probability of future occurrences of the user action that help achieve the monetization business goal and the engagement business goal. 2. The computer-readable storage medium of claim 1 , wherein the instructions further cause the hardware processor to: generate the content vectors associated with the available contents based on the content information; and generate the user vector associated with the user based on the user information, wherein the set of personalized contents is selected by the reinforcement learning model based on the content vectors and the user vector. 3. The computer-readable storage medium of claim 1 , wherein the reinforcement learning model selects the set of personalized contents that maximize the reward value. 4. The computer-readable storage medium of claim 1 , wherein the reinforcement learning model selects the set of personalized contents based at least on randomness. 5. The computer-readable storage medium of claim 1 , wherein the reinforcement learning model uses a contextual bandit algorithm to select the set of personalized contents. 6. A system, comprising: a hardware processor; and storage having instructions which, when executed by the hardware processor, cause the hardware processor to: receive game information regarding available games; generate game vectors associated with the available games based on the game information; receive user information regarding a user; generate a user vector associated with the user based on the user information; input the game vectors and the user vector to a machine learning model; recommend a personalized set of games for the user from the available games, the personalized set of games being output by the machine learning model; receive a user action associated with a selected game from the personalized set of games; calculate a reward value for the user action using a reward function that includes terms associated with business goals, the terms having tuning parameters that are manually set as weights for targeting the associated business goals; and train the machine learning model using the reward value as feedback to improve future recommendations of the available games that promote the business goals. 7. The system of claim 6 , wherein the machine learning model uses a reinforcement learning algorithm to select the personalized set of games. 8. The system of claim 6 , wherein the personalized set of games includes ranking. 9. The system of claim 8 , wherein the personalized set of games is displayed using heterogenous sizes that depend on the ranking of the personalized set of games. 10. The system of claim 8 , wherein the personalized set of games is displaying using heterogenous levels of interaction that depends on the ranking of the personalized set of games. 11. A method, comprising: receiving user information about a user; inputting a user vector based on the user information to a reinforcement learning model; recommending personalized contents for the user from available contents, the personalized contents being output by the reinforcement learning model; receiving an action relating to a selected content from the personalized contents; calculating a reward value for the action by using a reward function that includes terms associated with goals and tuning parameters associated with the terms, the tuning parameters being manually set as weights for targeting the goals; and training the reinforcement learning model using the reward value as feedback to select future personalized contents that further the goals. 12. The method of claim 11 , further comprising: monitoring content features associated with the available contents including the selected content; and automatically adjusting the reward function based on a particular monitored content feature associated with the selected content. 13. The method of claim 12 , wherein the reward function is based on an average of a particular monitored feature associated with the available contents. 14. The method of claim 11 , further comprising: monitoring user features associated with the user; and automatically adjusting the reward function based on a particular monitored user feature associated with the user. 15. The method of claim 11 , wherein the goals include one or more of monetization, engagement, inclusiveness, safety, or toxicity. 16. The method of claim 11 , wherein the tuning parameters are automatically adjusted based on time using a machine learning model. 17. The method of claim 11 , wherein the reward function is based on a probability of the user who performed the action will perform a subsequent action. 18. The method of claim 17 , wherein the subsequent action includes one or more of purchasing the selected content, purchasing another content, or playing the selected content. 19. The method of claim 11 , wherein training the reinforcement learning model includes adapting the reinforcement learning model to increase an occurrence of the action. 20. The method of claim 11 , wherein the reward function includes one or more of: an estimated value of the action for a particular content, a probability of the action converting to a particular goal, a utility of the particular goal for the particular content, or an average utility of the particular goal for the available contents.
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Recommending goods or services · CPC title
Filtering based on additional data, e.g. user or group profiles · CPC title
using advertising information · CPC title
using advertisements · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.