Online techniques for parameter mean and variance estimation in dynamic regression models
US-2019363966-A1 · Nov 28, 2019 · US
US11314529B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11314529-B2 |
| Application number | US-202016748452-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 21, 2020 |
| Priority date | Jan 21, 2020 |
| Publication date | Apr 26, 2022 |
| Grant date | Apr 26, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method for content selection and presentation is disclosed. A plurality of content elements configured for presentation in at least one content container is received and one of the plurality of content elements is selected for presentation in the at least one content container. The one of the plurality of content elements is selected by a trained selection model based on an optimal impression allocation. An interface is generated that includes the selected one of the plurality of content elements.
Opening claim text (preview).
What is claimed is: 1. A system for content selection and presentation, comprising: a memory having instructions stored thereon, and a processor-configured to-read the instructions to: receive a plurality of content elements configured for presentation in at least one content container; select one of the plurality of content elements for presentation in the at least one content container, wherein the one of the plurality of content elements is selected by a trained selection model based on an optimal impression allocation, wherein the optimal impression allocation is selected using testing data used to compare calculated reward values, wherein the optimal impression allocation is configured to balance a short-term reward value and a long-term reward value of each of the plurality of content elements, wherein the short-term reward value indicates immediate rewards, and wherein the long-term reward value indicates a user return rate and is calculated as a sum of discounted short term rewards; and generate an interface including the one of the plurality of content elements selected for presentation. 2. The system of claim 1 , wherein the long-term reward value is determined by a Markov Decision Process (S,C,P,R,γ), where S represents a state space, C represents a content space, P represents a transition function, and R represents the immediate reward function. 3. The system of claim 1 , wherein the short-term reward value is determined based on Thompson sampling of a posterior distribution reward function. 4. The system of claim 1 , wherein the optimal impression allocation includes an estimated impression allocation generated according to an equation: ∑ C i I ^ S i , C i × R ^ ( i , Test ) ( S i , C i ) where C i is the content element, I is an impression value, S i is a state, and R is a reward function. 5. The system of claim 4 , where the impression value I is calculated as: I ^ S i , C i = w ( R ^ i , Train ) ( S i , C i ) ) ∑ C j w ( R ^ i , Train ) ( S i , C j ) ) . 6. The system of claim 1 , wherein the trained selection model includes a plurality of impression allocations, and wherein the optimal impression allocation is selected from the plurality of impression allocations based on one or more predetermined selection criteria. 7. The system of claim 1 , wherein the long-term reward value is determined by a Markov Decision Process (S,C,P,R,γ), where S represents a state space, C represents a content space, P represents a transition function, and R represents the immediate reward function and the short-term reward value is determined based on Thompson sampling of a posterior distribution reward function. 8. A non-transitory computer readable medium having instructions stored thereon, wherein the instructions, when executed by a processor cause a device to perform operations comprising: receiving a plurality of content elements configured for presentation in at least one content container; selecting one of the plurality of content elements for presentation in the at least one content container, wherein the one of the plurality of content elements is selected by a trained selection model based on an optimal impression allocation, wherein the optimal impression allocation is configured to balance a short-term reward value and a long-term reward value of each of the plurality of content elements, wher
Related publications grouped by family.
Answers are generated from the same data shown on this page.