Machine-learning based generation of text style variations for digital content items
US-2022245322-A1 · Aug 4, 2022 · US
US12222937B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12222937-B2 |
| Application number | US-202217668358-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 9, 2022 |
| Priority date | Feb 9, 2022 |
| Publication date | Feb 11, 2025 |
| Grant date | Feb 11, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An online concierge system maintains various items and an item embedding for each item. When the online concierge system receives a query for retrieving one or more items, the online concierge system generates an embedding for the query. The online concierge system trains a machine-learned model to determine a measure of relevance of an embedding for a query to item embeddings by generating training data of examples including queries and items with which users performed a specific interaction. The online concierge system generates a subset of the training data including examples satisfying one or more criteria and further trains the machine-learned model by application to the examples of the subset of the training data and stores parameters resulting from the further training as parameters of the machine-learned model.
Opening claim text (preview).
What is claimed is: 1. A machine-learned model stored on a non-transitory computer readable storage medium, wherein the machine-learned model is manufactured by a process comprising: generating training data comprising a plurality of examples, each example comprising a query received by an online concierge system and an item with which a user of the online concierge system performed a specific interaction, wherein a label applied to each example of the training data indicates whether the specific interaction was performed with the item after the online concierge system received the query; generating a noisy subset of training data and a high-quality subset of training data based on a metric that defines a quality of training data, wherein the high-quality subset has a higher metric value than the noisy subset; initializing the machine-learned model comprising a network of a plurality of layers, the machine-learned model configured to receive a query and an item and to generate a predicted measure of relevance of the item to the query; for each of a plurality of the examples of the noisy subset of the training data: applying, by one or more processors, the machine-learned model to the query of the example of the noisy subset of the training data and to the item of the example of the noisy subset of the training data; backpropagating, in one or more iterations and by the one or more processors, one or more error terms obtained from one or more loss functions to update a set of parameters of the network, the backpropagating performed through the network and one or more of the error terms based on a difference between the label applied to the example of the noisy subset of the training data and a predicted measure of relevance of the item of the example of the noisy subset of the training data and to the query of the example of the training data; stopping, by the one or more processors, the backpropagation after the one or more loss functions satisfy one or more criteria; storing, in the computer readable storage medium, the set of parameters of the network that are updated in the one or more iterations; initializing the network to the stored set of parameters; for each of the plurality of the examples of the high-quality subset of the training data: applying, by the one or more processors, the machine-learned model to the query of the example of the high-quality subset of the training data and to the item of the example of the high-quality subset of the training data; backpropagating, by the one or more processors, one or more error terms obtained from one or more loss functions to generate a modified set of parameters of the network, the backpropagating performed through the network and one or more of the error terms based on a difference between a label applied to the example of the high-quality subset of the training data and a predicted measure of relevance of the item of the example of the high-quality subset of the training data and to the query of the example of the subset of the training data; stopping, by the one or more processors, the backpropagation after the one or more loss functions satisfy one or more criteria; and storing, in the computer readable storage medium, the modified set of parameters of the network trained from the subset of the training data as parameters of the machine-learned model. 2. The machine-learned model of claim 1 , wherein generating the high-quality subset of training data comprises: selecting examples of the training data including items with which the specific interaction was performed with at least a threshold frequency. 3. The machine-learned model of claim 2 , wherein generating the high-quality subset of training data further comprises: determining an example of the training data includes an item with which the specific frequency was performed with at least an additional threshold frequency; and including a specific number of replicas of the example determined to include the item with which the specific frequency was performed with at least the additional threshold frequency in the subset of the training data in response to the determining. 4. The machine-learned model of claim 1 , wherein generating the high-quality subset of training data comprises: ranking examples of the training data based on frequencies with which the specific interaction was performed with items included in the examples of the training data; selecting examples of the training data having at least a threshold position in the ranking. 5. The machine-learned model of claim 4 , wherein generating the high-quality subset of training data further comprises: determining an example of the training data includes an item with which the specific frequency was performed with at least a threshold frequency; and including a specific number of replicas of the example determined to include the item with which the specific frequency was performed with at least the threshold frequency in the subset of the training data in response to the determining. 6. The machine-learned model of claim 1 , wherein the specific interaction comprises including the item in an order received by the online concierge system. 7. The machine-learned model of claim 1 , wherein backpropagating one or more error terms obtained from one or more loss functions to modify the set of parameters of the network comprises: generating the one or more error terms from application of the machine-learned model to the example of the high-quality subset of the training data using an alternative loss function than a loss function generating the error term from application of the machine-learned model to the example of the training data. 8. The machine-learned model of claim 7 , wherein the alternative loss function applies a higher weight to an error term from application of the machine-learned model to the example of the high-quality subset of the training data than the loss function generating the error term from application of the machine-learned model to the noisy subset of the training data. 9. The machine-learned model of claim 1 , wherein applying the machine-learned model to the query of the example of the noisy subset of the training data and to the item of the example of the noisy subset of the training data comprises: applying the machine-learned model with a particular architecture to the example of the noisy subset of the training data and to the item of the example of the noisy subset of the training data. 10. The machine-learned model of claim 9 , wherein applying the machine-learned model to the query of the example of the high-quality subset of the training data and to the item of the example of the high-quality subset of the training data comprises: applying the machine-learned model with a different architecture than the particular architecture to the example of the high-quality subset of the training data and to the item of the high-quality subset of the example of the training data. 11. A method comprising: generating training data comprising a plurality of examples, each example comprising a query received by an online concierge system and an item with which a user of the online concierge system performed a specific interaction, wherein a label applied to each example of the training data indicates whether the specific interaction was performed with the item after the online concierge system received the query; generating a noisy subset of training data and a high-quality subset of training data based on a metric that defines a quality of training data, wherein the high-quality subset has a higher metric value than the noisy subset; initializing a machine-learned model comprising a networ
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title
using ranking · CPC title
Machine learning · CPC title
for particular applications; for extensibility, e.g. user defined types · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.