Sibling search queries
US-2021334314-A1 · Oct 28, 2021 · US
US12450277B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12450277-B2 |
| Application number | US-202418932301-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 30, 2024 |
| Priority date | Nov 2, 2023 |
| Publication date | Oct 21, 2025 |
| Grant date | Oct 21, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An online system updates the labels on negative examples to account for the possibility that the example is a false negative. The system generates a set of initial training examples that each include a query input by the user and item data for an item presented as a result to the user's query. Each training example also includes an initial label, which represents whether the user interacted with the item presented as a search result. The online system updates the initial label for a negative training example by identifying a set of bridge queries and computing a similarity score between the query for the training example and the bridge queries. The online system computes an updated label for the negative example based on the similarity scores and updates the training example with the updated label.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-readable medium storing a set of parameters for a machine-learning model, wherein the parameters are produced by a process comprising: initializing the set of parameters for the machine-learning model; accessing search log data captured by an online system, wherein the search log data describes a plurality of queries placed by users of the online system and a plurality of items; generating a set of initial training examples based on the search log data, wherein each initial training example comprises a query of the plurality of queries, an item of the plurality of items, and an initial label, wherein the initial label represents whether the item was presented as a search result to a user and was selected by the user; identifying a set of initial negative examples by identifying a subset of the set of initial training examples with initial labels below a threshold value; updating the set of initial negative examples to generate a set of updated negative examples, wherein the set of updated negative examples is generated by, for each initial negative example: identifying a plurality of bridge queries for the query of the initial negative example, wherein the plurality of bridge queries is a subset of the plurality of queries for which the item of the initial negative example was presented as a search result to a user and was selected by the user; computing a similarity score between the query of the initial negative example and each of the plurality of bridge queries; computing an updated label for the initial negative example based on the computed similarity scores; and generating an updated negative example comprising the computed updated label and the query and item of the initial negative example; generating a final set of training examples comprising the set of updated negative examples and a subset of the set of initial training examples with initial labels above the threshold value; updating the set of parameters by processing each training example in the final set of training examples, wherein updating the set of parameters results in an updated set of parameters for the machine-learning model, and wherein processing each training example in the set of training examples comprises: applying the machine-learning model to the item data and the query of the training example to generate an item prediction score, wherein the item prediction score represents a predicted likelihood that a user would select the item when the item is presented as a search result for the query; computing a loss score by comparing the item prediction score to a label of the training example; and updating the set of parameters for the machine-learning model through a backpropagation process using the computed loss score; and storing the updated set of parameters on the computer-readable medium. 2. The computer-readable medium of claim 1 , wherein the machine-learning model is a cross encoder model. 3. The computer-readable medium of claim 1 , wherein each initial training example of the set of initial training examples comprises user data describing a user corresponding to the query of the initial training example. 4. The computer-readable medium of claim 1 , wherein each initial training example of the set of initial training examples comprises context data describing a context of the query of the initial training example. 5. The computer-readable medium of claim 1 , wherein computing a similarity score between the query of the initial negative example and a bridge query comprises: applying a query embedding model to the query and the bridge query to generate embeddings for the query and bridge query; and computing a distance between the embedding for the query and the embedding for the bridge query. 6. The computer-readable medium of claim 5 , wherein the embedding model is part of a bi-encoder model that is trained to generate query embeddings and item embeddings for use in selecting items for search results. 7. The computer-readable medium of claim 1 , wherein computing an updated label for the initial negative examples comprises: computing an average of the computed similarity scores. 8. The computer-readable medium of claim 1 , further comprising: filtering the set of updated negative examples based on the updated labels. 9. A method, performed by a computing system comprising a processor and a non-transitory computer-readable medium, comprising: initializing a set of parameters for a machine-learning model; accessing search log data captured by an online system, wherein the search log data describes a plurality of queries placed by users of the online system and a plurality of items; generating a set of initial training examples based on the search log data, wherein each initial training example comprises a query of the plurality of queries, an item of the plurality of items, and an initial label, wherein the initial label represents whether the item was presented as a search result to a user and was selected by the user; identifying a set of initial negative examples by identifying a subset of the set of initial training examples with initial labels below a threshold value; updating the set of initial negative examples to generate a set of updated negative examples, wherein the set of updated negative examples is generated by, for each initial negative example: identifying a plurality of bridge queries for the query of the initial negative example, wherein the plurality of bridge queries is a subset of the plurality of queries for which the item of the initial negative example was presented as a search result to a user and was selected by the user; computing a similarity score between the query of the initial negative example and each of the plurality of bridge queries; computing an updated label for the initial negative example based on the computed similarity scores; and generating an updated negative example comprising the computed updated label and the query and item of the initial negative example; generating a final set of training examples comprising the set of updated negative examples and a subset of the set of initial training examples with initial labels above the threshold value; updating the set of parameters by processing each training example in the final set of training examples, wherein updating the set of parameters results in an updated set of parameters for the machine-learning model, and wherein processing each training example in the set of training examples comprises: applying the machine-learning model to the item data and the query of the training example to generate an item prediction score, wherein the item prediction score represents a predicted likelihood that a user would select the item when the item is presented as a search result for the query; computing a loss score by comparing the item prediction score to a label of the training example; and updating the set of parameters for the machine-learning model through a backpropagation process using the computed loss score; and storing the updated set of parameters on the computer-readable medium. 10. The method of claim 9 , wherein the machine-learning model is a cross encoder model. 11. The method of claim 9 , wherein each initial training example of the set of initial training examples comprises user data describing a user corresponding to the query of the initial training example. 12. The method of claim 9 , wherein each initial training example of the set of initial training examples comprises context data describing a context of the query of the initial training example. 13. The method of claim 9
Query formulation · CPC title
Search customisation based on user profiles and personalisation · CPC title
using ranking · CPC title
Clustering; Classification · CPC title
using metadata automatically derived from the content · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.