Click noise characterization model

US9355095B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9355095-B2
Application numberUS-201113341653-A
CountryUS
Kind codeB2
Filing dateDec 30, 2011
Priority dateDec 30, 2011
Publication dateMay 31, 2016
Grant dateMay 31, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The techniques discussed herein consider a degree of noise associated with user clicks performed during search sessions. The techniques then generate a model that characterizes click noise so that search engines can more accurately infer document relevance.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: under control of a processor configured with computer-executable instructions, receiving, for each of a plurality of query-document pairs identified from one or more search sessions, one or more labels indicating that each query-document pair is relevant or irrelevant; extracting feature data for each labeled query-document pair; generating, based at least in part on the feature data, a model that characterizes a context in which a user decides to click on an irrelevant document; and using the model to determine that an individual click event of a plurality of individual click events associated with a search query is associated with the irrelevant document. 2. The method recited in claim 1 , wherein the feature data includes one or more user class features that characterize a behavior of the user when the user performs online searches. 3. The method recited in claim 2 , wherein the user corresponds to a device identifier. 4. The method recited in claim 2 , wherein the one or more user class features are selected from a group comprising: a dwell time feature indicating an average amount of time that passes between two consecutive user actions; an interval time feature indicating an average amount of time between two consecutive document link selections for the user; a user skip feature indicating whether the user skipped a document link displayed on a search engine result page (SERP); a user click feature indicating whether the user selected a document link displayed on a SERP; a user first click feature indicating whether a document was the first document selected by the user within a query session; a user last click feature indicating whether a document was the last document selected by the user within a query session; a user only click feature indicating whether a document was the only document selected by the user within a query session; a fraction query no click feature indicating a percentage of query sessions in which the user does not select any document links displayed on a SERP; a fraction query one click feature indicating a percentage of query sessions in which the user selects only one document link displayed on a SERP; and a fraction query multi-clicks feature indicating a percentage of query sessions in which the user selects multiple document links displayed on a SERP. 5. The method recited in claim 1 , wherein the feature data includes one or more context class features that specify the context in which the user makes a click decision when the user performs an online search. 6. The method recited in claim 5 , wherein the one or more context class features are selected from a group comprising: a submit time feature indicating a recorded time the user submits a search query to a search engine; a query subset previous feature indicating whether a current query is a subset of a previous query; a query superset previous feature indicating whether a current query is a superset of a previous query; a query distance previous feature indicating an edit distance between a current query and a previous query; a click in last session feature indicating whether there was a selection in a previous query session; a dwell time in last session feature indicating an amount of time the user spends on a previous query session; a first query feature indicating whether a current query session is a first query session in a search session; a time in search feature indicating a time the user has spent on a search engine up to a current document link selection; a Uniform Resource Locator (URL) in search feature indicating a number of URLs that have been displayed to the user in a search session up to a current document link selection; a query in session feature indicating a number of submitted search queries within a search session up to a current document link selection; a click in session feature indicating a total number of user clicks realized in a search session up to a current document link selection; an average time between queries feature indicating an average time between two submitted search queries; and a time to last action feature indicating an amount of time that has passed since a previous action occurred. 7. The method recited in claim 1 , wherein receiving the one or more labels comprises receiving one or more human ratings indicating each query-document pair as relevant or irrelevant. 8. The method recited in claim 1 , wherein generating the model comprises learning one or more noise predictor parameters, the one or more noise predictor parameters being based at least in part on a history of click behavior specific to the user. 9. The method recited in claim 8 , wherein learning the one or more noise predictor parameters includes determining distributions of one or more weight parameters corresponding, respectively, to one or more extracted features. 10. The method recited in claim 1 , wherein the search query is a first search query, and the method further comprises: modifying the model, based at least in part on additional extracted feature data associated with click events of the first search query; and generating, using the model as modified, a probability of a subsequent click event for a given query-document pair associated with a second search query. 11. The method recited in claim 10 , wherein the probability of the subsequent click event for the given query-document pair is generated based at least in part on search behavior specific to the user. 12. The method recited in claim 10 , further comprising: ranking a plurality of documents associated with the second search query based at least in part on the generated probability; and providing search results, comprising at least a portion of the plurality of documents, according to the ranking. 13. The method recited in claim 10 , further comprising: incorporating the model into a Dynamic Bayesian Network (DBN) model; ranking the plurality of documents based at least in part on the DBN model; and providing search results, comprising at least a portion of the plurality of documents, according to the ranking. 14. The method recited in claim 10 , further comprising: incorporating the model into a user browsing model (UBM); ranking the plurality of documents based at least in part on the UBM; and providing search results, comprising at least a portion of the plurality of documents, according to the ranking. 15. One or more computer-readable storage media comprising computer-executable instructions that, when executed on one or more processors, configure the one or more processors to perform operations comprising: building a noise-aware click model based at least in part on: (i) human provided relevance ratings for each of a plurality of query-document pairs in a first group of one or more search sessions, and (ii) a first set of features associated with the first group of one or more search sessions; employing the noise-aware click model to characterize a context in which a user decides to click on an irrelevant document within a second group of one or more search sessions; and modifying the noise-aware click model based at least in part on extracted feature data associated with the second group of one or more search sessions and the context in which the user decides to click on the irrelevant document. 16. The one or more computer-readable storage media recited in claim 15 , wherein the operations further comprise: receiving a search query; and generating, using the noise-aware click model as modified, a probability of a click

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9355095B2 cover?
The techniques discussed herein consider a degree of noise associated with user clicks performed during search sessions. The techniques then generate a model that characterizes click noise so that search engines can more accurately infer document relevance.
Who is the assignee on this patent?
Chen Weizhu, Chen Zheng, Singla Adish, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F17/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 31 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).