Query expansion and query-document matching using path-constrained random walks

US9286396B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9286396-B2
Application numberUS-201313951574-A
CountryUS
Kind codeB2
Filing dateJul 26, 2013
Priority dateJul 26, 2013
Publication dateMar 15, 2016
Grant dateMar 15, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various technologies described herein pertain to use of path-constrained random walks for query expansion and/or query document matching. Clickthrough data from search logs is represented as a labeled and directed graph. Path-constrained random walks are executed over the graph based upon an input query. The graph includes a first set of nodes that represent queries included in the clickthrough data from search logs, a second set of nodes that represent documents included in the clickthrough data from the search logs, a third set of nodes that represent words from the queries and the documents, and edges between nodes that represent relationships between queries, documents, and words. The path-constrained random walks include traversals over edges of the graph between nodes. Further, a score for a relationship between a target node and a source node representative of the input query is computed based at least in part upon the path-constrained random walks.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising the following computer-executable acts: receiving an input query; executing path-constrained random walks over a computer-implemented labeled and directed graph based upon the input query, wherein the labeled and directed graph comprises: a first set of nodes that are representative of queries comprised in clickthrough data from search logs; a second set of nodes that are representative of documents comprised in the clickthrough data from the search logs; a third set of nodes that are representative of words from the queries and the documents; and edges between nodes that are representative of relationships between the queries, the documents, and the words; wherein the path-constrained random walks comprise traversals over edges of the graph between nodes, the path-constrained random walks traverse the edges of the graph between the nodes in accordance with predefined path types, an each of the predefined path types comprises a respective sequence of relations between the nodes in the graph for traversing as part of a corresponding path-constrained random walk from the path-constrained random walks; and computing a score for a relationship between a target node and a source node representative of the input query based at least in part upon the path-constrained random walks. 2. The method of claim 1 , wherein the third set of nodes comprises the target node, and wherein the target node is representative of a candidate query expansion term. 3. The method of claim 2 , wherein the input query is desirably input to a search engine, and wherein the method further comprises: selecting the candidate query expansion term based at least in part upon the score for the relationship between the target node representative of the candidate query expansion term and the source node representative of the input query; and responsive to selecting the candidate query expansion term, causing the search engine to execute a search over a plurality of documents based at least in part upon the candidate query expansion term. 4. The method of claim 2 , wherein the input query is desirably input to a search engine, and wherein the method further comprises: selecting the candidate query expansion term based at least in part upon the score for the relationship between the target node representative of the candidate query expansion term and the source node representative of the input query; and responsive to selecting the candidate query expansion term, causing the search engine to display the candidate query expansion term as a suggested query. 5. The method of claim 1 , further comprising outputting a ranked list of candidate query expansion terms based upon respective scores for corresponding relationships between target nodes representative of the candidate query expansion terms and the source node representative of the input query. 6. The method of claim 1 , wherein the second set of nodes comprises the target node, and wherein the target node is representative of a candidate document. 7. The method of claim 6 , wherein the input query is desirably input to a search engine, and wherein the method further comprises: returning the candidate document responsive to execution of a search over a plurality of documents performed by the search engine, wherein the candidate document is returned by the search engine based at least in part upon the score for the relationship between the target node representative of the candidate document and the source node representative of the input query. 8. The method of claim 1 , wherein computing the score for the relationship between the target node and the source node representative of the input query further comprising: determining respective values for the path-constrained random walks between the target node and the source node representative of the input query, wherein the path-constrained random walks traverse the edges of the graph between the nodes from the source node representative of the input query to the target node in accordance with the predefined path types; and combining the respective values for the path-constrained random walks that traverse the edges of the graph between the nodes from the source node representative of the input query to the target node in accordance with the predefined path types to compute the score for the relationship between the target node and the source node representative of the input query. 9. The method of claim 8 , wherein weights are assigned to the predefined path types, and wherein the respective values for the path-constrained random walks that traverse the edges of the graph between the nodes from the source node representative of the input query to the target node in accordance with the predefined path types are combined as a function of the weights. 10. The method of claim 1 , wherein the edges in the graph are labeled by respective relations, and wherein the edges in the graph are assigned respective edge scores based upon relation-specific probabilistic models for the respective relations. 11. The method of claim 10 , wherein an edge score between a particular source node and a particular target node is a probability of reaching the particular target node from the particular source node with a one-step random walk, the probability being based on a type of relation between the particular target node and the particular source node. 12. The method of claim 1 , further comprising constructing the labeled and directed graph based upon the clickthrough data from the search logs. 13. A computing apparatus, comprising: at least one processor; and memory that comprises computer excutable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including: executing path-constrained random walks over a labeled and directed graph based upon an input query, wherein the labeled and directed graph comprises: a first set of nodes that represent queries comprised in clickthrough data from search logs; a second set of nodes that represent documents comprised in the clickthrough data from the search logs; a third set of nodes that represent words from the queries and the documents; and edges between nodes that represent relationships between the queries, the documents, and the words; wherein the path-constrained random walks traverse edges of the graph between nodes in accordance with predefined path types, and each of the predefined path comprises a respective sequence of relations between the nodes in the graph for traversing as part of a corresponding path-constrained random walk from the path-constrained random walks; and computing a score for a relationship between a target node that represents a candidate query expansion term and a source node that represents the input query based at least in part upon the path-constrained random walks. 14. The computing apparatus of claim 13 , the memory further comprising computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including: outputting a ranked list of candidate query expansion terms based upon respective scores for relationships between target nodes that represent the candidate query expansion terms and the source node that represents the input query. 15. The computing apparatus of claim 13 , wherein the path-constrained random walks respectively instantiate the predefined path types, and the memory further comprising computer-executable instructions that, when executed by the at least one processor, cause the at least

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9286396B2 cover?
Various technologies described herein pertain to use of path-constrained random walks for query expansion and/or query document matching. Clickthrough data from search logs is represented as a labeled and directed graph. Path-constrained random walks are executed over the graph based upon an input query. The graph includes a first set of nodes that represent queries included in the clickthrough…
Who is the assignee on this patent?
Microsoft Corp, Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/951. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 15 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).