Running Time Prediction Algorithm for WAND Queries
US-2016189026-A1 · Jun 30, 2016 · US
US10459959B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10459959-B2 |
| Application number | US-201615345277-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 7, 2016 |
| Priority date | Nov 7, 2016 |
| Publication date | Oct 29, 2019 |
| Grant date | Oct 29, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and apparatus for performing top-k query processing include pruning a list of documents to identify a subset of the list of documents, where pruning includes, for other query terms in the set of query terms, skipping a document in the list of documents based, at least in part, on the contribution of the query term to the score of the corresponding document and the term upper bound for each other query term, in the set of query terms, that matches the document.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining, by one or more servers, a set of query terms of a search query; identifying, by the one or more servers, a list of documents pertaining to the set of query terms, wherein each document in the list of documents is identified by a document identifier, wherein each query term in the set of query terms has associated therewith a posting list of elements representing documents including the query term, wherein each element in the posting list indicates the corresponding document identifier and a contribution of the query term to a score of the document, and wherein each query term in the set of query terms has a corresponding term upper bound on its potential contribution to at least one document in the list of documents; organizing the list of documents using a tree data structure, each of a plurality of nodes of the tree data structure representing a corresponding document heap of a plurality of document heaps; and pruning, by the one or more servers, the list of documents to identify a subset of the list of documents, wherein pruning includes, for each query term in the set of query terms, skipping a document in the list of documents based, at least in part, on the contribution of the query term to the score of the corresponding document and a term upper bound for each other query term, in the set of query terms, that matches the document, wherein pruning includes traversing the tree data structure. 2. The method of claim 1 , further comprising: for a query term in the set of query terms: for an element in the corresponding posting list, summing the contribution of the query term to the score of the corresponding document and a term upper bound for each other query term, in the set of query terms, that matches the document to ascertain a total; and skipping the element within the corresponding posting list according to whether the total is lower than a lowest document score of a set of top scored documents of the list of documents. 3. The method of claim 2 , further comprising: updating the set of top scored documents to include the document based, at least in part, on the total. 4. The method of claim 1 , further comprising: for a query term in the set of query terms: summing a term upper bound for each of the other query terms in the set of query terms to ascertain a total; and skipping an element within the corresponding posting list based, at least in part, on whether the total is lower than a lowest document score of a set of top scored documents of the list of documents. 5. The method of claim 1 , each of the document heaps corresponding to a different set of one or more documents in the list of documents. 6. The method of claim 1 , further comprising: ascertaining, for each posting list, for each element of the corresponding posting list, the contribution of the corresponding query term to the score of the document represented by the element. 7. The method of claim 1 , further comprising: for each posting list, sorting the corresponding elements according to the document identifier of each of the respective elements. 8. The method of claim 1 , further comprising: for each posting list, sorting the corresponding elements according to the contribution of the query term to the scores of the corresponding documents. 9. A system, comprising: one or more servers; the one or more servers including one or more processors and one or memories, the one or more servers being configured to: obtain a set of query terms of a search query; identify a list of documents pertaining to the set of query terms, wherein each document in the list of documents is identified by a document identifier, wherein each query term in the set of query terms has associated therewith a posting list of elements representing documents including the query term, wherein each element in the posting list indicates the corresponding document identifier and a contribution of the query term to a score of the document, and wherein each query term in the set of query terms has a corresponding term upper bound on its potential contribution to at least one document in the list of documents; organize the list of documents using a tree data structure such that each of a plurality of nodes of the tree data structure represents a corresponding document heap of a plurality of document heaps; and prune the list of documents to identify a subset of the list of documents, wherein pruning includes, for each query term in the set of query terms, skipping a document in the list of documents based, at least in part, on the contribution of the query term to the score of the corresponding document and a term upper bound for each other query term, in the set of query terms, that matches the document, wherein pruning includes traversing the tree data structure. 10. The system of claim 9 , the one or more servers being further configured to: for a query term in the set of query terms: for an element in the corresponding posting list, sum the contribution of the query term to the score of the corresponding document and a term upper bound for each other query term, in the set of query terms, that matches the document to ascertain a total; and skip the element within the corresponding posting list according to whether the total is lower than a lowest document score of a set of top scored documents of the list of documents. 11. The system of claim 10 , the one or more servers being further configured to: update the set of top scored documents to include the document based, at least in part, on the total. 12. The system of claim 9 , the one or more servers being further configured to: for a query term in the set of query terms: sum a term upper bound for each of the other query terms in the set of query terms to ascertain a total; and skip an element within the corresponding posting list based, at least in part, on whether the total is lower than a lowest document score of a set of top scored documents of the list of documents. 13. The system of claim 9 , each of the document heaps corresponding to a different set of one or more documents in the list of documents. 14. The system of claim 9 , the one or more servers being further configured to: ascertain, for each posting list, for each element of the corresponding posting list, the contribution of the corresponding query term to the score of the document represented by the element. 15. The system of claim 9 , the one or more servers being further configured to: for each posting list, sort the corresponding elements according to the document identifier of each of the respective elements. 16. The system of claim 9 , the one or more servers being further configured to: for each posting list, sort the corresponding elements according to the contribution of the query term to the scores of the corresponding documents. 17. A computer program product comprising one or more non-transitory computer-readable media having computer program instructions stored therein, the computer program instructions being configured such that, when executed by one or more computing devices, the computer program instructions cause the one or more computing devices to: obtain a set of query terms of a search query; identify a list of documents pertaining to the set of query terms, wherein each document in the list of documents is identified by a document identifier, wherein each query term in the set of query terms has associated therewith a posting list of elements representing documents including the query te
Query execution (filtering based on additional data G06F16/335) · CPC title
Document management systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.