Methods for completing a user search
US-2015347423-A1 · Dec 3, 2015 · US
US9727637B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9727637-B2 |
| Application number | US-201414462662-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 19, 2014 |
| Priority date | Aug 19, 2014 |
| Publication date | Aug 8, 2017 |
| Grant date | Aug 8, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A mechanism is provided for retrieving candidate answers from a corpus of documents. The mechanism receives an input question for which an answer is sought. The mechanism extracts features of the input question based on a natural language processing. The mechanism executes a first search of the corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents. The mechanism executes a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages. The mechanism generates query results from the subset of passages matching from which candidate answers are identified.
Opening claim text (preview).
What is claimed is: 1. A method, in a question and answer (QA) system comprising a processor and a memory, for retrieving candidate answers from a corpus of documents, the method comprising: receiving, by the QA system, an input question for which an answer is sought; extracting, by the QA system, features of the input question based on a natural language processing of the input question; executing, by the QA system, a first search of the corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents matching the first subset of extracted features, wherein the utility of the first subset of extracted features identifies a degree to which each feature of the first subset of extracted features of the input question discriminates between documents in the corpus of documents that are sources of candidate answers to the input question; executing, by the QA system, a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages, wherein the utility of the second subset of extracted features identifies a degree to which each feature of the second subset of extracted features of the input question discriminates between passages in the set of passages that are sources of candidate answers to the input question; and generating, by the QA system, query results from the subset of passages from which a set of candidate answers for the input question are identified. 2. The method of claim 1 , wherein the set of passages extracted from the subset of documents is less than all of the passages included in the subset of documents. 3. The method of claim 1 , wherein executing the first search of the corpus of documents based on the first subset of the extracted features of the input question and the initial evaluation of the utility of the first subset of extracted features to generate the subset of documents matching the first subset of extracted features comprises: generating, by the QA system, a first statistical data structure for the corpus of documents; and identifying, by the QA system, the subset of documents from the corpus of documents comprised within the first statistical data structure relevant to the first subset of the extracted features utilizing the initial evaluation of the utility of the first subset of extracted features. 4. The method of claim 1 , wherein executing the second search of the set of passages extracted from the subset of documents based on the second subset of the extracted features of the input question and the reevaluation of the utility of the second subset of extracted features comprises: generating, by the QA system, a second statistical data structure for the set of passages; and identifying, by the QA system, the query results from the subset of passages comprised within the second statistical data structure relevant to the second subset of the extracted features utilizing the reevaluation of the utility of the second subset of extracted features. 5. The method of claim 1 , wherein the extracted features of the input question are identified by: identifying, by the QA system, a utility of each term in the input question; eliminating, by the QA system, zero or more terms within the input question that comprise a utility less than a predetermined value; and adding, by the QA system, the remaining terms in the input question to the extracted features. 6. The method of claim 5 , wherein the extracted features of the input question are further identified by: identifying, by the QA system, one or more synonyms associated with the terms added to the extracted features; and adding, by the QA system, the one or more synonyms associated with the terms to the extracted features. 7. The method of claim 5 , wherein the extracted features of the input question are further identified by: identifying, by the QA system, one or more tenses associated with the terms added to the extracted features; and adding, by the QA system, the one or more tenses associated with the terms to the extracted features. 8. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to: receive an input question for which an answer is sought; extract features of the input question based on a natural language processing of the input question; execute a first search of a corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents matching the first subset of extracted features, wherein the utility of the first subset of extracted features identifies a degree to which each feature of the first subset of extracted features of the input question discriminates between documents in the corpus of documents that are sources of candidate answers to the input question; execute a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages, wherein the utility of the second subset of extracted features identifies a degree to which each feature of the second subset of extracted features of the input question discriminates between passages in the set of passages that are sources of candidate answers to the input question; and generate query results from the subset of passages from which a set of candidate answers for the input question are identified. 9. The computer program product of claim 8 , wherein the set of passages extracted from the subset of documents is less than all of the passages included in the subset of documents. 10. The computer program product of claim 8 , wherein the computer readable program to execute the first search of the corpus of documents based on the first subset of the extracted features of the input question and the initial evaluation of the utility of the first subset of extracted features to generate the subset of documents matching the first subset of extracted features further causes the computing device to: generate a first statistical data structure for the corpus of documents; and identify the subset of documents from the corpus of documents comprised within the first statistical data structure relevant to the first subset of the extracted features utilizing the initial evaluation of the utility of the first subset of extracted features. 11. The computer program product of claim 8 , wherein the computer readable program to execute the second search of the set of passages extracted from the subset of documents based on the second subset of the extracted features of the input question and the reevaluation of the utility of the second subset of extracted features further causes the computing device to: generate a second statistical data structure for the set of passages; and identify the query results from the subset of passages comprised within the second statistical data structure relevant to the second subset of the extracted features utilizing the reevaluation of the utility of the second subset of extracted features. 12. The computer program product of claim 8 , wherein the extracted features of the i
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Physics · mapped topic
Physics · mapped topic
Machine learning · CPC title
Summarisation for human users · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.