Database query generation using natural language text
US-11860916-B2 · Jan 2, 2024 · US
US2020050621A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020050621-A1 |
| Application number | US-201916522727-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 26, 2019 |
| Priority date | Aug 9, 2018 |
| Publication date | Feb 13, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system verifies textual claims using a document corpus. The system includes a memory for storing program code and a processor device for running the code to retrieve documents from the corpus based on Term Frequency Inverse Document Frequency (TFIDF) similarity to a set of textual claims. The processor extracts named entities and capitalized phrases from the textual claims. The processor retrieves documents from the corpus with titles matching any of the extracted named entities and capitalized phrases. The processor extracts premise sentences from the retrieved documents. The processor classifies the premise sentences together with sources of the premises sentences against the textual claims to obtain classifications from among possible classifications including a supported, an unverified, or a contradicted classification. The processor aggregates the classifications over the premise sentences to selectively output, for each textual claim, an overall decision of the supported classification, the unverified classification, or the contradicted classification.
Opening claim text (preview).
What is claimed is: 1 . A system for verifying textual claims using a document corpus, comprising: a memory for storing program code; and a processor device for running the program code to retrieve documents from the document corpus based on Term Frequency Inverse Document Frequency (TFIDF) similarity to a set of textual claims; extract named entities and capitalized phrases from the textual claims; retrieve documents from the document corpus with titles matching any of the extracted named entities and capitalized phrases; extract premise sentences from the retrieved documents; classify the premise sentences together with sources of the premises sentences against the textual claims to obtain classifications from among possible classifications including a supported classification, an unverified classification, or a contradicted classification; and aggregate the classifications over the premise sentences to selectively output, for each of the textual claims, an overall decision of the supported classification, the unverified classification, or the contradicted classification. 2 . The system of claim 1 , wherein the processor device further outputs supporting statements for the overall decision from the document corpus. 3 . The system of claim 1 , wherein the processor adds a title of a source document to each of the premise sentences. 4 . The system of claim 1 , wherein the title of the source document is added before a corresponding one of the premise statements. 5 . The system of claim 1 , wherein the system is comprised in a news summarization system. 6 . The system of claim 1 , wherein the processor retrieves the documents from the document corpus using a threshold on an overall number of the documents retrieved from the document corpus to limit a number of processed document from the document corpus for each of multiple retrievals. 7 . The system of claim 1 , wherein each of the premise sentences is individually classified together with a corresponding one of the sources against the textual claims to obtain the classifications for each of the textual claims. 8 . The system of claim 1 , wherein concatenated sets of the premise statements are classified together with corresponding ones of the sources against the textual claims to obtain the classifications for each of the textual claims. 9 . The system of claim 1 , wherein the processor device biases the overall decision by resolving conflicts between supporting and refuting information in favor of the supporting information. 10 . A computer-implemented method for verifying textual claims using a document corpus, comprising: retrieving, by a processor device, documents from the document corpus based on Term Frequency Inverse Document Frequency (TFIDF) similarity to a set of textual claims; extracting, by the processor device, named entities and capitalized phrases from the textual claims; retrieving, by the processor device, documents from the document corpus with titles matching any of the extracted named entities and capitalized phrases; extracting, by the processor device, premise sentences from the retrieved documents; classifying, by the processor device, the premise sentences together with sources of the premises sentences against the textual claims to obtain classifications from among possible classifications including a supported classification, an unverified classification, or a contradicted classification; and aggregating, by the processor device, the classifications over the premise sentences to selectively output, for each of the textual claims, an overall decision of the supported classification, the unverified classification, or the contradicted classification. 11 . The computer-implemented method of claim 10 , wherein the processor device further outputs supporting statements for the overall decision from the document corpus. 12 . The computer-implemented method of claim 10 , wherein the processor adds a title of a source document to each of the premise sentences. 13 . The computer-implemented method of claim 10 , wherein the title of the source document is added before a corresponding one of the premise statements. 15 . The computer-implemented method of claim 10 , wherein the system is comprised in a news summarization system. 16 . The computer-implemented method of claim 10 , wherein each of the retrieving steps involve a respective threshold on an overall number of the documents retrieved from the document corpus to limit a number of processed document from the document corpus. 17 . The computer-implemented method of claim 10 , wherein each of the premise sentences is individually classified together with a corresponding one of the sources against the textual claims to obtain the classifications for each of the textual claims. 18 . The computer-implemented method of claim 10 , wherein concatenated sets of the premise statements are classified together with corresponding ones of the sources against the textual claims to obtain the classifications for each of the textual claims. 19 . The computer-implemented method of claim 10 , further comprising biasing the overall decision by resolving conflicts between supporting and refuting information in favor of the supporting information. 20 . A computer program product for verifying textual claims using a document corpus, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising: retrieving, by a processor device, documents from the document corpus based on Term Frequency Inverse Document Frequency (TFIDF) similarity to a set of textual claims; extracting, by the processor device, named entities and capitalized phrases from the textual claims; retrieving, by the processor device, documents from the document corpus with titles matching any of the extracted named entities and capitalized phrases; extracting, by the processor device, premise sentences from the retrieved documents; classifying, by the processor device, the premise sentences together with sources of the premises sentences against the textual claims to obtain classifications from among possible classifications including a supported classification, an unverified classification, or a contradicted classification; and aggregating, by the processor device, the classifications over the premise sentences to selectively output, for each of the textual claims, an overall decision of the supported classification, the unverified classification, or the contradicted classification.
Summarisation for human users · CPC title
Creation or modification of classes or clusters · CPC title
Document management systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.