Automated cognitive processing of source agnostic data
US-2019102375-A1 · Apr 4, 2019 · US
US10776587B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10776587-B2 |
| Application number | US-201615206326-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 11, 2016 |
| Priority date | Jul 11, 2016 |
| Publication date | Sep 15, 2020 |
| Grant date | Sep 15, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method, computerized apparatus and computer program product for claim generation, the method comprising: selecting at least one subject according to a given topic; selecting at least one verb from a first data source; selecting at least one object from a second data source; generating one or more candidate claim sentences, each of which composed of a subject selected from the at least one subject, a verb selected from the at least one verb and an object selected from the at least on object; and determining validity of the candidate claim sentences using a machine learning process.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: selecting at least one subject according to a given topic; selecting at least one verb from a first data source; selecting at least one object from a second data source different from the first data source; automatically synthesizing one or more automatically-generated candidate claim sentences, comprising combining for each automatically-generated candidate claim sentence: a subject selected from the at least one subject, a verb selected from the at least one verb and an object selected from the at least one object; and determining validity of the candidate automatically-generated claim sentences using a machine learning process, comprising applying a trained classifier on one or more features extracted from a candidate claim sentence selected from the automatically-generated candidate claim sentences, to obtain a predicted validity labeling, the predicated validity labeling referring to the coherency and relevancy to the given topic. 2. The computer-implemented method of claim 1 , wherein said selecting at least one subject comprises analyzing the given topic to extract one or more noun phrases. 3. The computer-implemented method of claim 1 , wherein the first data source is selected from the group consisting of: verbs extracted from a set of pre-labeled claim sentences; predetermined causation verbs; and any combination thereof. 4. The computer-implemented method of claim 1 , wherein the second data source is selected from the group consisting of: objects extracted from a set of pre-labeled claim sentences; objects extracted from a set of claim sentences detected in a corpus; concepts related to the given topic in a knowledge base; objects mined from a corpus based on predetermined criteria; and any combination thereof. 5. The computer-implemented method of claim 1 , wherein the trained classifier is trained using a training dataset comprising pre-labeled claim sentences. 6. The computer-implemented method of claim 1 , wherein at least one of the features relates to similarity between constituents of the candidate claim sentence and the given topic. 7. The computer-implemented method of claim 1 , wherein at least one of the features relates to co-occurrence of constituents of the candidate claim sentence within a corpus. 8. The computer-implemented method of claim 1 , wherein the first and second data sources comprise a same set of pre-labeled claim sentences associated each with a topic, wherein a verb and an object of a candidate claim sentence are selected from a same pre-labeled claim sentence of the set, wherein at least one of the features relates to similarity between constituents of the pre-labeled claim sentence and associated topic thereof. 9. The computer-implemented method of claim 1 , further comprising: using the machine learning process to rank candidate claim sentences determined as valid, and outputting a top ranked subset thereof. 10. A computerized apparatus having a processor, the processor being adapted to perform the steps of: selecting at least one subject according to a given topic; selecting at least one verb from a first data source; selecting at least one object from a second data source different from the first data source; automatically synthesizing one or more automatically-generated candidate claim sentences, comprising combining for each automatically-generated candidate claim sentence: a subject selected from the at least one subject, a verb selected from the at least one verb and an object selected from the at least one object; and determining validity of the candidate automatically-generated claim sentences using a machine learning process, comprising applying a trained classifier on one or more features extracted from a candidate claim sentence selected from the automatically-generated candidate claim sentences, to obtain a predicted validity labeling, the predicated validity labeling referring to the coherency and relevancy to the given topic. 11. The computerized apparatus of claim 10 , wherein said selecting at least one subject comprises analyzing the given topic to extract one or more noun phrases. 12. The computerized apparatus of claim 10 , wherein the first data source is selected from the group consisting of: verbs extracted from a set of pre-labeled claim sentences: predetermined causation verbs; and any combination thereof. 13. The computerized apparatus of claim 10 , wherein the second data source is selected from the group consisting of: objects extracted from a set of pre-labeled claim sentences; objects extracted from a set of claim sentences detected in a corpus; concepts related to the given topic in a knowledge base: objects mined from a corpus based on predetermined criteria; and any combination thereof. 14. The computerized apparatus of claim 10 , wherein at least one of the features relates to similarity between constituents of the candidate claim sentence and the given topic. 15. The computerized apparatus of claim 10 , wherein at least one of the features relates to co-occurrence of constituents of the candidate claim sentence within a corpus. 16. The computerized apparatus of claim 10 , wherein the first and second data sources comprise a same set of pre-labeled claim sentences associated each with a topic, wherein a verb and an object of a candidate claim sentence are selected from a same pre-labeled claim sentence of the set, wherein at least one of the features relates to similarity between constituents of the pre-labeled claim sentence and associated topic thereof. 17. The computerized apparatus of claim 10 , wherein said processor is further configured for using the machine learning process to rank candidate claim sentences determined as valid, and outputting a top ranked subset thereof. 18. A computer program product comprising a non-transitory computer readable storage medium retaining program instructions, which program instructions when read by a processor, cause the processor to perform a method comprising: selecting at least one subject according to a given topic; selecting at least one verb from a first data source; selecting at least one object from a second data source different from the first data source; automatically synthesizing one or more automatically-generated candidate claim sentences, comprising combining for each automatically-generated candidate claim sentence: a subject selected from the at least one subject, a verb selected from the at least one verb and an object selected from the at least one object; and determining validity of the candidate automatically-generated claim sentences using a machine learning process, comprising applying a trained classifier on one or more features extracted from a candidate claim sentence selected from the automatically-generated candidate claim sentences, to obtain a predicted validity labeling, the predicated validity labeling referring to the coherency and relevancy to the given topic.
Related publications grouped by family.
Answers are generated from the same data shown on this page.