Unsupervised template extraction

US10558760B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10558760-B2
Application numberUS-201715791009-A
CountryUS
Kind codeB2
Filing dateOct 23, 2017
Priority dateJul 28, 2017
Publication dateFeb 11, 2020
Grant dateFeb 11, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method implemented by an information handling system that includes a memory and a processor, that improves a question answering (QA) computer system by automatically generating relationship templates, the method comprising: extracting a plurality of event patterns corresponding to a plurality of events from data in a corpus utilized by the QA computer system; analyzing the extracted event patterns resulting in a plurality of clusters of related event patterns; creating one or more relationship templates from the plurality of clusters of related event patterns, wherein a first one of the or more relationship templates comprises a first set of the plurality of events included in a first one of the plurality of clusters; displaying the first relationship template as a first graphical representation on a display, wherein the first graphical representation displays the first set of events; and in response to receiving an input selection that selects the first graphical representation, displaying a set of second graphical representations on the display that represent a set of roles within the first relationship template. 2. The method of claim 1 further comprising: expanding the corpus utilized by the QA system, wherein the expanding further comprises: receiving a plurality of text data outside the corpus; retrieving a plurality of sub-clusters, wherein each of the sub-clusters is based on the created one or more relationship templates; and expanding a plurality of input arguments included in the event patterns by using a set of related event patterns found in the plurality of text data outside the corpus, wherein the method further comprises: matching portions of the corpus with the cluster that includes the related event pattern corresponding to the plurality of input arguments. 3. The method of claim 1 wherein the extracting further comprises: clustering the event patterns using hierarchical agglomerative clustering techniques so that the event patterns that are closer together are clustered in the same event pattern, wherein each cluster of event patterns forms the basis of one of the one or more relationship templates. 4. The method of claim 1 wherein the analyzing further comprises: converting an argument in each of the extracted event patterns into vectors by using distributional semantics, wherein the converting results in a plurality of word values each corresponding to one of a plurality of words in the extracted event pattern; calculating a similarity score between the plurality of words based on the word values pertaining to the respective words; and identifying a plurality of sets of similar words based on a comparison of the calculated similarities. 5. The method of claim 4 further comprising: selecting each of the arguments and a successive argument to the selected argument; and calculating the similarity score between the selected argument and the successive argument. 6. The method of claim 5 further comprising: performing the selecting and calculating on each pair of arguments and storing the similarity scores of all of the pairs; comparing the similarity scores to a threshold; in response to the threshold revealing a high similarity score between the arguments in one or more of the pairs of arguments: clustering the event pattern corresponding with the selected arguments with the event pattern corresponding with the selected arguments' respective successive arguments. 7. The method of claim 5 wherein the similarity is calculated using a cosine similarity algorithm.

Assignees

Inventors

Classifications

  • Natural language query formulation · CPC title

  • Discourse or dialogue representation · CPC title

  • using natural language analysis · CPC title

  • for mining of medical data, e.g. analysing previous cases of other patients · CPC title

  • Templates · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10558760B2 cover?
An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the pluralit…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/3329. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 11 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).