Information relation generation

US10198431B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10198431-B2
Application numberUS-201113214291-A
CountryUS
Kind codeB2
Filing dateAug 22, 2011
Priority dateSep 28, 2010
Publication dateFeb 5, 2019
Grant dateFeb 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

For generating a word space, manual thresholding of word scores is used. Rather than requiring the user to select the threshold arbitrarily or review each word, the user is iteratively requested to indicate the relevance of a given word. Words with greater or lesser scores are labeled in the same way depending upon the response. For determining the relationship between named entities, Latent Dirichlet Allocation (LDA) is performed on text associated with the name entities rather than on an entire document. LDA for relationship mining may include context information and/or supervised learning.

First claim

Opening claim text (preview).

We claim: 1. A method for mining a relationship of at least a first and a second named entity comprising: identifying a sentence with at least the first and the second named entity in a document; defining, by a processor, a first instance comprising the first and the second named entity, a type of named entity for each of the first and the second named entity, and text in the sentence between the first and the second named entity; applying, by a processor, latent Dirichlet allocation (LDA) to the document, the LDA including an input of the first instance, and then determining a distribution of types of relationship as an output, the types of relationship comprising labels of how the first named entity relates to the second named entity; and selecting one of the types of the relationship as the relationship for the first and the second named entity, wherein applying the LDA comprises applying a supervised maximum entropy discrimination LDA with the characteristic types of relationships as observed response variables of an output for supervision of the supervised maximum entropy discrimination LDA. 2. The method of claim 1 , wherein identifying the sentence comprises pairing named entities in sentences of the document, and wherein defining comprises defining a plurality of instances including the first instance. 3. The method of claim 2 wherein applying the LDA comprises identifying the relationship for each instance of the plurality of instances. 4. The method of claim 1 , wherein determining the distribution of types of relationship comprises discriminating between types of relationship by using a machine learning classifier. 5. The method of claim 4 , wherein for discriminating, the machine learnt classifier uses training data with known types of relationship and specific instances. 6. The method of claim 4 , wherein for the first instance, the type of relationship indicated is identified by a support vector machine. 7. The method of claim 1 , wherein the selection of one of the types of the relationship as the relationship for the first and the second named entities is based on an average over all possible models and latent topics, wherein latent topics are hidden semantic features discovered by topic models. 8. A method for mining a relationship of at least a first and a second named entity comprising: identifying a sentence with at least the first and the second named entity in a document; defining, by a processor, a first instance comprising the first and the second named entity, a type of named entity for each of the first and the second named entity, and text in the sentence between the first and the second named entity; applying, by a processor, latent Dirichlet allocation (LDA) to the document, the LDA including an input of the first instance, and then determining a distribution of types of relationship as an output, the types of relationship comprising labels of how the first named entity relates to the second named entity; and selecting one of the types of the relationship as the relationship for the first and the second named entity, wherein applying the LDA comprises applying a labeled LDA without a labeling prior probability. 9. A method for mining a relationship of at least a first named entity and a second named entity on a non-transitory computer readable storage media having stored therein data representing instructions executable by a programmed processor, the method comprising: applying latent Dirichlet allocation (LDA) to a document, which is stored on the non-transitory computer readable storage media, with identified sentences with the first and the second named entity, the LDA including an input of the first instance, the first instance comprising the first and the second named entity, a type of named entity for the first and the second named entity and text in a sentence between the first and the second named entity, and then determining a distribution of types of relationship as an output of the LDA, the types of relationships comprising labels of how the first named entity relates to the second named entity; and selecting one of the types of relationship as the relationship for the first and the second named entity, wherein applying the LDA comprises applying a supervised maximum entropy discrimination LDA with the characteristic types of relationships as observed response variables of an output for supervision of the supervised maximum entropy discrimination LDA.

Assignees

Inventors

Classifications

  • into predefined classes · CPC title

  • Lexical analysis, e.g. tokenisation or collocates · CPC title

  • G06F40/295Primary

    Named entity recognition · CPC title

  • Creation of semantic tools, e.g. ontology or thesauri · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10198431B2 cover?
For generating a word space, manual thresholding of word scores is used. Rather than requiring the user to select the threshold arbitrarily or review each word, the user is iteratively requested to indicate the relevance of a given word. Words with greater or lesser scores are labeled in the same way depending upon the response. For determining the relationship between named entities, Latent Di…
Who is the assignee on this patent?
Somasundaran Swapna, Li Dingcheng, Chakraborty Amit, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).