Unsupervised relation detection model training

US10073840B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10073840-B2
Application numberUS-201314136919-A
CountryUS
Kind codeB2
Filing dateDec 20, 2013
Priority dateDec 20, 2013
Publication dateSep 11, 2018
Grant dateSep 11, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A relation detection model training solution. The relation detection model training solution mines freely available resources from the World Wide Web to train a relationship detection model for use during linguistic processing. The relation detection model training system searches the web for pairs of entities extracted from a knowledge graph that are connected by a specific relation. Performance is enhanced by clipping search snippets to extract patterns that connect the two entities in a dependency tree and refining the annotations of the relations according to other related entities in the knowledge graph. The relation detection model training solution scales to other domains and languages, pushing the burden from natural language semantic parsing to knowledge base population. The relation detection model training solution exhibits performance comparable to supervised solutions, which require design, collection, and manual labeling of natural language data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of automatically generating natural language patterns based on a knowledge graph, the method comprising: selecting a relation from a knowledge graph; extracting at least a first pair of words from the knowledge graph, wherein the first pair of words is connected by the relation; receiving a set of documents as a search result based on a first query, wherein the first query comprises at least one instruction to select documents based on the first pair of words; extracting, from the set of documents, at least one textual snippet based on the first query, wherein the at least one textual snippet includes at least in part the first pair of words; extracting a second query from a query click log, wherein the query click log comprises at least one search query against at least a part of the set of documents and at least one link to at least one document, and wherein the second query is associated with at least one link to the at least one document containing the at least one textual snippet; generating a first set of training patterns, wherein the first set of training patterns is based on association between the at least one textual snippet and the relation; generating a second set of training patterns, wherein the second set of training patterns is based on association between the second query and the relation; generating a third set of natural language patterns for the knowledge graph, wherein generating the set of natural language patterns further comprises selectively combining the first set of training patterns and the second set of training patterns based on at least one weight between the first set of training patterns and the second set of training patterns; and applying the generated third set of natural language patterns to the knowledge graph to automatically train a natural language dialog system. 2. The method of claim 1 , further comprising training a relation detection model using the set of natural language patterns for the knowledge graph. 3. The method of claim 1 wherein associating the at least one textual snippet further comprises: retrieving all properties associated with the relation, wherein the properties comprise entities and corresponding relations; comparing the properties to the at least one textual snippet; and annotating the at least one textual snippet containing matches to entities from the properties with the corresponding relations. 4. The method of claim 1 further comprising interpolating the set of natural language patterns; and labeling the at least one snippet with additional relations from a set of additional relations using the relation classifier. 5. The method of claim 4 wherein the set of additional relations comprises relations with a high probability of appearing in a conversational input. 6. The method of claim 1 wherein the set of documents is on the World Wide Web. 7. The method of claim 1 wherein the relation and the pair of words form a triple. 8. A computer readable storage device containing computer executable instructions which, when executed by a computer, perform a method for training a relation detection model without supervision, the method comprising: selecting a relation from a knowledge graph; extracting at least a first pair of words from the knowledge graph, wherein the first pair of words is connected by the relation; receiving a set of documents as a search result based on a first query, wherein the first query comprises at least one instruction to select documents based on the first pair of words; extracting, from the set of documents, at least one textual snippet based on the first query, wherein the at least one textual snippet includes at least in part the first pair of words; extracting a second query from a query click log, wherein the query click log comprises at least one search query against at least a part of the set of documents and at least one link to at least one document, and wherein the second query is associated with at least one link to the at least one document containing the at least one textual snippet; generating a first set of training patterns, wherein the first set of training patterns is based on association between the at least one textual snippet and the relation; generating a second set of training patterns, wherein the second set of training patterns is based on association between the second query and the relation; generating a third set of natural language patterns for the knowledge graph, wherein generating the set of natural language patterns further comprises selectively combining the first set of training patterns and the second set of training patterns based on at least one weight between the first set of training patterns and the second set of training patterns; and applying the generated third set of natural language patterns to the knowledge graph to automatically train a natural language dialog system. 9. The computer readable storage device of claim 8 wherein associating the at least one textual snippet further comprises: selecting the smallest sequence of constituent elements in the at least one textual snippet that contains both words from the pair as the set of natural language patterns; and replacing the both words in the set of natural language patterns with tokens from the knowledge graph corresponding to each of both words. 10. The computer readable storage device of claim 9 wherein the method further comprises associating a pair of words from the set of natural language patterns with additional relations when the pair of words corresponds to more than one relation. 11. A system comprising at least one processor in electronic communication with a computer readable storage device, the computer readable storage device storing instructions that, when executed, are capable of performing a method, the method comprising: selecting a relation from a knowledge graph; extracting at least a first pair of words from the knowledge graph, wherein the first pair of words is connected by the relation; receiving a set of documents as a search result based on a first query, wherein the first query comprises at least one instruction to select documents based on the first pair of words; extracting, from the set of documents, at least one textual snippet based on the first query, wherein the at least one textual snippet includes at least in part the first pair of words; extracting a second query from a query click log, wherein the query click log comprises at least one search query against at least a part of the set of documents and at least one link to at least one document, and wherein the second query is associated with at least one link to the at least one document containing the at least one textual snippet; generating a first set of training patterns, wherein the first set of training patterns is based on association between the at least one textual snippet and the relation; generating a second set of training patterns, wherein the second set of training patterns is based on association between the second query and the relation; generating a third set of natural language patterns for the knowledge graph, wherein generating the set of natural language patterns further comprises selectively combining the first set of training patterns and the second set of training patterns based on at least one weight between the first set of training patterns and the second set of training patterns; and applying the generated third set of natural language patterns to the knowledge graph to automatically train a natural language dialog system. 12. The system of claim 11 , further comprising training a relation detection model

Assignees

Inventors

Classifications

  • G06F40/40Primary

    Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • G06F17/28Primary

    Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10073840B2 cover?
A relation detection model training solution. The relation detection model training solution mines freely available resources from the World Wide Web to train a relationship detection model for use during linguistic processing. The relation detection model training system searches the web for pairs of entities extracted from a knowledge graph that are connected by a specific relation. Performan…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/40. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).