Automated ontology building

US2016188564A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016188564-A1
Application numberUS-201514748324-A
CountryUS
Kind codeA1
Filing dateJun 24, 2015
Priority dateDec 29, 2014
Publication dateJun 30, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system are provided for automated ontology building. The method includes creating contextual tokens from text, parsing the text into at least one parse tree, and calculating a dependency graph across the contextual tokens using the at least one parse tree. The method further includes generating concept instance candidates and parent-child relationships based on pattern matching and transformation of the at least one parse tree. The method also includes grouping concept instance candidates into concept candidates. The method additionally includes arranging the concept candidates into a tree having tree nodes and creating predicate-based relationships between the tree nodes based on patterns and predicates identified in the text. The method further includes scoring and sorting the tree nodes. The method also includes performing an analysis of the tree nodes and rebalancing the tree based on the analysis to provide an ontology based on the text.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for automated ontology building, comprising: creating contextual tokens from text; parsing the text into at least one parse tree; calculating a dependency graph across the contextual tokens using the at least one parse tree; generating concept instance candidates and parent-child relationships based on pattern matching and transformation of the at least one parse tree; grouping concept instance candidates into concept candidates; arranging the concept candidates into a tree having tree nodes and creating predicate-based relationships between the tree nodes based on patterns and predicates identified in the text; scoring and sorting the tree nodes; and performing an analysis of the tree nodes and rebalancing the tree based on the analysis to provide an ontology based on the text. 2 . The method of claim 1 , further comprising: analyzing the text to determine enumeration candidates therein based on a set of rules; categorizing and assigning priority values to the enumeration candidates; computing assignment trees for the enumeration candidates to obtain a plurality of admissible candidate layouts; and pruning the enumeration candidates from the text based on the plurality of admissible candidate layouts and the priority values. 3 . The method of claim 1 , wherein said step of creating the contextual tokens from the text comprises annotating the text using rule-based state machines. 4 . The method of claim 1 , wherein said step of generating the concept instance candidates and the parent-child relationships comprises tagging words in the at least one parse tree as an applicable one of an instance or a class. 5 . The method of claim 1 , wherein the concept instance candidates are grouped responsive to a configurable equality expression between the text and at least one lemma. 6 . The method of claim 5 , wherein the configurable equality expression comprises a synonym set. 7 . The method of claim 1 , wherein the concept instance candidates are grouped responsive to the contextual tokens. 8 . The method of claim 1 , wherein the predicates are determined responsive to the dependency graph. 9 . The method of claim 1 , wherein the concept candidates are arranged into the tree using subclassOf, hyponymOf, and instanceOf relations. 10 . The method of claim 1 , wherein a given node from among the tree nodes is scored based on a number of children of the given node, a number of times the given node appears in the text, and a number of times the given node appears in the predicate-based relationships. 11 . The method of claim 1 , wherein the predicates in the text identified by said identifying step consist of predicates having at least two mandatory arguments. 12 . The method of claim 1 , wherein the ontology is formed as an output graph comprising a plurality of nodes, and the method further comprises providing a user interface for editing the ontology by at least one of adding a new node to the output graph, removing an existing node from the output graph, moving one of the plurality of nodes or a sub-graph across a parent-child hierarchy in the output graph, creating a new relation across the plurality of nodes, and removing an existing relation edges from the graph. 13 . The method of claim 1 , wherein the at least one parse tree detects applicable parts of speech of the text.

Assignees

Inventors

Classifications

  • Ontology · CPC title

  • G06F40/40Primary

    Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Semantic analysis · CPC title

  • G06F40/211Primary

    Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • G06F17/277Primary

    Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016188564A1 cover?
A method and system are provided for automated ontology building. The method includes creating contextual tokens from text, parsing the text into at least one parse tree, and calculating a dependency graph across the contextual tokens using the at least one parse tree. The method further includes generating concept instance candidates and parent-child relationships based on pattern matching and…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/40. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 30 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).