Adding to a knowledge base using an ontological analysis of unstructured text

US9965726B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9965726-B1
Application numberUS-201514696214-A
CountryUS
Kind codeB1
Filing dateApr 24, 2015
Priority dateApr 24, 2015
Publication dateMay 8, 2018
Grant dateMay 8, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are described for adding knowledge to a knowledge base.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product for adding knowledge to a knowledge base in a computing network, the knowledge base including a plurality of known entity objects and a plurality of known relation objects represented in a machine-readable format of the knowledge base, the knowledge base also including a plurality of known facts, each known fact including one of the known relation objects and at least one of the known entity objects, the computer program product comprising one or more non-transitory computer-readable media having computer program instructions stored therein, the computer program instructions being configured such that, when executed, the computer program instructions cause one or more computing devices to: identify one or more online information sources relating to subject matter within a particular information domain; obtain textual information from the one or more online information sources; identify known strings of text from the textual information based on corresponding known facts in the knowledge base that relate each of the known strings of text to one or more of the known entity objects or one or more of the known relation objects; identify candidate strings of text from the textual information based on correspondence of each of the candidate strings of text to at least one of a plurality of entity classes represented in the knowledge base; generate candidate entity objects in the machine-readable format of the knowledge base by extracting corresponding entity information from the candidate strings of text; identify, from the textual information using natural language processing, relationships among the strings of text corresponding to the known entity objects, the known relation objects, and the candidate entity objects; generate candidate facts by forming fact triples using the known entity objects, the known relation objects, and the candidate entity objects and based on the relationships among the corresponding strings of text in the textual information; identify temporal information in the textual information, the temporal information being a natural language representation of time; generate a temporal constraint using the machine-readable format of the knowledge base from the temporal information for at least one of the candidate facts, the temporal constraint representing a time or a period of time during which the corresponding candidate fact is valid; filter the candidate facts to remove each of the candidate facts for which an associated known entity object or an associated candidate entity object is ontologically incompatible with a corresponding one of the known relation objects in that the associated known entity object or the associated known candidate entity object corresponds to an entity class that is incompatible with a relation expressed by the known relation object; and add one of the remaining candidate facts to the knowledge base. 2. The computer program product of claim 1 , wherein the computer program instructions are further configured to cause the one or more computing devices to filter the candidate facts to remove each of the candidate facts that is determined to be unlikely to be true before adding the remaining candidate facts to the knowledge base. 3. A computer-implemented method for adding knowledge to a knowledge base, the knowledge base including a plurality of known entity objects and a plurality of known relation objects represented in a machine-readable format of the knowledge base, the knowledge base also including a plurality of known facts, each known fact including one of the known relation objects and at least one of the known entity objects, the method comprising: identifying known strings of text from textual information based on corresponding known facts in the knowledge base that relate each of the known strings of text to one or more of the known entity objects or one or more of the known relation objects; identifying candidate strings of text from the textual information based on correspondence of each of the candidate strings of text to at least one of a plurality of entity classes represented in the knowledge base; generating candidate entity objects in the machine-readable format of the knowledge base based on the candidate strings of text; based on natural language processing of the textual information, identifying relationships among the strings of text corresponding to the known entity objects, the known relation objects, and the candidate entity objects; generating candidate facts using the known entity objects, the known relation objects, and the candidate entity objects and based on the relationships among the known and candidate strings of text in the textual information; identifying temporal information in the textual information, the temporal information being a natural language representation of time; generating a temporal constraint from the temporal information for at least one of the candidate facts using the machine-readable format of the knowledge base, the temporal constraint representing a time or a period of time during which the corresponding candidate fact is valid; and adding at least some of the candidate facts to the knowledge base. 4. The method of claim 3 , wherein generating the candidate facts includes filtering pre-candidate facts to remove each pre-candidate fact for which an associated known entity object or an associated candidate entity object is ontologically incompatible with a corresponding one of the known relation objects. 5. The method of claim 3 , wherein generating the candidate facts includes preventing formation of candidate facts for which an associated known entity object or an associated candidate entity object is ontologically incompatible with a corresponding one of the known relation objects. 6. The method of claim 3 , wherein a subset of the candidate facts associates two of the known entity objects with one of the known relation objects, the method further comprising filtering the subset of the candidate facts to remove preexisting facts already represented in the knowledge base. 7. The method of claim 3 , wherein processing the textual information includes one or more of the following natural language processing techniques: sentence splitting, part of speech tagging, lemmatization, stemming, named entity recognition, or syntactic parsing. 8. The method of claim 3 , wherein a set including multiple candidate facts represents a plurality of different interpretations of a same set of text strings, the method further comprising disambiguating the set of candidate facts to remove unlikely interpretations of the corresponding text strings. 9. The method of claim 3 , further comprising identifying a first candidate fact as corresponding to a preexisting fact represented in the knowledge base, and endorsing the preexisting fact if the first candidate fact was derived from a different information source than the preexisting fact. 10. The method of claim 3 , further comprising identifying a first candidate fact as corresponding to a preexisting fact represented in the knowledge base, and identifying a second candidate fact as reliable based on an association between the first candidate fact and the second candidate fact. 11. A computing system, comprising: one or more data stores in a computing network, the one or more data stores having a knowledge base stored therein including a plurality of known entity objects and a plurality of known relation objects represented in a machine-readable format of the knowledge base, the knowledge base also including a plurality of known facts, each known fact including one of the known relation objects and at least one of

Assignees

Inventors

Classifications

  • Evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title

  • Physics · mapped topic

  • G06N99/005Primary

    Physics · mapped topic

  • Machine learning · CPC title

  • G06N5/022Primary

    Knowledge engineering; Knowledge acquisition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9965726B1 cover?
Techniques are described for adding knowledge to a knowledge base.
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06N99/005. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 08 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).