What technology area does this patent fall under?

Primary CPC classification G06F16/313. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Automatically linking text to concepts in a knowledge base

US2016012336A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016012336-A1
Application number	US-201514657343-A
Country	US
Kind code	A1
Filing date	Mar 13, 2015
Priority date	Jul 14, 2014
Publication date	Jan 14, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to an aspect, automatically linking text to concepts in a knowledge base using differential analysis includes receiving a text string and selecting, based on contents of the text string, a plurality of data sources that correspond to concepts in the knowledge base. In a further aspect, automatically linking the text to the concepts includes calculating, for each of the selected data sources, a probability that the text string is output by a language model built using the selected data source, calculating a probability that the text string is output by a generic language model, calculating link confidence scores for each concept based on a differential analysis of the probabilities, and creating a link from the text string to one of the concepts in the knowledge base. The creating is based on a link confidence score of the concept being more than a threshold value away from a prescribed threshold.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for automatically linking text to concepts in a knowledge base using a differential analysis, the method comprising: receiving a text string; selecting a plurality of data sources that correspond to concepts in the knowledge base, the selecting based on contents of the text string; calculating, for each of the selected data sources, a probability that the text string is output by a language model built using the selected data source; calculating a probability that the text string is output by a generic language model; calculating link confidence scores for each concept based on a differential analysis of the probabilities; and creating a link from the text string to one of the concepts in the knowledge base, the creating based on a link confidence score of the concept being more than a threshold value away from a prescribed threshold. 2 . The method of claim 1 , wherein the differential analysis compares the probability that the text string is output by a language model built using a data source to the probability that the text string is output by a generic language model. 3 . The method of claim 1 , wherein the differential analysis compares the probability that the text string is output by a language model built using a data source to a probability that the text string is output by a language model built using a competing data source. 4 . The method of claim 1 , wherein the generic language model is derived from a generic data source not specific to any of the concepts in the knowledge base. 5 . The method of claim 1 , wherein the calculating link confidence scores includes comparing the probabilities to a probability that the text string is contained in a generic data source that is not associated with any of the concepts in the knowledge base. 6 . The method of claim 1 , wherein the text string is linked to a second one of the concepts in the knowledge base. 7 . The method of claim 1 , wherein the link applies to a subset of the text string and the subset is indicated in the link. 8 . The method of claim 7 , wherein words in the subset are not consecutive in the text string. 9 . The method of claim 1 , wherein the text string is one of a collection of words, a sentence, a paragraph, and a whole document. 10 . The method of claim 1 , wherein each of the selected data sources includes one or more collection of names for the corresponding concept, a description for the corresponding concept, sentences referring to the corresponding concept, and paragraphs referring to the corresponding concept.

Assignees

Inventors

Classifications

G06N5/01
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
G06F40/134
Hyperlinking · CPC title
G06F16/313Primary
Selection or weighting of terms for indexing · CPC title
G06F16/3334
Selection or weighting of terms from queries, including natural language queries · CPC title
G06F16/3346
using probabilistic model · CPC title

Patent family

Related publications grouped by family.

View patent family 55067747

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016012336A1 cover?: According to an aspect, automatically linking text to concepts in a knowledge base using differential analysis includes receiving a text string and selecting, based on contents of the text string, a plurality of data sources that correspond to concepts in the knowledge base. In a further aspect, automatically linking the text to the concepts includes calculating, for each of the selected data s…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F16/313. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).