Built-in search indexing for nas systems
US-2015370839-A1 · Dec 24, 2015 · US
US2016196292A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016196292-A1 |
| Application number | US-201514971312-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 16, 2015 |
| Priority date | Jan 5, 2015 |
| Publication date | Jul 7, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Data relevance calculation program for; extracting topics from a group of individual data items and a group of target data items, each item including an index part and a content part, and at least a part of the target data items is related to any of the individual data items, based on words included in the individual data items and the target data items; setting an attribute of each topic based on a degree at which the topic is characterized by words included in the index part or included in the content part; and calculating relevance between any of the individual data items and each of the target data items based on the strength of a relationship between a topic included in an individual data item and a topic included in a target data item related to the individual data item and on the attribute of each topic.
Opening claim text (preview).
What is claimed is: 1 . A non-transitory and computer-readable storage medium that stores a data relevance calculation program for causing a computer to execute processing comprising: extracting a plurality of topics from a group of individual data items, each of which includes an index part and a content part, and a group of target data items, each of which includes an index part and a content part, and at least a part of which is related to any of the individual data items, based on words that are included in the group of the individual data items and the group of the target data items; setting an attribute of each of the topics based on at least one of a degree at which each of the extracted topics is characterized by words that are included in the index part and a degree at which each of the extracted topics is characterized by words that are included in the content part; and calculating relevance between any of the individual data items that are included in the group of the individual data items and each of the target data items that are included in the group of the target data items based on the strength of a relationship between a topic that is included in an individual data item and a topic that is included in a target data item related to the individual data item and on the attribute of each of the topics. 2 . The storage medium that stores a data relevance calculation program according to claim 1 , wherein in a case where the attribute of the topic that is included in the individual data item differs from the attribute of the topic that is included in the target data item related to the individual data item in the calculating of the relevance, the strength of the relationship between the topics is set to be lower than the strength of the relationship between the topics in a case where the attributes of both the topics are the same. 3 . The storage medium that stores a data relevance calculation program according to claim 1 , wherein as the attribute of each of the topics, an attribute indicating that the topic is characterized by the words included in the index part is set if the number of the words that are included in the index part is larger than the number of the words that are included in the content part among the plurality of words that characterize each topic, and an attribute indicating that the topic is characterized by the words included in the content part is set if the number of the words that are included in the content part is larger than the number of the words that are included in the index part. 4 . The storage medium that stores a data relevance calculation program according to claim 1 , wherein the sum of probabilities at which the respective words that are included in the index part, from among a plurality of words that are extracted as words characterizing each topic, occur in the topic is a degree at which the topic is characterized by the words that are included in the index part, and the sum of probabilities at which the respective words that are included in the content part occur in the topic is a degree at which the topic is characterized by the words that are included in the content part. 5 . The storage medium that stores a data relevance calculation program according to claim 1 , wherein each of the individual data items and the target data items is a document data item that is described in a natural language, wherein the index part is a part in which words or word sequences in accordance with a type of content represented by the respective parts of the document data are described, and wherein the content part is a part other than the index part in the document data. 6 . A data relevance calculation device comprising: an extraction unit configured to extract a plurality of topics from a group of individual data items, each of which includes an index part and a content part, and a group of target data items, each of which includes an index part and a content part, and at least a part of which is related to any of the individual data items, based on words that are included in the group of the individual data items and the group of the target data items; a setting unit configured to set an attribute of each of the topics based on at least one of a degree at which each of the topics that are extracted by the extraction unit is characterized by words that are included in the index part and a degree at which each of the topics that are extracted by the extraction unit is characterized by words that are included in the content part; and a calculation unit configured to calculate relevance between any of the individual data items that are included in the group of the individual data items and each of the target data items that are included in the group of the target data items based on the strength of a relationship between a topic that is included in an individual data item and a topic that is included in a target data item related to the individual data item and on the attribute of each of the topics set by the setting unit. 7 . The data relevance calculation device according to claim 6 , wherein in a case where the attribute of the topic that is included in the individual data item differs from the attribute of the topic that is included in the target data item related to the individual data item, the calculation unit sets the strength of the relationship between the topics to be lower than the strength of the relationship between the topics in a case where the attributes of both the topics are the same. 8 . The data relevance calculation device according to claim 6 , wherein the setting unit sets an attribute indicating that the topic is characterized by the words included in the index part if the number of the words that are included in the index part is larger than the number of the words that are included in the content part among the plurality of words that characterize each topic, and sets an attribute indicating that the topic is characterized by the words included in the content part if the number of the words that are included in the content part is larger than the number of the words that are included in the index part. 9 . The data relevance calculation device according to claim 6 , wherein the setting unit regards a sum of probabilities at which the respective words that are included in the index part, from among a plurality of words that are extracted as words characterizing each topic, occur in the topic as a degree at which the topic is characterized by the words that are included in the index part, and regards a sum of probabilities at which the respective words that are included in the content part occur in the topic as a degree at which the topic is characterized by the words that are included in the content part. 10 . The data relevance calculation device according to claim 6 , wherein each of the individual data items and the target data items is a document data item that is described in a natural language, wherein the index part is a part in which words or word sequences in accordance with a type of content represented by the respective parts of the document data are described, and wherein the content part is a part other than the index part in the document data. 11 . A data relevance calculation method of causing a computer to execute processing comprising: extracting a plurality of topics from a group of individual data items, each of which includes an index part and a content part, and a group of target data items, each of which includes an index part and a content part, and at least a part of which is related to any of the individual data items, based on words that are included in the group of the indivi
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Clustering; Classification · CPC title
Creation of semantic tools, e.g. ontology or thesauri · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.