Rule-based deconfliction of overlapping data
US-2024185097-A1 · Jun 6, 2024 · US
US9251465B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9251465-B2 |
| Application number | US-201514662443-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 19, 2015 |
| Priority date | Sep 21, 2012 |
| Publication date | Feb 2, 2016 |
| Grant date | Feb 2, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This disclosure provides a computer-program product, system, method and apparatus for accessing a representation of a category or item and accessing a set of multiple transactions. The transactions are processed to identify items found amongst the transactions, and the items are ordered based on an information-gain heuristic. A depth-first search for a group of best association rules is then conducted using a best-first heuristic and constraints that make the search efficient. The best rules found during the search can then be displayed to a user, along with accompanying statistics. The user can then select rules that appear to be most relevant, and further analytics can be applied to the selected rules to obtain further information about the information provided by these rules.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-readable storage medium having instructions stored thereon, the instructions executable to cause a data processing apparatus to perform operations including: accessing a representation of a document category; accessing a set of multiple documents, each of the documents in the set including a label indicating whether or not the document is included in the category; assembling a list of terms, wherein the terms include terms found in the documents of the set; and evaluating, using a graph search algorithm, association rules in a search space that includes the evaluated association rules and unevaluated association rules, wherein each of the evaluated association rules and each of the unevaluated association rules includes at least one of the terms in the list, wherein evaluating association rules includes performing the following computer operations with respect to each of the evaluated association rules: obtaining categorization results by using the evaluated association rule to individually categorize documents of the set; and estimating a precision of the evaluated association rule based on the categorization results; and selecting some of the evaluated association rules based on the precision estimated with respect to each of the evaluated association rules; displaying a tree graph on a computer display screen such that the tree graph includes a root node and additional nodes, wherein: the root node represents the document category; each of the additional nodes represents one of the selected association rules and the respective estimated precision; and edges of the tree graph connect nodes that represent selected association rules sharing terms in common. 2. The non-transitory computer-readable storage medium of claim 1 , wherein the operations further include: providing a node selection feature in conjunction with the tree graph, wherein the node selection feature facilitates use of the computer display screen for selecting one or more of the additional nodes; receiving a selection of a first one of the additional nodes, wherein the selection is made using the node selection feature; and in response to receiving the selection, displaying statistical information regarding the selected association rule that is represented by the first one of the additional nodes. 3. The non-transitory computer-readable storage medium of claim 1 , wherein the operations further include: providing a node selection feature in conjunction with the tree graph, wherein the node selection feature facilitates use of the computer display screen for selecting one or more of the additional nodes; receiving a selection of a first one of the additional nodes, wherein the selection is made using the node selection feature, and wherein the first one of the additional nodes represents a first one of the selected association rules; and displaying representations of documents of the set categorized as being included in the document category during evaluation of the first one of the selected association rules. 4. The non-transitory computer-readable storage medium of claim 3 , wherein the operations further include: receiving an input that identifies at least one of the selected association rules as being applicable to categorizing documents with regard to the document category; storing the selected association rules identified as being applicable; accessing an additional set of documents; and using the stored association rules to categorize documents of the additional set with regard to the document category. 5. The non-transitory computer-readable storage medium of claim 1 , wherein the operations further include: computing multiple measures of information gain, wherein the multiple measures of information gain include information gain of each of the terms in the list, and wherein each of the measures of information gain is computed with regard to the document category. 6. The non-transitory computer-readable storage medium of claim 5 , wherein the tree graph: associates each of the edges with one of the measures of information gain; and is configured to: receive an input that represents a selected one of the edges; and display the measure of information gain associated with the selected one of the edges. 7. The non-transitory computer-readable storage medium of claim 1 , wherein individually categorizing documents includes making a categorization decision with respect to each of the documents of the set, wherein each of the categorization decisions involves categorizing the respective document as: associated with the document category; or not associated with the document category. 8. The non-transitory computer-readable storage medium of claim 1 , wherein each of the additional nodes of the tree graph is displayed in a manner that reflects the precision estimated with respect to the selected association rule that the additional node represents. 9. The non-transitory computer-readable storage medium of claim 1 , wherein: the document category is defined with respect to a topic such that documents in which the topic appears are associated with the document category and documents in which the topic does not appear are not associated with the document category. 10. A computer-implemented method, comprising: accessing a representation of a document category; accessing a set of multiple documents, each of the documents in the set including a label indicating whether or not the document is included in the category; assembling a list of terms, wherein the terms include terms found in the documents of the set; and evaluating, using a graph search algorithm, association rules in a search space that includes the evaluated association rules and unevaluated association rules, wherein each of the evaluated association rules and each of the unevaluated association rules includes at least one of the terms in the list, wherein evaluating association rules includes performing the following computer operations with respect to each of the evaluated association rules: obtaining categorization results by using the evaluated association rule to individually categorize documents of the set; and estimating a precision of the evaluated association rule based on the categorization results; and selecting some of the evaluated association rules based on the precision estimated with respect to each of the evaluated association rules; displaying a tree graph on a computer display screen such that the tree graph includes a root node and additional nodes, wherein: the root node represents the document category; each of the additional nodes represents one of the selected association rules and the respective estimated precision; and edges of the tree graph connect nodes that represent selected association rules sharing terms in common. 11. The method of claim 10 , further comprising: providing a node selection feature in conjunction with the tree graph, wherein the node selection feature facilitates use of the computer display screen for selecting one or more of the additional nodes; receiving a selection of a first one of the additional nodes, wherein the selection is made using the node selection feature; and in response to receiving the selection, displaying statistical information regarding the selected association rule that is represented by the first one of the additional nodes. 12. The method of claim 10 , further comprising: providing a node selection feature in conjunction with the tree graph, wherein the node selection feature facilitates use of the computer display screen for selecting one or more of the addit
Inference or reasoning models · CPC title
Information retrieval; Database structures therefor; File system structures therefor · CPC title
Frames · CPC title
Trees · CPC title
Clustering or classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.