Rule discovery system, method, apparatus, and program

US9767411B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9767411-B2
Application numberUS-201314401019-A
CountryUS
Kind codeB2
Filing dateMay 13, 2013
Priority dateMay 14, 2012
Publication dateSep 19, 2017
Grant dateSep 19, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system includes a free itemset generation unit to generate a set of free itemsets, each having a frequency in the database greater than or equal to a predetermined threshold value set in advance, a valid rule candidate generation unit to generate rule candidates and store the generated rule candidates, and a rule minimality decision unit to check minimality of each of the generated rule candidates and output the generated rule candidate to an output apparatus when the generated rule candidate is determined to be minimal.

First claim

Opening claim text (preview).

The invention claimed is: 1. A rule discovery system comprising: a storage apparatus configured to store a database; a data processing apparatus; and an output apparatus, wherein the data processing apparatus includes; a free itemset generation unit to generate a free itemset including a set of items of attribute-value pairs in the database, the free itemset having a frequency in the database greater than or equal to a predetermined threshold value set in advance; a valid rule candidate generation unit to generate a rule candidate including a condition part having the generated free itemset set thereto and a consequent part having an item that does not share an attribute with the free itemset set thereto, the rule candidate having a confidence greater than or equal to a confidence threshold value set in advance, and to store the generated rule candidate in a storage unit, when a ratio of a frequency of the itemset obtained by adding one item x to a free itemset α to a frequency of the free itemset α is greater than or equal to the confidence threshold value, the valid rule candidate generation unit adding to a list of a rule candidate group including a plurality of the rule candidates a rule including a condition part having a free itemset α set thereto and a consequent part having the one item x set thereto; and a rule minimality decision unit to check minimality of the rule candidate generated by the valid rule candidate generation unit and to output the rule candidate to the output apparatus when the rule candidate is determined to be minimal. 2. The rule discovery system according to claim 1 , wherein the rule minimality decision unit sorts the list of the rule candidate group generated by the valid rule candidate generation unit in an ascending order of sizes of the rule candidates, the rule minimality decision unit sequentially extracts, from a head of the list, a minimal one or more of the rule candidates and outputs the extracted minimal one or more of the rule candidates to the output apparatus, and the rule minimality decision unit removes, from the list, one or more of the remaining rule candidates in the list that are redundant with respect to the output minimal one or more of the rule candidates. 3. The rule discovery system according to claim 1 , comprising an input apparatus configured to receive a threshold value of the frequency and the threshold value of the confidence as setting parameters. 4. The rule discovery system according to claim 1 , wherein the rule is a rule expressed as a CFD (Conditional Functional Dependency). 5. A rule discovery system comprising: a storage apparatus configured to store a database; a data processing apparatus; and an output apparatus, wherein the data processing apparatus includes; a free itemset generation unit to generate a free itemset including a set of items of attribute-value pairs in the database, the free itemset having a frequency in the database greater than or equal to a predetermined threshold value set in advance; a valid rule candidate generation unit to generate a rule candidate including a condition part having the generated free itemset set thereto and a consequent part having an item that does not share an attribute with the free itemset set thereto, the rule candidate having a confidence greater than or equal to a confidence threshold value set in advance, and to store the generated rule candidate in a storage unit; and a rule minimality decision unit to check minimality of the rule candidate generated by the valid rule candidate generation unit and to output the rule candidate to the output apparatus when the rule candidate is determined to be minimal, wherein the valid rule candidate generation unit calculates a frequency of an itemset obtained by adding one item x to each free itemset α, and when a ratio of a frequency of the itemset obtained by adding the one item x to the free itemset α to a frequency of the free itemset α is greater than or equal to the confidence threshold value, the valid rule candidate generation unit repeatedly performs a process of adding to a list of a rule candidate group including a plurality of the rule candidates a rule including a condition part having the free itemset α set thereto and a consequent part having the one item x set thereto, as a rule candidate, until the checking for all combinations of the free itemsets α generated by the free itemset generation unit and the item x is finished. 6. A method of discovering a rule from a database by a data processing apparatus, the method comprising: (a) reading the database, generating a free itemset that includes a set of items of attribute-value pairs in the database and that has a frequency in the database greater than or equal to a predetermined threshold value set in advance, and storing the generated free itemset in a storage unit; (b) generating a rule candidate that includes a condition part having the generated free itemset set thereto and a consequent part having an item that does not share an attribute with the free itemset set thereto, and that has a confidence greater than or equal to a confidence threshold value set in advance, and storing the generated rule candidate in a storage unit, when a ratio of a frequency of the itemset obtained by adding one item x to a free itemset α to a frequency of the free itemset α is greater than or equal to the confidence threshold value, adding to a list of a rule candidate group including a plurality of the rule candidates a rule including a condition part having a free itemset α set thereto and a consequent part having the one item x set thereto; and (c) checking minimality of the rule candidate generated and outputting the rule candidate to an output apparatus when the rule candidate is determined to be minimal. 7. The rule discovery method according to claim 6 , comprising: sorting the list of the rule candidate group including the generated rule candidates in an ascending order of sizes of the rule candidates; sequentially extracting a minimal one or more of the rule candidates from a head of the list and outputting the extracted minimal one or more of the rule candidates to the output apparatus; and removing, from the list, one or more of the remaining rule candidates in the list that are redundant with respect to the output minimal one or more of the rule candidates. 8. A non-transitory computer readable recording medium storing a program to cause a computer to execute the processing including: (a) reading a database to generate a free itemset that includes a set of items of attribute-value pairs in the database, and that has a frequency in the database greater than or equal to a predetermined threshold value set in advance, and storing the generated free itemset in a storage unit; (b) generating a rule candidate that includes a condition part having the generated free itemset set thereto and a consequent part having an item that does not share an attribute with the free itemset set thereto and that has a confidence greater than or equal to a confidence threshold value set in advance, and storing the generated rule candidate in a storage unit, when a ratio of a frequency of the itemset obtained by adding one item x to a free itemset α to a frequency of the free itemset α is greater than or equal to the confidence threshold value, adding to a list of a rule candidate group including a plurality of the rule candidates a rule including a condition part having a free itemset α set thereto and a consequent part having the one item x set thereto; and (c) checking minimality of each of the generated rule candidates and outputting the rule candidate to an output apparatus when the rule candidate is determined to be minimal.

Assignees

Inventors

Classifications

  • based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS] · CPC title

  • G06N5/025Primary

    Extracting rules from data · CPC title

  • Physics · mapped topic

  • Artificial life, i.e. computing arrangements simulating life · CPC title

  • Marketing; Price estimation or determination; Fundraising · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9767411B2 cover?
A system includes a free itemset generation unit to generate a set of free itemsets, each having a frequency in the database greater than or equal to a predetermined threshold value set in advance, a valid rule candidate generation unit to generate rule candidates and store the generated rule candidates, and a rule minimality decision unit to check minimality of each of the generated rule candi…
Who is the assignee on this patent?
Nec Corp
What technology area does this patent fall under?
Primary CPC classification G06N5/025. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 19 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).