Selective association rule generation

Mining association rules is a popular and well researched method for discovering interesting relations between variables in large databases. A practical problem is that at medium to low support values often a large number of frequent itemsets and an even larger number of association rules are found in a database. A widely used approach is to gradually increase minimum support and minimum confidence or to filter the found rules using increasingly strict constraints on additional measures of interestingness until the set of rules found is reduced to a manageable size. In this paper we describe a different approach which is based on the idea to first define a set of “interesting” itemsets (e.g., by a mixture of mining and expert knowledge) and then, in a second step to selectively generate rules for only these itemsets. The main advantage of this approach over increasing thresholds or filtering rules is that the number of rules found is significantly reduced while at the same time it is not necessary to increase the support and confidence thresholds which might lead to missing important information in the database.

[1]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[2]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[3]  Kurt Hornik,et al.  Introduction to arules – A computational environment for mining association rules and frequent item sets , 2009 .

[4]  Tao Zhang,et al.  Association Rules , 2000, PAKDD.

[5]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..

[6]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[7]  M. Schwarz,et al.  Otto-von-Guericke-University of Magdeburg , 2007 .

[8]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[9]  Jörg Rech,et al.  Knowledge Discovery in Databases , 2001, Künstliche Intell..

[10]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[11]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[12]  Chad Creighton,et al.  Mining gene expression databases for association rules , 2003, Bioinform..

[13]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[14]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[15]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[16]  ZhengZijian,et al.  KDD-Cup 2000 organizers' report , 2000 .

[17]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[18]  Bart Goethals,et al.  Advances in frequent itemset mining implementations: report on FIMI'03 , 2004, SKDD.

[19]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[20]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[21]  Christian Borgelt,et al.  Induction of Association Rules: Apriori Implementation , 2002, COMPSTAT.

[22]  Carla E. Brodley,et al.  KDD-Cup 2000 organizers' report: peeling the onion , 2000, SKDD.

[23]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[24]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[25]  Susan M. Bridges,et al.  Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection , 2000, Int. J. Intell. Syst..

[26]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[27]  Tomasz Imielinski,et al.  Association Rules... and What's Next? Towards Second Generation Data Mining Systems , 1998, ADBIS.

[28]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[29]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[30]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[31]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[32]  Kristin P. Bennett,et al.  Support vector machines: hype or hallelujah? , 2000, SKDD.

[33]  J ZakiMohammed,et al.  Advances in frequent itemset mining implementations , 2004 .

[34]  Bart Goethals,et al.  Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations (FIMI'03) , 2003 .