Advances in Knowledge Discovery and Data Mining

An automated data mining service offers an out-sourced, cost-effective analysis option for clients desiring to leverage their data resources for decision support and operational improvement. In the context of the service model, typically the client provides the service with data and other information likely to aid in the analysis process (e.g. domain knowledge, etc.). In return, the service provides analysis results to the client. We describe the required processes, issues, and challenges in automating the data mining and analysis process when the high-level goals are: (1) to provide the client with a high quality, pertinent analysis result; and (2) to automate the data mining service, minimizing the amount of human analyst effort required and the cost of delivering the service. We argue that by focusing on client problems within market sectors, both of these goals may be realized.

[1]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[2]  Weiru Liu,et al.  Learning belief networks from data: an information theory based approach , 1997, CIKM '97.

[3]  I. Mcharg Design With Nature , 1969 .

[4]  Jian Pei,et al.  Can we push more constraints into frequent pattern mining? , 2000, KDD '00.

[5]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[6]  Geoffrey I. Webb,et al.  Lazy Learning of Bayesian Rules , 2000, Machine Learning.

[7]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[8]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[9]  Raymond T. Ng,et al.  Finding Boundary Shape Matching Relationships in Spatial Data , 1997, SSD.

[10]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[11]  V. Estivill-Castro,et al.  Polygonization of Point Clusters through Cluster Boundary Extraction for Geographical Data Mining , 2002 .

[12]  Eamonn J. Keogh,et al.  Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches , 1999, AISTATS.

[13]  Jiawei Han,et al.  A progressive refinement approach to spatial data mining , 1999 .

[14]  Ke Wang,et al.  Mining Frequent Itemsets Using Support Constraints , 2000, VLDB.

[15]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[16]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[17]  Fionn Murtagh,et al.  Comments on 'Parallel Algorithms for Hierarchical Clustering and Cluster Validity' , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[19]  Philip S. Yu,et al.  Efficient mining of weighted association rules (WAR) , 2000, KDD '00.