Mining multi-dimensional concept-drifting data streams using Bayesian network classifiers

In recent years, a plethora of approaches have been proposed to deal with the increasingly challenging task of mining concept-drifting data streams. However, most of these approaches can only be applied to uni-dimensional classification problems where each input instance has to be assigned to a single output class variable. The problem of mining multi-dimensional data streams, which includes multiple output class variables, is largely unexplored and only few streaming multi-dimensional approaches have been recently introduced. In this paper, we propose a novel adaptive method, named Locally Adaptive-MB-MBC (LA-MB-MBC), for mining streaming multi-dimensional data. To this end, we make use of multi-dimensional Bayesian network classifiers (MBCs) as models. Basically, LA-MB-MBC monitors the concept drift over time using the average log-likelihood score and the Page-Hinkley test. Then, if a concept drift is detected, LA-MB-MBC adapts the current MBC network locally around each changed node. An experimental study carried out using synthetic multi-dimensional data streams shows the merits of the proposed method in terms of concept drift detection as well as classification performance.

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[3]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[4]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[5]  Max Henrion,et al.  Propagating uncertainty in bayesian networks by probabilistic logic sampling , 1986, UAI.

[6]  Saso Dzeroski,et al.  An extensive experimental comparison of methods for multi-label learning , 2012, Pattern Recognit..

[7]  João Gama,et al.  Learning with Local Drift Detection , 2006, ADMA.

[8]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[9]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[10]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[11]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[12]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[13]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[14]  Linda C. van der Gaag,et al.  Multi-dimensional Bayesian Network Classifiers , 2006, Probabilistic Graphical Models.

[15]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[16]  Philip S. Yu,et al.  An ensemble-based approach to fast classification of multi-label data streams , 2011, 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[17]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[18]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[19]  Zhong Wang,et al.  Multi-label Classification without the Multi-label Cost , 2010, SDM.

[20]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[21]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[22]  Concha Bielza,et al.  Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: An application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39) , 2012, J. Biomed. Informatics.

[23]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[24]  Vipin Kumar,et al.  Chapman & Hall/CRC Data Mining and Knowledge Discovery Series , 2008 .

[25]  Grigorios Tsoumakas,et al.  Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification , 2011, IJCAI.

[26]  Geoff Holmes,et al.  Scalable and efficient multi-label classification for evolving data streams , 2012, Machine Learning.

[27]  Mohamed Medhat Gaber,et al.  Advances in data stream mining , 2012, WIREs Data Mining Knowl. Discov..

[28]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[29]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[30]  Yang Zhang,et al.  Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble , 2009, ACML.

[31]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[32]  Concha Bielza,et al.  Multi-dimensional classification with Bayesian networks , 2011, Int. J. Approx. Reason..

[33]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[34]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[35]  João Gama,et al.  On evaluating stream learning algorithms , 2013, Machine Learning.

[36]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[37]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part II: Analysis and Extensions , 2010, J. Mach. Learn. Res..

[38]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[39]  D. Hinkley Inference about the change-point from cumulative sum tests , 1971 .

[40]  Philip S. Yu,et al.  Classifying Data Streams with Skewed Class Distributions and Concept Drifts , 2008, IEEE Internet Computing.

[41]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[42]  João Gama,et al.  A Study on Change Detection Methods , 2009 .

[43]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[44]  Linda C. van der Gaag,et al.  Probabilistic Graphical Models , 2014, Lecture Notes in Computer Science.

[45]  Huaiyu Zhu On Information and Sufficiency , 1997 .