Concept Drift Adaptation by Exploiting Historical Knowledge

Incremental learning with concept drift has often been tackled by ensemble methods, where models built in the past can be retrained to attain new models for the current data. Two design questions need to be addressed in developing ensemble methods for incremental learning with concept drift, i.e., which historical (i.e., previously trained) models should be preserved and how to utilize them. A novel ensemble learning method, namely, Diversity and Transfer-based Ensemble Learning (DTEL), is proposed in this paper. Given newly arrived data, DTEL uses each preserved historical model as an initial model and further trains it with the new data via transfer learning. Furthermore, DTEL preserves a diverse set of historical models, rather than a set of historical models that are merely accurate in terms of classification accuracy. Empirical studies on 15 synthetic data streams and 5 real-world data streams (all with concept drifts) demonstrate that DTEL can handle concept drift more effectively than 4 other state-of-the-art methods.

[1]  Peter Kokol,et al.  Interpretability of Sudden Concept Drift in Medical Informatics Domain , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[2]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[3]  George Forman,et al.  Tackling concept drift by temporal inductive transfer , 2006, SIGIR.

[4]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[5]  Xin Yao,et al.  Resampling-Based Ensemble Methods for Online Class Imbalance Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[6]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.

[7]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[8]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[9]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[10]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[11]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Mykola Pechenizkiy,et al.  An Overview of Concept Drift Applications , 2016 .

[13]  B Ludäscher,et al.  Scientific and Statistical Database Management, 20th International Conference, SSDBM 2008, Hong Kong, China, July 9-11, 2008, Proceedings , 2008, SSDBM.

[14]  Haibo He,et al.  Incremental Learning From Stream Data , 2011, IEEE Transactions on Neural Networks.

[15]  Guoliang Li,et al.  Efficient Similarity Search for Tree-Structured Data , 2008, SSDBM.

[16]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[17]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[18]  Jerzy Stefanowski,et al.  Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[20]  Hojjat Adeli,et al.  Concept Drift-Oriented Adaptive and Dynamic Support Vector Machine Ensemble With Time Window in Corporate Financial Risk Prediction , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[21]  Mykola Pechenizkiy,et al.  Dynamic integration of classifiers for handling concept drift , 2008, Inf. Fusion.

[22]  J. C. Schlimmer,et al.  Incremental learning from noisy data , 2004, Machine Learning.

[23]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[24]  Vasant Honavar,et al.  Learn++: an incremental learning algorithm for supervised neural networks , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[25]  Harry Zhang,et al.  A Fast Decision Tree Learning Algorithm , 2006, AAAI.

[26]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[27]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[28]  G. Yule On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c , 1900 .

[29]  Xin Yao,et al.  Online Ensemble Learning of Data Streams with Gradually Evolved Classes , 2016, IEEE Transactions on Knowledge and Data Engineering.

[30]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[31]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.