Layered Ensemble Architecture for Time Series Forecasting

Time series forecasting (TSF) has been widely used in many application areas such as science, engineering, and finance. The phenomena generating time series are usually unknown and information available for forecasting is only limited to the past values of the series. It is, therefore, necessary to use an appropriate number of past values, termed lag, for forecasting. This paper proposes a layered ensemble architecture (LEA) for TSF problems. Our LEA consists of two layers, each of which uses an ensemble of multilayer perceptron (MLP) networks. While the first ensemble layer tries to find an appropriate lag, the second ensemble layer employs the obtained lag for forecasting. Unlike most previous work on TSF, the proposed architecture considers both accuracy and diversity of the individual networks in constructing an ensemble. LEA trains different networks in the ensemble by using different training sets with an aim of maintaining diversity among the networks. However, it uses the appropriate lag and combines the best trained networks to construct the ensemble. This indicates LEAs emphasis on accuracy of the networks. The proposed architecture has been tested extensively on time series data of neural network (NN)3 and NN5 competitions. It has also been tested on several standard benchmark time series data. In terms of forecasting accuracy, our experimental results have revealed clearly that LEA is better than other ensemble and nonensemble methods.

[1]  Bogdan Gabrys,et al.  A Generic Multilevel Architecture for Time Series Prediction , 2011, IEEE Transactions on Knowledge and Data Engineering.

[2]  Fred Collopy,et al.  How effective are neural networks at forecasting and prediction? A review and evaluation , 1998 .

[3]  Shyi-Ming Chen,et al.  Temperature prediction using fuzzy time series , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[4]  José Neves,et al.  Evolving Time Series Forecasting ARMA Models , 2004, J. Heuristics.

[5]  Fred L. Collopy,et al.  Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons , 1992 .

[6]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[7]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[8]  Aurora Trinidad Ramirez Pozo,et al.  The boosting technique using correlation coefficient to improve time series forecasting accuracy , 2007, 2007 IEEE Congress on Evolutionary Computation.

[9]  Hubert Cardot,et al.  A new boosting algorithm for improved time-series forecasting with recurrent neural networks , 2008, Inf. Fusion.

[10]  A. Timmermann Forecast Combinations , 2005 .

[11]  Haimonti Dutta,et al.  Measuring Diversity in Regression Ensembles , 2009, IICAI.

[12]  J. Scott Armstrong,et al.  Extrapolation for Time-Series and Cross-Sectional Data , 2009 .

[13]  Thomas H. Naylor,et al.  Box-Jenkins Methods: An Alternative to Econometric Models , 1972 .

[14]  Cyril Fonlupt,et al.  Applying Boosting Techniques to Genetic Programming , 2001, Artificial Evolution.

[15]  B. Sick,et al.  Forecasting financial time series with support vector machines based on dynamic kernels , 2008, 2008 IEEE Conference on Soft Computing in Industrial Applications.

[16]  Paulo Cortez,et al.  Real-Time Forecasting by Bio-Inspired Models , 2002 .

[17]  Jörg D. Wichard,et al.  Forecasting the NN5 time series with hybrid models , 2011 .

[18]  D. N. Prabhakar Murthy,et al.  Forecasting maximum speed of aeroplanes — A case study in technology forecasting , 1984, IEEE Transactions on Systems, Man, and Cybernetics.

[19]  Nathan Intrator,et al.  Boosting Regression Estimators , 1999, Neural Computation.

[20]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[21]  Nikolaos Kourentzes,et al.  An evaluation of neural network ensembles and model selection for time series prediction , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[22]  Michael Y. Hu,et al.  Linear and nonlinear time series forecasting with artificial neural networks , 1998 .

[23]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[24]  Qi Wu,et al.  A SHORT-TERM FORECASTING MODEL WITH INHIBITING NORMAL DISTRIBUTION NOISE OF SALE SERIES , 2013, Appl. Artif. Intell..

[25]  Weizhong Yan,et al.  Toward Automatic Time-Series Forecasting Using Neural Networks , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[26]  A. Timmermann Chapter 4 Forecast Combinations , 2006 .

[27]  Fabio Roli,et al.  Design of effective neural network ensembles for image classification purposes , 2001, Image Vis. Comput..

[28]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[29]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[30]  Hong-Sen Yan,et al.  Short-term sales forecasting with change-point evaluation and pattern matching algorithms , 2012, Expert Syst. Appl..

[31]  Chee Peng Lim,et al.  Predicting drug dissolution profiles with an ensemble of boosted neural networks: a time series approach , 2003, IEEE Trans. Neural Networks.

[32]  Predicitions Huanhuan Chen,et al.  Ensemble Regression Trees for Time Series , .

[33]  R. Clemen Combining forecasts: A review and annotated bibliography , 1989 .

[34]  Bo Yang,et al.  Flexible neural trees ensemble for stock index modeling , 2007, Neurocomputing.

[35]  Victor L. Berardi,et al.  Time series forecasting with neural network ensembles: an application for exchange rate prediction , 2001, J. Oper. Res. Soc..

[36]  Fred Collopy,et al.  Automatic Identification of Time Series Features for Rule-Based Forecasting , 2001 .

[37]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[38]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[39]  Moacir P. Ponti Jr. Combining Classifiers: From the Creation of Ensembles to the Decision Fusion , 2011, 2011 24th SIBGRAPI Conference on Graphics, Patterns, and Images Tutorials.

[40]  Xin Yao,et al.  Evolutionary programming made faster , 1999, IEEE Trans. Evol. Comput..

[41]  Francisco Herrera,et al.  Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study , 2010, IEEE Transactions on Evolutionary Computation.

[42]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[43]  H. Jaeger,et al.  Stepping forward through echoes of the past : forecasting with Echo State Networks , 2007 .

[44]  Robert A. Jacobs,et al.  Bias/Variance Analyses of Mixtures-of-Experts Architectures , 1997, Neural Computation.

[45]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[46]  Fabio Roli,et al.  An approach to the automatic design of multiple classifier systems , 2001, Pattern Recognit. Lett..

[47]  M. Ogorzalek,et al.  Time series prediction with ensemble models , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[48]  Xin Yao,et al.  Simultaneous training of negatively correlated neural networks in an ensemble , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[49]  Xin Yao,et al.  A constructive algorithm for training cooperative neural network ensembles , 2003, IEEE Trans. Neural Networks.

[50]  John A. Bullinaria,et al.  Neural network ensembles for time series forecasting , 2009, GECCO '09.

[51]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[52]  Tiffany Hui-Kuang Yu,et al.  Ratio-based lengths of intervals to improve fuzzy time series forecasting , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[53]  Ludmila I. Kuncheva,et al.  Switching between selection and fusion in combining classifiers: an experiment , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[54]  ZHUO ZHENG Boosting and Bagging of Neural Networks with Applications to Financial Time Series , 2006 .

[55]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[56]  L. Kilian,et al.  In-Sample or Out-of-Sample Tests of Predictability: Which One Should We Use? , 2002, SSRN Electronic Journal.

[57]  Francisco Herrera,et al.  An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes , 2011, Pattern Recognit..

[58]  Amir F. Atiya,et al.  A new Bayesian formulation for Holt's exponential smoothing , 2009 .

[59]  J. Scott Armstrong,et al.  Principles of forecasting , 2001 .

[60]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[61]  Shuichi Kurogi,et al.  Forecasting Using First-Order Difference of Time Series and Bagging of Competitive Associative Nets , 2007, 2007 International Joint Conference on Neural Networks.

[62]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Xin Yao,et al.  Bagging and Boosting Negatively Correlated Neural Networks , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[64]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[65]  Lisa Werner,et al.  Principles of forecasting: A handbook for researchers and practitioners , 2002 .

[66]  Amir F. Atiya,et al.  Forecast combinations of computational intelligence and linear models for the NN5 time series forecasting competition , 2011 .

[67]  Sven F. Crone,et al.  Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction , 2011 .