Early Classification of Time Series by Simultaneously Optimizing the Accuracy and Earliness

The problem of early classification of time series appears naturally in contexts where the data, of temporal nature, are collected over time, and early class predictions are interesting or even required. The objective is to classify the incoming sequence as soon as possible, while maintaining suitable levels of accuracy in the predictions. Thus, we can say that the problem of early classification consists of optimizing two objectives simultaneously: accuracy and earliness. In this context, we present a method for early classification based on combining a set of probabilistic classifiers together with a stopping rule (SR). This SR will act as a trigger and will tell us when to output a prediction or when to wait for more data, and its main novelty lies in the fact that it is built by explicitly optimizing a cost function based on accuracy and earliness. We have selected a large set of benchmark data sets and four other state-of-the-art early classification methods, and we have evaluated and compared our framework obtaining superior results in terms of both earliness and accuracy.

[1]  Jian Pei,et al.  A brief survey on sequence classification , 2010, SKDD.

[2]  Jason Lines,et al.  Classification of Household Devices by Electricity Usage Profiles , 2011, IDEAL.

[3]  Mark A. Girolami,et al.  vbmp: Variational Bayesian Multinomial Probit Regression for multi-class classification in R , 2008, Bioinform..

[4]  Philip S. Yu,et al.  Extracting Interpretable Features for Early Classification on Time Series , 2011, SDM.

[5]  Philip S. Yu,et al.  Early classification on time series , 2012, Knowledge and Information Systems.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Camelia Chira,et al.  Classifiers with a reject option for early time-series classification , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[8]  Keith L. Downing,et al.  Introduction to Evolutionary Algorithms , 2006 .

[9]  Yong Duan,et al.  Early classification on multivariate time series , 2015, Neurocomputing.

[10]  Antoine Cornuéjols,et al.  Early Classification of Time Series as a Non Myopic Sequential Decision Making Problem , 2015, ECML/PKDD.

[11]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[12]  R. Scott Evans,et al.  Automated detection of physiologic deterioration in hospitalized patients , 2015, J. Am. Medical Informatics Assoc..

[13]  Eamonn J. Keogh,et al.  Time series shapelets: a novel technique that allows accurate, interpretable and fast classification , 2010, Data Mining and Knowledge Discovery.

[14]  José Manuel Benítez,et al.  On the stopping criteria for k-Nearest Neighbor in positive unlabeled time series classification problems , 2016, Inf. Sci..

[15]  Mohamed F. Ghalwash,et al.  Utilizing temporal patterns for estimating uncertainty in interpretable early decision making , 2014, KDD.

[16]  Eamonn J. Keogh,et al.  Reliable early classification of time series based on discriminating the classes over time , 2016, Data Mining and Knowledge Discovery.

[17]  Juan José Rodríguez Diez,et al.  Early Fault Classification in Dynamic Systems Using Case-Based Reasoning , 2005, CAEPIA.

[18]  Latifur Khan,et al.  Feature Selection for Classification of Variable Length Multiattribute Motions , 2007 .

[19]  David B. Fogel,et al.  Evolution-ary Computation 1: Basic Algorithms and Operators , 2000 .

[20]  Luca Scrucca,et al.  GA: A Package for Genetic Algorithms in R , 2013 .

[21]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[22]  Claude Sammut,et al.  Classification of Multivariate Time Series and Structured Data Using Constructive Induction , 2005, Machine Learning.

[23]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[24]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[25]  Rohit J. Kate Using dynamic time warping distances as features for improved time series classification , 2016, Data Mining and Knowledge Discovery.

[26]  Mohamed F. Ghalwash,et al.  Early classification of multivariate time series using a hybrid HMM/SVM model , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[27]  Hyrum S. Anderson,et al.  Classifying with confidence from incomplete information , 2013, J. Mach. Learn. Res..

[28]  Sylvie Gibet,et al.  On Recursive Edit Distance Kernels With Application to Time Series Classification , 2010, IEEE Transactions on Neural Networks and Learning Systems.

[29]  David M. J. Tax,et al.  Multivariate Time-Series Classification Using the Hidden-Unit Logistic Model , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Li Wei,et al.  Semi-supervised time series classification , 2006, KDD '06.