Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study

Abstract: Partly motivated by entropy-estimation problems in neuroscience, we present adetailed and extensive comparison between some of the most popular and effective entropyestimation methods used in practice: The plug-in method, four different estimators basedon the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on theContext-Tree Weighting (CTW) method, and the renewal entropy estimator.M ETHODOLOGY : Three new entropy estimators are introduced; two new LZ-basedestimators, and the “renewal entropy estimator,” which is tailored to data generated by abinary renewal process. For two of the four LZ-based estimators, a bootstrap procedure isdescribed for evaluating their standard error, and a practical rule of thumb is heuristicallyderived for selecting the values of their parameters in practice. T HEORY : We prove that,unlike their earlier versions, the two new LZ-based estimators are universally consistent,that is, they converge to the entropy rate for every finite-valued, stationary and ergodicprocess. An effective method is derived for the accurate approximation of the entropy rateof a finite-state hidden Markov model (HMM) with known distribution. Heuristiccalculations are presented and approximate formulas are derived for evaluating the bias andthe standard error of each estimator. S

