论文信息 - On Ensemble Techniques for AIXI Approximation

On Ensemble Techniques for AIXI Approximation

One of the key challenges in AIXI approximation is model class approximation - i.e. how to meaningfully approximate Solomonoff Induction without requiring an infeasible amount of computation? This paper advocates a bottom-up approach to this problem, by describing a number of principled ensemble techniques for approximate AIXI agents. Each technique works by efficiently combining a set of existing environment models into a single, more powerful model. These techniques have the potential to play an important role in future AIXI approximations.

[1] Mark Herbster,et al. Tracking the Best Expert , 1995, Machine-mediated learning.

[2] Frans M. J. Willems,et al. Switching between two universal source coding algorithms , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[4] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[5] Y. Shtarkov,et al. The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[6] Dr. Marcus Hutter,et al. Universal artificial intelligence , 2004 .

[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8] Matthew V. Mahoney,et al. Adaptive weighing of context models for lossless data compression , 2005 .

[9] Marcus Hutter,et al. Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability (Texts in Theoretical Computer Science. An EATCS Series) , 2006 .

[10] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[11] Sanjeev Arora,et al. Efficient algorithms for online convex optimization and their applications , 2006 .

[12] Steven de Rooij,et al. Catching Up Faster in Bayesian Model Selection and Model Averaging , 2007, NIPS.

[13] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[14] Joel Veness,et al. Reinforcement Learning via AIXI Approximation , 2010, AAAI.

[15] Joel Veness,et al. A Monte-Carlo AIXI Approximation , 2009, J. Artif. Intell. Res..

[16] Yunmei Chen,et al. Projection Onto A Simplex , 2011, 1101.6081.

[17] Christopher Mattern. Mixing Strategies in Data Compression , 2012, 2012 Data Compression Conference.

[18] Joel Veness,et al. Context Tree Switching , 2011, 2012 Data Compression Conference.