Treatise of Universal Induction

Understanding inductive reasoning is a problem that has engaged mankind for thousands of years. This problem is relevant to a wide range of fields and is integral to the philosophy of science. It has been tackled by many great minds ranging from philosophers to scientists to mathematicians, and more recently computer scientists. In this article we argue the case for Solomonoff Induction, a formal inductive framework which combines algorithmic information theory with the Bayesian framework. Although it achieves excellent theoretical results and is based on solid philosophical foundations, the requisite technical knowledge necessary for understanding this framework has caused it to remain largely unknown and unappreciated in the wider scientific community. The main contribution of this article is to convey Solomonoff induction and its related concepts in a generally accessible form with the aim of bridging this current technical gap. In the process we examine the major historical contributions that have led to the formulation of Solomonoff Induction as well as criticisms of Solomonoff and induction in general. In particular we examine how Solomonoff induction addresses many issues that have plagued other inductive systems, such as the black ravens paradox and the confirmation problem, and compare this approach with other recent approaches.

[1]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[2]  Axthonv G. Oettinger,et al.  IEEE Transactions on Information Theory , 1998 .

[3]  Joel Veness,et al.  A Monte-Carlo AIXI Approximation , 2009, J. Artif. Intell. Res..

[4]  Marcus Hutter,et al.  Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability (Texts in Theoretical Computer Science. An EATCS Series) , 2006 .

[5]  Editors , 1986, Brain Research Bulletin.

[6]  Hans Reichenbach Book Reviews: The Theory of Probability: An Inquiry into the Logical and Mathematical Foundations of the Calculus of Probability , 1950 .

[7]  Marcus Hutter Optimality of universal Bayesian prediction for general loss and alphabet , 2003 .

[8]  L. Good THE PARADOX OF CONFIRMATION* , 1960, The British Journal for the Philosophy of Science.

[9]  Jürgen Schmidhuber,et al.  Algorithmic complexity bounds on future prediction errors , 2007, Inf. Comput..

[10]  C. McGinn,et al.  Can We Solve the Mind–Body Problem? , 1989 .

[11]  Jürgen Schmidhuber,et al.  The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002, COLT.

[12]  Andrew R. Barron,et al.  Information-theoretic asymptotics of Bayes methods , 1990, IEEE Trans. Inf. Theory.

[13]  Marcus Hutter,et al.  Open Problems in Universal Induction & Intelligence , 2009, Algorithms.

[14]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[15]  D. Blackwell,et al.  Merging of Opinions with Increasing Information , 1962 .

[16]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[17]  E. Asmis Epicurus' Scientific Method , 1988 .

[18]  P. Grünwald The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .

[19]  Shane Legg,et al.  Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.

[20]  Maurizio Ferconi,et al.  The Theory That Would Not Die , 2013 .

[21]  F. Schick Dutch Bookies and Money Pumps , 1986 .

[22]  Marcus Hutter A Complete Theory of Everything (Will Be Subjective) , 2010, Algorithms.

[23]  Marcus Hutter Convergence and Loss Bounds for Bayesian Sequence Prediction , 2003, IEEE Trans. Inf. Theory.

[24]  Marcus Hutter,et al.  Adaptive Online Prediction by Following the Perturbed Leader , 2005, J. Mach. Learn. Res..

[25]  R. Baierlein Probability Theory: The Logic of Science , 2004 .

[26]  Marcus Hutter,et al.  On Universal Prediction and Bayesian Confirmation , 2007, Theor. Comput. Sci..

[27]  W. D. Wightman Philosophical Transactions of the Royal Society , 1961, Nature.

[28]  Marcus Hutter,et al.  Algorithmic information theory , 2007, Scholarpedia.

[29]  Ming Li,et al.  Clustering by compression , 2003, IEEE International Symposium on Information Theory, 2003. Proceedings..

[30]  Abraham Lempel,et al.  On the Complexity of Finite Sequences , 1976, IEEE Trans. Inf. Theory.

[31]  P. William,et al.  Philosophical Writings a Selection Edited and Translated by Philotheus Boehner , 1957 .

[32]  Y. Shtarkov,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[33]  J. Urgen Schmidhuber A Computer Scientist's View of Life, the Universe, and Everything , 1997 .

[34]  Peter Grünwald,et al.  Invited review of the book Statistical and Inductive Inference by Minimum Message Length , 2006 .

[35]  L. G. Neuberg,et al.  Bayes or Bust?-A Critical Examination of Bayesian Confirmation Theory. , 1994 .

[36]  T. Penelhum A Treatise of Human Nature (review) , 2000 .

[37]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[38]  Sandy L. Zabell,et al.  The rule of succession , 1989 .

[39]  Paul M. B. Vitányi,et al.  Similarity of Objects and the Meaning of Words , 2006, TAMC.