Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010)

The simple, but general formal theory of fun and intrinsic motivation and creativity (1990-2010) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old, but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, and humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown, but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical, but nonoptimal implementations (1991, 1995, and 1997-2002) are reviewed, as well as several recent variants by others (2005-2010). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.

[1]  C. A. Garabedian Birkhoff on Aesthetic Measure , 1934 .

[2]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[3]  D. Berlyne NOVELTY AND CURIOSITY AS DETERMINANTS OF EXPLORATORY BEHAVIOUR1 , 1950 .

[4]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[5]  D. Huffman A Method for the Construction of Minimum-Redundancy Codes , 1952 .

[6]  J. Piaget The child's construction of reality , 1954 .

[7]  H. Frank Kybernetische Analysen : subjektiver Sachverhalte , 1964 .

[8]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[9]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[10]  Peter Secretan Learning , 1965, Mental Health.

[11]  W. Stolz Information Theory and Esthetic Perception. , 1967 .

[12]  R. Arnheim,et al.  Information Theory and Esthetic Perception , 1968 .

[13]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[14]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[15]  M. Bense Einführung in die informationstheoretische Ästhetik , 1969 .

[16]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[17]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[18]  H. Franke,et al.  Ästhetik als Informationsverarbeitung , 1974 .

[19]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[20]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[21]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[22]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[23]  Douglas B. Lenat,et al.  Why AM and EURISKO Appear to Work , 1984, Artif. Intell..

[24]  Douglas B. Lenat,et al.  Theory Formation by Heuristic Search , 1983, Artificial Intelligence.

[25]  Victor Raskin,et al.  Semantic mechanisms of humor , 1984 .

[26]  John H. Holland,et al.  Properties of the Bucket Brigade , 1985, ICGA.

[27]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[28]  Jürgen Schmidhuber,et al.  A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .

[29]  Z. Schreter,et al.  The Neural Bucket Brigade , 1989 .

[30]  Jürgen Schmidhuber,et al.  Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.

[31]  Jürgen Schmidhuber,et al.  Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem , 1990 .

[32]  Jürgen Schmidhuber,et al.  An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[33]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[34]  Stewart W. Wilson,et al.  A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .

[35]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[36]  David H. Ackley,et al.  Adaptation in Constant Utility Non-Stationary Environments , 1991, ICGA.

[37]  Jenq-Neng Hwang,et al.  Query-based learning applied to partially trained multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[38]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[39]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[40]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[41]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[42]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[43]  Jürgen Schmidhuber,et al.  Planning simple trajectories using neural subgoal generators , 1993 .

[44]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[45]  Garrison W. Cottrell,et al.  Learning Mackey-Glass from 25 Examples, Plus or Minus 2 , 1993, NIPS.

[46]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[47]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[48]  Mark B. Ring Continual learning in reinforcement environments , 1995, GMD-Bericht.

[49]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[50]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[51]  Jürgen Schmidhuber,et al.  Sequential neural text compression , 1996, IEEE Trans. Neural Networks.

[52]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[53]  Jürgen Schmidhuber,et al.  HQ-Learning , 1997, Adapt. Behav..

[54]  Jürgen Schmidhuber,et al.  Low-Complexity Art , 2017 .

[55]  J. Schmidhuber What''s interesting? , 1997 .

[56]  Jürgen Schmidhuber,et al.  A Computer Scientist's View of Life, the Universe, and Everything , 1999, Foundations of Computer Science: Potential - Theory - Cognition.

[57]  Paul M. B. Vitányi,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Graduate Texts in Computer Science.

[58]  J. Urgen Schmidhuber A Computer Scientist's View of Life, the Universe, and Everything , 1997 .

[59]  Jürgen Schmidhuber,et al.  Algorithmic Theories of Everything , 2000, ArXiv.

[60]  Mitsuo Kawato,et al.  Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.

[61]  Ofi rNw8x'pyzm,et al.  The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002 .

[62]  Jürgen Schmidhuber,et al.  Hierarchies of Generalized Kolmogorov Complexities and Nonenumerable Universal Measures Computable in the Limit , 2002, Int. J. Found. Comput. Sci..

[63]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[64]  Jürgen Schmidhuber,et al.  Exploring the predictable , 2003 .

[65]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[66]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[67]  Jürgen Schmidhuber,et al.  Optimal Ordered Problem Solver , 2002, Machine Learning.

[68]  Jürgen Schmidhuber,et al.  Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.

[69]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[70]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[71]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[72]  Jürgen Schmidhuber,et al.  Fast Online Q(λ) , 1998, Machine Learning.

[73]  Jürgen Schmidhuber,et al.  Gödel Machines: Towards a Technical Justification of Consciousness , 2005, Adaptive Agents and Multi-Agent Systems.

[74]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[75]  Dr. Marcus Hutter,et al.  Universal artificial intelligence , 2004 .

[76]  Chrystopher L. Nehaniv,et al.  Empowerment: a universal agent-centric measure of control , 2005, 2005 IEEE Congress on Evolutionary Computation.

[77]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[78]  Jürgen Schmidhuber,et al.  Completely Self-referential Optimal Reinforcement Learners , 2005, ICANN.

[79]  Bram Bakker,et al.  Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .

[80]  Marcus Hutter,et al.  Strong Asymptotic Assertions for Discrete MDL in Regression and Classification , 2005, ArXiv.

[81]  J. Schmidhuber Don't forget randomness is still just a hypothesis , 2006, Nature.

[82]  Jürgen Schmidhuber,et al.  Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts , 2006, Connect. Sci..

[83]  Peter Dayan,et al.  The Classification of Spikes in EEG Recordings using Features Derived from ICA , 2006 .

[84]  Risto Miikkulainen,et al.  Developing navigation behavior through self-organizing distinctive-state abstraction , 2006, Connect. Sci..

[85]  Benjamin Kuipers,et al.  Bootstrap learning of foundational representations , 2006, Connect. Sci..

[86]  Risto Miikkulainen,et al.  Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[87]  Douglas S. Blank,et al.  Introduction to developmental robotics , 2006, Connect. Sci..

[88]  Jürgen Schmidhuber 2006: Celebrating 75 Years of AI - History and Outlook: The Next 25 Years , 2006, 50 Years of Artificial Intelligence.

[89]  Peter Stone,et al.  Towards autonomous sensor and actuator model induction on a mobile robot , 2006, Connect. Sci..

[90]  Matthew Schlesinger,et al.  Decomposing infants’ object representations: A dual-route processing account , 2006, Connect. Sci..

[91]  Chrystopher L. Nehaniv,et al.  From unknown sensors and actuators to actions grounded in sensorimotor perceptions , 2006, Connect. Sci..

[92]  Brian Scassellati,et al.  Learning acceptable windows of contingency , 2006, Connect. Sci..

[93]  Jürgen Schmidhuber,et al.  Simple Algorithmic Principles of Discovery, Subjective Beauty, Selective Attention, Curiosity & Creativity , 2007, Discovery Science.

[94]  Jürgen Schmidhuber,et al.  Gödel Machines: Fully Self-referential Optimal Universal Self-improvers , 2007, Artificial General Intelligence.

[95]  G. Baldassarre,et al.  Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[96]  Jürgen Schmidhuber,et al.  New Millennium AI and the Convergence of History: Update of 2012 , 2012 .

[97]  Jürgen Schmidhuber,et al.  The New AI: General & Sound & Relevant for Physics , 2003, Artificial General Intelligence.

[98]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[99]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[100]  Thomas J. Walsh,et al.  Knows what it knows: a framework for self-aware learning , 2008, ICML '08.

[101]  Jürgen Schmidhuber,et al.  Driven by Compression Progress , 2008, KES.

[102]  Risto Miikkulainen,et al.  Accelerated Neural Evolution through Cooperatively Coevolved Synapses , 2008, J. Mach. Learn. Res..

[103]  Jürgen Schmidhuber,et al.  Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.

[104]  Stephen Hart,et al.  Intrinsically motivated hierarchical manipulation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[105]  Kenneth O. Stanley,et al.  Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.

[106]  Stephen Hart,et al.  The development of hierarchical knowledge in robot systems , 2009 .

[107]  J. Schmidhuber Science as By-Products of Search for Novel Patterns , or Data Compressible in Unknown Yet Learnable Ways , 2009 .

[108]  J. Schmidhuber,et al.  A Novel Connectionist System for Unconstrained Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[109]  J. Schmidhuber,et al.  Measuring and Optimizing Behavioral Complexity , 2009 .

[110]  Jürgen Schmidhuber,et al.  Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes (特集 高次機能の学習と創発--脳・ロボット・人間研究における新たな展開) , 2009 .

[111]  Jürgen Schmidhuber,et al.  Ultimate Cognition à la Gödel , 2009, Cognitive Computation.

[112]  Julian Togelius,et al.  Measuring and Optimizing Behavioral Complexity for Evolutionary Reinforcement Learning , 2009, ICANN.

[113]  Richard L. Lewis,et al.  Where Do Rewards Come From , 2009 .

[114]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[115]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[116]  Faustino J. Gomez,et al.  Sustaining diversity using behavioral information distance , 2009, GECCO.

[117]  Jürgen Schmidhuber,et al.  Artificial Scientists & Artists Based on the Formal Theory of Creativity , 2010, AGI 2010.

[118]  Lihong Li,et al.  Learning from Logged Implicit Exploration Data , 2010, NIPS.

[119]  Karl J. Friston,et al.  Action and behavior: a free-energy formulation , 2010, Biological Cybernetics.

[120]  Andrew G. Barto,et al.  Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.