Probabilistic Incremental Program Evolution

Probabilistic incremental program evolution (PIPE) is a novel technique for automatic program synthesis. We combine probability vector coding of program instructions, population-based incremental learning, and tree-coded programs like those used in some variants of genetic programming (GP). PIPE iteratively generates successive populations of functional programs according to an adaptive probability distribution over all possible programs. Each iteration, it uses the best program to refine the distribution. Thus, it stochastically generates better and better programs. Since distribution refinements depend only on the best program of the current population, PIPE can evaluate program populations efficiently when the goal is to discover a program with minimal runtime. We compare PIPE to GP on a function regression problem and the 6-bit parity problem. We also use PIPE to solve tasks in partially observable mazes, where the best programs have minimal runtime.

[1]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[2]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[3]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[4]  G. Chaitin A Theory of Program Size Formally Identical to Information Theory , 1975, JACM.

[5]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[6]  James S. Albus,et al.  New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[7]  Gregory J. Chaitin,et al.  Algorithmic Information Theory , 1987, IBM J. Res. Dev..

[8]  Hans-Paul Schwefel,et al.  Numerical optimization of computer models , 1981 .

[9]  Leonid A. Levin,et al.  Randomness Conservation Inequalities; Information and Independence in Mathematical Theories , 1984, Inf. Control..

[10]  Nichael Lynn Cramer,et al.  A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.

[11]  Ray J. Solomonoff,et al.  The Application of Algorithmic Probability to Problems in Artificial Intelligence , 1985, UAI.

[12]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[13]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[14]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[15]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[16]  C. Watkins Learning from delayed rewards , 1989 .

[17]  Jürgen Schmidhuber,et al.  Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.

[18]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[19]  Michael I. Jordan,et al.  Hierarchies of Adaptive Experts , 1991, NIPS.

[20]  Sepp Hochreiter,et al.  Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[21]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[22]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[23]  Lonnie Chrisman,et al.  Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[24]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[25]  Masayuki Yanagiya,et al.  A Simple Mutation-Dependent Genetic Algorithm , 1993, ICGA.

[26]  Michael K. Sahota,et al.  Real-time intelligent behaviour in dynamic environments : soccer-playing robots , 1993 .

[27]  Thomas Bäck,et al.  Optimal Mutation Rates in Genetic Search , 1993, ICGA.

[28]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[29]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[30]  Astro Teller,et al.  The evolution of mental models , 1994 .

[31]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[32]  Michael L. Littman,et al.  Memoryless policies: theoretical limitations and practical results , 1994 .

[33]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[34]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[35]  Shumeet Baluja,et al.  A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .

[36]  Juergen Schmidhuber,et al.  On learning how to learn learning strategies , 1994 .

[37]  Michael I. Jordan,et al.  Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[38]  Dave Cliff,et al.  Adding Temporary Memory to ZCS , 1994, Adapt. Behav..

[39]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[40]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[41]  William B. Langdon,et al.  Directed Crossover within Genetic Programming , 1995 .

[42]  Barak A. Pearlmutter Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[43]  Manuela M. Veloso,et al.  Beating a Defender in Robotic Soccer: Memory-Based Learning of a Continuous Function , 1995, NIPS.

[44]  Gerhard Weiß,et al.  Adaptation and Learning in Multi-Agent Systems: Some Remarks and a Bibliography , 1995, Adaption and Learning in Multi-Agent Systems.

[45]  Una-May O'Reilly,et al.  An analysis of genetic programming , 1995 .

[46]  S. Baluja An Empirical Comparison of Seven Iterative and Evolutionary Function Optimization Heuristics , 1995 .

[47]  Jürgen Schmidhuber Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability , 1995, ICML.

[48]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[49]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[50]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[51]  Nicholas Freitag McPhee,et al.  Accurate Replication in Genetic Programming , 1995, ICGA.

[52]  Frédéric Gruau,et al.  On using syntactic constraints with genetic programming , 1996 .

[53]  Sandip Sen,et al.  Correlating Internal Parameters and External Performance: Learning Soccer Agents , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[54]  Kwong-Sak Leung,et al.  Evolving recursive functions for the even-parity problem using genetic programming , 1996 .

[55]  Andrew McCallum,et al.  Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .

[56]  Paul A. Viola,et al.  MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[57]  Thomas Bäck,et al.  Evolutionary computation: an overview , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[58]  Sepp Hochreiter,et al.  Guessing can Outperform Many Long Time Lag Algorithms , 1996 .

[59]  Luca Maria Gambardella,et al.  Learning Real Team Solutions , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[60]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[61]  Marco Wiering,et al.  HQ-Learning: Discovering Markovian Subgoals for Non-Markovian Reinforcement Learning , 1996 .

[62]  Thomas Bäck,et al.  Intelligent Mutation Rate Control in Canonical Genetic Algorithms , 1996, ISMIS.

[63]  Jürgen Schmidhuber,et al.  LSTM can Solve Hard Long Time Lag Problems , 1996, NIPS.

[64]  Jürgen Schmidhuber,et al.  Solving POMDPs with Levin Search and EIRA , 1996, ICML.

[65]  Juergen Schmidhuber,et al.  Incremental self-improvement for life-time multi-agent reinforcement learning , 1996 .

[66]  Thomas Haynes,et al.  Duplication of Coding Segments in Genetic Programming , 1996, AAAI/IAAI, Vol. 1.

[67]  Justinian P. Rosca,et al.  Discovery of subroutines in genetic programming , 1996 .

[68]  Lee Spector,et al.  Simultaneous evolution of programs and their control structures , 1996 .

[69]  Maja J. Matarić,et al.  Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .

[70]  Jürgen Schmidhuber,et al.  Evolving Soccer Strategies , 1997, ICONIP.

[71]  James A. Hendler,et al.  Co-evolving Soccer Softbot Team Coordination with Genetic Programming , 1997, RoboCup.

[72]  Jürgen Schmidhuber,et al.  Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability , 1997, Neural Networks.

[73]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[74]  S. Baluja,et al.  Using Optimal Dependency-Trees for Combinatorial Optimization: Learning the Structure of the Search Space , 1997 .

[75]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[76]  Robert G. Reynolds,et al.  Learning to Control the Program Evolution Process with Cultural Algorithms , 1997, Evolutionary Computation.

[77]  J. Pollack,et al.  The Evolutionary Induction of Subroutines , 1997 .

[78]  Jürgen Schmidhuber,et al.  Probabilistic Incremental Program Evolution: Stochastic Search Through Program Space , 1997, ECML.

[79]  Shumeet Baluja,et al.  Using Optimal Dependency-Trees for Combinational Optimization , 1997, ICML.

[80]  Jürgen Schmidhuber,et al.  On Learning Soccer Strategies , 1997, ICANN.

[81]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[82]  Manuela M. Veloso,et al.  Layered Approach to Learning Client Behaviors in the Robocup Soccer Server , 1998, Appl. Artif. Intell..

[83]  Jürgen Schmidhuber,et al.  Reinforcement Learning with Self-Modifying Policies , 1998, Learning to Learn.

[84]  Jürgen Schmidhuber,et al.  Evolving Structured Programs with Hierarchical Instructions and Skip Nodes , 1998, ICML.

[85]  Jürgen Schmidhuber,et al.  CMAC models learn to play soccer , 1998 .

[86]  Manuela M. Veloso,et al.  Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.