New Millennium AI and the Convergence of History: Update of 2012

Artificial Intelligence (AI) has recently become a real formal science: the new millennium brought the first mathematically sound, asymptotically optimal, universal problem solvers, providing a new, rigorous foundation for the previously largely heuristic field of General AI and embedded agents. At the same time there has been rapid progress in practical methods for learning true sequence-processing programs, as opposed to traditional methods limited to stationary pattern association. Here we will briefly review some of the new results, and speculate about future developments, pointing out that the time intervals between the most notable events in over 40,000 years or 29 lifetimes of human history have sped up exponentially, apparently converging to zero within the next few decades. Or is this impression just a by-product of the way humans allocate memory space to past events?

[1]  Jürgen Schmidhuber,et al.  The New AI: General & Sound & Relevant for Physics , 2003, Artificial General Intelligence.

[2]  Mark Steijvers,et al.  A Recurrent Network that performs a Context-Sensitive Prediction Task , 1996 .

[3]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[4]  Marcus Hutter The Fastest and Shortest Algorithm for all Well-Defined Problems , 2002, Int. J. Found. Comput. Sci..

[5]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[6]  Hervé Bourlard,et al.  Connectionist speech recognition , 1993 .

[7]  Paul E. Utgoff,et al.  Shift of bias for inductive concept learning , 1984 .

[8]  Roland Olsson,et al.  Inductive Functional Programming Using Incremental Program Transformation , 1995, Artif. Intell..

[9]  Jürgen Schmidhuber,et al.  Evolino: Hybrid Neuroevolution / Optimal Linear Search for Sequence Prediction , 2005, IJCAI 2005.

[10]  Allen Newell,et al.  GPS, a program that simulates human thought , 1995 .

[11]  Janet Wiles,et al.  Learning a context-free task with a recurrent neural network: An analysis of stability , 1999 .

[12]  Benjamin Kuipers,et al.  Bootstrap learning of foundational representations , 2006, Connect. Sci..

[13]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[14]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[15]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Risto Miikkulainen,et al.  Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.

[17]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning through Symbiotic Evolution , 1996, Machine Learning.

[18]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[19]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[20]  Tony Plate,et al.  Holographic Recurrent Networks , 1992, NIPS.

[21]  Jürgen Schmidhuber,et al.  Artificial curiosity based on discovering novel algorithmic predictability through coevolution , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[22]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[23]  Lee A. Feldkamp,et al.  Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks , 1994, IEEE Trans. Neural Networks.

[24]  Risto Miikkulainen,et al.  Robust non-linear control through neuroevolution , 2003 .

[25]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[26]  Jürgen Schmidhuber,et al.  Recurrent nets that time and count , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[27]  Stefano Nolfi,et al.  Evolving mobile robots in simulated and real environments , 1995 .

[28]  Jürgen Schmidhuber,et al.  Reinforcement Learning Soccer Teams with Incomplete World Models , 1999, Auton. Robots.

[29]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[30]  J. Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM networks , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[31]  Jürgen Schmidhuber,et al.  An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[32]  Stephen Hart,et al.  Intrinsically motivated hierarchical manipulation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[33]  Raymond L. Watrous,et al.  Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.

[34]  Jürgen Schmidhuber,et al.  Reinforcement Learning with Self-Modifying Policies , 1998, Learning to Learn.

[35]  Douglas B. Lenat,et al.  Theory Formation by Heuristic Search , 1983, Artificial Intelligence.

[36]  Anthony J. Robinson,et al.  An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.

[37]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[38]  Jürgen Schmidhuber,et al.  Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem , 1990 .

[39]  Jürgen Schmidhuber,et al.  Classifying Unprompted Speech by Retraining LSTM Nets , 2005, ICANN.

[40]  Jürgen Schmidhuber,et al.  Optimal Ordered Problem Solver , 2002, Machine Learning.

[41]  Padhraic Smyth,et al.  Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.

[42]  Karl Sims,et al.  Evolving virtual creatures , 1994, SIGGRAPH.

[43]  Risto Miikkulainen,et al.  Active Guidance for a Finless Rocket Using Neuroevolution , 2003, GECCO.

[44]  Alex Graves,et al.  Rapid Retraining on Speech Data with LSTM Recurrent Networks. , 2005 .

[45]  H. P. Schwefel,et al.  Numerische Optimierung von Computermodellen mittels der Evo-lutionsstrategie , 1977 .

[46]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[47]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[48]  Nichael Lynn Cramer,et al.  A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.

[49]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[50]  Guo-Zheng Sun,et al.  Time Warping Invariant Neural Networks , 1992, NIPS.

[51]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[52]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[53]  Rafal Salustowicz,et al.  Probabilistic Incremental Program Evolution , 1997, Evolutionary Computation.

[54]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[55]  Sepp Hochreiter,et al.  Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[56]  R. Penrose A Generalized inverse for matrices , 1955 .

[57]  Jürgen Schmidhuber,et al.  Completely Self-referential Optimal Reinforcement Learners , 2005, ICANN.

[58]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[59]  Jürgen Schmidhuber 2006: Celebrating 75 Years of AI - History and Outlook: The Next 25 Years , 2006, 50 Years of Artificial Intelligence.

[60]  SolomonoffR. Complexity-based induction systems , 2006 .

[61]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[62]  Jürgen Schmidhuber,et al.  Hierarchies of Generalized Kolmogorov Complexities and Nonenumerable Universal Measures Computable in the Limit , 2002, Int. J. Found. Comput. Sci..

[63]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[64]  Vernor Vinge,et al.  ==================================================================== the Coming Technological Singularity: How to Survive in the Post-human Era , 2022 .

[65]  Janet Wiles,et al.  Context-free and context-sensitive dynamics in recurrent neural networks , 2000, Connect. Sci..

[66]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[67]  Xin Yao,et al.  A review of evolutionary artificial neural networks , 1993, Int. J. Intell. Syst..

[68]  Jürgen Schmidhuber,et al.  Co-evolving recurrent neurons learn deep memory POMDPs , 2005, GECCO '05.

[69]  Helko Lehmann,et al.  Computation in Recurrent Neural Networks: From Counters to Iterated Function Systems , 1998, Australian Joint Conference on Artificial Intelligence.

[70]  Stefano Nolfi,et al.  How to Evolve Autonomous Robots: Different Approaches in Evolutionary Robotics , 1994 .

[71]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[72]  Janet Wiles,et al.  Learning to count without a counter: A case study of dynamics and activation landscapes in recurrent networks , 1995 .

[73]  Klaus Obermayer,et al.  Sequence classificatio n for protein analysis , 2005 .

[74]  Jürgen Schmidhuber,et al.  Sequence Labelling in Structured Domains with Hierarchical Recurrent Neural Networks , 2007, IJCAI.

[75]  Jürgen Schmidhuber,et al.  Ultimate Cognition à la Gödel , 2009, Cognitive Computation.

[76]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[77]  Jürgen Schmidhuber,et al.  Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets , 2003, Neural Networks.

[78]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[79]  Dr. Marcus Hutter,et al.  Universal artificial intelligence , 2004 .

[80]  Jürgen Schmidhuber,et al.  The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002, COLT.

[81]  José Carlos Príncipe,et al.  A Theory for Neural Networks with Time Delays , 1990, NIPS.

[82]  Sven Behnke,et al.  Hierarchical Neural Networks for Image Interpretation , 2003, Lecture Notes in Computer Science.

[83]  Tom Schaul,et al.  Stochastic search using the natural gradient , 2009, ICML '09.

[84]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[85]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[86]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[87]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[88]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[89]  Jürgen Schmidhuber,et al.  Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks , 2007, NIPS.

[90]  Jürgen Schmidhuber,et al.  Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability , 1997, Neural Networks.

[91]  Vaclav Smil,et al.  Detonator of the population explosion , 1999, Nature.

[92]  Eduardo Sontag,et al.  Turing computability with neural nets , 1991 .

[93]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[94]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[95]  Randall D. Beer,et al.  Sequential Behavior and Learning in Evolved Dynamical Neural Networks , 1994, Adapt. Behav..

[96]  Raymond C. Kurzweil,et al.  The Singularity Is Near , 2018, The Infinite Desire for Growth.

[97]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[98]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[99]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[100]  Risto Miikkulainen,et al.  Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..

[101]  Jürgen Schmidhuber,et al.  Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.

[102]  Risto Miikkulainen,et al.  Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[103]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[104]  Mike Casey,et al.  The Dynamics of Discrete-Time Computation, with Application to Recurrent Neural Networks and Finite State Machine Extraction , 1996, Neural Computation.

[105]  Jürgen Schmidhuber,et al.  Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements , 2003, ArXiv.

[106]  Jürgen Schmidhuber,et al.  A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks , 1992, Neural Computation.

[107]  Jürgen Schmidhuber,et al.  Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .

[108]  Jürgen Schmidhuber,et al.  Netzwerkarchitekturen, Zielfunktionen und Kettenregel , 1993 .

[109]  Jürgen Schmidhuber,et al.  A Technical Justification of Consciousness , 2005 .

[110]  Jordan B. Pollack,et al.  Analysis of Dynamical Recognizers , 1997, Neural Computation.

[111]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[112]  Jürgen Schmidhuber,et al.  Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.

[113]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[114]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[115]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[116]  Tom Schaul,et al.  Efficient natural evolution strategies , 2009, GECCO.

[117]  N N Schraudolph,et al.  Processing images by semi-linear predictability minimization. , 1997, Network.

[118]  Paul Rodríguez,et al.  A Recurrent Neural Network that Learns to Count , 1999, Connect. Sci..

[119]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[120]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[121]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[122]  Janet Wiles,et al.  Recurrent Neural Networks Can Learn to Implement Symbol-Sensitive Counting , 1997, NIPS.

[123]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[124]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[125]  Jürgen Schmidhuber,et al.  Training Recurrent Networks by Evolino , 2007, Neural Computation.

[126]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[127]  J. Urgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[128]  Mark B. Ring Learning Sequential Tasks by Incrementally Adding Higher Orders , 1992, NIPS.

[129]  Corso Elvezia Probabilistic Incremental Program Evolution , 1997 .

[130]  Jürgen Schmidhuber,et al.  LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.

[131]  Jürgen Schmidhuber,et al.  Evolving Modular Fast-Weight Networks for Control , 2005, ICANN.

[132]  Luís B. Almeida,et al.  A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[133]  Fernando J. Pineda,et al.  Recurrent Backpropagation and the Dynamical Approach to Adaptive Neural Computation , 1989, Neural Computation.

[134]  Jürgen Schmidhuber,et al.  Neural processing of complex continual input streams , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[135]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[136]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[137]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[138]  Jürgen Schmidhuber,et al.  Gödel Machines: Fully Self-referential Optimal Universal Self-improvers , 2007, Artificial General Intelligence.

[139]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[140]  K. Gödel Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I , 1931 .

[141]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[142]  Peter Tiño,et al.  Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[143]  Jürgen Schmidhuber,et al.  Semilinear Predictability Minimization Produces Well-Known Feature Detectors , 1996, Neural Computation.

[144]  Barak A. Pearlmutter Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[145]  Jürgen Schmidhuber,et al.  Learning Nonregular Languages: A Comparison of Simple Recurrent Networks and LSTM , 2002, Neural Computation.

[146]  Jürgen Schmidhuber,et al.  Evolino for recurrent support vector machines , 2005, ESANN.

[147]  C. Lee Giles,et al.  Experimental Comparison of the Effect of Order in Recurrent Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[148]  Peter M. Todd,et al.  Designing Neural Networks using Genetic Algorithms , 1989, ICGA.

[149]  Jürgen Schmidhuber,et al.  Modeling systems with internal state using evolino , 2005, GECCO '05.

[150]  Jürgen Schmidhuber,et al.  Learning Team Strategies: Soccer Case Studies , 1998, Machine Learning.

[151]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[152]  Michael C. Mozer,et al.  Induction of Multiscale Temporal Structure , 1991, NIPS.

[153]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..