暂无分享,去创建一个
[1] K. Gödel. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I , 1931 .
[2] K. Gödel. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I , 1931 .
[3] Emil L. Post. Finite combinatory processes—formulation , 1936, Journal of Symbolic Logic.
[4] A. Church. An Unsolvable Problem of Elementary Number Theory , 1936 .
[5] A. Turing. On computable numbers, with an application to the Entscheidungsproblem , 1937, Proc. London Math. Soc..
[6] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[7] David A. Huffman,et al. A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.
[8] Henry J. Kelley,et al. Gradient Theory of Optimal Flight Paths , 1960 .
[9] S. Dreyfus. The numerical solution of variational problems , 1962 .
[10] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..
[11] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..
[12] Lawrence J. Fogel,et al. Artificial Intelligence through Simulated Evolution , 1966 .
[13] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.
[14] Alekseĭ Grigorʹevich Ivakhnenko,et al. CYBERNETIC PREDICTING DEVICES , 1966 .
[15] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .
[16] H. Akaike. Statistical predictor identification , 1970 .
[17] A. G. Ivakhnenko,et al. Polynomial Theory of Complex Systems , 1971, IEEE Trans. Syst. Man Cybern..
[18] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .
[19] H. Akaike. A new look at the statistical model identification , 1974 .
[20] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .
[21] B. Speelpenning. Compiling Fast Partial Derivatives of Functions Given by Algorithms , 1980 .
[22] Paul J. Werbos,et al. Applications of advances in nonlinear sensitivity analysis , 1982 .
[23] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[24] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[25] J. Rissanen. Stochastic Complexity and Modeling , 1986 .
[26] Rina Dechter,et al. Learning While Searching in Constraint-Satisfaction-Problems , 1986, AAAI.
[27] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[28] Dana H. Ballard,et al. Modular Learning in Neural Networks , 1987, AAAI.
[29] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[30] David E. Goldberg,et al. Genetic Algorithms in Search Optimization and Machine Learning , 1988 .
[31] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[32] Petre Stoica,et al. Decentralized Control , 2018, The Control Systems Handbook.
[33] Michael I. Jordan. Supervised learning and systems with excess degrees of freedom , 1988 .
[34] John E. Moody,et al. Fast Learning in Multi-Resolution Hierarchies , 1988, NIPS.
[35] Jordan B. Pollack,et al. Implications of Recursive Distributed Representations , 1988, NIPS.
[36] Stephen I. Gallant,et al. Connectionist expert systems , 1988, CACM.
[37] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.
[38] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[39] B. Widrow,et al. The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[40] Frank Fallside,et al. Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .
[41] H. B. Barlow,et al. Finding Minimum Entropy Codes , 1989, Neural Computation.
[42] Halbert White,et al. Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.
[43] Jürgen Schmidhuber,et al. A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .
[44] P. J. Werbos,et al. Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.
[45] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[46] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[47] Peter M. Todd,et al. Designing Neural Networks using Genetic Algorithms , 1989, ICGA.
[48] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[49] Hiroaki Kitano,et al. Designing Neural Networks Using Genetic Algorithms with Graph Generation System , 1990, Complex Syst..
[50] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[51] T. Sejnowski,et al. Learning Algorithms for Networks with Internal and External Feedback , 1990 .
[52] Jürgen Schmidhuber,et al. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[53] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.
[54] Isabelle Guyon,et al. Structural Risk Minimization for Character Recognition , 1991, NIPS.
[55] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[56] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[57] Jürgen Schmidhuber,et al. Learning to generate sub-goals for action sequences , 1991 .
[58] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[59] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[60] Eduardo Sontag,et al. Turing computability with neural nets , 1991 .
[61] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[62] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[63] A. P. Wieland,et al. Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[64] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[65] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[66] D. Mackay,et al. A Practical Bayesian Framework for Backprop Networks , 1991 .
[67] Jürgen Schmidhuber,et al. Learning to Generate Artificial Fovea Trajectories for Target Detection , 1991, Int. J. Neural Syst..
[68] Osamu Watanabe,et al. Kolmogorov Complexity and Computational Complexity , 2012, EATCS Monographs on Theoretical Computer Science.
[69] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[70] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[71] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[72] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..
[73] E. Allender. Applications of Time-Bounded Kolmogorov Complexity in Complexity Theory , 1992 .
[74] Narendra Ahuja,et al. Cresceptron: a self-organizing neural network which grows adaptively , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[75] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[76] Fu-Chuang Chen,et al. Adaptive control of nonlinear systems using neural networks , 1992 .
[77] Jürgen Schmidhuber,et al. Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.
[78] Mark B. Ring. Learning Sequential Tasks by Incrementally Adding Higher Orders , 1992, NIPS.
[79] Michael C. Mozer,et al. A Connectionist Symbol Manipulator that Discovers the Structure of Context-Free Languages , 1992, NIPS.
[80] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[81] John E. Moody,et al. Fast Pruning Using Principal Components , 1993, NIPS.
[82] Geoffrey E. Hinton,et al. Keeping Neural Networks Simple , 1993 .
[83] Andrzej Cichocki,et al. Neural networks for optimization and signal processing , 1993 .
[84] Inman Harvey,et al. Evolving Recurrent Dynamical Networks for Robot Control , 1993 .
[85] Jürgen Schmidhuber,et al. A ‘Self-Referential’ Weight Matrix , 1993 .
[86] Mitsuo Kawato,et al. Neural network control for a closed-loop System using Feedback-error-learning , 1993, Neural Networks.
[87] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[88] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[89] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[90] Jürgen Schmidhuber,et al. Netzwerkarchitekturen, Zielfunktionen und Kettenregel , 1993 .
[91] Xin Yao,et al. A review of evolutionary artificial neural networks , 1993, Int. J. Intell. Syst..
[92] Jonas Karlsson,et al. Learning via task decomposition , 1993 .
[93] Kumpati S. Narendra,et al. Control of nonlinear dynamical systems using neural networks: controllability and stabilization , 1993, IEEE Trans. Neural Networks.
[94] David H. Wolpert,et al. Bayesian Backpropagation Over I-O Functions Rather Than Weights , 1993, NIPS.
[95] Eduardo D. Sontag,et al. Neural Networks for Control , 1993 .
[96] Sean B. Holden,et al. On the theory of generalization and self-structuring in linearly weighted connectionist networks , 1993 .
[97] Vasant Honavar,et al. Generative learning structures and processes for generalized connectionist networks , 1993, Inf. Sci..
[98] Satinder P. Singh,et al. Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes , 1994, AAAI.
[99] Karl Sims,et al. Evolving virtual creatures , 1994, SIGGRAPH.
[100] Astro Teller,et al. The evolution of mental models , 1994 .
[101] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[102] Bernd Fritzke,et al. A Growing Neural Gas Network Learns Topologies , 1994, NIPS.
[103] Randall D. Beer,et al. Sequential Behavior and Learning in Evolved Dynamical Neural Networks , 1994, Adapt. Behav..
[104] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[105] Stefano Nolfi,et al. How to Evolve Autonomous Robots: Different Approaches in Evolutionary Robotics , 1994 .
[106] Neil Burgess,et al. A Constructive Algorithm that Converges for Real-Valued Input Patterns , 1994, Int. J. Neural Syst..
[107] Juergen Schmidhuber,et al. On learning how to learn learning strategies , 1994 .
[108] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[109] Christian Jacob,et al. Genetic L-System Programming , 1994, PPSN.
[110] Gerhard Weiß,et al. Hierarchical Chunking in Classifier Systems , 1994, AAAI.
[111] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[112] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .
[113] John Moody,et al. Architecture Selection Strategies for Neural Networks: Application to Corporate Bond Rating Predicti , 1995, NIPS 1995.
[114] Corso Elvezia. Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability , 1995 .
[115] Roland Olsson,et al. Inductive Functional Programming Using Incremental Program Transformation , 1995, Artif. Intell..
[116] J. Stephen Judd,et al. Optimal stopping and effective machine complexity in learning , 1993, Proceedings of 1995 IEEE International Symposium on Information Theory.
[117] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[118] Stefano Nolfi,et al. Evolving Mobile Robots in Simulated and Real Environments , 1995, Artificial Life.
[119] Rajesh Parekh,et al. Constructive Neural Network Learning Algorithms for Multi-Category Pattern Classification , 1995 .
[120] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[121] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[122] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[123] K S Narendra,et al. Control of nonlinear dynamical systems using neural networks. II. Observability, identification, and control , 1996, IEEE Trans. Neural Networks.
[124] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[125] Larry D. Pyeatt,et al. A comparison between cellular encoding and direct encoding for genetic neural networks , 1996 .
[126] Andrew McCallum,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[127] Jürgen Schmidhuber,et al. Sequential neural text compression , 1996, IEEE Trans. Neural Networks.
[128] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[129] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[130] Jürgen Schmidhuber,et al. Solving POMDPs with Levin Search and EIRA , 1996, ICML.
[131] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[132] Richard D. Braatz,et al. On the "Identification and control of dynamical systems using neural networks" , 1997, IEEE Trans. Neural Networks.
[133] Jürgen Schmidhuber,et al. Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability , 1997, Neural Networks.
[134] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[135] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[136] David E. Moriarty,et al. Symbiotic Evolution of Neural Networks in Sequential Decision Tasks , 1997 .
[137] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[138] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.
[139] Shigenobu Kobayashi,et al. Reinforcement Learning in POMDPs with Function Approximation , 1997, ICML.
[140] Huaiyu Zhu. On Information and Sufficiency , 1997 .
[141] Risto Miikkulainen,et al. Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..
[142] Jürgen Schmidhuber,et al. Flat Minima , 1997, Neural Computation.
[143] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[144] Jürgen Schmidhuber,et al. Reinforcement Learning with Self-Modifying Policies , 1998, Learning to Learn.
[145] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[146] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[147] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[148] F. Pasemann,et al. Evolving structure and function of neurocontrollers , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).
[149] B. Schölkopf,et al. Advances in kernel methods: support vector learning , 1999 .
[150] Jürgen Schmidhuber,et al. Feature Extraction Through LOCOCODE , 1999, Neural Computation.
[151] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[152] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[153] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[154] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[155] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[156] Vivek S. Borkar,et al. Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..
[157] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[158] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[159] John F. Kolen,et al. Field Guide to Dynamical Recurrent Networks , 2001 .
[160] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[161] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[162] Tao Zhang,et al. Stable Adaptive Neural Network Control , 2001, The Springer International Series on Asian Studies in Computer and Information Science.
[163] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..
[164] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.
[165] Paul E. Utgoff,et al. Many-Layered Learning , 2002, Neural Computation.
[166] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[167] Ofi rNw8x'pyzm,et al. The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002 .
[168] Jürgen Schmidhuber,et al. Hierarchies of Generalized Kolmogorov Complexities and Nonenumerable Universal Measures Computable in the Limit , 2002, Int. J. Found. Comput. Sci..
[169] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[170] G. Rizzolatti,et al. Hearing Sounds, Understanding Actions: Action Representation in Mirror Neurons , 2002, Science.
[171] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[172] Christian Igel,et al. Neuroevolution for reinforcement learning using evolution strategies , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..
[173] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[174] Risto Miikkulainen,et al. Active Guidance for a Finless Rocket Using Neuroevolution , 2003, GECCO.
[175] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[176] Sven Behnke,et al. Hierarchical Neural Networks for Image Interpretation , 2003, Lecture Notes in Computer Science.
[177] Jürgen Schmidhuber,et al. A robot that reinforcement-learns to identify and memorize important previous observations , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[178] Risto Miikkulainen,et al. Evolving Keepaway Soccer Players through Task Decomposition , 2003, GECCO.
[179] Sridhar Mahadevan,et al. Hierarchical Policy Gradient Algorithms , 2003, ICML.
[180] Risto Miikkulainen,et al. Robust non-linear control through neuroevolution , 2003 .
[181] Mitsuo Kawato,et al. Inter-module credit assignment in modular reinforcement learning , 2003, Neural Networks.
[182] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[183] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[184] Jürgen Schmidhuber,et al. Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets , 2003, Neural Networks.
[185] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[186] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[187] Harald Haas,et al. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.
[188] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[189] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[190] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[191] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[192] Jürgen Schmidhuber,et al. Optimal Ordered Problem Solver , 2002, Machine Learning.
[193] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[194] Jürgen Schmidhuber,et al. Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.
[195] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[196] Keechul Jung,et al. GPU implementation of neural networks , 2004, Pattern Recognit..
[197] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[198] Sridhar Mahadevan,et al. Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.
[199] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[200] Jürgen Schmidhuber,et al. Fast Online Q(λ) , 1998, Machine Learning.
[201] Chia-Feng Juang,et al. A hybrid of genetic algorithm and particle swarm optimization for recurrent network design , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[202] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[203] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[204] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[205] Marcus Hutter. Simulation Algorithms for Computational Systems Biology , 2017, Texts in Theoretical Computer Science. An EATCS Series.
[206] Mark A. Pitt,et al. Advances in Minimum Description Length: Theory and Applications , 2005 .
[207] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[208] Jürgen Schmidhuber,et al. Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.
[209] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.
[210] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[211] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[212] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .
[213] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[214] Patrice Y. Simard,et al. High Performance Convolutional Neural Networks for Document Processing , 2006 .
[215] Jürgen Schmidhuber,et al. Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts , 2006, Connect. Sci..
[216] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[217] Jürgen Schmidhuber,et al. A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[218] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[219] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[220] Jürgen Schmidhuber,et al. Training Recurrent Networks by Evolino , 2007, Neural Computation.
[221] Justus H. Piater,et al. Closed-Loop Learning of Visual Control Policies , 2011, J. Artif. Intell. Res..
[222] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[223] Andrew G. Barto,et al. Skill Characterization Based on Betweenness , 2008, NIPS.
[224] P. Vitányi,et al. An Introduction to Kolmogorov Complexity and Its Applications, Third Edition , 1997, Texts in Computer Science.
[225] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[226] Steffen Udluft,et al. Learning long-term dependencies with recurrent neural networks , 2008, Neurocomputing.
[227] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[228] Risto Miikkulainen,et al. Accelerated Neural Evolution through Cooperatively Coevolved Synapses , 2008, J. Mach. Learn. Res..
[229] M. Graziano. The Intelligent Movement Machine: An Ethological Perspective on the Primate Motor System , 2008 .
[230] T. Munich,et al. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.
[231] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[232] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[233] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[234] Victor Uc Cetina,et al. Reinforcement learning in continuous state and action spaces , 2009 .
[235] J. Schmidhuber,et al. A Novel Connectionist System for Unconstrained Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[236] Jürgen Schmidhuber,et al. Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes (特集 高次機能の学習と創発--脳・ロボット・人間研究における新たな展開) , 2009 .
[237] Kenneth O. Stanley,et al. A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.
[238] Julian Togelius,et al. Hierarchical controller learning in a First-Person Shooter , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.
[239] Verena Heidrich-Meisner,et al. Neuroevolution strategies for episodic reinforcement learning , 2009, J. Algorithms.
[240] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.
[241] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[242] Martin A. Riedmiller,et al. Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[243] Tom Schaul,et al. Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients , 2010, ICANN.
[244] Tom Schaul,et al. Exponential natural evolution strategies , 2010, GECCO '10.
[245] Sven Behnke,et al. Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.
[246] Junichiro Yoshimoto,et al. Free-energy-based reinforcement learning in a partially observable environment , 2010, ESANN.
[247] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.
[248] Robert A. Legenstein,et al. Reinforcement Learning on Slow Features of High-Dimensional Input Streams , 2010, PLoS Comput. Biol..
[249] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.
[250] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[251] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[252] Jan Peters,et al. Policy Gradient Methods , 2010, Encyclopedia of Machine Learning.
[253] R. Sutton,et al. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .
[254] Jürgen Schmidhuber,et al. On Fast Deep Nets for AGI Vision , 2011, AGI.
[255] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[256] Luca Maria Gambardella,et al. Convolutional Neural Network Committees for Handwritten Character Classification , 2011, 2011 International Conference on Document Analysis and Recognition.
[257] Luca Maria Gambardella,et al. Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.
[258] P. Schrimpf,et al. Dynamic Programming , 2011 .
[259] Luca Maria Gambardella,et al. Better Digit Recognition with a Committee of Simple Neural Nets , 2011, 2011 International Conference on Document Analysis and Recognition.
[260] Faustino J. Gomez,et al. Intrinsically Motivated Evolutionary Search for Vision-Based Reinforcement Learning , 2011 .
[261] Jürgen Schmidhuber,et al. A committee of neural networks for traffic sign classification , 2011, The 2011 International Joint Conference on Neural Networks.
[262] Jürgen Schmidhuber,et al. Sequential Constant Size Compressors for Reinforcement Learning , 2011, AGI.
[263] Tom Schaul,et al. The two-dimensional organization of behavior , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).
[264] Luca Maria Gambardella,et al. Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images , 2012, NIPS.
[265] Jürgen Schmidhuber,et al. Self-Delimiting Neural Networks , 2012, ArXiv.
[266] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[267] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[268] Martin A. Riedmiller,et al. Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[269] Jürgen Schmidhuber,et al. Transfer learning for Latin and Chinese characters with Deep Neural Networks , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[270] Biao Huang,et al. System Identification , 2000, Control Theory for Physicists.
[271] Shimon Whiteson,et al. Evolutionary Computation for Reinforcement Learning , 2012, Reinforcement Learning.
[272] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[273] Yee Whye Teh,et al. Actor-Critic Reinforcement Learning with Energy-Based Policies , 2012, EWRL.
[274] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[275] Jürgen Schmidhuber,et al. Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams , 2012, Neural Computation.
[276] Joos Vandewalle,et al. Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications , 2012 .
[277] Hans-Georg Zimmermann,et al. Forecasting with Recurrent Neural Networks: 12 Tricks , 2012, Neural Networks: Tricks of the Trade.
[278] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[279] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[280] Jürgen Schmidhuber,et al. An intrinsic value system for developing multiple invariant representations with incremental slowness learning , 2013, Front. Neurorobot..
[281] Luca Maria Gambardella,et al. Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.
[282] Jürgen Schmidhuber,et al. Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.
[283] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[284] Jürgen Schmidhuber,et al. First Experiments with PowerPlay , 2012, Neural networks : the official journal of the International Neural Network Society.
[285] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[286] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[287] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[288] Tom Schaul,et al. A linear time natural evolution strategy for non-separable functions , 2011, GECCO.
[289] Jürgen Schmidhuber,et al. Compete to Compute , 2013, NIPS.
[290] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[291] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[292] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[293] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[294] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[295] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[296] Bhuvana Ramabhadran,et al. Prosody contour prediction with long short-term memory, bi-directional, deep recurrent neural networks , 2014, INTERSPEECH.
[297] Yaroslav Bulatov,et al. Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks , 2013, ICLR.
[298] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[299] Frank K. Soong,et al. TTS synthesis with bidirectional LSTM based recurrent neural networks , 2014, INTERSPEECH.
[300] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[301] Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014 , 2014, ICML.
[302] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[303] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[304] Jason Weston,et al. Memory Networks , 2014, ICLR.
[305] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[306] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[307] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[308] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[309] Peter Stone,et al. Reinforcement learning , 2019, Scholarpedia.