Learning to Learn Using Gradient Descent

This paper introduces the application of gradient descent methods to meta-learning. The concept of "meta-learning", i.e. of a system that improves or discovers a learning algorithm, has been of interest in machine learning for decades because of its appealing applications. Previous meta-learning approaches have been based on evolutionary methods and, therefore, have been restricted to small models with few free parameters. We make meta-learning in large systems feasible by using recurrent neural networks withth eir attendant learning routines as meta-learning systems. Our system derived complex well performing learning algorithms from scratch. In this paper we also show that our approachp erforms non-stationary time series prediction.

[1]  Paul E. Utgoff,et al.  Shift of bias for inductive concept learning , 1984 .

[2]  Editors , 1986, Brain Research Bulletin.

[3]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[4]  David Haussler,et al.  Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..

[5]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[6]  Peter R. Conwell,et al.  Fixed-weight networks can learn , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[7]  David J. Chalmers,et al.  The Evolution of Learning: An Experiment in Genetic Connectionism , 1991 .

[8]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .

[9]  Rich Caruana,et al.  Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[10]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[11]  Jieyu Zhao,et al.  Simple Principles of Metalearning , 1996 .

[12]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  Jürgen Schmidhuber,et al.  Flat Minima , 1997, Neural Computation.

[14]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[15]  A. Steven Younger,et al.  Fixed-weight on-line learning , 1999, IEEE Trans. Neural Networks.

[16]  Magnus Thor Jonsson,et al.  Evolution and design of distributed learning rules , 2000, 2000 IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks. Proceedings of the First IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks (Cat. No.00.