Advances in optimizing recurrent networks
暂无分享,去创建一个
Razvan Pascanu | Yoshua Bengio | Nicolas Boulanger-Lewandowski | Yoshua Bengio | Razvan Pascanu | Nicolas Boulanger-Lewandowski
[1] Y. Nesterov. A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .
[2] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[3] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[4] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[5] Peter Tiňo,et al. Learning long-term dependencies is not as difficult with NARX recurrent neural networks , 1995 .
[6] Yoshua Bengio,et al. Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.
[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[8] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[9] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[10] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.
[11] Herbert Jaeger,et al. Optimization and applications of echo state networks with leaky- integrator neurons , 2007, Neural Networks.
[12] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[13] Geoffrey E. Hinton,et al. The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.
[14] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[15] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[16] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[17] Geoffrey E. Hinton,et al. Temporal-Kernel Recurrent Neural Networks , 2010, Neural Networks.
[18] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[20] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.
[21] Mohamed Chtourou,et al. On the training of recurrent neural networks , 2011, Eighth International Multi-Conference on Systems, Signals & Devices.
[22] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[23] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[24] Yoshua Bengio,et al. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.
[25] Vysoké Učení,et al. Statistical Language Models Based on Neural Networks , 2012 .
[26] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[27] Razvan Pascanu,et al. Understanding the exploding gradient problem , 2012, ArXiv.
[28] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[29] Pascal Vincent,et al. Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives , 2012, ArXiv.
[30] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[31] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Stefan C. Kremer,et al. Recurrent Neural Networks , 2013, Handbook on Neural Information Processing.
[33] Geoffrey E. Hinton,et al. Training Recurrent Neural Networks , 2013 .