The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions
暂无分享,去创建一个
[1] Fernando J. Pineda,et al. Dynamics and architecture for neural computation , 1988, J. Complex..
[2] M. V. Rossum,et al. In Neural Computation , 2022 .
[3] Barak A. Pearlmutter. Learning State Space Trajectories in Recurrent Neural Networks , 1988, Neural Computation.
[4] A. W. Smith,et al. Encoding sequential structure: experience with the real-time recurrent learning algorithm , 1989, International 1989 Joint Conference on Neural Networks.
[5] James L. McClelland,et al. Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.
[6] José Carlos Príncipe,et al. A Theory for Neural Networks with Time Delays , 1990, NIPS.
[7] Geoffrey E. Hinton,et al. A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.
[8] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[9] Michael C. Mozer,et al. Induction of Multiscale Temporal Structure , 1991, NIPS.
[10] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[11] Jürgen Schmidhuber,et al. A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks , 1992, Neural Computation.
[12] Raymond L. Watrous,et al. Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.
[13] Guo-Zheng Sun,et al. Time Warping Invariant Neural Networks , 1992, NIPS.
[14] Tony Plate,et al. Holographic Recurrent Networks , 1992, NIPS.
[15] Yoshua Bengio,et al. Credit Assignment through Time: Alternatives to Backpropagation , 1993, NIPS.
[16] C. Lee Giles,et al. Experimental Comparison of the Effect of Order in Recurrent Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..
[17] Lee A. Feldkamp,et al. Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks , 1994, IEEE Trans. Neural Networks.
[18] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[19] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .
[20] Barak A. Pearlmutter. Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.
[21] Gustavo Deco,et al. Nonlinear higher-order statistical decorrelation by volume-conserving neural architectures , 1995, Neural Networks.
[22] Corso Elvezia. Bridging Long Time Lags by Weight Guessing and \long Short Term Memory" , 1996 .
[23] Sepp Hochreiter,et al. Guessing can Outperform Many Long Time Lag Algorithms , 1996 .
[24] Jürgen Schmidhuber,et al. LSTM can Solve Hard Long Time Lag Problems , 1996, NIPS.
[25] Peter Tiño,et al. Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.
[26] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.