Application of reinforcement learning to balancing of Acrobot
暂无分享,去创建一个
[1] TesauroGerald. Practical Issues in Temporal Difference Learning , 1992 .
[2] R. Murray,et al. Nonlinear controllers for non-integrable systems: the Acrobot example , 1990, 1990 American Control Conference.
[3] Shin Ishii,et al. On-line EM Algorithm for the Normalized Gaussian Network , 2000, Neural Computation.
[4] Geoffrey E. Hinton,et al. An Alternative Model for Mixtures of Experts , 1994, NIPS.
[5] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[6] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[7] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[8] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[9] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[10] Shin Ishii,et al. Reinforcement Learning Based on On-Line EM Algorithm , 1998, NIPS.
[11] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.