### Policy Gradient in Continuous Time

暂无分享，去创建一个

[1] Michael A. Arbib,et al. Topics in Mathematical System Theory , 1969 .

[2] Alan Weiss,et al. Sensitivity analysis via likelihood ratios , 1986, WSC '86.

[3] Peter W. Glynn,et al. Likelilood ratio gradient estimation: an overview , 1987, WSC '87.

[4] H. Kushner,et al. A Monte Carlo method for sensitivity analysis and parametric optimization of nonlinear stochastic systems , 1991 .

[5] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .

[6] E. Chong,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[7] M. Talagrand. A new look at independence , 1996 .

[8] Harold J. Kushner,et al. Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[11] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[12] M. Ledoux. The concentration of measure phenomenon , 2001 .

[13] John N. Tsitsiklis,et al. Approximate Gradient Methods in Policy-Space Optimization of Markov Reward Processes , 2003, Discret. Event Dyn. Syst..

[14] Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning , 2004, Machine Learning.

[15] Rémi Munos,et al. Sensitivity Analysis Using It[o-circumflex]--Malliavin Calculus and Martingales, and Application to Stochastic Optimal Control , 2005, SIAM J. Control. Optim..

[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[17] Steven M. LaValle,et al. Planning algorithms , 2006 .