Covariant Policy Search
暂无分享,去创建一个
Jeff G. Schneider | J. Andrew Bagnell | J. Bagnell | J. Schneider | Jochen Renz | W. Liu | Sanjiang Li
[1] M. Degroot. Optimal Statistical Decisions , 1970 .
[2] N. N. Chent︠s︡ov. Statistical decision rules and optimal inference , 1982 .
[3] James F. Allen. Maintaining knowledge about temporal intervals , 1983, CACM.
[4] Anthony G. Cohn,et al. A Spatial Logic based on Regions and Connection , 1992, KR.
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[7] Luis Fariñas del Cerro,et al. A New Tractable Subclass of the Rectangle Algebra , 1999, IJCAI.
[8] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[9] P. Bartlett,et al. Direct Gradient-Based Reinforcement Learning: I. Gradient Estimation Algorithms , 1999 .
[10] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[11] N. Čencov. Statistical Decision Rules and Optimal Inference , 2000 .
[12] Max J. Egenhofer,et al. Similarity of Cardinal Directions , 2001, SSTD.
[13] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[14] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[15] Alfonso Gerevini,et al. Combining topological and size information for spatial reasoning , 2002, Artif. Intell..
[16] Stefan Schaal,et al. Policy Gradient Methods for Robot Control , 2003 .
[17] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[18] Spiros Skiadopoulos,et al. On the consistency of cardinal direction constraints , 2005, Artif. Intell..