Learning Road Traffic Control: Towards Practical Traffic Control Using Policy Gradients
暂无分享,去创建一个
[1] M. Peifer,et al. Traffic Control , 1966, Nature.
[2] D I Robertson,et al. "TRANSYT" METHOD FOR AREA TRAFFIC CONTROL , 1969 .
[3] R. Rescorla,et al. A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .
[4] A.G. Sims,et al. The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits , 1980, IEEE Transactions on Vehicular Technology.
[5] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[6] R. D. Bretherton,et al. Optimizing networks of traffic signals in real time-the SCOOT method , 1991 .
[7] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[8] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[9] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[10] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[11] Thomas L. Thorpe,et al. Traac Light Control Using Sarsa with Three State Representations , 1996 .
[12] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[13] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[14] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[15] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[16] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[17] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[18] Jussi Rintanen,et al. Complexity of Probabilistic Planning under Average Rewards , 2001, IJCAI.
[19] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[20] Nathan H. Gartner,et al. Traffic Flow Theory - A State-of-the-Art Report: Revised Monograph on Traffic Flow Theory , 2002 .
[21] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.
[22] Bernhard Friedrich,et al. Data Fusion Techniques for Adaptive Traffic Signal Control , 2003 .
[23] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[24] Wang,et al. Review of road traffic control strategies , 2003, Proceedings of the IEEE.
[25] Anne Condon,et al. On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..
[26] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[27] A. Koopman,et al. Simulation and optimization of traffic in a city , 2004, IEEE Intelligent Vehicles Symposium, 2004.
[28] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[29] Andrew Y. Ng,et al. On Local Rewards and Scaling Distributed Reinforcement Learning , 2005, NIPS.
[30] E.H.J. Nijhuis,et al. Cooperative multi-agent reinforcement learning of traffic lights , 2005 .
[31] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[32] Carlos Guestrin,et al. A robust architecture for distributed inference in sensor networks , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..
[33] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.