论文信息 - Learn to Swing Up and Balance a Real Pole Based on Raw Visual Input Data

Learn to Swing Up and Balance a Real Pole Based on Raw Visual Input Data

For the challenging pole balancing task we propose a system which uses raw visual input data for reinforcement learning to evolve a control strategy. Therefore we use a neural network --- a deep autoencoder --- to encode the camera images and thus the system states in a low dimensional feature space. The system is compared to controllers that work directly on the motor sensor data. We show that the performances of both systems are settled in the same order of magnitude.

Martin A. Riedmiller | Sascha Lange | Jan Mattner | S. Lange | J. Mattner

[1] Lothar Wenzel,et al. Computer vision based inverted pendulum , 2000, Proceedings of the 17th IEEE Instrumentation and Measurement Technology Conference [Cat. No. 00CH37066].

[2] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.

[3] Tao Xiong,et al. A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[4] Haoping Wang,et al. Hybrid control for vision based Cart-Inverted Pendulum system , 2008, 2008 American Control Conference.

[5] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[6] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[7] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.

[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9] Martin A. Riedmiller,et al. Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[10] Martin A. Riedmiller. Neural reinforcement learning to swing-up and balance a real pole , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[11] Konstantinos V. Katsikopoulos,et al. Markov decision processes with delays and asynchronous cost collection , 2003, IEEE Trans. Autom. Control..

[12] Luca Maria Gambardella,et al. Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition , 2010, ArXiv.

[13] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[14] B. Widrow,et al. An adaptive 'broom balancer' with visual inputs , 1988, IEEE 1988 International Conference on Neural Networks.

[15] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.