C-Trace: A new algorithm for reinforcement learning of robotic control

There has been much recent interest in the potential of using reinforcement learning techniques for control in autonomous robotic agents. How to implement eeective reinforcement learning in a real-world robotic environment still involves many open questions. Are standard reinforcement learning algorithms like Watkins' Q-learning appropriate , or are other approaches more suit-able? Some speciic issues to be considered are noise/disturbance and the possibly non-Markovian aspects of the control problem. These are the particular issues we focus upon in this paper. The test-bed for the experiments described in this paper is a real six-legged insectoid walking robot; the task set is to learn an eeectively coordinated walking gait. The performance of a new algorithm we call C-Trace is compared to Watkins' well-known 1-step Q-learning reinforcement learning algorithm. We discuss the markedly superior performance of this new algorithm in the context of both theoretical and existing empirical results regarding learning in noisy and non-Markovian domains.