Learning visual representations for perception-action systems

We discuss vision as a sensory modality for systems that interact flexibly with uncontrolled environments. Instead of trying to build a generic vision system that produces task-independent representations, we argue in favor of task-specific, learn-able representations. This concept is illustrated by two examples of our own work. First, our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension, RLJC, additionally handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a non-parametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.

[1]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[2]  Karun B. Shimoga,et al.  Robot Grasp Synthesis Algorithms: A Survey , 1996, Int. J. Robotics Res..

[3]  E. Reed The Ecological Approach to Visual Perception , 1989 .

[4]  Jing Peng,et al.  Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..

[5]  Andrew Y. Ng,et al.  Policy search via the signed derivative , 2009, Robotics: Science and Systems.

[6]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[7]  Danica Kragic,et al.  Selection of robot pre-grasps using box-based shape approximation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Justus H. Piater,et al.  Task-Driven Discretization of the Joint Space of Visual Percepts and Continuous Actions , 2006, ECML.

[9]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[10]  Martin A. Riedmiller,et al.  Reinforcement learning for robot soccer , 2009, Auton. Robots.

[11]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[12]  Randal E. Bryant,et al.  Symbolic Boolean manipulation with ordered binary-decision diagrams , 1992, CSUR.

[13]  Danica Kragic,et al.  A strategy for grasping unknown objects based on co-planarity and colour information , 2010, Robotics Auton. Syst..

[14]  Justus H. Piater,et al.  Refining grasp affordance models by experience , 2010, 2010 IEEE International Conference on Robotics and Automation.

[15]  Florentin Wörgötter,et al.  Accumulated Visual Representation for Cognitive Vision , 2008, BMVC.

[16]  Florentin Wörgötter,et al.  Visual Primitives: Local, Condensed, Semantically Rich Visual Descriptors and their Applications in Robotics , 2010, Int. J. Humanoid Robotics.

[17]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[18]  N. Kruger,et al.  Learning object-specific grasp affordance densities , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[19]  Nicolas Pugeault,et al.  Early cognitive vision: feedback mechanisms for the disambiguation of early visual representation , 2008 .

[20]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[21]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[22]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[23]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[24]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[25]  Hugh Durrant-Whyte,et al.  Simultaneous Localisation and Mapping ( SLAM ) : Part I The Essential Algorithms , 2006 .

[26]  Michael Isard,et al.  Nonparametric belief propagation , 2010, Commun. ACM.

[27]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[28]  S. Ullman Visual routines , 1984, Cognition.

[29]  Nozha Boujemaa,et al.  Object-based queries using color points of interest , 2001, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL 2001).

[30]  Justus H. Piater,et al.  Learning Objects and Grasp Affordances through Autonomous Exploration , 2009, ICVS.

[31]  Jeff G. Schneider,et al.  Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[32]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[33]  V. Braitenberg Vehicles, Experiments in Synthetic Psychology , 1984 .

[34]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  Justus H. Piater,et al.  Interactive learning of mappings from visual percepts to actions , 2005, ICML.

[37]  Justus H. Piater,et al.  Closed-Loop Learning of Visual Control Policies , 2011, J. Artif. Intell. Res..

[38]  Sébastien Jodogne,et al.  Learning, then Compacting Visual Policies , 2005 .

[39]  Pierre Geurts,et al.  Iteratively Extending Time Horizon Reinforcement Learning , 2003, ECML.

[40]  Markus Lappe,et al.  Biologically Motivated Multi-modal Processing of Visual Primitives , 2003 .

[41]  Henrik I. Christensen,et al.  Automatic grasp planning using shape primitives , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[42]  Hugh Durrant-Whyte,et al.  Simultaneous localization and mapping (SLAM): part II , 2006 .

[43]  M. Goodale,et al.  The visual brain in action , 1995 .

[44]  J. Peng,et al.  Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.

[45]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[46]  M. Arterberry,et al.  The Cradle of Knowledge: Development of Perception in Infancy , 1998 .

[47]  Justus H. Piater,et al.  Task-Driven Learning of Spatial Combinations of Visual Features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[48]  Justus H. Piater,et al.  A Probabilistic Framework for 3D Visual Object Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[50]  Danica Kragic,et al.  Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object-Action complexes , 2008, Int. J. Humanoid Robotics.

[51]  M. Paradiso,et al.  Neuroscience: Exploring the Brain , 1996 .