Learning Visual Representations for Interactive Systems

We describe two quite different methods for associating action parameters to visual percepts. Our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a nonparametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.

[1]  Justus H. Piater,et al.  Task-Driven Learning of Spatial Combinations of Visual Features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[2]  Nozha Boujemaa,et al.  Object-based queries using color points of interest , 2001, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL 2001).

[3]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[4]  Pierre Geurts,et al.  Iteratively Extending Time Horizon Reinforcement Learning , 2003, ECML.

[5]  Markus Lappe,et al.  Biologically Motivated Multi-modal Processing of Visual Primitives , 2003 .

[6]  Justus H. Piater,et al.  A Probabilistic Framework for 3D Visual Object Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Danica Kragic,et al.  Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object-Action complexes , 2008, Int. J. Humanoid Robotics.

[8]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[9]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[10]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[11]  Florentin Wörgötter,et al.  Accumulated Visual Representation for Cognitive Vision , 2008, BMVC.

[12]  Andrew W. Moore,et al.  The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces , 1993, Machine Learning.

[13]  N. Kruger,et al.  Learning object-specific grasp affordance densities , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[14]  Randal E. Bryant,et al.  Symbolic Boolean manipulation with ordered binary-decision diagrams , 1992, CSUR.

[15]  Danica Kragic,et al.  A strategy for grasping unknown objects based on co-planarity and colour information , 2010, Robotics Auton. Syst..

[16]  Nicolas Pugeault,et al.  Early cognitive vision: feedback mechanisms for the disambiguation of early visual representation , 2008 .

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  Michael Isard,et al.  Nonparametric belief propagation , 2010, Commun. ACM.

[19]  Justus H. Piater,et al.  Interactive learning of mappings from visual percepts to actions , 2005, ICML.

[20]  Justus H. Piater,et al.  Closed-Loop Learning of Visual Control Policies , 2011, J. Artif. Intell. Res..

[21]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  Justus H. Piater,et al.  Task-Driven Discretization of the Joint Space of Visual Percepts and Continuous Actions , 2006, ECML.

[25]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[26]  Sébastien Jodogne,et al.  Learning, then Compacting Visual Policies , 2005 .