Interactive Perception: Leveraging Action in Perception and Perception in Action

Recent approaches in robot perception follow the insight that perception is facilitated by interaction with the environment. These approaches are subsumed under the term Interactive Perception (IP). This view of perception provides the following benefits. First, interaction with the environment creates a rich sensory signal that would otherwise not be present. Second, knowledge of the regularity in the combined space of sensory data and action parameters facilitates the prediction and interpretation of the sensory signal. In this survey, we postulate this as a principle for robot perception and collect evidence in its support by analyzing and categorizing existing work in this area. We also provide an overview of the most important applications of IP. We close this survey by discussing remaining open questions. With this survey, we hope to help define the field of Interactive Perception and to provide a valuable resource for future research.

[1]  R. Held,et al.  MOVEMENT-PRODUCED STIMULATION IN THE DEVELOPMENT OF VISUALLY GUIDED BEHAVIOR. , 1963, Journal of comparative and physiological psychology.

[2]  Sylvia Weir,et al.  Action perception , 1974 .

[3]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[4]  Ruzena Bajcsy,et al.  Active touch and robot perception , 1984 .

[5]  Christopher G. Atkeson,et al.  Estimation of Inertial Parameters of Manipulator Loads and Links , 1986 .

[6]  Matthew T. Mason,et al.  An exploration of sensorless manipulation , 1986, IEEE J. Robotics Autom..

[7]  R. Bajcsy Active perception , 1988, Proc. IEEE.

[8]  Mark H. Lee,et al.  A Survey of Robot Tactile Sensing Technology , 1989, Int. J. Robotics Res..

[9]  Ruzena Bajcsy,et al.  Exploration of Surfaces for Robot Mobility , 1990 .

[10]  Alan D. Christiansen,et al.  Learning reliable manipulation strategies without initial physical models , 1990, Proceedings., IEEE International Conference on Robotics and Automation.

[11]  Ruzena Bajcsy,et al.  Segmentation via manipulation , 1991, IEEE Trans. Robotics Autom..

[12]  Dana H. Ballard,et al.  Animate Vision , 1991, Artif. Intell..

[13]  Ruzena Bajcsy,et al.  Sensorimotor Learning Using Active Perception in Continuous Domains , 1991 .

[14]  Ruzena Bajcsy,et al.  Active Perception and Exploratory Robotics , 1993 .

[15]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[16]  Gerd Hirzinger,et al.  Robotics research : the seventh international symposium , 1996 .

[17]  David W. Murray,et al.  Hardware development of the Yorick series of active vision systems , 1998, Microprocess. Microsystems.

[18]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[19]  A. Noë,et al.  A sensorimotor account of vision and visual consciousness. , 2001, The Behavioral and brain sciences.

[20]  Giorgio Metta,et al.  Towards manipulation-driven vision , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Giorgio Metta,et al.  Early integration of vision and manipulation , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[22]  L. Natale,et al.  Learning haptic representation of objects , 2004 .

[23]  Michael R. James,et al.  Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[24]  Yiannis Aloimonos,et al.  Active vision , 2004, International Journal of Computer Vision.

[25]  Michael Gasser,et al.  The Development of Embodied Cognition: Six Lessons from Babies , 2005, Artificial Life.

[26]  Lorenzo Natale,et al.  Tapping into Touch , 2005 .

[27]  L. Natale,et al.  A Sensitive Approach to Grasping , 2005 .

[28]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[29]  Giulio Sandini,et al.  Sensorimotor coordination in a "baby" robot: learning about objects through grasping. , 2007, Progress in brain research.

[30]  Ales Ude,et al.  Sensorimotor processes for learning object representations , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[31]  Gordon Cheng,et al.  Making Object Learning and Recognition an Active Process , 2008, Int. J. Humanoid Robotics.

[32]  Danica Kragic,et al.  Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object-Action complexes , 2008, Int. J. Humanoid Robotics.

[33]  Ales Ude,et al.  The Karlsruhe Humanoid Head , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[34]  Oliver Brock,et al.  Learning to Manipulate Articulated Objects in Unstructured Environments Using a Grounded Relational Representation , 2008, Robotics: Science and Systems.

[35]  Ashutosh Saxena,et al.  Reactive grasping using optical proximity sensors , 2009, 2009 IEEE International Conference on Robotics and Automation.

[36]  Oliver Brock,et al.  A Factorization Approach to Manipulation in Unstructured Environments , 2009, ISRR.

[37]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[38]  Oliver Brock,et al.  Interactive segmentation for manipulation in unstructured environments , 2009, 2009 IEEE International Conference on Robotics and Automation.

[39]  Advait Jain,et al.  Pulling open doors and drawers: Coordinating an omni-directional base and a compliant arm with Equilibrium Point control , 2010, 2010 IEEE International Conference on Robotics and Automation.

[40]  Oliver Kroemer,et al.  Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..

[41]  Byron Boots,et al.  Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[42]  Oliver Brock,et al.  Interactive Perception of Articulated Objects , 2010, ISER.

[43]  Danica Kragic,et al.  Strategies for multi-modal scene exploration , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[44]  Ales Ude,et al.  Object segmentation and learning through feature grouping and manipulation , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[45]  Oussama Khatib,et al.  Global Localization of Objects via Touch , 2011, IEEE Transactions on Robotics.

[46]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Masayuki Inaba,et al.  Grasp, motion, view planning on dual-arm humanoid for manipulating in-hand object , 2011, Advanced Robotics and its Social Impacts.

[48]  James F. O'Brien,et al.  Bringing clothing into desired configurations with limited perception , 2011, 2011 IEEE International Conference on Robotics and Automation.

[49]  Jun Morimoto,et al.  Segmentation and learning of unknown objects through physical interaction , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[50]  Christoph H. Lampert,et al.  Learning Dynamic Tactile Sensing With Robust Vision-Based Training , 2011, IEEE Transactions on Robotics.

[51]  Leslie Pack Kaelbling,et al.  Efficient Planning in Non-Gaussian Belief Spaces and Its Application to Robot Grasping , 2011, ISRR.

[52]  Stefan Schaal,et al.  Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[53]  Niklas Bergström,et al.  Scene Understanding through Autonomous Interactive Perception , 2011, ICVS.

[54]  Leslie Pack Kaelbling,et al.  Robust grasping under object pose uncertainty , 2011, Auton. Robots.

[55]  Dieter Fox,et al.  Autonomous generation of complete 3D object models using next best view manipulation planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[56]  Wolfram Burgard,et al.  A Probabilistic Framework for Learning Kinematic Models of Articulated Objects , 2011, J. Artif. Intell. Res..

[57]  Marc Toussaint,et al.  Gaussian process implicit surfaces for shape estimation and grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[58]  Danica Kragic,et al.  Visual object-action recognition: Inferring object affordances from human demonstration , 2011, Comput. Vis. Image Underst..

[59]  Guy Shani,et al.  A survey of point-based POMDP solvers , 2013, Autonomous Agents and Multi-Agent Systems.

[60]  Siddhartha S. Srinivasa,et al.  Physics-Based Grasp Planning Through Clutter , 2012, Robotics: Science and Systems.

[61]  Oliver Kroemer,et al.  Maximally informative interaction learning for scene exploration , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[62]  Leslie Pack Kaelbling,et al.  Unifying perception, estimation and action for mobile manipulation via belief space planning , 2012, 2012 IEEE International Conference on Robotics and Automation.

[63]  Dieter Fox,et al.  Interactive singulation of objects from a pile , 2012, 2012 IEEE International Conference on Robotics and Automation.

[64]  Joseph M. Romano,et al.  Creating Realistic Virtual Textures from Contact Acceleration Data , 2012, IEEE Transactions on Haptics.

[65]  Jeffrey C. Trinkle,et al.  The application of particle filtering to grasping acquisition with visual occlusion and tactile sensing , 2012, 2012 IEEE International Conference on Robotics and Automation.

[66]  Xiaofeng Ren,et al.  Discriminatively Trained Sparse Code Gradients for Contour Detection , 2012, NIPS.

[67]  Kei Okada,et al.  Segmentation of Textured and Textureless Objects through Interactive Perception , 2012 .

[68]  Gaurav S. Sukhatme,et al.  Using manipulation primitives for brick sorting in clutter , 2012, 2012 IEEE International Conference on Robotics and Automation.

[69]  Gerald E. Loeb,et al.  Bayesian Exploration for Intelligent Identification of Textures , 2012, Front. Neurorobot..

[70]  Trevor Darrell,et al.  Using robotic exploratory procedures to learn the meaning of haptic adjectives , 2013, 2013 IEEE International Conference on Robotics and Automation.

[71]  Joel W. Burdick,et al.  The next best touch for model-based localization , 2013, 2013 IEEE International Conference on Robotics and Automation.

[72]  Siddhartha S. Srinivasa,et al.  Object search by manipulation , 2013, 2013 IEEE International Conference on Robotics and Automation.

[73]  Danica Kragic,et al.  Enhancing visual perception of shape through tactile glances , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[74]  Marc Toussaint,et al.  Uncertainty aware grasping and tactile exploration , 2013, 2013 IEEE International Conference on Robotics and Automation.

[75]  Jivko Sinapov,et al.  Grounded object individuation by a humanoid robot , 2013, 2013 IEEE International Conference on Robotics and Automation.

[76]  Siddhartha S. Srinivasa,et al.  Efficient touch based localization through submodularity , 2012, 2013 IEEE International Conference on Robotics and Automation.

[77]  Jun Morimoto,et al.  Integrating visual perception and manipulation for autonomous learning of object representations , 2013, Adapt. Behav..

[78]  Oliver Kroemer,et al.  Probabilistic interactive segmentation for anthropomorphic robots in cluttered environments , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[79]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[80]  Zoltan-Csaba Marton,et al.  Tracking-based interactive segmentation of textureless objects , 2013, 2013 IEEE International Conference on Robotics and Automation.

[81]  Siddhartha S. Srinivasa,et al.  Pose estimation for contact manipulation with manifold particle filters , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[82]  Petter Ögren,et al.  Model-free robot manipulation of doors and drawers by means of fixed-grasps , 2013, 2013 IEEE International Conference on Robotics and Automation.

[83]  Gaurav S. Sukhatme,et al.  Towards Interactive Object Recognition , 2014 .

[84]  Jeannette Bohg,et al.  Three-dimensional object reconstruction of symmetric objects by fusing visual and tactile sensing , 2014, Int. J. Robotics Res..

[85]  Joni Pajarinen,et al.  Robotic manipulation in object composition space , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[86]  Ja Choon Koo,et al.  Exploration of unknown object by active touch of robot hand , 2014 .

[87]  Heather Culbertson,et al.  Modeling and Rendering Realistic Textures from Unconstrained Tool-Surface Interactions , 2014, IEEE Transactions on Haptics.

[88]  Giorgio Metta,et al.  Active In-Hand Object Recognition on a Humanoid Robot , 2014, IEEE Transactions on Robotics.

[89]  Oliver Brock,et al.  Entropy-based strategies for physical exploration of the environment's degrees of freedom , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[90]  Leslie Pack Kaelbling,et al.  Interactive Bayesian identification of kinematic mechanisms , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[91]  Ales Ude,et al.  Physical interaction for segmentation of unknown textured and non-textured rigid objects , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[92]  Connor Schenck,et al.  Learning relational object categories using behavioral exploration and multimodal perception , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[93]  Alan Yuille,et al.  Active Vision , 2014, Computer Vision, A Reference Guide.

[94]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[95]  Thomas B. Schön,et al.  Learning deep dynamical models from image pixels , 2014, ArXiv.

[96]  Wouter M. Bergmann Tiest,et al.  Shape from Touch , 2014 .

[97]  Oliver Brock,et al.  Online interactive perception of articulated objects with multi-level recursive estimation based on task-specific priors , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[98]  Oliver Kroemer,et al.  Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments , 2014, IEEE Transactions on Robotics.

[99]  Matthew R. Walter,et al.  Learning Articulated Motions From Visual Demonstration , 2014, Robotics: Science and Systems.

[100]  Connor Schenck,et al.  Grounding semantic categories in behavioral interactions: Experiments with 100 objects , 2014, Robotics Auton. Syst..

[101]  Antonis A. Argyros,et al.  Shape from interaction , 2014, Machine Vision and Applications.

[102]  Jeremy A. Fishel,et al.  Bayesian Action&Perception: Representing the World in the Brain , 2014, Front. Neurosci..

[103]  Astrid M. L. Kappers,et al.  Shape from touch , 2014, Scholarpedia.

[104]  Siddhartha S. Srinivasa,et al.  Pose estimation for planar contact manipulation with manifold particle filters , 2015, Int. J. Robotics Res..

[105]  Joni Pajarinen,et al.  Decision making under uncertain segmentations , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[106]  Stefan Schaal,et al.  Data-Driven Online Decision Making for Autonomous Manipulation , 2015, Robotics: Science and Systems.

[107]  Dmitry Berenson,et al.  No falls, no resets: Reliable humanoid behavior in the DARPA robotics challenge , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[108]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[109]  Sergey Levine,et al.  Learning compound multi-step controllers under unknown dynamics , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[110]  Thomas B. Schön,et al.  Data-Efficient Learning of Feedback Policies from Image Pixels using Deep Dynamical Models , 2015, ArXiv.

[111]  Abdeslam Boularias,et al.  Learning to Manipulate Unknown Objects in Clutter by Reinforcement , 2015, AAAI.

[112]  Danica Kragic,et al.  Learning Predictive State Representation for in-hand manipulation , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[113]  William T. Freeman,et al.  A computational approach for obstruction-free photography , 2015, ACM Trans. Graph..

[114]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[115]  Wei Sun,et al.  Autoscanning for coupled scene reconstruction and proactive object analysis , 2015, ACM Trans. Graph..

[116]  Sergey Levine,et al.  Learning force-based manipulation of deformable objects from multiple demonstrations , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[117]  Gaurav S. Sukhatme,et al.  Active articulation model estimation through interactive perception , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[118]  Gaurav S. Sukhatme,et al.  Interactive Segmentation of Textured and Textureless Objects , 2015 .

[119]  Twan Koolen,et al.  Team IHMC's Lessons Learned from the DARPA Robotics Challenge Trials , 2015, J. Field Robotics.

[120]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[121]  Oliver Brock,et al.  Learning state representations with robotic priors , 2015, Auton. Robots.

[122]  Giorgio Metta,et al.  Self-supervised learning of grasp dependent tool affordances on the iCub Humanoid robot , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[123]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[124]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[125]  Trevor Darrell,et al.  Robotic learning of haptic adjectives through physical interaction , 2015, Robotics Auton. Syst..

[126]  Oliver Brock,et al.  An integrated approach to visual perception of articulated objects , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[127]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[128]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[129]  Siddhartha S. Srinivasa,et al.  Pre- and post-contact policy decomposition for planar contact manipulation under uncertainty , 2014, Int. J. Robotics Res..

[130]  Yoichi Sato,et al.  Understanding Hand-Object Manipulation with Grasp Types and Object Attributes , 2016, Robotics: Science and Systems.

[131]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[132]  Abhinav Gupta,et al.  The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.

[133]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[134]  Lea Fleischer,et al.  The Senses Considered As Perceptual Systems , 2016 .

[135]  Sergey Levine,et al.  Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[136]  Dieter Fox,et al.  SE3-nets: Learning rigid body motion using deep neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[137]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[138]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[139]  John K. Tsotsos,et al.  Revisiting active perception , 2016, Autonomous Robots.