From Sensory Signals to Modality-Independent Conceptual Representations: A Probabilistic Language of Thought Approach

People learn modality-independent, conceptual representations from modality-specific sensory signals. Here, we hypothesize that any system that accomplishes this feat will include three components: a representational language for characterizing modality-independent representations, a set of sensory-specific forward models for mapping from modality-independent representations to sensory signals, and an inference algorithm for inverting forward models—that is, an algorithm for using sensory signals to infer modality-independent representations. To evaluate this hypothesis, we instantiate it in the form of a computational model that learns object shape representations from visual and/or haptic signals. The model uses a probabilistic grammar to characterize modality-independent representations of object shape, uses a computer graphics toolkit and a human hand simulator to map from object representations to visual and haptic features, respectively, and uses a Bayesian inference algorithm to infer modality-independent object representations from visual and/or haptic signals. Simulation results show that the model infers identical object representations when an object is viewed, grasped, or both. That is, the model’s percepts are modality invariant. We also report the results of an experiment in which different subjects rated the similarity of pairs of objects in different sensory conditions, and show that the model provides a very accurate account of subjects’ ratings. Conceptually, this research significantly contributes to our understanding of modality invariance, an important type of perceptual constancy, by demonstrating how modality-independent representations can be acquired and used. Methodologically, it provides an important contribution to cognitive modeling, particularly an emerging probabilistic language-of-thought approach, by showing how symbolic and statistical approaches can be combined in order to understand aspects of human perception.

[1]  R. Lawson A comparison of the effects of depth rotation on visual and haptic three-dimensional object recognition. , 2009, Journal of experimental psychology. Human perception and performance.

[2]  Thomas L. Griffiths,et al.  A Rational Analysis of Rule-Based Concept Learning , 2008, Cogn. Sci..

[3]  Jennifer L. Campos,et al.  Bayesian integration of visual and vestibular signals for heading. , 2009, Journal of vision.

[4]  R. Vogels,et al.  The representation of perceived shape similarity and its role for category learning in monkeys: A modeling study , 2008, Vision Research.

[5]  M. Tarr Visual Object Recognition: Can A Single Mechanism Suffice? , 1998 .

[6]  Jonathan W. Peirce,et al.  PsychoPy—Psychophysics software in Python , 2007, Journal of Neuroscience Methods.

[7]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[8]  H. Bülthoff,et al.  Multimodal similarity and categorization of novel, three-dimensional objects , 2007, Neuropsychologia.

[9]  Ravi S. Menon,et al.  Haptic study of three-dimensional objects activates extrastriate visual areas , 2002, Neuropsychologia.

[10]  Stephen P. Brooks,et al.  Markov chain Monte Carlo method and its application , 1998 .

[11]  Michael L. Peterson,et al.  Perception of Faces, Objects, and Scenes: Analytic and Holistic Processes (335-355) , 2006 .

[12]  Adam N. Sanborn,et al.  Bridging Levels of Analysis for Probabilistic Models of Cognition , 2012 .

[13]  Yale E. Cohen,et al.  A common reference frame for movement plans in the posterior parietal cortex , 2002, Nature Reviews Neuroscience.

[14]  Elie Bienenstock,et al.  Compositionality, MDL Priors, and Object Recognition , 1996, NIPS.

[15]  Daniel Kersten,et al.  Bayesian models of object perception , 2003, Current Opinion in Neurobiology.

[16]  S. Lacey,et al.  Perceptual learning of view-independence in visuo-haptic object representations , 2009, Experimental Brain Research.

[17]  John R. Anderson The Adaptive Character of Thought , 1990 .

[18]  A. Pouget,et al.  Multisensory spatial representations in eye-centered coordinates for reaching , 2002, Cognition.

[19]  H. Bülthoff,et al.  Visual and haptic perceptual spaces show high similarity in humans. , 2010, Journal of vision.

[20]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.

[21]  Charles Kemp,et al.  How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.

[22]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[23]  H. Bülthoff,et al.  Similarity and categorization: from vision to touch. , 2011, Acta psychologica.

[24]  Christian Wallraven,et al.  Categorizing natural objects: a comparison of the visual and the haptic modalities , 2011, Experimental Brain Research.

[25]  Charles Kemp,et al.  The discovery of structural form , 2008, Proceedings of the National Academy of Sciences.

[26]  A. Yuille,et al.  Object perception as Bayesian inference. , 2004, Annual review of psychology.

[27]  D. Freides,et al.  Human information processing and sensory modality: cross-modal functions, information complexity, memory, and deficit. , 1974, Psychological bulletin.

[28]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[29]  Soledad Ballesteros,et al.  Cross-modal repetition priming in young and old adults , 2009 .

[30]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[31]  Robert A. Jacobs,et al.  Transfer of object shape knowledge across visual and haptic modalities , 2014, CogSci.

[32]  Barbara Tversky,et al.  Parts, Partonomies, and Taxonomies. , 1989 .

[33]  W. Hayward,et al.  Viewpoint Dependence and Object Discriminability , 2000, Psychological science.

[34]  Robert A. Jacobs,et al.  A Rational Analysis of the Acquisition of Multisensory Representations , 2012, Cogn. Sci..

[35]  Hideko F. Norman,et al.  The visual and haptic perception of natural object shape , 2004, Perception & psychophysics.

[36]  H. Bülthoff,et al.  Viewpoint Dependence in Visual and Haptic Object Recognition , 2001, Psychological science.

[37]  Á. Pascual-Leone,et al.  The metamodal organization of the brain. , 2001, Progress in brain research.

[38]  Long Zhu,et al.  Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[40]  Yali Amit,et al.  POP: Patchwork of Parts Models for Object Recognition , 2007, International Journal of Computer Vision.

[41]  King-Sun Fu,et al.  A Step Towards Unification of Syntactic and Statistical Pattern Recognition , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Adam N Sanborn,et al.  Rational approximations to rational models: alternative algorithms for category learning. , 2010, Psychological review.

[43]  Noah D. Goodman,et al.  Theory learning as stochastic search in the language of thought , 2012 .

[44]  Zygmunt Pizlo,et al.  From : Shape Perception in Human and Computer Vision , 2017 .

[45]  T. Hendler,et al.  Convergence of visual and tactile shape processing in the human lateral occipital complex. , 2002, Cerebral cortex.

[46]  Guojun Lu,et al.  Review of shape representation and description techniques , 2004, Pattern Recognit..

[47]  S. Ballesteros,et al.  Implicit and Explicit Memory for Visual and Haptic Objects: Cross-Modal Priming Depends on Structural Descriptions , 1999 .

[48]  L. Tierney Rejoinder: Markov Chains for Exploring Posterior Distributions , 1994 .

[49]  S. Laurence,et al.  The Conceptual Mind: New Directions in the Study of Concepts , 2015 .

[50]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. I. , 1962 .

[51]  Thomas L. Griffiths,et al.  The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..

[52]  Noah D. Goodman,et al.  Concepts in a Probabilistic Language of Thought , 2014 .

[53]  Pedro F. Felzenszwalb A Stochastic Grammar for Natural Shapes , 2013, Shape Perception in Human and Computer Vision.

[54]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[55]  S. Pinker The Language Instinct , 1994 .

[56]  R D Easton,et al.  Do vision and haptics share common representations? Implicit and explicit memory within and between modalities. , 1997, Journal of experimental psychology. Learning, memory, and cognition.

[57]  Radomír Mech,et al.  Learning design patterns with bayesian grammar induction , 2012, UIST.

[58]  Heinrich H. Bülthoff,et al.  Object Feature Validation Using Visual and Haptic Similarity Ratings , 2022 .

[59]  L. Barsalou,et al.  Whither structured representation? , 1999, Behavioral and Brain Sciences.

[60]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[61]  L. Tyler,et al.  Binding crossmodal object features in perirhinal cortex. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[62]  Tejas D. Kulkarni,et al.  Deep Generative Vision as Approximate Bayesian Computation , 2014 .

[63]  Axel Cleeremans,et al.  The Oxford Companion to Consciousness , 2009 .

[64]  J. F. Soechting,et al.  Postural Hand Synergies for Tool Use , 1998, The Journal of Neuroscience.

[65]  Amir Amedi,et al.  Multisensory visual–tactile object related network in humans: insights gained using a novel crossmodal adaptation approach , 2009, Experimental Brain Research.

[66]  C. Koch,et al.  Explicit Encoding of Multimodal Percepts by Single Neurons in the Human Brain , 2009, Current Biology.

[67]  S. Pinker,et al.  The past and future of the past tense , 2002, Trends in Cognitive Sciences.

[68]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[69]  S. Pinker,et al.  Combination and structure, not gradedness, is the issue , 2002, Trends in Cognitive Sciences.

[70]  A. Yuille,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[71]  Noah D. Goodman,et al.  Bootstrapping in a language of thought: A formal model of numerical concept learning , 2012, Cognition.

[72]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[73]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[74]  J. Hummel,et al.  Connectedness and the integration of parts with relations in shape perception. , 1998, Journal of experimental psychology. Human perception and performance.

[75]  Donald D. Hoffman,et al.  Parts of recognition , 1984, Cognition.

[76]  R. Jacobs,et al.  Transfer of object category knowledge across visual and haptic modalities: Experimental and computational studies , 2013, Cognition.

[77]  Erik J Schlicht,et al.  Impact of coordinate transformation uncertainty on human sensorimotor control. , 2007, Journal of neurophysiology.

[78]  M. Lee,et al.  Avoiding the dangers of averaging across subjects when using multidimensional scaling , 2003 .

[79]  R. Quiroga Concept cells: the building blocks of declarative memory functions , 2012, Nature Reviews Neuroscience.

[80]  Joshua B. Tenenbaum,et al.  Inverse Graphics with Probabilistic CAD Models , 2014, ArXiv.

[81]  F. E. Grubbs Sample Criteria for Testing Outlying Observations , 1950 .

[82]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[83]  Robert A Jacobs,et al.  Learning multisensory representations for auditory-visual transfer of sequence category knowledge: a probabilistic language of thought approach , 2014, Psychonomic bulletin & review.

[84]  D M Wolpert,et al.  Multiple paired forward and inverse models for motor control , 1998, Neural Networks.

[85]  A. Amedi,et al.  Functional imaging of human crossmodal identification and object recognition , 2005, Experimental Brain Research.

[86]  Haibin Ling,et al.  Shape Classification Using the Inner-Distance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[87]  Manish Singh,et al.  Bayesian estimation of the shape skeleton , 2006, Proceedings of the National Academy of Sciences.

[88]  James L. McClelland,et al.  Rules or connections in past-tense inflections: what does the evidence rule out? , 2002, Trends in Cognitive Sciences.

[89]  D. McDermott LANGUAGE OF THOUGHT , 2012 .

[90]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[91]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[92]  James L. McClelland,et al.  ‘Words or Rules’ cannot exploit the regularity in exceptions , 2002, Trends in Cognitive Sciences.

[93]  Amy J Bastian,et al.  Multidigit Movement Synergies of the Human Hand in an Unconstrained Haptic Exploration Task , 2008, The Journal of Neuroscience.

[94]  R. Andersen,et al.  Reaches to Sounds Encoded in an Eye-Centered Reference Frame , 2000, Neuron.

[95]  Gregory Ashby,et al.  On the Dangers of Averaging Across Subjects When Using Multidimensional Scaling or the Similarity-Choice Model , 1994 .

[96]  S Edelman,et al.  Faithful representation of similarities among three-dimensional shapes in human vision. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[97]  S. Lacey,et al.  Cross-Modal Object Recognition Is Viewpoint-Independent , 2007, PloS one.

[98]  H. Bülthoff,et al.  The eyes grasp, the hands see: Metric category knowledge transfers between vision and touch , 2013, Psychonomic bulletin & review.

[99]  Ronen Basri,et al.  Determining the similarity of deformable shapes , 1998, Vision Research.

[100]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[101]  Thomas L. Griffiths,et al.  Latent Features in Similarity Judgments: A Nonparametric Bayesian Approach , 2008, Neural Computation.

[102]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .