A bag-of-semantics model for image clustering

This paper presents a novel method to organize a collection of images into a hierarchy of clusters based on image semantics. Given a group of raw images with no metadata as input, our method describes the semantics of each image with a bag-of-semantics model (i.e., a set of meaningful descriptors), which is derived from the image’s Object Relation Network (Chen et al. in Proceedings of the 21st International Conference on World Wide Web, 2012)—an expressive graph model representing rich semantics for image objects and their relations. We adopt the class hierarchies in a guide ontology as different levels of lenses to view the bag-of-semantics models. Image clusters are automatically extracted by grouping images with the same bag-of-semantics viewed through a certain lens. With a series of coarse-to-fine lenses, images are clustered in a top-down hierarchical manner. In addition, given that users can have different perspectives regarding how images should be clustered, our method allows each user to control the clustering process while browsing, and thus dynamically adjusts the clustering result according to the user’s preferences.

[1]  Kerry Rodden,et al.  Does organisation by similarity assist image browsing? , 2001, CHI.

[2]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[3]  Shiri Gordon,et al.  Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Viktor K. Prasanna,et al.  Understanding web images by object relation network , 2012, WWW.

[6]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[7]  Sim Heng Ong,et al.  A clustering-based system to automate transfer function design for medical image visualization , 2011, The Visual Computer.

[8]  Philip H. S. Torr,et al.  What, Where and How Many? Combining Object Detectors and CRFs , 2010, ECCV.

[9]  Wei-Ying Ma,et al.  Locality preserving clustering for image database , 2004, MULTIMEDIA '04.

[10]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[11]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[12]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Frédo Durand,et al.  ACM SIGGRAPH Asia 2009 Papers , 2009 .

[14]  Sriram Subramanian,et al.  Talking about tactile experiences , 2013, CHI.

[15]  David W. Jacobs,et al.  Active image clustering: Seeking constraints from humans to complement algorithms , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Antonio Torralba,et al.  Using the forest to see the trees: exploiting context for visual object detection and localization , 2010, CACM.

[17]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[18]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[19]  Tao Qin,et al.  Web image clustering by consistent utilization of visual features and surrounding texts , 2005, MULTIMEDIA '05.

[20]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[21]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[22]  Ying Liu,et al.  Semantic Clustering for Region-Based Image Retrieval , 2007, Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007).

[23]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[24]  Wei-Ying Ma,et al.  IGroup: web image search results clustering , 2006, MM '06.

[25]  Marco Brambilla,et al.  A revenue sharing mechanism for federated search and advertising , 2012, WWW.

[26]  Shi-Min Hu,et al.  Efficient affinity-based edit propagation using K-D tree , 2009, ACM Trans. Graph..

[27]  Wei-Ying Ma,et al.  Iteratively clustering web images based on link and attribute reinforcements , 2005, ACM Multimedia.

[28]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Venu Govindaraju,et al.  Syntactic image parsing using ontology and semantic descriptions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[30]  Scott P. Robertson,et al.  Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , 1991 .

[31]  Viktor K. Prasanna,et al.  Semantic image clustering using object relation network , 2012, CVM.

[32]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Yixin Chen,et al.  CLUE: cluster-based retrieval of images by unsupervised learning , 2005, IEEE Transactions on Image Processing.

[34]  Yixin Chen,et al.  A sparse support vector machine approach to region-based image categorization , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[35]  Shi-Min Hu,et al.  Proceedings of the First international conference on Computational Visual Media , 2012 .