A novel image classifier based on Gaussian mixture language model

In this paper, we propose a novel Gaussian Mixture Language Model to address the issues of the traditional bag of visual words (BoVW) based model. We firstly take full advantage of image semantic information to learn a new distance metric which can achieve the minimal loss of image information, and then we train Gaussian Mixture Models (GMM) using this distance metric. Given a test image, a visual document is firstly constructed using this codebook, and then its category is determined by estimating the maximum probability using the language model under a specific category. Experiments show that the codebook generated by our method can effectively reflect the image semantic information and highly suitable the language model, and confirm that the proposed method is satisfactory and competitive in comparison with the traditional BoVW based method as well as other state of the art methods.

[1]  Fei-FeiLi,et al.  One-Shot Learning of Object Categories , 2006 .

[2]  Nenghai Yu,et al.  Semantics-Preserving Bag-of-Words Models and Applications , 2010, IEEE Transactions on Image Processing.

[3]  Larry S. Davis,et al.  Label Consistent K-SVD: Learning a Discriminative Dictionary for Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Florent Perronnin,et al.  Universal and Adapted Vocabularies for Generic Visual Categorization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Trevor Darrell,et al.  Beyond spatial pyramids: Receptive field learning for pooled image features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Tieniu Tan,et al.  Feature Coding in Image Classification: A Comprehensive Study , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[10]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[11]  Thomas S. Huang,et al.  Novel Gaussianized vector representation for improved natural scene categorization , 2010, Pattern Recognit. Lett..

[12]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Liang-Tien Chia,et al.  Image-to-Class Distance Metric Learning for Image Classification , 2010, ECCV.

[16]  Qingming Huang,et al.  Improving Image Distance Metric Learning by Embedding Semantic Relations , 2012, PCM.

[17]  Nenghai Yu,et al.  Visual language modeling for image classification , 2007, MIR '07.

[18]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[19]  Ning Zhou,et al.  Jointly Learning Visually Correlated Dictionaries for Large-Scale Visual Recognition Applications. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[20]  James M. Rehg,et al.  Beyond the Euclidean distance: Creating effective visual codebooks using the Histogram Intersection Kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Kristen Grauman,et al.  Learning a Tree of Metrics with Disjoint Visual Features , 2011, NIPS.

[22]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Qi Tian,et al.  Multi-feature metric learning with knowledge transfer among semantics and social tagging , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Thomas S. Huang,et al.  Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[25]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[27]  Ricardo da Silva Torres,et al.  Visual word spatial arrangement for image retrieval and classification , 2014, Pattern Recognit..

[28]  Yasuo Kuniyoshi,et al.  Global Gaussian approach for scene categorization using information geometry , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Pedro F. Felzenszwalb,et al.  Reconfigurable models for scene recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Jianping Fan,et al.  Jointly Learning Visually Correlated Dictionaries for Large-Scale Visual Recognition Applications , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.