Image Representations and Feature Selection for Multimedia Database Search

The success of a multimedia information system depends heavily on the way the data is represented. Although there are "natural" ways to represent numerical data, it is not clear what is a good way to represent multimedia data, such as images, video, or sound. We investigate various image representations where the quality of the representation is judged based on how well a system for searching through an image database can perform-although the same techniques and representations can be used for other types of object detection tasks or multimedia data analysis problems. The system is based on a machine learning method used to develop object detection models from example images that can subsequently be used for examples to detect-search-images of a particular object in an image database. As a base classifier for the detection task, we use support vector machines (SVM), a kernel based learning method. Within the framework of kernel classifiers, we investigate new image representations/kernels derived from probabilistic models of the class of images considered and present a new feature selection method which can be used to reduce the dimensionality of the image representation without significant losses in terms of the performance of the detection-search-system.

[1]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[3]  Yoshiaki Shirai,et al.  Detection of the movements of persons from a sparse sequence of TV images , 1983, Pattern Recognition.

[4]  Yee-Hong Yang,et al.  A region based approach for human body motion analysis , 1987, Pattern Recognit..

[5]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[7]  Karl Rohr,et al.  Incremental recognition of pedestrians from image sequences , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Noga Alon,et al.  Scale-sensitive dimensions, uniform convergence, and learnability , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[9]  R. Vaillant,et al.  Original approach for the localisation of objects in images , 1994 .

[10]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[11]  T. Poggio,et al.  The importance of symmetry and virtual views in three-dimensional object recognition , 1994, Current Biology.

[12]  F. Girosi,et al.  Prior knowledge and the creation of "virtual" examples for RBF networks , 1995, Proceedings of 1995 IEEE Workshop on Neural Networks for Signal Processing.

[13]  Margrit Betke,et al.  Fast object recognition in noisy images using simulated annealing , 1995, Proceedings of IEEE International Conference on Computer Vision.

[14]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[15]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Other Conferences.

[16]  Bernhard Schölkopf,et al.  Prior Knowledge in Support Vector Kernels , 1997, NIPS.

[17]  Shaogang Gong,et al.  Non-intrusive Person Authentication for Access Control by Visual Tracking and Face Recognition , 1997, AVBPA.

[18]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[19]  Tomaso A. Poggio,et al.  Pedestrian detection using wavelet templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Ulrich Kressel,et al.  Tracking non-rigid, moving objects based on color cluster flow , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Massimiliano Pontil,et al.  Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  Partha Niyogi,et al.  Distinctive feature detection using support vector machines , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[25]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[26]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[27]  David Haussler,et al.  Probabilistic kernel regression models , 1999, AISTATS.

[28]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[29]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[30]  Remco C. Veltkamp,et al.  Content-based image retrieval systems: A survey , 2000 .

[31]  Tomaso Poggio,et al.  Models of object recognition , 2000, Nature Neuroscience.

[32]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[33]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[34]  Alan L. Yuille,et al.  Feature extraction from faces using deformable templates , 2004, International Journal of Computer Vision.

[35]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[36]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.