Node label matching improves classification performance in Deep Belief Networks

If output signals of artificial neural network classifiers are interpreted per node as class label predictors then partial knowledge encoded by the network during the learning procedure can be exploited in order to reassign which output node should represent each class label so that learning speed and final classification accuracy are improved. Our method for computing these reassignments is based on the maximum average correlation between actual node outputs and target labels over a small labeled validation dataset. Node Label Matching is an ancillary method for both supervised and unsupervised learning in artificial neural networks and we demonstrate its integration with Contrastive Divergence pre-training in Restricted Boltzmann Machines and Back Propagation fine-tuning in Deep Belief Networks. We introduce the Segmented Density Random Binary dataset and present empirical results of Node Label Matching on both our synthetic data and a subset of the MNIST benchmark.

[1]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[2]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[5]  Marco Wiering,et al.  2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) , 2011, IJCNN 2011.

[6]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[9]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[10]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[11]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  Conrad Sanderson,et al.  Armadillo: An Open Source C++ Linear Algebra Library for Fast Prototyping and Computationally Intensive Experiments , 2010 .

[14]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..