论文信息 - Transforming Neural-Net Output Levels to Probability Distributions

Transforming Neural-Net Output Levels to Probability Distributions

(1) The outputs of a typical multi-output classification network do not satisfy the axioms of probability; probabilities should be positive and sum to one. This problem can be solved by treating the trained network as a preprocessor that produces a feature vector that can be further processed, for instance by classical statistical estimation techniques. (2) We present a method for computing the first two moments of the probability distribution indicating the range of outputs that are consistent with the input and the training data. It is particularly useful to combine these two ideas: we implement the ideas of section 1 using Parzen windows, where the shape and relative size of each window is computed using the ideas of section 2. This allows us to make contact between important theoretical ideas (e.g. the ensemble formalism) and practical techniques (e.g. back-prop). Our results also shed new light on and generalize the well-known "softmax" scheme.

Yann LeCun | John S. Denker | Yann LeCun | J. Denker

[1] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[2] Peter E. Hart,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3] Lawrence D. Jackel,et al. Large Automatic Learning, Rule Extraction, and Generalization , 1987, Complex Syst..

[4] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .

[5] John S. Bridle,et al. Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.

[6] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[7] Naftali Tishby,et al. Consistent inference of probabilities in layered networks: predictions and generalizations , 1989, International 1989 Joint Conference on Neural Networks.

[8] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.