Noise sensitivity signatures for model selection

Presents a method for calculating the "noise sensitivity signature" of a learning algorithm which is based on scrambling the output classes of various fractions of the training data. This signature can be used to indicate a good (or bad) match between the complexity of the classifier and the complexity of the data and hence to improve the predictive accuracy of a classification algorithm. Use of noise sensitivity signatures is distinctly different from other schemes to avoid overtraining, such as cross-validation, which uses only part of the training data, or various penalty functions, which are not data-adaptive. Noise sensitivity signature methods use all of the training data and are manifestly data-adaptive and nonparametric. They are well suited for situations with limited training data.