Active Learning with Statistical Models

For many types of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992; Cohn, 1994]. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.

[1]  D. Naidu,et al.  Optimal Control Systems , 2018 .

[2]  A. G. Butkovskiy,et al.  Optimal control of systems , 1966 .

[3]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[4]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  T. Hassard,et al.  Applied Linear Regression , 2005 .

[7]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[8]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[10]  W. Cleveland,et al.  Regression by local fitting: Methods, properties, and computational algorithms , 1988 .

[11]  Matthew Self,et al.  Bayesian Classification , 1988, AAAI.

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[14]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[15]  Steven J. Nowlan,et al.  Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .

[16]  Sebastian Thrun,et al.  Active Exploration in Dynamic Environments , 1991, NIPS.

[17]  Eric B. Baum,et al.  Neural net algorithms that learn in polynomial time from examples and queries , 1991, IEEE Trans. Neural Networks.

[18]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[19]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[20]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[21]  Mark Plutowski,et al.  Selecting concise training sets from clean data , 1993, IEEE Trans. Neural Networks.

[22]  Michael I. Jordan,et al.  Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[23]  S. Schaal,et al.  Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.

[24]  Gerhard Paass,et al.  Bayesian Query Construction for Neural Network Models , 1994, NIPS.

[25]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[26]  David A. Cohn,et al.  Minimizing Statistical Bias with Queries , 1996, NIPS.