Estimation of Network Parameters in Semiparametric Stochastic Perceptron

It was reported (Kabashima and Shinomoto 1992) that estimators of a binary decision boundary show asymptotically strange behaviors when the probability model is ill-posed or semiparametric. We give a rigorous analysis of this phenomenon in a stochastic perceptron by using the estimating function method. A stochastic perceptron consists of a neuron that is excited depending on the weighted sum of inputs but its probability distribution form is unknown here. It is shown that there exists no n-consistent estimator of the threshold value h, that is, no estimator h that converges to h in the order of 1/ n as the number n of observations increases. Therefore, the accuracy of estimation is much worse in this semiparametric case with an unspecified probability function than in the ordinary case. On the other hand, it is shown that there is a n-consistent estimator of the synaptic weight vector. These results elucidate strange behaviors of learning curves in a semiparametric statistical model.

[1]  V. P. Godambe An Optimum Property of Regular Maximum Likelihood Estimation , 1960 .

[2]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[3]  Cal Lib A Projection Pursuit Algorithm rorExploratory DataAnalysis , 1974 .

[4]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[5]  C. Manski MAXIMUM SCORE ESTIMATION OF THE STOCHASTIC UTILITY MODEL OF CHOICE , 1975 .

[6]  P. Ruud Sufficient Conditions for the Consistency of Maximum Likelihood Estimation Despite Misspecifications of Distribution in Multinomial Discrete Choice Models , 1983 .

[7]  W. J. Hall,et al.  Information and Asymptotic Efficiency in Parametric-Nonparametric Models , 1983 .

[8]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[9]  S. Amari,et al.  Estimation in the Presence of Infinitely many Nuisance Parameters--Geometry of Estimating Functions , 1988 .

[10]  David Haussler,et al.  Predicting (0, 1)-functions on randomly drawn points , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[11]  David Haussler,et al.  Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.

[12]  Ker-Chau Li,et al.  Regression Analysis Under Link Violation , 1989 .

[13]  K. Nawata Semiparametric estimation and efficiency bounds of binary choice models when the models contain one continuous variable , 1989 .

[14]  D. Pollard,et al.  Cube Root Asymptotics , 1990 .

[15]  Ker-Chau Li,et al.  Slicing Regression: A Link-Free Regression Method , 1991 .

[16]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[17]  Yoshiyuki Kabashima,et al.  Learning Curves for Error Minimum and Maximum Likelihood Algorithms , 1992, Neural Computation.

[18]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[19]  S. Amari,et al.  Di erential Geometry of Estimating functions in Semiparametric Statistical Models , 1993 .

[20]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[21]  Yoshiyuki Kabashima,et al.  Learning a Decision Boundary from Stochastic Examples: Incremental Algorithms with and without Queries , 1995, Neural Computation.

[22]  Shun-ichi Amari,et al.  Information geometry of the EM and em algorithms for neural networks , 1995, Neural Networks.