Information geometry of the EM and em algorithms for neural networks

[1]  S. Amari,et al.  Information geometry of estimating functions in semi-parametric statistical models , 1997 .

[2]  Michael I. Jordan,et al.  Convergence results for the EM approach to mixtures of experts architectures , 1995, Neural Networks.

[3]  S. Geman,et al.  Hidden Markov Random Fields , 1995 .

[4]  S. Amari,et al.  Gradient systems in view of information geometry , 1995 .

[5]  Shun-ichi Amari,et al.  The EM Algorithm and Information Geometry in Neural Network Learning , 1995, Neural Computation.

[6]  S. Amari,et al.  The EM algorithm and information geometry in neural networks , 1995 .

[7]  Motoaki Kawanabe,et al.  Estimation of Network Parameters in Semiparametric Stochastic Perceptron , 1994, Neural Computation.

[8]  Roy L. Streit,et al.  Maximum likelihood training of probabilistic neural networks , 1994, IEEE Trans. Neural Networks.

[9]  Alan F. Murray,et al.  Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.

[10]  Alan L. Yuille,et al.  Statistical Physics, Mixtures of Distributions, and the EM Algorithm , 1994, Neural Computation.

[11]  Pierre Baldi,et al.  Smooth On-Line Learning Algorithms for Hidden Markov Models , 1994, Neural Computation.

[12]  John G. Taylor,et al.  Noisy reinforcement training for pRAM nets , 1994, Neural Networks.

[13]  S. Amari [Neural Networks: A Review from Statistical Perspective]: Comment , 1994 .

[14]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[15]  Shun-ichi Amari,et al.  Differential geometric structures of stable state feedback systems with dual connections , 1992, Kybernetika.

[16]  Hidetoshi Shimodaira A new criterion for selecting models from partially observed data , 1994 .

[17]  J. Besag,et al.  Spatial Statistics and Bayesian Computation , 1993 .

[18]  Radford M. Neal A new view of the EM algorithm that justifies incremental and other variants , 1993 .

[19]  A. Fujiwara Dualistic Dynamical Systems in the Framework of Information Geometry , 1993 .

[20]  M. Murray,et al.  Differential Geometry and Statistics , 1993 .

[21]  William J. Byrne,et al.  Alternating minimization and Boltzmann machine learning , 1992, IEEE Trans. Neural Networks.

[22]  Shun-ichi Amari,et al.  Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.

[23]  Shun-ichi Amari,et al.  Identifiability of hidden Markov information sources and their minimum degrees of freedom , 1992, IEEE Trans. Inf. Theory.

[24]  C. R. Rao,et al.  Information and the Accuracy Attainable in the Estimation of Statistical Parameters , 1992 .

[25]  Shun-ichi Amari,et al.  Dualistic geometry of the manifold of higher-order neurons , 1991, Neural Networks.

[26]  S. Amari,et al.  Asymptotic theory of sequential estimation : Differential geometrical approach , 1991 .

[27]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[28]  L. Saulis,et al.  Limit theorems for large deviations , 1991 .

[29]  Shun-ichi Amari,et al.  Mathematical foundations of neurocomputing , 1990, Proc. IEEE.

[30]  Takashi Kurose Dual connections and affine geometry , 1990 .

[31]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[32]  S. Amari Fisher information under restriction of Shannon information in multi-terminal situations , 1989 .

[33]  R. Kass The Geometry of Asymptotic Inference , 1989 .

[34]  O. Barndorff-Nielsen,et al.  Approximating exponential models , 1989 .

[35]  Shun-ichi Amari,et al.  Statistical inference under multiterminal rate restrictions: A differential geometric approach , 1989, IEEE Trans. Inf. Theory.

[36]  D. Cox,et al.  The role of differential geometry in statistical theory , 1986 .

[37]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[38]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[39]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Shun-ichi Amari,et al.  Differential geometry of statistical inference , 1983 .

[41]  S. Amari Differential Geometry of Curved Exponential Families-Curvatures and Information Loss , 1982 .

[42]  N. N. Chent︠s︡ov Statistical decision rules and optimal inference , 1982 .

[43]  O. Barndorff-Nielsen Information and Exponential Families in Statistical Theory , 1980 .

[44]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[45]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[46]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[47]  N. L. Johnson Linear Statistical Inference and Its Applications , 1966 .

[48]  Calyampudi R. Rao,et al.  Linear Statistical Inference and Its Applications. , 1975 .

[49]  M. Kendall Theoretical Statistics , 1956, Nature.