DIVERGENCE FUNCTION , INFORMATION MONOTONICITY AND INFORMATION GEOMETRY
暂无分享,去创建一个
[1] S. M. Ali,et al. A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .
[2] Shun-ichi Amari,et al. Dynamics of Learning in Multilayer Perceptrons Near Singularities , 2008, IEEE Transactions on Neural Networks.
[3] S. Amari. Integration of Stochastic Models by Minimizing -Divergence , 2007, Neural Computation.
[4] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .
[5] N. Čencov. Statistical Decision Rules and Optimal Inference , 2000 .
[6] Shun-ichi Amari,et al. $\alpha$ -Divergence Is Unique, Belonging to Both $f$-Divergence and Bregman Divergence Classes , 2009, IEEE Transactions on Information Theory.
[7] Inderjit S. Dhillon,et al. Matrix Nearness Problems with Bregman Divergences , 2007, SIAM J. Matrix Anal. Appl..
[8] Shun-ichi Amari,et al. Dynamics of Learning Near Singularities in Layered Networks , 2008, Neural Computation.
[9] Shun-ichi Amari,et al. Information Geometry and Its Applications: Convex Function and Dually Flat Manifold , 2009, ETVC.
[10] D. Petz. Monotone metrics on matrix spaces , 1996 .
[11] S. Amari,et al. Singularities Affect Dynamics of Learning in Neuromanifolds , 2006, Neural Computation.
[12] Imre Csiszár,et al. Axiomatic Characterizations of Information Measures , 2008, Entropy.
[13] Jan Havrda,et al. Quantification method of classification processes. Concept of structural a-entropy , 1967, Kybernetika.
[14] H. Chernoff. A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .
[15] Shun-ichi Amari,et al. Dynamics of learning near singularities in radial basis function networks , 2008, Neural Networks.
[16] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[17] Inder Jeet Taneja,et al. Relative information of type s, Csiszár's f-divergence, and information inequalities , 2004, Inf. Sci..
[18] Kenji Fukumizu,et al. Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.
[19] Yasuo Matsuyama,et al. The alpha-EM algorithm: surrogate likelihood maximization using alpha-logarithmic information measures , 2003, IEEE Trans. Inf. Theory.
[20] J. Milnor. On the concept of attractor , 1985 .
[21] A. Rényi. On Measures of Entropy and Information , 1961 .
[22] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[23] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..
[24] I. Csiszár. Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems , 1991 .