论文信息 - A Guide to the Literature on Learning Probabilistic Networks from Data

A Guide to the Literature on Learning Probabilistic Networks from Data

The literature review presented discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The article avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples.

Wray L. Buntine

[1] S. Kullback,et al. Information Theory and Statistics , 1959 .

[2] E. Parzen. Annals of Mathematical Statistics , 1962 .

[3] Aiko M. Hormann,et al. Programs for Machine Learning. Part I , 1962, Inf. Control..

[4] R. L. Winkler. The Quantification of Judgment: Some Methodological Suggestions , 1967 .

[5] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[6] Ronald A. Howard,et al. Decision analysis: Perspectives on inference, decision, and experimentation , 1970 .

[7] R. Cox,et al. Journal of the Royal Statistical Society B , 1972 .

[8] Elizabeth C. Hirschman,et al. Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[9] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10] J. F. C. Kingman,et al. Information and Exponential Families in Statistical Theory , 1980 .

[11] D. A. Kenny,et al. Correlation and Causation , 1937, Wilmott.

[12] Frederick Hayes-Roth,et al. Building expert systems , 1983, Advanced book program.

[13] Leo Breiman,et al. Classification and Regression Trees , 1984 .

[14] P. McCullagh,et al. Generalized Linear Models , 1984 .

[15] A. F. Smith,et al. Statistical analysis of finite mixture distributions , 1986 .

[16] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[17] C. S. Wallace,et al. Estimation and Inference by Compact Coding , 1987 .

[18] J. Rissanen. Stochastic complexity and the mdl principle , 1987 .

[19] D. Edwards,et al. A fast model selection procedure for large families of models , 1987 .

[20] Max Henrion,et al. An Experimental Comparison of Knowledge Engineering for Expert Systems and for Decision Analysis , 1987, AAAI.

[21] Brian D. Ripley,et al. Stochastic Simulation , 2005 .

[22] Ross D. Shachter,et al. Thinking Backward for Knowledge Acquisition , 1987, AI Mag..

[23] Paul Compton,et al. Inductive knowledge acquisition: a case study , 1987 .

[24] Donald Michie,et al. Current developments in expert systems , 1987 .

[25] David J. Spiegelhalter,et al. Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[26] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .

[27] Matthew Self,et al. Bayesian Classification , 1988, AAAI.

[28] Alice M. Agogino,et al. Automated Construction of Sparse Bayesian Networks from Unstructured Probabilistic Models and Domain Information , 2013, UAI.

[29] N. Wermuth,et al. Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[30] David J. Spiegelhalter,et al. Assessment, Criticism and Improvement of Imprecise Subjective Probabilities for a Medical Expert System , 1989, UAI.

[31] Kevin T. Kelly,et al. Discovering Causal Structure. , 1989 .

[32] J. Rissanen. Stochastic Complexity in Statistical Inquiry Theory , 1989 .

[33] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[34] J. Ross Quinlan,et al. Unknown Attribute Values in Induction , 1989, ML.

[35] Judea Pearl,et al. Equivalence and Synthesis of Causal Models , 1990, UAI.

[36] G. Casella,et al. Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[37] D. Edwards. Hierarchical interaction models , 1990 .

[38] Max Henrion,et al. Uncertainty: A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis , 1990 .

[39] P. Games. Correlation and Causation: A Logical Snafu , 1990 .

[40] Gregory F. Cooper,et al. An Entropy-driven System for Construction of Probabilistic Expert Systems from Databases , 1990, UAI.

[41] Dan Geiger,et al. Identifying independence in bayesian networks , 1990, Networks.

[42] J. N. R. Jeffers,et al. Graphical Models in Applied Multivariate Statistics. , 1990 .

[43] David Heckerman,et al. Probabilistic similarity networks , 1991, Networks.

[44] N. Wermuth,et al. On Substantive Research Hypotheses, Conditional Independence Graphs and Graphical Chain Models , 1990 .

[45] David J. Spiegelhalter,et al. Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[46] Stuart L. Crawford,et al. Constructor: A System for the Induction of Probabilistic Models , 1990, AAAI.

[47] M. Frydenberg. The chain graph Markov property , 1990 .

[48] Steffen L. Lauritzen,et al. Independence properties of directed markov fields , 1990, Networks.

[49] Wray L. Buntine. Theory Refinement on Bayesian Networks , 1991, UAI.

[50] Judea Pearl,et al. A Theory of Inferred Causation , 1991, KR.

[51] Eric Horvitz,et al. Decision Analysis and Expert Systems , 1991, AI Mag..

[52] Robin Hanson,et al. Bayesian Classification with Correlation and Inheritance , 1991, IJCAI.

[53] Eugene Charniak,et al. Bayesian Networks without Tears , 1991, AI Mag..

[54] Wray L. Buntine. Classifiers: A Theoretical and Empirical Study , 1991, IJCAI.

[55] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.

[56] Wray L. BuntineRIACS. Theory Reenement on Bayesian Networks , 1991 .

[57] B Efron,et al. Statistical Data Analysis in the Computer Age , 1991, Science.

[58] P. Spirtes,et al. An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[59] Michael P. Wellman,et al. Planning and Control , 1991 .

[60] Andrew R. Barron,et al. Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.

[61] D. J. Hand,et al. Artificial Intelligence Frontiers in Statistics: AI and Statistics III , 1992 .

[62] Gregory F. Cooper,et al. A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[63] Wray L. Buntine,et al. Learning classification trees , 1992 .

[64] Dan Geiger,et al. An Entropy-based Learning Algorithm of Bayesian Conditional Trees , 1992, UAI.

[65] Steffen L. Lauritzen,et al. aHUGIN: A System Creating Adaptive Causal Probabilistic Networks , 1992, UAI.

[66] Radford M. Neal. Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[67] Judea Pearl,et al. An Algorithm for Deciding if a Set of Observed Independencies Has a Causal Explanation , 1992, UAI.

[68] Padhraic Smyth. Admissible stochastic complexity models for classification problems , 1992 .

[69] Peter Spirtes,et al. Equivalence of causal models with latent variables , 1992 .

[70] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[71] Bo ThiessonApril. Bifrost { Block Recursive Models Induced from Relevant Knowledge, Observations, and Statistical Techniques , 1993 .

[72] Joe Suzuki,et al. A Construction of Bayesian Networks from Databases Based on an MDL Principle , 1993, UAI.

[73] Francisco Javier Díez,et al. Parameter adjustment in Bayes networks. The generalized noisy OR-gate , 1993, UAI.

[74] Martin Abba Tanner,et al. Tools for Statistical Inference: Observed Data and Data Augmentation Methods , 1993 .

[75] M. F. Møller,et al. Efficient Training of Feed-Forward Neural Networks , 1993 .

[76] D. Spiegelhalter,et al. Modelling Complexity: Applications of Gibbs Sampling in Medicine , 1993 .

[77] Stuart J. Russell,et al. Decision Theoretic Subsampling for Induction on Large Databases , 1993, ICML.

[78] P. Spirtes,et al. Causation, prediction, and search , 1993 .

[79] David J. Spiegelhalter,et al. Bayesian analysis in expert systems , 1993 .

[80] Klaus-Uwe Höffgen,et al. Learning and robust learning of product distributions , 1993, COLT '93.

[81] Ron Musick,et al. Minimal Assumption Distribution Propagation in Belief Networks , 1993, UAI.

[82] Gregory M. Provan,et al. Tradeoffs in Constructing and Evaluating Temporal Influence Diagrams , 1993, UAI.

[83] J. Q. Smith,et al. 1. Bayesian Statistics 4 , 1993 .

[84] A. Dawid,et al. Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models , 1993 .

[85] Wray L. Buntine. Artificial Intelligence Frontiers in Statistics , 1993 .

[86] Kathryn B. Laskey. Sensitivity analysis for probability assessments in Bayesian networks , 1995, IEEE Trans. Syst. Man Cybern..

[87] J. Pearl. [Bayesian Analysis in Expert Systems]: Comment: Graphical Models, Causality and Intervention , 1993 .

[88] Shai Ben-David,et al. On learning in the limit and non-uniform (ε,δ)-learning , 1993, COLT '93.

[89] Wai Lam,et al. Using Causal Information and Local Measures to Learn Bayesian Networks , 1993, UAI.

[90] D. Hand,et al. Artificial Intelligence Frontiers in Statistics , 2020 .

[91] Franz von Kutschera,et al. Causation , 1993, J. Philos. Log..

[92] Moninder Singh,et al. An Algorithm for the Construction of Bayesian Network Structures from Data , 1993, UAI.

[93] David J. Spiegelhalter,et al. Sequential Model Criticism in Probabilistic Expert Systems , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[94] Michal Jacovi,et al. On Learning in the Limit and Non-Uniform (epsilon, delta)-Learning. , 1993, COLT 1993.

[95] S. Sclove. Small-sample and large-sample statistical model selection criteria , 1994 .

[96] P. Cheeseman,et al. Selecting Models from Data: AI and Statistics IV , 1994 .

[97] Rohan A. Baxter,et al. MML and Bayesianism: similarities and differences: introduction to minimum encoding inference Part , 1994 .

[98] Constantin F. Aliferis,et al. An Evaluation of an Algorithm for Inductive Learning of Bayesian Belief Networks Using Simulated Data Sets , 1994, UAI.

[99] Ron Kohavi,et al. MLC++: a machine learning library in C++ , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[100] Ross D. Shachter,et al. Three Approaches to Probability Model Selection , 1994, UAI.

[101] R. Scheines. Inferring causal structure among unmeasured variables , 1994 .

[102] Walter R. Gilks,et al. A Language and Program for Complex Bayesian Modelling , 1994 .

[103] Ross D. Shachter,et al. Laplace's Method Approximations for Probabilistic Inference in Belief Networks with Continuous Variables , 1994, UAI.

[104] D. Haussler,et al. Rigorous Learning Curve Bounds from Statistical Mechanics , 1994, COLT '94.

[105] Peter C. Cheeseman,et al. Selecting models from data , 1994, Lecture notes in statistics.

[106] David Madigan,et al. Markov Chain Monte Carlo Methods for Hierarchical Bayesian Expert Systems , 1994 .

[107] D. Madigan,et al. Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[108] Zoubin Ghahramani,et al. Factorial Learning and the EM Algorithm , 1994, NIPS.

[109] Wai Lam,et al. LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[110] Remco R. Bouckaert,et al. Properties of Bayesian Belief Network Learning Algorithms , 1994, UAI.

[111] Russell G. Almond,et al. Strategies for Graphical Model Selection , 1994 .