Optimizing to Arbitrary NLP Metrics using Ensemble Selection

While there have been many successful applications of machine learning methods to tasks in NLP, learning algorithms are not typically designed to optimize NLP performance metrics. This paper evaluates an ensemble selection framework designed to optimize arbitrary metrics and automate the process of algorithm selection and parameter tuning. We report the results of experiments that instantiate the framework for three NLP tasks, using six learning algorithms, a wide variety of parameterizations, and 15 performance metrics. Based on our results, we make recommendations for subsequent machine-learning-based research for natural language learning.

[1]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[2]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[3]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[4]  Walter Daelemans,et al.  Parameter optimization for machine-learning of word sense disambiguation , 2002, Natural Language Engineering.

[5]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[6]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[7]  ChainsAmit BaggaBox,et al.  Algorithms for Scoring Coreference , .

[9]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[10]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[11]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[12]  Walter Daelemans,et al.  Evaluation of Machine Learning Methods for Natural Language Processing Tasks , 2002, LREC.

[13]  Veronique Hoste,et al.  Optimization issues in machine learning of coreference resolution , 2005 .

[14]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[15]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[16]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[17]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[18]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[19]  Claire Cardie,et al.  Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions , 2004, COLING.

[20]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[21]  Wray L. Buntine,et al.  Learning classification trees , 1992 .

[22]  Claire Cardie,et al.  Recognizing and Organizing Opinions Expressed in the World Press , 2003, New Directions in Question Answering.

[23]  Claire Cardie,et al.  Improving Machine Learning Approaches to Noun Phrase Coreference Resolution , 2004 .