Approaching Sentiment Analysis by using semi-supervised learning of multi-dimensional classifiers

Sentiment Analysis is defined as the computational study of opinions, sentiments and emotions expressed in text. Within this broad field, most of the work has been focused on either Sentiment Polarity classification, where a text is classified as having positive or negative sentiment, or Subjectivity classification, in which a text is classified as being subjective or objective. However, in this paper, we consider instead a real-world problem in which the attitude of the author is characterised by three different (but related) target variables: Subjectivity, Sentiment Polarity, Will to Influence, unlike the two previously stated problems, where there is only a single variable to be predicted. For that reason, the (uni-dimensional) common approaches used in this area yield to suboptimal solutions to this problem. Somewhat similar happens with multi-label learning techniques which cannot directly tackle this problem. In order to bridge this gap, we propose, for the first time, the use of the novel multi-dimensional classification paradigm in the Sentiment Analysis domain. This methodology is able to join the different target variables in the same classification task so as to take advantage of the potential statistical relations between them. In addition, and in order to take advantage of the huge amount of unlabelled information available nowadays in this context, we propose the extension of the multi-dimensional classification framework to the semi-supervised domain. Experimental results for this problem show that our semi-supervised multi-dimensional approach outperforms the most common Sentiment Analysis approaches, concluding that our approach is beneficial to improve the recognition rates for this problem, and in extension, could be considered to solve future Sentiment Analysis problems.

[1]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[2]  Ellen Riloff,et al.  Learning subjective nouns using extraction pattern bootstrapping , 2003, CoNLL.

[3]  Xavier Carreras,et al.  FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.

[4]  Marie-Francine Moens,et al.  A machine learning approach to sentiment analysis in multilingual Web texts , 2009, Information Retrieval.

[5]  Glenn Fung,et al.  Automated Heart Wall Motion Abnormality Detection from Ultrasound Images Using Bayesian Networks , 2007, IJCAI.

[6]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[7]  Carolyn Penstein Rosé,et al.  Analyzing collaborative learning processes automatically: Exploiting the advances of computational linguistics in computer-supported collaborative learning , 2008, Int. J. Comput. Support. Collab. Learn..

[8]  Fabio Gagliardi Cozman,et al.  The effect of unlabeled data on generative classifiers, with application to model selection , 2002 .

[9]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[10]  R. Okafor Maximum likelihood estimation from incomplete data , 1987 .

[11]  Ari Rappoport,et al.  ICWSM - A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews , 2010, ICWSM.

[12]  José Antonio Lozano Alonso,et al.  Learning Bayesian network classifiers for multidimensional supervised classification problems by means of a multiobjective approach , 2010 .

[13]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[14]  José Antonio Lozano,et al.  Multi-Objective Learning of Multi-Dimensional Bayesian Classifiers , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[15]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[16]  Daniel Marcu,et al.  Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.

[17]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[18]  Nicu Sebe,et al.  Semisupervised learning of classifiers: theory, algorithms, and their application to human-computer interaction , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[20]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[21]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[22]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[23]  Pedro Larrañaga,et al.  Machine Learning : Editorial , 2005 .

[24]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[25]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[26]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[27]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[28]  Thomas S. Huang,et al.  Semisupervised Learning of Classifiers With Application to Human -Computer Interaction , 2003 .

[29]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[30]  Rosa Blanco Gómez Learning Bayesian networks form data with factorisation and classification purposes. Applications in biomedicine , 2005 .

[31]  Linda C. van der Gaag,et al.  Inference and Learning in Multi-dimensional Bayesian Network Classifiers , 2007, ECSQARU.

[32]  Concha Bielza,et al.  Multi-dimensional classification with Bayesian networks , 2011, Int. J. Approx. Reason..

[33]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[34]  Hsinchun Chen,et al.  Affect Analysis of Web Forums and Blogs Using Correlation Ensembles , 2008, IEEE Transactions on Knowledge and Data Engineering.

[35]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[36]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[37]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[38]  Regina Barzilay,et al.  Multiple Aspect Ranking Using the Good Grief Algorithm , 2007, NAACL.

[39]  Thomas Hofmann,et al.  Predicting Structured Data (Neural Information Processing) , 2007 .

[40]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[41]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[42]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[43]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[44]  José Antonio Lozano,et al.  Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[46]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[47]  Robert P. W. Duin,et al.  Using two-class classifiers for multiclass classification , 2002, Object recognition supported by user interaction for service robots.

[48]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[49]  Shlomo Argamon,et al.  Stylistic text classification using functional lexical features: Research Articles , 2007 .

[50]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[51]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[52]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[53]  Ian Witten,et al.  Data Mining , 2000 .

[54]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[55]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[56]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[57]  Shlomo Argamon,et al.  Stylistic text classification using functional lexical features , 2007, J. Assoc. Inf. Sci. Technol..

[58]  Linda C. van der Gaag,et al.  Multi-dimensional Bayesian Network Classifiers , 2006, Probabilistic Graphical Models.

[59]  Vincent Ng,et al.  Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews , 2006, ACL.

[60]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[61]  Lluís Padró,et al.  FreeLing 1.3: Syntactic and semantic services in an open-source NLP library , 2006, LREC.