论文信息 - Context-sensitive learning methods for text categorization - 字舞流文

Context-sensitive learning methods for text categorization

Two recently implemented machine-learning algorithms, RIPPER and sleeping-experts for phrases, are evaluated on a number of large text categorization problems. These algorithms both construct classifiers that allow the “context” of a word w to affect how (or even whether) the presence or absence of w will contribute to a classification. However, RIPPER and sleeping-experts differ radically in many other respects: differences include different notions as to what constitutes a context, different ways of combining contexts to construct a classifier, different methods to search for a combination of contexts, and different criteria as to what contexts should be included in such a combination. In spite of these differences, both RIPPER and sleeping-experts perform extremely well across a wide variety of categorization problems, generally outperforming previously applied learning methods. We view this result as a confirmation of the usefulness of classifiers that represent contextual information.

Yoram Singer | William W. Cohen | Y. Singer

[1] J. J. Rocchio,et al. Relevance feedback in information retrieval , 1971 .

[2] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[3] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.

[4] Avrim Blum. Learning boolean functions in an infinite attribute space , 1990, STOC '90.

[5] G Salton,et al. Developments in Automatic Text Retrieval , 1991, Science.

[6] David D. Lewis,et al. Representation and Learning in Information Retrieval , 1991 .

[7] Thomas G. Dietterich,et al. Learning with Many Irrelevant Features , 1991, AAAI.

[8] Michael J. Pazzani,et al. An Investigation of Noise-Tolerant Relational Concept Learning Algorithms , 1991, ML.

[9] David D. Lewis,et al. An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[10] David Haussler,et al. How to use expert advice , 1993, STOC.

[11] William W. Cohen. Efficient Pruning Methods for Separate-and-Conquer Rule Learning Systems , 1993, IJCAI.

[12] Michael J. Pazzani,et al. HYDRA: A Noise-tolerant Relational Concept Learning Algorithm , 1993, IJCAI.

[13] R. Mike Cameron-Jones,et al. FOIL: A Midterm Report , 1993, ECML.

[14] Sholom M. Weiss,et al. Towards language independent automated learning of text categorization models , 1994, SIGIR '94.

[15] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..

[16] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[17] Sholom M. Weiss,et al. Automated learning of decision rules for text categorization , 1994, TOIS.

[18] David D. Lewis,et al. Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[19] Yiming Yang,et al. Expert network: effective and efficient learning from human decisions in text categorization and retrieval , 1994, SIGIR '94.

[20] James Allan,et al. The effect of adding relevance information in a relevance feedback environment , 1994, SIGIR '94.

[21] Johannes Fürnkranz,et al. Incremental Reduced Error Pruning , 1994, ICML.

[22] David D. Lewis,et al. A comparison of two learning algorithms for text categorization , 1994 .

[23] Ron Kohavi,et al. Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[24] Yiming Yang,et al. An example-based mapping method for text categorization and retrieval , 1994, TOIS.

[25] Avrim Blum,et al. Empirical Support for Winnow and Weighted-Majority Based Algorithms: Results on a Calendar Scheduling Domain , 1995, ICML.

[26] Kenneth Ward Church,et al. Poisson mixtures , 1995, Natural Language Engineering.

[27] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.

[28] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[29] David D. Lewis,et al. Text categorization of low quality images , 1995 .

[30] Gerard Salton,et al. Optimization of relevance feedback weights , 1995, SIGIR '95.

[31] William W. Cohen. Fast Eeective Rule Induction , 1995 .

[32] William W. Cohen. Text Categorization and Relational Learning , 1995, ICML.

[33] Hinrich Schütze,et al. A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[34] Thorsten Joachims,et al. WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[35] Andreas S. Weigend,et al. A neural network approach to topic spotting , 1995 .

[36] J. R. Quinlan,et al. MDL and Categorical Theories (Continued) , 1995, ICML.

[37] Michael J. Pazzani,et al. Learning from hotlists and coldlists: towards a WWW information filtering and seeking agent , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[38] James P. Callan,et al. Training algorithms for linear text classifiers , 1996, SIGIR '96.

[39] Hinrich Schütze,et al. Method combination for document filtering , 1996, SIGIR '96.

[40] David A. Hull. Stemming algorithms: a case study for detailed evaluation , 1996 .

[41] Yoram Singer,et al. Learning to Query the Web , 1996 .

[42] William W. Cohen. Learning Rules that Classify E-Mail , 1996 .

[43] Hwee Tou Ng,et al. Feature selection, perceptron learning, and a usability case study for text categorization , 1997, SIGIR '97.

[44] Yoram Singer,et al. Using and combining predictors that specialize , 1997, STOC '97.

[45] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[46] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[47] Yoram Singer,et al. Boosting and Rocchio applied to text filtering , 1998, SIGIR '98.