论文信息 - A neural network approach to topic spotting

A neural network approach to topic spotting

This paper presents an application of nonlinear neural networks to topic spotting. Neural networks allow us to model higher-order interaction between document terms and to simultaneously predict multiple topics using shared hidden features. In the context of this model, we compare two approaches to dimensionality reduction in representation: one based on term selection and another based on Latent Semantic Indexing (LSI). Two diierent methods are proposed for improving LSI representations for the topic spotting task. We nd that term selection and our modiied LSI representations lead to similar topic spotting performance, and that this performance is equal to or better than other published results on the same corpus.

Andreas S. Weigend | Jan O. Pedersen | Jan Pedersen | Erik D. Wiener | A. Weigend

[1] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[2] V. Rich. Personal communication , 1989, Nature.

[3] Philip J. Hayes,et al. CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories , 1990, IAAI.

[4] Lisa F. Rau,et al. SCISOR: extracting information from on-line news , 1990, CACM.

[5] David D. Lewis,et al. Representation and Learning in Information Retrieval , 1991 .

[6] Susan T. Dumais,et al. Improving the retrieval of information from external sources , 1991 .

[7] David L. Waltz,et al. Classifying news stories using memory based reasoning , 1992, SIGIR '92.

[8] Sholom M. Weiss,et al. Towards language independent automated learning of text categorization models , 1994, SIGIR '94.

[9] David A. Hull. Improving text retrieval for the routing problem using latent semantic indexing , 1994, SIGIR '94.

[10] David D. Lewis,et al. A comparison of two learning algorithms for text categorization , 1994 .