论文信息 - Clustering and identifying temporal trends in document databases

Clustering and identifying temporal trends in document databases

We introduce a simple and efficient method for clustering and identifying temporal trends in hyper-linked document databases. Our method can scale to large datasets because it exploits the underlying regularity often found in hyper-linked document databases. Because of this scalability, we can use our method to study the temporal trends of individual clusters in a statistically meaningful manner. As an example of our approach, we give a summary of the temporal trends found in a scientific literature database with thousands of documents.

[1] C. Lee Giles,et al. Indexing and retrieval of scientific literature , 1999, CIKM '99.

[2] Sridhar P. Nerur,et al. Research themes and trends in artificial intelligence: an author co-citation analysis , 1999, INTL.

[3] K. McCain. Mapping Authors in Intellectual Space , 1989 .

[4] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[5] Eugene Garfield,et al. Citation indexing: its theory and application in science , 1979 .

[6] J. H. Ward. Hierarchical Grouping to Optimize an Objective Function , 1963 .

[7] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[8] Peter Pirolli,et al. Life, death, and lawfulness on the electronic frontier , 1997, CHI.

[9] Ali S. Hadi,et al. Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[10] Les Carr,et al. Trailblazing the literature of hypertext: author co-citation analysis (1989–1998) , 1999, HYPERTEXT '99.

[11] B. C. Griffith,et al. The Structure of Scientific Literatures I: Identifying and Graphing Specialties , 1974 .

[12] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[13] S. Brereton. Life , 1876, The Indian medical gazette.

[14] C. Lee Giles,et al. Digital Libraries and Autonomous Citation Indexing , 1999, Computer.