Self-organization of the web and identification of communities

Despite the decentralized and unorganized nature of the web, we show that the web self-organizes such that communities of highly related pages can be efficiently identified based purely on connectivity. This discovery allows the identification of communities independent of, and unbiased by, the specific words used by authors. Applications include improved search engines, content filtering, and objective analysis of relationships within and between communities on the web.

[1]  Donna L. Hoffman,et al.  Bridging the Digital Divide: The Impact of Race on Computer Access and Internet Use. , 1998 .

[2]  Krishna Bharat,et al.  Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.

[3]  Ramana Rao,et al.  Silk from a sow's ear: extracting usable structures from the Web , 1996, CHI.

[4]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[5]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[6]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[7]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[8]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[9]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[10]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[11]  Andrew V. Goldberg,et al.  A new approach to the maximum flow problem , 1986, STOC '86.

[12]  Huberman,et al.  Strong regularities in world wide web surfing , 1998, Science.

[13]  Eugene Garfield,et al.  Citation indexing: its theory and application in science , 1979 .

[14]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[15]  D. R. Fulkerson,et al.  Maximal Flow Through a Network , 1956 .

[16]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.