Incorporating User Feedback into Name Disambiguation of Scientific Cooperation Network

In scientific cooperation network, ambiguous author names may occur due to the existence of multiple authors with the same name. Users of these networks usually want to know the exact author of a paper, whereas we do not have any unique identifier to distinguish them. In this paper, we focus ourselves on such problem, we propose a new method that incorporates user feedback into the model for name disambiguation of scientific cooperation network. Perceptron is used as the classifier. Two features and a constraint drawn from user feedback are incorporated into the perceptron to enhance the performance of name disambiguation. Specifically, we construct user feedback as a training stream, and refine the perceptron continuously. Experimental results show that the proposed algorithm can learn continuously and significantly outperforms the previous methods without introducing user interactions.

[1]  Cristiano Giuffrida,et al.  A heuristic approach to author name disambiguation in bibliometrics databases for large-scale research assessments , 2011, J. Assoc. Inf. Sci. Technol..

[2]  Tru H. Cao,et al.  Exploring Wikipedia and Text Features for Named Entity Disambiguation , 2010, ACIIDS.

[3]  Yuhua Li,et al.  Disambiguating Authors by Pairwise Classification , 2010 .

[4]  C. Lee Giles,et al.  Disambiguating authors in academic publications using random forests , 2009, JCDL '09.

[5]  Yiqun Liu,et al.  Automatic Query Type Identification Based on Click Through Information , 2006, AIRS.

[6]  Adriano Veloso,et al.  Effective self-training author name disambiguation in scholarly digital libraries , 2010, JCDL '10.

[7]  Hai Jin,et al.  Name Disambiguation Using Semantic Association Clustering , 2009, 2009 IEEE International Conference on e-Business Engineering.

[8]  Thorsten Joachims,et al.  Accurately Interpreting Clickthrough Data as Implicit Feedback , 2017 .

[9]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[10]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[11]  Juan-Zi Li,et al.  A constraint-based topic modeling approach for name disambiguation , 2009, Frontiers of Computer Science in China.

[12]  Juan-Zi Li,et al.  A constraint-based probabilistic framework for name disambiguation , 2007, CIKM '07.

[13]  Hong Cheng,et al.  Graph Clustering Based on Structural/Attribute Similarities , 2009, Proc. VLDB Endow..

[14]  Cheng Li,et al.  Two supervised learning approaches for name disambiguation in author citations , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[15]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[16]  Hui Han,et al.  Name disambiguation in author citations using a K-way spectral clustering method , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[17]  Jeffrey F. Naughton,et al.  Efficiently incorporating user feedback into information extraction and integration programs , 2009, SIGMOD Conference.

[18]  Marcos André Gonçalves,et al.  An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations , 2010 .

[19]  Jan-Ming Ho,et al.  Author Name Disambiguation for Citations Using Topic and Web Correlation , 2008, ECDL.

[20]  Won-Kyung Sung,et al.  On co-authorship for author disambiguation , 2009, Inf. Process. Manag..