CHEN Yubo1, HE Shizhu1, LIU Kang1, ZHAO Jun1, LV Xueqiang2
1. NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; 2. ICDDR, Beijing Information Science and Technology University, Beijing 100101, China
Abstract:Entity linking is an important method of entity disambiguation, which aims to map an entity to an entry stored in the existing knowledge base. Several methods have been proposed to tackle this problem, most of which are based on the co-occurrence statistics without capture various semantic relations. In this paper, we make use of multiple features and propose a learning to rank algorithm for entity linking. It effectively utilizes the relationship information among the candidates and save a lot of time and effort. The experiment results on the TAC KBP 2009 dataset demonstrate the effectiveness of our proposed features and framework by an accuracy of 84.38%, exceeding the best result of the TAC KBP 2009 by 2.21%.
[1] 赵军,刘康,周光有等.开放式文本信息抽取[J].中文信息学报,2011,25(6): 98-110. [2] Fabian M Suchanek, Gjergji Kasneci, Gerhard Weikum.Yago:A large ontology from wikipedia and wordnet. Web Semantics: Science, Services and Agents on the World Wide Web, 2008,6(3):203-217. [3] Fei Wu, Daniel S Weld. Automatically refining the wikipedia infobox ontology[C]//Proceedings of the 17th international conference on World Wide Web,2008: 635-644. [4] Sren Auer, Christian Bizer, Georgi Kobilarov, et al.Dbpedia: A nucleus for a web of open data. In The semantic web, 2007: 722-735. [5] Amit Bagga, Breck Baldwin. Entity-based cross-document coreferencing using the vector space model[C]//Proceedings of HLT/ACL, 1998: 79-85. [6] Michael B Fleischman, Eduard Hovy. x Multi-document person name resolution[C]//Proceedings of ACL, Reference Resolution Work shop, 1998: 66-82. [7] Bradley Malin, Edoardo Airoldi, Kathleen, et al. A network analysis model for disambiguation of names in lists[J]. Computational & Mathematical Organization Theory, 2005,11(2):119-139. [8] Han X, Zhao J. Named entity disambiguation by leveraging Wikipedia semantic knowledge[C]//Proceeding of the 18th ACM conference on Information and knowledge management, 2009: 215-224. [9] Silviu Cucerzan. Large-scale named entity disambiguation based on wikipedia data[C]//Proceedings of EMNLP CoNLL, 2007,7: 708-716. [10] David Milne, Ian H Witten. Learning to link with wikipedia[C]//Proceedings of the 17th ACM conference on Information and Knowledge Management,2008: 509-518. [11] Tao Zhang, Kang Liu, Jun Zhao. The nlprir entity linking system at tac 2012. [12] Thorsten Joachims. Optimizing search engines using clickthrough data[C]//Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002: 133-142. [13] Xianpei Han, Le Sun. A generative entity mention model for linking entities with knowledge base[C]//Proceedings HLT/ACL, 2011: 945-954. [14] David Milne, Ian H Witten. An open-source toolkit for mining wikipedia[J]. Artificial Intelligence,2013,194:222-239. [15] Mark Dredze, Paul McNamee, Delip Rao, et al. Entity disambiguation for knowledge base population[C]//Proceedings of CL.2010: 277-285. [16] Wei Shen, Jianyong Wang, Ping Luo, et al. Linden: linking named entities with knowledge base via semantic knowledge[C]//Proceedings of WWW, 2012: 449-458. [17] Paul McNamee, Hoa Trang Dang. Overview of the tac 2009 knowledge base population track. In Text Analysis Conference (TAC), 2009,17: 111-113. [18] Zhe Cao, Tao Qin, Tie-Yan Liu, et al. Learning to rank: from pairwise approach to listwise approach[C]//Proceedings of ICML, 2007: 129-136.