潘 霄,余正涛,郭剑毅,毛存礼,杨秀贞. 一种基于特征映射的中文专家消歧方法[J]. 中文信息学报, 2016, 30(2): 26-31.
PAN Xiao, YU Zhengtao, GUO Jianyi, MAO Cunli, YANG Xiuzhen. A Chinese Expert Disambiguation Method Based on Feature Mapping. , 2016, 30(2): 26-31.
A Chinese Expert Disambiguation Method Based on Feature Mapping
PAN Xiao1,2, YU Zhengtao1,2, GUO Jianyi1,2, MAO Cunli1,2, YANG Xiuzhen1
1. School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunan 650500, China; 2. Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, Kunming, Yunan 650500, China)
Abstract:A Chinese expert page disambiguation method based on feature mapping is proposed according to the characteristics of the Chinese expert page. Firstly, with the help of CRFs model, 12 predefined character attributes are extracted from the standard and the candidate page, and their weights are decided by a ME classifier. Then, the page similarity is calculated to decide if the candidate page attributes should be appended Experiments on NLP and ML expert pages show the effectiveness of the proposed method in disambiguation.
[1] Houfeng Wang, Zheng Mei. Chinese Multi-document Person Name Disambiguation [J]. High Technology Letters, 2005, 11(3): 280-283. [2] Bollegala D, Matsuo Y,Ishizuka M. Disambiguating Personal Names on the Web Using Automatically Extracted Key Phrases[J]. Frontiers in Artificial Intelligence and Applications, 2006: 553-557. [3] Cohen W, Ravikumar P, Fienberg S. A Comparison of String Distance Metrics for Name-matching Tasks[C]//Proceedings of the IJCAI Workshop on Information Integration on the Web, Acapulco, Mexico, 2003: 73-78. [4] 周晓, 李超, 胡明涵, 等. 基于人物互斥属性的中文人名消歧[C]// 第六届全国信息检索学术会议, 2010. [5] 郎君, 秦兵, 宋巍等. 基于社会网络的人名检索结果重名消解[J]. 计算机学报, 2009,(7): 1365-1375. [6] Jie Tang, Limin Yao, Duo Zhang. A Combination Approach to Web User Profiling[J]. ACM Transactions on Knowledge Discovery from Data , 2010, 5(1): 2. [7] Lafferty J, McCallum A, Pereira F. Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]// Proceedings of the 18th International Conference on Machine Learning, Wil-liamstown, USA, 2001: 282-289. [8] Liyan Zhang. A Chinese Word Segmentation Algorithm Based on Maximum Entropy[C]// Machine Learning and Cybernetics (ICMLC), 2010 International Conference on. IEEE, 2010(3): 1264-1267. [9] 刘群, 李素建. 基于《 知网》 的词汇语义相似度计算[J]. 中文计算语言学, 2002, 7(2): 59-76. [10] Botía J F, Isaza C, Kempowsky T, et al. Automaton based on Fuzzy Clustering Methods for Monitoring Industrial Processes[J]. Engineering Applications of Artificial Intelligence, 2012, 4(26): 1211-1220.