词义消歧一直是自然语言处理中的热点和难题。集成方法被认为是机器学习研究的四大趋势之一,在系统研究已有集成学习方法在汉语词义消歧中的应用后,借鉴模式识别领域集成分类器思想,提出了一种动态自适应加权投票的多分类器集成方法来构建融合分类器。实验结果表明,所提融合分类器模型对汉语文本自动消歧结果的准确率提高较大。
Abstract
Word Sense Disambiguation (WSD) has been a hot but difficult issue of natural language processing. Ensemble method is considered as one of the four major trends in machine learning research. After a survey of machine learning methods applied in Chinese word sense disambiguation,we introduce the ensembled classifier in the pattern recognition into this issue and propose a classifier ensembled by dynamic weight adaptation. Experimental results show that the proposed classifier has improved the Chinese WDS accuracy significantly.
Key wordsword sense disambiguation; classifier; ensembled classifier; context features
关键词
词义消歧 /
分类器 /
多分类器融合 /
上下文特征
{{custom_keyword}} /
Key words
words word sense disambiguation /
classifier /
ensembled classifier /
context features
/
/
/
/
/
/
/
/
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Thomas G. Dietterich. Machine learning research: Four current directions[J]. AI Magazine, 1997, 18(4): 97-136.
[2] 吴云芳,王淼,金澎,等. 多分类器集成的汉语词义消歧研究[J]. 计算机研究与发展,2008,45(8): 1354-1361.
[3] 全昌勤,何婷婷,姬东鸿,等. 基于多分类器决策的词义消歧方法[J].计算机研究与发展,2006,43(5): 933-939.
[4] Latinne P, Debeir O, Decaestecker C. Combining Different Methods and Numbers of Weak Decision Trees[J]. Pattern Analysis & Applications, 2002, 5(2): 201-209.
[5] 张仰森, 郭江. 四种统计词义消歧模型的分析与比较.北京信息科技大学学报, 2011, 26(2): 13-18.
[6] Kilgarriff A, Rosenzweig J. Framework and results for English SenSeval[J]. Computers and the Humanities 34: 15-48, 2000.
[7] Xiaojie Wang, Yuji Matsumoto. Trajectory based word sense disambiguation [C/OL]//COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics. http://aclweb.org/anthology/C/C04/C04-1130.pdf.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(60873013,61070119);北京大学计算语言学教育部重点实验室开放课题基金(KLCL-1005);北京市属市管高等学校人才强教计划资助项目(PHR201007131)
{{custom_fund}}