基于语言模型的有监督词义消歧模型优化研究

杨陟卓,黄河燕

PDF(1764 KB)
PDF(1764 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (1) : 19-25.
综述与前瞻

基于语言模型的有监督词义消歧模型优化研究

  • 杨陟卓,黄河燕
作者信息 +

Supervised WSD Model Optimization Based on Language Model

  • YANG Zhizhuo, HUANG Heyan
Author information +
History +

摘要

词义消歧是自然语言领域中重要的研究课题之一。目前,有监督词义消歧方法已经是解决该问题的有效手段。但是,由于缺乏大规模的训练语料,有监督方法还不能取得满意的效果。该文提出一种基于语言模型的词义消歧优化模型,该模型采用语言模型优化传统的有监督消歧模型,充分利用有监督和语言模型两种模型的消歧优势,共同推导歧义词的词义。该模型可以在训练语料不足的情况下,有效的提高词义消歧效果。在真实数据上表明,该方法的消歧性能超过了参加SemEval-2007:task #5评测任务的最好的有监督词义消歧系统。

Abstract

Word Sense Disambiguation (WSD) is one of the key issues in natural language processing. Currently, supervised WSD method is an effective way to solve the problem. However, because of the lack of large-scale training data, supervised methods cannot achieve satisfactory results. This paper presents a word sense disambiguation optimization model based on statistical language model, which exploits language model to optimize traditional supervised WSD model. The new model derives the meaning of ambiguous words by taking advantage of the knowledge contained in training data and language model. The model can significantly improve WSD performance when the training data is insufficient. Experimental results show that the optimized model outperformed the best participating system in the SemEval-2007: task #5 evaluation.

关键词

数据稀疏 / 模型优化 / 有监督模型 / 语言模型 / 参数估计

Key words

data sparseness / model optimization / supervised model / language model / parameter estimation

引用本文

导出引用
杨陟卓,黄河燕. 基于语言模型的有监督词义消歧模型优化研究. 中文信息学报. 2014, 28(1): 19-25
YANG Zhizhuo, HUANG Heyan. Supervised WSD Model Optimization Based on Language Model. Journal of Chinese Information Processing. 2014, 28(1): 19-25

参考文献

[1] Chan Y S, Ng H T. Scaling up word sense disambiguation via parallel texts[C]//Proceedings of AAAI. 2005, 5: 1037-1042.
[2] Navigli R. Word Sense Disambiguation: A survey [J]. ACM Computing Surveys, 2009, 41(2): 1-69.
[3] 何径舟, 王厚峰. 基于特征选择和最大熵模型的汉语词义消歧.软件学报[J] ,2010, 21(6):1287-1295.
[4] Mart nez D, Agirre E, Mrquez L. Syntactic features for high precision word sense disambiguation[C]//Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 2002: 1-7.
[5] Che W, Liu T. Jointly modeling wsd and srl with markov logic[C]//Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 2010: 161-169.
[6] Dang H T, Palmer M. The role of semantic roles in disambiguating verb senses[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005: 42-49.
[7] 张仰森,黄改娟,苏文杰. 基于隐最大熵原理的汉语词义消歧方法.中文信息学报[J], 2012, 26(3):72-78.
[8] 卢志茂,刘挺,张刚,等.基于依存分析改进贝叶斯模型的词义消歧.高技术通讯[J], 2003, 13(5): 1-7.
[9] 范冬梅, 卢志茂, 张汝波,等. 基于信息增益改进贝叶斯模型的汉语词义消歧. 电子与信息学报[J], 2008,30(12): 2926-2929.
[10] 张仰森, 郭江. 基于隐最大熵原理的汉语词义消歧方法. 中文信息学报[J], 2012,26(1):3-8.
[11] Escudero G, Màrquez L, Rigau G. Naive Bayes and exemplar-based approaches to word sense disambiguation revisited[J]. arXiv preprint cs/0007011, 2000.
[12] Song F, Croft W B. A general language model for information retrieval[C]//Proceedings of the eighth international conference on information and knowledge management. ACM, 1999: 316-321.
[13] 刘鹏远, 赵铁军.利用语义词典Web挖掘语言模型的无指导译文消歧木. 软件学报[J], 2009, 20(5):1292-1300.
[14] Bergsma S, Lin D, Goebel R. Web-Scale N-gram Models for Lexical Disambiguation[C]//Proceedings of IJCAI. 2009, 9: 1507-1512.
[15] Jin P, Wu Y, Yu S. SemEval-2007 task 05: multilingual Chinese-English lexical sample[C]//Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics, 2007: 19-23.
[16] Dong Zhendong, Dong Qiang. Hownet[OL]. 1999.[2010-11-5], http://www.keenage.com
[17] Carpuat M, Wu D. Word sense disambiguation vs. statistical machine translation[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005: 387-394.

基金

国家自然科学基金(61132009);北京理工大学科技创新计划重大项目培育专项计划基金;国防基础基金
PDF(1764 KB)

636

Accesses

0

Citation

Detail

段落导航
相关文章

/