全词消歧(All-Words Word Sense Disambiguation)可以看作一个序列标注问题,该文提出了两种基于序列标注的全词消歧方法,它们分别基于隐马尔可夫模型(Hidden Markov Model, HMM)和最大熵马尔可夫模型(Maximum Entropy Markov Model, MEMM)。首先,我们用HMM对全词消歧进行建模。然后,针对HMM只能利用词形观察值的缺点,我们将上述HMM模型推广为MEMM模型,将大量上下文特征集成到模型中。对于全词消歧这类超大状态问题,在HMM和MEMM模型中均存在数据稀疏和时间复杂度过高的问题,我们通过柱状搜索Viterbi算法和平滑策略来解决。最后,我们在Senseval-2和Senseval-3的数据集上进行了评测,该文提出的MEMM方法的F1值为0.654,超过了该评测上所有的基于序列标注的方法。
Abstract
All-Words Word Sense Disambiguation (WSD) can be regarded as a sequence labeling problem, and two All-Words WSD methods based on sequence labeling are proposed in this paper, which are based on Hidden Markov Model (HMM) and Maximum Entropy Markov Model (MEMM), respectively. First, we model All-Words WSD using HMM. Since HMM can only exploit lexical observation, we generalize HMM to MEMM by incorporating a large number of non-independent features. For All-Words WSD which is a typical extra-large state problem, the data sparsity and high time complexity seriously hinder the application of HMM and MEMM models. We solve these problems by beam-search Viterbi algorithm and smoothing strategy. Finally, we test our methods on the dataset of All-Words WSD tasks in Senseval-2 and Senseval-3, and achieving a 0.654 F1 value forthe MEMM method which outperforms other methods based on sequence labeling.
Key wordsall-words word sense disambiguation; hidden Markov model; maximum entropy Markov model; very large state problem
关键词
全词消歧 /
隐马尔可夫模型 /
最大熵马尔可夫模型 /
超大状态问题
{{custom_keyword}} /
Key words
all-words word sense disambiguation /
hidden Markov model /
maximum entropy Markov model /
very large state problem
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Mooney R. J. Comparative experiments on disambiguating word senses: An illustration of the role of bias in machine learning [C]//Proceedings of the 1996 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1996. 82-91.
[2] Tratz S., Sanfillippo A., Gregory M., et al.PNNL: A supervised maximum entropy approach to word sense disambiguation [C]//Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007). Stroudsburg, PA, USA, 2007. 264-267.
[3] Escudero G., M rquez L., Rigau, G. On the portability and tuning of supervised word sense disambiguation [C]//Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP). 2000. 172-180.
[4] Lawrence R. Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition [C]//Proceedings of the IEEE. 1989. 257-286.
[5] Andrew McCallum, Dayne Freitag, Fernando Pereira. Maximum Entropy Markov Models for Information Extraction and Segmentation [C]//Proceedings of the 17th International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2000. 591-598.
[6] John Lafferty, Andrew McCallum, Fernando Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data [C]//Proceedings of the 18th International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2001. 282-289.
[7] El-B ze M., M rialdo B.. HMM Based Taggers [C]//H. Van Halteren eds. Syntactic Wordclass Tagging. Kluwer Academic Publishers, 1999.
[8] F. Jelinek. Statistical Methods for Speech Recognition [M]. Cambridge: MIT Press, 1998.
[9] Segond F., Schiller, A., Grefenstette, G., et al. An Experiment in Semantic Tagging using Hidden Markov Model Tagging [C]//Proceedings of the Joint ACL/EACL Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Stroudsburg, PA, USA, 1997. 78-81.
[10] Claude de Loupy, MarcEl-Beze, Pierre-Fran ois Marteau. Word Sense Disambiguation using HMM Tagger [C]//Proceedings of the 1st International Conference on Language Resources and Evaluation (LREC). Granada, Spain, 1998. 1255-1258.
[11] E. Crestan, M. El-Beze, C. De Loupy. Improving WSD with Multi-Level View of Context Monitored by Similarity Measure [C]//Proceedings of the 2nd International Workshop on Evaluating Word Sense Disambiguation Systems. Toulouse, France, 2001. 67-70.
[12] Antonio Molina, Ferran Pla, Encarna Segarra. A Hidden Markov Model Approach to Word Sense Disambiguation [C]//Proceedings of the 8th Ibero-American Conference on AI: Advances in Artificial Intelligence. Longdon, UK: Springer-Verlag. 2002. 655-663.
[13] Antonio Molina, Ferran Pla, Encarna Segarra. WSD system based on Specialized Hidden Markov Model [C]//Proceedings of the Third International Workshop on the Evalution of Systems for the Semantic Analysis of Text, 2004.
[14] Yoong Keok Lee, Hwee Tou Ng. An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA, 2002. 41-48.
[15] Edmonds P., Cotton S. Senseval-2: Overview [C]//Proceedings of the 2nd Internationnal Workshop on Evaluating Word Sense Disambiguation Systems. 2001. 1-6.
[16] Benjamin Snyder, Martha Palmer. The English All-Words Task [C]//Proceeding of Senseval-3: The 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. Barcelona, Spain, 2004. 41-43.
[17] Miller G.A., Chodorow M., Landes S., et al. Using a Semantic Concordance for Sense Identification [C]//Proceedings of the ARPA Workshop on Human Language Technology. Stroudsburg, PA, USA, 1994. 240-243.
[18] Mihalcea R. Word sense disambiguation with pattern learning and automatic feature selection [J]. Natural Language Engineering, 2002,8(4):348-358.
[19] Hoste V., Hendrickx I., Daelemans W., et al. Parameter optimization for machine learning of word sense disambiguation [J]. Natural Language Engineering, 2002,8(4):311-325.
[20] Decadt B., Hoste V., Daelemans W., et al. GAMBL, genetic algorithm optimization of memory-based WSD [C]//Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. 2004. 108-112.
[21] Mihalcea R., Faruque E. Senselearner: Minimally supervised word sense disambiguation for all words in option text [C]//Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. 2004. 155-158.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家高技术研究发展计划(863计划)项目(2010AA012505);国家自然科学基金重点课题资助项目(60933005);国家自然科学基金资助项目(60873097)
{{custom_fund}}