语义选择限制刻画了谓语对论元的语义选择倾向,是一种重要的词汇语义知识,对自然语言的句法、语义分析具有重要作用。该文研究汉语语义选择限制知识的自动获取,提出基于HowNet和基于LDA (Latent Dirichlet Allocation)的两种知识获取方法,对方法进行了实验对比与分析。实验表明,前者所获取的知识可理解性更好,后者所获取的知识应用效果更好。两种方法具有很好的互补性,我们提出了一个二者的融合方案。
Abstract
Selectional preference describes the semantic preference of the predicate for its arguments. It is an important lexical knowledge which can be applied to syntactic and semantic analysis of natural languages. This paper studies the automatic acquisition of Chinese selectional preferences and proposes a HowNet based method and a LDA (Latent Dirichlet Allocation) based method. A comparative study shows that the former method acquires better understood knowledge while the latter achieves better performance in application. The two methods are complementary and mayoe combineal in process.
关键词
语义选择限制 /
知识获取 /
How Net /
LDA(Latent Dirichlet Allocation)
{{custom_keyword}} /
Key words
selectional preference /
knowledge acquisition /
HowNet /
LDA (Latent Dirichlet Allocation
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Y Wilks. A Preferential Pattern-seeking Semantics for Natural Language Inference [J]. Artificial Intelligence, 1975, 6: 53-74.
[2] Guangyou Zhou, Jun Zhao, Kang Liu, et al. Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing [C]//Proceedings of ACL2011, 2011: 1556-1565.
[3] 邵艳秋, 穗志方, 吴云芳. 基于词汇语义特征的中文语义角色标注研究[J]. 中文信息学报, 2009, 23(6): 3-10.
[4] P Resnik. Selection and Information: A Classed-Based Approach to Lexical Relationships [D]. University of Pennsylvania, Philadelphia, PA, 1993.
[5] Shane Bergsma, Dekang Lin, Randy Goebel. Discriminative Learning of Selectional Preference from Unlabeled Text [C]//Proceedings of EMNLP2008, 2008, 59-68.
[6] Yuxiang Jia, Shiwen Yu. Unsupervised Chinese Verb Metaphor Recognition Based on Selectional Preferences [C]//Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation (PACLIC 22), 2008: 207-214.
[7] 吴云芳, 段慧明, 俞士汶. 动词对宾语的语义选择限制[J]. 语言文字应用, 2005, 5月第2期: 121-128.
[8] 李斌. 现代汉语动宾搭配的语义分析和计算[D]. 南京师范大学博士学位论文, 2009.
[9] 贾玉祥, 俞士汶. 语义选择限制的自动获取及其在隐喻处理中的应用[C]//第四届全国学生计算语言学研讨会(SWCL 2008), 2008: 90-96.
[10] 董振东. HowNet [DB/OL]. http://www.keenage.com.
[11] 柏晓鹏. 现代汉语词义分类体系的建立和自动标注[D]. 新加坡国立大学博士学位论文, 2012.
[12] H Li, N Abe. Generalizing case frames using a thesaurus and the MDL principle [J]. Computational Linguistics, 1998, 24(2): 217-244.
[13] S Clark, D Weir. Class-based probability estimation using a semantic hierarchy [J]. Computational Linguistics, 2002, 28(2): 187-206.
[14] Alex Judea, Vivi Nastase, Micheal Strube. Concept-based Selectional Preferences and Distributional Representations from Wikipedia Articles [C]//Proceedings of LREC2012, 2012: 2985-2990.
[15] M Ciaramita, M Johnson. Explaining away ambiguity: Learning verb selectional preference with Bayesian networks [C]//Proceedings of COLING2000, 2000: 187-193.
[16] Diarmuid 'O S'eaghdha. Latent variable models of selectional preference [C]//Proceedings of ACL2010, 2010: 435-444.
[17] Katrin Erk, Sebastian Pado, Ulrike Pado. A Flexible, Corpus-driven Model of Regular and Inverse Selectional Preferences [J]. Computational Linguistics, 2010, 36(4): 723-763.
[18] Zhenhua Tian, Hengheng Xiang, Ziqi Liu, et al. A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation [C]//Proceedings of ACL2013, 2013: 1169-1179.
[19] Nathanael Chambers, Dan Jurafsky. Improving the use of pseudo-words for evaluating selectional preferences[C]//Proceedings of ACL2010, 2010: 445-453.
[20] D Blei, A Ng, M Jordan. Latent Dirichlet Allocation [J]. Journal of Machine Learning Research, 2003, 3:993-1022.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家博士后科学基金(2011M501184);河南省博士后科研资助(2010027);计算语言学教育部重点实验室(北京大学)开放课题(201301);国家自然科学基金(60970083, 61170163, 61272221,61402419);国家社会科学基金(14BYY096);国家863计划项目(2012AA011101);河南省科技厅科技攻关计划项目(132102210407)。
{{custom_fund}}