基于条件随机场的汉语框架语义角色自动标注

宋毅君,王瑞波,李济洪,李国臣

PDF(1317 KB)
PDF(1317 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (3) : 36-47.
语言分析与生成

基于条件随机场的汉语框架语义角色自动标注

  • 宋毅君1,王瑞波1,李济洪1,李国臣2
作者信息 +

Semantic Role Labeling of Chinese FrameNet Based on Conditional Random Fields

  • SONG Yijun1,WANG Ruibo1,LI Jihong1, LI Guochen2
Author information +
History +

摘要

在给定目标词及其所属框架的条件下,汉语框架语义角色标注可以分为语义角色识别和角色分类两个步骤。该文将此任务通过IOB2标记策略形式化为词序列标注问题,以词为基本标注单元,采用条件随机场模型进行自动标注实验。先对语料使用清华大学的基本块自动分析器进行分析,提取出15个块层面的新特征,并将这些特征标记形式化到词序列上。以文献[20]已有的12个词层面特征以及15个块层面特征共同构成候选特征集,采用正交表方法来选择模型的最优特征模板。在与文献[20]相同的语料上,相同的3组2折交叉验证实验下,语义角色标注的总性能的F1-值比文献[20]的F1-值提高了近1%,且在显著水平0.05的t-检验下显著。实验结果表明: (1)基于词序列模型,新加入的15个块层面特征可以显著提高标注模型的性能,但这类特征主要对角色分类有显著作用,对角色识别作用不显著;(2) 基于词序列的标注模型显著好于以基本块为标注单元以及以句法成分为标注单元的标注模型。

Abstract

Given a predicate word and its frame, semantic role labeling of Chinese FrameNet can be divided into two steps: the boundary identification of semantic roles and the classification of semantic roles. In this paper, these tasks are formalized onto the word sequential labeling problem through IOB2 strategy. We apply conditional random field model to automatic labeling experiment with word as the basic tagging unit. We extract 15 new base-chunk features by applying the base chunk parser of Tsinghua University to automatic parsing on sentences, and the features are formalized onto the word sequence. Experiments show that the F1-value of the total performance of semantic roles labeling increases by nearly 1% in comparison with the baseline, which is significant under 0.05 significance level of the t-test.

关键词

汉语框架语义知识库 / 语义角色标注 / 条件随机场模型 / 基本块

Key words

Chinese FrameNet / semantic role labeling / conditional random fields / base chunk

引用本文

导出引用
宋毅君,王瑞波,李济洪,李国臣. 基于条件随机场的汉语框架语义角色自动标注. 中文信息学报. 2014, 28(3): 36-47
SONG Yijun1,WANG Ruibo1,LI Jihong1, LI Guochen2. Semantic Role Labeling of Chinese FrameNet Based on Conditional Random Fields. Journal of Chinese Information Processing. 2014, 28(3): 36-47

参考文献

[1] You L, Liu K. Building Chinese FrameNet Database[A]. Proceedings of IEEE NLP-KE’05[C]. Wuhan: IEEE, 2005: 301-306.
[2] Fillmore, Charles J. Frame semantics and the nature of language[A]. In Annals of the New York Academy of Sciences: Conference on the Origin and Development of Language and Speech[C]. 1976, 280: 20-32
[3] Che WX, Li ZH, Li YQ, et al. Multilingual dependency-based syntactic and semantic parsing[A]. Proceedings of the CoNLL-2009[C], Boulder: ACL Press, 2009: 49-54.
[4] Zhao H, Chen WL, Kit C, Zhou GD. Multilingual dependency learning: A huge feature engineering method to semantic dependency parsing[A]. Proceedings of the CoNLL-2009[C]. Boulder: ACL Press, 2009: 55-60.
[5] 刘挺,车万翔,李生. 基于最大熵分类器的语义角色标注[J]. 软件学报. 2007, 18(3):565-573.
[6] 董静,孙乐,吕元华,冯元勇. 基于线性链条件随机场模型的语义角色标注[A]. 中国中文信息学会二十五周年学术会议[C]. 2006.
[7] Yu JD, Fan X, Pang W,Yu Z. Semantic role labeling based on conditional random fields[A]. Journal of Southeast University(English Edition). 2007, 23(3):361-364.
[8] Sun HL,Jurafsky D. Shallow Semantic Parsing of Chinese[A]. Proceedings of NAACL-HLT 2004[C]. 2004.
[9] Xue NianWen.Labeling Chinese predicates with semantic roles[J].Computational Linguistics, 2008,34(2):225-255.
[10] 丁伟伟,常宝宝. 基于最大熵原则的汉语语义角色分类[J]. 中文信息学报. 2008, 22(6):20-27.
[11] Weiwei Ding, Baobao Chang. Fast Semantic Role Labeling for Chinese Based on Semantic Chunking[A]. Proceedings of 22nd International Conference on the Computer Processing of Oriental Languages (ICCPOL 2009)[C]. Hongkong,China. 2009.
[12] Weiwei Ding,Baobao Chang. Word Based Chinese Semantic Role Labeling with Semantic Chunking[J]. International Journal of Computational Processing Oriental Language. 2009, 22(2-3): 133-154.
[13] WeiWei Sun. Semantics-driven shallow parsing for Chinese semantic role labeling[A]. Proceedings of the ACL 2010[C]. 2010.
[14] WeiWei Sun. Improving Chinese semantic role labeling with rich syntactic features[A]. Proceedings of the ACL 2010 Conference[C]. Uppsala,Sweden. 2010: 168-172.
[15] 李军辉,周国栋,朱巧明,钱培德. 中文名词性谓词语义角色标注[J]. 软件学报,2011,22(8):1725-1737.
[16] 李世奇,赵铁军,李晗静,刘鹏远,刘水.基于特征组合的中文语义角色标注[J].软件学报,2011,22(2):222-232.
[17] Gildea D, Jurafsky D. Automatic labeling of semantic roles[J]. Computational Linguistics,2002, 28(3):245-288.
[18] 刘鸣洋,由丽萍. 汉语感知词语的语义角色标注规则初探[A]. 内容计算的研究与应用前沿(CNCCL-2007)[C]. 北京:清华大学出版社. 2007,320-325,
[19] 刘开瑛. 汉语框架语义网(CFN)构建现状[A]. 第四届全国学生计算语言学研讨会论文集[C]. 2008.
[20] 李济洪,王瑞波,王蔚林,李国臣. 汉语框架语义角色自动标注[J]. 软件学报, 2010,30(4): 597-611.
[21] Surdeanu M, M rquez L, Carreras X, Comas PR. Combination strategies for semantic role labeling[J]. Journal of Artificial Intelligence Research,2007, 29:105-151.
[22] Weiwei Sun, Zhifang Sui, Meng Wang, and Xin Wang. Chinese semantic role labeling with shallow parsing[A]. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing[C]. Singapore: Association for Computational Linguistics. 2009,1475-1483.
[23] 周强.基于规则的汉语基本块自动分析器[A]. 第七届中文信息处理国际会议论文集[C].北京:电子工业出版社. 2007: 137-142.
[24] 周强.汉语基本块描述体系[J].中文信息学报,2007,21(3):21-27.
[25] 李济洪. 汉语框架语义角色自动标注技术研究[D]. 山西大学2010届博士论文.
[26] Ethem Alpaydin. Combined 5 x 2 cv F test for comparing supervised classification learning algorithms[J]. Neural Computation. 1999, 11(8):1885-1892.
[27] Markatou M, Tian H, Biswas S, et al. Analysis of variance of cross-validation estimators of the generalization error. Journal of Machine learning Reseerch, 2005, 6:1127-1168.
[28] Sylvain Arlot. A survey of cross-validation procedures for model selection[J]. Statistics Surveys. 2010, 4:40-79. DOI: 10.1214/09-SS054.
[29] YuHong Yang. Comparing Learning Methods for Classification[J]. Statistica Sinica. 2006, 16(2): 635-657.
[30] 茆诗松. 统计手册[M]. 北京:科学出版社,2003.
[31] Taku Kudo. CRF++ Tools Package:http://crfpp.sourceforge.net/, version:5.0, 2007.
[32] 王瑞波. 基于条件随机场的汉语框架语义角色自动标注研究[D]. 山西大学2009届硕士论文.

基金

国家自然科学基金(60873128)
PDF(1317 KB)

Accesses

Citation

Detail

段落导航
相关文章

/