“把”字句是现代汉语中一种重要的特殊句式,该文尝试用基于知识库的规则方法对把字句进行语义角色自动标注。首先,我们从《人民日报》语义角色标注语料库中收集把字句例句,形成一个覆盖范围较广的把字句例句库;之后,对例句库中把字句的句法和语义构成规律进行手工标注,标注内容包括谓语动词的配价类型、把字句谓语结构类型、把字句句模类型等。在上述标注的基础上,对把字句的句模构成规律进行分析,总结出若干条语义角色标注规则;最后,在测试数据上对前述规则进行验证,语义角色标注的最终正确率为98.61%,这一结果说明该文所提出的规则在把字句语义角色标注上是有效的。
Abstract
Ba-sentence is a typical Chinese sentence pattern. This paper proposed a rule-based method for automatic semantic role labeling, with a special focus on ba-sentences. Firstly, we collect a set of ba-sentences from our annotated semantic corpus, including texts from Peoples Daily, and thus forming a sample gallery of ba-sentences. Then, we manually annotate the valence type of each predicate, the syntactic structure type and semantic structure type of each ba-sentence. Based on this annotated corpus, we analyzed the rules of semantic formation, and summed up several rules of semantic role labeling. Finally, we evaluated these rules in a test set yielding an overall precision is 98.61%.
关键词
把字句 /
语义角色标注 /
句模
{{custom_keyword}} /
Key words
ba-sentence /
semantic role labeling /
semantic sentence pattern
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Gildea D, Jurafsky D. Automatic labeling of semantic roles[J]. Computational linguistics,2002,28(3): 245-288.
[2] Xue N, Palmer M. Calibrating features for semantic role labeling[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004: 88-94.
[3] Cohn T, Blunsom P. Semantic role labelling with tree conditional random fields[C]//Proceedings of the Ninth Conference on Computational Natural Language Learning, 2005: 169-172.
[4] 范晓.三个平面的语法观[M].北京: 北京语言学院出版社,1996: 201-209.
[5] 鲁川,缑瑞隆,董丽萍.现代汉语基本句模[J].世界汉语教学,2000,(4): 11-24.
[6] Baker C F, Fillmore C J, Lowe J B. The Berkeley Framenet project[C]//Proceedings of the 17th InterNational Conference on Computational linguistics, 1998: 86-90.
[7] Kingsbury P, Palmer M. From treebank to propbank[C]//Proceedings of the International Conference on Language Resources & Evaluation, 2002: 1989-1993.
[8] Xue N. Labeling Chinese predicates with semantic roles[J]. Computational Linguistics, 2008, 34(2): 225-256.
[9] 刘开瑛. 汉语框架语义网构建及其应用技术研究[J].中文信息学报,2011,25(6): 46-52.
[10] 范晓,胡裕树.试论语法研究的三个层面[J].新疆师范大学学报,1985,(2): 7-15.
[11] 范晓,朱晓亚.论句模研究的方法[J].徐州师范大学学报,1999,25(4): 18-23.
[12] 徐昌火.试论句模研究的对象、起点和基本原则[J].南京师范大学学报,1999,(4): 101-108.
[13] 孙道功,亢世勇,孙茂松.面向语言处理的单句句型句模对应关系研究——基于标注语料库的定量考察[J].计算机工程与应用,2006,42(33): 170-173.
[14] 亢世勇,许小星.现代汉语句系系统的构建和研究[J].中文信息学报,2010,24(1): 103-109.
[15] 郑丽娟,邵艳秋.基于语义依存图库的兼语句句模研究[J].中文信息学报,2015, 29(6): 30-37.
[16] 范晓.动词的配价与汉语的把字句[J].中国语文,2001,(4): 309-319.
[17] 詹卫东,穗志方,常宝宝,等.现代汉语谓词语义角色标注语料库规范(讨论稿)(内部资料).
[18] Dong Z, Dong Q. Hownet and the computation of meaning[M]. World Scientific Publishing Company. 2006: 763-769.
[19] Wells J C. Accents of English[M]. Cambridge,1982: 86-95.
[20] Likun Q, Yue Z, Meishan Z. Dependency Tree Representations of Predicate-Argument Structures[C]//Proceedings of AAAI-16, 2016: 2645-2651.
[21] 邵敬敏,赵春利.“致使把字句”和“省隐被字句”及其语用解释[J].汉语学习,2005,(4): 11-18.
[22] 范晓.论“致使结构”语法研究和探索(十)[M].北京: 商务印书馆,2000: 431-442.
[23] 黄伯荣,廖旭东.现代汉语(增订四版)[M].北京: 高等教育出版社,2007: 87-88.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家社科基金(12&ZD227);国家自然科学基金(61572245, 61103089);鲁东大学人文社会科学研究项目(WY2013003)
{{custom_fund}}