篇章连贯性研究是篇章分析领域的重要课题之一。基于Chinese FrameNet(CFN),该文构建了汉语篇章连贯性描述体系,该描述体系研究了框架语义与篇章单元的关系,探讨了篇章如何通过框架与框架之间的语义关系实现篇章的连贯,为篇章连贯提供了合适的描写机制和计算基础。从《人民日报》选取了160篇文章进行标注实践,在篇章结构和篇章关系两方面均取得了大于0.8的kappa值,验证了描述体系具有较高的人工标注一致性,可作为进一步进行大规模篇章标注语料构建的依据。
Abstract
The research on discourse coherence is an important issue in discourse analysis. Based on Chinese FrameNet(CFN), this paper presents a coherence description scheme for Chinese discourse. It establishes the relationship between the frames and discourse units, and discusses the ways to achieve the discourse coherence by the frames and semantic relationships between frames. This provides a description mechanism and computation basis for discourse coherence. Annotations of 160 articles are selected from the People's Daily shows a more than 0.8 kappa value in both discourse structure annotation and discourse relation annotation. This proves that the proposed scheme guarantee a high consistent manual annotation, which is crucial to larger-scale discourse annotating.
关键词
框架 /
篇章单元 /
篇章结构 /
篇章关系 /
kappa值
{{custom_keyword}} /
Key words
frame /
discourse unit /
discourse structure /
discourse relation /
kappa value
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Crystal D. The Cambridge encyclopedia of language [M]. Cambridge:Cambridge University Press, 1987.
[2] Mitkov R. How could rhetorical relations be used in machine translation (and at least two open questions)? [C]//Proceedings of ACL Workshop on intentionality and structure in discourse relations. Morristown:Association for Computational Linguistics, 1993:86-89.
[3] Santhosh S. Discourse based advancement on question answering system [J]. International Journal on Soft Computing, 2012:11.
[4] Mann W C, Thompson S A. Rhetorical structure theory:toward a foundational theory of text organization [J]. Text, 1988, 8(3):243-281.
[5] Carlson L, Marcu D. Building a discourse-tagged corpus in the framework of rhetorical structure theory [C]//Proceedings of the Second SIGdial Workshop on Discourse and Dialogue, 2001.
[6] Forbes K, Mihsakaki E, Prasad R, et al. D-LTAG System:Discourse parsing with a lexicalized treeadjoining grammar [J]. Journal of Logic, Language and Information, 2001, 12(3):261-279.
[7] PDTB Research Group. The Penn discourse treebank 2. 0 annotation manual [R]. Philadelphia:University of Pennsylvania, 2008.
[8] 孙静, 李艳翠, 周围栋, 等. 汉语隐式篇章关系识别[J]. 北京大学学报(自然科学版), 2014, 50(1):111-117.
[9] 张牧宇, 秦兵, 刘挺. 中文篇章级句间语义关系体系及标注[J]. 中文信息学报, 2014, 28(2):28-36.
[10] 周强, 周骁聪. 基于话题链的汉语语篇连贯性描述体系[J]. 中文信息学报, 2014, 28(5):102-110.
[11] 李天贤. 认知框架视角下的语篇连贯研究[D]. 浙江大学博士学位论文. 2012.
[12] Fillmore C J. Frame semantics [M]//Linguistics in the Morning Calm, the Linguistic Society of Korea, Seoul:Hanshin. 1982:111-137.
[13] 李茹. 汉语句子框架语义结构分析技术研究[D]. 山西大学博士学位论文. 2012.
[14] 郝晓燕, 刘伟, 李茹, 等. 汉语框架语义知识库及软件描述体系[J]. 中文信息学报, 2007, 21(5):96-100.
[15] Kinneavy J L. A Theory of discourse:the aim of discourse [M]. Englewood Cliffs, NJ:Prentice-Hall International, 1971.
[16] 黄国文. 语篇分析概要[M]. 长沙:湖南教育出版社, 1988.
[17] 徐盛桓. 篇章:情景的组合[J]. 外国语(上海外国语大学学报), 1990, 6:3-13.
[18] 黄伯荣, 廖序东. 现代汉语[M]. 北京:高等教育出版社, 2011.
[19] 栾建安, 王纪宪, 苏炳华, 等. 多类别多评估者的kappa分析[J]. 中国卫生统计, 1995, 12(6):20-22.
[20] Daniel M, Estibaliz A, Magdelena R. Experiments in constructing a corpus of discourse trees [C]//Proceedings of the ACL Workshop on Standards and Tools for Discourse Tagging, College Park MD, 1999:48-57.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家863计划(2015AA015407);国家自然科学基金(61373082);山西省回国留学人员科研资助项目(2013-015);山西省科技基础条件平台建设项目(2014091004-0103);中国民航大学信息安全测评中心开放课题基金(CAAC-ISECCA-201402)
{{custom_fund}}