内容标签和关系标签相结合的汉语篇章标注规范

王 荀,李素建,王宇昕

PDF(4843 KB)
PDF(4843 KB)
中文信息学报 ›› 2015, Vol. 29 ›› Issue (3) : 65-70.
语编标注与推理

内容标签和关系标签相结合的汉语篇章标注规范

  • 王 荀,李素建,王宇昕
作者信息 +

Exploration on Chinese Discourse Tagging Scheme

  • WANG Xun, LI Sujian, WANG Yuxin
Author information +
History +

摘要

篇章标注是自然语言处理中的重要任务,很多其他任务,如自动摘要、机器问答等都可以通过篇章标注得到对文本内容和语义的认识,从而获得更好的结果。与此同时,篇章理解的理论如篇章修辞结构(RST),向心理论(CT)等与实际问题的结合并不紧密,难以实用。该文中我们参考现有的语言学理论和一些语篇标注库(如RST-DT,PDTB),并结合自然语言处理任务特点,提出了一套用于篇章标注的汉语标注体系。这个体系能够比较准确和全面地描述出篇章的内容和逻辑关系,并很好地服务于实际任务的需要。

Abstract

Discourse Tagging is fundamental in natural language processing and helpful to a deep understanding of the texts. Many application tasks, such as automatic summarization, question & answering and so on, would benefit a lot from a thorough understanding of the text. On the basis of the existing discourse theories such as Rhetoric Structure Theory or Centering Theory, this paper designs a new discourse tagging system, which covers both the logical relations and text content or the practical needs of real natural language processing tasks.

关键词

篇章语义标注 / 修辞结构理论 / 关系标签 / 内容标签

Key words

discourse tagging / rhetoric structure theory / relation tag / content tag

引用本文

导出引用
王 荀,李素建,王宇昕. 内容标签和关系标签相结合的汉语篇章标注规范. 中文信息学报. 2015, 29(3): 65-70
WANG Xun, LI Sujian, WANG Yuxin. Exploration on Chinese Discourse Tagging Scheme. Journal of Chinese Information Processing. 2015, 29(3): 65-70

参考文献

[1] Mann William C, Sandra A Thompson. Rhetorical Structure Theory: Description and Construction of Text Structures[C]//Proceedings of University of Southern California, Information Sciences Institute, 1986.
[2] Walker M A. Centering Theory in Discourse[M]. Oxford:Clarendon Press, 1998.
[3] Carlson Lynn, Daniel Marcu, Mary Ellen Okurowski. Building a discourse-tagged corpus in the framework of rhetorical structure theory[C]//Proceedings of the Second SIGdial Workshop on Discourse and Dialogue-Volume 16. Association for Computational Linguistics, 2001.
[4] The Penn Discourse TreeBank 1.0 Annotation Manual[R]. The PDTB Research Group. March 29, 2006.
[5] Prasad Rashmi, Diresh Nikhll, Lee Alan, et al. The penn discourse treebank 2.0[C]//Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008). 2008.
[6] 乐明. 汉语财经评论的修辞结构标注研究[C].第九届全国计算语言学学术会议,2007
[7] 娄开阳. 现代汉语新闻语篇的结构研究[M],北京: 世界图书出版公司,2008.
[8] 李毅,亢世勇,孙茂松,孙道功. 基于奥运语料的语义成分标注规范[C].全国第八届计算语言学联合学术会议,南京,2005.
[9] Baker Collin F, Charles J Fillmore, John B. Lowe. The berkeley framenet project[C]//Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 1998.
[10] Fillmore Charles J. Frame Semantics and the Nature of Language[J]. Annals of the New York Academy of Sciences, 1976,280(1): 20-32.

基金

国家自然科学基金(61273278);国家社会科学项目(12&ZD227);国家科技支撑计划子课题项目(2011BAH10B04-03);国家863计划(2012AA011101)。
PDF(4843 KB)

479

Accesses

0

Citation

Detail

段落导航
相关文章

/