英汉《小王子》抽象语义图结构的对比分析

李 斌;闻 媛;卜丽君;曲维光;薛念文

PDF(1970 KB)
PDF(1970 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (1) : 50-57.
自然语言处理应用

英汉《小王子》抽象语义图结构的对比分析

  • 李 斌1,闻 媛1,卜丽君1,曲维光2,薛念文3
作者信息 +

A Comparative Analysis of the AMR Graphs

  • LI Bin1, WEN Yuan1, BU Lijun1, QU Weiguang2, XUE Nianwen3
Author information +
History +

摘要

AMR(抽象语义表示)是国际上一种新的句子语义表示方法,有着接近于中间语言的表示能力,其研发者已经建立了英文《小王子》等AMR语料库。AMR与以往的句法语义表示方法的最大不同在于两个方面,首先采用图结构来表示句子的语义;其次允许添加原句之外的概念节点来表示隐含的语义。该文针对汉语特点,在制定中文AMR标注规范的基础上,标注完成了中文版《小王子》的AMR语料库,标注一致性的Smatch值为0.83。统计结果显示,英汉双语含图结构句子具有很高的相关性,且含有图的句子比例高达40%左右,额外添加的概念节点则存在较大差异。最后讨论了AMR在汉语句子语义表示以及跨语言对比方面的优势。

Abstract

AMR is a new representation of the abstract meaning of a sentence, which is close to the Interlingua. The English AMR corpus including the Little Prince has been released. The major differences between AMR and the previous syntactic and semantic representation lie in two aspects. First, AMR uses a graph. Second, it allows adding concept nodes which are omitted in a sentence. In this paper, we design the Chinese AMR annotation specification and construct the Chinese Little Prince AMR corpus, achieving an inter-agreement Smatch value is 0.83. The bilingual comparison shows that the graph structures in English and Chinese sentences are highly correlated. With a proportion of 40% sentences having graph structure. But the added concept nodes are different. We also discuss AMRs ability to represent the semantic meaning of Chinese sentences as well as the advantages of AMR in cross language comparison.

关键词

抽象语义表示 / 语义图 / 英汉对比 / 自然语言处理

Key words

abstract semantic representation / semantic graph / English-Chinese comparison / natural language processing

引用本文

导出引用
李 斌;闻 媛;卜丽君;曲维光;薛念文. 英汉《小王子》抽象语义图结构的对比分析. 中文信息学报. 2017, 31(1): 50-57
LI Bin; WEN Yuan; BU Lijun; QU Weiguang; XUE Nianwen. A Comparative Analysis of the AMR Graphs. Journal of Chinese Information Processing. 2017, 31(1): 50-57

参考文献

[1] Banarescu L, Bonial C, Cai S, et al. Abstract Meaning Representation for Sembanking[C]//Proceedings of the 7th Linguistic Annotation Workshop, Sophia, Bulgaria, 2013.
[2] Xue N, Bojar O, Hajiě J, et al. Not an Interlingua, but Close: Comparison of English AMRs to Chinese and Czech[C]//Proceedings of the 9th International Conference on Language Resources and Evalua-tion (LREC14), Reykjavik, Iceland, May 26-31, 2014: 1765-1772.
[3] Flanigan J, Thomson S, Carbonell J, et al. A Discriminative Graph-Based Parser for the Abstract Meaning Representation[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014: 1426-1436.
[4] Liu F,Flanigan J, Thomson S, et al. Toward Abstractive Summarization Using Semantic Representations Human Language Technologies[C]//Proceedings of the 2015 Annual Conference of the North American Chapter of the ACL, Denver, Colorado, May 31- June 5, 2015: 1077-1086.
[5] Ding Y, Shao Y,Che W, et al. Dependency Graph Based Chinese Semantic Parsing[C]//Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Springer International Publishing, 2014: 58-69.
[6] Banarescu L, Bonial C, Cai S, et al. Abstract Meaning Repre-sentation (AMR) 1.2.2 Specification[DB/OL]. [2015]. https: //github. com/amrisi/amr-guidelines/blob/master/amr.md.
[7] 徐通锵.语言论——语义型语言的结构原理和研究方法[M].长春: 东北师范大学出版社.1997.
[8] Chomsky N. Syntactic Structures[M]. The Hague/Paris: Mouton, 1957.
[9] Tesnière L. Eléments de syntaxe structurale[M]. Paris: Librairie C. Klincksieck, 1959.
[10] Fillmore C J. Frame Semantics[J]. Encyclopedia of Language & Linguistics, 2006: 613-620.
[11] Baker Collin F, Charles J Fillmore, John B Lowe. The Berkeley FrameNet Project[C]//Proceedings of COLING/ACL-98, Montreal, 1998: 86-90.
[12] Palmer M. Daniel G, Paul K. The Proposition Bank: An Annotated Corpus of Semantic Roles[J]. Computational Linguistics, 2005, 31(1): 71-106.
[13] Hajiě, Jan, Ciaramita M, et al. The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages[C]//Proceedings of the 13th Conference on Computational Natural Language Learning: Shared Task. Association for Computational Linguistics, 2009: 1-18.
[14] Oepen S, Kuhlmann M, Miyao Y, et al. SemEval 2014 Task 8: Broad-Coverage Semantic Dependency Parsing[C]//Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), 2014: 63-72.
[15] Xue N, Xia F, Chiou F, et al. The Penn Chinese TreeBank: Phrase Structure Annotation of a Large Corpus[J]. Natural Language Engineering, 2005,11(2): 207-238.
[16] Bin Li,YuanWen, Lijun Bu, et al. Annotating the Little Prince with Chinese AMRs[C]//Proceedings of the 10th Linguistic Annotation Workshop. Berlin, Aug, 2016.
[17] Nianwen Xue, Martha Palmer. Adding Semantic Roles to the Chinese Treebank[J]. Natural Language Engineering, 2009,15(1): 143-172.
[18] Cai S, Knight K. Smatch: an Evaluation Metric for Semantic Feature Structures[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofia, Bulgaria, August 4-9, 2013: 748-752.
[19] Pourdamghani N, Gao Y, Hermjakob U, et al. Aligning English Strings with Abstract Meaning Representation Graphs[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014: 425-429.李斌(1981—),博士,副教授,主要研究领域为计算语言学。
E-mail: libin.njnu@gmail.com闻媛(1992—),硕士研究生,主要研究领域为计算语言学。
E-mail: wenyuan.njnu@gmail.com卜丽君(1990—),硕士研究生,主要研究领域为计算语言学。
E-mail: blj_njnu@163.com

基金

江苏高校哲学社会科学研究项目(2016SJB740004);国家科技支撑计划课题(2014BAK04B02);国家自然科学基金(61272221)
PDF(1970 KB)

1011

Accesses

0

Citation

Detail

段落导航
相关文章

/