基于抽象语义表示的汉语疑问句的标注与分析

闫培艺,李斌,黄彤,霍凯蕊,陈瑾,曲维光

PDF(2262 KB)
PDF(2262 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (7) : 33-41.
语言分析与计算

基于抽象语义表示的汉语疑问句的标注与分析

  • 闫培艺1,2,李斌1,黄彤1,霍凯蕊1,陈瑾1,曲维光3
作者信息 +

Chinese Interrogative Sentences Annotation and Analysis Based on the Abstract Meaning Representation

  • YAN Peiyi1,2, LI Bin1, HUANG Tong1, HUO Kairui1, CHEN Jin1, QU Weiguang3
Author information +
History +

摘要

计算语言学领域多采取问句分类和句法分析相结合的方式处理疑问句,但精度和效率还不理想。疑问句的语言学研究成果丰富,比如疑问句的结构类型、疑问焦点等,但缺乏系统的形式化表示。该文采用基于图结构的句子语义整体表示方法——中文抽象语义表示来标注疑问句的语义结构,将疑问焦点和整句语义一体化表示出来,选取了宾州中文树库、小学语文教材等2万句语料中共计2 071个疑问句进行标注。统计结果表明,疑问焦点可通过疑问概念amr-unknown和语义关系的组合来表示。其次,根据疑问代词所关联的语义关系,统计了疑问焦点的概率分布,发现原因、修饰语和受事的占比最高,分别占26.45%、16.74%以及16.45%。基于抽象语义表示的疑问句标注与分析可以为汉语疑问句研究提供基础理论与资源。

Abstract

The interrogative sentence has rich linguistic research results, such as interrogative sentence structure types, but it lacks systematic formal representation. We use Chinese Abstract Meaning Representation based on graph structure to annotate the semantic structure of Chinese interrogative sentence. A total of 2,071 sentences are selected from Penn Chinese Treebank, Chinese textbooks for elementary schools, etc. It is revealed that the interrogative focus can be represented by the interrogative concept amr-unknown and the semantic relationship. Additionally, the cause, modifier, and arg1(patient) are top-ranked in the interrogative focus, covering 26.45%, 16.74%, and 16.45%, respectively. Interrogative sentences annotation and analysis based on Abstract Meaning Representation provides a theoretical study and resources for related study in Chinese.

关键词

疑问句 / 抽象语义表示 / 语义关系 / 语义计算

Key words

interrogative sentences / abstract meaning representation / semantic relation / semantic computation

引用本文

导出引用
闫培艺,李斌,黄彤,霍凯蕊,陈瑾,曲维光. 基于抽象语义表示的汉语疑问句的标注与分析. 中文信息学报. 2022, 36(7): 33-41
YAN Peiyi, LI Bin, HUANG Tong, HUO Kairui, CHEN Jin, QU Weiguang. Chinese Interrogative Sentences Annotation and Analysis Based on the Abstract Meaning Representation. Journal of Chinese Information Processing. 2022, 36(7): 33-41

参考文献

[1] Sankar C, Subramanian S, Pal C,et al. Do neural dialog systems use the conversation history effectively? An empirical study[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence, Italy, 2019:32-37.
[2] Kamath A, Jia R, Liang P. Selective question answering under domain shift[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 2020:5684-5696.
[3] Madabushi H T, Lee M. High accuracy rule-based question classification using question syntax and semantics[C]//Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers.Osaka, Japan, 2016:1220-1230.
[4] Maredia A, Schechtman K, Levitan S I,et al. Comparing approaches for automatic question identification[C]//Proceedings of the 6th Joint Conference on Lexical and Computational Semantics. Vancouver, Canada, 2017: 110-114.
[5] 彭洪保,李茹,段建勇. 基于汉语框架网的问句语义角色自动标注研究[C]. 中国计算机语言学研究前沿进展(2007—2009).北京:清华大学出版社, 2009:220-225.
[6] Hancock B, Bordes A, Mazare PE,et al. Learning from dialogue after deployment: Feed yourself, chatbot![C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy, 2019: 3667-3684.
[7] Fan A, Jernite Y, Perez E,et al. ELI5: Long form question answering[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy, 2019: 3558-3567.
[8] 邵敬敏. 现代汉语疑问句研究[M].上海:华东师范大学出版社,1996.
[9] 闫亚平.汉语附加问句句法形式的浮现与发展[J].汉语学报,2019(3):21-29.
[10] 赵睿艺. 现代汉语“疑问代词+V+不是V”构式研究[D].武汉:华中科技大学硕士学位论文,2019.
[11] Curme G O, Kurath H. A grammar of the English language in three volumes[M]. Boston: Heath, 1931.
[12] Jespersen O. The system of grammar[M]. London: G. Allen & Unwin Ltd, 1933.
[13] Nesfield J C. Idiom, grammar, and synthesis for high school[M].London: Macmillan and Co.Ltd, 1928.
[14] Diessel H. The relationship between demonstratives and interrogatives[J]. Studies in Language, 2003, 27(3): 635-655.
[15] Vachek J. The linguistic school of Prague[J]. Journal of the American Oriental Society, 1968, 88(2):369.
[16] Chomsky N, Anderson S, Kisparsky P. Conditions on transformations[M]. Arderson, Kiparsky, A festschrift for Morris Halle, New York: Holt, Rinehart & Winston; 1973: 232-286.
[17] Baker C L. Notes on the description of English questions: the role of an abstract question morpheme[J]. Foundations of Language,1970, 6(2):197-219.
[18] Halliday M, Matthiessen C M, Matthiessen C. An introduction to functional grammar[M]. New York:Routledge,2014.
[19] 马建忠. 马氏文通[M].北京: 商务印书馆,2010.
[20] 陆俭明. 由“非疑问句形式+呢”造成的疑问句[J].中国语文,1982,(6): 435.
[21] 吕叔湘. 疑问·否定·肯定[J]. 中国语文, 1985,(4): 241-250.
[22] 王力. 中国现代语法[M].北京:商务印书馆, 1985.
[23] 黄伯荣. 陈述句,疑问句,祈使句,感叹句[M].上海:上海教育出版社,1985.
[24] 黎锦熙. 新著国语文法[M].北京:商务印书馆, 1992.
[25] 邵敬敏,赵秀凤.“什么”非疑问用法研究[J].语言教学与研究, 1989,(1):26-40.
[26] 刘月华.“怎么”与“为什么”[J].语言教学与研究,1985,(4):130-139.
[27] Clark S, Steedman M, Curran J R. Object-extraction and question-parsing using CCG[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Barcelona, Spain, 2004:111-118.
[28] Judge J, Cahill A, Van Genabith J. Question bank: Creating a corpus of parse-annotated questions[C]//Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Sydney, Australia, 2006:497-504.
[29] Myers L L. WH-interrogatives in spoken French: a corpus-based analysis of their form and function[D].PhD diss, Austin: University of Texas at Austin, 2007.
[30] Mrozinski J, Whittaker E, Furui S. Collecting a why-question corpus for development and evaluation of an automatic QA-system[C]//Proceedings of the ACL: HLT. Columbus, Ohio, USA: Association for Computational Linguistics, 2008:443-451.
[31] Sidi F, Jabar M A, Selamat M H,et al. Malay interrogative knowledge corpus[J]. American Journal of Economics and Business Administration, 2011, 3(1): 171-176.
[32] Marcus M, Marcinkiewicz M A, Santorini B. Building a large annotated corpus of English: The Penn treebank[J]. Computational Linguistics, 1993,19(2): 313-330.
[33] Bhmová A, Hajic J, Hajicová E,et al. The Prague dependency treebank: three-level annotation scenario[J]. Treebanks: Building and using parsed corpora, 2003, 20:103-127.
[34] 彭洪保. 基于汉语框架网的问句语义角色标注研究[D].山西:山西大学硕士学位论文,2010.
[35] 李茹,王文晶,梁吉业,等.基于汉语框架网的旅游信息问答系统设计[J].中文信息学报,2009,23(2):34-40.
[36] 文勖,张宇,刘挺.基于句法结构分析的中文问题分类[J].中文信息学报,2006,20(2):33-39.
[37] 毛先领,李晓明.问答系统研究综述[J].计算机科学与探索,2012,6(3):193-207.
[38] Banarescu L, Bonial C, Cai S,et al. Abstract meaning representation for sembanking[C]//Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse. Sofia, Bulgaria: Association for Computational Linguistics, 2013:178-186.
[39] Bos J. Expressive power of abstract meaning representations[J]. Computational linguistics, 2016, 42(3): 527-535.
[40] 李斌,闻媛,宋丽,等.融合概念对齐信息的中文AMR语料库的构建[J].中文信息学报,2017,31(6):93-102.
[41] Oepen S, Abend O, Abzianidze L, et al. MRP 2020: the second shared task on cross-framework and cross-lingual meaning representation parsing[C]//Proceedings of the CoNLL Shared Task: Cross-Framework Meaning Representation Parsing, 2020: 1-22.
[42] 戴玉玲,戴茹冰,冯敏萱,等.基于关系对齐的汉语虚词抽象语义表示与分析[J].中文信息学报,2020,34(4):21-29.
[43] Li B, Wen Y, Song L,et al. Building a Chinese AMR bank with concept and relation alignments[J]//Linguistic Issues in Language Technology, 2019, 18(1): 1-41.
[44] 唐燕玲,石毓智.疑问和焦点之关系[J].外国语(上海外国语大学学报),2009,32(1):51-57.
[45] 林裕文.谈疑问句[J].中国语文,1985,(2):91-98.

基金

国家社会科学基金(18BYY127);国家自然科学基金(61772278);江苏省社会科学基金(20YYB007)
PDF(2262 KB)

Accesses

Citation

Detail

段落导航
相关文章

/