闻媛,宋丽,吴泰中,李斌,周俊生,曲维光. 基于中文AMR语料库的非投影结构研究[J]. 中文信息学报, 2018, 32(12): 31-40.
WEN Yuan, SONG Li, WU Taizhong, LI Bin, ZHOU Junsheng, QU Weiguang. Research on Non-projective Structure Based on the Chinese Abstract Meaning Representation Corpus. , 2018, 32(12): 31-40.
Research on Non-projective Structure Based on the Chinese Abstract Meaning Representation Corpus
WEN Yuan1, SONG Li1, WU Taizhong2, LI Bin1, ZHOU Junsheng2, QU Weiguang2,3
1.School of Chinese Language and Literature, Nanjing Normal University, Nanjing, Jiangsu 210097, China; 2.School of Computer Science and Technology, Nanjing Normal University, Nanjing, Jiangsu 210023, China; 3.Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang University, Fuzhou, Fujian 350121, China
Abstract:The non-projective structure refers to the phenomenon that the word nodes on the dependency tree are misplaced with different word sequence in the original sentence. It has not been discussed in Chinese, following only the projection principle in the construction of Chinese dependency corpus. In this paper, we construct a Chinese abstract meaning representation (AMR) corpus of 10 149 sentences, in which 31.62% sentences have non-projective structures. Then we distinguish the three main types of the non-projective structures, modal words, topicalization and the component separation. Finally, we provide the solutions for the structures in the AMR parsing.
[1] Nivre J,et al.The CoNLL 2007 shared task on dependency parsing[C]//Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,2007,117(1):53-55. [2] Oepen S,et al.SemEval 2014 task 8: Broad-coverage semantic dependency parsing[C]//Proceedings of International Workshop on Semantic Evaluation,2015:63-72. [3] Havelka J.Beyond projectivity: Multilingual evaluation of constraints and measures on non-projective structures[C]//Proceedings of 45th Annual Meeting of the Association of Computational Linguistics,2007:608-615. [4] McDonald R,et al.Non-projective dependency parsing using spanning tree algorithms[C]//Proceedings of Conference on Human Language Technology and Empirical Methods in Natural Language Processing.2005:523-530. [5] Banarescu L,et al.Abstract meaning representation for sembanking[C]//Proceedings of Linguistic Annotation Workshop and Interoperability with Discourse.2013:178-186. [6] 李斌,等.融合概念对齐信息的中文AMR语料库的构建[J],中文信息学报,2017,31(6):93-102. [7] Tesnière L.Eléments de Syntaxe Structurale[M].Librairie C.Klincksieck,1959. [8] Ihm P,Lecerf Y.éléments Pour une Grammaire Générale des Langues Projectives[M].Bruxelles: Presses Académiques Européennes,1963. [9] Hays D G.Dependency theory: A formalism and some observations[J].Language,1964,40(4):511-525. [10] Marcus S.Sur la Notion de Projectivité[J].Mathematical Logic Quarterly,1965,11(2):181-192. [11] Robinson J J.Dependency structures and transformational rules[J].Language,1970,46(2):36. [12] Uhlírová L.On the non-projective constructions in czech[J].Prague Studies in Mathematical Linguistics,1972,(3): 171-181. [13] ?tícha F.Krí?ení vět v ce?tině[J].Na?e Rec,1996(79):26-31. [14] Oliva K.Některé aspekty komplexity ceského slovního neporádku[J].Ce?tina-univerzália a specifika,2001,(3):163-172. [15] Petkevic V.Neprojektivní Konstrukce v Ce?tině z Hlediska Automatické Morfologické Disambiguace Ceskych Textu[J].Ce?tina-univerzália a Specifika.Brno: Masarykova univerzita,2001:197-205. [16] Hajic J,et al.The Prague dependency treebank: A three-level annotation scenario[C]//Proceedings of the Treebanks: Building and using parsed corpora,amsterdam.Kluwer,2000:103-127. [17] Hajicová E,et al.Issues of projectivity in the prague dependency treebank[J].Prague Bulletin of Mathematical Linguistics,2004,(81):5-22. [18] Mannem P,Chaudhry H,Bharati A.Insights into non-projectivity in Hindi[C]//Proceedings of 4th International Joint Conference on Natural Language Processing,2009: 10-17. [19] Ambati B R,Deoskar T,Steedman M.Hindi CCG Bank: A CCG treebank from the Hindi dependency treebank[J].Language Resources and Evaluation,2018,52(1):67-100. [20] Zeman D,et al.HamleDT: Harmonized multi-language dependency treebank[J].Language Resources and Evaluation,2014,48(4): 601-637. [21] 郑丽娟,邵艳秋,杨尔弘.中文非投射语义依存现象分析研究[J].中文信息学报,2014,28(6):41-47. [22] Cai S,Knight K.Smatch: An evaluation metric for semantic feature structures[C]//Proceedings of Meeting of the Association for Computational Linguistics,2013:748-752. [23] Xue N,et al.The Penn Chinese TreeBank: Phrase structure annotation of a large corpus[J].Natural Language Engineering,2005,11(2): 207-238. [24] Carnie A.Syntax: A generative introduction[M].Wiley-Blackwell,2013. [25] Lyu C,Titov I.AMR parsing as graph prediction with latent alignment[C]//Proceedings of 56th Annual Meeting of the Association for Computational Linguistics,2018: 397-407. [26] Wang C,Li B,Xue N.Transition-Based Chinese AMR parsing[C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2018,2: 247-252.