抽象语义表示(Abstract Meaning Representation, AMR)是一种深层次的句子级语义表示形式,其将句子中的语义信息抽象为由概念结点与关系组成的有向无环图,相比其他较为浅层的语义表示形式如语义角色标注、语义依存分析等,AMR因其出色的深层次语义信息捕捉能力,被广泛运用在例如信息抽取、智能问答、对话系统等多种下游任务中。 AMR解析过程将自然语言转换成AMR图。虽然AMR图中的大部分概念结点和关系与句子中的词语具有较为明显的对齐关系,但原始的英文AMR语料中并没有给出具体的对齐信息。为了克服对齐信息不足给AMR解析以及AMR在下游任务上的应用造成的阻碍,Li等人[14]提出并标注了具有概念和关系对齐的中文AMR语料库。然而,现有的AMR解析方法并不能很好地在AMR解析的过程中利用和生成对齐信息。因此,该文首次提出了一种可以利用并且生成对齐信息的AMR解析方法,包括了概念预测和关系预测两个阶段。该文提出的方法具有高度的灵活性和可扩展性,实验结果表明,该方法在公开数据集CAMR 2.0和CAMRP 2022盲测集分别取得了77.6(+10.6)和70.7(+8.5)的Align Smatch分数,超过了过去基于序列到序列(Sequence-to-Sequence)模型的方法。该文同时对AMR解析的性能和细粒度指标进行详细的分析,并对存在的改进方向进行了展望。该文的代码和模型参数已经开源到https://github.com/pkunlp-icler/Two-Stage-CAMRP,供复现与参考。
Abstract
Abstract Meaning Representation (AMR) is a semantic representation that captures the sentence-level meaning through directed acyclic graph with conceptual nodes and relations. This representation surpasses other shallow semantic representations, such as semantic role labeling and semantic dependency parsing, making it suitable for various downstream tasks including information extraction, question answering, and dialog system. AMR parsing, the process of converting natural language into an AMR graph, faces the challenge due to the lack of alignment information in the original English AMR corpus. In this paper, we present a novel AMR parsing method that leverages and generates alignment information, comprising two stages: concept prediction and relation prediction. Our approach outperforms previous sequence-to-sequence model based methods by achieving AlignSmatch scores of 77.6 (+10.6) and 70.7 (+8.5) on the publicly available dataset CAMR2.0 and the blind test set CAMRP2022, respectively. We provide a detailed analysis of both the performance and fine-grained metrics of AMR parsing, and discuss the potential for improvement, with the code and model parameters available at https://github.com/pkunlp-icler/Two-Stage-CAMRP.
关键词
语义解析 /
抽象语义表示 /
中文自然语言处理
{{custom_keyword}} /
Key words
semantic parsing /
abstract meaning representation /
Chinese natural language processing
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] LIAO K, LEBANOFF L, LIU F. Abstract meaning representation for multi-document summarization[C]//Proceedings of the 27th International Conference on Computational Linguistics, 2018:1178-1190.
[2] HARDY VLACHOS A. Guided neural language generation for abstractive summarization using abstract meaning representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018:768-773.
[3] MITRA A, BARAL C. Addressing a question answering challenge by combining statistical methods with inductive rule learning and reasoning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2016: 2779-2785.
[4] SACHAN M, XING E. Machine comprehension using rich semantic representations[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016:486-492.
[5] CLAIRE B, DONATELLI L, MITCHELL A, et al. Dialogue-AMR: Abstract meaning representation for dialogue[C]//Proceedings of the LREC, 2020:684-695.
[6] BAI X, CHEN Y, SONG L, et al. Semantic representation for dialogue modeling[J]. arXiv preprint arXiv: 2105.10188, 2021.
[7] RAO S, MARCU D, KNIGHT K, et al. Biomedical event extraction using abstract meaning representation[C]//Proceedings of the BioNLP, 2017:126-135.
[8] WANG Y, LIU S, RASTEGAR M, et al. Dependency and AMR embeddings for drug-drug interaction extraction from biomedical literature[C]//Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, 2017:36-43.
[9] ZHANG Z, JI H. Abstract meaning representation guided graph encoding and decoding for joint information extraction[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021:39-49.
[10] XU R, WANG P, LIU T, et al. A two-stream AMR-enhanced model for document-level event agument extraction[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics,2022:5025-5036.
[11] JEFFREY F, CHRIS D, NOAH A, et al. CMU at SemEval-2016 Task 8: Graph-based AMR parsing with infinite ramp loss[C]//Proceedings of the 10th International Workshop on Semantic Evaluation, Association for Computational Linguistics, 2016:1202-1206.
[12] BAI X, CHEN Y, SONG L, et al. Semantic representation for dialogue modeling[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2021: 4430-4445.
[13] XU R, WANG P, LIU T, et al. A two-stream AMR-enhanced model for document-level event argument extraction[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, 2022: 5025-5036.
[14] LI B, WEN Y, QU W, et al. Annotating the little prince with Chinese AMRs[C]//Proceedings of the 10th Linguistic Annotation Workshop Held in Conjunction with ACL, Association for Computational Linguistics, 2016: 7-15.
[15] LI B, WEN Y, SONG L, et al. Building a Chinese AMR bank with concept and relation alignments[C]//Proceedings of the Linguistic Issues in Language Technology, 2019:1-41.
[16] SHAO Y, GENG Z, LIU Y, et al. CPT: A pre-trained unbalanced transformer for both Chinese language understanding and generation[J]. arXiv preprint arXiv: 2109.05729.
[17] CUI Y, CHE W, LIU T, et al. Pre-training with whole word masking for Chinese BERT[J].iIEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021: 3504-3514.
[18] FLANIGAN J, THOMSON S, CARBONELL J, et al. A discriminative graph-based parser for the abstract meaning representation[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014: 1426-1436.
[19] LYU C, TITOV I. AMR parsing as graph prediction with latent alignment[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 397-407.
[20] ZHANG S, MA X, DUH K, et al. AMR parsing as sequence-to-graph transduction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 80-94.
[21] ZHOU Q, ZHANG Y, JI D, et al. AMR parsing withlatent structural information[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 4306-4319.
[22] CAI D, LAM W. AMR parsing via graph-sequence iterative inference[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020:1290-1301.
[23] ZHANG S, MA X, DUH K, et al. Broad-coverage semantic parsing as transduction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 3786-3798.
[24] FERNANDEZ A, RAM B M, et al. Transition-based parsing with stack-transformers[C]//Proceedings of the Association for Computational Linguistics: EMNLP, 2020:1001-1007.
[25] NASEEM T, SHAH A, WAN H,et al. Rewarding smatch: Transition-based AMR parsing with reinforcement learning[J]. arXiv preprint arXiv:1905.13370, 2019.
[26] LEE Y, ASTUDILLO R F,et al. Pushing the limits of AMR parsing with self-learning[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: Findings, 2020: 3028-3214.
[27] ZHOU J, NASEEM T, ERNANDEZ A,et al. AMR parsing with action-pointer transformer[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021:5585-5598.
[28] BAI X, CHEN Y, ZHANG Y. Graph pre-training for AMR parsing and generation[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022: 6001-6015.
[29] GE D, LI J, ZHU, et al. Modeling source syntax and semantics for neural AMR parsing[C]//Proceedings of the IJCAI, 2019: 4975-4981.
[30] XU B, ZHANG L, MAO Z, et al. Curriculum learning for natural language understanding[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 6095-6104.
[31] BEVILACQUA M, BLLOSHMI R, NAVIGLI R. One SPRING to rule them both: Symmetric AMR semantic parsing and generation without a complex pipeline[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021.
[32] XIE B, SU JI, GE Y, et al. Cui improving tree-structured decoder training for code generation via mutual learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 141121-14128.
[33] CHEN L, WANG P, XU R, et al.ATP: AMRize then parse! Enhancing AMR parsing with pseudo AMRs[G]//Findings of the Association for Computational Linguistics: NAACL, 2022: 2482-2496.
[34] CHENYAO Y, DANIEL G. Sequence-to-sequence AMR parsing with ancestor information[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2022.
[35] ZIMING C, LI Z, ZHAO H. AMR parsing and generation with bidirectional Bayesian learning[C]//Proceedings of the International Conference on Computational Linguistics, 2022.
[36] CHEN L, GAO B, CHANG B. A two-stage method for Chinese AMR parsing[J]. arXiv preprint arXiv: 2209.14512, 2022.
[37] WANG P, CHEN L, LIU T, et al. Hierarchical curriculum learning for AMR parsing[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022: 333-339.
[38] RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. arXiv preprint arXiv: 1910.10683, 2019.
[39] MIKE L, YINHAN L, NAMAN G, et al.BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2020:7871-7880.
[40] 吴泰中,顾敏,周俊生,等.基于转移神经网络的中文AMR解析[J].中文信息学报, 2019,33(4): 1-11.
[41] HUANG Z, LI J, GONG Z. Chinese AMR parsing based on sequence-to-sequence modeling[C]//Proceedings of the 20th Chinese National Conference on Computational Linguistics, 2021: 374-385.
[42] 肖力铭,李斌,许智星,等. 基于概念关系对齐的中文抽象语义表示解析评测方法[J].中文信息学报,2022,36(1): 21-30.
[43] CAI SH, KEVIN K. Smatch: An evaluation metric for semantic feature structures[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013: 748-752.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61936012)
{{custom_fund}}