TransformerG:基于层级图结构与文本注意力机制的法律文本多跳阅读理解

朱斯琪,过弋,王业相,余军,汤奇峰,邵志清

PDF(1766 KB)
PDF(1766 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (11) : 148-155,168.
自然语言理解与生成

TransformerG:基于层级图结构与文本注意力机制的法律文本多跳阅读理解

  • 朱斯琪1,过弋1,2,3,王业相1,余军1,汤奇峰1,邵志清1
作者信息 +

TransformerG:Multi-hop Reading Comprehension of Legal Texts Based on Hierarchical Graph Structure and Attention Mechanism

  • ZHU Siqi1, GUO Yi1,2,3, WANG Yexiang1, YU Jun1, TANG Qifeng1, SHAO Zhiqing1
Author information +
History +

摘要

该文针对Cail2020法律多跳机器阅读理解数据集进行研究,提出了TransformerG,一个基于不同层级的实体图结构与文本信息的注意力机制融合的多跳阅读理解模型。该模型有效地结合了段落中问题节点、问题的实体节点、句子节点、句中的实体节点的特征与文本信息的特征,从而预测答案片段。此外,该文提出了一种句子级滑动窗口的方法,有效解决在预训练模型中文本过长导致的截断问题。利用TransformerG模型参加中国中文信息学会计算语言学专委会(CIPS-CL)和最高人民法院信息中心举办的“中国法研杯”司法人工智能挑战赛机器阅读理解赛道,取得了第2名的成绩。

Abstract

Focused on the Cail2020 multi-hop machine reading comprehension data set, this paper presents TransformerG, a multi-hop reading comprehension model based on the integration of paragraph graph structure and attention mechanism.This model combines the feature of question node, question entity node, sentence node and sentence entity node in the text to predict the answer span. In addition, a sentence level sliding window method is designed to substitute the truncation of long text in the pre-training model. The proposed TransformerG model ranks Top 2 in the machine reading comprehension setting of Cail2020 Competition.

关键词

层级图结构 / 多跳机器阅读理解 / 法研杯

Key words

hierarchical graph structure / multi-hop machine reading comprehension / Cail2020

引用本文

导出引用
朱斯琪,过弋,王业相,余军,汤奇峰,邵志清. TransformerG:基于层级图结构与文本注意力机制的法律文本多跳阅读理解. 中文信息学报. 2022, 36(11): 148-155,168
ZHU Siqi, GUO Yi, WANG Yexiang, YU Jun, TANG Qifeng, SHAO Zhiqing. TransformerG:Multi-hop Reading Comprehension of Legal Texts Based on Hierarchical Graph Structure and Attention Mechanism. Journal of Chinese Information Processing. 2022, 36(11): 148-155,168

参考文献

[1] PRANAV R, JIAN Z, KONSTANTIN L, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Austin, Texas: Association for Computational Linguistics,2016:2383-2392.
[2] PRANAV R, ROBIN J, PERCY L. Know what you don't know: Unanswerable questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics,2018:784-789.
[3] LAI G K, XIE Q Z, LIU H X, et al. RACE: Large-scale reading comprehension dataset from examinations[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics,2017:785-794.
[4] SIVA R, DANQI C, CHRISTOPHER D M. CoQA: A conversational question answering challenge[G]//Transactions of the Association for Computational Linguistics,2019:249-266.
[5] HE W, LIU K, LIU J, et al. Dureader: A Chinese machine reading comprehension dataset from real-world applications[J]. arXiv preprint arXiv: 1711.05073, 2017.
[6] JOHANNES W, PONTUS S, SEBASTIAN R. Constructing datasets for multi-hop reading comprehension across documents[C]//Proceedings of Transactions of the Association for Computational Linguistics,2018:287-302.
[7] ALON T, JONATHAN B. The web as a knowledge-base for answering complex questions[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans, Louisiana,2018: 641-651.
[8] YANG Z L, QI P, ZHANG S Z, et al. HotpotQA: A dataset for diverse, explainable multi-hop question answering[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics,2018: 2369-2380.
[9] PEREZ E, LEWIS P, YIH W, et al. Unsupervised question decomposition for question answering[J]. arXiv preprint arXiv:2002.09758, 2020.
[10] SEWON M, VICTOR Z, LUKE Z, et al. Multi-hop reading comprehension through question decomposition and rescoring[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics,2019: 6097-6109.
[11] NISHIDA K, NISHIDA K, NAGATA M, et al. Answering while summarizing: Multi-task learning for multi-hop QA with evidence extraction[J]. arXiv preprint arXiv:1905.08511, 2019.
[12] SEWON M, ERIC W, SAMEER S, et al. Compositional questions do not necessitate multi-hop reasoning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics,2019: 4249-4257.
[13] TU M, HUANG K, WANG G, et al. Select, answer and explain: Interpretable multi-hop reading comprehension over multiple documents[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 34(05): 9073-9080.
[14] SHAO N, CUI Y, LIU T, et al. Is graph structure necessary for multi-hop reasoning?[J]. arXiv preprint arXiv:2004.03096, 2020.
[15] GROENEVELD D, KHOT T, SABHARWAL A. A simple yet strong pipeline for HotpotQA[J]. arXiv preprint arXiv:2004.06753, 2020.
[16] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.
[17] LIU Y, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[J]. arXiv preprint arXiv:1907.11692, 2019.
[18] FANG Y W, SUN S Q, GAN Z, et al. Hierarchical graph network for multi-hop question answering[J]. arXiv preprint arXiv:1911.03631, 2019.
[19] XIAO Y X, QU Y R, QIU L et al. Dynamically fused graph network for multi-hop reasoning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019: 6140-6150.
[20] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[21] SRIVASTAVA R K, GREFF K, SCHMIDHUBER J. Highway networks[J]. arXiv preprint arXiv:1505.00387, 2015.
[22] CHO K, VAN MERRINBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014.
[23] VELICˇKOVIC' P, CUCURULL G, CASANOVA A, et al. Graph attention networks[J]. arXiv preprint arXiv:1710.10903, 2017.[24] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Network[J]. arXiv preprint arXiv:1911.02170, 2019.
[25] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 3rd International Conference on Neural Information Processing Systems, 2017: 5998-6008.

基金

上海市科学技术委员会科研计划项目(22DZ1204903,22511104800)
PDF(1766 KB)

Accesses

Citation

Detail

段落导航
相关文章

/