基于多任务联合训练的法律文本机器阅读理解模型

PDF(1560 KB)

中文信息学报 ›› 2021, Vol. 35 ›› Issue (7) : 109-117,125.

机器阅读理解

基于多任务联合训练的法律文本机器阅读理解模型

李芳芳^1,2,任星凯¹,毛星亮³,林中尧¹,刘熙尧¹

作者信息 +

A Reading Comprehension Model for Judical Texts Based on Multi Task Joint Training

LI Fangfang^1,2, REN Xingkai¹, MAO Xingliang³, LIN Zhongyao¹, LIU Xiyao¹

Author information +

History +

摘要

随着裁判文书等司法大数据不断积累,如何将人工智能与法律相结合成为了法律智能研究的热点。该文针对2020中国法研杯司法人工智能挑战赛(CAIL2020)的机器阅读理解任务,提出了一种基于多任务联合训练的机器阅读理解模型。该模型将阅读理解任务划分为四个子模块: 文本编码模块、答案抽取模块、答案分类模块和支持句子判别模块。此外,该文提出了一种基于TF-IDF的“问题-文章句子”相似度匹配的数据增强方法。该方法对中国法研杯2019的训练集进行重新标注,实现数据增强。通过以上方法,最终该集成模型在2020中国法研杯机器阅读理解任务中联合F₁值为74.49,取得全国第一名。

Abstract

The combination of artificial intelligence with law has become a hot research issue. Focused on the machine reading comprehension task of China AI Law Challenge 2020 (CAIL2020), this paper proposes a multi-task joint training of four sub-modules: word embedding module, answer extraction module, answer classification module and supporting facts discrimination module. This paper proposes a data augmentation method based on TF-IDF ‘question-context’ similarity matching, which re-labels the training set of CAIL2019 for data augmentation. After performing CAIL2020 machine reading comprehension task, the F₁ value of this model achieves 74.49 as the first place in this task.

导出引用

李芳芳,任星凯,毛星亮,林中尧,刘熙尧. 基于多任务联合训练的法律文本机器阅读理解模型. 中文信息学报. 2021, 35(7): 109-117,125

LI Fangfang, REN Xingkai, MAO Xingliang, LIN Zhongyao, LIU Xiyao. A Reading Comprehension Model for Judical Texts Based on Multi Task Joint Training. Journal of Chinese Information Processing. 2021, 35(7): 109-117,125

参考文献

[1] Liu S, Zhang X, Zhang S, et al. Neural machine reading comprehension: Methods and trends[J]. Applied Sciences, 2019,9(18):3698.
[2] Chen D, Bolton J, Manning C D. A thorough examination of the CNN/Daily Mail reading comprehension task.[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016:358-2367.
[3] Hill F,Bordes A, Chopra S, et al. The goldilocks principle: Reading childrens books with explicit memory representations[J/OL]. arXiv preprint arXiv:1511.02301, 2015.
[4] Paperno D, Kruszewski G, Lazaridou A, et al. The LAMBADA dataset: Word prediction requiring a broad discourse context [J/OL]. arXiv preprint arXiv:1606.06031, 2016.
[5] Richardson M, Burges C J, Renshaw E.mctest: A challenge dataset for the open-domain machine comprehension of text[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013: 193-203.
[6] Lai G,Xie Q, Liu H, et al. RACE: Large-scale ReAding comprehension dataset from examinations[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017: 785-794.
[7] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392.
[8] Rajpurkar P, Jia R, Liang P. Know what you don’t know:Unanswerable questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 784-789.
[9] Trischler A, Wang T, Yuan X, et al. NewsQA: A machine comprehension dataset[C]//Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017: 191-200.
[10] Joshi M, Choi E, Weld D S, et al.TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1601-1611.
[11] He W, Liu K, Liu J, et al.Dureader: A chinese machine reading comprehension dataset from real-world applications[J/OL]. arXiv preprint arXiv:1711.05073, 2017.
[12] Qiu X, Sun T, Xu Y, et al. Pre-trained models for natural language processing: A survey [J]. arXiv preprint arXiv:2003.08271, 2020.
[13] Le Q,Mikolov T. Distributed representations of sentences and documents[C]//Proceedings of the 31st International Conference on Machine Learning, 2014: 1188-1196.
[14] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J/OL]. arXiv preprint arXiv:1301.3781, 2013.
[15] Pennington J,Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543.
[16] Peters M E, Neumann M,Iyyer M, et al. Deep contextualized word representations[C]//Proceedings of NAACL-HLT, 2018: 2227-2237.
[17] Brown T B, Mann B, Ryder N, et al. Language models are few-shot learners[J]. arXiv preprint arXiv: 2005.14165, 2020.
[18] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[19] Vaswani A,Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the Neural Information Processing Systems, 2017: 5998-6008.
[20] Liu Y, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach [J/OL]. arXiv preprint arXiv:1907.11692, 2019.
[21] Hermann K M,Kocisky T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 1693-1701.
[22] Seo M, Kembhavi A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J/OL]. arXiv preprint arXiv:1611.01603, 2016.
[23] Ruder S. An overview of multi-task learning in deep neural networks[J/OL]. arXiv preprint arXiv:1706.05098, 2017.
[24] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011,12(ARTICLE):2493-537.
[25] Liu X, Gao J, He X, et al. Representation learning using multi-task deep neural networks for semantic classification and information retrieval[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015: 912-921.
[26] Duong L, Cohn T, Bird S, et al. Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 845-850.
[27] Yang Y,Hospedales T M. Trace norm regularised deep multi-task learning[J/OL]. arXiv preprint arXiv:1606.04038, 2016.
[28] Hu M, Peng Y, Huang Z, et al. Retrieve, read, rerank: Towards end-to-end multi-document reading comprehension[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2285-2295.
[29] Yang M, Zhao W, Ye J, et al. Investigating capsule networks with dynamic routing for text classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 3110-3119.
[30] Clark K, Luong M T, Le Q V, et al. Manning: ELECTRA: Pre-training text encoders as discriminators rather than generators[C]//Proceedings of the ICIR 2020,2020:1-18.
[31] Wei J, Ren X, Li X, et al. NEZHA: Neural contextualized representation for chinese language understanding[J/OL]. arXiv preprint arXiv. 1909.00204,2019.

基金

国家重点研发计划(2020YFC0832700);国家自然科学基金(71790615);国防科技重点实验室基金(6142101190302);湖南省自然科学面上基金(2020JJ4746);湖南省长沙市自然科学基金(Kq2014134)

PDF(1560 KB)

1704

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2020-10-26	2021-07-30
Issue Date
2021-07-30

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金