基于多任务联合训练的法律文本机器阅读理解模型

李芳芳,任星凯,毛星亮,林中尧,刘熙尧

PDF(1560 KB)
PDF(1560 KB)
中文信息学报 ›› 2021, Vol. 35 ›› Issue (7) : 109-117,125.
机器阅读理解

基于多任务联合训练的法律文本机器阅读理解模型

  • 李芳芳1,2,任星凯1,毛星亮3,林中尧1,刘熙尧1
作者信息 +

A Reading Comprehension Model for Judical Texts Based on Multi Task Joint Training

  • LI Fangfang1,2, REN Xingkai1, MAO Xingliang3, LIN Zhongyao1, LIU Xiyao1
Author information +
History +

摘要

随着裁判文书等司法大数据不断积累,如何将人工智能与法律相结合成为了法律智能研究的热点。该文针对2020中国法研杯司法人工智能挑战赛(CAIL2020)的机器阅读理解任务,提出了一种基于多任务联合训练的机器阅读理解模型。该模型将阅读理解任务划分为四个子模块: 文本编码模块、答案抽取模块、答案分类模块和支持句子判别模块。此外,该文提出了一种基于TF-IDF的“问题-文章句子”相似度匹配的数据增强方法。该方法对中国法研杯2019的训练集进行重新标注,实现数据增强。通过以上方法,最终该集成模型在2020中国法研杯机器阅读理解任务中联合F1值为74.49,取得全国第一名。

Abstract

The combination of artificial intelligence with law has become a hot research issue. Focused on the machine reading comprehension task of China AI Law Challenge 2020 (CAIL2020), this paper proposes a multi-task joint training of four sub-modules: word embedding module, answer extraction module, answer classification module and supporting facts discrimination module. This paper proposes a data augmentation method based on TF-IDF ‘question-context’ similarity matching, which re-labels the training set of CAIL2019 for data augmentation. After performing CAIL2020 machine reading comprehension task, the F1 value of this model achieves 74.49 as the first place in this task.

关键词

中国法研杯 / 机器阅读理解 / 多任务联合训练

Key words

China AI Law Challenge / machine reading comprehension / multi-task joint training

引用本文

导出引用
李芳芳,任星凯,毛星亮,林中尧,刘熙尧. 基于多任务联合训练的法律文本机器阅读理解模型. 中文信息学报. 2021, 35(7): 109-117,125
LI Fangfang, REN Xingkai, MAO Xingliang, LIN Zhongyao, LIU Xiyao. A Reading Comprehension Model for Judical Texts Based on Multi Task Joint Training. Journal of Chinese Information Processing. 2021, 35(7): 109-117,125

参考文献

[1] Liu S, Zhang X, Zhang S, et al. Neural machine reading comprehension: Methods and trends[J]. Applied Sciences, 2019,9(18):3698.
[2] Chen D, Bolton J, Manning C D. A thorough examination of the CNN/Daily Mail reading comprehension task.[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016:358-2367.
[3] Hill F,Bordes A, Chopra S, et al. The goldilocks principle: Reading childrens books with explicit memory representations[J/OL]. arXiv preprint arXiv:1511.02301, 2015.
[4] Paperno D, Kruszewski G, Lazaridou A, et al. The LAMBADA dataset: Word prediction requiring a broad discourse context [J/OL]. arXiv preprint arXiv:1606.06031, 2016.
[5] Richardson M, Burges C J, Renshaw E.mctest: A challenge dataset for the open-domain machine comprehension of text[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013: 193-203.
[6] Lai G,Xie Q, Liu H, et al. RACE: Large-scale ReAding comprehension dataset from examinations[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017: 785-794.
[7] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016: 2383-2392.
[8] Rajpurkar P, Jia R, Liang P. Know what you don’t know:Unanswerable questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 784-789.
[9] Trischler A, Wang T, Yuan X, et al. NewsQA: A machine comprehension dataset[C]//Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017: 191-200.
[10] Joshi M, Choi E, Weld D S, et al.TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1601-1611.
[11] He W, Liu K, Liu J, et al.Dureader: A chinese machine reading comprehension dataset from real-world applications[J/OL]. arXiv preprint arXiv:1711.05073, 2017.
[12] Qiu X, Sun T, Xu Y, et al. Pre-trained models for natural language processing: A survey [J]. arXiv preprint arXiv:2003.08271, 2020.
[13] Le Q,Mikolov T. Distributed representations of sentences and documents[C]//Proceedings of the 31st International Conference on Machine Learning, 2014: 1188-1196.
[14] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J/OL]. arXiv preprint arXiv:1301.3781, 2013.
[15] Pennington J,Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543.
[16] Peters M E, Neumann M,Iyyer M, et al. Deep contextualized word representations[C]//Proceedings of NAACL-HLT, 2018: 2227-2237.
[17] Brown T B, Mann B, Ryder N, et al. Language models are few-shot learners[J]. arXiv preprint arXiv: 2005.14165, 2020.
[18] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[19] Vaswani A,Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the Neural Information Processing Systems, 2017: 5998-6008.
[20] Liu Y, Ott M, Goyal N, et al. RoBERTa: A robustly optimized BERT pretraining approach [J/OL]. arXiv preprint arXiv:1907.11692, 2019.
[21] Hermann K M,Kocisky T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 1693-1701.
[22] Seo M, Kembhavi A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J/OL]. arXiv preprint arXiv:1611.01603, 2016.
[23] Ruder S. An overview of multi-task learning in deep neural networks[J/OL]. arXiv preprint arXiv:1706.05098, 2017.
[24] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011,12(ARTICLE):2493-537.
[25] Liu X, Gao J, He X, et al. Representation learning using multi-task deep neural networks for semantic classification and information retrieval[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015: 912-921.
[26] Duong L, Cohn T, Bird S, et al. Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 845-850.
[27] Yang Y,Hospedales T M. Trace norm regularised deep multi-task learning[J/OL]. arXiv preprint arXiv:1606.04038, 2016.
[28] Hu M, Peng Y, Huang Z, et al. Retrieve, read, rerank: Towards end-to-end multi-document reading comprehension[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2285-2295.
[29] Yang M, Zhao W, Ye J, et al. Investigating capsule networks with dynamic routing for text classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 3110-3119.
[30] Clark K, Luong M T, Le Q V, et al. Manning: ELECTRA: Pre-training text encoders as discriminators rather than generators[C]//Proceedings of the ICIR 2020,2020:1-18.
[31] Wei J, Ren X, Li X, et al. NEZHA: Neural contextualized representation for chinese language understanding[J/OL]. arXiv preprint arXiv. 1909.00204,2019.

基金

国家重点研发计划(2020YFC0832700);国家自然科学基金(71790615);国防科技重点实验室基金(6142101190302);湖南省自然科学面上基金(2020JJ4746);湖南省长沙市自然科学基金(Kq2014134)
PDF(1560 KB)

1704

Accesses

0

Citation

Detail

段落导航
相关文章

/