长短答案分类指导的机器阅读理解方法

杨建喜,向芳悦,李韧,李东,蒋仕新,张露伊,肖桥

PDF(3672 KB)
PDF(3672 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (5) : 112-121.
问答与对话

长短答案分类指导的机器阅读理解方法

  • 杨建喜,向芳悦,李韧,李东,蒋仕新,张露伊,肖桥
作者信息 +

A Machine Reading Comprehension Method Guided by Long-Short Answers Classification

  • YANG Jianxi, XIANG Fangyue, LI Ren, LI Dong, JIANG Shixin, ZHANG Luyi, XIAO Qiao
Author information +
History +

摘要

针对现有机器阅读理解模型存在长答案不完整、短答案冗余,即模型对答案的边界信息捕捉能力有待提升问题,该文基于“问题分类+答案预测联合学习”的流水线式策略,提出了一种通过答案长短特征分类指导机器阅读理解的神经网络模型。该方法采用预训练语言模型对问题和文章进行语义表示,并以待预测答案的长短类型对相应问题进行分类,然后将问题分类的结果用于指导阅读理解中的答案预测模块,最终以多任务学习的方式得到全部答案的开始位置和结束位置。实验结果表明,该模型在CMRC2018数据集上的EM平均值为67.4%,F1平均值为87.6%,相比基线模型,分别提升了0.9%和1.1%。在自建的中文桥梁检测问答数据集上的EM平均值为89.4%、F1平均值为94.7%,相比基线模型,分别提升了1.2%和0.5%。在更少训练集规模的CMRC2018和中文繁体数据集DRCD上,该文方法也优于基线模型。

Abstract

Existing machine reading comprehension models are defected in capturing the boundary information of the answer, leading to incomplete long answers and redundant short answers. This paper proposes a strategy to guide the machine reading comprehension through classification of answer length features. With the question and the document encoded by RoBERTa_wwm_ext pre-trained model, the questions are classified according to the predicted length of the answer. The result of the question classification is used to guide the answer prediction module in reading comprehension, where the beginning and end positions of all answers are finally obtained in the way of multi-task learning. Compared with the baseline models, the experimental results on the CMRC2018 dataset, the self-built Chinese bridge inspection question and answer dataset and the traditional Chinese data set DRCD all confirm the superior performance of the proposed method according to either EM value or F1 value.

关键词

机器阅读理解 / RoBERTa_wwm_ext / 文本分类 / 多任务学习

Key words

machine reading comprehension / RoBERTa_wwm_ext / text classification / multi-task learning

引用本文

导出引用
杨建喜,向芳悦,李韧,李东,蒋仕新,张露伊,肖桥. 长短答案分类指导的机器阅读理解方法. 中文信息学报. 2023, 37(5): 112-121
YANG Jianxi, XIANG Fangyue, LI Ren, LI Dong, JIANG Shixin, ZHANG Luyi, XIAO Qiao. A Machine Reading Comprehension Method Guided by Long-Short Answers Classification. Journal of Chinese Information Processing. 2023, 37(5): 112-121

参考文献

[1] 顾迎捷,桂小林,李德福,等.基于神经网络的机器阅读理解综述[J].软件学报,2020,31(07): 2095-2126.
[2] QIU X, SUN T, XU Y, et al. Pre-trained models for natural language processing: A survey[J]. Science China Technological Sciences, 2020: 1-26.
[3] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013.
[4] Pennington J, Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[5] MATTHEW P, MARK N, MOHIT I, et al. Deep contextualized word representations[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 2227-2237.
[6] ZAREMBA W,SUTSKEVER I, VINYALS O. Recurrent neural network regularization[C]//Proceedings of ICLR,2015: 1-8.
[7] VASWANI A,SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[8] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL, 2019: 4171-4186.
[9] LAI S, XU L, LIU K, et al. Recurrent convolutional neural networks for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2015, 29(1): 2267-2273.
[10] CUI Y, LIU T, CHE W, et al. A span-extraction dataset for Chinese machine reading comprehension[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 5883-5889.
[11] SHAO C, LIU T, LAI Y, et al. DRCD: A Chinese machine reading comprehension dataset[J]. arXiv preprint arXiv:1806.00920, 2018.
[12] SEO M, KEMBHAVI A, FARHADI A, et al. Bidirectional attention flow for machine comprehension[C]//Proceedings of ICLK, 2017: 1-13.
[13] WANG W, YANG N, WEI F, et al. Gated self-matching networks for reading comprehension and question answering[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 189-198.
[14] YU A W, DOHAN D, LUONG M T, et al. Qanet: Combining local convolution with global self-attention for reading comprehension[C]//Proceedings of the ICLR, 2018: 1-16.
[15] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[EB/OL]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
[16] SUN Y, WANG S, LI Y, et al. Ernie: Enhanced representation through knowledge integration[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 1441-1451.
[17] LIU Y, OTT M, GOYAL N, et al. ROBERTA: A robustly optimized BERT pretraining approach[J]. arXiv preprint arXiv: 1907.11692, 2019.
[18] 李芳芳, 任星凯, 毛星亮, 等. 基于多任务联合训练的法律文本机器阅读理解模型[J]. 中文信息学报, 2021, 35(7): 109-117,125.
[19] 张虎, 王宇杰, 谭红叶, 等. 基于 MHSA 和句法关系增强的机器阅读理解方法研究[J]. 自动化学报, 2021: 1-11.
[20] MINAEE S, KALCHBRENNER N, CAMBRIA E, et al. Deep learning-based text classification: A comprehensive review[J]. ACM Computing Surveys, 2021, 54(3): 1-40.
[21] 江伟, 金忠. 基于短语注意机制的文本分类[J]. 中文信息学报, 2018, 32(2): 102-109.
[22] YAO L, MAO C, LUO Y. Graph convolutional networks for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(01): 7370-7377.
[23] VU N T, ADEL H, GUPTA P, et al. Combining recurrent and convolutional neural networks for relation classification[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 534-539.
[24] LI F, ZHANG M, FU G, et al. A Bi-LSTM-RNN model for relation classification using low-cost sequence features[J].arXiv preprint arXiv:1608.07720, 2016.
[25] CAI R, ZHANG X, WANG H. Bidirectional recurrent convolutional neural network for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 756-765.
[26] LIU X, GAO J, HE X, et al. Representation learning using multi-task deep neural networks for semantic classification and information retrieval[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015: 912-921.
[27] 谭红叶, 李宣影, 刘蓓. 基于外部知识和层级篇章表示的阅读理解方法[J]. 中文信息学报, 2020, 34(4): 85-91.
[28] LOUNICI K, PONTIL M, TSYBAKOV A B, et al. Taking advantage of sparsity in multi-task learning[J]. arXiv preprint arXiv: 0903.1468, 2009.
[29] LI X, YIN F, SUN Z, et al. Entity-relation extraction as multi-turn question answering[C]//Proceedings of the 57th Annula Meeting of the Association for Compututional Linguistics, 2019: 1340-1350.
[30] 彭宇, 李晓瑜, 胡世杰,等. 基于BERT的三阶段式问答模型[J]. 计算机应用, 2022, 42(1): 64-70.
[31] PENG Y, LI X, SONG J, et al. Verification mechanism to obtain an elaborate answer span in machine reading comprehension[J]. Neurocomputing, 2021, 466: 80-91.
[32] CUI Y, CHE W, LIU T, et al. Pre-training with whole word masking forChinese BERT[J]. IEEE ACM Transactions on Audio, Sprech, Language Processing, 2021, 29: 3054-3514.
[33] ZHANG Y, YANG Q. An overview of multi-task learning[J]. National Science Review, 2018, 5(1): 30-43.

基金

国家自然科学基金(62003063);重庆市自然科学基金(cstc2020jcyj-msxmX0047);重庆市教委科学技术研究项目(KJZD-M202000702,KJQN202000726);重庆交通大学研究生科研创新项目(2021yjkc002)
PDF(3672 KB)

568

Accesses

0

Citation

Detail

段落导航
相关文章

/