基于多尺度卷积的阅读理解候选句抽取

李沫谦,杨陟卓,李茹,王笑月,吉宇

PDF(3434 KB)
PDF(3434 KB)
中文信息学报 ›› 2024, Vol. 38 ›› Issue (8) : 128-139,157.
机器阅读理解

基于多尺度卷积的阅读理解候选句抽取

  • 李沫谦1,杨陟卓1,2,李茹1,2,王笑月1,吉宇1
作者信息 +

Evidence Sentence Extraction for Reading Comprehension Based on Multi-scale Convolution

  • LI Moqian1, YANG Zhizhuo1,2, LI Ru1,2, WANG Xiaoyue1, JI Yu1
Author information +
History +

摘要

机器阅读理解作为检验机器是否具有理解人类自然语言能力的重要任务之一,受到了越来越广泛的关注。该文针对选择型阅读理解任务中特征提取不全面和交互不充分的问题,提出一种基于多尺度卷积的候选句抽取模型。首先,使用预训练模型编码句子语义信息,并利用多种特征辅助编码提升模型性能。其次,为了充分利用文本信息,采用多尺度卷积捕捉不同尺度的文本特征。再次,使用Focal Loss解决阅读理解中正负样本不均衡的问题,最后,选取top-20作为候选句。该文的方法在两个阅读理解选择题数据集上进行测试,实验结果表明,多尺度卷积模型效果优于基线模型,F1值较最优基线模型结果分别提升3.66%和4.82%,验证了方法的有效性。

Abstract

Machine reading comprehension is a popular task to test whether a machine can understand natural language. Aiming at the choice reading comprehension items, we propose a multi-scale convolution based evidence sentence extraction model to extract more comprehensive features. Firstly, we utilize the pre-trained model to encode the semantic information for sentences, and use various features to assist the encoding to improve the performance of the model. Then, the multi-scale convolution is adopted to capture the text features at different scales, with the focal loss to alleviated the unbalanced sample issue. Finally, top-20 sentences are selected as the evidence sentences. Experimented on two datasets of reading comprehension, the proposed method improves the F1 values by 3.66% and 4.82%, respectively, compared with the optimal baseline models.

关键词

机器阅读理解 / 候选句抽取 / 多尺度卷积

Key words

machine reading comprehension / evidence sentence extraction / multi-scale convolution

引用本文

导出引用
李沫谦,杨陟卓,李茹,王笑月,吉宇. 基于多尺度卷积的阅读理解候选句抽取. 中文信息学报. 2024, 38(8): 128-139,157
LI Moqian, YANG Zhizhuo, LI Ru, WANG Xiaoyue, JI Yu. Evidence Sentence Extraction for Reading Comprehension Based on Multi-scale Convolution. Journal of Chinese Information Processing. 2024, 38(8): 128-139,157

参考文献

[1] LAI G, XIE Q, LIU H, et al. RACE: Large-scale reading comprehension dataset from examinations[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2017: 785-794.
[2] SUN K, YU D,YU D,et al. Improving machine reading comprehension with general reading strategies[C]//Proceedings of the Conference of the North,2019: 2633-2643.
[3] SHUOHANG W, MO Y, JING J, et al.A C-Matching model for multi-choice reading comprehension[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,2018: 746-751.
[4] DANIEL Z K, ERIC M, JACOB J, et al. Defining textual entailment[J]. Journal of the Association for Information Science Technology, 2018,69(6): 763-772.
[5] ZHANG H CH, FURU W, BING Q, et al. Hierarchical attention flow for multiple-choice reading comprehension[C]//Proceedings of the the AAAI Conference on Artificial Intelligence,2018: 6077-6085.
[6] SHUAILIANG Z, HAI Z, YUWEI W, et al. DCMN+: Dual co-matching network for multi-choice reading comprehension[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020: 9563-9570.
[7] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics,2019: 4171-4186.
[8] TSUNG L, PRIYA G, ROSS G, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision,2017,(99): 2999-3007.
[9] BEN SGHAIER M, BAKARI W, NEJI M. Ar-SLoTE: A recognizing textual entailment tool for arabic question/answering systems[C]//Proceedings of 7th International conference on ICT & Accessibility,2019: 1-6.
[10] ASMA A, CHAITANYA S, DINA D, et al. Overview of the MEDIQA shared task on textual inference, question entailment and question answering[C]//Proceedings of 18th BioNLP Workshop and Shared T-ask, 2019: 370-379.
[11] NASERASADI A, KHOSRAVI H, SADEGHI F. Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem[J]. Natural Language Engineering, 2019,25(1): 121-146.
[12] DWIJEN R, AMITAVA D, BABY B. A new approach for twitter event summarization based on sentence identification and partial textual entailment [J]. Computation System,2019,23(3): 100-105.
[13] TANIK S, SUDIP N, ASIF E, et al. Textual entailment using machine translation evaluation metrics[C]//Proceedings of 18th International Conference of Computational Linguistics and Intelligent Text Processing, 2017: 317-328.
[14] PHAM M Q N, NGUYEN M L, SHIMAZU A. Learning to recognize textual entailment in japanese texts with the utilization of machine translation[J]. ACM Transactions on Asian Language Information Processing, 2012: 438-449.
[15] 郭少茹,张虎,钱揖丽,等. 面向高考阅读理解的句子语义相关度[J]. 清华大学学报(自然科学版),2017,057(57): 575-579.
[16] 李茹,马淑晖,张虎,等. 阅读理解答案预测[J]. 山西大学学报(自然科学版),2017,40(4): 763-770.
[17] QIU R,PENG L,WEIWEI H,et al. Option comparison network for multiple-choice reading comprehension [J].arXiv preprint arXiv:1903.0323, 2019.
[18] LIN B Y, CHEN X, CHEN J, et al. Kagnet: Knowledge-aware graph networks for commonsense reasoning[C]//Proceedings of the Conference on EMNLP-IJCNLP,2019: 2829-2839.
[19] JIN D, GAO S, KAO J Y, et al. MMM: Multi-stage multi-task learning for multi-choice reading comprehension[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(5): 8010-8017.
[20] 张志昌, 张宇, 刘挺,等. 基于浅层语义树核的阅读理解答案句抽取[J]. 中文信息学报,2008,22(1): 80-86.
[21] 李国臣, 刘姝林, 杨陟卓,等. 基于框架语义的高考语文阅读理解答案句抽取[J]. 中文信息学报,2016,30(6): 164-172.
[22] MUELLER J, THYAGARAJAN A. Siamese recurrent archite-ctures for learning sentence similarity[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence, AAAI Press,2016: 2786-2792.
[23] 陈千,陈夏飞,郭鑫,等. 面向阅读理解的多对一中文文本蕴含问题研究[J]. 中文信息学报, 2018, 32(4): 91-98.
[24] TRIVEDI H, KWON H, KHOT T, et al. Repurposing entailment for multi-hop question answering tasks[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics,2019: 2948-2958.
[25] SAMUEL R. B, GABOR A, CHRISTOPHER P, et al. A large annotated corpus for learning natural language inference[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2015: 632-642.
[26] RODRGUEZ J R, MNDEZ M R, CARRASCO E F. Optimization under fuzzy if-then rules using stochastic algor-ithms[J]. Computer Aided Chemical Engineering,2005,20(05): 181-186.
[27] MAXIMILIAN K, SABINE S. A rank-based distance measure to detect polysemy and to determine salie-nt vector-space features for german prepositions[C]//Proceedings of the 9th International Conference on Language Resources and Evaluation,2014:4459-4466.
[28] NITENDRA N, DARREN M, ENVER T. Euclidean position estimation of static features using a moving uncalibrated camera[J]. IEEE Transactions on Control Systems Technology, 2012,20(2): 480-485.
[29] MOSES S. C. Similarity estimation techniques from rounding algorithms[C]//Proceedings of the 34th annual ACM Symposium on Theory of Computing. Association for Computing Machinery,2002: 380-388.
[30] BLEI D M, NG A Y, JORDAN M I, et al. Latent Diri-chlet allocation[J]. Journal of Machine Learning Research, 2012,3: 993-1022.
[31] KINGMA D, BA J. Adam: A method for stochastic optimization[C]//Proceedings of the International Conference on Learning Representations, 2014.
[32] GUOKUN L, QIZHE X, HANXIAO L, et al. RACE: Large-scale reading comprehension dataset from exa-minations[C]//Proceedings of the EMNLP,2017: 785-794.
[33] WANG G H , DIAN Y, KAI S, et al. Evidence sentence extraction for machine reading comprehension[C]//Proceedings of the 23rd Conference on Computational Natural Language Learning, 2019: 1-14.
[34] MIKOLOV T, SUTSKEVER I, KAI C, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems,2013: 3111-3119.
[35] ZHENZHONG L, MINGDA C, SEBASTIAN G, et al. ALBERT: A lite bert for self-supervised learning of language representations[C]//Proceedings of the 8th International Conference on Learning Representations,2020.
[36] YIMING C, WANXIANG C, TING L, et al. Pre-training with whole word masking for Chinese BERT[J]. arXiv preprint arXiv:1906.08101,2019.
[37] GUOLIN K, QI M, THOMAS F, et al. LightGBM: A highly efficient gradient boosting decision tree[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems,2017: 3146-3154.

基金

国家重点研发基金(2018YFB1005103);山西省基础研究计划面上项目(20210302123469);国家自然科学基金(61936012)
PDF(3434 KB)

200

Accesses

0

Citation

Detail

段落导航
相关文章

/