MCA-Reader:基于多重联结机制的注意力阅读理解模型

PDF(2937 KB)

中文信息学报 ›› 2019, Vol. 33 ›› Issue (10) : 73-80.

阅读理解与文本生成

MCA-Reader:基于多重联结机制的注意力阅读理解模型

张禹尧,蒋玉茹,毛腾,张仰森

作者信息 +

MCA-Reader: Multi-connected Attention Model for Machine Reading Comprehension

ZHANG Yuyao, JIANG Yuru, Mao Teng, ZHANG Yangsen

Author information +

History +

摘要

机器阅读理解是当下自然语言处理的一个热门任务,其内容是: 在给定文本的基础上,提出问题,机器要在给定文本中寻找并给出最终问题的答案。片段抽取式阅读理解是当前机器阅读理解研究的一个典型的方向,机器通过预测答案在文章中的起始和结束位置来定位答案。在此过程中,注意力机制起着不可或缺的作用。该文为了更好地解决片段抽取式机器阅读理解任务,提出了一种基于多重联结机制的注意力阅读理解模型。该模型通过多重联结的方式,更有效地发挥了注意力机制在片段抽取式机器阅读理解任务中的作用。利用该模型,在第二届“讯飞杯”中文机器阅读理解评测(CMRC2018)的最终测试集上EM值为71.175,F₁值为88.090,排名第二。

Abstract

Machine reading comprehension is a challenging task in natural language processing. Focused on fragment-extractive reading comprehension, this paper proposes an attention reading comprehension model based on multi-connect mechanism. The model more effectively exerts the role of attention mechanism in fragment extraction machine reading comprehension tasks through multiple connections. This model achieves an EM score of 71.175 and an F₁ value of 88.090 in the final test data set of the Second Evaluation Workshop on Chinese Machine Reading Comprehension, CMRC 2018, ranking second.

导出引用

张禹尧,蒋玉茹,毛腾,张仰森. MCA-Reader:基于多重联结机制的注意力阅读理解模型. 中文信息学报. 2019, 33(10): 73-80

ZHANG Yuyao, JIANG Yuru, Mao Teng, ZHANG Yangsen. MCA-Reader: Multi-connected Attention Model for Machine Reading Comprehension. Journal of Chinese Information Processing. 2019, 33(10): 73-80

参考文献

[1] Dhingra B, Liu H, Yang Z, et al. Gated-attention readers for text comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017:1832-1846.
[2] Sordoni A, Bachman P, Trischler A, et al. Iterative alternating neural attention for machine reading[J]. arXiv preprint arXiv: 1606.02245, 2016.
[3] Shen Y, Huang P S, Gao J, et al. Reasonet: Learning to stop reading in machine comprehension[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2017:1047-1055.
[4] Cui Y, Liu T, Xiao L, et al. A span-extraction dataset for Chinese machine reading comprehension[J]. arXiv preprint arXiv: 1810.07366, 2018.
[5] Hermann K M, Kocisky T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of the Advances in Neural Information Processing systems,2015:1693-1701.
[6] Hill F, Bordes A, Chopra S, et al. The goldilocks principle: Reading children's books with explicit memory representations[J]. arXiv preprint arXiv: 1511.02301, 2015.
[7] Cui Y, Liu T, Chen Z, et al. Dataset for the first evaluation on Chinese machine reading comprehension[J]. arXiv preprint arXiv: 1709.08299, 2017.
[8] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,2016:2383-2392.
[9] Nguyen T, Rosenberg M, Song X, et al. MS MARCO: A human generated machine reading comprehension dataset[J]. arXiv preprint arXiv: 1611.09268, 2016.
[10] He W, Liu K, Liu J, et al.DuReader: A chinese machine reading comprehension dataset from real-world applications[C]//Proceedings of the Workshop on Machine Reading for Question Answering,2018:37-46.
[11] Bahdanau D, Cho K, Bengio Y.Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014.
[12] Kadlec R, Schmid M, Bajgar O, et al. Text understanding with the attention sum reader network[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016:908-918.
[13] Seo M, Kembhavi A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv preprint arXiv: 1611.01603, 2016.
[14] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv: 1301.3781, 2013.
[15] Weissenborn D, Wiese G, Seiffe L. Making neural QA as simple as possible but not simpler[C]//Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017),2017:271-280.
[16] Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv preprint arXiv: 1412.3555, 2014.
[17] Cui Y, Chen Z, Wei S, et al. Attention-over-attention neural networks for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017:593-602.
[18] Vinyals O, Fortunato M, Jaitly N. Pointer networks[C]//Proceedings of the Advances in Neural Information Processing Systems,2015:2692-2700.
[19] Shao C C, Liu T, Lai Y, et al. Drcd: A Chinese machine reading comprehension dataset[J]. arXiv preprint arXiv: 1806.00920, 2018.
[20] Srivastava N, Hinton G,Krizhevsky A, et al. Dropout: A simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
[21] Kingma D P, Ba J. Adam: a method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.

基金

国家自然科学基金(61602044,61772081);促进高校内涵发展-研究生科技创新项目(5121911044)

PDF(2937 KB)

876

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2018-12-03	2019-10-15
Issue Date
2019-10-15

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金