梁小波,任飞亮,刘永康,潘凌峰,侯依宁,张熠,李妍. N-Reader:基于双层Self-attention的机器阅读理解模型[J]. 中文信息学报, 2018, 32(10): 130-137.
LIANG Xiaobo, REN Feiliang, LIU Yongkang, PAN Lingfeng, HOU Yining, ZHANG Yi, LI Yan. N-Reader: Machine Reading Comprehension Model Based on Double Layers of Self-attention. , 2018, 32(10): 130-137.
N-Reader:基于双层Self-attention的机器阅读理解模型
梁小波,任飞亮,刘永康,潘凌峰,侯依宁,张熠,李妍
东北大学 计算机科学与工程学院,辽宁 沈阳 110169
N-Reader: Machine Reading Comprehension Model Based on Double Layers of Self-attention
LIANG Xiaobo, REN Feiliang, LIU Yongkang, PAN Lingfeng, HOU Yining, ZHANG Yi, LI Yan
School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning 110169, China
Abstract:Machine reading comprehension (MRC) is an important task in natural language processing and artificial intelligence. To improve the Chinese multi-document MRC, this paper proposes N-Reader, an end-to-end MRC model based on neural network. It applies a two-layer self-attention mechanism to encode the input documents to ultilize both the information from a single document and the similarity information from multiple documents. Besides, this paper also proposes a multi-paragraph completion algorithm to preprocess the input documents. This preprocessing method can further recognize the semantics-related paragraphs among input documents, and contribute to a better answer sequence. In the “2018 NLP Challenge on Machine Reading Comprehension” jointly organized by Chinese Information Processing Society of China (CIPS), Chinese Computer Federation (CCF), and Baidu Inc., our model ranks No.3 among the hugely competitive models.
[1] He W, Liu K, Lyu Y, et al. DuReader: a Chinese machine reading comprehension dataset from real-world applications[J]. arXiv preprint arXiv:1711.05073, 2017. [2] Rajpurkar P, Zhang J, Lopyrev K, et al. SQuAD: 100,000+ questions for machine comprehension of text[J]. arXiv preprint arXiv:1606.05250, 2016. [3] Wang S, Jiang J. Machine comprehension using match-LSTM and answer pointer[J]. arXiv preprint arXiv:1608.07905, 2016. [4] Cui Y, Chen Z, Wei S, et al. Attention-over-attention neural networks for reading comprehension[J]. arXiv preprint arXiv:1607.04423, 2016. [5] Microsoft Asia Natural Language Computing Group. R-net: Machine reading comprehension with self-matching networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017. [6] Hu M, Peng Y, Qiu X. Mnemonic reader: Machine comprehension with iterative aligning and multi-hop answer pointing[J]. arXiv preprint arXiv:1705.02798,2017. [7] Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification[C]//Proceedings of Advances in Neural Information Processing Systems. 2015: 649-657. [8] Flick C. ROUGE: A Package for automatic evaluation of summaries[C]//Proceedings of the Workshop on Text Summarization Branches Out. 2004:10. [9] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2002: 311-318. [10] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014. [11] Hermann K M, Kocˇisk T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of Advances in Neural Information Processing Systems.2015:1693-1701. [12] Kadlec R, Schmid M, Bajgar O, et al. Text understanding with the attention sum reader network[J]. arXiv preprint arXiv:1603.01547, 2016. [13] Xiong C, Zhong V, Socher R. Dynamic coattention networks for question answering[J]. arXiv preprint arXiv:1611.01604, 2016. [14] Seo M, Kembhavi A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv preprint arXiv:1611.01603, 2016. [15] Graves A. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780. [16] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. Empirical Methods in Natural Language Processing,2014: 1724-1734. [17] Vinyals O, Fortunato M, Jaitly N. Pointer Networks[C]//Proceedings of the Neural Information Processing Systems,2015: 2692-2700. [18] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2014:1532-1543.