霍欢,王忠萌. 基于深度层次特征的阅读理解模型[J]. 中文信息学报, 2018, 32(12): 132-142.
HUO Huan, WANG Zhongmeng. Machine Comprehension Model on Deep Hierarchical Features. , 2018, 32(12): 132-142.
Machine Comprehension Model on Deep Hierarchical Features
HUO Huan1,2, WANG Zhongmeng1
1.School of Optical-Electrical and Computer Engineering, Shanghai University of Science and Technology, Shanghai 200093, China; 2.Shanghai Key Laboratory of Data Science, Fudan University, Shanghai 201203, China
Abstract:For Chinese machine reading tasks of multi-passage continuous answer spans, this paper proposes a model based on deep hierarchical features to extract the three-level deep features in details, in snippets, and in full-texts. In this model, words represented by word vectors are encoded in a recurrent layer to obtain the detailed features. The snippets features are constructed through several convolution layers and highway layers. And the full-text features are extracted from candidate passages to perform the overall inspection. Finally, through these features, the passage where the answer is located and the answer spans within the passage is determined. Experimented on 2018 NLP Challenge on Machine Reading Comprehension, the proposed model achieves a Rouge-L score of 57.55 and a Bleu-4 score of 50.87.
[1] Richardson M,Burges C J C,Renshaw E.MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text[J].Empirical Methods in Natural Language Processing,2013: 193-203. [2] Chen D,et al.A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task[J].meeting of the association for computational linguistics,2016: 2358-2367. [3] Rajpurkar P,et al.SQuAD: 100,000+ Questions for Machine Comprehension of Text[J].empirical methods in natural language processing,2016: 2383-2392. [4] Hochreiter S,Schmidhuber J.Long Short-Term Memory[J].Neural Computation,1997,9(8): 1735-1780. [5] Cho K,et al.Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J].empirical methods in natural language processing,2014: 1724-1734. [6] Wang S,Jiang J.Machine Comprehension Using Match-LSTM and Answer Pointer[J].arXiv preprint arXiv:1608.07905,2016. [7] Cui Y,et al.Attention-over-Attention Neural Networks for Reading Comprehension[J].meeting of the association for computational linguistics,2017: 593-602. [8] Wenpeng Yin,Sebastian Ebert,Hinrich Schütze.Attention-Based Convolutional Neural Network for Machine Comprehension[J].arXiv preprint arXiv: 1602.04341,2016. [9] Wu F,et al.Fast ReadingComprehension with ConvNets[J].arXiv preprint arXiv: 1711.04352,2017. [10] Bahdanau D,Cho K,Bengio Y,et al.Neural Machine Translation by Jointly Learning to Align and Translate[J].arXiv preprint arXiv:1409.0473,2014. [11] Wang W,et al.Gated Self-Matching Networks for Reading Comprehension and Question Answering[C]//Proceedings of the Meeting of the Association for Computational Linguistics.2017: 189-198. [12] Seo M J,Kembhavi A,Farhadi A,et al.Bidirectional Attention Flow for Machine Comprehension[J].arXiv preprint arXiv: 1611.01603,2016. [13] Yu A W,et al.QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension[J].arXiv preprint arXiv: 1804.09541,2018. [14] He K,et al.Deep Residual Learning for Image Recognition[J].Conference on Computer Vision and Pattern Recognition,2015: 770-778. [15] Tan C,et al.S-Net: From Answer Extraction to Answer Synthesis for Machine Reading Comprehension[J].national conference on artificial intelligence,2018. [16] Wang Y,et al.Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification[J].meeting of the association for computational linguistics,2018: 1918-1927. [17] Srivastava R K,Greff K,Schmidhuber J.Training very deep networks[J].Neural Information Processing Systems,2015: 2377-2385. [18] He W,et al.DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications[J].arXiv preprint arXiv: 1711.05073,2017. [19] Flick C.ROUGE: A Package for Automatic Evaluation of summaries[C]//Proceedings of The Workshop on Text Summarization Branches Out.2004: 10. [20] Papineni K,et al.Bleu: a Method for Automatic Evaluation of Machine Translation[C]//Proceedings of the meeting of the association for computational linguistics,2002: 311-318. [21] Pennington J,Socher R,Manning C.Glove: Global vectors for word representa-tion[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP),2014: 1532-1543. [22] Srivastava N,et al.Dropout: a simple way to prevent neural networks from overfitting[J].Journal of machine learning research,2014,15(1): 1929-1958. [23] Glorot X,Bengio Y.Understanding the difficulty of training deep feedforward neural networks[J].Journal of Machine Learning Research,2010,9: 249-256. [24] Kingma D P,Ba J.Adam: A Method for Stochastic Optimization[J].arXiv preprint arXiv: 1412.6980,2014.