基于深度层次特征的阅读理解模型

霍欢,王忠萌

PDF(6043 KB)
PDF(6043 KB)
中文信息学报 ›› 2018, Vol. 32 ›› Issue (12) : 132-142.
机器阅读理解

基于深度层次特征的阅读理解模型

  • 霍欢1,2,王忠萌1
作者信息 +

Machine Comprehension Model on Deep Hierarchical Features

  • HUO Huan1,2, WANG Zhongmeng1
Author information +
History +

摘要

对于面向真实场景的中文机器阅读,理解文本所呈现的复杂信息至关重要。针对多篇章的连续答案片段型中文机器阅读任务,该文提出一种基于深度层次特征的模型,来提取细节、片段、全文三个层次的深度特征,从而多角度把握篇章包含的信息。在该模型中,词语经过词向量表示后,经过循环(recurrent)层编码后得到细节特征,并经过若干卷积(convolution)层和高速公路(highway)层等构造片段特征,同时对候选篇章进行全文特征的提取来进行整体的考察。最后,通过这些特征来确定答案所在篇章以及该篇章内的答案片段所在位置。在2018机器阅读理解技术竞赛中,单模型取得57.55的Rouge-L分数和50.87的Bleu-4分数,实验取得较好效果。

Abstract

For Chinese machine reading tasks of multi-passage continuous answer spans, this paper proposes a model based on deep hierarchical features to extract the three-level deep features in details, in snippets, and in full-texts. In this model, words represented by word vectors are encoded in a recurrent layer to obtain the detailed features. The snippets features are constructed through several convolution layers and highway layers. And the full-text features are extracted from candidate passages to perform the overall inspection. Finally, through these features, the passage where the answer is located and the answer spans within the passage is determined. Experimented on 2018 NLP Challenge on Machine Reading Comprehension, the proposed model achieves a Rouge-L score of 57.55 and a Bleu-4 score of 50.87.

关键词

机器阅读 / 层次特征 / 卷积

Key words

machine comprehension / hierarchical feature / convolution

引用本文

导出引用
霍欢,王忠萌. 基于深度层次特征的阅读理解模型. 中文信息学报. 2018, 32(12): 132-142
HUO Huan, WANG Zhongmeng. Machine Comprehension Model on Deep Hierarchical Features. Journal of Chinese Information Processing. 2018, 32(12): 132-142

参考文献

[1] Richardson M,Burges C J C,Renshaw E.MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text[J].Empirical Methods in Natural Language Processing,2013: 193-203.
[2] Chen D,et al.A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task[J].meeting of the association for computational linguistics,2016: 2358-2367.
[3] Rajpurkar P,et al.SQuAD: 100,000+ Questions for Machine Comprehension of Text[J].empirical methods in natural language processing,2016: 2383-2392.
[4] Hochreiter S,Schmidhuber J.Long Short-Term Memory[J].Neural Computation,1997,9(8): 1735-1780.
[5] Cho K,et al.Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J].empirical methods in natural language processing,2014: 1724-1734.
[6] Wang S,Jiang J.Machine Comprehension Using Match-LSTM and Answer Pointer[J].arXiv preprint arXiv:1608.07905,2016.
[7] Cui Y,et al.Attention-over-Attention Neural Networks for Reading Comprehension[J].meeting of the association for computational linguistics,2017: 593-602.
[8] Wenpeng Yin,Sebastian Ebert,Hinrich Schütze.Attention-Based Convolutional Neural Network for Machine Comprehension[J].arXiv preprint arXiv: 1602.04341,2016.
[9] Wu F,et al.Fast ReadingComprehension with ConvNets[J].arXiv preprint arXiv: 1711.04352,2017.
[10] Bahdanau D,Cho K,Bengio Y,et al.Neural Machine Translation by Jointly Learning to Align and Translate[J].arXiv preprint arXiv:1409.0473,2014.
[11] Wang W,et al.Gated Self-Matching Networks for Reading Comprehension and Question Answering[C]//Proceedings of the Meeting of the Association for Computational Linguistics.2017: 189-198.
[12] Seo M J,Kembhavi A,Farhadi A,et al.Bidirectional Attention Flow for Machine Comprehension[J].arXiv preprint arXiv: 1611.01603,2016.
[13] Yu A W,et al.QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension[J].arXiv preprint arXiv: 1804.09541,2018.
[14] He K,et al.Deep Residual Learning for Image Recognition[J].Conference on Computer Vision and Pattern Recognition,2015: 770-778.
[15] Tan C,et al.S-Net: From Answer Extraction to Answer Synthesis for Machine Reading Comprehension[J].national conference on artificial intelligence,2018.
[16] Wang Y,et al.Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification[J].meeting of the association for computational linguistics,2018: 1918-1927.
[17] Srivastava R K,Greff K,Schmidhuber J.Training very deep networks[J].Neural Information Processing Systems,2015: 2377-2385.
[18] He W,et al.DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications[J].arXiv preprint arXiv: 1711.05073,2017.
[19] Flick C.ROUGE: A Package for Automatic Evaluation of summaries[C]//Proceedings of The Workshop on Text Summarization Branches Out.2004: 10.
[20] Papineni K,et al.Bleu: a Method for Automatic Evaluation of Machine Translation[C]//Proceedings of the meeting of the association for computational linguistics,2002: 311-318.
[21] Pennington J,Socher R,Manning C.Glove: Global vectors for word representa-tion[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP),2014: 1532-1543.
[22] Srivastava N,et al.Dropout: a simple way to prevent neural networks from overfitting[J].Journal of machine learning research,2014,15(1): 1929-1958.
[23] Glorot X,Bengio Y.Understanding the difficulty of training deep feedforward neural networks[J].Journal of Machine Learning Research,2010,9: 249-256.
[24] Kingma D P,Ba J.Adam: A Method for Stochastic Optimization[J].arXiv preprint arXiv: 1412.6980,2014.

基金

国家自然科学基金(61003031);上海重点科技攻关项目(14511107902);上海市工程中心建设项目(GCZX14014);上海市一流学科建设项目(XTKX2012);上海市数据科学重点实验室开放课题资助课题(201609060003);沪江基金研究基地专项(C14001)
PDF(6043 KB)

606

Accesses

0

Citation

Detail

段落导航
相关文章

/