该文提出了一种结合依存句法分析和深度神经网络的自动句子填空技术。首先,提出了一种依存句法信息展开的序列建模方案,可以在引入句法信息的同时兼顾效率,并在此基础上利用排序学习思想,训练候选答案排序模型;其次,针对整体序列建模的细节建模失准问题,提出了一种基于语言模型多状态信息融合的自动句子填空模型;最后,设计了一种结合序列表示、依存句法信息、多状态信息的多源信息融合模型。该文还构建出一个英文答题数据集并据此进行了实验。实验结果表明,依存句法展开模型相对于常用的序列建模方案,准确率有11%的绝对提升;语言模型状态排序模型相对于基线模型,准确率有9.3%的绝对提升;最终的多源信息融合模型,在测试集上获得最高76.9%的准确率。
Abstract
This paper proposes an automatic sentence completion method by combining dependency parsing with deep neural networks. Firstly, a sequence modeling method based on syntactic information expansion is proposed, which can preserve the efficiency while employing syntactic information. On the basis of this, we use the idea of learning to rank to train the candidate answer ranking model. Secondly, aiming at the lack of details of the overall sequence modeling, an automatic sentence completion model based on multi-state information fusion of language model is proposed. Finally, a multi-source information fusion model combining sentence representation, dependency syntax, and multi-state information is designed. This paper also constructed an English sentence completion dataset. The experimental results on this dataset show that the dependency syntax expansion model achieves an absolute improvement of 11% compared with the baseline sequence modeling methods; the language model based state ranking technique achieves an absolute improvement of 9.3% compared with the baseline model; and the final multi-source information fusion model achieved the top accuracy of 76.9% on the test set.
关键词
句子填空 /
句法分析 /
序列建模 /
深度学习
{{custom_keyword}} /
Key words
sentence completion /
syntactic analysis /
sequence modeling /
deep learning
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Zweig G,Platt J C,Meek C,et al.Computational approaches to sentence completion[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1.Association for Computational Linguistics,2012:601-610.
[2] 谭咏梅,杨一枭,杨林,等.基于LSTM和N-gram的ESL文章的语法错误自动纠正方法[J].中文信息学报,2018,32(6):19-27.
[3] Deerwester S,Dumais S T,Furnas G W,et al.Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407.
[4] Mikolov T,Karafiát M,Burget L,et al.Recurrent neural network based language model[C]//Proceedings of the 11th Annual Conference of the International Speech Communication Association.ISCA,2010:1045-1048.
[5] Mikolov T,Kombrink S,Burget L,et al.Extensions of recurrent neural network language model[C]//Proceedings of the 2011 IEEE International Conference on.IEEE,2011:5528-5531.
[6] Mikolov T,Chen K,Corrado G,et al.Efficient estimation of word representations in vector space[J].arXiv preprint arXiv:1301.3781,2013.
[7] Tran K,Bisazza A,Monz C.Recurrent memory networks for language modeling[C]//Proceedings of NAACL-HLT.Association for Computational Linguistics,2016:321-331.
[8] Sukhbaatar S,Weston J,Fergus R.End-to-end memory networks[C]//Proceedings of NIPS 2015,2015:2440-2448.
[9] Kumar A,Irsoy O,Ondruska P,et al.Ask me anything: Dynamic memory networks for natural language processing[C]//Proceedings of the 33rd Interna-tional Conference on Machine Learning.ACM,2016: 1378-1387.
[10] Gubbins J,Vlachos A.Dependency language models for sentence completion[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2013:1405-1410.
[11] Mirowski P,Vlachos A.Dependency recurrent neural language models for sentence completion[J].arXiv preprint arXiv:1507.01193,2015.
[12] DeMarneffe M C,MacCartney B,Manning C D.Generating typed dependency parses from phrase structure parses[C]//Proceedings of LREC.ELRA,2006:449-454.
[13] Tai K S,Socher R,Manning C D.Improved semantic representations from tree-structured long short-term memory networks[J].arXiv preprint arXiv:1503.00075,2015.
[14] Socher R,Perelygin A,Wu J,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2013:1631-1642.
[15] Mou L,Peng H,Li G,et al.Discriminative neural sentence modeling by tree-based convolution[J].arXiv preprint arXiv:1504.01106,2015.
[16] Li J,Luong M T,Jurafsky D,et al.When are tree structures necessary for deep learning of representations?[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2015:2304-2314.
[17] Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[18] 栾克鑫,杜新凯,孙承杰,等.基于注意力机制的句子排序方法[J].中文信息学报,2018,32(1):123-130.
[19] Shen T,Zhou T,Long G,et al.Disan: Directional self-attention network for rnn/cnn-free language understanding[C]//Proceedings of the 2018 AAAI Conference on Artificial Intelligence.Association for the Advancement of Artificial Intelligence,2018:5446-5453.
[20] Burges C,Shaked T,Renshaw E,et al.Learning to rank using gradient descent[C]//Proceedings of the 2005 International Conference on Machine Learning.The International Machine Learning Society,2005:89-96.
[21] Zeiler M D.ADADELTA: An adaptive learning rate method[J].arXiv preprint arXiv:1212.5701,2012.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点研发计划(2018YFB1005100)
{{custom_fund}}