Automatic Sentence Completion Based on Deep Learning
CHEN Zhigang1,2, HUA Lei1, LIU Quan1,2,3, YIN Kun1, WEI Si1,2, HU Guoping1,2
1.iFLYTEK Research, IFLYTEK CO., LTD., Hefei, Anhui 230088, China; 2.State Key Laboratory of Cognitive Intelligence, Hefei, Anhui 230088, China; 3.School of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui 230026, China
Abstract:This paper proposes an automatic sentence completion method by combining dependency parsing with deep neural networks. Firstly, a sequence modeling method based on syntactic information expansion is proposed, which can preserve the efficiency while employing syntactic information. On the basis of this, we use the idea of learning to rank to train the candidate answer ranking model. Secondly, aiming at the lack of details of the overall sequence modeling, an automatic sentence completion model based on multi-state information fusion of language model is proposed. Finally, a multi-source information fusion model combining sentence representation, dependency syntax, and multi-state information is designed. This paper also constructed an English sentence completion dataset. The experimental results on this dataset show that the dependency syntax expansion model achieves an absolute improvement of 11% compared with the baseline sequence modeling methods; the language model based state ranking technique achieves an absolute improvement of 9.3% compared with the baseline model; and the final multi-source information fusion model achieved the top accuracy of 76.9% on the test set.
[1] Zweig G,Platt J C,Meek C,et al.Computational approaches to sentence completion[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1.Association for Computational Linguistics,2012:601-610. [2] 谭咏梅,杨一枭,杨林,等.基于LSTM和N-gram的ESL文章的语法错误自动纠正方法[J].中文信息学报,2018,32(6):19-27. [3] Deerwester S,Dumais S T,Furnas G W,et al.Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407. [4] Mikolov T,Karafiát M,Burget L,et al.Recurrent neural network based language model[C]//Proceedings of the 11th Annual Conference of the International Speech Communication Association.ISCA,2010:1045-1048. [5] Mikolov T,Kombrink S,Burget L,et al.Extensions of recurrent neural network language model[C]//Proceedings of the 2011 IEEE International Conference on.IEEE,2011:5528-5531. [6] Mikolov T,Chen K,Corrado G,et al.Efficient estimation of word representations in vector space[J].arXiv preprint arXiv:1301.3781,2013. [7] Tran K,Bisazza A,Monz C.Recurrent memory networks for language modeling[C]//Proceedings of NAACL-HLT.Association for Computational Linguistics,2016:321-331. [8] Sukhbaatar S,Weston J,Fergus R.End-to-end memory networks[C]//Proceedings of NIPS 2015,2015:2440-2448. [9] Kumar A,Irsoy O,Ondruska P,et al.Ask me anything: Dynamic memory networks for natural language processing[C]//Proceedings of the 33rd Interna-tional Conference on Machine Learning.ACM,2016: 1378-1387. [10] Gubbins J,Vlachos A.Dependency language models for sentence completion[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2013:1405-1410. [11] Mirowski P,Vlachos A.Dependency recurrent neural language models for sentence completion[J].arXiv preprint arXiv:1507.01193,2015. [12] DeMarneffe M C,MacCartney B,Manning C D.Generating typed dependency parses from phrase structure parses[C]//Proceedings of LREC.ELRA,2006:449-454. [13] Tai K S,Socher R,Manning C D.Improved semantic representations from tree-structured long short-term memory networks[J].arXiv preprint arXiv:1503.00075,2015. [14] Socher R,Perelygin A,Wu J,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2013:1631-1642. [15] Mou L,Peng H,Li G,et al.Discriminative neural sentence modeling by tree-based convolution[J].arXiv preprint arXiv:1504.01106,2015. [16] Li J,Luong M T,Jurafsky D,et al.When are tree structures necessary for deep learning of representations?[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2015:2304-2314. [17] Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [18] 栾克鑫,杜新凯,孙承杰,等.基于注意力机制的句子排序方法[J].中文信息学报,2018,32(1):123-130. [19] Shen T,Zhou T,Long G,et al.Disan: Directional self-attention network for rnn/cnn-free language understanding[C]//Proceedings of the 2018 AAAI Conference on Artificial Intelligence.Association for the Advancement of Artificial Intelligence,2018:5446-5453. [20] Burges C,Shaked T,Renshaw E,et al.Learning to rank using gradient descent[C]//Proceedings of the 2005 International Conference on Machine Learning.The International Machine Learning Society,2005:89-96. [21] Zeiler M D.ADADELTA: An adaptive learning rate method[J].arXiv preprint arXiv:1212.5701,2012.