摘要文本复述判别是一个重要的句子级语义理解应用。该文提出了一个轻量级的基于记忆单元的单层循环神经网络模型,并结合语义角色标注知识帮助进行英文文本复述判别。使用单层的循环网络模型减缓由于网络层数过多加重的梯度消失和梯度爆炸问题,易于训练;并且利用外部记忆单元和语义角色知识帮助存储两句话中不同层级的语义联系。该文模型在英文评测语料Microsoft Research Paraphrase Corpus测试集上F值为84.3%。实验表明,语义角色标注知识确实可以帮助文本复述判别,并且轻量级模型达到了与同类多层次网络模型相近的效果。
Abstract:Paraphrase identification is an important sentence semantic understanding task. In this paper, we present a light-weight memory based recurrent neural network with sematic role features for this issue. The proposed single layer recurrent network alleviates the gradient disappearance and gradient explosion which aggravate by multilayer neural networks. We employ semantic role features to describe the semantic relationships between two sentneces. On the test set of Microsoft Research Paraphrase Corpus, we achieve 84.3% in F1 score, which is competitive compared with multilayer neural network models.
[1]Erwin Marsi, Emiel Krahmer. Explorations in sentence fusion[C]//Proceedings of the European Workshop on Natural Language Generation, 2005:109-117. [2]Paul Clough, Robert Gaizauskas, Scott SL Piao, et al. Meter:Measuring text reuse[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002:152-159. [3]Chris Callison-Burch. Syntactic constraints on paraphrases extracted from parallel corpora[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2008:196-205. [4]Stephen Wan, MarkDras, Robert Dale, et al. Using dependency-based features to take the“parafarce” out of paraphrase[C]//Proceedings of the Australasian Language Technology Workshop, 2006. [5]Zornitsa Kozareva, Andrés Montoyo. Paraphrase identification on the basis of supervised machine learning techniques[C]//Proceedings of the Advances in natural language processing. Springer, 2006:524-533. [6]Nitin Madnani, Joel Tetreault, Martin Chodorow. Re-examining machine translation metrics for paraphrase identification[C]//Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, 2012:182-190. [7]Richard Socher, Eric H Huang, Jeffrey Pennington, et al. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection[C]//Proceedings of the NIPS, 2011:801-809. [8]Baotian Hu, Zhengdong Lu, Hang Li, et al. Convolutional neural network architectures for matching natural language sentences[C]//Proceedings of the Advances in neural information processing systems, 2014:2042-2050. [9]Wenpeng Yin, Hinrich Schütze. Convolutional Neural Network for Paraphrase Identification[C]//Proceedings of the HLT-NAACL, 2015:901-911. [10]Wenpeng Yin, Hinrich Schütze, Bing Xiang, et al. Abcnn:Attention-based convolutional neural network for modeling sentence pairs[C]//Proceedings of the arXiv preprint arXiv:1512.05193, 2015. [11]Hochreiter Sepp, Jürgen Schmidhuber. Long short-term memory[C]//Proceedings of the Neural computation, 1997:1735-1780. [12]Zhen Wang, Tingsong Jiang, Baobao Chang, et al. Chinese semantic role labeling with bidirectional recurrent neural networks[C]//Proceedings of the EMNLP, 2015:1626-1631. [13]Rada Mihalcea, Courtney Corley, Carlo Strapparava, et al. Corpus-based and knowledge-based measures of text semantic similarity[C]//Proceedings of the AAAI, 2006:775-780. [14]Long Qiu, Min-Yen Kan, Tat-Seng Chua. Paraphrase recognition via dissimilarity significance classification[C]//Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, 2006:18-26. [15]Samuel Fernando, Mark Stevenson. A semantic similarity approach to paraphrase detection[C]//Proceedings of the 11th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics, 2008:45-52. [16]Dipanjan Das, Noah A Smith. Paraphrase identification as probabilistic quasi-synchronous recognition[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 2009(1):468-476. [17]William Blacoe, Mirella Lapata. A comparison of vector-based representations for semantic composition[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012:546-556.