Abstract:Paraphrase Recognition can be regarded as a sub-problem of Text Entailment Recognition. This problem is difficult in that simply using term frequency or syntax information is prone to error judgment because even the same pack of words can cook up sentences with totally different meanings and similar parsing trees can either have different meanings. In this paper we present a new approach based on Semantic Role Labeling (SRL) to identify paraphrase. In our approach, we first label sentences with semantic role, and then get features partly representing the meaning of the sentence. By doing so, we also take the specialty of News sentences under consideration. Our experiment proves the effectiveness of our approach. Key wordsnatural language processing; semantic role labeling; paraphrase recognition
[1] McKeown, K. Paraphrasing using given and new information in a question-answer system[C]//Association for Computational Linguistics.1979. [2] Callison-Burch, C., P. Koehn, and M. Osborne. Improved statistical machine translation using paraphrases[C]//Association for Computational Linguistics Morristown, NJ, USA.. 2006. [3] 宗成庆,等. 面向口语翻译的汉语语句改写方法[J]. Journal of Chinese Language and Computing, 2002, 12(1): 63-77. [4] Zong, C., et al., Approach to Spoken Chinese Paraphrasing Based on Feature Extraction[C]//Proceedings of the 6th Natural Language Processing Pacific Rim Symposium (NLPRS). Tokyo, Japan. 2001: 551-556. [5] 车万翔, 等. 基于改进编辑距离的中文相似句子检索[J]. 高技术通讯, 2004. 14(7): 15-19. [6] Harabagiu, S. and A. Hickl. Methods for using textual entailment in open-domain question answering[C]//Association for Computational Linguistics Morristown, NJ, USA.2006. [7] Barzilay, R., K. McKeown, and M. Elhadad. Information fusion in the context of multi-document summarization[C]//1999: Association for Computational Linguistics Morristown, NJ, USA. [8] 秦兵,等. 多文档自动文摘综述[J]. 中文信息学报, 2005, 19(6): 14-20. [9] Qiu, L., M. Kan, and T. Chua. Paraphrase recognition via dissimilarity significance classification[C]//Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing.2006. [10] Corley, C. and R. Mihalcea, Measuring the semantic similarity of texts[C]//Ann Arbor, 2005. [11] Fellbaum, C., WordNet: An electronic lexical database[M]. 1998: MIT press Cambridge, MA. [12] 宗成庆. 统计自然语言处理[M]. 北京:清华大学出版社,2008. [13] Finch, A., Y. Hwang, and E. Sumita. Using machine translation evaluation techniques to determine sentence-level semantic equivalence[C]//Proceedings of the 3rd INternational Workshop on Paraphrasin. 2005. [14] Wu, D., Recognizing paraphrases and textual entailment using inversion transduction grammars[C]//Ann Arbor, 2005. [15] Bar-Haim, R., I. Szpektor, and O. Glickman, Definition and analysis of intermediate entailment levels[C]//Empirical Modeling of Semantic Equivalence and Entailment, 2005. 100: 55. [16] Wan, S., et al., Using dependency-based features to take the “para-farce” out of paraphrase[C]//Proc. of ALTW, 2006. [17] Das, D. and N. Smith, Paraphrase identification as probabilistic quasi-synchronous recognition[C]//Proc. of ACL-IJCNLP, 2009. [18] Zhang, Y. and J. Patrick. Paraphrase identification by text canonicalization[C]//Proc. of Australiasian Language Technology Workshop. 2005. [19] Paul Kingsbury, M.P., and Mitch Marcus, Adding semantic annotation to the penn treebank[C]//Proceedings of the Human Language Technology Conference, 2002. [20] Charniak, E. A maximum-entropy-inspired parser[C]//Proceedings of the First Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL’2000). 2000. [21] Pradhan, S., et al. Shallow semantic parsing using support vector machines[C]//Proceedings of HLT/NAACL. Boston, USA,2004. [22] Dolan, B., C. Quirk, and C. Brockett. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources[C]//Association for Computational Linguistics Morristown, NJ, USA. 2004.