Sentence Fusion for Complex Problems in Reading Comprehension
TAN Hongye1,2, ZHAO Honghong1, LI Ru1,2
1. School of Computer and Information Technology of Shanxi University, Taiyuan, Shanxi 030006, China; 2. Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing of Shanxi University, Taiyuan, Shanxi 030006,China
Abstract:Reading comprehension system is a research focus in natural language processing. In these systems,both answer extraction and sentence fusion are necessary for answering complex problems. This paper focuses on the techniques of sentence fusion for complex problems, and presents a method considering the sentence importance, the relevancy to queries and the sentence readability. This method first chooses the partsto be fused based on sentence division and word salience. Then, the repeated contents are merged by word alignments. Finally, the sentences are generated based on the integer linear optimization, which utilizes dependency relations, the language model and word salient. The experiments on reading comprehension datasets in college entrance examinations achieve an F-measure of 82.62%.
[1] Voorhees E M,Tice D M. Building a question answering test collection[C]//Proceeding of International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2000: 200-207. [2] 张志昌, 张宇, 刘挺, 等. 开放域问答技术研究进展[J]. 电子学报, 2009, 37(5): 1058-1069. [3] Matthew Richardson, Christopher J.C. Burges, Eric Renshaw. MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013: 193-203. [4] Jason Weston, Antoine Borses, Sumit Chopra,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks[J]. Computer Science, 2015. [5] Lynette Hirschman, Marc Light, Eric Breck, et al.Deep Read: A reading comprehension system[C]// Meeting of the Association for Computational Linguistics, 2002: 325-332. [6] 张志昌, 张宇, 刘挺, 等. 基于话题和修辞识别的阅读理解why型问题回答[J]. 计算机研究与发展, 2011, 48(2): 216-223. [7] Jawad Sadek, Fairouz Chakkour, Farid Meziane.Arabic Rhetorical Relations Extraction for Answering "Why" and "How to" Questions[C]//Proceedings of International Conference on Applications of Natural Language Processing and Information Systems, 2012: 385-390. [8] Kevin Knight, Daniel Marcu. Statistics-Based Summarization-Step One: Sentence Compression[C]//Proceedings of Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence. AAAI Press, 2000: 703-710. [9] Nitin Madnani, Jimmy Lin, Bonnie Dorr. TREC 2007 ciQA Task: University of Maryland[C]//Proceeding of Sixteenth Text Retrieval Conference, Trec 2007, 2007: 214-220. [10] K Knight, D Marcu. Summarization beyond sentence extraction: A probabilistic approach to sentence compression[J]. Artificial Intelligence, 2002, 139(1): 91-107. [11] J Turner, E Charniak. Supervised and unsupervised learning for sentence compression[C]//Proceeding of Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005: 140-141. [12] RT McDonald. Discriminative Sentence Compression with Soft Syntactic Evidence.[C]//Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, 2006. [13] Wanxiang Che, Yanyan Zhao, Honglei Guo, et al. Sentence compression for aspect-based sentiment analysis[J]. Audio Speech & Language Processing IEEE/ACM Transactions on, 2015, 23(12): 2111-2124. [14] Katja Filippova, Enrique Alfonseca, Carlos A. Colmenares, et al. Sentence Compression by Deletion with LSTMs[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015: 360-368. [15] Barzilay,Regina, Kathleen R. McKeown. Sentence Fusion for Multidocument News Summarization[J]. Computational Linguistics, 2005, 31(3): 297-328. [16] Marsi, Erwin, Emiel Krahmer.Explorations in sentence fusion[C]//Proceedings of the 10th European Workshop on Natural Language Generation, 2010: 109-117. [17] Katja Filippova, Michael Strube. Sentence fusion via dependency graph compression[C]//Proceeding of Conference on Empirical Methods in Natural Language Processing, 2008: 177-185. [18] Stephen Wan, Robert Dale, Mark Dras, et al. Global revision in summarization: Generating novel sentences with Prims algorithm[C]//Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, 2007: 26-235. [19] Lidong Bing, Piji Li, Yi Liao, et al. Abstractive Multi-Document Summarization via Phrase Selection and Merging[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 1587-1597. [20] Kavita Ganesan, ChengXiang Zhai, and Jiawei Han.Opinosis: A Graph-based Approach to Abstractive Summarization of Highly Redundant Opinions[C]//Proceedings of the International Conference on Computational Linguistics, Proceedings of the Conference, 2010: 340-348. [21] 王红玲, 张明慧, 周国栋. 主题信息的中文多文档自动文摘系统[J]. 计算机工程与应用, 2012, 48(25): 132-136. [22] 刘江鸣, 徐金安, 张玉洁. 基于隐主题马尔科夫模型的多特征自动文摘[J]. 北京大学学报: 自然科学版, 2014, 50(1): 187-193. [23] Marie-Catherine de Marneffe, Bill MacCartney, Christopher D.Manning. Generating Typed Dependency Parses from Phrase Structure Parses[J]. Lrec, 2006: 449-454.