Abstract:Sentence ordering is an important task in multi-document summarization. For this purpose, we first use neural network model to incorporate five proposed criteria for sentence connection, namely chronology, probabilistic, topical-closeness, precedence, and succession. Then, a sentence ordering method based on Markov random walk model is proposed, which determines the final ordering of the sentences based on the strength of connection between them. Examined by the semi-automatic and a subjective measures, the proposed method achieves obviously better sentence order compared with the baseline algorithms in the experiments.
[1] 韩永峰, 许旭阳, 李弼程, 等. 基于事件抽取的网络新闻多文档自动摘要[J]. 中文信息学报, 2012, 26(1): 58-66. [2] 刘平安. 基于HLDA模型的中文多文档摘要技术研究[D]. 北京邮电大学, 2013. [3] Wang L,Raghavan H, Castelli V, et al. A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization[C]//Proceedings of ACL, 2013. [4] Ferreira R, Cabral L D S, Freitas F, et al. A multi-document summarization system based on statistics and linguistic treatment[J]. Expert Systems with Applications, 2014, 41(13): 5780-5787. [5] R Barzilay, N Elhadad, K McKeown. Inferring strategies for sentence ordering in multidocument news summarization[J]. Journal of Artificial Intelligence Research,2002,17: 35-55. [6] Banik E, Kow E, Chaudhri V. User-Controlled, Robust Natural Language Generation from an Evolving Knowledge Base[J]. Enlg, 2013. [7] Peng G, He Y, Zhang W, et al. A Study for Sentence Ordering Based on Grey Model[C]//Proceedings of Services Computing Conference (APSCC) IEEE, 2010: 567-572. [8] Bollegala D, Okazaki N, Ishizuka M. A preference learning approach to sentence ordering for multi-document summarization[J]. Information Sciences, 2012, 217: 78-95. [9] Sukumar P, Gayathri K S. Semantic based Sentence Ordering Approach for Multi-Document Summarization[J]. International Journal of Recent Technology and Engineering (IJRTE), 2014, 3(2): 71-76. [10] Nishikawa H, Hasegawa T, Matsuo Y, et al. Optimizing informativeness and readability for sentiment summarization[C]//Proceedings of the ACL 2010 Conference Short Papers. Association for Computational Linguistics, 2010: 325-330. [11] Nishikawa H, Hasegawa T, Matsuo Y, et al. Opinion summarization with integer linear programming formulation for sentence extraction and ordering[C]//Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010: 910-918. [12] M Lapata,Probabilistic text structuring: Experiments with sentence ordering[C]//Proceedings of the annual meeting of ACL, 2003: 545-552. [13] Naoaki Okazaki, Yutaka Matsuo, and Mitsuru Ishizuka. Improving chronological sentence ordering by precedence relation[C]//Proceedings of 20th International Conference on Computational Linguistics (COLING 04), 2004: 750-756. [14] Dragomir R Radev, Hongyan Jing, and Malgorzata Budzikowska. Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies[J]. ANLP/NAACL Workshop on Summarization, Seattle, WA, April 2000. [15] Hinton G E. A Practical Guide to Training Restricted Boltzmann Machines[M]. Neural Networks: Tricks of the Trade. Springer Berlin Heidelberg, 2012: 599-619. [16] D Bollegala, N Okazaki, M Ishizuka, A bottom-up approach to sentence ordering for multi-document summarization[J]. Information Processing and Management,2010,46(1): 89-109. [17] Sharma A K, Prajapat S K, Aslam M. A Comparative Study Between Naive Bayes and Neural Network (MLP) Classifier for Spam Email Detection[C]//IJCA Proceedings on National Seminar on Recent Advances in Wireless Networks and Communications. Foundation of Computer Science (FCS), 2014,(2): 12-16. [18] Gharaviri A, Dehghan F, Teshnelab M, et al. Comparison of neural network, ANFIS, and SVM classifiers for PVC arrhythmia detection[C]//Proceedings of Machine Learning and Cybernetics, International Conference on. IEEE, 2008, 2: 750-755. [19] 林莉媛, 王中卿, 李寿山, 等. 基于 PageRank 的中文多文档文本情感摘要[J]. 中文信息学报, 2014, 28(2): 85-90. [20] 苗家, 马军, 陈竹敏. 一种基于 HITS 算法的 Blog 文摘方法[J]. 中文信息学报, 2011, 25(1): 104-109. [21] Wan X, Yang J. Multi-document summarization using cluster-based link analysis[C]//Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM, 2008: 299-306.