一种基于神经网络模型的句子排序方法

PDF(1760 KB)

中文信息学报 ›› 2016, Vol. 30 ›› Issue (5) : 195-202.

综述

一种基于神经网络模型的句子排序方法

康世泽,马宏,黄瑞阳

作者信息 +

A Neural Network Model Based Sentence Ordering Method
for Multi-document Summarization

KANG Shize,MA Hong,HUANG Ruiyang

Author information +

History +

摘要

句子排序是多文本摘要中的重要问题,合理地对句子进行排序对于摘要的可读性和连贯性具有重要意义。该文首先利用神经网络模型融合了五种前人已经提出过的标准来决定任意两个句子之间的连接强度,这五种标准分别是时间、概率、主题相似性、预设以及继承。其次,该文提出了一种基于马尔科夫随机游走模型的句子排序方法,该方法利用所有句子之间的连接强度共同决定句子的最终排序。最终,该文同时使用人工和半自动方法对句子排序的质量进行评价,实验结果表明该文所提出方法的句子排序质量与基准算法相比具有明显提高。

Abstract

Sentence ordering is an important task in multi-document summarization. For this purpose, we first use neural network model to incorporate five proposed criteria for sentence connection, namely chronology, probabilistic, topical-closeness, precedence, and succession. Then, a sentence ordering method based on Markov random walk model is proposed, which determines the final ordering of the sentences based on the strength of connection between them. Examined by the semi-automatic and a subjective measures, the proposed method achieves obviously better sentence order compared with the baseline algorithms in the experiments.

导出引用

康世泽,马宏,黄瑞阳. 一种基于神经网络模型的句子排序方法. 中文信息学报. 2016, 30(5): 195-202

KANG Shize,MA Hong,HUANG Ruiyang. A Neural Network Model Based Sentence Ordering Method
for Multi-document Summarization. Journal of Chinese Information Processing. 2016, 30(5): 195-202

参考文献

[1] 韩永峰, 许旭阳, 李弼程, 等. 基于事件抽取的网络新闻多文档自动摘要[J]. 中文信息学报, 2012, 26(1): 58-66.
[2] 刘平安. 基于HLDA模型的中文多文档摘要技术研究[D]. 北京邮电大学, 2013.
[3] Wang L,Raghavan H, Castelli V, et al. A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization[C]//Proceedings of ACL, 2013.
[4] Ferreira R, Cabral L D S, Freitas F, et al. A multi-document summarization system based on statistics and linguistic treatment[J]. Expert Systems with Applications, 2014, 41(13): 5780-5787.
[5] R Barzilay, N Elhadad, K McKeown. Inferring strategies for sentence ordering in multidocument news summarization[J]. Journal of Artificial Intelligence Research,2002,17: 35-55.
[6] Banik E, Kow E, Chaudhri V. User-Controlled, Robust Natural Language Generation from an Evolving Knowledge Base[J]. Enlg, 2013.
[7] Peng G, He Y, Zhang W, et al. A Study for Sentence Ordering Based on Grey Model[C]//Proceedings of Services Computing Conference (APSCC) IEEE, 2010: 567-572.
[8] Bollegala D, Okazaki N, Ishizuka M. A preference learning approach to sentence ordering for multi-document summarization[J]. Information Sciences, 2012, 217: 78-95.
[9] Sukumar P, Gayathri K S. Semantic based Sentence Ordering Approach for Multi-Document Summarization[J]. International Journal of Recent Technology and Engineering (IJRTE), 2014, 3(2): 71-76.
[10] Nishikawa H, Hasegawa T, Matsuo Y, et al. Optimizing informativeness and readability for sentiment summarization[C]//Proceedings of the ACL 2010 Conference Short Papers. Association for Computational Linguistics, 2010: 325-330.
[11] Nishikawa H, Hasegawa T, Matsuo Y, et al. Opinion summarization with integer linear programming formulation for sentence extraction and ordering[C]//Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010: 910-918.
[12] M Lapata,Probabilistic text structuring: Experiments with sentence ordering[C]//Proceedings of the annual meeting of ACL, 2003: 545-552.
[13] Naoaki Okazaki, Yutaka Matsuo, and Mitsuru Ishizuka. Improving chronological sentence ordering by precedence relation[C]//Proceedings of 20th International Conference on Computational Linguistics (COLING 04), 2004: 750-756.
[14] Dragomir R Radev, Hongyan Jing, and Malgorzata Budzikowska. Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies[J]. ANLP/NAACL Workshop on Summarization, Seattle, WA, April 2000.
[15] Hinton G E. A Practical Guide to Training Restricted Boltzmann Machines[M]. Neural Networks: Tricks of the Trade. Springer Berlin Heidelberg, 2012: 599-619.
[16] D Bollegala, N Okazaki, M Ishizuka, A bottom-up approach to sentence ordering for multi-document summarization[J]. Information Processing and Management,2010,46(1): 89-109.
[17] Sharma A K, Prajapat S K, Aslam M. A Comparative Study Between Naive Bayes and Neural Network (MLP) Classifier for Spam Email Detection[C]//IJCA Proceedings on National Seminar on Recent Advances in Wireless Networks and Communications. Foundation of Computer Science (FCS), 2014,(2): 12-16.
[18] Gharaviri A, Dehghan F, Teshnelab M, et al. Comparison of neural network, ANFIS, and SVM classifiers for PVC arrhythmia detection[C]//Proceedings of Machine Learning and Cybernetics, International Conference on. IEEE, 2008, 2: 750-755.
[19] 林莉媛, 王中卿, 李寿山, 等. 基于 PageRank 的中文多文档文本情感摘要[J]. 中文信息学报, 2014, 28(2): 85-90.
[20] 苗家, 马军, 陈竹敏. 一种基于 HITS 算法的 Blog 文摘方法[J]. 中文信息学报, 2011, 25(1): 104-109.
[21] Wan X, Yang J. Multi-document summarization using cluster-based link analysis[C]//Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM, 2008: 299-306.

PDF(1760 KB)

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献

Received	Published
2015-10-15	2016-10-15
Issue Date
2016-10-15

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注