基于PageRank的中文多文档文本情感摘要

林莉媛,王中卿,李寿山,周国栋

PDF(1183 KB)
PDF(1183 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (2) : 85-90.
信息提取和文本挖掘

基于PageRank的中文多文档文本情感摘要

  • 林莉媛,王中卿,李寿山,周国栋
作者信息 +

Chinese Multi-Document Opinion Summarization via PageRank

  • LIN Liyuan, WANG Zhongqing, LI Shoushan, ZHOU Guodong
Author information +
History +

摘要

文本情感摘要任务旨在对带有情感的文本数据进行浓缩、提炼进而产生文本所表达的关于情感意见的摘要。该文主要研究基于多文档的文本情感摘要问题, 重点针对网络上存在同一个产品的多个评论产生相应的摘要。首先,为了进行关于文本情感摘要的研究,该文收集并标注了一个基于产品评论的中文多文档文本情感摘要语料库。其次,该文提出了一种基于情感信息的PageRank算法框架用于实现多文档文本情感摘要,该算法同时考虑了情感和主题相关两方面的信息。实验结果表明,该文采用的方法和已有的方法相比在ROUGE值上有显著提高。

Abstract

Opinion summarization aims to refine the text data so as to generate a summary regarding the expressed opinion. This study focuses on multi-document opinion summarization where the main task is to generate a summary for a given amounts of reviews towards the same product. We first collect and annotate a Chinese multi-document corpus on product reviews. Then, a novel PageRank framework to generate opinion summarization is proposed, with the advantage of considering both the topic relation and opinion relation among reviews. Empirical studies on the corpus demonstrate that the proposed method substantially outperforms existing approaches in terms of ROUGE measurement.

关键词

摘要 / 情感 / 多文档

Key words

summarization / sentiment / multi-document

引用本文

导出引用
林莉媛,王中卿,李寿山,周国栋. 基于PageRank的中文多文档文本情感摘要. 中文信息学报. 2014, 28(2): 85-90
LIN Liyuan, WANG Zhongqing, LI Shoushan, ZHOU Guodong. Chinese Multi-Document Opinion Summarization via PageRank. Journal of Chinese Information Processing. 2014, 28(2): 85-90

参考文献

[1] Hu M, Liu B. Mining and Summarizing Customer Reviews[C]//Proceedings of SIGKDD-04. 2004.
[2] Titov I, Mc-donald R. A Joint Model of Text and Aspect Ratings for Sentiment Summarization[C]//Proceedings of ACL-08. 2008.
[3] Carenini Giuseppe, Ng Raymond, Pauls Adam. Multi-Document Summarization of Evaluative Text[C]//Proceedings of EACL-06, 2006: 305-312.
[4] Carenini Giuseppe, Cheung Jackie Chi Kit. Extractive vs. NLG-based Abstractive Summarization of Evaluative Text: The Effect of Corpus Controversiality[C]//Proceedings of the 5th International Natural Language Generation Conference (INLG), 2008: 33-41.
[5] Kevin Lerman, Sasha Blair-Goldensohn, Ryan McDonald. Sentiment Summarization: Evaluating and Learning User Preferences[C]//Proceedings of EACL-09, 2009:514-522.
[6] Kevin Lerman, McDonald Ryan. Contrastive Summarization: An Experiment with Consumer Reviews[C]//Proceedings of Human Language Technologies: the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), Companion Volume: short Papers, 2009: 113-116.
[7] Radev D, Jing H, Stys M, et al. Centroid-based Summarization of Multiple Documents[J]. Information Processing and Management. 2004(40): 919-938.
[8] Wan X. Using Bilingual Information for Cross-Language Document Summarization[C]//Proceedings of ACL-11. 2011.
[9] Hitoshi Nishikawa, Takaaki Hasegawa, Yoshihiro Matsuo, Genichiro Kikui. Opinion summarization with integer linear programming formulation for sentence extraction and ordering[C]//Proceedings of COLING. 2010.
[10] Wang D, Liu Y. A Pilot Study of Opinion Summarization in Conversations[C]//Proceedings of ACL-11. 2011.
[11] Ganesan K, Zhai C, Han J. Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions[C]//Proceedings of COLING-2008. 2008.
[12] Lin, C. Training a Selection Function for Extraction[C]//Proceedings of CIKM-99. 1999.
[13] Celikyilmaz A, Hakkani-Tur D. Discovery of Topically Coherent Sentences for Extractive Summarization[C]//Proceedings of ACL-11. 2011.
[14] Wan X, Yang J. Multi-document Summarization using Cluster-based Link Analysis[C]//Proceedings of SIGIR-08. 2008.
[15] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification using Machine Learning Techniques[C]//Proceedings of EMNLP-02. 2002.
[16] Li S, Huang C, Zhou G, et al. Employing Personal/Impersonal Views in Supervised and Semi-supervised Sentiment Classification[C]//Proceedings of ACL-10. 2010.
[17] Li F, Tang Y, Huang M, et al. Answering Opinion Questions with Random Walks on Graphs[C]//Proceedings of ACL-10. 2010.
[18] Page L, Brin S, Motwani R, et al. The PageRank Citation Ranking: Bringing Order to the Web[J]. Technical Report, Stanford Digital Libraries. 1998.
[19] Baeza-Yates R, Ribeiro-Neto B. Modern Information Retrival[M]. ACM Press and Addison Wesley. 1999.
[20] Lin C. ROUGE: a Package for Automatic Evaluation of Summaries[C]//Proceedings of ACL-04 Workshop on Text Summarization Branches Out. 2004.
[21] 张瑾,王小磊,许洪波. 自动文摘评价方法总述[J].中文信息学报,2008,2(3):81-88.

基金

国家自然科学基金(61003155,60873150);模式识别国家重点实验室开放课题基金
PDF(1183 KB)

Accesses

Citation

Detail

段落导航
相关文章

/