该文研究有监督学习方法在多文档文本情感摘要中的应用。利用从亚马逊中文网和亚马逊英文网上收集的产品评论语料,抽取文本内特征、PageRank特征、情感特征和评论质量特征,基于有监督方法进行多文档文本情感摘要抽取。实验结果表明有监督学习方法比无监督学习方法在ROUGE值上有显著的提高,情感特征和评论质量特征均有助于文本情感摘要。
Abstract
This paper investigates the application of supervised learning methods in multi-document opinion summarization. We use the corpus collected from Amazon, extract text features, PageRank feature, opinion features and reviews quality features, and, finally, generate the multi-document opinion summarization based on supervised learning method. Experimental results show that the ROUGE values are significantly improve by using supervised learning method than that unsupervised learning method. The opinion features and reviews quality features are helpful for summarization.
关键词
情感摘要 /
评论质量 /
情感特征 /
有监督学习 /
最大熵分类器
{{custom_keyword}} /
Key words
opinion summarization /
comments quality /
emotional features /
supervised learning /
maximum entropy classifier
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Hu M, Liu B. Mining and summarizing customer reviews[C]//Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004: 168-177.
[2] Titov I, McDonald R. A joint model of text and aspect ratings for sentiment summarization [J]. Urbana, 2008, 51: 61801.
[3] Carenini G, Cheung J C K, Pauls A. Multi-document summarization of evaluative text [J]. Computational Intelligence, 2013, 29(4): 545-576.
[4] Carenini G, Cheung J C K. Extractive vs. NLG-based abstractive summarization of evaluative text: The effect of corpus controversiality[C]//Proceedings of the Fifth International Natural Language Generation Conference. Association for Computational Linguistics, 2008: 33-41.
[5] Lerman K, Blair-Goldensohn S, McDonald R. Sentiment summarization: evaluating and learning user preferences[C]//Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2009: 514-522.
[6] Lerman K, McDonald R. Contrastive summarization: an experiment with consumer reviews[C]//Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics, companion volume: Short papers. Association for Computational Linguistics, 2009: 113-116.
[7] Nishikawa H, Hasegawa T, Matsuo Y, et al. Opinion summarization with integer linear programming formulation for sentence extraction and ordering[C]//Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010: 910-918.
[8] Wang D, Liu Y. A pilot study of opinion summarization in conversations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011: 331-339.
[9] 林莉媛,王中卿,李寿山等. 基于PageRank的中文多文档文本情感摘要[J]. 中文信息学报.2014, 28(2): 85-90.
[10] Liu F, Liu F, Liu Y. Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion[C]//Proceedings of spoken Language Technology Workshop, 2008. SLT 2008. IEEE. IEEE, 2008: 181-184.
[11] Wong K F, Wu M, Li W. Extractive summarization using supervised and semi-supervised learning[C]//Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 2008: 985-992.
[12] Li C, Qian X, Liu Y. Using supervised bigram-based ILP for extractive summarization[C]//Proceedings of ACL.2013: 1004-1013.
[13] Shen D, J Sun, H Li, et al. Document Summarization using Conditional Random Fields[C]//Proceeding of the IJCAI-07.
[14] Hong Y, Lu J, Yao J, et al. What reviews are satisfactory: novel features for automatic helpfulness voting[C]//Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval. ACM, 2012: 495-504.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家863计划前沿技术研究类项目(2012AA011102);NSFC面上项目(61273320)
{{custom_fund}}