近些年来,随着电商平台的飞速发展,越来越多的人会选择在网上购物并且对商品进行评价。对于较长篇幅的评论,进行摘要可以让用户快速地了解到商品的优缺点。目前主流的生成式摘要模型大多只考虑文本的序列化信息,而对一个商品评论来说,评论中的商品属性信息和情感信息极为重要。为了让模型学习到评论中的商品属性及情感信息,该文提出了一种融合评论中属性及情感信息的生成式摘要方法。该方法通过将不同种类的情感和属性信息嵌入生成模型的编码阶段的方式,从而有效的结合这些信息。实验证明,该方法可生成更高质量的摘要,生成的摘要在ROUGE评价指标上会有较大幅度的提升。
Abstract
In recent years, with the rapid development of e-commerce platform, more and more people choose to shop online and review the products. For longer reviews, the summary can give users a quick idea of the advantages and disadvantages of the product. At present, most of the mainstream generative summarization models only consider the sequential information of the text. However, the attribute and emotional information are very important. In order for modelling these information, this paper presents a generative summarizaiton model which combines the attribute and emotional information in comments. This method effectively integrate these information by embedding attributes and emotions into the encoding layer of the model. Experiments show that this method can generate a higher quality summary, which will be greatly improved on the ROUGE evaluation metric.
关键词
生成式文摘 /
情感及属性信息 /
神经网络
{{custom_keyword}} /
Key words
extractive summarization /
sentiment and attribute information /
Neural network
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 张龙凯, 王厚峰. 文本摘要问题中的句子抽取方法研究[J]. 中文信息学报, 2012, 26(2): 97-102.
[2] Mohan M J, Sunitha C, Ganesh A, et al. A study on ontology based abstractive summarization[J]. Procedia Computer Science, 2016, 87: 32-37.
[3] Dong L, Yang N, Wang W, et al. Unified language modelpre-training for natural language understanding and generation[J]. arXiv preprint arXiv:1905.03197, 2019.
[4] 刘洋. 神经机器翻译前沿进展[J]. 计算机研究与发展, 2017, 54(6): 1144.
[5] Park D, Ahn C W.长短期记忆神经网络 编码器-解码器 with adversarial network for text generation from keyword[C]//Proceedings of the International Conference on Bio-Inspired Computing: Theories and Applications. Springer, Singapore, 2018: 388-396.
[6] Rush A M, Chopra S, Weston J. A neural注意力 model for abstractive sentence summarization[J]. arXiv preprint arXiv:1509.00685, 2015.
[7] Chen C, ZhaY, Zhu D, et al. 注意力 is all you need for general-purpose protein structure embedding[J]. bioRxiv, 2021.
[8] Chopra S, Auli M, Rush A M. Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 93-98.
[9] Gers F A, Schraudolph N N, Schmidhuber J. Learning precise timing with LSTM recurrent networks[J]. Journal of Machine Learning Research, 2002, 3(8): 115-143.
[10] Gu J, Lu Z, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning[J]. arXiv preprint arXiv:1603.06393, 2016.
[11] Gulcehre C, Ahn S, Nallapati R, et al. Pointing the unknown words[J]. arXiv preprint arXiv:1603.08148, 2016.
[12] Zeng W, Luo W, Fidler S, et al. Efficient summarization with read-again and copy mechanism[J]. arXiv preprint arXiv:1611.03382, 2016.
[13] See A, Liu P J , Manning C D, et al. Get to the point: summarization with pointer-generator networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017: 1073-1083.
[14] Cohan A, Dernoncourt F, Kim D S, et al. A discourse-aware 注意力 model for abstractive summarization of long documents[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics, 2018: 615-621.
[15] Lei Y, Buys J, Blunsom P. Online segment to segment neural transduction[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.
[16] Zhang H, Xu J, Wang J. Pretraining-based natural language generation for text summarization[J]. arXiv preprint arXiv:1902.09243, 2019.
[17] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL HCT, 2019:4171-4186.
[18] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques[J]. arXiv preprint cs/0205070, 2002.
[19] Hu M, Liu B. Mining and summarizing customer reviews[C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004: 168-177.
[20] Popescu A M, Etzioni O. Extracting product features and opinions from reviews[M]. Natural Language Processing and Text Mining. Springer, London, 2007: 9-28.
[21] Liu J, Cao Y, Lin C Y, et al. Low-quality product review detection in opinion summarization[C]//Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2007: 334-342.
[22] Ganesan K, Zhai C X, Han J. Opinosis: A graph based approach to abstractive summarization of highly redundant opinions[C]//Proceedings of the Coling International Conference on Computational Linguistics, Proceedings of the Conference. OAI, 2010.
[23] Wang L, Yao J, Tao Y, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[J]. arXiv preprint arXiv:1805.03616, 2018.
[24] Tian Y, Yu J, Jiang J. Aspect and opinion aware abstractive review summarization with reinforced hard typed decoder[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019: 2061-2064.
[25] Williams R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992, 8(3): 229-256.
[26] Hermann K M, Kocisky T, Grefenstette E, et al. Teaching machines to read and comprehend[J]. Advances in Neural Information Processing Systems, 2015, 28: 1693-1701.
[27] Cho K, Van Merrinboer B, Gulcehre C, et al. Learning phrase representations using RNN for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014.
[28] Lewis M, Liu Y, Goyal N, et al..BART: Denoising Sequence-to-Sequence pre-training for natural language generation, translation, and comprehension[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,2020.
[29] Lin C Y. Rouge: A package for automatic evaluation of summaries[C]//Proceedings of the Text Summarization Branches Out. 2004: 74-81.
[30] Hua L, Wan X, Li L. Overview of the NLPCC shared task: Single document summarization[C]//National CCF Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017: 942-947.
[31] Haveliwala T H. Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search[J]. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(4): 784-796.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61806137,61702149)
{{custom_fund}}