李玉强,黄瑜,孙念,李琳,刘爱华. 基于性格情绪特征的改进主题情感模型[J]. 中文信息学报, 2020, 34(7): 96-104.
LI Yuqiang, HUANG Yu, SUN Nian, LI Lin, LIU Aihua. An Improved Topic Sentiment Model Based on User Character. , 2020, 34(7): 96-104.
An Improved Topic Sentiment Model Based on User Character
LI Yuqiang1, HUANG Yu1, SUN Nian1, LI Lin1, LIU Aihua2
1.School of Computer Science and Technology, Wuhan University of Technology, Wuhan, Hubei 430063, China; 2.School of Energy and Power Engineering, Wuhan University of Technology, Wuhan, Hubei 430063, China
摘要近年来,以微博为代表的社交媒体在情感分析中备受关注。然而,绝大多数现有的主题情感模型并没有充分考虑到用户性格特征,导致情感分析结果难尽人意。故该文在现有的JST模型基础上进行改进,提出一种基于时间的性格建模方法,将用户性格特征纳入主题情感模型中;鉴于微博数据包含大量的表情符号之类的特有信息,为了充分利用表情符号来提升微博情感识别性能,该文将情感符号融入JST模型中,进而提出了一种改进的主题情感联合模型UC-JST(Joint Sentiment/Topic Model Based on User Character)。通过在真实的新浪微博数据集上进行实验,结果表明UC-JST情感分类效果优于JST、TUS-LDA、JUST、TSMMF四种典型的无监督情感分类方法。
Abstract:In the sentiment analysis in micro-blogs, most existing topic sentiment models do not fully consider the users personality characteristics. Based on the JST model, this paper proposes a time-based personality modeling method to incorporate users personality features into the topic sentiment model. Since the microblog data contains a lot of unique information such as emoticons, we also introduce emoticons into the JST model. As a result, an probabilistic model named UC-JST(Joint Sentimet/Topic model based on User Character)is proposed. Tested on the real Sina Weibo dataset, the results show that UC-JST performs better than JST, TUS-LDA ,JUST and TSMMF in terms of sentiment classification accuracy.
[1] 黄发良,于戈,张继连,等. 基于社交关系的微博主题情感挖掘[J].软件学报. 2017, 28(03):694-707. [2] Lin C, He Y. Joint sentiment/topic model for sentiment analysis[C]//Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009: 375-384. [3] Peterson C. A primer in positive psychology[M]. Oxford: Oxford University Press, 2006: 182-208. [4] 杜慧,陈云芳,张伟. 主题模型中的参数估计方法综述[J]. 计算机科学, 2017, 44(S1):29-32,47. [5] 蔡永明,长青.共词网络LDA模型的中文短文本主题分析[J].情报学报,2018,37(03):305-317. [6] Nguyen D Q, Billingsley R, Du L, et al. Improving topic models with latent feature word representations[J]. Transactions of the Association for Computational Linguistics, 2015, 3(1):299-313. [7] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26(1):3111-3119. [8] Lin C, He Y, Everson R, et al. Weakly supervised joint sentiment-topic detection from text[J]. IEEE Transactions on Knowledge & Data Engineering, 2012, 24(6):1134-1145. [9] Dermouche M, Kouas L, Velcin J, et al. A joint model for topic-sentiment modeling from text[C]//Proceedings of the 30th Annual ACM Symposium on Applied Computing, 2015: 819-824. [10] 黄俊衡. 基于改进主题模型的微博短文本情感分析的研究[D]. 南京: 东南大学硕士学位论文, 2017. [11] 黄发良,冯时,王大玲,等. 基于多特征融合的微博主题情感挖掘[J]. 计算机学报,2017,40(04):872-888. [12] 许银洁,孙春华,刘业政. 考虑用户特征的主题情感联合模型[J]. 计算机应用,2018, 38(05):1261-1266. [13] 袁婷婷,杨文忠,仲丽君,等. 一种基于性格的微博情感分析模型PLSTM[J/OL]. 计算机应用研究,2019,37(02):1-6. [14] 李海芳,何海鹏,陈俊杰. 性格、心情和情感的多层情感建模方法[J]. 计算机辅助设计与图形学学报,2011,23(04):725-730. [15] Kshirsagar S, Magnenat-Thalman N. A multilayer personality model[C]//Proceedings of the 2nd International Symposium on Smart Graphics, 2002: 107-115. [16] Abelson R P. Whatever became of consistency theory?[J]. Personality & Social Psychology Bulletin, 1983, 9(1):37-64. [17] 赵蓉英,张扬.基于时空维度的国内外情感分析研究演化分析[J]. 情报科学, 2018, 36(10):171-177. [18] 吴晨茜,陈锻生.表情符向量化算法[J]. 华侨大学学报(自然科学版), 2019, 40(03):399-404. [19] 何炎祥,孙松涛,牛菲菲,等. 用于微博情感分析的一种情感语义增强的深度学习模型[J]. 计算机学报, 2017, 40(04):773-790. [20] He L, Jia Y, Han W, et al. Mining user interest in microblogs with a user-topic model[J]. China Communications, 2014, 11(8):131-144.