该文旨在探索一种面向微博的社会情绪词典构建方法,并将其应用于社会公共事件的情绪分析中。首先通过手工方法建立小规模的基准情绪词典,然后利用深度学习工具Word2vec对社会热点事件的微博语料通过增量式学习方法来扩展基准词典,并结合HowNet词典匹配和人工筛选生成最终的情绪词典。接下来,分别利用基于情绪词典和基于SVM的情绪方法对实验标注语料进行情绪分析,结果对比分析表明基于词典的情绪分析方法优于基于SVM的情绪分析方法,前者的平均准确率和召回率比后者分别高13.9%和1.5%。最后运用所构建的情绪词典对热点公共事件进行情绪分析,实验结果表明该方法是有效的。
Abstract
This paper aims to explore a method to build social emotional lexicons from microblog and apply it to analyze social emotions in social public events. First, the small-scale standard emotional lexicons are manually collected as the basic emotional lexicon. Then, word2vec, a tool based on deep learning, is used to conduct incremental learning method on the corpus from social events on microblogs to expand the basic emotional lexicon. The final emotional lexicon is filtered by HowNet and experts. In the following, the paper compares the results of emotional analysis based on the generated emotional lexicon with those based on SVM classification, demonstrating 13.9% increase in average precision and 1.5% increase in recall. Finally, the proposed methods are verified according to emotional analysis on different social events with the generated emotional lexicon.
Key words microblogging; social emotions; lexicon; emotional analysis
关键词
微博 /
社会情绪 /
词典 /
情绪分析
{{custom_keyword}} /
Key words
microblogging /
social emotions /
lexicon /
emotional analysis
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Zhang Jianfeng, Xia Yunqing, Yao Jianmin. A review towards microtext processing[J].Journal of Chinese Information Processing, 2012, 26(4):21-27.
[2] Carlo Strapparava, Alessandro Valitutti. WordNet-Affect: an Affective Extension of WordNet [J]. ITC-irst,Istituto per la Ricerca Scientifica e Tecnologica I-38050 Povo Trento Italy:1083-1086.
[3] Salah Z, Coenen F, Grossi D. Generating domain-specific sentiment lexicons for opinion mining[M].Advanced Data Mining and Applications. Springer Berlin Heidelberg, 2013: 13-24.
[4] Li S, Hao J, Jiang Y, et al. Exploiting Co-occurrence Opinion Words for Semi-supervised Sentiment Classification[C]//Advanced Data Mining and Applications. Springer Berlin Heidelberg, 2013: 36-47.
[5] 柳位平,朱艳辉,栗春亮等.中文基准情感词词典构建方法研究[J].计算机应用, 2009.10(29): 2875-2877.
[6] 常晓龙,张晖.融合语素特征的中文褒贬词典构建[J]. 计算机应用, 2012, 32(7): 2033 -2037.
[7] 徐琳宏,林鸿飞,潘宇,等.情感词汇本体的构造[J].情报学报,2008,27(2): 180-185.
[8] 桂守才.基础心理学[M].北京:人民教育出版社, 2007.
[9] 林传鼎.社会主义心理学中的情绪问题[J].社会心理学科,2006,21(83):37-62.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61572145);广东省科技计划项目(2014A040401083);教育部人文社会科学研究青年项目(14YJC870021);广东省哲学社会科学“十二五”规划项目(GD14YXW02)
{{custom_fund}}