社交网络舆情分析是一种新的研究趋势,而其中微博话题的情感倾向性判定是社交网络舆情分析中的热点。针对微博内容特征以及微博间转发、评论关系特征,构建情感分析用词典、网络用语词典以及表情符号库,设计基于短语路径的微博话题情感倾向性判定算法,以及基于多特征的微博话题情感倾向性判定算法,并进一步利用微博间的转发和评论关系对基于多特征的微博话题情感倾向性判定算法进行优化,其微平均正确率与F值分别达到85.3%和79.4%。
Abstract
Public opinion analysis for micro-blog post is a new trend, wherein sentiment orientation identification on micro-blog topic is a hot issue. According to the features of contents and the various relations of Chinese micro-blog post, we construct the dictionaries of sentiment words, internet slang and emotions respectively, Then we implement the sentiment analysis algorithms based on phrase path and the multi-feature of sentiment orientation of micro-blog topics. Using micro-blogs forwarding and commentaries, we take a future step to optimize the algorithm based on the multiple features. According to the experimental results, the values of the Precision and F-measure reach 85.3% and 79.4%, respectively.
关键词
微博 /
微博话题 /
情感分析 /
观点分析 /
情感倾向性
{{custom_keyword}} /
Key words
Micro-blog post /
Micro-blog topic /
sentiment analysis /
opinion analysis /
sentiment orientation
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C]//Proceedings of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010: 67-70.
[2] 张晨逸, 孙建伶, 丁轶群. 基于MB-LDA模型的微博主题挖掘 [J]. 计算机研究与发展, 2011, 48(10): 1795-1802.
[3] Alec Go, Richa Bhayani, Huang Lei. Twitter Sentiment Classification using Distant Supervision[R]. CS224N Project Report, Stanford, 2009.
[4] Luciano Barbosa, Feng Junlan. Robust Sentiment Detection on Twitter from Biased and Noisy Data [C]//Proceedings of COLING 2010. Beijing, China, 2010:36-44.
[5] Jiang Long, Yu Mo, Zhou Ming, et al. Target-dependent Twitter Sentiment Classification[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Portland, Oregon, 2011: 151-160.
[6] 徐琳宏, 林鸿飞, 杨志豪. 基于语义理解的文本倾向性识别机制 [J]. 中文信息学报, 2007, 21(1): 96-100.
[7] 夏云庆, 杨莹, 张鹏洲等. 基于情感向量空间模型的歌词情感分析 [J]. 中文信息学报, 2010, 24(1): 99-103.
[8] Turney P D, Littman M L. Measuring praise and criticism: inference of semantic orientation from association [J]. ACM Trans on Information Systems, 2003, 21(4): 315-346.
[9] 杜伟夫, 谭松波, 云晓春等. 一种新的情感词汇语义倾向计算方法 [J]. 计算机研究与发展, 2009, 46(10): 1713-1720.
[10] 王素格, 李德玉, 魏英杰. 基于赋权粗糙隶属度的文本情感分类方法 [J]. 计算机研究与发展, 2011, 48(5): 855-861.
[11] 朱嫣岚, 闵锦, 周雅倩等. 基于HowNet的词汇语义倾向计算 [J]. 中文信息学报, 2006, 20(1): 14-20.
[12] Apoorv Agarwal, Xie Boyi, Ilia Vovsha, et al. Sentiment Analysis of Twitter Data [C]//Proceedings of the Workshop on Language in Social Media (LSM 2011). Portland, Oregon, 2011: 30-38.
[13] 姚天昉, 聂青阳, 李建超等. 一个用于汉语汽车评论的意见挖掘系统[R]. 曹右琦, 孙茂松, 编. 中国中文信息学会成立25周年学术年会论文集. 北京: 清华大学出版社, 2006: 260-281.
[14] 周强, 赵颖泽. 汉语功能块自动分析 [J]. 中文信息学报, 2007, 21(5): 18-24.
[15] Cheng Yuchang, Masayuki Asahara, Yuji Mmtsumotoy. Machine Learning-based Dependency Analyzer for Chinese [C]//Proceedings of the International Conference on Chinese Computing 2005. Singapore: COLIPS Publication, 2005: 66-73.
[16] 赵妍妍, 秦兵, 车万翔等. 基于句法路径的情感评价单元识别 [J]. 软件学报, 2011, 22(5): 887-898.
[17] Adam Bermingham, Alan Smeaton. Classifying Sentiment in Microblogs: Is Brevity an Advantage? [C]//Proceedings of the 19th ACM international conference on Information and knowledge management. Toronto, Ontario, Canada: CIKM10, 2010: 1833-1836.
[18] Rakesh Agrawal, Sridhar Rajagopalan, Ramakrishnan Srikant, et al. Mining Newsgroups Using Networks Arising From Social Behavior [C]//Proceedings of the 12th international conference on World Wide Web. Budapest, Hungary: WWW03, 2003: 529-535.
[19] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification using Machine Learning Techniques [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Philadelphia, PA: Association for Computational Linguistics, 2002: 79-86.
[20] Ralitsa Angelova, Gerhard Weikum. Graph-based Text Classification: Learn from Your Neighbors [C]//Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval. Seattle, Washington, USA: SIGIR06, 2006: 485-492.
[21] 赵妍妍, 秦兵, 刘挺. 文本情感分析 [J]. 软件学报, 2010, 21(8): 1834-1848.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点基础研究发展计划(973计划)(2013CB329605);国家自然科学基金(61132009);国家科技支撑计划(2012BAH14F06)
{{custom_fund}}