微博日益成为一个巨大而复杂的互联网舆论平台。分析微博中特定话题的情感趋势对于了解网络舆情、分析产品销量趋势显得尤为重要。该文使用微博进行真实事件公众情感趋势预测: 首先,考虑到微博特征稀疏、上下文缺失的特性,借助词语上下位语义关系对其进行语义扩充;其次,使用语义特征和情感常识知识构造双层分类方法进行情感分析;最后,对特定事件在连续时间段内的微博使用时序情感分析方法进行公众情感趋势预测。实验证明,该情感分析方法准确率相对于传统分类方法有明显的提高,在此基础上的情感趋势预测符合事件的真实发展状况。
Abstract
Microblog is a large and complicated public opinion platform on the Internet. In this paper, we demonstrate how microblogs can be used to predict real world public sentiment trends of events. Firstly, considering the special properties of microblogs, absence of context and sparseness of feature, we use the hyponymy relationship between words to do semantic extension for each microblog. Secondly, with the help of semantic feature and affective commonsense knowledge, we can decide the sentiment of each microblog through constructing a double-layer text classifier. Finally, public sentiment trend prediction of each event is performed by using time series sentiment analysis of microblogs. The experiment results show that our sentiment analysis method has a better performance than state-of-the art classification methods. Besides, the sentiment trends of events are consistent with the development of the real world situation to a large degree.
关键词
微博 /
情感分析 /
语义扩充 /
情感常识 /
公众情感趋势
{{custom_keyword}} /
Key words
microblog /
sentiment analysis /
semantic expansion /
affective commonsense knowledge /
public sentiment trend
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Pang B, Lee L. Opinion mining and sentiment analysis[J]. Foundations and trends in information retrieval, 2008, 2(1-2): 1-135.
[2] Liu B. Sentiment analysis and subjectivity[J]. Handbook of natural language processing, 2010, 2: 627-666.
[3] Pang B, Lee L,Vaithyanathan S. Thumbs up?: sentiment classification using machine learning techniques[C]// Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for Computational Linguistics, 2002: 79-86.
[4] Pang B, Lee L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts[C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004: 271.
[5] Bollen J, Mao H, Pepe A. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena[C]//ICWSM 2011: 450-453.
[6] Joshi M, Das D,Gimpel K, et al. Movie reviews and revenues: An experiment in text regression[C]//Proceedings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2010: 293-296.
[7] Alec G, Lei H, Richa B. Twitter sentiment analysis[R]. Final Projects from CS224N for Spring 2008/2009, The Stanford Natural Language Processing Group, June 6, 2009.
[8] 谢丽星, 周明, 孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报, 2012, 26(1): 73-83.
[9] Li D,Shuai X, Sun G, et al. Mining topic-level opinion influence in microblog[C]//Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 2012: 1562-1566.
[10] Singh P. The public acquisition of commonsense knowledge[C]//Proceedings of AAAI Spring Symposium: Acquiring (and Using) Linguistic (and World) Knowledge for Information Access. 2002.
[11] Oh C, Sheng O. Investigating predictive power of stock micro blog sentiment in forecasting future stock price directional movement[C]// Proceedings of the ICIS. 2011.
[12] Tumasjan A, Sprenger T O, Sandner P G, et al. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment[J]. ICWSM, 2010, 10: 178-185.
[13] Liviu L, Mihaela T. Predicting Product Performance with Social Media[J]. Informatica Economica,2011, 15(2): 46-56.
[14] Gupta M,Gao J, Zhai C X, et al. Predicting future popularity trend of events in microblogging platforms[J]. Proceedings of the American Society for Information Science and Technology, 2012, 49(1): 1-10.
[15] Hearst M A. Automatic acquisition of hyponyms from large text corpora[C]// Proceedings of the 14th conference on Computational linguistics-Volume 2. Association for Computational Linguistics, 1992: 539-545.
[16] 任巨伟, 杨亮, 林鸿飞. 情感图式构造及其在文本情感计算中的应用[J]. 江西师范大学学报: 自然科学版, 2013, 37(2): 130-135.
[17] Fellbaum C. WordNet: an electronic lexical database[R]. Cambridge: MIT Press, 1999.
[18] Liu H, Singh P. ConceptNet—a practical commonsense reasoning tool-kit[J]. BT technology journal, 2004, 22(4): 211-226.
[19] 任巨伟, 杨源, 王昊, 等. 二元情感常识库建设及其在文本情感分析中的应用[J]. 中国科技论文在线精品论文, 2014, 7(4): 291-299.
[20] Yang L,Lin H F. Construction and application of chinese emotional corpus[C]// Proceedings of lecture notes in computer science(springer),2013(7717): 122-133.
[21] Wang S, Fan X, Chen X. Chinese short text classification based on hyponymy relation[J]. Journal of Computer Applications, 2010, 30(3): 603-606.
[22] 陈建美,林鸿飞,基于语法的情感词汇自动获取,智能系统学报[J], 2009,4(2): 100-106.
[23] 刘志明, 刘鲁. 基于机器学习的中文微博情感分类实证研究[J]. 计算机工程与应用, 2012, 48(1): 1-4.
[24] 徐琳宏, 林鸿飞. 基于语义特征和本体的语篇情感计算[J]. 计算机研究与发展, 2007, 44(S2): 356-360.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61632011,61562080);辽宁省自然科学基金(201202031,201402003)
{{custom_fund}}