Abstract:Micro-blog sentiment analysis is a key technique of public opinion research for social networks. Micro-blog emoticons and sentiment words are both of intuitive called as explicit emotion features, while the content semantics are called implicit features which sometimes are very important for micro-blog emotion discrimination. Therefore, in this paper, a new systematic methodology for sentiment analysis is proposed using both explicit and implicit emotion features. At first, the sentiment analysis dictionary, the glossary of social networking terms, as well as the emoticon library, are all initialized. Then, the text micro-blog frequent word sets are defined. According to the feature set of words, the initial micro-blog clusters can be directly generated depending on the maximum frequent item sets. Furthermore, as to solve the micro-blog overlap problem between multiple initial clusters, an efficient elimination method is proposed employing the extended membership degree of the short-message semantic. Finally, the semantic similarity matrix for each separated cluster is defined, based on which a hierarchical sentiment clustering for micro-blogs is conducted. Taking the well-known contest NLPCC2013 in China as instance, the efficiency of our proposed method is proved by the comparing experiments. At last, a real-world case is also done to exactly show the emotion change from Chinese micro-blogs for the Malaysia Airlines Disappear Incident during March 8 to Spril 8, 2014
[1] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报,2010, 21(8): 1834-1848. [2] Jichang Zhao, Li Dong, Junjie Wu, et al. MoodLens: An Emoticon-Based Sentiment Analysis System for ChineseTweets[C]//Proceedings of the Eighteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD),2012: 1528-1531. [3] 徐琳宏, 林鸿飞, 潘宇, 等. 情感词汇本体的构造[J].情报学报, 2008, 27(2): 180-185. [4] 贺飞艳,何炎祥,刘楠.面向微博短文本的细粒度情感特征抽取方法[J].北京大学学报,2014,50(1):48-54. [5] Go A, Bhayani R, Huang L. Twitter Sentiment Classification using Distant Supervision[M]. Technical report, Stanford Digital Library Technologies Project, 2009. [6] 谢丽星,周明,孙茂松.基于层次结构的多策略中文微博情感分析和特征抽取[J].中文信息学报.2012,26(1): 73-83. [7] Dmitry Davidov, Oren Tsurm, Ari Rappoport.Enhanced Sentiment Learning Using Twitter Hashtags andSmileys[C]//Proceedings of the FourteenthConference on Computational Natural LanguageLearning, CoNLL 10, Uppsala,Sweden.2010:107-116. [8] Cambria E, Song Y, Wang H, et al. Semantic Multidimensional Scaling for Open-Domain SentimentAnalysis[J]. Intelligent Systems, IEEE, 2014, 29(2):44-51. [9] imko M, Korenek P. Sentiment analysis on microblog utilizing appraisal theory[J]. World Wide Web, 2014, 17(4):847-867. [10] B C M Fung, K Wang, Ester. Hierarchical Document Clustering Using Frequent Itemsets[C]//Proceedings of the SIAM International Conference on Data Mining, 2003. [11] 《知网》中文版.[EB/OL]http://www.keenage.com/html/c_index.html. [12] 朱嫣岚,闵锦,周雅倩等.基于Hownet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. [13] http://tcci.ccf.org.cn/conference/2013/dldoc/ev02.pdf. [14] http://www.csie.ntu.edu.tw/~cjlin/libsvm/.