微博作为一种新兴的社交网络平台,逐渐成为公众发布个人信息,获取实时信息,表达个人观点的新平台。针对微博情感倾向判断的问题,提出了一种基于意群划分的中文微博情感倾向分析(STDSG)方法。引入意群的概念,提出微博意群划分算法,根据意群间的关系,考虑否定词、程度词及标点符号的对情感倾向分析的影响,提出计算微博意群情感倾向的方法。在给定的数据集上,实验结果准确率达到了80.1%,总体性能优于基于情感词典的方法及基于支持向量机的方法。
Abstract
Micro-blog as a new interaction social networking is rich in peoples opinions. Aiming at the Microblog sentiment orientation indetification,this paper proposes an algorithm based on the Sense Group partition.After an introduction to the concept of sense group, we propose the algorithm for the sense group partition. Then, together with the negative words, the degree words and punctuation, we establish the formula of sentiment identification based on the relationship between the sense groups. The experiments reveals an accuracy of 80.1%, outperformed the sentiment lexicon based approach and the SVM based method.
关键词
微博 /
意群 /
情感倾向
{{custom_keyword}} /
Key words
Micro-blog /
sense group /
sentiment orientation
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 娄德成,姚天防.汉语句子语义极性分析和观点抽取方法的研究[J].计算机应用,2006, 26(11): 2622-2625.
[2] B Pang, L Lee. Opinion Mining and Sentiment Analysis[J].Foundations and Trends in Information Retrieval, 2008, 2(1-2):1-135.
[3] Peter D Turney. Unsupervised Learning of Semantic Orientation from a Hundred-billion-word Corpus. Technical Report [ R ], National Research Council of Canada: M. L. Littman, 2002: 1-9.
[4] Hatzivassiloglou,V, McKeown,K Predicting the semantic orientation of adjectives[J].In: ACL.1997:174-181.
[5] Kamps J, Marx M, Mok ken R J, et al. Using WordNet to measure semantic orientation of adjectives[C]//Proceedings of LREC-04,4th Int Conf on Language Resources and Evaluation.Lisbon:LREC,2004: 1115-1118.
[6] 杜伟夫,谭松波,云晓春,等.一种新的情感词汇语义倾向计算方法[J].计算机研究与发展, 2009, 46(10): 1713-1720.
[7] Meena,A,Prabhakar,T V. Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. In:Amat i,G.,Carp inet o, C.,Romano,G.(eds.)ECIR 2007.LNCS,vol. 4425: 573-580.
[8] Wang Chao, Lu Jie, Zhang Guangquan.A semantic classification approach for online product reviews[C]//Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI′5), 2005.
[9] 王根,赵军.基于多重冗余标记CRF的句子情感分析研究[J].中文信息学报, 2007, 21 (5): 51-55.
[10] 杨超, 冯时, 王大玲等. 基于情感词典扩展技术的网络舆情倾向性分析[J]. 小型微型计算机系统, 2010,4:691-695.
[11] B Pang,L Lee, S Vaithyanathan.Thumbs up?Sentiment classification using machine learning techniques[C]//Proceeding of the Conference on Empirical Methods in Natural Language Processing(EMNLP),2002: 79-86.
[12] Cui H,Mittal VO,Datar M.Comparative experiments on sentiment classification for online product revies[C]//Proceedings of the AAAI2006.2006: 1265-1270.
[13] Dmitry Davidiv, Oren Tsur, Ari Rappoport. Enhanced Sentiment Learning Using Twitter Hash-tags and Smileys. In Coling 2010(poster paper), 2010: 241-249.
[14] Luciano Barbosa, Junlan Feng. Robust Sentiment Detection on Twitter from Biased and Noisy Data.In Coling 2010(poster paper),2010: 36-44.
[15] 谢丽星,周明,孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J].中文信息学报, 2012,26(1):691-695.
[16] 索翠萍.意群—一种划分多层复句的好方法[J].职业技术教育,1999,18:25.
[17] 周昌乐,丁晓君. 汉语机器理解的困难与对策一种意群动力学的观点[J].现代外语, 2000,23 (2):195-201
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金项目资助(61203242)
{{custom_fund}}