情感词发现与极性权重自动计算算法研究

张华平;李恒训;李清敏

PDF(1996 KB)
PDF(1996 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (3) : 48-54.
语言分析与计算

情感词发现与极性权重自动计算算法研究

  • 张华平1;2;李恒训3;李清敏4
作者信息 +

Research on Automatic Emotional Word Detection and Polarity Weighting Algorithm

  • ZHANG Huaping1;2; LI Hengxun3; LI Qingmin4
Author information +
History +

摘要

随着互联网电子商务和各种社交网络应用的快速发展,产生了大量的用户评价信息。为满足快速整理这些评价信息的需求,情感倾向性分析应运而生。情感词典是各类情感倾向性识别算法的基础,收集一部全面且权重合理的情感词典,往往可以简单快速而有效地解决情感分析问题。但情感词典规模有限,而网络上新的情感词层出不穷,语言使用不规范,人工整理耗时耗力。已有的情感词收集方法较复杂,且领域性强,收集的情感词可扩展性差。本文提出一种自动挖掘潜在情感词并计算其极性权重的算法,该算法与应用领域无关,具有良好的扩展性。该方法利用共现特性,基于朴素贝叶斯公式能检测出未知的情感词,并根据其情感权重值的大小判断其情感极性,可有效地扩展情感词典,将已有的情感词典进一步量化。在理论研究的基础上,本文分别针对京东、豆瓣及大众点评网三组评论语料做了实验,其结果的准确率都基本在90%以上,验证了该方法的有效性和实用性,为情感倾向性分析提供了知识库基础。

Abstract

Rapid development of Internet commerce and various social networking applications leads to a largenumber of user comment information. To meet the requirement of fast processing these information, sentimentand its polarity analysis arises at the moment. Emotion dictionary is the basis for all kinds of recognitionalgorithms of emotional polarity. To build a comprehensive emotional dictionary with rational weight, this paperproposes an automatic emotion weight (AEW) algorithm to mine the potential emotional words and estimate theemotion weight, with the advantage of domain independence and good scalability. The method uses special typeof co\|occurrence, which is based on Bayesian theory, to recognize unknown emotion words, judge the sentimentpolarity according to the value of its emotion weight. We verify the theoretical research by three empiricalanalysis of data form JD.com, douban.com and dianping.com, achieving a precision about 90%.

关键词

情感词 / 情感权重 / 情感程度判别 / 情感词典

Key words

sentiment lexicon / polarity weight / emotional orientation degree / emotion dictionary

引用本文

导出引用
张华平;李恒训;李清敏. 情感词发现与极性权重自动计算算法研究. 中文信息学报. 2017, 31(3): 48-54
ZHANG Huaping; LI Hengxun; LI Qingmin. Research on Automatic Emotional Word Detection and Polarity Weighting Algorithm. Journal of Chinese Information Processing. 2017, 31(3): 48-54

参考文献

[1] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报, 2010,21(8): 1834-1848.
[2] Zhang Jianfeng, Xia Yunqing, Yao Jianmin. A review towards microtext processing [J]. Journal of Chinese Information Processing, 2012, 26(4): 21-27.
[3] Yang Aimin, Zhou Yongmei, Lin Jianghao. A method of Chinese texts sentiment classification based on Bayesian algorithm [J]. Applied Mechanics and Materials, 2013, (263/266): 2185-2190.
[4] Lin Jianghao, Yang Aimin, Zhou Yongmei, et al. Classification of microblog sentiment based on nave Bayesian [J].Computer Engineering and Science, 2012, 34(9): 86-90.
[5] Ren Yong, Kaji N, Yoshinaga N, et al. Sentiment classification in resource-scarce languages by using label propagation[C]//Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation (PACLIC 25), Singapore, 2011: 420-429.
[6] Escalante H J, Montes-Y-Gómez M, Solorio T. A weighted profile intersection measure for profile-based authorship attribution[C]// Proceedings of the 10th Mexican International Conference on Artificial Intelligence (MICAI ’11). Berlin, Heidelberg: Springer-Verlag, 2011: 232-243.
[7] Jung J J. Maximum entropy-based named entity recognition method for multiple social networking services [J]. Journal of Internet Technology, 2012, 13(6): 931-937.
[8] 朱嫣岚,阂锦,周雅倩等.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1): 14-20.
[9] Hu M, Liu B. Mining and summarizing customer reviews[C]//Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Seattle, WA, USA. 2004:168-177.
[10] Baccianella S, Esuli A, Sebastiani F. SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining[C]//Proceedings of the 7th Conference on International Language Resources and Evaluation (LREC ’10), Valletta, Malta, 2010: 2200-2204.
[11] Hamouda A, Marei M, Rohaim M. Building machine learning based senti-word lexicon for sentiment analysis [J]. Journal of Advances in Information Techno-logy, 2011, 2(4): 199-203.
[12] 阳爱民,林江豪,周咏梅.中文文本情感词典构建方法[J].计算机科学与探索,2013,7(11): 1033-1039.
[13] J Turney Peter.Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of review[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002: 417-424.
[14] Qiu G, Liu B, Bu J, et al. Opinion word expansion and target extraction through double propagation[J]. Computational Linguistics, 2011(37): 9-27.

基金

国家重点基础研究发展计划(973计划)(2013CB329601)
PDF(1996 KB)

924

Accesses

0

Citation

Detail

段落导航
相关文章

/