大规模情感词典的构建及其在情感分类中的应用

赵妍妍,秦 兵,石秋慧,刘 挺

PDF(2791 KB)
PDF(2791 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (2) : 187-193.
情感分析与社会计算

大规模情感词典的构建及其在情感分类中的应用

  • 赵妍妍1,秦 兵2,石秋慧2,刘 挺2
作者信息 +

Large-scale Sentiment Lexicon Collection and Its Application in Sentiment Classification

  • ZHAO Yanyan1, QIN Bing2 , SHI Qiuhui2, LIU Ting2
Author information +
History +

摘要

以微博为代表的社会媒体的飞速发展为情感分析方向带来巨大的资源,同时也对情感分析算法的性能提出了更大的挑战。其中,现有的情感词典尤其是中文情感词典规模不足是影响情感分析性能的一个重要因素。为此,该文基于海量的微博数据,使用简单的文本统计算法,构建了一个十万词语/词组的大规模情感词典。我们以情感分析的基础任务——情感分类为例,将大规模情感词典作为特征用于该任务上,实验结果表明大规模词典有助于情感分类性能的提高。

Abstract

Rapid development of social media, such as Micro-blog, brings lots of information as well as challenges for sentiment analysis. The limited size of Chinese sentiment lexicon is one critical influence on the performances of sentiment analysis. This paper proposes a simple statistical method to mine large amounts of sentiment words or phrases to construct a large scale 100,000 words/phrases from microblogs. We apply this large-scale lexicon to Chinese microblog sentiment classification, and the results confirm a clear performance improvement.

关键词

情感词典 / 情感分析 / 情感分类 / 微博

Key words

sentiment lexicon / sentiment analysis / sentiment classification / chinese microblog

引用本文

导出引用
赵妍妍,秦 兵,石秋慧,刘 挺. 大规模情感词典的构建及其在情感分类中的应用. 中文信息学报. 2017, 31(2): 187-193
ZHAO Yanyan, QIN Bing , SHI Qiuhui, LIU Ting. Large-scale Sentiment Lexicon Collection and Its Application in Sentiment Classification. Journal of Chinese Information Processing. 2017, 31(2): 187-193

参考文献

[1] 赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8): 1834-1848.
[2] Pang B, Lee L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval. 2008,2(1-2): 1-135.
[3] L Velikovich, S Blair-Goldensohn, K. Hannan, R McDonald. The viability of web-derived polarity lexicons[C]//Proceedings of the NAACL, 2010: 777-785.
[4] S Mohammad, S Kiritchenko, X Zhu. NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets[C]//Proceedings of the Second Joint Conference on Lexical and Computational Semantics (*SEM), 2013: 321-327.
[5] V Hatzivassiloglou, K McKeown. Predicting the semantic orientation of adjectives[C]//Proceedings of the EACL, 1997: 174-181.
[6] J Wiebe. Learning subjective adjectives from corpora[C]//Proceedings of the AAAI, 2000: 735-740.
[7] P Turney, M Littman. Measuring praise and criticism: Inference of semantic orientation from association[J]. ACM Trans. on Information Systems, 2003,21(4): 315-346.
[8] SKim, E Hovy. Automatic detection of opinion bearing words and sentences[C]//Proceedings of the IJCNLP, 2005: 61-66.
[9] S Kim, E Hovy. Identifying and analyzing judgment opinions[C]//Proceedings of the NAACL, 2006: 200-207.
[10] D Rao, D Ravichandran. Semi-Supervised polarity lexicon induction[C]//Proceedings of the EACL, 2009: 675-682.
[11] 徐琳宏,林鸿飞,潘宇,等.情感词汇本体的构造[J]. 情报学报, 2008, 27(2): 180-185.
[12] 李军. 中文评论的褒贬义分类实验研究[D].清华大学硕士学位论文,2008.
[13] F Li, S Pan, O Jin, et al. Cross-Domain Co-Extraction of Sentiment and Topic Lexicons[C]//Proceedings of the 50th ACL, 2012: 410-419.
[14] B Pang, L Lillian, V Shivakumar. Thumbs up? Sentiment Classification using Machine Learning Techniques[C]//Proceedings of the EMNLP, 2002: 79-86.

基金

中国博士后科学基金(2012M520740, 2013T60373, 2012M520142)
PDF(2791 KB)

1129

Accesses

0

Citation

Detail

段落导航
相关文章

/