摘要
情感特征的提取是进行文本情感分析的一个非常重要的步骤,也是影响其结果好坏的主要因素。在该文中,作者提出一种新的特征提取方法来解决新闻评论的情感分析问题。在该方法中,首先根据评论和新闻的对比分析获得候选情感特征,然后经过相关的扩充和验证操作得到通用的情感特征,并将其用于新闻评论的情感分析。对新闻进行话题划分后进行更细粒度的情感分析:根据新闻话题信息,设计相应的话题相关的特征对比和验证过程,选取出面向话题的情感特征,最后用面向话题的情感特征对相应话题进行情感分析。实验证明,这种情感特征提取方法,对于新闻评论这种语句短、评论对象相对分散的评论,情感分析效果有较大的改进。
Abstract
Feature extraction is essential to the quality of text based sentiment analysis. This paper proposes a novel approach to feature extracttion for the sentiment analysis of the news comments. Firstly, the candidate sentimental features are extracted according to the comparison between contents of the news comments and the corresponding news. Then, the general sentimental features for sentiment analysis on various news comments are selected by several extension and validation processes.The proposed method is featured by capable of providing finer-grained sentimental analysis for specific news topic. Specifically, based on the topic information of news comments, it can be adapted for corresponding feature comparison and validation policies to extract topical sentiment features. The experiments show a high performance for the sentiment analysis in sparse data sets, such as comments from news.
Key wordscomputer application; Chinese information processing;sentiment analysis; feature selection; feature extension
关键词
计算机应用 /
中文信息处理 /
情感分析 /
特征选取 /
特征扩展
{{custom_keyword}} /
Key words
computer application /
Chinese information processing /
sentiment analysis /
feature selection /
feature extension
{{custom_keyword}} /
陶富民,高 军,王腾蛟, 周 凯.
面向话题的新闻评论的情感特征选取. 中文信息学报. 2010, 24(3): 37-44
TAO Fumin, GAO Jun, WANG Tengjiao, ZHOU Kai.
Topic Oriented Sentimental Feature Selection Method for News Comments. Journal of Chinese Information Processing. 2010, 24(3): 37-44
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Wentian Li. Random Texts Exhibit Zipf’s-Law-Like Word Frequency Distribution[J]. IEEE Transactions on Information Theory 38 1992, 6: 1842-1845.
[2] J. Liu, Y. Cao, C.Y. Lin and et al. Low-quality product review detection in opinion summarization[C]//Proc. of EMNLP-CoNLL, 2007: 334-342.
[3] S.M. Kim, P. Pantel, T. Chklovski and M. Pennacchiotti. Automatically assessing review helpfulness[C]//Proc. of EMNLP, 2006: 423-430.
[4] Yiming Yang and Jan O. Pedersen. A Comparative Study on Feature Selection in Text Categorization[C]//Proc. of ICML, 1997: 412-420.
[5] G. Forman. An extensive empirical study of feature selection metrics for text classification[J]. Journal of Machine Learning Research, 2003, 3: 1289-1305.
[6] B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques[C]//Proc. of EMNLP, 2002: 79-86.
[7] P. Turney. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews[C]// Proc. of ACL, 2002: 417-424.
[8] M. Hu and B. Liu. 2004. Mining and summarizing customer reviews[C]//Proc.of ACM SIGKDD, 2004: 168-177.
[9] Ana-M. Popescu and O. Etzioni. Extracting product features and opinions from reviews[C]//Proc.of HLT/EMNLP, 2005: 339-346.
[10] N. Kobayashi, K. Inui, Y. Matsumoto and et al. Collecting evaluative expressions for opinion extraction[C]//Proc. of IJCNLP, 2004: 584-589.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金资助项目(60873062)
{{custom_fund}}