基于文本纹理特征的中文情感倾向性分类

许歆艺,刘功申

PDF(1535 KB)
PDF(1535 KB)
中文信息学报 ›› 2015, Vol. 29 ›› Issue (3) : 106-112.
情感分析与社会计算

基于文本纹理特征的中文情感倾向性分类

  • 许歆艺,刘功申
作者信息 +

Texture Based Sentiment Orientation Identification for Chinese Texts

  • XU Xinyi, LIU Gongshen
Author information +
History +

摘要

随着互联网的发展,社交网络、电子商务等已经成为人们关注的焦点,对社交网络的文本进行情感倾向性分析和挖掘变得越来越重要。该文针对网络上的中文文本,提出一种基于文本纹理特征的情感倾向性分类方法。通过测试多种文本纹理特征对文本情感倾向性的影响,成功将文本纹理特征融入情感分类中。通过计算各类特征与文本的情感倾向性的相关度,对特征进行降维。相对于基于词频的情感倾向性分类方法,查准率平均提高了10%左右。

Abstract

With the development of Internet, the text orientation identification and text mining in social network is becoming a hot research issue. In this paper, a text sentiment orientation identification method using textures is proposed. The feature reduction is conducted by mutual information between the texture features and the text orientations. Compared to sentiment orientation classification method based on word frequency, the proposed method is proved about 10% increase for precision on average.

关键词

中文文本分类 / 情感倾向性 / 文本纹理 / SVM

Key words

Chinese text categorization / sentiment orientation / textures of text / SVM

引用本文

导出引用
许歆艺,刘功申. 基于文本纹理特征的中文情感倾向性分类. 中文信息学报. 2015, 29(3): 106-112
XU Xinyi, LIU Gongshen. Texture Based Sentiment Orientation Identification for Chinese Texts. Journal of Chinese Information Processing. 2015, 29(3): 106-112

参考文献

[1] Peter D Turney, Michael L Littman. Measuring praise and criticism: Inference of semantic orientation from association[J].ACM Transactions on Information Systems (TOIS).2003, 21(4):315-346.
[2] Kim, S M, E Hovy. Automatic Detection of Opinion Bearing words and Sentences[A]. Companion Volume to the Proceedings of IJCNLP-05[C].Jeju Island, KR,2005: 61-66.
[3] Janyce wiebe, Theresa wilson, Matthew Bell. Identifying Collocations for Recognizing Opinions[A]. ACL-01 Workshop on Collocation: Computational Extraction, Analysis, and Exploitation[C]. Toulouse, France, 2001: 24-31.
[4] Theresa Wilson, Janyce Wiebe, Paul Hoffmann.Recognizing Contextual Polarity:An Exploration of Features for Phrase-Level Sentiment Analysis[J].Computational Linguistics,2009,35(3):399-433.
[5] Prem Melville, Wojciech Gryc, and Richard D. Lawrence.Sentiment analysis of blogs by combining lexical knowledge with text classification[A]. KDD ′09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining[C].New York, USA:ACM, 2009,1275-1284.
[6] 朱嫣岚,闵锦,周雅倩等.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1): 14-20.
[7] 代六玲,黄河燕,陈肇雄.中文文本分类中特征抽取方法的比较研究[J].中文信息学报,2004,18(1): 26-32.
[8] 徐军,丁宇新,王晓龙,使用机器学习方法进行新闻的情感自动分类[J].中文信息学报,2004,18(1): 95-100.
[9] 刘依璐. 基于机器学习的中文文本分类方法研究 [D]. 西安:西安电子科技大学,2009.[10] 徐琳宏, 林鸿飞, 杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(6): 96-100.
[11] Bo Pang,Lillian Lee,Shivakumar Vaithyanathan.Thumbs up? Sentiment Classification using Machine Learning Techniques[A].EMNLP ′02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing[C]Stroudsburg, PA, USA:Association for Computational Linguistics,2002: 79-86.
[12] 胡洁.高维数据特征降维研究综述[J].计算机应用研究.2008,25(9): 2601-2606.
[13] N. Cristianini, J. Shawe-Taylor.An introduction to support vector machines and other kernel-based learning methods[M].Cambridge:Cambridge University Press,2000.
[14] Nitin Namdeo Pise, Parag Kulkarn.Semi-Supervised Learning with SVM and K-Means Clustering Algorithm[A].Prasad, Bhanu.IICAI[C].IICAI,2010: 463-482.
[15] 张爱华,靖红芳,王斌等.文本分类中特征权重因子的作用研究[J].中文信息学报,2010,24(3): 97-104.

基金

国家自然科学基金(61272441, 61171173)
PDF(1535 KB)

651

Accesses

0

Citation

Detail

段落导航
相关文章

/