Abstract:In this paper, a weakly supervised sentiment analysis approach is proposed. A few words are collected to construct an initial sentiment lexicon. These seed words are used to mine potential sentimental words in the target text. In this process, linguistic features at multi-levels are explored and the role of the context is examined. The lexicon is expanded iteratively, and the final version is applied to classify the sentiment of a target document. Compared to results of previous studies on the same data, this approach achieves the best F-score while the constructed sentiment lexicon is rather small. The experimental results also show that this approach is robust when applied to a texts of different domains.
[1]Bo Pang, Lilian Lee. A sentiment education: Sentiment analysis using subjectivity summarization based on minimum cuts[C]//Proceedings of the 42nd Meeting of the Association for Computational Linguistics. 2004. [2] H Yu, V Hatzivassiloglou. Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing. 2003. [3] Wang S, Manning C D. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification[C]//Proceedings of the 50th Meeting of the Association for Computational Linguistics. 2012: 90-94. [4] 傅向华,刘国,郭岩岩,郭武彪.中文博客多方面话题情感分析研究[J].中文信息学报,2013,27(1): 47-56. [5] 王志昊,王中卿,李寿山,李培峰. 不平衡情感分类中的特征选择方法研究[J]. 中文信息学报,2013,27(4): 113-118. [6] 谢丽星,周明,孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报,2012, 26(1):73-84. [7] Turney P D.Thumbs up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews[C]//Proceeding of Association for Computational Linguistics 40th Anniversary Meeting. 2002:1417-1424. [8] Zagibalov T, J Carroll. Automatic Seed Word Selection for Unsupervised Sentiment Classification of Chinese Text[C]//Proceedings of Coling-08,2008:1073-1080. [9] Zagibalov T, J Carroll. Unsupervised classification of sentiment and objectivity in Chinese text[C]//Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP), Hyderabad, India, 2008:304-311. [10] M Hu, B Liu. Mining Opinion Features in Customer Reviews[C]//Proceedings of the Association for the Advancement of Artificial Intelligence(AAAI), 2004:755-760. [11] Ye Q, Lin B, Li Y J. Sentiment Classification for Chinese Reviews: A Comparison between SVM and Semantic Approaches[C]//Proceedings of the 4th International Conference on Machine Learning and Cybernetics ICMLC2005(IEEE). 2005,4(8):2341-2346. [12] Ye Q, Shi W, Li Y J. Sentiment Classification for Movie Reviews in Chinese by Proved Semantic Oriented Approach[C]//Proceedings of the 39th Annual Hawaii International Conference on System Sciences. 2006. [13] Li T, Zhang Y, Sindhwani V. A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge[C]//Proceedings of the joint conference of the annual meeting of the association for computational linguistics and the international joint conference on natural language processing of the asian federation of natural language processing (ACL-IJCNLP). 2009: 244-252. [14] Melville P, Gryc W, Lawrence R D. Sentiment analysis of blogs by combining lexical knowledge with text classification[C]//Proceedings of the 15th ACM SIGKDD conference on knowledge discovery and data mining(KDD). 2009: 1275-1284. [15] Qiu L, Zhang W, Hu C, et al. Selc: A self-supervised model for sentiment classification[C]//Proceeding of the 18th ACM conference on information and knowledge management(CIKM). 2009: 929-936. [16] He Y, Zhou D. Self-training from labeled features for sentiment analysis[J]. Information Processing and Management, 2011, 47: 606-616. [17] Rebecca Bruce, Janyce Wiebe. Recognizing Subjectivity: A Case Study in Manual Tagging[J]. Natural Language Engineering, 1999, 5(2):1-16. [18] 刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429.