Deep Learning for Chinese Micro-blog Sentiment Analysis
LIANG Jun1, CHAI Yumei1, YUAN Huibin2, ZAN Hongying1, LIU Ming1
1.School of Information Engineering, Zhengzhou University ,Zhengzhou Henan, 450001, China;
2. China Institute of Nuclear Information & Economics, Beijing 100048, China
Abstract:Chinese micro-blog sentiment analysis aims to discover the user attitude towards hot events. Most of the current studies analyze the micro-blog sentiment by traditional algorithms such as SVM, CRF based on hand-engineered features. This paper explores the feasibility of performing Chinese micro-blog sentiment analysis by deep learning. We try to avoid task-specific features, and use recursive neural networks to discover relevant features to the tasks. We propose a novel model - sentiment polarity transition model - based on the relationship between neighboring words of a sentence to strengthen the text association. The proposed method achieves a performance close to state-of-the-art methods based on the hand-engineered features, but saving a lot of manual annotation work.
[1] B. Pang, L. Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales[C]Proceedings of the ACL, 2005: 115-124.
[2] 唐慧丰,谭松波,程学旗.基于监督学习的中文情感分类技术比较研究[J].中文信息学报 2007, 6(2):88-94
[3] Y. Bengio, R. Ducharme, P. Vincent, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003,3:1137-1155.
[4] Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning[C]//Proceedings of the 25th international conference on Machine learning. ACM, 2008: 160-167.
[5] Mnih A, Hinton G E. A Scalable Hierarchical Distributed Language Model[C]//Proceedings of NIPS. 2008: 1081-1088.
[6] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C]//Proceedings of INTERSPEECH. 2010: 1045-1048.
[7] Mikolov T, Kombrink S, Burget L, et al. Extensions of recurrent neural network language model[C]//Proceedings of Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE, 2011: 5528-5531.
[8] Kombrink S, Mikolov T, Karafiát M, et al. Recurrent Neural Network Based Language Modeling in Meeting Recognition[C]//Proceedings of INTERSPEECH. 2011: 2877-2880.
[9] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013.
[10] Richard Socher, Jeffrey Pennington, Eric Huang, et al. Manning Conference onEmpirical Methods in Natural Language Processing (EMNLP 2011, Oral ) Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions.2011.
[11] Turney P D. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002: 417-424.
[12] Kennedy A, Inkpen D. Sentiment classification of movie reviews using contextual valence shifters[J]. Computational Intelligence, 2006, 22(2): 110-125.
[13] 朱嫣岚, 闵锦, 周雅倩等. 基于 HowNet 的词汇语义倾向计算 [J]. 中文信息学报, 2006, 20(1): 14-20.
[14] Lin C, He Y, Everson R. A comparative study of Bayesian models for unsupervised sentiment detection[C]//Proceedings of the Fourteenth Conference on Computational Natural Language Learning. Association for computational linguistics, 2010: 144-152.