在情感倾向性分析中,经常会发生由于领域知识的变化引起的分类精度下降的问题。为解决此类问题,该文提出了一种基于实例和特征相融合的知识迁移方法,首先通过三部图构建了源领域和目标领域的领域依赖特征词之间的关联,并得到一个公共的语义空间来对原有的向量空间模型进行重建,然后再通过带偏置的马尔科夫模型,建立源领域和目标领域实例之间的关联,从而有效的将源领域学习到的情感倾向性知识迁移到目标领域中,高于其它方法的实验结果验证了算法的有效性。
Abstract
The accuracy decrease across different domains is commor in current sentiment analysis. To solve the problem, this paper presents a knowledge transferring approach based on the combination of the features and the instancetransfer. Firstly, the proposed approach builds the relevance of the domain dependent features between the source domain and the target domain via a tripartite graph so that a common semantic space is projected to rebuild the original vector space model. Then the proposed approach builds the relevance of the instances between the source domain and the target domain via a biased Markov model. This approach transfers sentiment analysis knowledge from the source domain to the target domain. The enhanced experimental performance confirms the effectiveness of the approach.
关键词
跨领域倾向性分析 /
迁移学习 /
偏置的马尔科夫模型
{{custom_keyword}} /
Key words
cross-domain sentiment analysis /
transfer learning /
biased Markov model
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1]John Blitzer, Mark Dredze, Fernando Pereira. Biographies, Bollywood, Boomboxes and Blenders: Domain Adaptation for Sentiment Classification[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 2007: 432-439.
[2] Sinno Jialin Pan, Xiaochuan Ni, Jiantao Sun, et al.. Cross-domain Sentiment Classification via Spectral Feature Alignment[C]//Proceedings of the 19th International World Wide Web Conference-Raleigh, North Carolina USA, 2010.
[3] Jiang Jing, Zhai Chengxiang. Instance weighting for domain adaptation in NLP[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 2007: 264-271.
[4] Wu Qiong, Tan Songbo, Zhai Haijun et al. SentiRank: Cross-Domain Graph Ranking for Sentiment Classification[C]//Proceedings of the IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology. 2009.
[5] Bo Pang, Lillian Lee, Shivakumar Vaithyanathan, Thumbs up? Sentiment classification using machine learning techniques[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2002: 79-86.
[6] Delip Rao, Deepak Ravichandran. Semi-supervised Polarity Lexicon Induction[C]//Proceedings of 12th Conference of the European Chapter of the Association for Computational Linguistics. 2009: 675-682.
[7] 徐琳宏,林鸿飞,潘宇,情感词汇本体的构造[J],情报学报,2008,(27):180-185.
[8] 赵妍妍,秦兵,车万翔,刘挺, 基于句法路径的情感评价单元识别[J], 软件学报. 2011(22):887-898.
[9] 王素格, 李德玉, 魏英杰. 基于赋权粗糙隶属度的文本情感分类方法[J], 计算机研究与发展, 2011,48(5):855-861.
[10] Sinno Jialin Pan, Yang Qiang. A survey on transfer learning[J], IEEE Transactions on Knowledge and Data Engineering, 2009, 22(10):1345-1359.
[11] Dai Wenyuan, Xue Guirong, Yang Qiang, et al. Transferring naive bayes classifiers for text classification[C]//Proceedings of the 22nd AAAI Conference on Artificial Intelligence, Canada, 2007:540-545.[12] Meng Jiana, Lin Hongfei, Li Yanpeng. Knowledge transfer based on feature representation mapping for text classification [J], Expert Systems with Applications, 2011, 38(8): 10562-10567
[13] Andrew Arnold, Ramesh Nallapati, William W. Cohen. A comparative study of methods for transductive transfer learning[C]//Proceedings of the 7th IEEE International Conference on Data Mining Workshops. Omaha, Nebraska, USA: IEEE Computer Society, 2007: 77-82.
[14] Pengcheng Wu, Thomas G. Dietterich. Improving svm accuracy by training on auxiliary data sources[C]//Proceedings of the 21st International Conference on Machine Learning, Morgan Kaufmann,2004: 871-878.
[15] Vikas C. Raykar, Balaji Krishnapuram, Jinbo Bi, et al. Bayesian multiple instance learning: automatic feature selection and inductive transfer[C]//Proceedings of the 25th International Conference on Machine learning. 2008: 808-815.
[16] Lawrence Page, Sergey Brin, Rajeev Motwani, et al. The PageRank citation ranking: bringing order to the web, Technical Report[R], Stanford University, Stanford, CA, 1998.
[17] 郑伟,王朝坤,刘璋等,一种基于随机游走模型的多标签分类算法[J], 计算机学报,2010,33(8):1418-1425
[18] Thorsten Joachims. Text Categorization with Support Vector Machines: Leaning with Many Relevant Features[C]//Proceedings of the 10th European Conference on Machine Learning, 1998: 137-142.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(61202254); 高校自主科研基金(DC201502030202, DC201502030405)
{{custom_fund}}