基于弱监督预训练深度模型的微博情感分析

万圣贤;兰艳艳;郭嘉丰;程学旗;

PDF(1846 KB)
PDF(1846 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (3) : 191-197.
情感分析与社会计算

基于弱监督预训练深度模型的微博情感分析

  • 万圣贤1;2;兰艳艳1;2;郭嘉丰1;2;程学旗1;2
作者信息 +

Pretrain Deep Models by Distant Supervision for Weibo Sentiment Analysis

  • WAN Shengxian 1;2; LAN Yanyan 1;2; GUO Jiafeng 1;2; CHENG Xueqi1;2
Author information +
History +

摘要

微博情感分析对于商业事务和政治选举等应用非常重要。传统的做法主要基于浅层机器学习模型,对人工提取的特征有较大的依赖,而微博情感特征往往难以提取。深度学习可以自动学习层次化的特征,并被用于解决情感分析问题。随着新的深度学习技术的提出,人们发现只要提供足够多的监督数据,就能训练出好的深度模型。然而,在微博情感分析中,通常监督数据都非常少。微博中广泛存在着弱监督数据。该文提出基于弱监督数据的“预训练—微调整”训练框架(distant pretrain-finetune),使用弱监督数据对深度模型进行预训练,然后使用监督数据进行微调整。这种做法的好处是可以利用弱监督数据学习到一个初始的模型,然后利用监督数据来进一步改善模型并克服弱监督数据存在的一些问题。我们在新浪微博数据上进行的实验表明,这种做法可以在监督数据较少的情况下使用深度学习,并取得比浅层模型更好的效果。

Abstract

Sentiment analysis (SA) is important in many applications such as commercial business and political election. The state-of-the-art methods of SA are based on shallow machine learning models. These methods are heavily dependent on feature engineering, however, the features for Weibo SA are difficult to be extracted manually. Deep learning (DL) can learn hierarchical representations from raw data automatically and has been applied for SA. Recently proposed DL techniques shown that one can train deep models successfully given enough supervised data. However, in Weibo SA, supervised data are usually too scarce. It is easy to obtain large scale distant supervision data in Weibo. In this paper, we proposed to pre-train deep models by distant supervision and used supervised data to fine-tune the deep models. This approach could take the advantages of distant supervision to learn good initial models while using supervised data to improve the models and to correct the errors brought by distant supervision. Experimental results on Sina Weibo dataset show that we can train deep models with small scale supervised data and obtain better results than shallow models.

关键词

情感分析 / 深度学习 / 弱监督 / 预训练-微调整

Key words

sentiment analysis / deep learning / distant supervision / pretrain-finetune

引用本文

导出引用
万圣贤;兰艳艳;郭嘉丰;程学旗;. 基于弱监督预训练深度模型的微博情感分析. 中文信息学报. 2017, 31(3): 191-197
WAN Shengxian ; LAN Yanyan ; GUO Jiafeng ; CHENG Xueqi;. Pretrain Deep Models by Distant Supervision for Weibo Sentiment Analysis. Journal of Chinese Information Processing. 2017, 31(3): 191-197

基金

973基金项目(2014CB34040401,2012CB316303);国家自然科学基金(61232010,61472401,61433014,61425016,61203298);中国科学院青年创新促进会(20144310,2016102)
PDF(1846 KB)

769

Accesses

0

Citation

Detail

段落导航
相关文章

/