基于无指导学习的微博评论分析方法

徐帅帅,戴新宇,黄书剑,陈家骏

PDF(1366 KB)
PDF(1366 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (2) : 179-186.
情感分析与社会计算

基于无指导学习的微博评论分析方法

  • 徐帅帅,戴新宇,黄书剑,陈家骏
作者信息 +

Unsupervised Microblog Comment Analysis

  • XU Shuaishuai, DAI Xinyu, HUANG Shujian, CHEN Jiajun
Author information +
History +

摘要

该文以一种有效的方法寻找出有价值的微博评论,这对于读者更高效地阅读评论,为舆情分析、文本挖掘等任务提供支持,均具有重要的应用价值。针对微博及其评论文本短小、内容发散等特点,该文提出一种基于无指导学习的微博评论分析方法,该方法通过互联网搜索引擎扩展微博文本,基于相关性计算自动构造正负训练用例,生成特定的某条微博评论分类模型,通过该模型对评论的价值性进行评估。实验结果表明,该方法能够比较好地识别出评论的价值。

Abstract

The valuable microblog comments can be supplied to the readers, or be provided to some tasks like public opinion analysis and text mining. To detect such valuable comment, this paper presents an unsupervised comments analysis method. Firstly, we use the search engine to expand the microblog text. Secondly, we use the correlation measure to get the most valuable comments and the most invaluable comments, respectively. Finally, we generate a comment classification model to assess the comment value. The experimental results show our method performs well on the task of valuable comments recognition.

关键词

微博评论 / 价值性 / 无指导学习 / 评论过滤

Key words

microblog comment / value / unsupervised / comment filter

引用本文

导出引用
徐帅帅,戴新宇,黄书剑,陈家骏. 基于无指导学习的微博评论分析方法. 中文信息学报. 2017, 31(2): 179-186
XU Shuaishuai, DAI Xinyu, HUANG Shujian, CHEN Jiajun. Unsupervised Microblog Comment Analysis. Journal of Chinese Information Processing. 2017, 31(2): 179-186

参考文献

[1] Sriram B, Fuhry D, Demir E, et al. Short text classification in twitter to improve information filtering. [C]//Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010, 841-842.
[2] Liu L, Jia K. Detecting spam in chinese microblogs-a study on sina weibo. [C]//Proceedings of the Computational Intelligence and Security (CIS), 2012 Eighth International Conference. IEEE, 2012, 578-581.
[3] Agichtein E, Castillo C, Donato D, et al. Finding high-quality content in social media. [C]//Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM, 2008, 183-194.
[4] Jindal N, Liu B. Opinion spam and analysis. [C]//Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM, 2008, 219-230.
[5] Jindal N, Liu B. Review spam detection. [C]//Proceedings of the 16th international conference on World Wide Web. ACM, 2007, 1189-1190.
[6] Jindal N, Liu B. Analyzing and detecting review spam. [C]//Proceedings of 7th IEEE International Conference on. IEEE, 2007, 547-552.
[7] Ott M, Choi Y, Cardie C, Hancock J T. Finding deceptive opinion spam by any stretch of the imagination. [C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. 2011, 1: 309-319.
[8] Mukherjee A, Liu B, Wang J, et al. Detecting group review spam. [C]//Proceedings of the 20th international conference companion on World Wide Web. ACM, 2011, 93-94.
[9] Mukherjee A, Liu B, Glance N. Spotting fake reviewer groups in consumer reviews. [C]//Proceedings of the 21st international conference on World Wide Web. ACM, 2012, 191-200.
[10] Lim E P, Nguyen V A, Jindal N, et al. Detecting product review spammers using rating behaviors. [C]//Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, 2010, 939-948.
[11] 赵玉茗. 文本间语义相关性计算及其应用研究[D]. 哈尔滨工业大学博士学位论文, 2009.
[12] Healy M, Delany S J, Zamolotskikh A. An assessment of case base reasoning for short text message classification. [C]//Proceedings of the 16th Irish Lonference on Artifical Intelligence & Coguitive Science (AICI’05), 2005: 257\266.
[13] Chen M, Jin X, Shen D. Short text classification improved by learning multi-granularity topics. [C]//Proceedings of the Twenty-Second international joint conference on Artificial Intelligence. AAAI Press, 2011, 3: 1776-1781.

基金

国家自然科学基金(61170181);江苏省自然科学基金(BK2011192);国家社会科学基金(11AZD121)
PDF(1366 KB)

575

Accesses

0

Citation

Detail

段落导航
相关文章

/