日本地震的微博热点事件分析

王 昊,杨 亮,林鸿飞

PDF(3031 KB)
PDF(3031 KB)
中文信息学报 ›› 2012, Vol. 26 ›› Issue (5) : 7-14.
综述

日本地震的微博热点事件分析

  • 王 昊,杨 亮,林鸿飞
作者信息 +

Hot Event Analysis of Japan Earthquake on Microblog

  • WANG Hao, YANG Liang, LIN Hongfei
Author information +
History +

摘要

北京时间2011年3月11日日本发生强烈地震,随后在新浪微博上引发了热烈的讨论。该文利用基于情感的HITS算法对日本地震发生后一周内爬取的新浪微博进行事件分析。首先将候选主题词与情感类别构成二部图,再根据HITS算法的得分和候选主题词的频率,计算候选主题词的得分,得到每日的主题词。然后采用互信息的特征选取的方法分析了特定主题词在七天中的变化,以此分析日本地震中的主题变化,同时采用基于规则的情感分类的方法分析人们在特定主题词下表现的情感。该文通过实验证明了基于情感的HITS算法的可行性,同时发现实验语料中网民讨论的话题以两天为单位,以及在微博上对于日本地震,网民并不是表现出高兴或悲哀的情感,而是更倾向于表现出赞扬和贬责这类体现争论的情感。

Abstract

At 14:46 on March 11th, a strong earthquake occurred in Japan, which leads a heated discussion on Sina Micro-blog. This paper employs emotion based HITS algorithm to analyze the corpus which is crawled during the following week that Japan Earthquake happened. Firstly the bipartite graph is constructed by candidate topic words and emotion category. The topic words is then decided by the HITS score and the frequency. Then mutual information is adopted to choose the particular topic words which show the change of the topic during the week. At the same time, the rule based emotion classification is applied to judge the emotion of the micro-blogs which contains the topic words. This paper proves the feasibility of the emotion based HITS algorithm and finds that the topic in the corpus lasts only two days. And this paper also find that users on micro-blog show the emotion of praise and criticism which express the opinion rather than the emotion of happy and sad which express the feeling.
Key wordshot event detection; opinion analysis; emotion based HITS; Japan earthquake

关键词

热点事件发现 / 倾向性分析 / 基于情感先验的HITS算法 / 日本地震

Key words

hot event detection / opinion analysis / emotion based HITS / Japan earthquake

引用本文

导出引用
王 昊,杨 亮,林鸿飞. 日本地震的微博热点事件分析. 中文信息学报. 2012, 26(5): 7-14
WANG Hao, YANG Liang, LIN Hongfei. Hot Event Analysis of Japan Earthquake on Microblog. Journal of Chinese Information Processing. 2012, 26(5): 7-14

参考文献

[1] 彭晖.新浪2011第二季度财报[OL].[ 2011年8月18日]. http://tech.sina.com.cn/i/2011-08-18/05295944929.shtml.
[2] 维基百科.2011年日本太平洋近海地震[OL].[ 2012年3月12日]. http://zh.wikipedia.org/wiki/2011%
E5%B9%B4%E6%97%A5%E6%9C%AC%E4%B8%9C%E5%8C%97%E5%9C%B0%E6%96%B9%E5%A4%AA%E5%B9%B3%E6%B4%8B%E8%BF%91%E6%B5%B7%E5%9C%B0%E9%9C%87.
[3] Akshay Java,Xiaodan Song,Tim Finin. Why we Twitter :Understanding Microblogging Usage and Communities[C]//Proceedings of Association for Computing Machinery’2007,San Jose, California , USA,2007:56-65.
[4] Takeshi Sakaki,Makoto Okazaki,Yutaka Matsuo. Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors[C]//Proceedings of World Wide Web Conference’2010,Raleigh,North Carolina,USA,2010:851-860.
[5] Bharath Sriram,David Fuhry,Engin Demir,et al. Short Text Classification in Twitter to Improve Information Filtering[C]//Proceedings of Special Interest Group on Information Retrieval’10,Geneva,Switzerland,July 2010:841-842.
[6] Yang Shen,Shuchen Li,Ling Zheng,et al. Emotion mining research on micro-blog[C]//Proceedings of the 1st IEEE Symposium on Web Society,Lanzhou,China,2009: 71-75.
[7] 沈阳,田晨耕,李舒晨,等. 闲言碎语中的宏大信息流: 微博客研究[C]//第六届全国搜索引擎和网上信息挖掘学术研讨会,大连,中国,2009.
[8] 沈阳.本拉登死亡事件报道[OL].[2011年5月5日].http://www.fanpq.com./pdf/bld.pdf.
[9] 曹鹏,李静远,满彤,等. Twitter中近似重复消息的判定方法研究[C]//第六届全国信息检索学术会议,牡丹江,哈尔滨,2010:32-39.
[10] Lei Zhang,Bing Liu,Suk Hwan Lim,et al. Extracting and ranking product features in opinion documents[C]//Proceedings of International Conference on Computational Linguistics ’10,Beijing,2010: 10-31.
[11] 徐琳宏,林鸿飞,潘宇,等. 情感词汇本体的构造[J]. 情报学报,2008,27(2): 180-185.
[12] Kleinberg Jon. Authoritative sources in hyperlinked environment[J]. Journal of the Association for Computing Machinery,1999,46(5): 604-632.
[13] 石晶,李万龙. 基于 LDA 模型的主题词抽取方法[J]. 计算机工程,2010,36(19): 81-83.
[14] 许洪波,姚天昉,黄萱菁. 第二届中文倾向性分析评测[C]. 上海,2009: 107-116.
[15] D. Blei, A. Ng, M. Jordan. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, January 2003,3: 993-1022.

基金

国家自然科学基金资助项目(60673039,60973068);国家863高科技计划资助项目(2006AA01Z151);教育部留学回国人员科研启动基金和高等学校博士学科点专项科研基金(20090041110002)
PDF(3031 KB)

477

Accesses

0

Citation

Detail

段落导航
相关文章

/