Abstract:Microblog is a new social network developed in the Web2.0 era, with the simple and quick operation for a post anytime and anywhere through the interaction form. These features make Microblog boom with a highlight in the Internet since 2006, when the Obvious company of the United States launched the worlds first Microblog service named Twitter. This paper firstly introduces the state-of-art research on Twitter, including 1) feature analysis on Microblog social network, e.g. the structure of Microblog users network, the Microblog users impact analysis and the data diffusion mechanics in the information network; 2) semantic analysis, i.e. emotional semantic analysis on Microblog; 3) related applications in Microblog, e.g. event monitoring and warning, security, privacy and real time search. Then we summarize the research on Chinese Microbolg, including the feature and knowledge discovery of Chinese Microblog, and the differences between English and Chinese Microblog. Finally, we discuss the problems in the future research on Chinese Microblog. Key wordsTwitter; Chinese microblog; information process
[1] A. Java, X. Song, T. Finin, et al. Why we twitter: understanding microblogging usage and communities.[C]//Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, 2007: 56-65. [2] H. Kwak, C. Lee, H. Park, et al. What is Twitter, a social network or a news media[C]//Proceedings of the International Conference on Word Wide Web (WWW), 2010: 591-600. [3] S. Wu, J. M. Hofman, W. A. Mason, et al. Who says what to whom on Twitter[C]//Proceedings of the International Conference on World Wide Web (WWW), 2011: 705-714. [4] M. Gupte, P. Shankar, J. Li, et al. Finding hierarchy in directed online social networks[C]//Proceedings of the International Conference on World Wide Web (WWW), 2011: 557-566. [5] A. Arasu, J. Cho, H. Garcia-Molina, et al. Searching the web [J]. ACM Transactions on Internet Technology, 2001, 1(1): 2-43. [6] J. Weng, E. Lim, J. Jiang, et al. TwitterRank: finding topic-sensitive influential twitterers[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2010: 261-270. [7] M. Cha, H. Haddadi, F. Benevenuto, K. P. Gummad. Measuring user influence on twitter: the million follower fallacy[C]//Proceedings of the 4th International AAAI Conference on Weblogs and Social Media, 2010. [8] E. Bakshy, J. M. Hofman, W. A. Mason, et al. Everyones an influencer: quantifying influence on Twitter[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2011: 65-74. [9] B. Krishnamurthy, P. Gill, M. Arlitt. A few chirps about twitter[C]//Proceedings the 1st Workshop on Online Social Networks, 2008: 19-24. [10] D. Zhao, M. B. Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work[C]//Proceedings of the International Conference on Supporting Group Work, 2009: 243-252. [11] Aditya Pal, Scott Counts. Identifying topical authorities in microblogs[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2011: 45-54. [12] M. Welch, U. Schonfeld, D. He, et al. Topical semantics of Twitter links[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2011: 327-336. [13] J. Yang, S. Counts. Comparing information diffusion structure in weblogs and microblogs[C]//Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2010. [14] J. Yang, S. Counts. Predicting the speed, scale, and range of information diffusion in Twitter[C]//Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2010. [15] S. Petrovic, M. Osborne, V. Lavrenko. RT to win! predicting message propagation in Twitter[C]//Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2011. [16] J. Leskovec. Social media analytics: Tracking, modeling and predicting the flow of information through networks[C]//Proceedings of the International Conference on World Wide Web (WWW), 2011: 277-278. [17] D. Romero, B. Meeder, J. Kleinberg. Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on Twitter[C]//Proceedings of the International Conference on World Wide Web (WWW), 2011: 695-704. [18] S. Sadikov, M. Medina, J. Leskovec, et al. Correcting for missing data in information cascades[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2011: 55-64. [19] 杨亮, 林原, 林鸿飞. 基于情感分布的微博热点事件发现 [J]. 中文信息学报, 2012, 26(1): 84-90, 109. [20] 靳延安, 李瑞轩, 文坤梅, 等. 社会标注及其在信息检索中的应用研究综述 [J]. 中文信息学报, 2010, 24(4): 52-62. [21] W. Wu, B. Zhang, M. Ostendorf. Automatic Generation of Personalized Annotation Tags for Twitter Users[C]//Proceedings of the Annual Conference of the North American Chapter of Association for Computational Linguistics (ACL), 2010: 689-692. [22] Mihalcea, P. Tarau. TextRank: bringing order into texts[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2004: 404-411. [23] X. Zhao, J. Jiang, J. He, et al. Topical keyphrase extraction from Twitter[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), 2011: 379-388. [24] W. Zhao, J. Jiang, J. Weng. Comparing Twitter and traditional media using topic models[C]//Proceedings of the European Conference on Information Retrieval (ECIR), 2011: 338-349. [25] L. Hong, O. Dan, B. D. Davison. Predicting popular messages in twitter[C]//Proceedings of the International Conference on World Wide Web (WWW), 2011: 57-58. [26] C. Castillo, M. Mendoza, B. Poblete. Information credibility on twitter[C]//Proceedings of the International Conference on World Wide Web (WWW), 2011: 675-684. [27] 曹鹏, 李静远,满彤, 等. Twitter 中近似重复消息的判定方法研究 [J]. 中文信息学报, 2011, 25(1): 20-27. [28] M. Hu, B. Liu. Mining and summarizing customer reviews[C]//Proceedings of the Annual Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), 2004: 168-177. [29] N. Kaji, M. Kitsuregawa. Automatic construction of polarity-tagged corpus from HTML documents[C]//Proceedings of the Joint Conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics (COLING/ ACL), 2006: 452-459. [30] L. Zhuang, F. Jing, X. Zhu, et al. Movie review mining and summarization[C]//Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM), 2006: 43-50. [31] A. Andreevskaia, S. Bergler. Mining WordNet for fuzzy sentiment: sentiment tag extraction from WordNet glosses[C]//Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2006: 209-216. [32] G. A. Miller. WordNet: a lexical database for English [J]. ACM Transactions on Communication, 1995, 38(11): 39-41. [33] X. Ding, B. Liu, P. Yu. A holistic lexicon-based approach to opinion mining[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2008: 231-240. [34] 朱嫣岚,闵锦,周雅倩,等. 基于HowNet的词汇语义倾向计算 [J]. 中文信息学报, 2006, 1(20): 14-20. [35] 章剑锋,张奇,吴立德,等. 中文观点挖掘中的主观性关系抽取 [J]. 中文信息学报, 2008, 22(2): 55-59,86. [36] 杜伟夫,谭松波,云晓春. 一种新的情感词汇语义倾向计算方法 [J]. 计算机研究与发展, 2009, 46(10): 1713-1720. [37] 刘群, 李素建. 基于《知网》的词汇语义相似度的计算[C]//第三届汉语词汇语义学研讨会, 2002. [38] 廖祥文,曹冬林,方滨兴,等. 基于概率推理模型的博客倾向性检索研究 [J]. 计算机研究与发展, 2009, 46(9): 1530-1536. [39] A. Bermingham, A. Smeaton. Classifying sentiment in microblogs: is brevity an advantage?[C]//Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM), 2010: 1833-1836. [40] A. Go, L. Huang, R. Bhayani. Twitter sentiment analysis [R]. Final Projects from CS224N for Spring 2008/2009 at The Stanford Natural Language Processing Group. [41] E. Kim, S. Gilbert, M. Edwards, et al. Detecting sadness in 140 characters: sentiment analysis of mourning Michael Jackson on Twitter [R]. Web Ecology Project, Boston, MA, 2009. [42] B. J. Jansen, M. Zhang, K. Sobel, et al. Micro-blogging as online word of mouth branding[C]//Proceedings of the 27th International Conference Extended Abstracts on Human Factors in Computing Systems, 2009: 3859-3864. [43] J. Bollen, H. Mao, X. Zeng. Twitter mood predicts the stock market [J]. Journal of Computational Science, 2011, 2(1): 1-8. [44] A. Tumasjan, T. O. Sprenger, P. G. Sandner, et al. Predicting elections with Twitter: what 140 characters reveal about political sentiment[C]//Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2010. [45] T. Sakaki, M. Okazaki, Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors[C]//Proceedings of the 19th International World Wide Web Conference (WWW), 2010: 851-860. [46] V. K. Singh, M. Gao, R. Jain. Situation detection and control using spatio-temporal analysis of microblogs[C]//Proceedings of the 19th International World Wide Web Conference (WWW), 2010: 1181-1182. [47] C. Zhang, J. Sun, X. Zhu, et al. Privacy and security for online social networks: challenges and opportunities [J]. IEEE Network, 2010, 24(4): 13-18 [48] J. Sun, X. Zhu, Y. Fang. A privacy-preserving scheme for online social networks with efficient revocation[C]//Proceedings of the 29th IEEE International Conference on Computer Communications (INFOCOM), 2010: 1-9. [49] J. Teevan, D. Ramage, M. Morris. Twittersearch: A comparison of microblog search and web search[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2011: 35-44. [50] K. Borau, C. Ullrich, J. Feng, et al. Microblogging for language learning: using twitter to train communicative and cultural competence[C]//Proceedings of International Conference on Web-based Learning (ICWL), 2009: 78-87. [51] M. Ebner, M. Schiefner. In microblogging more than fun?[C]//Proceedings of IADIS International Conference on Mobile Learning, 2008: 155-159. [52] B. Sriram, D. Fuhry, E. Demir, et al. Short text classification in Twitter to improve information filter-ing[C]//Proceedings of the 33rd Annual Conference of the ACM Special Interest Group on Information Retrieval (SIGIR), 2010: 841-842. [53] J. Pujol, V. Erramilli, G. Siganos, et al. The little engine(s) that could: scaling online social networks[C]//Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM), 2010: 375-386. [54] Y. Duan, L. Jiang, T. Qin, et al. An empirical study on learning to rank of tweets[C]//Proceedings of the 23rd International Conference on Computational Linguistics (COLING), 2010: 295-303. [55] A.D. Sarma, S. Gollapudi, R. Panigrahy. Ranking Mechanisms in Twitter-Like Forums[C]//Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2010: 21-30. [56] J. Huang, K. M. Thornton, E. N. Efthimiadis. Conversational tagging in Twitter[C]//Proceedings of the 21st ACM Conference on Hypertext and Hypermedia, 2010: 173-178. [57] L. Yu, S. Asur, B. A. Huberman. What trends in Chinese social media[C]//Proceedings of the ACM SIGKDD Workshop on Social Network Mining and Analysis (SNA-KDD), 2011. [58] 谢丽星, 周明, 孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取 [J]. 中文信息学报, 2012, 26(1): 73-83.