社交网络已经成为现代人们在线交流并交换信息的重要途径之一。以国内的人人网为例,大量的年轻人,尤其是学生,以此为平台,相互讨论感兴趣的话题。人与人之间因为学习关系、工作关系、共同的兴趣等诸多因素关联起来;以大学生交流为主体的社交网则更有可能因为在相同院、系、所而关联在一起,从而呈现出社团结构。该文以人人网的真实数据,使用CNM算法来验证这一假设;同时,还利用社会网络的结构知识对CNM算法作了改进,提高了社团发现的精度。所挖掘的社团结构关系还表明,高校不同院系和学科形成的社团具有各自的特点。
Abstract
Social Network is a new medium of exchanging information on line. Take Renren.com as an example, a myriad of young people, especially students, talk about interesting topics on this platform. People are connected for many reasons, such as studying in same college, working in same company, having interest in common. And the network nodes in Renren.com are probably joined together in groups according to the property of users department or school. In this article, the real-world network data is collected from Renren.com in the first place, and then the CNM algorithm is utilized to validate assumptions mentioned above. Based on the structure of Social Network, an improved method for discovering community structure is proposed, which outperforms the CNM in terms of accuracy. The community structure detected in the social network shows the different characteristics of each department or school in college.
关键词
社交网络 /
社团结构 /
社团挖掘 /
人人网
{{custom_keyword}} /
Key words
social network /
community structure /
community mining /
Renren
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] WattsD J, Strogatz S H. Collective dynamics of ‘small-world’ networks[J]. Nature, 1998, 393(6684): 440-442.
[2] Barabasi A L, Albert R. Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439): 509-512.
[3] Girvan M, Newman M E J. Community Structure in Social and Biological Networks[J]. PNAS, 2001, 99(12): 7821-7826.
[4] Newman M E J, Girvan M. Finding and Evaluating Community Structure in Networks[J]. Physical Review E, 2004, 69 (2): 026113.
[5] Clauset A, Newman M E J, Moore C. Finding Community Structure in Very Large Networks[J]. Physical Review E, 2004, 70(6): 066111.
[6] Chen J, Community Mining: Discovering Communities in Social Networks[D]. Edmonton, Alberta: University of Alberta, 2010.
[7] Karypis G, Kumar V. Multilevel k-way partitioning scheme for irregular graphs[J]. Journal of Parallel and Distriuted Computing, 1998, 1(48): 96-129.
[8] Satuluri V, Parthasarathy S. Scalable graph clustering using stochastic flows: applications to community discovery[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). Paris, France: 2009: 737-746.
[9] Kernighan B W, Lin S. An efficient heuristic procedure for partitioning graphs[J]. Bell System Technical Journal, 1970, 1970(49): 291-307.
[10] Ng A, Jordan M, Weiss Y. On spectral clustering: analysis and an algorithm[C]//Proceedings of the 15th Annual Conference on Neural Information Processing Systems (NIPS 2001). Vancouver, British Columbia, Canada: 2001: 849-856.
[11] Nowicki K, Snijders T A B. Estimation and prediction for stochastic blockstructures[J]. Journal of the American Statistical Association, 2001, 96(455): 1077-1087.
[12] Wang X, Mohanty N, McCallum A. Group and Topic Discovery from Relations and Their Attributes[C]//Proceedings of the 19th Annual Conference on Neural Information Processing Systems (NIPS 2006). Whistler, B.C., Canada: 2006: 1449-1456.
[13] Zhang H, et al. An LDA-based Community Structure Discovery Approach for Large-Scale Social Networks[C]//Proceedings of the IEEE Conference on Intelligence and Security Informatics. New Brunswick, New Jersey: 2007: 200-207.
[14] Newman M E J. Finding community structure in networks using the eigenvectors of matrices[J]. Physical Review E, 2006, 74(3): 036104.
[15] Danon L, et al., Comparing community structure identification[J]. Journal of Statistical Mechanics: Theory and Experiment, 2005, 2005(9), 09008.
[16] Scott J, Social Network Analysis: a handbook. 2nd ed[M]. London: SAGE Publications, 2000: 208.
[17] Zhou Y, Cheng H, Yu J X. Clustering large attributed graphs: an efficient incremental approach[C]//Proceedings of the 10th IEEE International Conference on Data Mining (ICDM 2010). Sydney, Australia: 2010: 689-698.
[18] Zhou Y, Cheng H, Yu J X. Clustering large attributed information networks: an efficient incremental computing approach[J]. Data Mining and Knowledge Discovery Journal, 2012, 25(3): 450-477.
[19] Zhou Y, Liu L. Clustering Analysis in Large Graphs with Rich Attributes[J]. Data Mining: Foundations and Intelligent Paradigms, 2012, 23: 7-27.
[20] Hotho A, Staab S, Stumme G. WordNet improves Text Document Clustering[C]//Proceedings of the SIGIR 2003 Semantic Web Workshop. Toronto, Canada: Citeseer, 2003: 541-544.
[21] Hsu W, Lancaster J. Structural link analysis from user profiles and friends networks: A feature construction approach[C]//Proceedings of the 1th International AAAI Conference on Weblogs and Social Media (ICWSM 2007). Boulder, Colorado, USA: 2007: 75-80.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家高技术研究发展计划(863)(2012AA011101),国家自然科学基金(91024009),国家社会科学基金(12&ZD227)
{{custom_fund}}