当前微博谣言检测研究大多基于微博原文、评论内容及其相互关系,忽略了情感特征、语法特征及语言特征等重要因素的影响。为此,该文提出了一种基于事件-词语-特征异质图的微博谣言检测新方法。首先,在传统方法基础上引入情感、语法、心理等方面的知识,提出文本特征的概念以有效挖掘微博事件中蕴含的情感特征、语法特征以及语言特征。然后,综合微博评论、文本词语及文本特征对谣言检测结果的影响,构建用于谣言检测的事件-词语-特征异质图。最后,利用GraphSAGE和异质图注意力网络在节点表达方面的优势提出新的节点信息聚合方法,以此在区分节点类型重要性的同时降低节点集规模带来的影响。实验结果表明,该方法能有效提高微博事件表示的准确性;相对于传统机器学习方法和典型的深度学习方法而言,该方法在谣言检测精度上具有明显优势。
Abstract
Most of the current Weibo rumor detection methods are based on the original Weibo texts, comment contents and their interrelationships, ignoring the influence of important factors such as emotional features, lexical features and language features. In this paper, a new method for Weibo rumor detection based on aheterogeneous graph of event-word-feature is proposed. Firstly, the knowledge from the aspects of emotion, lexical and psychology is introduced to effectively mine the emotional features, lexical features and language features of Weibo events. Then, an event-word-feature heterogeneous graph for rumor detection is constructed to combine the effects of Weibo comments, text words and text features. Finally, utilizing the advantages of GraphSAGE and heterogeneous graph attention network, and a new node information aggregation method is proposed to distinguish the importance of node types while reducing the impact of node set size. The experimental results show that the rumor detection accuracy of the proposed method has obvious advantages over that of traditional machine learning based methods and typical deep learning based methods.
关键词
谣言检测 /
文本特征 /
异质图注意力网络
{{custom_keyword}} /
Key words
rumor detection /
text features /
heterogeneous graph attention network
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] KUMAR S, SHAN N. False information on web and social media: A survey[M].Social Media Analytics: Advances and Applications, Calabasas: CRC press, 2018.
[2] 张志勇, 荆军昌, 李斐, 等. 人工智能视角下的在线社交网络虚假信息检测、传播与控制研究综述[J]. 计算机学报, 2021,44(11): 2261-2282.
[3] 曾雪强, 华鑫, 刘平生, 等. 基于情感轮和情感词典的文本情感分布标记增强方法[J]. 计算机学报, 2021,44(6): 1080-1094.
[4] NGUYEN T N, LI C, NIEDDREE C. On early-stage debunking rumors on twitter: Leveraging the wisdom of weak learners[C] //Proceedings of International Conference on Social Informatics, 2017: 141-158.
[5] SINGH J P, KUMAR A, RANA N P, et al. Attention-based LSTM network for rumor veracity estimation of tweets[J]. Information Systems Frontiers, 2022, 24: 459-474.
[6] MA J, GAO W, MITRA P, et al. Detecting rumors from microblogs with recurrent neural networks [C] //Proceedings of IJCAI, 2016: 3818-3824.
[7] SONG C, YANG C, CHEN H, et al. CED: Credible early detection of social media rumors [J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 33(08): 3035-3047.
[8] 王友卫, 童爽, 凤丽洲,等. 基于图卷积网络的归纳式微博谣言检测新方法 [J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.
[9] CHEN X, ZHU D, LIN D, et al. Rumor knowledge embedding based data augmentation for imbalanced rumor detection [J]. Information Sciences, 2021, 580: 352-370.
[10] 杨延杰,王莉,王宇航. 融合源信息和门控图神经网络的谣言检测研究[J].计算机研究与发展, 2021, 58(7): 1412-1424.
[11] WU Z, PI D, CHEN J, et al. Rumor detection based on propagation graph neural network with attention mechanism [J]. Expert Systems with Applications, 2020, 158: 113595.
[12] BIAN T, XIAO X, XU T, et al. Rumor detection on social media with bi-directional graph convolutional networks[C] //Proceedings of AAAI, 2020: 549-556.
[13] LU Y J, LI C T. GCAN: Graph-aware co-attention networks for explainable fake news detection on social media [C] // Proceedings of ACL, 2020: 505-514.
[14] ZHANG H, QIAN S, FANG Q, et al. Multi-modal meta multi-task learning for social media rumor detection[J]. IEEE Transactions on Multimedia, 2021, 24: 1449-1459.
[15] ZHOU J, CUI G, HU S, et al. Graph neural networks: A review of methods and applications[J]. AI Open, 2020, 1: 57-81.
[16] 牛耘, 潘明慧, 魏欧, 等. 基于词典的中文微博情绪识别[J]. 计算机科学, 2014, 9: 253-258.
[17] 徐琳宏, 林鸿飞, 潘宇,等. 情感词汇本体的构造[J]. 情报学报, 2008, 27(2): 180-185.
[18] JIAO Z, SUN S, SUN K. Chinese lexical analysis with deep bi-gru-crf network[J]. arXiv preprint arXiv: 1807.01882, 2018.
[19] LUO X, LIU Z, SHANG M, et al. Highly-accurate community detection via pointwise mutual information-incorporated symmetric non-negative matrix factorization[J]. IEEE Transactions on Network Science and Engineering, 2020, 8(1): 463-476.
[20] YUAN X, WANG S, WAN L, et al. SSF: Sentence similar function based on Word2Vector similar elements[J]. Journal of Information Processing Systems, 2019, 15(6): 1503-1516.
[21] HAMILTON W, YING Z, LESKOVEC J. Inductive representation learning on large graphs[C] //Proceedings of NIPS, 2017: 1025-1035.
[22] LINMEI H, YANG T, SHI C, et al. Heterogeneous graph attention networks for semi-supervised short text classification[C] //Proceedings of EMNLP-IJCNLP, 2019: 4821-4830.
[23] ZHANG J, JIANG Y, WU S, et al. Prediction of remaining useful life based on bidirectional gated recurrent unit with temporal self-attention mechanism[J]. Reliability Engineering & System Safety, 2022, 221: 108297.1-108297.10.
[24] MA J, GAO W, WONG K F. Detect rumors in microblog posts using propagation structure via kernel learning[C] //Proceedings of ACL, 2017: 708-717.
[25] WONGKAR M, ANGDRESEY A. Sentiment analysis using naive Bayes algorithm of the data crawler: Twitter [C] // Proceedings of ICIC, 2019: 1-5.
[26] SHAH K, PATEL H, SANGHVI D, et al. A comparative analysis of logistic regression, random forest and KNN models for the text classification[J]. Augmented Human Research, 2020, 5(1): 1-16.
[27] JOULIN A, GRAVE E, BOJANOWSKI P , et al. Bag of tricks for efficient text classification [J]. arXiv preprint arXiv: 1607.01759, 2016.
[28] KIM Y. Convolutional neural networks for sentence classification [C] //Proceedings of EMNLP, 2014: 1746-1751.
[29] 程艳, 尧磊波, 张光河, 等. 基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析[J]. 计算机研究与发展, 2020, 57(12): 2583-2595.
[30] YAO L, MAO C, LUO Y. Graph convolutional networks for text classification[C]//Proceedings of AAAI, 2019: 7370-7377.
[31] ZHANG Y, YU X, CUI Z, et al. Every document owns its structure: Inductive text classification via graph neural networks [C] //Proceedings of ACL, 2020: 334-339.
[32] 许诺, 赵薇, 尚柯源, 等. 基于预训练语言模型的健康谣言检测[J]. 系统科学与数学,2022,42(10): 2582-2589.
[33] SHU K, CUI L, WANG S, et al. dEFEND: Explainable fake news detection [C] //Proceedings of SIGKDD, 2019: 395-405.
[34] 胡斗, 卫玲蔚, 周薇, 等. 一种基于多关系传播树的谣言检测方法[J]. 计算机研究与发展, 2021, 58(7): 1395-1411.
[35] YUAN C, MA Q, ZHOU W, et al. Jointly embedding the local and global relations of heterogeneous graph for rumor detection [C] //Proceedings of ICDM, 2019: 796-805.
[36] VELIC^KOVIC′P, CUCURULL G, CASANOVA A, et al. Graph attention networks [C]//Proceedings of ICLR, 2018: 1-12.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
教育部人文社科项目(19YJCZH178);国家自然科学基金(61906220);国家社会科学基金(18CTJ008);中央财经大学新兴交叉学科建设项目
{{custom_fund}}