面向多模态的虚假新闻检测工作大部分是利用文本和图片之间的一对一关系,将文本特征和图片特征进行简单融合,忽略了帖子内多张图片内容的有效特征,对帖子间的语义关联建模不足。为了克服现有方法的局限性,该文提出了一种基于文图一对多关系的多模态虚假新闻检测模型。利用跨模态注意力网络筛选多张图片的有效特征,通过多模态对比学习网络动态调整帖子间高层次的语义特征关联,增强融合图文特征的联合表示。在新浪微博数据集上的实验结果表明,该模型能充分利用文图一对多关系的有效信息和帖子之间的语义特征关系,比基线模型准确率提升了3.15%。
Abstract
Most of the existing works for multi-modal fake news detection simply fuse textual and image features in a one-to-one manner, while ignoring the information of multiple images in news posts as well as the relationship between different news posts. To overcome these limitations, this paper proposes a model employing the one-to-many relationship of text and images for multi-modal fake newsdetection(OMMFN). In our method, the cross-modal attention network(CMA) is used to extract the effective features of multiple images. Then, the multi-modal contrast learning network(MCL) is used to dynamically adjust the semantic feature relationship between different news posts to improve multi-modal joint feature representation of text and images. Experiments on Sina Weibo dataset show that our model can capture the effective information of text and images with the one-to-many relationship and make full use of the semantic feature relationship between different news posts. The performance in accuracy is improved by 3.15% over the state of the art significantly.
关键词
虚假新闻检测 /
跨模态注意力机制 /
多模态对比学习
{{custom_keyword}} /
Key words
fake news detection /
cross-modal attention /
multi-modal contrast learning
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] O'HALLORAN K L. Interdependence, interaction and metaphor in multisemiotic texts[J]. Social Semiotics, 1999, 9(3): 317-354.
[2] MORENCY L P, BALTRUSAITIS T. Tutorial on multimodal machine learning [C]//Proceedings of NAACL, 2022: 33-38.
[3] 贺雅文. 从 “猪肉钩虫” 事件看微博谣言的传播及应对策略[J]. 新闻世界, 2014 (10): 123-124.
[4] WANG Y, MA F, JIN Z, et al. EANN: Event adversarial neural networks for multi-modal fake news detection[C]//Proceedings of the 24th ACM Sigkdd International Conference on Knowledge Discovery & Data Mining, 2018: 849-857.
[5] KHATTAR D, GOUD J S, GUPTA M, et al. MVAE: Multimodal variational autoencoder for fake news detection[C]//Proceedings of the World Wide Web Conference,2019: 2915-2921.
[6] ZHOU X, WU J,ZAFARANI R. Similarity-aware multi-modal fake news detection[C]// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham: Springer International Publishing, 2020: 354-367.
[7] 亓鹏, 曹娟, 盛强. 语义增强的多模态虚假新闻检测[J]. 计算机研究与发展, 2021, 58(7): 1456.
[8] QI P, CAO J, LI X, et al. Improving fake news detection by using an entity-enhanced framework to fuse diverse multimodal clues[C]//Proceedings of the 29th ACM International Conference on Multimedia, 2021: 1212-1220.
[9] 张少钦,杜圣东,张晓博,等.融合多模态信息的社交网络谣言检测方法[J].计算机科学,2021,48(05):117-123.
[10] QIAN S, WANG J, HU J, et al. Hierarchical multi-modal contextual attention network for fake news detection[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021: 153-162.
[11] MA J, GAO W, MITRA P, et al. Detecting rumors from microblogs with recurrent neural networks[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016: 3818-3824.
[12] CHENG M, NAZARIAN S, BOGDAN P. Vroc: Variational autoencoder-aided multi-task rumor classifier based on text[C]//Proceedings of the Web Conference,2020: 2892-2898.
[13] JIN Z, CAO J, ZHANG Y, et al. Novel visual and statistical image features for microblogs news verification[J]. IEEE Transactions on Multimedia, 2016, 19(3): 598-608.
[14] QI P, CAO J, YANG T, et al. Exploiting multi-domain visual information for fake news detection[C]//Proceedings of the IEEE International Conference on Data Mining. IEEE, 2019: 518-527.
[15] CHEN Y. Convolutional neural network for sentence classification[D]. University of Waterloo, 2015.
[16] SINGHAL S, SHAH R R, CHAKRABORTY T, et al. Spotfake: A multi-modal framework for fake news detection[C]//Proceedings of the 15th International Conference on Multimedia Big Data. IEEE, 2019: 39-47.
[17] KENTON J D M W C, Toutanova L K. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT. 2019: 4171-4186.
[18] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]Proceedings of the ICLR 2015: 1-14.
[19] SINGHAL S, KABRA A, SHARMA M, et al. Spotfake+: A multimodal framework for fake news detection via transfer learning (student abstract)[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020, 34(10): 13915-13916.
[20] JIN Z, CAO J, GUO H, et al. Multimodal fusion with recurrent neural networks for rumor detection on microblogs[C]//Proceedings of the 25th ACM International Conference on Multimedia, 2017: 795-816.
[21] FANG H, WANG S, ZHOU M, et al. Cert: Contrastive self-supervised learning for language understanding[J]. arXiv preprint arXiv:2005.12766, 2020.
[22] WU X, GAO C, ZANG L, et al. ESimCSE: Enhanced sample building method for contrastive learning of unsupervised sentence embedding[C]//Proceedings of the 29th International Conference on Computational Linguistics. 2022: 3898-3907.
[23] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2020: 1597-1607.
[24] HE K, FAN H, WU Y, et al. Momentum contrast for unsupervised visual representation learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 9729-9738.
[25] CHEN X, HE K. Exploring Simple siamese representation learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 15750-15758.
[26] JIA C, YANG Y, XIA Y, et al. Scaling up visual and vision-language representation learning with noisy text supervision[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2021: 4904-4916.
[27] LI J, SELVARAJU R, GOTMARE A, et al. Align before fuse: Vision and language representation learning with momentum distillation[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems, 2021: 9694-9705.
[28] HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 558-567.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(N061402220);湖南省教育厅重点科研项目(19A49);湖南省自然科学基金(2020JJ4525,2022JJ30495)
{{custom_fund}}