基于多级特征融合和强化学习的多模态实体对齐

李华昱,王翠翠,张智康,李海洋

PDF(8536 KB)
PDF(8536 KB)
中文信息学报 ›› 2024, Vol. 38 ›› Issue (9) : 36-47.
知识表示与知识获取

基于多级特征融合和强化学习的多模态实体对齐

  • 李华昱,王翠翠,张智康,李海洋
作者信息 +

Multi-modal Entity Alignment Based on Multi-level Feature Fusion and Reinforcement Learning

  • LI Huayu, WANG Cuicui, ZHANG Zhikang, LI Haiyang
Author information +
History +

摘要

针对传统实体对齐方法未充分利用多模态信息,且在特征融合时未考虑模态间潜在的交互影响等问题,该文提出了一种多模态实体对齐方法,旨在充分利用实体的不同模态特征,在不同多模态知识图谱中找到等价实体。首先通过不同的特征编码器获得属性、关系、图像和图结构的嵌入表示,同时引入数值模态以增强实体语义信息;其次在特征融合阶段,在对比学习的基础上同时进行跨模态互补性和相关性建模,并引入强化学习优化模型输出,减小获得的联合嵌入和真实模态嵌入之间的异构差异;最后计算两个实体之间的余弦相似度,筛选出候选对齐实体对,并将其迭代加入对齐种子,指导新的实体对齐。实验结果表明,该文所提方法在多模态实体对齐任务中是有效的。

Abstract

To address the defects of failing to leverage multimodal information and the potential interaction effects between modalities, this study proposes a multimodal entity alignment technique. This approach aims to capitalize on the distinctive modal features of entities to identify comparable entities in disparate multimodal knowledge graphs. Firstly, different feature encoders are used to extract attribute, relation, image and graph structure representations, and numerical modalities are used to enrich entity semantic information. Secondly, in the feature fusion stage, cross-modal complementarity and relevance modelling are executed simultaneously on the grounds of comparative learning. Reinforcement learning is also implemented to enhance the model output and decrease the heterogeneous disparities between the acquired joint embeddings and the actual modal embeddings. Finally, the cosine similarity between two entities' cosine similarity is analyzed to filter out candidate aligned entity pairs, which are then iteratively added to the alignment seed to direct the new entity alignment. Experimental results demonstrate the effectiveness of the proposed approach in the multimodal entity alignment task.

关键词

多模态知识图谱 / 表示学习 / 实体对齐 / 特征融合

Key words

multimodal knowledge graph / representation learning / entity alignment / feature fusion

引用本文

导出引用
李华昱,王翠翠,张智康,李海洋. 基于多级特征融合和强化学习的多模态实体对齐. 中文信息学报. 2024, 38(9): 36-47
LI Huayu, WANG Cuicui, ZHANG Zhikang, LI Haiyang. Multi-modal Entity Alignment Based on Multi-level Feature Fusion and Reinforcement Learning. Journal of Chinese Information Processing. 2024, 38(9): 36-47

参考文献

[1] ZHU X, LI Z, WANG X, et al. Multi-modal knowledge graph construction and application: A survey[J]. IEEE Transactions on Knowledge and Data Engineering, 2022,36(2): 715-735.
[2] HAO Y, WU Y, CHEN L, et al. Intelligent question answering system based on domain knowledge graph[C]//Proceedings of the 3rd International Conference on Artificial Intelligence and Education, Atlantis Press, 2022: 137-142.
[3] ZHOU H J, SHEN T T, LIU X L, et al. Survey of knowledge graph approaches and applications[J]. Journal on Artificial Intelligence,2020,2(2): 89-101.
[4] SUN R, CAO X, ZHAO Y, et al. Multi-modal knowledge graphs for recommender systems[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020: 1405-1414.
[5] LIU Y, LI H, GARCIA DURAN A, et al. MMKG: Multi-modal knowledge graphs[C]//Proceedings of the Semantic Web: 16th International Conference, Springer International Publishing, 2019: 459-474.
[6] WANG M, WANG H, QI G, et al. RichPedia: A large-scale, comprehensive multi-modal knowledge graph[J]. Big Data Research, 2020, 22: 100159.
[7] HUANG N, DESHPANDE Y R, LIU Y, et al. Endowing language models with multimodal knowledge graph representations[J]. arXiv preprint arXiv:2206.13163, 2022.
[8] QIAN Y, PAN L. Leveraging multimodal features for knowledge graph entity alignment based on dynamic self-attention networks[J]. Expert Systems with Applications, 2023, 228: 120363.
[9] WANG H, LIU Q, HUANG R, et al. Multi-modal entity alignment method based on feature enhancement[J]. Applied Sciences, 2023, 13(11): 67-74.
[10] CHEN L, LI Z, WANG Y, et al. MMEA: Entity alignment for multi-modal knowledge graph[C]//Proceedings of the Knowledge Science, Engineering and Management: 13th International Conference, Springer International Publishing, 2020: 134-147.
[11] LIU F, CHEN M, ROTH D,et al. Visual pivoting for (unsupervised) entity alignment[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(5): 4257-4266.
[12] SUN R, CAO X, ZHAO Y, et al. Multi-modal knowledge graphs for recommender systems[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020: 1405-1414.
[13] DING Y, YU J, LIU B,et al. Mukea: Multimodal knowledge extraction and accumulation for knowledge-based visual question answering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5089-5098.
[14] 王春雷,王肖,刘凯.多模态知识图谱表示学习综述[J/OL].计算机应用, 2023: 1-19. http://kns.cnki.net/kcms/detail/51.1307.tp.20230728.1508.010.html.[2023-10-23].
[15] BORDES A, USUNIER N, GARCIA-DURAN A, et al. Translating embeddings for modeling multi-relational data[C]//Proceedings of the 26th ACM International Conference on Neural Information Processing Systems, 2013: 2787-2795.
[16] JI G, HE S, XU L,et al. Knowledge graph embedding via dynamic mapping matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 687-696.
[17] YANG B, YIH W, HE X,et al. Embedding entities and relations for learning and inference in knowledge bases[J]. arXiv preprint arXiv:1412.6575, 2014.
[18] XIE R, LIU Z, LUAN H, et al. Image-embodied knowledge representation learning[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017: 3140-3146.
[19] WANG Z, LI L, LI Q,et al. Multimodal data enhanced representation learning for knowledge graphs[C]//Proceedings of the International Joint Conference on Neural Networks. IEEE, 2019: 1-8.
[20] JIN D, QI Z, LUO Y, et al.TransFusion: Multi-modal fusion for video tag inference via translation-based knowledge embedding[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 1093-1101.
[21] GUO H, TANG J, ZENG W,et al. Multi-modal entity alignment in hyperbolic space[J]. Neurocomputing, 2021, 461: 598-607.
[22] LIN Z, ZHANG Z, WANG M, et al. Multi-modal contrastive representation learning for entity alignment[C]//Proceedings of the 29th International Conference on Computational Linguistics, 2022: 2572-2584.
[23] YUAN S, LU Z, LI Q,et al. A multi-modal entity alignment method with inter-modal enhancement[J]. Big Data and Cognitive Computing,2023,7(2): 77-91.
[24] ZHU J, HUANG C, DE MEO P. DFMKE: A dual fusion multi-modal knowledge graph embedding framework for entity alignment[J]. Information Fusion, 2023, 90: 111-119.
[25] VELICKOVIC P, CUCURULL G, CASANOVA A,et al. Graph attention networks[J]. Stat, 2017, 1050(20): 10-48550.
[26] CHEN Z, CHEN J, ZHANG W, et al. Meaformer: Multi-modal entity alignment transformer for meta modality hybrid[C]//Proceedings of the 31st ACM International Conference on Multimedia, 2023: 3317-3327.
[27] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 3rd International Conference on Learning Representations. Computational and Biological Learning Society, 2015.
[28] ZENG W, ZHAO X, TANG J, et al. Reinforcement learning-based collective entity alignment with adaptive features[J]. ACM Transactions on Information Systems, 2021, 39(3): 1-31.
[29] KENDALL A, GAL Y, CIPOLLA R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7482-7491.
[30] CHENG B, ZHU J, GUO M. MultiJAF: Multi-modal joint entity alignment framework for multi-modal knowledge graph[J]. Neurocomputing, 2022, 500: 581-591.
[31] HAMA K, MATSUBARA T. Multi-modal entity alignment using uncertainty quantification for modality importance[J]. IEEE Access, 2023, 11: 28479-28489.
[32] 郭浩,李欣奕,唐九阳,等. 自适应特征融合的多模态实体对齐研究[J]. 自动化学报, 2022, 48(x): 1-13.
[33] GUO L, CHEN Z, CHEN J, et al. Revisit and outstrip entity alignment: A perspective of generative models[J].arXiv preprint arXiv:2305.14651, 2023.

基金

山东省自然科学基金(ZR2020MF140);中国石油大学(华东)研究生创新基金(22CX04035A)
PDF(8536 KB)

511

Accesses

0

Citation

Detail

段落导航
相关文章

/