基于Beta分布和半监督学习的非确定性知识图谱嵌入模型

徐遥,何世柱,刘康,张弛,焦飞,赵军

PDF(2143 KB)
PDF(2143 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (10) : 54-62.
知识表示与知识获取

基于Beta分布和半监督学习的非确定性知识图谱嵌入模型

  • 徐遥1,2,何世柱1,2,刘康1,2,张弛3,焦飞4,赵军1,2
作者信息 +

Uncertain Knowledge Graph Embedding by Beta Distribution and Semi-supervised Learning

  • XU Yao1,2, HE Shizhu1,2, LIU Kang1,2, ZHANG Chi 3, JIAO Fei4, ZHAO Jun1,2
Author information +
History +

摘要

近年来,面向确定性知识图谱的嵌入模型在知识图谱补全等任务中取得了长足的进展,但如何设计和训练面向非确定性知识图谱的嵌入模型仍然是一个重要挑战。不同于确定性知识图谱,非确定性知识图谱的每个事实三元组都有着对应的置信度,因此,非确定性知识图谱嵌入模型需要准确地计算出每个三元组的置信度。现有的非确定性知识图谱嵌入模型结构较为简单,只能处理对称关系,并且无法很好地处理假负(false-negative)样本问题。为了解决上述问题,该文首先提出了一个用于训练非确定性知识图谱嵌入模型的统一框架,该框架使用基于多模型的半监督学习方法训练非确定性知识图谱嵌入模型。为了解决半监督学习中半监督样本噪声过高的问题,我们还使用蒙特卡洛Dropout计算出模型对输出结果的不确定度,并根据该不确定度有效地过滤了半监督样本中的噪声数据。此外,为了更好地表示非确定性知识图谱中实体和关系的不确定性以处理更复杂的关系,该文还提出了基于Beta分布的非确定性知识图谱嵌入模型UBetaE,该模型将实体、关系均表示为一组相互独立的Beta分布。在公开数据集上的实验结果表明,结合该文所提出的半监督学习方法和UBetaE模型,不仅极大地缓解了假负样本问题,还在多个任务中明显优于UKGE等当前最优的非确定性知识图谱嵌入模型。

Abstract

In recent years, embedding models for deterministic knowledge graph have made great progress in tasks such as knowledge graph completion. However, how to design and train embedding models for uncertain knowledge graphs is still an important challenge. Different from deterministic knowledge graphs, each fact triple of uncertain knowledge graph has a corresponding confidence. Therefore, the uncertain knowledge graph embedding model needs to accurately calculate the confidence of each triple. The existing uncertain knowledge graph embedding model with relatively simple structure can only deal with symmetric relations, and cannot handle the false-negative problem well. Aiming to solve the above problems, we first propose a unified framework for training uncertain knowledge graph embedding models. The framework uses a multi-model based semi-supervised learning method to train uncertain knowledge graph embedding models. In order to solve the problem of excessive noise in semi-supervised samples, we also use Monte Carlo Dropout to calculate the uncertainty of the model on the output results, and effectively filter the noisy data in semi-supervised samples according to this uncertainty. In addition, in order to better represent the uncertainty of entities and relationships in uncertain knowledge graph to deal with more complex relations, we also propose an uncertain knowledge graphs embedding model UBetaE based on Beta distribution, which represents both entities and relations as a set of mutually independent Beta distributions. The experimental results on the public dataset show that the combination of the semi-supervised learning method and UBetaE model proposed in this paper not only greatly alleviates the false-negative problem, but also significantly outperforms the current SOTA uncertain knowledge graph embedding models such as UKGE in multiple tasks.

关键词

知识图谱 / 非确定性知识图谱嵌入 / 半监督学习 / Beta分布

Key words

knowledge graph / uncertain knowledge graph embedding / semi-supervised learning / Beta distribution

引用本文

导出引用
徐遥,何世柱,刘康,张弛,焦飞,赵军. 基于Beta分布和半监督学习的非确定性知识图谱嵌入模型. 中文信息学报. 2022, 36(10): 54-62
XU Yao, HE Shizhu, LIU Kang, ZHANG Chi, JIAO Fei, ZHAO Jun. Uncertain Knowledge Graph Embedding by Beta Distribution and Semi-supervised Learning. Journal of Chinese Information Processing. 2022, 36(10): 54-62

参考文献

[1] JI S, PAN S, CAMBRIA E, et al.A survey on knowledge graphs: Representation, acquisition, and applications [J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(2): 494-514.
[2] NICKEL M, MURPHY K, TRESP V, et al.A review of relational machine learning for knowledge graphs [J]. Proceedings of the IEEE, 2015, 104(1): 11-33.
[3] SPEER R, CHIN J, HAVASI C.Conceptnet 5.5: An open multilingual graph of general knowledge[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017: 4444-4451.
[4] CHEN X, CHEN M, SHI W, et al. Embedding uncertain knowledge graphs[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019: 3363-3370.
[5] BORDES A, USUNIER N, GARCIA-DURAN A, et al.Translating embeddings for modeling multi-relational data [C]//Proceedings of the 27th Annual Conference on Neural Information Processing systems, 2013, 26: 2787-2795.
[6] YANG B, YIH W, HE X, et al.Embedding entities and relations for learning and inference in knowledge bases [J]. arXiv preprint arXiv, 2014, 1412: 6575.
[7] GALRRAGA L A, TEFLIOUDI C, HOSE K, et al.AMIE: Association rule mining under incomplete evidence in ontological knowledge bases[C]//Proceedings of the 22nd International Conference on World Wide Web, 2013: 413-422.
[8] GAL Y, GHAHRAMANI Z.Dropout as a bayesian approximation: Representing model uncertainty in deep learning[C]//Proceedings of the 33nd International Conference on Machine Learning, 2016: 1050-1059.
[9] WANG Q, MAO Z, WANG B, et al.Knowledge graph embedding: A survey of approaches and applications [J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(12): 2724-2743.
[10] WANG Z, ZHANG J, FENG J, et al.Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2014: 1112-1119.
[11] LIN Y, LIU Z, SUN M, et al.Learning entity and relation embeddings for knowledge graph completion[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence, 2015: 2181-2187.
[12] JI G, HE S, XU L, et al.Knowledge graph embedding via dynamic mapping matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Ccomputational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 687-696.
[13] NICKEL M, TRESP V, KRIEGEL H. A three-way model for collective learning on multi-relational data[C]//Proceedings of the 28th International Conference on Machine Learning, 2011: 809-816.
[14] SUN Z, DENG Z, NIE J, et al.Rotate: Knowledge graph embedding by relational rotation in complex space[C]//Proceedings of the 7th International Conference on Learning Representations, 2019: 1-18.
[15] TROUILLON T, WELBL J, RIEDEL S, et al.Complex embeddings for simple link prediction [C]// Proceedings of the 33nd International Conference on Machine Learning, 2016: 2071-2080.
[16] KIMMIG A, BACH S, BROECHELER M, et al. A short introduction to probabilistic soft logic[C]//Proceedings of the NIPS Workshop on Probabilistic Programming: Foundations and Applications, 2012: 1-4.
[17] GUO C, PLEISS G, SUN Y, et al. On calibration of modern neural networks[C]//Proceedings of the 34th International Conference on Machine Learning, 2017: 1321-1330.
[18] HAN X, CAO S, XIN L, et al.OpenKE: An open toolkit for knowledge embedding[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2018: 139-144.
[19] LEE D. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks[G]. Workshop on Challenges in Representation Learning, 2013, 3(2): 896.
[20] XIE Q, MA X, DAI Z, et al. An interpretable knowledge transfer model for knowledge base completion [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 950-962.
[21] TARVAINEN A, VALPOLA H.Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results [C]//Proceedings of the 31st Conference on Neural Information Processing Systems, 2017: 1195-1204.
[22] REN H, LESKOVEC J. Beta embeddings for multi-hop logical reasoning in knowledge graphs [C]//Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020, 33: 19716-19726.
[23] KINGMA D P, BA J. Adam: A method for stochastic optimization[C]// Proceedings of the 3rd International Conference on Learning Representations, 2015.

基金

国网总部科技项目(5700-202012488A-0-0-00)
PDF(2143 KB)

Accesses

Citation

Detail

段落导航
相关文章

/