李文浩,刘文长,孙茂松,矣晓沅. 概率式关联可信中文知识图谱——“文脉”[J]. 中文信息学报, 2022, 36(12): 67-73.
LI Wenhao, LIU Wenchang, SUN Maosong, YI Xiaoyuan. Wenmai—A Probablistic-Like Association Reliable Chinese Knowledge Graph. , 2022, 36(12): 67-73.
Wenmai—A Probablistic-Like Association Reliable Chinese Knowledge Graph
LI Wenhao1,2,3, LIU Wenchang2,3,4, SUN Maosong1,2,3,5, YI Xiaoyuan6
1.The Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China; 2.Institute for Artificial Intelligence, Tsinghua University, Beijing 100084, China; 3.Beijing National Center for Information Science and Technology, Beijing 100084, China; 4.The Department of Computer Science, University of California, Davis, Davis, CA 95616, USA; 5.Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University, Xuzhou, Jiangsu 221009, China; 6.Microsoft Research Asia, Beijing 100080, China
Abstract：The existing Chinese knowledge graphs are derived from Wikipedia and Baidu Baike by leveraging the information of the entity infobox and categorical system. Differently,This article proposes a Chinese knowledge graph with probabilistic links by treat the hyperlinks in these resources as entity relations, weighted by the TF-IDF value of the mention frequency of the target entity in the entry article of the source entity. A reliable link screening algorithm is further desgned to remove the occasional links to make the knowledge graph more reliable. Based on the above methods, this article has constructed a probabilistically probabilistic-like association reliable Chinese knowledge graph named "Wenmai", which is public available in GitHub as a support for knowledge-guided natural language processing.
 Niu X,Sun X,Wang H,et al. Zhishi. me,weaving chinese linking open data[C]//Proceedings of International Semantic Web Conference. Berlin,Heidelberg: Springer,2011: 205-220.  Wang Z,Li J,Wang Z,et al. XLore: A Large-scale english-Chinese bilingual knowledge graph[C]//Proceedings of International Semantic Web Conference (Posters & Demos),2013,1035: 121-124.  Xu B,Xu Y,Liang J,et al. CN-DBPedia: A never-ending Chinese knowledge extraction system[C]//International Conference on Industrial,Engineering and Other Applications of Applied Intelligent Systems. Cham: Springer,2017: 428-438.  Zeng Y,Wang D,Zhang T,et al. CASIA-KB: A multi-source Chinese semantic knowledge base built from structured and unstructured web data[C]//Proceedings of Joint International Semantic Technology Conference. Cham: Springer,2013: 75-88.  Jin H,Li C,Zhang J,et al. XLORE 2: Large-scale cross-lingual knowledge graph construction and application[J]. Data Intelligence,2019,1(1): 77-98.  Xu B,Liang J,Xie C,et al. CN-DBPedia 2: An extraction and verification framework for enriching Chinese encyclopedia knowledge gase[J]. Data Intelligence,2019,1(3): 271-288.  Lin Y,Liu Z,Sun M,et al. Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of 29th AAAI Conference on Artificial Intelligence,2015: 2181-2187.  Lin Y,Liu Z,Luan H,et al. Modeling relation paths for representation learning of knowledge bases[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2015: 705-714.  Han X,Cao S,Lv X,et al. Openke: An open toolkit for knowledge embedding[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations,2018: 139-144.