|
|
Wenmai—A Probablistic-Like Association Reliable Chinese Knowledge Graph |
LI Wenhao1,2,3, LIU Wenchang2,3,4, SUN Maosong1,2,3,5, YI Xiaoyuan6 |
1.The Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China; 2.Institute for Artificial Intelligence, Tsinghua University, Beijing 100084, China; 3.Beijing National Center for Information Science and Technology, Beijing 100084, China; 4.The Department of Computer Science, University of California, Davis, Davis, CA 95616, USA; 5.Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University, Xuzhou, Jiangsu 221009, China; 6.Microsoft Research Asia, Beijing 100080, China |
|
|
Abstract The existing Chinese knowledge graphs are derived from Wikipedia and Baidu Baike by leveraging the information of the entity infobox and categorical system. Differently,This article proposes a Chinese knowledge graph with probabilistic links by treat the hyperlinks in these resources as entity relations, weighted by the TF-IDF value of the mention frequency of the target entity in the entry article of the source entity. A reliable link screening algorithm is further desgned to remove the occasional links to make the knowledge graph more reliable. Based on the above methods, this article has constructed a probabilistically probabilistic-like association reliable Chinese knowledge graph named "Wenmai", which is public available in GitHub as a support for knowledge-guided natural language processing.
|
Received: 19 October 2021
|
|
|
|
|
[1] Niu X,Sun X,Wang H,et al. Zhishi. me,weaving chinese linking open data[C]//Proceedings of International Semantic Web Conference. Berlin,Heidelberg: Springer,2011: 205-220. [2] Wang Z,Li J,Wang Z,et al. XLore: A Large-scale english-Chinese bilingual knowledge graph[C]//Proceedings of International Semantic Web Conference (Posters & Demos),2013,1035: 121-124. [3] Xu B,Xu Y,Liang J,et al. CN-DBPedia: A never-ending Chinese knowledge extraction system[C]//International Conference on Industrial,Engineering and Other Applications of Applied Intelligent Systems. Cham: Springer,2017: 428-438. [4] Zeng Y,Wang D,Zhang T,et al. CASIA-KB: A multi-source Chinese semantic knowledge base built from structured and unstructured web data[C]//Proceedings of Joint International Semantic Technology Conference. Cham: Springer,2013: 75-88. [5] Jin H,Li C,Zhang J,et al. XLORE 2: Large-scale cross-lingual knowledge graph construction and application[J]. Data Intelligence,2019,1(1): 77-98. [6] Xu B,Liang J,Xie C,et al. CN-DBPedia 2: An extraction and verification framework for enriching Chinese encyclopedia knowledge gase[J]. Data Intelligence,2019,1(3): 271-288. [7] Lin Y,Liu Z,Sun M,et al. Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of 29th AAAI Conference on Artificial Intelligence,2015: 2181-2187. [8] Lin Y,Liu Z,Luan H,et al. Modeling relation paths for representation learning of knowledge bases[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2015: 705-714. [9] Han X,Cao S,Lv X,et al. Openke: An open toolkit for knowledge embedding[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations,2018: 139-144. |
|
|
|