|
|
Preliminary Study on the Construction of Chinese Medical Knowledge Graph |
BYAMBASUREN Odmaa1,2, YANG Yunfei1,2, SUI Zhifang1,2, DAI Damai1,2, CHANG Baobao1,2, LI Sujian1,2, ZAN Hongying2,3 |
1.Key Laboratory of Computational Linguistics, Ministry of Education, Peking University, Beijing 100871, China; 2.Peng Cheng Laboratory, Shenzhen, Guangdong 518055, China; 3.School of Information Engineering, Zhengzhou University, Zhengzhou, Henan 450001, China |
|
|
Abstract The medical knowledge graph is the cornerstone of intelligent medical applications. The existing medical knowledge graphs are not enough from the perspectives of scale, specification, taxonomy, formalization as well as the precise description of the knowledge to meet the needs of intelligent medical applications. We apply natural language processing and text mining techniques with a semi-automated approach to develop the Chinese Medical Knowledge Graph (CMeKG 1.0) . The construction of CMeKG 1.0 refers to the international medical coding systems such as ICD-10, ATC, and MeSH, as well as large-scale, multi-source heterogeneous clinical guidelines, medical standards, diagnostic protocols, and medical encyclopedia resources. CMeKG covers types such as diseases, drugs, and diagnosis/treatment technologies, with more than 1 million medical concept relationships. This paper presents the description system, key technologies, construction process and medical knowledge description of CMeKG 1.0, serving as a reference for the construction and application of knowledge graphs in the medical field.
|
Received: 15 April 2019
|
|
|
|
|
[1] Singhal A. Introducing the knowledge graph: things, not strings [EB/OL]. Official google blog, 2012. https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html. [2] Fabian M, Suchanek Gjergji, Kasneci Gerhard Weikum. YAGO: A large ontology from Wikipedia and WordNet[J]. Web Semantics: Science, Services and Agents on the World Wide Web, 2008(3): 203-217. [3] Christian Bizera, Jens Lehmannb, Georgi Kobilarova, et al. DBpedia-A crystallization point for the Web of data[J]. Web Semantics: Science, Services and Agents on the World Wide Web, 2009(07): 154-165. [4] Bo Xu, Yong Xu, Jiaqing Liang, et al. CN-DBpedia: A never-ending Chinese knowledge extraction system[J].Lecture Notes in Computer Science, vol 10351. Springer, Cham. 2017(10351): 428-438. [5] Zhigang Wang, Juanzi Li, Zhichun Wang, et al. XLore: A large-scale English-Chinese bilingual knowledge graph[C]//Proceedings of the 12th International Senatic Web Conference (ISWC 2013) on Posters & Demoustrations Track. [6] World Health Organization. International Statistical Classification of diseases and related health problems. (10th Revision)[OL] https://icd.who.int/browse10/2016/en. [7] Nadkarni P, Chen R, Brandt C. UMLS concept indexing for production databases [J]. Journal of the American Medical Informatics Association, 2001, 8(1): 80-91. [8] Donnelly K. SNOMED-CT: The advanced terminology and coding system for eHealth [J]. Studies in Health Technology and Informatics, 2006, 121(121): 279-90. [9] Natalya F Noy, Nigam H Shah, Patricia L Whetzel, et al. BioPortal: Ontologies and integrated data resources at the click of a mouse [J]. Nucleic Acids Research, 2009, 2009(37): 170-173. [10] Francois Belleau, Marc-Alexandre Nolin, Nicole Tourigny, et al. Bio2RDF: Towards a mashup to build bioinformatics knowledge systems [J]. Journal of Biomedical Informatics, 2008, 2008(41): 706-716. [11] 贾李蓉, 于彤, 崔蒙, 等. 中医药学语言系统研究进展 [J]. 中国数字医学, 2014, 9(10): 57-62. [12] 贾李蓉, 刘静, 于彤, 等. 中医药知识图谱构建 [J]. 医学信息学杂志, 2015, 36(8): 51-59. [13] 阮彤, 孙程琳, 王昊奋, 等. 中医药知识图谱构建与应用 [J]. 医学信息学杂志, 2016, 37(4): 8-13. [14] Tong Ruan, Mengjie Wang, JianSun, et al. An automatic approach for constructing a knowledge base of symptoms in Chinese [J]. Journal of Biomedical Semantics, 2017, 8(33): 71-79. [15] 刘燕, 傅智杰, 李姣,等. 医学百科知识图谱构建 [J]. 中华医学图书情报杂志, 2018, 27(6): 28-34 [16] Deqing Li, Honghui Mei, Yi Shen, et al. ECharts: A declarative framework for rapid construction of wed-based visualization[J].Visual Informatics, 2018, 2(2): 136-146. [17] ATC/DDD index of 2019 [EB/OL]. WHO collaborating Centre of Drug Statistics Methodology, 2019. https://www.whocc.no/atc/structure_and_principles/. [18] Nelson S J, Johnston W D, Humphreys B L. Relationships in medical subject headings (MeSH) [M]. Relationships in the Organization of Knowledge. Springer Netherlands, 2001:171-184. |
|
|
|