Language Resources Construction
ZAN Hongying, LIU Tao, NIU Changyong, ZHAO Yueshu, ZHANG Kunli, SUI Zhifang
2020, 34(5): 19-26.
In the current medical corpus, the classification system of entities and entity relations is difficult to meet the development requirement of precision medicine. This paper conducts the research about pediatric diseases. In particular, this paper constructs an annotation system and detailed annotation schemes for named entity and entity relations under the guidance of medical experts. By fusing the relevant medical standard, annotation tools are applied for machine pre-annotation, manual annotation and manual proofreading of entities and entity relations in pediatric medical texts with more than 2.98 million words, thus constructing a medical entities and entity relations corpus for 504 common pediatric diseases. In this corpus, 23 603 named entities and 36 513 entity relationships were annotated, and for them the consistency accuracies of multiple-around annotation are 0.85 and 0.82, respectively. Based on the annotated corpus, this paper also constructs a pediatric medical knowledge graph and develops a pediatric medical knowledge QA system.