基于BERT的手术名称标准化重排序算法

陈漠沙,仇伟,谭传奇

PDF(1154 KB)
PDF(1154 KB)
中文信息学报 ›› 2021, Vol. 35 ›› Issue (3) : 88-93.
信息抽取与文本挖掘

基于BERT的手术名称标准化重排序算法

  • 陈漠沙,仇伟,谭传奇
作者信息 +

A BERT Based Reordering Method for Clinical Operation Term Normalization

  • CHEN Mosha, QIU Wei, TAN Chuanqi
Author information +
History +

摘要

临床术语标准化是医学文本信息抽取中不可或缺的一项任务。临床上对于同一种诊断、手术、药品、检查、化验、症状等,往往会有多种不同的写法,术语标准化(归一)要解决的问题就是为临床上各种不同的说法找到对应的标准名称。在检索技术生成候选答案的基础上,该文提出了基于BERT(bidirectional encoder representation from transformers) 对候选答案进行重排序的方法。实验表明,该方法在CHIP2019手术名称标准化数据集上单模型准确率达到89.1%、融合模型准确率达到92.8%,基本满足实际应用标准。同时该方法具备较好的泛化能力,可应用到其他医学种类术语的标准化任务上。

Abstract

Clinical term normalization is an indispensable task in clinical text information extraction. There are often various ways of writing about the same clinical term like diagnosis, operation, medicine, examination, laboratory test, symptom, etc., and term normalization is to find the corresponding standard name for different clinical terms. Based on the candidate answers generated by information retrieval tools, this paper proposes a method of reordering candidates based on BERT (Bidirectional Encoder Representation from Transformers). The experimental results show that the accuracy of single model and fusion model achieves 89.1% and 92.8%, respectively.

关键词

手术名称标准化 / Lucene检索 / BERT

Key words

clinical operation term normalization / Lucene information retrieval / BERT

引用本文

导出引用
陈漠沙,仇伟,谭传奇. 基于BERT的手术名称标准化重排序算法. 中文信息学报. 2021, 35(3): 88-93
CHEN Mosha, QIU Wei, TAN Chuanqi. A BERT Based Reordering Method for Clinical Operation Term Normalization. Journal of Chinese Information Processing. 2021, 35(3): 88-93

参考文献

[1] Hanna Suominen, Sanna Salanter, Sumithra Velupillai, et al. Overview of the share/clefe health evaluation lab 2013[C]//Proceedings of International Conference of the Cross-Language Evaluation Forum for European Languages, Springer, 2013: 212-231.
[2] Sameer Pradhan,Noémie Elhadad, Wendy Chapman, et al. SemEval-2014 task 7: Analysis of clinical text[C]//Proceedings of the 8th International Workshop on Semantic Evaluation, 2014: 54-62.
[3] Noémie Elhadad, Sameer Pradhan, Sharon Gorman,et al. SemEval-2015 task 14: Analysis of clinical text[C]//Proceedings of the 9th International Workshop on Semantic Evaluation, 2015: 303-310.
[4] Luo YF, Sun W,Rumshisky A. MCN: A comprehensive corpus for medical concept normalization[J]. Journal of Biomedical Informatics, 2019, 22: 103-132.
[5] Ghiasvand O, Kate R J. UWM: Disorder mention extraction from clinical text using CRFs and normalization using learned edit distance patterns[C]//Proceedings of the 8th International Workshop on Semantic Evaluation, 828-832.
[6] Kang N, Singh B, Afzal Z, et al. Using rule-based natural language processing to improve disease normalization in biomedical text[J]. JAMIA, 2012, 20(5): 876-881.
[7] Leaman R, Doan RI, Lu Z. DNorm: Disease name normalization with pairwise learning to rank[J]. Bioinformatics, 2013, 29: 2909-2917.
[8] Luo Y, Song G, Li P, et al. Multi-task medical concept normalization using multi-view convolutional neural network[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
[9] Zongcheng Ji, Qiang Wei, Hua Xu. BERT-based ranking for biomedical entity normalization[J]. arXiv preprint arXiv: 1908.03548, 2019.
[10] Devlin J, Chang M, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J].arXiv preprint arXiv: 1801.04805, 2018.
PDF(1154 KB)

1365

Accesses

0

Citation

Detail

段落导航
相关文章

/