近年来,针对电子病历文本的研究受到越来越多的关注,而相关疾病预测模型很少注意到病历文本中记录独立分布的半结构化形式以及语义关系复杂的特点,故该文提出了一种基于加权层级注意力机制的辅助诊断方法,设计加权累加法将普通句向量转换为结构弱关联句向量,并构成词、句、文档层级结构注意力机制来提高模型结构学习能力,此外,设计监督层用于缓解语义关系复杂造成的学习偏置问题,以辅助模型的训练效果。在真实数据集中进行验证表明,该文模型优于当前主流的深度学习模型,取得了较好效果。
Abstract
To capture the semistructured information and the complex semantic relations in the medical record texts, this article proposes a disease prediction method based on a weighted hierarchical attention mechanism. The weighted accumulation method is designed to convert ordinary sentence vectors into structurally weakly related sentence vectors. A hierarchical structure attention mechanism is formed for the word, sentence, and document levels to improve the model. In addition, a supervision layer is constructed to alleviate the learning bias problem. Experiments on the real data set show the proposed model outperforms current deep learning models.
关键词
累加法 /
注意力机制 /
层级结构 /
辅助诊断
{{custom_keyword}} /
Key words
accumulative method /
attention mechanism /
hierarchical structure /
auxiliary diagnosis
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 邢帆. 医疗数据爆炸的获与惑[J]. 中国信息化,2015(02): 52-55.
[2] 李素萍. 国内外电子病案现状及发展趋势[J]. 中国病案,2011,12(08): 29-31.
[3] WASSERMAN R C. Electronic medical records,epidemiology,and epistemology: Reflections on EMRs and future pediatric clinical research[J]. Academic Pediatrics,2011,11(4): 280-287.
[4] MENDONCA E A. Clinical decision support systems: Perspectives in dentistry [J]. Journal of Dental Education,2004,68(6): 589-597.
[5] 张燕,高非. 电子病案结构和临床辅助决策系统设计[J]. 中国病案,2009,10(04): 28-30.
[6] 吕曙光,尹真真,陈丽. 住院病案首页主要诊断填写缺陷分析及对策[J]. 中国病案,2018,19(08): 20-22.
[7] 王加宽,俞立平,乔闯. 颈腰疾病专家诊断系统的研制[J]. 徐州医学院学报,1998,18(01): 51-53.
[8] MAO. Monitoring and deterioration warning[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2012: 1140-1148.
[9] FILIPPO A,ALBERTO L,ELADIA M P,et al. Artificial neural networks in medical diagnosis[J]. Journal of Applied Biomedicine,2013,(2): 47.
[10] CHOI E,SCHUETZ A,STEWART W F,et al. Using recurrent neural network models for early detection of heart failure on set[J]. Journal of the American Medical Informatics Association Jamia,2016,24(2): 361-370.
[11] 刘利明.基于数据挖掘心血管疾病风险因子发现与早期预警的风险建模[D].深圳: 深圳大学硕士学位论文,2017.
[12] 马鸿超,张坤丽,赵悦淑,等. 基于特征融合的产科多标记辅助诊断研究[J]. 中文信息学报,2018,32(05): 128-136.
[13] 陈旭,刘鹏鹤,孙毓忠,等. 面向不均衡医学数据集的疾病预测模型研究[J]. 计算机学报,2019,42(3): 596-609.
[14] HE B,GUAN Y,DAI R. Convolutional gated recurrent units for medical relation classification[C]//Proceedings of the IEEE International Conference on Bioinformatics and Bio-medicine. Madrid,Spain,2018: 646-650.
[15] 杨锦锋,关毅,何彬,等. 中文电子病历命名实体和实体关系语料库构建[J]. 软件学报,2016,27(11): 2725-2746.
[16] HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural Computation,1997,9(8): 1735-1780.
[17] STERN M,ANDREAS J,KLEIN D. A minimal span-based neural constituency parser[J]. Association for Computational Linguistics,2017: 818-827.
[18] VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Processing Systems, 2017: 5998-6008.
[19] 结巴分词工具[EB/OL]. http://github.com/fxsjy/jieba[2002-04-15]
[20] MIKOLOV T,CHEN K,CORRADO G,et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv: 1301. 3781,2013.
[21] KIM Y. Convolutional Neural Networks for Sentence Classification[J]. Eprint Arxiv,2014.
[22] LIU P, QIU X, XUANJING H. Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016: 2873-2879.
[23] WANG R,LI Z,CAO J,et al. Convolutional recurrent neural networks for text classification[C]//Proceedings of the International Joint Conference on Neural Networks Budapest, Hungary, 2019: 1-6.
[24] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017: 427-431.
[25] PENG ZP,WEI S,TIAN J,et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(81860318,81560296)
{{custom_fund}}