基于交互自注意力的文档级化学物质诱导疾病关系抽取

李正光,林鸿飞,申晨,徐博,郑巍

PDF(2604 KB)
PDF(2604 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (7) : 98-105.
信息抽取与文本挖掘

基于交互自注意力的文档级化学物质诱导疾病关系抽取

  • 李正光1,2,林鸿飞1,申晨1,徐博1,郑巍2
作者信息 +

Document-level Chemical-induced Disease Relation Extraction via Cross Self-attention

  • LI Zhengguang1,2, LIN Hongfei1, SHEN Chen1, XU Bo1, ZHENG Wei2
Author information +
History +

摘要

从生物医学文献中提取化学物质诱导疾病关系对疾病治疗和药物开发具有重要意义,然而现有化学物质诱导疾病关系抽取方法忽略了整篇文档里不同句子的实体语义信息,因此不足以捕获完整的文档级语义信息,导致抽取效果不佳。该文揭示一种结合标题、摘要和最短依赖路径的交互自注意力机制,提出基于语义信息交互学习的化学物质诱导疾病关系抽取方法。该方法可增强文档的语义表示,并通过语义信息交互获取文档的完整语义。在CDR语料上的实验结果表明,采用交互自注意力学到的交互语义信息对于抽取文档级化学物质诱导疾病关系具有较好的促进作用。

Abstract

Chemical-induced disease (CID) relation extraction from biomedical articles plays an important role in disease treatment and drug development. To capture the semantic information of entities in different sentences, this paper proposes a cross self-attention among title, abstract and shortest dependency paths (SDPs) to learn mutual semantic information. The proposed method enhances the semantic representation and captures the complete semantic information at the document level. The experimental results on CDR corpus show that the proposed method can promote the extraction performance of the document-level CID relations.

关键词

生物医学文档 / 关系抽取 / 交互自注意力 / 语义信息

Key words

biomedical document / relation extraction / cross self-attention / semantic information

引用本文

导出引用
李正光,林鸿飞,申晨,徐博,郑巍. 基于交互自注意力的文档级化学物质诱导疾病关系抽取. 中文信息学报. 2022, 36(7): 98-105
LI Zhengguang, LIN Hongfei, SHEN Chen, XU Bo, ZHENG Wei. Document-level Chemical-induced Disease Relation Extraction via Cross Self-attention. Journal of Chinese Information Processing. 2022, 36(7): 98-105

参考文献

[1] Wei C H,Peng Y,Leaman R,et al.Assessing the state of the art in biomedical relation extraction: overview of the Bio Creative V chemical-disease relation (CDR) task[J].The Journal of Biological Databases and Curation, 2016: 1-8.
[2] Davis A P,Murphy C G, Saracenirichards C A,et al.Comparative toxicogenomics database: A knowledgebase and discovery tool for chemical-gene-disease networks[J].Nucleic Acids Research, 2009, 37:786-792.
[3] Li Z G,Lin H F,Zheng W,et al.Interactive self-attentive siamese network for biomedical sentence similarity[J].IEEE Access, 2020, 8:84093-84104.
[4] Xu J, Wu Y,Zhang Y,et al.CD-REST: A system for extracting chemical- induced disease relation in literature[J].The Journal of Biological Databases and Curation, 2016:1-9.
[5] Pons E,Becker B F H,Akhondi S A,et al.RELigator: Chemical-disease relation extraction using prior knowledge and textual information[C]//Proceedings of the 5th BioCreative Challenge Evaluation Workshop. 2015: 247-253.
[6] Peng Y, Wei C H, Lu Z.Improving chemical disease relation extraction with rich features and weakly labeled data[J].Journal of Cheminformatics, 2016, 8(53):1-12.
[7] 李智恒,桂颖溢,杨志豪, 等.基于生物医学文献的化学物质致病关系抽取[J].计算机研究与发展, 2018, 55(1):198-206.
[8] Zheng W,Lin H F,Li Z H, et al.An effective neural model extracting document level chemical-induced disease relations from biomedical literature[J].Journal of Biomedical Informatics, 2018, 83(2018):1-9.
[9] Zheng W,Lin H F,Liu X,et al. A document level neural model integrated domain knowledge for chemical-induced disease relations[J].BMC Bioinformatics, 2018, 19(328):1-12.
[10] Sahu S K,Christopoulou F,Miwa M, et al. Inter-sentence relation extraction with document-level graph convolutional neural network[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2018: 4309-4316.
[11] Gu J,Sun F,Qian L,et al. Chemical-induced disease relation extraction via attention-based distant supervision[J].BMC Bioinformatics, 2019, 20(403):1-14.
[12] Wu S,He Y.Enriching pre-trained language model with entity information for relation classification[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019: 2361-2364.
[13] Daniel M L,Noel M,O'Boyle,et al.Efficient chemical-disease identification and relationship extraction using Wikipedia to improve recall[J]. The Journal of Biological Databases and Curation, 2016, 2016:1-6.
[14] Patrick V,Emma S,Andrew M. Simultaneously self-attending to all mentions for full-abstract biological relation extraction[C]//Proceedings of the NAACL, 2018.
[15] Gu J H,Sun F Q,Qian L H,et al. Chemical-induced disease relation extraction via attention-based distant supervision [J]. BMC bioinformatics, 2019, 20(403): 1-14.
[16] Hao Y,Zhang Y,Liu K,et al.An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 221-231.
[17] Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[18] Gu J,Qian L, Zhou G.Chemical-induced disease relation extraction with various linguistic features[J].The Journal of Biological Databases and Curation, 2016: 1-11.
[19] Sun C,Yang Z,Su L,et al.Chemical-protein interaction extraction via Gaussian probability distribution and external biomedical knowledge[J].Bioinformatics, 2020, 36(15):4323-4330.
[20] Xu Y,Mou L,Li G,et al.Classifying relations via long short term memory networks along shortest dependency paths[C]//Proceedings of the EMNLP. 2015: 1785-1794.
[21] Lee J,Yoon W,Kim S,et al.BioBERT: A pre-trained biomedical language representation model for biomedical text mining[J].Bioinformatics, 2020, 36(4):1234-1240.
[22] Gu J,Sun F,Qian L,et al.Chemical-induced disease relation extraction via convolutional neural network[J].The Journal of Biological Databases and Curation, 2017: 1-12.

基金

国家自然科学基金(62006034,62076046)
PDF(2604 KB)

1107

Accesses

0

Citation

Detail

段落导航
相关文章

/