实体关系联合抽取任务旨在识别命名实体的同时可抽取实体间的语义关系。该文提出了一种基于多特征融合及奖惩机制的藏医药领域实体关系联合抽取方法,针对基于序列标注的联合抽取方法中标注策略的局限性及特征单一、模型学习能力有限的问题,提出以下解决方案: ①使用嵌套实体标注策略突破原有标注方法的局限;②使用类别特征静态融合、多特征动态融合方法及奖惩机制分别用于特征增强及模型优化。实验结果表明,该文方法提升了藏医药领域联合抽取模型的效果,模型最终的F1值为79.23%。同时,为了证明该文模型的鲁棒性及有效性,还在SKE及NYT领域数据上进行了相关实验,实验结果验证该模型的有效性,且优于基线方法。
Abstract
The entity relation joint extraction task refers to extracting semantic relations between entities while identifying named entities. This paper proposes a joint extraction method of entity relations in the Tibetan medicine field based on multi feature fusion and reward-and-punishment mechanism. We adopt the nested entity annotation strategy to break through the limitations of existing annotation methods. The static fusion of category features, dynamic fusion of multi features, and reward-and-punishment mechanisms are applied for feature enhancement and model optimization. The experimental results show that our method is effective and superior to the baseline methods.
关键词
藏医药 /
实体关系 /
联合抽取 /
多特征融合 /
奖惩机制
{{custom_keyword}} /
Key words
Tibetan medicine /
entity relation /
joint extraction /
multi-feature fusion /
reward and punishment mechanism
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] ZHENG S,WANG F,BAO H,et al. Joint extraction of entities and relations based on a novel tagging scheme[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017: 1227-1236.
[2] LEE CH K. LSTM-CRF models for named entity recognition[J]. IEICE TRANS. INF. & SYST,2017,D(4): 882-887.
[3] 文松. 面向中医药领域的命名实体识别方法研究[D]. 广州: 广东工业大学硕士学位论文,2022.
[4] CHEN G,TIAN Y,SONG Y,et al. Relation extraction with type-aware map memories of word dependencies[C]//Proceedings of the Association for Computational Linguistics,2021: 2501-2512.
[5] 于韬,尼玛次仁,拥措,等. 基于藏文Albert预训练语言模型的图采样与聚合实体关系抽取[J]. 中文信息学报,2022,36(10): 63-72.
[6] MIWA M,BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016: 1105-1116.
[7] KATIYAR A,CARDIE C. Going out on a limb: Joint extraction of entity mentions and relations without dependency trees[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017: 917-928.
[8] ZHOU P,ZHENG S,XU J,et al. Joint extraction of multi-ple relations and entities by using a hybrid neural network[C]//Proceedings of the 16th China National Conference on Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data,2017: 135-146.
[9] DAI D,XIAO X,LYU Y,et al. Joint extraction of entities and overlapping relations using position attentive sequence labeling[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence,2019: 6300-6308.
[10] ZENG X R,ZENG D,HE S,et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,2018: 506-514.
[11] ZENG X R,HE S Z,ZENG D J,et al. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing,2019: 367-377.
[12] ZENG D,ZHANG H,LIU Q. CopyMTL: Copy mechanism for joint extraction of entities and relations with multi-task learning[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence,2020: 9507-9514.
[13] NAYAK T,NG H. Effective modeling of encoder-decoder architecture for joint entity and relation extraction[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence,2020: 8528-8535.
[14] SUI D,CHEN Y,LIU K,et al. Joint entity and relation extraction with set prediction networks[J]. IEEE Tromsaction on Neural Networks and Learning Systems, 2023:2162-2388.
[15] 田佳来,吕学强,游新冬,等. 基于分层序列标注的实体关系联合抽取方法[J]. 北京大学学报(自然科学版),2021,57(01): 53-60.
[16] LAI T,CHENG L,WANG D,et al. RMAN: relational multihead attention neural network for joint extraction of entities and relations[J]. Applied Intelligence,2022,52(3): 3132-3142.
[17] 朱秀宝,周刚,陈静,等. 基于增强序列标注策略的单阶段联合实体关系抽取方法[J]. 计算机科学,2023,50(08): 184-192.
[18] 付瑞,李剑宇,王笳辉,等. 面向领域知识图谱的实体关系联合抽取[J]. 华东师范大学学报(自然科学版),2021,5: 24-36.
[19] 吴赛赛,梁晓贺,谢能付,等. 面向领域实体关系联合抽取的标注方法[J]. 计算机应用,2021,41(10): 2858-2863.
[20] HUANLING T,HUI Z,HONGMIN W,et al. Representation of semantic word embeddings based on SLDA and word2vec model[J]. Chinese Journal of Electronics,2023,32(03): 647-654.
[21] 邓亮,齐攀虎,刘振龙,等. BGPNRE:一种基于BERT的全局指针网络实体关系联合抽取方法[J]. 计算机科学,2023,50(03): 42-48.
[22] SUN Y,WANG S,LI Y,et al. ERNIE 2.0: A continual pre-training framework for language understanding[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020: 8968-8975.
[23] LAN Z,CHEN M,GOODMAN S,et al. ALBERT: A lite BERT for self-supervised learning of language representations[C]//Proceedings of International Conference on Learning Representations,2019: 1-17.
[24] YANG Z,XU Z,CUI Y,et al. CINO: A Chinese minority pre-trained language model[C]//Proceedings of the Association for Computational Linguistics, 2022:1105-1116.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
科技创新2030——“新一代人工智能”重大项目(2022ZD0116100);西藏自治区科技厅项目《藏医药古籍文献数字化及其知识挖掘技术研发》
{{custom_fund}}