MKE: 基于背景知识与多头选择的嵌套命名实体识别

李政,涂刚,汪汉生

PDF(2548 KB)
PDF(2548 KB)
中文信息学报 ›› 2024, Vol. 38 ›› Issue (4) : 86-98,107.
信息抽取与文本挖掘

MKE: 基于背景知识与多头选择的嵌套命名实体识别

  • 李政1,涂刚1,汪汉生2
作者信息 +

MKE: Nested NER Based on Knowledge Embedding and Multi-Head Selection

  • LI Zheng1, TU Gang1, WANG Hansheng2
Author information +
History +

摘要

目前,在嵌套命名实体识别研究中,基于片段的方法将命名实体识别转化为分类问题,通过微调预训练模型,能够较好地识别嵌套实体,但仍存在领域知识缺乏和无法实现实体多分类的不足。该文提出基于知识嵌入的多头模型,用于解决这些问题。模型的改进包括: ①引入领域背景知识,知识嵌入层以实体矩阵的形式,实现背景知识的无损嵌入; ②将命名实体识别过程转化为多头选择过程,借助注意力打分模型,计算候选片段得分,最终在正确识别嵌套实体边界的同时实现实体多分类。实验结果表明,以实体矩阵方式实现的背景知识嵌入,可以有效提高识别准确率,在7个嵌套与非嵌套命名实体识别数据集上取得SOTA表现。

Abstract

In existing research on nested named entity recognition, this task is treated as span classification tasks via finetuned pretrained models. This paper proposes a multi-head model based on knowledge embedding (MKE for short) method to further improve this task. This method introduces domain-specific knowledge in the form of entity matrices, allowing the background knowledge to be embedded without any loss. It also transforms the named entity recognition into a multi-head selection process, followed by scoring the candidate spans using the attention score model. The experimental results show that the proposed method achieves the state-of-the-art performance on seven nested and flat named entity recognition datasets.

关键词

嵌套命名实体识别 / 知识嵌入 / 多头选择 / 注意力 / 实体多分类

Key words

nested named entity recognition / knowledge embedding / multi-head selection / attention / entity multi-classification

引用本文

导出引用
李政,涂刚,汪汉生. MKE: 基于背景知识与多头选择的嵌套命名实体识别. 中文信息学报. 2024, 38(4): 86-98,107
LI Zheng, TU Gang, WANG Hansheng. MKE: Nested NER Based on Knowledge Embedding and Multi-Head Selection. Journal of Chinese Information Processing. 2024, 38(4): 86-98,107

参考文献

[1] DODDINGTON G R, MITCHELL A, PRZYBOCKI M A, et al. The automatic content extraction program-tasks, data, and evaluation[C]//Proceedings of the 4th International Conference on Language Resources and Evaluation. Slovenia: ELRA, 2004: 837-840.
[2] WALKER C, STRASSEL S, MEDERo J, et al. ACE 2005 multilingual training corpus LDC2006T06 [OL]. https://catalog.ldc.upenn.edu/LDC 2006T06[2021-12-28].
[3] SOHRAB M G, MIWA M. Deep exhaustive model for nested named entity recognition[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2018: 2843-2849.
[4] ZHANG Y, YANG J. Chinese NER using lattice lstm[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2018:1554-1564.
[5] LI X, YAN H, QIU X, et al. Flat: Chinese NER using flat-lattice transformer[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020:6836-6842.
[6] LIU W, FU X, ZHANG Y, et al. Lexicon enhanced Chinese sequence labeling using bert adapter[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 5847-5858.
[7] BEKOULIS G, DELEU J, DEMEESTER T, et al. Joint entity recognition and relation extraction as a multi-head selection problem[J]. Expert Systems with Applications, 2018, 114(1): 34-45.
[8] WANG Y, YU B, ZHANG Y, et al. Tplinker: Single-stage joint extraction of entities and relations through token pair linking[C]//Proceedings of the 28th International Conference on Computational Linguistics. Barcelona: ICCL, 2020: 1572-1582.
[9] LI X, SUN X, MENG Y, et al. Dice loss for data-imbalanced NLP tasks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 465-476.
[10] LU W, ROTH D. Joint mention extraction and classification with mention hypergraphs[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 857-867.
[11] WANG B, LU W. Neural segmental hypergraphs for overlapping mention recognition[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2018: 204-214.
[12] KATIYAR A, CARDIE C. Nested named entity recognition revisited[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2018: 861-871.
[13] ZHENG C, CAI Y, XU J, et al. A boundary-aware neural model for nested named entity recognition[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 357-366.
[14] LIN H, LU Y, HAN X, et al. Sequence-to-nuggets: Nested entity mention detection via anchor-region networks[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 5182-5192.
[15] JU M, MIWA M, ANANIADOU S. A neural layered model for nested named entity recognition[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2018: 1446-1459.
[16] WANG J, SHOU L, CHEN K, et al. Pyramid: A layered model for nested named entity recognition[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 5918-5928.
[17] JIA L, LIU S, WEI F, et al. Nested named entity recognition via an independent-layered pretrained model[J]. IEEE Access, 2021, 9(1): 109693-109703.
[18] LI X, FENG J, MENG Y, et al. A unified mrc framework for named entity recognition[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 5849-5859.
[19] YU J, BOHNET B, POESIO M. Named entity recognition as dependency parsing[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 6470-6476.
[20] TAN Z, SHEN Y, ZHANG S, et al. A sequence-to-set network for nested named entity recognition[C]//Proceedings of the 30th International Joint Conference on Artificial Intelligence. Amsterdam: Elsevier, 2021: 3936-3942.
[21] SHEN Y, MA X, TAN Z, et al. Locate and label: A two-stage identifier for nested named entity recognition[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 2782-2794.
[22] MA R, PENG M, ZHANG Q, et al. Simplify the usage of lexicon in Chinese NER[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 5951-5960.
[23] JI H, PAN X, ZHANG B, et al. Overview of tac-kbp2017 13 languages entity discovery and linking[C]//Proceedings of the Text Analysis Conference. Maryland: NIST, 2017: 1-4.
[24] OHTA T, TATEISI Y, KIM J D, et al. The genia corpus: An annotated research abstract corpus in molecular biology domain[C]//Proceedings of the Human Language Technology Conference. San Francisco: Morgan Kaufmann, 2002: 73-77.
[25] SANG E F, DE MEULDER F. Introduction to the conll-2003 shared task: Language-independent named entity recognition[C]//Proceedings of the 7th Conference on Natural Language Learning. Stroudsburg: ACL, 2003: 142-147.
[26] LEVOW G A. The third international Chinese language processing bakeoff: Word segmentation and named entity recognition[C]//Proceedings of the 5th Workshop on Chinese Language Processing. Stroudsburg: ACL, 2006: 108-117.
[27] WEISCHEDEL R, PRADHAN S, RAMSHAW L, et al. Ontonotes release 4.0[OL]. [2021-12-28].https://catalog.ldc.upenn.edu/LDC2011T03.
[28] MENG Y, WU W, WANG F, et al. Glyce: Glyph-vectors for Chinese character representations[J/OL]. arXiv preprint arXiv:1901.10125, 2019.
[29] LIU Y, OTT M, GOYAL N, et al. RoBERTA: A robustly optimized bert pretraining approach[J/OL]. arXiv preprint arXiv:1907.11692, 2019.
[30] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2019: 4171-4186.
[31] LEE J, YOON W, KIM S, et al. BioBERT: A pre-trained biomedical language representation model for biomedical text mining[J]. Bioinformatics, 2020, 36(4): 1234-1240.
[32] WANG Y, LI Y, TONG H, et al. HIT: Nested named entity recognition via head-tail pair and token interaction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 6027-6036.

基金

国防基础科研计划(JCKY2019204A007)
PDF(2548 KB)

Accesses

Citation

Detail

段落导航
相关文章

/