面向小样本命名实体识别的标签语义增强原型网络

黄伟光,宁佐廷,段俊文,安莹

PDF(2030 KB)
PDF(2030 KB)
中文信息学报 ›› 2024, Vol. 38 ›› Issue (10) : 95-105.
信息抽取与文本挖掘

面向小样本命名实体识别的标签语义增强原型网络

  • 黄伟光1,宁佐廷2,段俊文1,安莹3
作者信息 +

Enhanced Prototype Network with Label Semantics for Few-Shot Named Entity Recognition

  • HUANG Weiguang1, NING Zuoting2, DUAN Junwen1, AN Ying3
Author information +
History +

摘要

小样本命名实体识别任务旨在通过有限数量的标注样本来识别并分类文本中的实体。目前,基于两阶段的小样本命名实体识别方法存在泛化能力差和原型类别混淆等问题。为了解决这些问题,该文提出了一种利用标签语义增强实体表示的两阶段方法。具体来说,该文使用蕴含语义信息的标签名称增强实体表示,并将其应用于跨度检测和实体分类模型。在跨度检测模型中,采用注意力机制将标签语义融入到文本表示,以减少跨度检测模型泛化能力不足的问题。同时,利用增强后的实体表示构建类别原型,使得原型可以获得更丰富的特征,从而降低了原型间的混淆。实验结果表明,该文方法可以充分利用标签语义信息,并在多个基准数据集上取得了良好的性能表现。

Abstract

Few-shot named entity recognition aims to identify entity types with limited annotated data. Currently, the prototype network is usually employed for entity recognition, which is defected in distinguishing new category data. To address issue, this paper proposes a two-stage few-shot named entity recognition method enhanced by label semantics. Specifically, this paper uses the attention mechanism to incorporate label semantic information into the span detection. At the same time, this label semantic information also enhance specific types of span prototypes, making prototypes capture richer features and better distinguish different types of entities. Experimental results on multiple benchmark datasets show that the proposed method can achieve good performance.

关键词

小样本命名实体识别 / 原型网络 / 标签语义

Key words

few-shot named entity recognition / prototype network / label semantics

引用本文

导出引用
黄伟光,宁佐廷,段俊文,安莹. 面向小样本命名实体识别的标签语义增强原型网络. 中文信息学报. 2024, 38(10): 95-105
HUANG Weiguang, NING Zuoting, DUAN Junwen, AN Ying. Enhanced Prototype Network with Label Semantics for Few-Shot Named Entity Recognition. Journal of Chinese Information Processing. 2024, 38(10): 95-105

参考文献

[1] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota: Association for Computational Linguistics, 2019: 4171-4186.
[2] DING N, XU G, CHEN Y, et al. Few-NERD: A few-shot named entity recognition dataset[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Online: Association for Computational Linguistics, 2021: 3198-3213.
[3] HUANG J, LI C, SUBUDHI K, et al. Few-Shot named entity recognition: An empirical baseline study[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2021: 10408-10423.
[4] 邓依依, 邬昌兴, 魏永丰, 等. 基于深度学习的命名实体识别综述[J]. 中文信息学报, 2021, 35(9): 30-45.
[5] MA T, JIANG H, WU Q, et al. Decomposed meta-learning for few-shot named entity recognition[C]//Proceedings of the Association for Computational Linguistics. Dublin, Ireland, 2022: 1584-1596.
[6] 胡晗, 刘鹏远. 小样本关系分类研究综述[J]. 中文信息学报, 2022, 36(2): 1-11.
[7] LEE H YI, LI S W, VU T. Meta learning for natural language processing: A survey[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle, United States, 2022: 666-684.
[8] JI B, LI S, GAN S, et al. Few-shot named entity recognition with entity-level prototypical network enhanced by dispersedly distributed prototypes[C]//Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, Republic of Korea, 2022: 1842-1854.
[9] YANG Y, KATIYAR A. Simple and effective few-shot named entity recognition with structured nearest neighbor learning[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Online, 2020: 6365-6375.
[10] LI J, CHIU B, FENG S, et al. Few-shot named entity recognition via meta-learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(9): 4245-4256.
[11] FRITZLER A, LOGACHEVA V, KRETOV M. Few-Shot classification in named entity recognition task[C]//Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. New York, NY, USA, 2019: 993-1000.
[12] DE LICHY C, GLAUDE H, CAMPBELL W. Meta-learning for few-shot named entity recognition[C]//Proceedings of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing. 2021: 44-58.
[13] HUANG Y, LEI W, FU J, et al. Reconciliation of pretrained models and prototypical neural networks in few-shot named entity recognition[C]//Proceedings of the Association for Computational Linguistics: EMNLP, 2022: 1793-1807.
[14] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Proceedings of the Advaances in Neural Information Processing Systems: Curran Associates, 2017.
[15] WANG P, XU R, LIU T, et al. An enhanced span-based decomposition method for few-shot sequence labeling[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, United States, 2022: 5012-5024.
[16] LIU J, SONG L, QIN Y. Prototype rectification for few-shot learning[C]//Proceedings of the 16th European Conference, Glasgow, UK, 2020: 741-756.
[17] DAS S S S, KATIYAR A, PASSONNEAU R, et al. CONTaiNER: Few-shot named entity recognition via contrastive learning[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics Dublin, Ireland, 2022: 6338-6353.
[18] CAO J, GAO Y, HUANG H. A prototype-based few-shot named entity recognition[C]//Proceedings of the 8th International Conference on Computing and Artificial Intelligence. New York, NY, USA, 2022: 338-343.
[19] WEN W, LIU Y, LIN Q, et al. Few-shot named entity recognition with joint token and sentence awareness[J]. Data Intelligence, 2023,5(3): 767-785.
[20] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia, 2017: 1126-1135.
[21] MA J, BALLESTEROS M, DOSS S, et al. Label semantics for few shot named entity recognition[C]//Proceedings of the Association for Computational Linguistics, Dublin, Ireland, 2022: 1956-1971.
[22] LUO Q, LIU L, LIN Y, et al. Dont miss the labels: Label-semantic augmented meta-learner for few-shot text classification[C]//Proceedings of the Association for Computational Linguistics: ACL-IJCNLP, 2021: 2773-2782.
[23] MUELLER A, KRONE J, ROMEO S, et al. Label semantic aware pre-training for few-shot text classification[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin, Ireland, 2022: 8318-8334.
[24] YANG P, CONG X, SUN Z, et al. Enhanced language representation with label knowledge for span extraction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, 2021: 4623-4635.
[25] HOU Y, CHE W, LAI Y, et al. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 1381-1393.
[26] TJONG KIM SANG E F. Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition[C]//Proceedings of the 6th Conference on Natural Language Learning, 2002.
[27] ZELDES A. The GUM corpus: Creating multilayer resources in the classroom[J]. Language Resources and Evaluation, 2017, 51(3): 581-612.
[28] DERCZYNSKI L, NICHOLS E, VAN ERP M, et al. Results of the WNUT shared task on novel and emerging entity recognition[C]//Proceedings of the 3rd Workshop on Noisy User-generated Text, 2017: 140-147.
[29] PRADHAN S, MOSCHITTI A, XUE N, et al. Towards robust linguistic analysis using ontoNotes[C]//Proceedings of the 7th Conference on Computational Natural Language Learning. Sofia, Bulgaria, 2013: 143-152.
[30] REN L, ZHANG Z, WANG H, et al. Language model pre-training with sparse latent typing[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, 2022: 1480-1494.
[31] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook, NY, USA: 2016: 3637-3645.
[32] LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[C]//Proceedings of the International Conference on Learning Representations, 2019.
[33] MAATEN L VAN DER, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9(86): 2579-2605.

基金

湖南省自然科学青年基金(2021JJ40783);网络犯罪侦查湖南省普通高校重点实验室开放课题(2020WLFZZC004)
PDF(2030 KB)

Accesses

Citation

Detail

段落导航
相关文章

/