基于多任务标签一致性机制的中文命名实体识别

吕书宁,刘健,徐金安,陈钰枫,张玉洁

PDF(1710 KB)
PDF(1710 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (12) : 87-97.
信息抽取与文本挖掘

基于多任务标签一致性机制的中文命名实体识别

  • 吕书宁,刘健,徐金安,陈钰枫,张玉洁
作者信息 +

Chinese Named Entity Recognition Based on Multi-task Label Consistency

  • LYU Shuning, LIU Jian, XU Jin'an, CHEN Yufeng, ZHANG Yujie
Author information +
History +

摘要

实体边界预测对中文命名实体识别至关重要。现有研究为改善边界识别效果而提出的多任务学习方法大多仅考虑与分词任务进行简单结合,但由于缺少包含多任务标签的训练数据,导致无法学到多个任务之间的标签一致性关系。该文提出一种新的基于多任务标签一致性机制的中文命名实体识别方法: 将分词和词性信息融入命名实体识别模型,进而联合训练命名实体识别、分词、词性标注三种任务;建立基于标签一致性机制的多任务学习模式,增强边界信息学习,捕获标签一致性关系,更好地学习多任务表示。相较于基线模型,全样本实验、模拟小样本实验及真实小样本实验分别提升F1值10.28%、11.17%和8.84%,表明了该方法的有效性。

Abstract

Entity boundary prediction is essential for Chinese named entity recognition. Most of multi-task learning methods consider only employing the word segmentation task. This paper presents a new Chinese named entity recognition method based on the multi-task label consistency mechanism. The method integrates word segmentation and part-of-speech information into the named entity recognition model to establish a multi-task learning mode based on the label consistency mechanism, It enhances the boundary information learning by capturing label consistency relationships under the multi-task framework. The method is vilidated by the full sample experiment, simulated small sample experiment and real small sample experiment, resulting 10.28%, 11.17% and 8.84% improvements over the baseline model, respectively.

关键词

中文命名实体识别 / 多任务学习 / 标签一致性机制 / BERT模型

Key words

Chinese named entity recognition / multi-task learning / label consistency mechanism / BERT model

引用本文

导出引用
吕书宁,刘健,徐金安,陈钰枫,张玉洁. 基于多任务标签一致性机制的中文命名实体识别. 中文信息学报. 2023, 37(12): 87-97
LYU Shuning, LIU Jian, XU Jin'an, CHEN Yufeng, ZHANG Yujie. Chinese Named Entity Recognition Based on Multi-task Label Consistency. Journal of Chinese Information Processing. 2023, 37(12): 87-97

参考文献

[1] SANG E T K, DEMEULDER F. Introduction to the CoNLL-2003 shared task:Language-independent named entity recognition[C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, 2003: 142-147.
[2] DODDINGTON G R, MITCHELL A,PRZYBOCKI M, et al. The automatic content extraction (ace) program-tasks, data, and evaluation[C]//Proceedings of the 40th International Conference on Language Resources and Evaluation, 2000,2(1): 837-840.
[3] DEMARTINI G, IOFCIU T, DE VRIES A P. Overview of the INEX entity ranking track[C]//Proceedings of the International Workshop of the Initiative for the Evaluation of XML Retrieval. Springer, Berlin, Heidelberg, 2009: 254-264.
[4] BALOG K, SERDYUKOV P, D E VRIES A P. Overview of the TREC entity track[C]//Proceedings of the TREC. 2011, 2011:11.
[5] BIKEL D. M, MILLER S, SCHWARTZ R, et al. Nymble: A high-performance learning name-finder[C]// Proceedings of the 5th Conference on Applied Natural Language Processing, 1997: 194-201.
[6] ISOZAKI H, KAZAWA H. Efficient support vector classifiers for named entity recognition[C]//Proceedings of the COLING: The 19th International Conference on Computational Linguistics, 2002.
[7] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of 18th International Conference on Machine Learning, 2001: 282-289.
[8] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Proceedings of NAACL-HLT, 2016: 260-270.
[9] PENG N, DREDZE M. Improving named entity recognition for chinese social media with word segmentation representation learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 149-155.
[10] LUO W, YANG F. An empirical study of automatic Chinese word segmentation for spoken language understanding and named entity recognition[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 238-248.
[11] CAO P, CHEN Y, LIU K, et al. Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 182-192.
[12] WANG M, CHE W, MANNING C D. Joint word alignment and bilingual named entity recognition using dual decomposition[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013: 1073-1082.
[13] LI J, SUN A, HAN J, et al. A survey on deep learning for named entity recognition[J]. IEEE Transactions on Knowledge and Data Engineering, 2020,34(1): 50-70.
[14] SZARVAS G, FARKAS R, KOCSOR A. A multilingual named entity recognition system using boosting and c4. 5 decision tree learning algorithms[C]//Proceedings of the International Conference on Discovery Science. Springer, Berlin, Heidelberg, 2006: 267-278.
[15] BIKEL D M, SCHWARTZ R, WEISCHEDEL R M. An algorithm that learns what's in a name[J]. Machine Learning, 1999, 34(1): 211-231.
[16] MA X, HOVY E. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1064-1074.
[17] KURU O, CAN O A,YURET D. Charner: Character-level named entity recognition[C]//Proceedings of COLING, the 26th International Conference on Computational Linguistics: Technical Papers, 2016: 911-921.
[18] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011, 12: 2493-2537.
[19] REI M. Semi-supervised multitask learning for sequence labeling[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 2121-2130.
[20] LIN Y, YANG S,STOYANOV V, et al. A multi-lingual multi-task architecture for low-resource sequence labeling[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 799-809.
[21] CRICHTON G, PYYSALO S, CHIU B, et al. A neural network multi-task learning approach to biomedical named entity recognition[J]. BMC Bioinformatics, 2017, 18(1): 1-14.
[22] WANG X, ZHANG Y, REN X, et al. Cross-type biomedical named entity recognition with deep multi-task learning[J]. Bioinformatics, 2018, 1: 8.
[23] LI H, HAGIWARA M, LI Q, et al. Comparison of the impact of word segmentation on name tagging for chinese and japanese[C]//Proceedings of the LREC, 2014: 2532-2536.
[24] HUANG Z, WEI X, KAI Y. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.
[25] LU Y, ZHANG Y, JI D. Multi-prototype Chinese character embedding[C]//Proceedings of the 10th International Conference on Language Resources and Evaluation, 2016: 855-859.
[26] ZHANG Y, YANG J. Chinese NER using lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for ComputationalLinguistics, 2018: 1554-1564.
[27] ZHOU P, ZHENG S, XU J, et al. Joint extraction of multiple relations and entities by using a hybrid neural network[M]. Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Springer, Cham, 2017: 135-146.
[28] PENG N, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 548-554.
[29] HE H, SUN X. F-score driven max margin neural network for named entity recognition in chinesesocial media[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017: 713-718.
[30] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[31] RANA R. Gated recurrent unit (GRU) for emotion classification from noisyspeech[J/ OL]. arXiv preprint arXiv:1612.07778, 2016.
[32] MA X, HOVY E. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1064-1074.
[33] KINGMA D, BA J. Adam: A method for stochastic optimization[C]//Proceedings of the ICLR, 2015.
[34] HINTON G, SRIVASTAVE N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors.[J/ OL].arXiv preprint arXiv: 1207.0580, 2012.
[35] SHEN Y, YUN H, LIPTON Z C, et al. Deep active learning for named entity recognition[C]//Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017: 252-256.

基金

国家重点研究与发展计划项目(2019YFB1405200);国家自然科学基金(61976015,61976016,61876198,61370130)
PDF(1710 KB)

704

Accesses

0

Citation

Detail

段落导航
相关文章

/