融合空洞卷积神经网络与层次注意力机制的中文命名实体识别

陈茹,卢先领

PDF(1386 KB)
PDF(1386 KB)
中文信息学报 ›› 2020, Vol. 34 ›› Issue (8) : 70-77.
信息抽取与文本挖掘

融合空洞卷积神经网络与层次注意力机制的中文命名实体识别

  • 陈茹1,2,卢先领2,3
作者信息 +

Combing Iterated Dilated Convolutions Neural Network and Hierarchical Attention Network for Chinese Named Entity Recognition

  • CHEN Ru1,2, LU Xianling2,3
Author information +
History +

摘要

该文针对现有的命名实体识别(named entity recognition,NER)模型未考虑到文本层次化结构对实体识别的重要作用,以及循环神经网络受其递归性的限制导致计算效率低下等问题,构建了IDC-HSAN模型(Iterated Dilated Convolutions Neural Networks and Hierarchical Self-attention Network)。该模型通过迭代的空洞卷积神经网络(ID-CNN)充分利用GPU的并行性大大降低了使用长短时记忆网络的时间代价。然后,采用层次化注意力机制捕获重要的局部特征和全局上下文中的重要语义信息。此外,为了丰富嵌入信息,加入了偏旁部首信息。最后,在不同领域数据集上的实验结果表明,IDC-HSAN模型能够从文本中获取有用的实体信息,和传统的深度网络模型、结合注意力机制的命名实体识别模型相比识别效果有所提升。

Abstract

The IDC-HSAN (Iterated Dilated Convolutions Neural Networks and Hierarchical Self-attention Network)model is constructed for Chinese named entity recognition to deal with the hierarchical text structure and the deficiency in computation of RNN. The model enable the parallel computation ion GPU and reduce the time cost of LSTM significantly. The hierarchical self-attention mechanism is applied to capture local and global semantic information. In addition, the radical information is also employed to enrich the embedded information. The experimental results show that this model can identify entities better than the classical deep model with the attention mechanism.

关键词

注意力机制 / 迭代空洞卷积神经网络 / 中文命名实体识别

Key words

attention mechanism / iterated dilated convolutions neural networks / Chinese named entity recognition

引用本文

导出引用
陈茹,卢先领. 融合空洞卷积神经网络与层次注意力机制的中文命名实体识别. 中文信息学报. 2020, 34(8): 70-77
CHEN Ru, LU Xianling. Combing Iterated Dilated Convolutions Neural Network and Hierarchical Attention Network for Chinese Named Entity Recognition. Journal of Chinese Information Processing. 2020, 34(8): 70-77

参考文献

[1] Peng N,Dredze M. Improving named entity recognition for chinese social media with word segmentation representation learning[J].arXiv preprint arXiv:1603.00786, 2016.
[2] He H, Sun X. A unified model for cross-domain and semi-supervised named entity recognition in chinese social media[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017.
[3] Zhu Y, Wang G, Karlsson B F. CAN-NER: Convolutional Attention Network for Chinese NamedEntityRecognition[J].arXiv preprint arXiv:1904.02141, 2019.
[4] Chen H, LinZ, Ding G, et al. GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019.
[5] Gehring J,Auli M, Grangier D, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017: 1243-1252.
[6] Strubell E , Verga P , Belanger D , et al. Fast and Accurate Entity Recognition with Iterated Dilated Convolutions[J]. ACL, 2017:2670-2680.
[7] Cao P, Chen Y, Liu K, et al. Adversarial transfer learning for chinese named entity recognition with self-attention mechanism[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018:182-192.
[8] Sun Y, Lin L, Yang N, et al. Radical-enhancedchinese character embedding[C]//Proceedings of the 21st International Conference on Neural Information Processing. Springer, Cham, 2014: 279-286.
[9] Yin R, Wang Q, Li P, et al. Multi-granularity chinese word embedding[C]//Proceedings of the 2016 conference on empirical methods in natural language processing. 2016: 981-986.
[10] Ke Y, Hagiwara M. CNN-encoded radical-level representation for Japanese processing[J]. Transactions of the Japanese Society for Artificial Intelligence, 2018, 33(4): D-I23-1-8.
[11] Levow G A. The third international Chinese language processing bakeoff: Word segmentation and named entity recognition[C]//Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 2006: 108-117.
[12] Peng N,Dredze M. Named entity recognition for chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 548-554.
[13] Chen A, Peng F, Shan R, et al.Chinese named entity recognition with conditional probabilistic models[C]//Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 2006: 173-176.
[14] Zhang S, Qin Y, Hou W J, et al. Word segmentation and named entity recognition forsighan bakeoff3[C]//Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 2006: 158-161.
[15] Zhou J, Qu W, Zhang F. Chinese named entity recognition via joint identification and categorization[J]. Chinese journal of electronics, 2013, 22(2): 225-230.
[16] Dong C, Zhang J,Zong C, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//Proceedings of ICCPOL 2016 and NLPCC 2016,Springer, Cham, 2016: 239-250.
[17] Zhang Y, Yang J. Chinese NER using lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 2018:1554-1564.
[18] Huang Z, Xu W, Yu K, et al. Bidirectional LSTM-CRF Models for SequenceTagging.[J]. arXiv: Computation and Language, 2015.
[19] Xu C, Wang F, Han J, et al. Exploiting Multiple Embeddings for Chinese Named EntityRecognition[C]//Proceedings of CIKM 2019, 2019: 2269-2272.
[20] Vaswani A,Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing. 2017: 5998-6008.
[21] Meng Y, Wu W, Wang F, et al.Glyce: Glyph-vectors for chinese character representations [C]//Proceedings of NeurIPS 2019, 2019: 2742-2753.

基金

教育部“赛尔网络下一代互联网技术创新项目”(NGII20170623)
PDF(1386 KB)

Accesses

Citation

Detail

段落导航
相关文章

/