Combing Iterated Dilated Convolutions Neural Network and Hierarchical Attention Network for Chinese Named Entity Recognition
CHEN Ru1,2, LU Xianling2,3
1.Key Laboratory Advanced Process Control for Light Industry (MOE), Jiangnan University, Wuxi, Jiangsu 214122, China; 2.School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China; 3.Jiangsu Key Construction Laboratory of IOT Application Technology, Wuxi, Jiangsu 214100, China
Abstract:The IDC-HSAN (Iterated Dilated Convolutions Neural Networks and Hierarchical Self-attention Network)model is constructed for Chinese named entity recognition to deal with the hierarchical text structure and the deficiency in computation of RNN. The model enable the parallel computation ion GPU and reduce the time cost of LSTM significantly. The hierarchical self-attention mechanism is applied to capture local and global semantic information. In addition, the radical information is also employed to enrich the embedded information. The experimental results show that this model can identify entities better than the classical deep model with the attention mechanism.
[1] Peng N,Dredze M. Improving named entity recognition for chinese social media with word segmentation representation learning[J].arXiv preprint arXiv:1603.00786, 2016. [2] He H, Sun X. A unified model for cross-domain and semi-supervised named entity recognition in chinese social media[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. [3] Zhu Y, Wang G, Karlsson B F. CAN-NER: Convolutional Attention Network for Chinese NamedEntityRecognition[J].arXiv preprint arXiv:1904.02141, 2019. [4] Chen H, LinZ, Ding G, et al. GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. [5] Gehring J,Auli M, Grangier D, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017: 1243-1252. [6] Strubell E , Verga P , Belanger D , et al. Fast and Accurate Entity Recognition with Iterated Dilated Convolutions[J]. ACL, 2017:2670-2680. [7] Cao P, Chen Y, Liu K, et al. Adversarial transfer learning for chinese named entity recognition with self-attention mechanism[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018:182-192. [8] Sun Y, Lin L, Yang N, et al. Radical-enhancedchinese character embedding[C]//Proceedings of the 21st International Conference on Neural Information Processing. Springer, Cham, 2014: 279-286. [9] Yin R, Wang Q, Li P, et al. Multi-granularity chinese word embedding[C]//Proceedings of the 2016 conference on empirical methods in natural language processing. 2016: 981-986. [10] Ke Y, Hagiwara M. CNN-encoded radical-level representation for Japanese processing[J]. Transactions of the Japanese Society for Artificial Intelligence, 2018, 33(4): D-I23-1-8. [11] Levow G A. The third international Chinese language processing bakeoff: Word segmentation and named entity recognition[C]//Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 2006: 108-117. [12] Peng N,Dredze M. Named entity recognition for chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 548-554. [13] Chen A, Peng F, Shan R, et al.Chinese named entity recognition with conditional probabilistic models[C]//Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 2006: 173-176. [14] Zhang S, Qin Y, Hou W J, et al. Word segmentation and named entity recognition forsighan bakeoff3[C]//Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 2006: 158-161. [15] Zhou J, Qu W, Zhang F. Chinese named entity recognition via joint identification and categorization[J]. Chinese journal of electronics, 2013, 22(2): 225-230. [16] Dong C, Zhang J,Zong C, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//Proceedings of ICCPOL 2016 and NLPCC 2016,Springer, Cham, 2016: 239-250. [17] Zhang Y, Yang J. Chinese NER using lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 2018:1554-1564. [18] Huang Z, Xu W, Yu K, et al. Bidirectional LSTM-CRF Models for SequenceTagging.[J]. arXiv: Computation and Language, 2015. [19] Xu C, Wang F, Han J, et al. Exploiting Multiple Embeddings for Chinese Named EntityRecognition[C]//Proceedings of CIKM 2019, 2019: 2269-2272. [20] Vaswani A,Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing. 2017: 5998-6008. [21] Meng Y, Wu W, Wang F, et al.Glyce: Glyph-vectors for chinese character representations [C]//Proceedings of NeurIPS 2019, 2019: 2742-2753.