殷章志,李欣子,黄德根,李玖一. 融合字词模型的中文命名实体识别研究[J]. 中文信息学报, 2019, 33(11): 95-100,106.
YIN Zhangzhi, LI Xinzi, HUANG Degen, LI Jiuyi. Chinese Named Entity Recognition Ensembled with Character. , 2019, 33(11): 95-100,106.
融合字词模型的中文命名实体识别研究
殷章志,李欣子,黄德根,李玖一
大连理工大学 计算机科学与技术学院,辽宁 大连 116024
Chinese Named Entity Recognition Ensembled with Character
YIN Zhangzhi, LI Xinzi, HUANG Degen, LI Jiuyi
School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
Abstract:Named Entity Recognition(NER) plays an important role in Natural Language Processing. In order to obtain better results without manual features, this paper proposes an NER method based on an ensemble model of BiLSTM. Firstly, we apply the BiLSTM-CRF training on the data, obtaining the character-based model Char-NER and the word-based model Word-NER respectively. Then the score vectors obtained by the two models are merged as the input to the SVM model. The experimental results show that this method achieves 94.04%, 92.15%, 87.05% and 91.73%, 93.20%, 83.15% F-Scores of name, location and organization on the 1998 people's daily and MSRA corpus respectively without hand-crafted features.
[1] 俞鸿魁,张华平,刘群,等. 基于层叠隐马尔可夫模型的中文命名实体识别[J]. 通信学报,2006,(2): 87-94. [2] Chieu H L,Ng H T.Named entity recognition: A maximum entropy approach using global information[C]//Proceedings of the 19th International Conference on Computational Linguistics,2002(1): 1-7. [3] 黄德根,李泽中,万如. 基于SVM和CRF的双层模型中文机构名识别[J]. 大连理工大学学报,2010,50(5): 782-787. [4] Aaron L F H,Derek F W,Lidia S C. Chinese named entity recognition with conditional random fields in the light of chinese characteristics[M]. Warsaw: Springer,2013. [5] Collobert R,Weston J,Bottou L,et al.Natural language processing (almost) from scratch[J]. The Journal of Machine Learning Research,2011(12): 2493-2537. [6] Huang Z,Xu W,Yu K.Bidirectional LSTM-CRF models for sequence tagging[J].arXiv preprint arXiv: 1508.01991,2015. [7] Dong C,Zhang J,Zong C,et al. Character-based LSTM-CRF with radical-level features for chinese named entity recognition[C]//Proceedings of the International Conference on Computer Processing of Oriental Languages. Springer International Publishing,2016: 239-250. [8] Levow G A. The third international chinese language processing bakeoff: Word segmentation and named entity recognition[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing,Sydney: Association for Computational Linguistics,2006: 108-117. [9] Breiman L. Bagging predictors[J]. Machine Learning,1996,24(2),123-140. [10] 潘泉,王增福,梁彦,等. 信息融合理论的基本方法与进展(Ⅱ)[J]. 控制理论与应用,2012,29(10): 1233-1244. [11] 张佳宝. 基于条件随机场的中文命名实体识别研究[D]. 北京: 国防科学技术大学硕士学位论,2010. [12] 冯蕴天,张宏军,郝文宁,等. 基于深度信念网络的命名实体识别[J]. 计算机科学,2016,43(04): 224-230. [13] Zhou J,He L,Dai X,et al. Chinese named entity recognition with a multi-phase model[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Stroudsburg: Association for Computational Linguistics,2006: 213-216. [14] Zhou J,Qu W,Zhang F. Chinese named entity recognition via joint identification and categorization[J]. Chinese Journal of Electronics,2013,22(2): 225-230. [15] 王蕾,谢云,周俊生,等. 基于神经网络的片段级中文命名实体识别[J].中文信息学报,2018,32(03): 84-90,100. [16] Hochreiter S,Schmidhuber J. Long short-term memory[J]. Neural Computation,1997,9(8): 1735-1780. [17] Graves A, Mohamed A, Hinton G. Speech recognition with deep recurrent neural networks[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing IEEE,2013: 6645-6649.