基于ACNNC模型的中文分词方法

张忠林,余炜,闫光辉,袁晨予

PDF(4557 KB)
PDF(4557 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (8) : 12-19,28.
语言分析与计算

基于ACNNC模型的中文分词方法

  • 张忠林,余炜,闫光辉,袁晨予
作者信息 +

Chinese Word Segmentation Based on ACNNC Model

  • ZHANG Zhonglin, YU Wei, YAN Guanghui, YUAN Chenyu
Author information +
History +

摘要

目前,现有中文分词模型大多是基于循环神经网络的,其能够捕捉序列整体特征,但存在忽略了局部特征的问题。针对这种问题,该文综合了注意力机制、卷积神经网络和条件随机场,提出了注意力卷积神经网络条件随机场模型(Attention Convolutional Neural Network CRF, ACNNC)。其中,嵌入层训练词向量,自注意力层代替循环神经网络捕捉序列全局特征,卷积神经网络捕捉序列局部特征和位置特征,经融合层的特征输入条件随机场进行解码。实验证明该文提出的模型在BACKOFF 2005测试集上有更好的分词效果,并在PKU、MSR、CITYU和AS上取得了96.2%、96.4%、96.1%和95.8%的F1值。

Abstract

At present, most of the existing Chinese word segmentation models are based on recurrent neural networks, which can capture the overall features of the sequence while ignoring local features. This paper combines the attention mechanism, convolutional neural network and conditional random fields, and proposes Attention Convolutional Neural Network CRF (ACNNC). The self-attention layer replaces the recurrent neural network to capture the global features of the sequence, and the convolutional neural network captures location features of the sequence. The features are combined in the fusion layer and then input into conditional random fields for decoding. The experimental results on BACKOFF 2005 show that the model proposed achieves 96.2%, 96.4%, 96.1% and 95.8% F1 values on PKU, MSR, CITYU and AS test set, respectively.

关键词

中文分词 / 深度学习 / 注意力机制

Key words

Chinese word segmentation / deep learning / attention mechanism

引用本文

导出引用
张忠林,余炜,闫光辉,袁晨予. 基于ACNNC模型的中文分词方法. 中文信息学报. 2022, 36(8): 12-19,28
ZHANG Zhonglin, YU Wei, YAN Guanghui, YUAN Chenyu. Chinese Word Segmentation Based on ACNNC Model. Journal of Chinese Information Processing. 2022, 36(8): 12-19,28

参考文献

[1] 邓丽萍,罗智勇. 基于半监督CRF的跨领域中文分词[J]. 中文信息学报,2017,31(04): 9-19.
[2] 北京航空航天大学. 信息处理用现代汉语分词规范(GB/T13715—92)[S].北京: 中国标准出版社,2004.
[3] 黄昌宁,赵海. 中文分词十年回顾[J]. 中文信息学报,2007(03): 8-19.
[4] Liu H F, Song X Y. Bayesian analysis of hidden Markov structural equation models with an unknown number of hidden states[J]. Econometrics and Statistics,2021,18: 29-43.
[5] 林亚平,刘云中,周顺先,等. 基于最大熵的隐马尔可夫模型文本信息抽取[J]. 电子学报,2005(02): 236-240.
[6] Gandhi H, Attar V. Extracting aspect terms using CRF and bi-LSTM models[J]. Procedia computer science, 2020,167.
[7] 刘建伟,王园方,罗雄麟. 深度记忆网络研究进展[J]. 计算机学报,2020: 1-52.
[8] Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural Computation,1997, 9(8): 1735-1780.
[9] Mnih V, Heess N, Graves A. Recurrent models of visual attention[J]. Advances in Neural Information Processing Systems, 2014, 27: 2204-2212.
[10] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409. 0473, 2014.
[11] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000-6010.
[12] Duan S, Zhao H. Attention is all you need for Chinese word segmentation[J]. arXiv preprint arXiv: 1910. 14537, 2019.
[13] Santors C N D,Bing X,Zhou B. Classifying relations by ranking with convolutional neural networks [C]//Proceedings of the 53rd Annual Meeting of the ACL, 2015, 86: 132-137.
[14] Collobert R,Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning [C]//Proceedings of the 25th International Conference on Machine learning. New York,USA: ACM, 2008.
[15] Wang C, Xu B. Convolutional neural network with word embeddings for Chinese word segmentation[J]. arXiv preprint arXiv: 1711. 04411, 2017.
[16] Chen X,Qiu X, Zhu C, et al. Long short-term memory neural networks for Chinese word segmentation [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 1197-1206.
[17] 金宸,李维华,姬晨,等. 基于双向LSTM神经网络模型的中文分词[J]. 中文信息学报,2018,32(02): 29-37.
[18] He H, Wu L, Yan H, et al. Effective neural solution for multi-criteria word segmentation[M]. Smart intelligent computing and applications. Springer, Singapore, 2019: 133-142.
[19] 成于思,施云涛. 基于深度学习和迁移学习的领域自适应中文分词[J]. 中文信息学报,2019,33(09): 9-16.
[20] 王星,李超,陈吉. 基于膨胀卷积神经网络模型的中文分词方法[J]. 中文信息学报,2019,33(09): 24-30.
[21] 史宇. 基于深度学习的中文分词方法研究[D]. 南京: 南京邮电大学硕士学位论文,2019.
[22] Cai D,Zhao H,Zhang Z,et al. Fast and accurate neural word segmentation for Chinese [J]. Association for Computational Linguistics,2017: 608-615.
[23] Chen X,Shi Z,Qiu X,et. al. Adversarial multi-criteria learning for Chinese word segmentation [C]//Proceedings of ACL,2017: 1193-1203.
[24] Yang J,Zhang Y,Liang S. Subword encoding in lattice LSTM for Chinese word segmentation [C]//Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2019: 2720-2725.
[25] Gong J,Chen X,Gui T,et al. Switch-LSTMs for multi-criteria Chinese word segmentation [C]//Proceedings of the 33th AAAI Conference on Artificial Intelligence, 2019: 6457-6464.
[26] 章登义,胡思,徐爱萍. 一种基于双向LSTM的联合学习的中文分词方法[J/OL]. http://www. arocmag. com/article/01-2019-10-008. html[2018-07-09].
[27] Huang K, Huang D, Liu Z, et al. A joint multiple criteria model in transfer learning for cross-domain Chinese word segmentation[C]//Proceedings of the Conference on Empirical Methods in Natural LanguageProcessing, 2020: 3873-3882.

基金

国家自然科学基金(61662043,62062049);甘肃省哲学社会科学规划项目(20YB056)
PDF(4557 KB)

1367

Accesses

0

Citation

Detail

段落导航
相关文章

/