融入注意力机制的越南语组块识别方法

王闻慧,毕玉德,雷树杰

PDF(3636 KB)
PDF(3636 KB)
中文信息学报 ›› 2019, Vol. 33 ›› Issue (12) : 91-100.
民族语言及周边语言信息处理

融入注意力机制的越南语组块识别方法

  • 王闻慧1,毕玉德2,雷树杰1
作者信息 +

Vietnamese Chunk Identification Incorporating Attention Mechanism

  • WANG Wenhui1, BI Yude2, LEI Shujie1
Author information +
History +

摘要

对于越南语组块识别任务,在前期对越南语组块内部词性构成模式进行统计调查的基础上,该文针对Bi-LSTM+CRF模型提出了两种融入注意力机制的方法: 一是在输入层融入注意力机制,从而使得模型能够灵活调整输入的词向量与词性特征向量各自的权重;二是在Bi-LSTM之上加入了多头注意力机制,从而使模型能够学习到Bi-LSTM输出值的权重矩阵,进而有选择地聚焦于重要信息。实验结果表明,在输入层融入注意力机制后,模型对组块识别的F值提升了3.08%,在Bi-LSTM之上加入了多头注意力机制之后,模型对组块识别的F值提升了4.56%,证明了这两种方法的有效性。

Abstract

For the Vietnamese chunk identification task, this paper proposes two ways to integrate the attention mechanism with the Bi-LSTM+CRF model. The first is to integrate the attention mechanism at the input layer, which allows the model to flexibly adjust weights of word embeddings and POS feature embeddings. The second is to add a multi-head attention mechanism on the top of Bi-LSTM, which enables the model to learn weight matrix of the Bi-LSTM outputs and selectively focus on important information. Experimental results show that, after integrating the attention mechanism at the input layer, the F-value of Vietnamese chunk identification is increased by 3.08%; and after adding the multi-head attention mechanism on the top of Bi-LSTM, the F-value of Vietnamese chunk identification is improved by 4.56%.

关键词

越南语 / 组块识别 / Bi-LSTM+CRF模型 / 注意力机制

Key words

Vietnamese / chunk identification / Bi-LSTM+CRF model / attention mechanism

引用本文

导出引用
王闻慧,毕玉德,雷树杰. 融入注意力机制的越南语组块识别方法. 中文信息学报. 2019, 33(12): 91-100
WANG Wenhui, BI Yude, LEI Shujie. Vietnamese Chunk Identification Incorporating Attention Mechanism. Journal of Chinese Information Processing. 2019, 33(12): 91-100

参考文献

[1] Abney S P. Parsing by chunks[J]. Principle-Based Parsing: Computation and Psycholinguistics, 1991, 1(44): 257-278.
[2] Abney S. Partial parsing via finite-state cascades[J]. Natural Language Engineering, 1996, 2(4): 399-399.
[3] Ramshaw L A, Marcus M P. Text chunking using transformation-based learning[C]//Proceedings of the 3rd ACL/SIGDAT Workshop. 1995: 222-226.
[4] Ngai G, Florian R. Transformation-based learning in the fast lane[C]//Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies. 2001: 1-8.
[5] 张昱琪, 周强. 汉语基本短语的自动识别[J]. 中文信息学报, 2002, 16(6): 1-8.
[6] 李珩, 朱靖波, 姚天顺. 基于SVM的中文组块分析[J]. 中文信息学报, 2004, 18(2): 2-8.
[7] 徐中一, 胡谦, 刘磊. 基于CRF的中文组块分析[J]. 吉林大学学报(理学版), 2007, 45(3): 416-420.
[8] 刘芳, 赵铁军,于浩,等. 基于统计的汉语组块分析[J]. 中文信息学报, 2000, 14(6): 28-32.
[9] 张芬, 曲维光, 赵红艳, 等. 基于CRF和转换错误驱动学习的浅层句法分析[J]. 广西师范大学学报(自然科学版), 2011, 29(3): 147-150.
[10] 李素建. 汉语组块计算的若干研究[D]. 北京: 中国科学院计算技术研究所博士学位论文, 2002.
[11] 李佳. 融入依存句法分析的汉越组块对齐研究[D]. 昆明: 昆明理工大学硕士学位论文, 2018.
[12] Lê Minh Nguyên, Huong Thao Nguyen, Phuong Thai Nguyen, et al. An empirical study of Vietnamese noun phrase chunking with discriminative sequence models[C]//Proceedings of the Workshop on Asian Language Resources, 2009: 9-16.
[13] Thao N T H, Thai N P, Minh N L, et al. Vietnamese noun phrase chunking based on conditional random fields[C]//Proceedings of the International Conference on Knowledge and Systems Engineering. IEEE Computer Society, 2009: 172-178.
[14] 郭剑毅, 李佳, 余正涛, 等. 基于约束条件随机场的越南语名词组块识别方法: 中国,CN 107797994 A[P].2018-03-13.
[15] Vaswani A,Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceeding of the The 31st Conference on Neural Information Processing Systems, 2017: 1-15.
[16] Tomas Mikolov, Ilya Sutskever, Kai Chen, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the Advances in Neural Information Processing Systems, 2013(2): 3111-3119.
[17] Thanh Vu,Dat Quoc Nguyen, Dai Quoc Nguyen, et al. VnCoreNLP: A Vietnamese natural language processing toolkit[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, 2018: 56-60.
[18] Dzmitry Bahdanau, Kyung Hyun Cho, Yoshua Bengio. Neural machine translation by Jointly Learning to align and translate[C]//Proceeding of the International Conference on Learning Representations, 2015: 1-15.
[19] Marek Rei, Gamal K O Crichton, Pyysalo Sampo. Attending to characters in neural sequence labeling models[C]//Proceedings of COLING 2016, 2016: 309-318.
[20] 王路路, 艾山·吾买尔, 吐尔根·依布拉音, 等. 基于深度神经网络的维吾尔文命名实体识别研究[J]. 中文信息学报, 2019, 33(3): 64-70.
[21] Zhixing Tan, Mingxuan Wang, Jun Xie, et al. Deep semantic role labeling with self-attention[C]//Proceeding of the 32nd AAAI Conference on Artificial Intelligence, 2018: 4929-4936.
[22] Shaw P, Uszkoreit J, Vaswani A. Self-attention with relative position representations[J].arXiv Preprint arXiv: 1803.02155, 2018.
[23] Verga P, Strubell E, McCallum A. Simultaneously self-attending to all mentions for full-abstract biological relation extraction[C]//Proceeding of the North American Chapter of the Association for Computational Linguistics, 2018: 872-884.
[24] Lin Z, Feng M, Santos C N, et al. A structured self-attentive sentence embedding[J]. arXiv Preprint arXiv: 1703.03130, 2017.
[25] 刘艳超. 越南语浅层句法分析方法的研究[D]. 昆明: 昆明理工大学硕士学位论文. 2017.
PDF(3636 KB)

735

Accesses

0

Citation

Detail

段落导航
相关文章

/