针对构建朝鲜语语料库的人工标注工作过于费时费力,少数民族语言难以与各家资源融合的这一问题,该文从表征学习的角度,意图构建有效的朝鲜语句子结构表示,用来提升后续自然语言处理任务的效果。我们将深度强化学习与自注意力机制相结合,提出了一种分层结构的自注意力模型(Hierarchically Structured Korean,HS-K)。模型利用强化学习中的Actor-Critic思想,将文本分类效果作为强化学习的标签反馈信息,把文本的结构划分任务转化为序列决策任务。实验结果表明,模型可以识别出接近人工标注的朝鲜语重要句子结构,对朝鲜语信息化与智能化有着良好的辅助作用。
Abstract
A Hierarchically Structured Korean(HS-K) is proposed in this article to construct an effective Korean representation by combining deep reinforcement learning with Self-Attention mechanism. Applying the Actor-Critic approach in reinforcement learning, the model takes the text classification effect as the label feedback of reinforcement learning, and treats the prasing task as the sequence decision task. The experimental results show that the model can identify the key syntactic structure of Korean, comparable to manual tagging.
关键词
朝鲜语自然语言处理 /
深度强化学习 /
自注意力机制 /
句子结构化
{{custom_keyword}} /
Key words
Korean natural language processing /
deep reinforcement learning /
Self-Attention mechanism /
sentence structuring
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 毕玉德.朝鲜语自然语言处理研究管窥[J].中文信息学报,2011,25(06): 166-169.
[2] 安帅飞,毕玉德,张婷.韩国语定语从句句法特征分析及其自动识别[J].中文信息学报,2018,32(02): 66-74.
[3] Li Y, Li W, Sun F, et al. Component-enhanced Chinese character embeddings[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.Lisbon,2015: 829-834.
[4] Yiming Cui, Wanxiang Che, Ting Liu, et al. Pre-training with whole word masking for Chinese BERT[J]. arXiv preprint arXiv: 1906.08101, 2019.
[5] Mandar Joshi, Danqi Chen, Yinhan Liu, et al. Spanbert: Improving pre-training by representing and predicting spans[J]. Transactions of the Association for Computational Linguistics, 2020(8): 64-77.
[6] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, 2019: 4171-4186.
[7] Joulin A, Grave E, Bojanowski P. Bag of tricks for efficient text classification[C]//Proceedings of the European Chapter of the ACL, 2017,427-431.
[8] Kim Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014,1746-1751.
[9] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008.
[10] Rik Koncel Kedziorski1, Dhanush Bekal1, Yi Luan, et al. Text generation from knowledge graphs with graph transformers[C]//Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2019: 2284-2293.
[11] Luo F, Li P, Yang P, et al. Towards fine-grained text sentiment transfer[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2013-2022.
[12] Hanneman G, Lavie A. Automatic category label coarsening for syntax based machine translation[C]//Proceedings of Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA: ACL Press, 2011: 98-106.
[13] Enjung P, Chengjun Z. KoNLPy: Simple and succinct Korean message processing Python package[C]//Proceedings of the 26th Korean and Korean Language Information Processing Conference, 2014: 1-4.
[14] Sennrich R, Haddow B, Birch A. Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2016: 1715-1725.
[15] Kudo T.Subword regularization: Improving neural network translation models with multiple subword candidates[J]. arXiv preprint arXiv: 1804.10959, 2018.
[16] 杨飞扬,赵亚慧,崔荣一,等. 基于平行语料和翻译概率的多语种词对齐方法[J].中文信息学报,2019,33(12): 37-44.
[17] Luo F, Li P, Zhou J, et al. A dual reinforcement learning framework for unsupervised text style transfer[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao: IJCAI,2019: 5116-5122.
[18] Tianyang Z, Minlie H, Li Z. Learning structured representation for text classification via reinforcement learning[C]//Proceedings of the AAAI. New Orleans,2018: 6053-6060.
[19] Williams R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning.[J]. Machine Learning, 1998,8(3-4): 229-256.
[20] Lin Z, Feng M, Santos C N. A structured self-attentive sentence embedding[C]//Proceedings of the International Conference on Learning Representations. Toulon: ICLR, 2017: 1-15.
[21] Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, et al. Sequence level training with recurrent neural networks[C]//Proceedings of the 4th International Conference on Learning Representations, 2016: 1-16.
[22] Chung J,Gulcehre C, Cho K. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv preprint arXiv: 1412.3555, 2014.
[23] Tang D, Qin B, Liu T. Document modeling with gated recurrent neural network for sentimentclassificationg[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Portugal: EMNLP,2015: 1422-1432.
[24] Mingjie T, Yahui Z, Rongyi C. Identifying word translations in scientific literature based on labeled bilingual topic model and co-occurrence features.[C]//Proceedings of the 17th China National Conference on Computational Linguistics, 2018: 79-92.
[25] Pennington J,Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Qatar: EMNLP,2014: 1532-1543.
[26] Juntang Zhuang, Tommy Tang, Sekhar Tatikonda, et al. AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients[C]//Proceedings of the 34th Conference on Neural Information Processing System. Vancouver: NeurIPS, 2020: 1-12.
[27] Kingma D, Ba J. Adam: A method for stochastic optimization. arXiv 2014[J]. arXiv preprint arXiv: 1412.6980, 2019: 434.
[28] Hanneman G, Lavie A. Automatic category label coarsening for syntax-based machine translation[C]//Proceedings of the 5th Workshop on Syntax, Semantics and Structure in Statistical Translation. Portland, USA, 2011: 98-106.
[29] Kim Y. Convolutional neural networks for sentence classification[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL Press, 2014: 1746-1751.
[30] Xianyan M, Rongyi C, Yahui Z, et al. Multilingual text classification method based on bidirectional long-short memory unit and convolutional neural network[J]. Computer Application Research, 2020,37(9): 2669-2673.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家语委“十三五”科研规划项目(YB135-76);延边大学外国语言文学世界一流学科建设科研项目(18YLPY13)
{{custom_fund}}