基于门控化上下文感知网络的词语释义生成方法

张海同,孔存良,杨麟儿,何姗,杜永萍,杨尔弘

PDF(2193 KB)
PDF(2193 KB)
中文信息学报 ›› 2020, Vol. 34 ›› Issue (7) : 105-112.
自然语言处理应用

基于门控化上下文感知网络的词语释义生成方法

  • 张海同1,3,孔存良2,3,4,杨麟儿2,3,4,何姗5,杜永萍1,杨尔弘2,3
作者信息 +

Gated Context-Aware Network for Definition Generation

  • ZHANG Haitong1,3, KONG Cunliang2,3,4, YANG Liner2,3,4, HE Shan5, DU Yongping1, YANG Erhong2,3
Author information +
History +

摘要

传统的词典编纂工作主要采用人工编纂的方式,效率较低且耗费大量的资源。为减少人工编纂的时间和经济成本,该文提出一种基于门控化上下文感知网络的词语释义生成方法,利用门控循环神经网络(GRU)对词语释义生成过程进行建模,自动为目标词生成词语释义。该模型基于编码器—解码器架构。编码器首先利用双向GRU对目标词的上下文进行编码,并采用不同的匹配策略进行目标词与上下文的交互,结合注意力机制分别从粗粒度和细粒度两个层次将上下文信息融合到目标词的向量表示中,最终获得目标词在特定语境中的编码向量。解码器则同时基于目标词的语境与语义信息为目标词生成上下文相关的词语释义。此外,通过向模型提供目标词字符级特征信息,进一步提高了生成释义的质量。在英文牛津词典数据集上进行的实验表明,该文提出的方法能够生成易于阅读和理解的词语释义,在释义建模的困惑度和生成释义的BLEU值上分别超出此前模型4.45和2.19,性能有显著提升。

Abstract

The traditional lexicography was mainly subject to manual compilation, which is inefficient and costs a lot of resources. This paper proposes a gated context-aware network for definition generation. It utilizes GRU to model the definitions of words and generates the textual definition for the target word automatically. The model is based on the encoder-decoder architecture. Firstly, the context of the target word is encoded by bidirectional GRU. Then, different matching strategies are used to interact the target word with context and the context information is incorporated into the target word embedding from two aspects of coarse-grained and fine-grained by the attention mechanism to obtain the meaning of the target word in a specific context. The decoding process based on the contextual and semantic information to generate context-dependent definition of the target word. In addition, the quality of generated definitions is further improved by providing the character level information of target words. The experimental results show that the proposed model improves the perplexity of definition modeling and the BLEU score of definition generation on the English Oxford dictionary dataset by 4.45 and 2.19 respectively, and can generate readable and understandable definitions.

关键词

释义生成 / GRU / 编码器—解码器 / 注意力机制

Key words

definition generation / GRU / encoder-decoder / attention mechnism

引用本文

导出引用
张海同,孔存良,杨麟儿,何姗,杜永萍,杨尔弘. 基于门控化上下文感知网络的词语释义生成方法. 中文信息学报. 2020, 34(7): 105-112
ZHANG Haitong, KONG Cunliang, YANG Liner, HE Shan, DU Yongping, YANG Erhong. Gated Context-Aware Network for Definition Generation. Journal of Chinese Information Processing. 2020, 34(7): 105-112

参考文献

[1] 贺芸, 庄成余. 论英语全球化传播的原因及其影响[J]. 云南师范大学学报,2004,2(6):60-64.
[2] 章宜华. 对外汉语学习词典释义问题探讨——国内外二语学习词典的对比研究[J]. 世界汉语教学, 2011, 1: 6-9.
[3] Turian J, Ratinov L, Bengio Y. Word representations: A simple and general method for semi-supervised learning[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2010: 384-394.
[4] Noraset T, Liang C, Birnbaum L, et al. Definition modeling: Learning to define word embeddings in natural language[C]//Proceedings of 31st AAAI Conference on Artificial Intelligence, 2017.
[5] Sutskever I, Martens J, Hinton G E. Generating text with recurrent neural networks[C]//Proceedings of the 28th International Conference on Machine Learning (ICML-11), 2011: 1017-1024.
[6] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[7] Gadetsky A, Yakubovskiy I, Vetrov D. Conditional generators of words definitions[J]. arXiv preprint arXiv: 1806.10090, 2018.
[8] Bartunov S, Kondrashkin D, Osokin A, et al. Breaking sticks and ambiguities with adaptive skip-gram[C]//Proceedings of Artificial Intelligence and Statistics, 2016: 130-138.
[9] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014.
[10] Yang L, Kong C, Chen Y, et al. Incorporating sememes into Chinese definition modeling[J]. arXiv preprint arXiv: 1905.06512, 2019.
[11] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of Advances in Neural Information Processing Systems, 2014: 3104-3112.
[12] Cho K, VanMerrinboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv: 1406.1078, 2014.
[13] Hochreiter S, Bengio Y, Frasconi P, et al. Gradient flow in recurrent nets: The difficulty of learning long-term dependencies[M]. A Field Guide to Dynamical Recurrent Networks, NJ: Wiley-IEEE Press, 2001, 237-243.
[14] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conferce on Neural Information Processing Systems, 2013: 3111-3119.
[15] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011, 12(8):2493-2537.
[16] Kim Y,Jernite Y, Sontag D, et al. Character-aware neural language models[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016.
[17] Papineni K, Roukos S, Ward T, et al. BLEU: A method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2002: 311-318.
[18] Merity S, Xiong C, Bradbury J, et al. Pointer sentinel mixture models[J]. arXiv preprint arXiv: 1609.07843, 2016.
[19] Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.
[20] Yang Z, Yang D, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 1480-1489.
[21] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008.

基金

语言资源高精尖创新中心项目(TYZ19005);国家重点研发计划项目(2018YFC1900804);国家语委信息化项目(ZOI135-105,YB135-89)
PDF(2193 KB)

811

Accesses

0

Citation

Detail

段落导航
相关文章

/