层次化结构全局上下文增强的篇章级神经机器翻译

PDF(7170 KB)

中文信息学报 ›› 2022, Vol. 36 ›› Issue (9) : 67-75.

机器翻译

层次化结构全局上下文增强的篇章级神经机器翻译

陈林卿,李军辉,贡正仙

作者信息 +

Hierarchical Global Context Augmented Document-level Neural Machine Translation

CHEN Linqing, LI Junhui, GONG Zhengxian

Author information +

History +

摘要

如何有效利用篇章上下文信息一直是篇章级神经机器翻译研究领域的一大挑战。该文提出利用来源于整个篇章的层次化全局上下文来提高篇章级神经机器翻译性能。为了实现该目标,该文提出的模型分别获取当前句内单词与篇章内所有句子及单词之间的依赖关系,结合不同层次的依赖关系以获取含有层次化篇章信息的全局上下文表示。最终源语言当前句子中的每个单词都能获取其独有的综合词和句级别依赖关系的上下文。为了充分利用平行句对语料在训练中的优势,该文使用两步训练法,在句子级语料训练模型的基础上使用含有篇章信息的语料进行二次训练以获得捕获全局上下文的能力。在若干基准语料数据集上的实验表明,该文提出的模型与若干强基准模型相比取得了有意义的翻译质量提升。实验进一步表明,结合层次化篇章信息的上下文比仅使用词级别上下文更具优势。除此之外,该文还尝试通过不同方式将全局上下文与翻译模型结合并观察其对模型性能的影响,并初步探究篇章翻译中全局上下文在篇章中的分布情况。

Abstract

How to effectively use textual context information is a challenge in the field of document-level neural machine translation (NMT). This paper proposes to use a hierarchical global context derived from the entire document to improve the document-level NMT models. The proposed model obtains the dependencies between the words in current sentence and all other sentences, as well as those between all words. Then the dependencies of different levels are combined as the global context containing the hierarchical contextual information. In order to take advantage of the parallel sentence in training, this paper employs a two-step training strategy: a sentence level model is first trained by the Transformer, and then fine-tuned on a document-level corpus. Experiments on several benchmark corpus data sets show that the proposed model significantly improves translation quality compared with other strong baseline models.

导出引用

陈林卿,李军辉,贡正仙. 层次化结构全局上下文增强的篇章级神经机器翻译. 中文信息学报. 2022, 36(9): 67-75

CHEN Linqing, LI Junhui, GONG Zhengxian. Hierarchical Global Context Augmented Document-level Neural Machine Translation. Journal of Chinese Information Processing. 2022, 36(9): 67-75

参考文献

[1] Sutskever I, Vinyals O, Quoc V Le. Sequence to sequence learning with neural networks[C]// Proceedings of NIPS, 2014: 3104-3112.
[2] Bahdanau D, Cho K H, et al. Neural machine translation by jointly learning to align and translate[C]// Proceedings of ICLR, 2015.
[3] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]// Proceedings of NIPS, 2017: 5998-6008.
[4] Maruf S, Haffari G. Document context neural machine translation with memory networks[C]// Proceedings of ACL, 2018: 1275-1284.
[5] Wang L, Tu Z, Way A, et al. Exploiting cross-sentence context for neural machine translation[C]// Proceedings of EMNLP, 2017: 2826-2831.
[6] Zhang J, Luan H, Sun M, et al. Improving the transformer translation model with document-level context[C]// Proceedings of EMNLP, 2018: 533-542.
[7] Miculicich L, Ram D, Pappas N, et al. Document-level neural machine translation with hierarchical attention networks[C]// Proceedings of EMNLP, 2018: 2947-2954.
[8] Tan X, Zhang L, Xiong D, et al. Hierarchical modeling of global context for document-level neural machine translation[C]// Proceedings of EMNLP-IJCNLP, 2019: 1576-1585.
[9] Lin Z, Feng M, Santos C, et al. A structured self-attentive sentence embedding[C]//Proceedings of ICLR, 2017.
[10] Koehn P, Hoang H, Brich A, et al. Moses: Open source toolkit for statistical machine translation[C]// Proceedings of ACL, 2007: 177-180.
[11] Sennrich R, Haddow B, Alexandra B. Neural machine translation of rare words with subword units[C]// Proceedings of ACL, 2016: 1715-1725.
[12] Cettolo M, Girardi C, Federico M. WIT3: Web inventory of transcribed and translated talks[C]// Proceedings of EAMT, 2012: 261-268.
[13] Klein, Guillaume, Kim, et al.OpenNMT: Open-source toolkit for neural machine translation[C]// Proceedings of ACL, 2017: 67-72.
[14] Kingma D P., Jimmy B. Adam: A method for stochastic optimization[C]// Proceedings of ICLR, 2015.
[15] Papineni K, Roukos S, Todd W, et al. BLEU: A method for automatic evaluation of machine translation[C]// Proceedings of ACL, 2002: 311-318.
[16] Lavie, Alon, Agarwal, Abhaya. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments[C]// Proceedings of WMT, 2007: 228-231.
[17] Koehn P. Statistical significance tests for machine translation evaluation[C]// Proceedings of EMNLP, 2004: 388-395.
[18] Wong K Y, Maruf S, Haffari G. Contextual neural machine translation improves translation of cataphoric pronouns[C]// Proceedings of ACL, 2020: 5971-5978.
[19] Werlen L M, Popescu-Belis A. Validation of an automatic metric for the accuracy of pronoun translation (APT)[C]// Proceedings of the 3rd Workshop on Discourse in Machine Translation. Copenhagen, Denmark: Association for Computational Linguistics, 2017: 17-25.
[20] Gong Z, Zhang M, Zhou G. Cache-based document-level statistical machine translation[C]// Proceedings of EMNLP, 2011: 909-919.
[21] Hardmeier C, Nivre J, Tiedemann J. Document-Wide decoding for phrase-based statistical machine translation[C]// Proceedings of EMNLP, 2012: 1179-1190.
[22] Xiong D, Ding Y, Zhang M, et al. Lexical chain based cohesion models for document-level statistical machine translation[C]// Proceedings of EMNLP, 2013:1563-1573.
[23] Garcia MM, Bonet E E, Cristina, et al. Document-level machine translation with word vector models[C]// Proceedings of EAMT, 2015: 59-66.
[24] Jean S, Lauly S, Firat O, et al. Does neural machine translation benefit from larger context?[J]. arXiv preprintar arXiv:1704.05135.
[25] Bawden R, Sennrich R, Birch A, et al. Evaluating discourse phenomena in neural machine translation[C]// Proceedings of NAACL, 2018:1304-1313.
[26] Voita E, Sennrich R, Titov I. When a good translation is wrong in context: context-aware machine translation improves on deixis, ellipsis, and lexical cohesion[C]// Proceedings of ACL, 2019: 1198- 1212.
[27] Yang Z, Zhang J, Meng F, et al. Enhancing context modeling with a query-guided capsule network for document-level translation[C]// Proceedings of EMNLP, 2019: 1527-1537.
[28] Tu Z, Liu Y, Shi S, et al. Learning to remember translation history with a continuous cache[C]// Proceedings of TACL, 2018:407-420.
[29] Kuang S, Xiong D, Luo W, et al. Modeling coherence for neural machine translation with dynamic and topic caches[C]// Proceedings of COLING, 2018: 596-606.
[30] Mace V, Servan C. Using whole document context in neural machine translation[C]//Proceedings of IWSLT, 2019.
[31] Xiong H, He Z, Wu H, et al. Modeling coherence for discourse neural machine translation[C]// Proceedings of AAAI, 2019: 7338-7345.
[32] Maruf S, André F. T. Martins, Haffari G. Selective attention for context-aware neural machine translation[C]// Proceedings of NAACL, 2019: 3092-3102.

基金

国家自然科学基金(61876120,61976148)

PDF(7170 KB)

877

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2021-02-21	2022-11-01
Issue Date
2022-11-01

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金