融合目标端上下文的篇章神经机器翻译

PDF(5923 KB)

中文信息学报 ›› 2024, Vol. 38 ›› Issue (4) : 59-68.

机器翻译

融合目标端上下文的篇章神经机器翻译

贾爱鑫,李军辉,贡正仙,张民

作者信息 +

Modeling Target-side Context for Document-level Neural Machine Translation

JIA Aixin, LI Junhui, GONG Zhengxian, ZHANG Min

Author information +

History +

摘要

神经机器翻译在句子级翻译任务上取得了令人瞩目的效果,但是句子级翻译的译文会存在一致性、指代等篇章问题,篇章翻译通过利用上下文信息来解决上述问题。不同于以往使用源端上下文建模的方法,该文提出了融合目标端上下文信息的篇章神经机器翻译。具体地,该文借助推敲网络的思想,对篇章源端进行二次翻译,第一次基于句子级翻译,第二次翻译参考了全篇的第一次翻译结果。基于LDC中英篇章数据集和WMT英德篇章数据集的实验结果表明,在引入较少的参数的条件下,该文方法能显著提高翻译性能。同时,随着第一次翻译(即句子级译文)质量的提升,所提方法也更有效。

Abstract

Recently neural machine translation (NMT) has achieved remarkable success in sentence-level translation. However, it still cannot resolve a wide variety of discourse phenomena, such as lexical cohesion and coreference, which can be alleviated by using context information in document-level translation. In contrast to existing studies of modeling source-side context, this paper proposes to model target-side context in document-level NMT. Specifically, motivated by deliberation networks, our approach translates source-side document twice. In the first-pass translation, it performs sentence-level translation. In the second-pass, it properly translates each sentence by modeling the target-side context which has just be generated from the first-pass translation. Experimental results on LDC Chinese-to-English and WMT English-to-German document-level translation tasks show that our approach significantly improves translation performance by introducing few parameters. Meanwhile, it is observed that the proposed approach benefits more if the performance of the first-pass translation (i.e., sentence-level NMT) is improved.

导出引用

贾爱鑫,李军辉,贡正仙,张民. 融合目标端上下文的篇章神经机器翻译. 中文信息学报. 2024, 38(4): 59-68

JIA Aixin, LI Junhui, GONG Zhengxian, ZHANG Min. Modeling Target-side Context for Document-level Neural Machine Translation. Journal of Chinese Information Processing. 2024, 38(4): 59-68

参考文献

[1] SUTSKEVER I, VINYALS O, QUOC V L. Sequence to sequence learning with neural Networks[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, 2014: 3104-3112.
[2] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the 3rd International Conference on Learning Representations, ICLR, 2015.
[3] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, 2017: 5998-6008.
[4] TU Z, LIU Y, SHI S, et al. Learning to remember translation history with a continuous cache. [J]//Transactions of the Association for Computational Linguistics, 2018: 407-420.
[5] KUANG S, XIONG D, LUO W, et al. Cache-based document-level neural machine translation. [J]arXiv preprint arXiv:1711.11221,2017.
[6] WANG L, TU Z, WAY A, et al. Exploiting cross-sentence context for neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, 2017: 2826-2831.
[7] MARUF S, HAFFARI G. Document context neural machine translation with memory networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, 2018: 1275-1284.
[8] VOITA E, SERDYUKOV P, SENNRICH R, et al. Context-aware neural machine translation learns anaphora resolution[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, 2018: 1264-1274.
[9] ZHANG J, LUAN H, SUN M, et al. Improving the transformer translation model with document-level context[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, 2018: 533-542.
[10] MICULICICH WERLEN L, RAM D, PAPPAS N, et al. Document-Level neural machine translation with hierarchical attention networks[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, 2018: 2947-2954.
[11] MARUF S, MARTINS F. T, HAFFARI G.Selective attention for context-aware neural machine translation[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistic, ACL, 2019: 3092-3102.
[12] TAN X, ZHANG L, XIONG D, et al. Hierarchical modeling of global context for document-level neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP, 2019: 1576-1585.
[13] CHEN L, LI J, GONG Z, et al. Improving context-aware neural machine translation with source-side monolingual documents[C]//Proceedings of the 30th International Joint Conference on Artificial Intelligence, IJCAI, 2021a.
[14] CHEN L, LI J, GONG Z, et al. Breaking the corpus bottleneck for context-aware neural machine translation with a novel joint pre-training approach[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP, 2021b.
[15] VOITA E, SENNRICH R, TITOV I. When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL, 2019a: 1198-1212.
[16] VOITA E, SENNRICH R, TITOV I. Context-aware monolingual repair for neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, 2019b: 877-886.
[17] XIONG H, HE Z, WU H, et al. Modeling coherence for discourse neural machine translation[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, AAAI, 2019: 7338-7345.
[18] XIA Y, TIAN F, WU L, et al. Deliberation networks: Sequence generation beyond one-pass decoding[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, NIPS, 2017: 1784-1794.
[19] LIN Z, FENG M, YU M, et al. A Structured self-attentive sentence embedding[C]//Proceedings of the 5th International Conference on Learning Representations, ICLR, 2017.
[20] KOEHN P, HOANG H, BIRCH A, et al. Moses: Open source toolkit for statistical machine translation[C]//Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, ACL, System Demonstrations, 2007: 177-180.
[21] SENNRICH R, HADDOW B, BIRCH A. Neural machine translation of rare words withsubword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, 2016: 1715-1725.
[22] PHILIPP K. Statistical significance tests for machine translation evaluation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing , EMNLP, 2004: 388-395.

基金

国家自然科学基金(61876120,61976148)

PDF(5923 KB)

560

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金

Published
2024-05-13
Issue Date
2024-05-14