Abstract:Neural machine translation is currently the most popular method in the field of machine translation, while translation memory is a tool to help professional translators avoid repeated translations. This paper proposes two methods to integrate the translation memory into neural machine translation via data augmentation: (1) directly stitching translation memory after the source sentence; (2) stitching translation memory by tag embedding. Experiments on Chinese-English and English-German datasets show that proposed methods can achieve significant improvements.
[1] 冯志伟. 自然语言机器翻译新论 [M]. 北京: 语文出版社, 1994. [2] 刘群. 机器翻译研究新进展 [J]. 当代语言学, 2009, 11(02): 147-158. [3] Sutskever Ilya, Vinyals Oriol, Le Quoc V. Sequence to sequence learning with neural networks[J]. Advances in Neural Information Processing Systems, 2014, 27(4):3104-3112. [4] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of International Conference on Learning Representations. San Diego, CA:International Conference on Learning Representations,2015. [5] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems, 2017: 5998-6008. [6] Koehn P,Senellart J. Convergence of translation memory and statistical machine translation[C]//Proceedings of AMTA Workshop on MT Research and the Translation Industry, 2010: 21-31. [7] Cao Q,Xiong D. Encoding gated translation memory into neural machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 3042-3047. [8] Wang K,Zong C, Su K Y. Integrating translation memory into phrase-based machine translation during decoding[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013: 11-21. [9] Wang K,Zong C, Su K Y. Dynamically integrating cross-domain translation memory into phrase-based machine translation during decoding[C]//Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics, 2014: 398-408. [10] Gu J, Wang Y, Cho K, et al. Search engine guided neural machine translation[C]//Proceedings of Thirty-Second AAAI Conference on Artificial Intelligence, 2018. [11] Bapha A, Firat O. Non-parametric adaptation for neural machine translation[J]. arXiv preprint arXiv:1903.00058, 2019. [12] Sennrich R, Haddow B, Birch A. Improving neural machine translation models with monolingual data[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 86-96. [13] Edunov S, Ott M, Auli M, et al. Understanding back-translation at scale[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 489-500. [14] Sennrich R, Haddow B. Linguistic input features improve neural machine translation[C]//Proceedings of the First Conference on Machine Translation, 2016: 83-91. [15] Li J,Xiong D, Tu Z, et al. Modeling source syntax for neural machine translation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 688-697. [16] Bloodgood M, Strauss B. Translation memory retrieval methods[J]. arXiv preprint arXiv:1505.05841, 2015. [17] Rafalovitch A, Dale R. United nations general assembly resolutions: A six-language parallel corpus[C]//Proceedings of the MT Summit XII. 2009, 12: 292-299. [18] Joulin A, Grave E, Bojanowski P, et al. Fasttext. zip: Compressing text classification models[J]. arXiv preprint arXiv:1612.03651, 2016. [19] Sennrich R, Haddow B, Birch A. Neural machine translation of rare words with subword units[J]. arXiv preprint arXiv:1508.07909, 2015. [20] Papineni K, Roukos S, Ward T, et al. BLEU: A method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002: 311-318. [21] Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980,2014. [22] Srivastava N, Hinton G,Krizhevsky A, et al. Dropout: A simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1):1929-1958.