Abstract:Traditionally, neural machine translation relies on large-scale bilingual parallel corpora. In contrast, unsupervised neural machine translation avoids the dependence on bilingual corpora by generating pseudo-parallel data, whose quality plays a decisive role in the model training. To ensure the final quality of machine translation, we propose an unsupervised neural machine translation model using quality estimation to control the quality of pseudo-parallel data generated. Specifically, in the process of back-translation, we use quality estimation to score the generated pseudo-parallel data, and then select parallel data with higher score (HTER) to train the neural network. Compared with the baseline system, the BLEU scores are increased by 0.79 and 0.55, respectively, on WMT 2019 German-English and Czech-English monolingual news corpora.
[1] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, et al. Learning phrase representations using RNN Encoder-Decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, PA: EMNLP, 2014: 1724-1734. [2] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. Neural machine translation by jointly learning to align and translate[J].arXiv preprint arXiv: 1409.0473, 2014. [3] Ashish Vaswani, Noam Shazeer, Niki Parmar, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc, 2017: 6000-6010. [4] Mikel Artetxe, Gorka Labaka, Eneko Agirre, et al. Unsupervised neural machine translation[C]//Proceedings of the 6th International Conference on Learning Representations, PA: ICLR, 2018. [5] Guillaume Lample, Myle Ott, Alexis Conneau, et al. Phrase-Based and neural unsupervised machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, PA: EMNLP, 2018: 5039-5049. [6] Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, et al. Word translation without parallel data[C]//Proceedings of the 6th International Conference on Learning Representations, PA: ICLR, 2018. [7] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, et al. Generative Adversarial Networks[J]. Communications of the ACM, 2020,63(1): 139-144. [8] Leng Yichong, Tan Xu, Qin Tao, et al. Unsupervised pivot translation for distant languages[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, PA: ACL, 2019: 175-183. [9] Artetxe Mikel, Labaka Gorka, and Eneko Agirre. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, PA: ACL, 2018: 789-798. [10] Guillaume Lample, Alexis Conneau, Ludovic Denoyer, et al. Unsupervised machine translation using monolingual corpora only[C]//Proceedings of the 6th International Conference on Learning Representations, PA: ICLR, 2018. [11] Zhen Yang, Wei Chen, Feng Wang, et al. Unsupervised neural machine translation with weight sharing[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, PA:ACL, 2018: 46-55. [12] Wang Shuo, Liu Yang, Wang Chao, et al. Improving Back-Translation with Uncertainty-based confidence estimation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019: 791-810. [13] Kepler, Fábio, Trénous, Jonay, Treviso M, et al. OpenKiwi: An open source framework for quality estimation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, PA: ACL, 2019: 117-122. [14] Zhenhao Li, Lucia Specia. Improving neural machine translation robustness via data augmentation: Beyond back translation[C]//Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), 2019: 328-336. [15] Miguel Graa, Yunsu Kim, Julian Schamper, et al. Generalizing Back-Translation in neural machine translation[C]//Proceedings of the Fourth Conference on Machine Translation, 2019: 45-32. [16] Rico Sennrich, Barry Haddow, Alexandra Birch. Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, PA: ACL, 2015: 1715-1725. [17] Tomas Mikolov, Ilya Sutskever, Kai Chen, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems, Volume 2, 2013: 3111-3119. [18] Julia Kreutzer, Shigehiko Schamoni, Stefan Riezler. Quality estimation from scraTCH (QUETCH): Deep learning for word-level translation quality estimation[C]//Proceedings of the 10th Workshop on Statistical Machine Translation, 2015: 316-322. [19] André F. T. Martins, Marcin Junczys-Dowmunt, Fabio Kepler, et al. Pushing the limits of translation quality estimation[J]. Transactions of the Association for Computational Linguistics, 2017, 5: 205-218. [20] Jiayi Wang, Kai Fan, Bo Li, et al. Alibaba submission for WMT18 quality estimation task[C]//Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers, 2018: 809-815. [21] Lucia Specia, Carolina Scarton, Gustavo Henrique Paetzold. Quality estimation for machine translation[J]. Synthesis Lectures on Human Language Technologies, 2018, 11(1): 1-162. [22] Philipp Koehn, Hieu Hoang, Alexandra Birch, et al. Moses: Open source toolkit for statistical machine translation[C]//Proceedings of the Association for Computational Linguistics ACL'07, 2007, 9(1):177-180. [23] Rico Sennrich, Barry Haddow, Alexandra Birch. Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, PA: ACL, 2015: 1715-1725. [24] Piotr Bojanowski, Edouard Grave, Armand Joulin, et al. Enriching word vectors with subword information[J]. Transactions of the Association for Computational Linguistics, 2017, 5:135-147. [25] Kishore Papineni, Salim Roukos, Todd Ward, et al. Bleu: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational linguistics. Association for Computational Linguistics, Stroudsburg, PA: ACL, 2002: 311-318.