1.Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; 2.Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
Abstract:Neural machine translation (NMT) has achieved good results in tasks with sufficient parallel corpora, but often has poor results in translation tasks with scarce resources. To address NMT between Chinese and Vietnamese without large-scale parallel corpus, we explore the use of easily available Chinese and Vietnamese monolingual corpora by mining cross-language information at the word level. A Chinese-Vietnamese unsupervised neural machine translation method that incorporates Earth Mover's Distance(EMD) to minimize bilingual dictionaries is proposed. First, monolingual word embeddings for Chinese and Vietnamese are trained independently, and a Chinese-Vietnamese bilingual dictionary is obtained by minimizing their EMD. The dictionary is then used as a seed dictionary to train the Chinese-Vietnamese bilingual word embeddings. Finally, the shared encoder unsupervised machine translation model is applied to construct a Chinese-Vietnamese unsupervised neural machine translation. Experiments show that this method can effectively improve the performance of Chinese-Vietnamese unsupervised neural machine translation.
[1] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. Neural machine translation by jointly learning to align and translate [J]. arXiv preprint arXiv:1409.0473, 2014. [2] Ilya Sutskever, Oriol Vinyals, Quoc V Le. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014: 3104-3112. [3] Rico Sennrich, Barry Haddow, Alexandra Birch. Edinburgh neural machine translation systems for WMT16[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 371-376. [4] Philipp Koehn, Rebecca Knowles. Six challenges for neural machine translation[C] //Proceedings of the 1st Workshop on Neural Machine Translation, Vancouver, 2017:28-39. [5] Guillaume Lample, Ludovic Denoyer, Marc’Aurelio Ranzato. Unsupervised machine translation using monolingual corpora only[J]. arXiv preprint arXiv:1711.00043, 2017. [6] Mikel Artetxe, Gorka Labaka, Eneko Agirre, et al. Unsu-pervised neural machine translation[J]. arXiv preprint arXiv:1710.11041, 2017. [7] Zhen Yang, Wei Chen, Feng Wang, et al. Unsupervised neural machine translation with weight sharing[J]. arXiv preprint arXiv:1804.09057, 2018. [8] Guillaume Lample, Myle Ott, Alexis Conneau, et al. Phrase-based and neural unsupervised machine translation[C] //Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 5039-5049. [9] Guillaume Lample, Alexis Conneau. Cross-lingual language model pretraining[J]. arXiv preprint arXiv:1901.07291, 2019. [10] Meng Zhang, Yang Liu, Huanbo Luan, et al. Earth mover's distance minimization for unsupervised bilingual lexicon induction [C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, 2017:1934-1945. [11] MinhThang Luong, Hieu Pham, Christopher D Manning. Effective approaches to attention-based neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015: 1412-1421. [12] Di He, Yingce Xia, Tao Qin, et al. Dual learning for machine translation[C]//Proceedings of the International Conference on Neural Information Processing Systems, 2016: 820-828. [13] Orhan Firat, Kyunghyun Cho, Yoshua Bengio. Multiway, multilingual neural machine translation with a shared attention mechanism[C]//Proceedings of the Association for Computational Linguistics, San Diego, 2016: 866-875. [14] Thanh-Le Ha, Jan Niehues, Alexander Waibel. Toward multilingual neural machine translation with universal encoder and decoder [J]. arXiv preprint arXiv:1611.04798, 2016. [15] Jason Lee, Kyunghyun Cho, Thomas Hofmann. Fully characterlevel neural machine translation without explicit segmentation[C]//Proceedings of the Association for Computational Linguistics, 2017: 365-378. [16] Melvin Johnson, Mike Schuster, Quoc V Le, et al. Google's multilingual neural machine translation system: Enabling zero-shot translation[J].Transactions of the Association for Computational Linguistics,2016, 5: 339-351. [17] Tomas Mikolov, Ilya Sutskever, Kai Chen, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, 2013: 3111-3119. [18] Tomas Mikolov, Quoc V. Le, Ilya Sutskever. Exploiting similarities among languages for machine translation[J]. arXiv preprint arXiv:1309.4168, 2013. [19] Ian J Goodfellow, Jean Pouget Abadie, Mehdi Mirza, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, 2014: 2672-2680. [20] Zhang Meng, Liu Yang, Luan Huanbo, et al. Adversarial training for unsupervised bilingual lexicon induction[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 2017: 1959-1970. [21] Angeliki Lazaridou, Georgiana Dinu, Marco Baroni. Hubness and pollution: Delving into crossspace mapping for zeroshot learning[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015: 270-280. [22] Georgiana Dinu, Angeliki Lazaridou, Marco Baroni. Improving zeroshot learning by mitigating the hubness problem [J]. Computer Science, 2014, 9284: 135-151. [23] Zhang Meng, Liu Yang, Luan Huanbo, et al. Building Earth Mover's Distance on bilingual word embeddings for machine translation[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, 2016: 2870-2876. [24] Zhang Meng, Liu Yang, Luan Huanbo, et al. Inducing bilingual lexica from nonparallel data with Earth Mover's Distance regularization[C]//Proceedings of the 26th International Conference on Computational Linguistics, Osaka, 2016: 3188-3198. [25] Rico Sennrich, Barry Haddow, Alexandra Birch. Improving neural machine translation models with monolingual data[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016: 86-96.