Abstract:Integrating pre-defined bilingual pairs into Neural Machine Translation (NMT) has always been a challenging task with substantial application scenarios. Limited by the word-by-word decoding strategy, the explicit integration of external bilingual pairs into NMT often requires modifying the beam search decoding algorithm or even the model itself. This paper proposes a simple method of incorporating pre-defined bilingual pairs into NMT: (1)preprocessing the training data to add information about pre-defined bilingual pairs; (2) using partially shared embeddings help the model distinguish between pre-defined bilingual pairs and other texts. Experiments and analysis in multiple language pairs show that the method can improve the probability of successful translation of pre-defined bilingual pairs, reaching nearly 99% (the Chinese-English benchmark is 73.8%).
[1] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the 3rd International Conference on Learning Representations, 2015. [2] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014: 3104-3112. [3] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008. [4] Stahlberg F, Hasler E, Waite A, et al. Syntactically guided neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 299-305. [5] Tang Y, Meng F, Lu Z, et al. Neural machine translation with external phrase memory[J]. arXiv preprint arXiv:1606.01792, 2016. [6] Wang X, Tu Z, Xiong D, et al. Translating phrases in neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 1421-1431. [7] Zhang J, Liu Y, Luan H, et al. Prior knowledge integration for neural machine translation using posterior regularization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1514-1523. [8] Hokamp C, Liu Q. Lexically constrained decoding for sequence generation using grid beam search[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1535-1546. [9] Chatterjee R, Negri M, Turchi M, et al. Guiding neural machine translation decoding with external knowledge[C]//Proceedings of the 2nd Conference on Machine Translation, 2017: 157-168. [10] Crego J, Kim J, Klein G, et al. Systran′s pure neural machine translation systems[J]. arXiv preprint arXiv:1610.05540, 2016. [11] Song K, Zhang Y, Yu H, et al. Code-switching for enhancing NMT with prespecified translation[C]//Proceedings of the NAACL-HLT (1), 2019:449-459. [12] Devlin J, Chang M W, Lee K, et al. Bert: pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018. [13] Che W, Li Z, Liu T. LTP: A Chinese language technology platform[C]//Proceedings of the 27th International Conference on Computational Linguistics: Demonstrations, 2010: 13-16. [14] Koehn P, Hoang H, Birch A, et al. Moses: open source toolkit for statistical machine translation[C]//Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, 2007: 177-180. [15] Sennrich R, Haddow B, Birch A. Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1715-1725. [16] Lample G, Ott M, Conneau A, et al. Phrase-based & neural unsupervised machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 5039-5049. [17] Artetxe M, Labaka G, Agirre E, et al. Unsupervised neural machine translation[C]//Proceedings of the International Conference on Learning Representations, 2018: 1-12.