神经机器翻译在平行语料充足的任务中能取得很好的效果,然而对于资源稀缺型语种的翻译任务则往往效果不佳。汉语和越南语之间没有大规模的平行语料库,在这项翻译任务中,该文探索只使用容易获得的汉语和越南语单语语料,通过挖掘单语语料中词级别的跨语言信息,融合到无监督翻译模型中提升翻译性能;该文提出了融合EMD(Earth Mover's Distance)最小化双语词典的汉—越无监督神经机器翻译方法,首先分别训练汉语和越南语的单语词嵌入,通过最小化它们的EMD训练得到汉越双语词典,然后再将该词典作为种子词典训练汉越双语词嵌入,最后利用共享编码器的无监督机器翻译模型构建汉—越无监督神经机器翻译方法。实验表明,该方法能有效提升汉越无监督神经机器翻译的性能。
Abstract
Neural machine translation (NMT) has achieved good results in tasks with sufficient parallel corpora, but often has poor results in translation tasks with scarce resources. To address NMT between Chinese and Vietnamese without large-scale parallel corpus, we explore the use of easily available Chinese and Vietnamese monolingual corpora by mining cross-language information at the word level. A Chinese-Vietnamese unsupervised neural machine translation method that incorporates Earth Mover's Distance(EMD) to minimize bilingual dictionaries is proposed. First, monolingual word embeddings for Chinese and Vietnamese are trained independently, and a Chinese-Vietnamese bilingual dictionary is obtained by minimizing their EMD. The dictionary is then used as a seed dictionary to train the Chinese-Vietnamese bilingual word embeddings. Finally, the shared encoder unsupervised machine translation model is applied to construct a Chinese-Vietnamese unsupervised neural machine translation. Experiments show that this method can effectively improve the performance of Chinese-Vietnamese unsupervised neural machine translation.
关键词
无监督学习 /
EMD /
汉语—越南语 /
神经机器翻译
{{custom_keyword}} /
Key words
unsupervised learning /
Earth Mover's Distance /
Chinese-Vietnamese /
neural machine translation
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. Neural machine translation by jointly learning to align and translate [J]. arXiv preprint arXiv:1409.0473, 2014.
[2] Ilya Sutskever, Oriol Vinyals, Quoc V Le. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014: 3104-3112.
[3] Rico Sennrich, Barry Haddow, Alexandra Birch. Edinburgh neural machine translation systems for WMT16[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 371-376.
[4] Philipp Koehn, Rebecca Knowles. Six challenges for neural machine translation[C] //Proceedings of the 1st Workshop on Neural Machine Translation, Vancouver, 2017:28-39.
[5] Guillaume Lample, Ludovic Denoyer, Marc’Aurelio Ranzato. Unsupervised machine translation using monolingual corpora only[J]. arXiv preprint arXiv:1711.00043, 2017.
[6] Mikel Artetxe, Gorka Labaka, Eneko Agirre, et al. Unsu-pervised neural machine translation[J]. arXiv preprint arXiv:1710.11041, 2017.
[7] Zhen Yang, Wei Chen, Feng Wang, et al. Unsupervised neural machine translation with weight sharing[J]. arXiv preprint arXiv:1804.09057, 2018.
[8] Guillaume Lample, Myle Ott, Alexis Conneau, et al. Phrase-based and neural unsupervised machine translation[C] //Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 5039-5049.
[9] Guillaume Lample, Alexis Conneau. Cross-lingual language model pretraining[J]. arXiv preprint arXiv:1901.07291, 2019.
[10] Meng Zhang, Yang Liu, Huanbo Luan, et al. Earth mover's distance minimization for unsupervised bilingual lexicon induction [C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, 2017:1934-1945.
[11] MinhThang Luong, Hieu Pham, Christopher D Manning. Effective approaches to attention-based neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015: 1412-1421.
[12] Di He, Yingce Xia, Tao Qin, et al. Dual learning for machine translation[C]//Proceedings of the International Conference on Neural Information Processing Systems, 2016: 820-828.
[13] Orhan Firat, Kyunghyun Cho, Yoshua Bengio. Multiway, multilingual neural machine translation with a shared attention mechanism[C]//Proceedings of the Association for Computational Linguistics, San Diego, 2016: 866-875.
[14] Thanh-Le Ha, Jan Niehues, Alexander Waibel. Toward multilingual neural machine translation with universal encoder and decoder [J]. arXiv preprint arXiv:1611.04798, 2016.
[15] Jason Lee, Kyunghyun Cho, Thomas Hofmann. Fully characterlevel neural machine translation without explicit segmentation[C]//Proceedings of the Association for Computational Linguistics, 2017: 365-378.
[16] Melvin Johnson, Mike Schuster, Quoc V Le, et al. Google's multilingual neural machine translation system: Enabling zero-shot translation[J].Transactions of the Association for Computational Linguistics,2016, 5: 339-351.
[17] Tomas Mikolov, Ilya Sutskever, Kai Chen, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, 2013: 3111-3119.
[18] Tomas Mikolov, Quoc V. Le, Ilya Sutskever. Exploiting similarities among languages for machine translation[J]. arXiv preprint arXiv:1309.4168, 2013.
[19] Ian J Goodfellow, Jean Pouget Abadie, Mehdi Mirza, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, 2014: 2672-2680.
[20] Zhang Meng, Liu Yang, Luan Huanbo, et al. Adversarial training for unsupervised bilingual lexicon induction[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 2017: 1959-1970.
[21] Angeliki Lazaridou, Georgiana Dinu, Marco Baroni. Hubness and pollution: Delving into crossspace mapping for zeroshot learning[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015: 270-280.
[22] Georgiana Dinu, Angeliki Lazaridou, Marco Baroni. Improving zeroshot learning by mitigating the hubness problem [J]. Computer Science, 2014, 9284: 135-151.
[23] Zhang Meng, Liu Yang, Luan Huanbo, et al. Building Earth Mover's Distance on bilingual word embeddings for machine translation[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, 2016: 2870-2876.
[24] Zhang Meng, Liu Yang, Luan Huanbo, et al. Inducing bilingual lexica from nonparallel data with Earth Mover's Distance regularization[C]//Proceedings of the 26th International Conference on Computational Linguistics, Osaka, 2016: 3188-3198.
[25] Rico Sennrich, Barry Haddow, Alexandra Birch. Improving neural machine translation models with monolingual data[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016: 86-96.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点研发计划(2019QY1801);国家自然科学基金(61732005,61672271,61761026,61762056,61866020);云南省高新技术产业专项(201606)
{{custom_fund}}