利用质量估计改进无监督神经机器翻译

PDF(2124 KB)

中文信息学报 ›› 2021, Vol. 35 ›› Issue (3) : 51-59.

机器翻译

利用质量估计改进无监督神经机器翻译

徐佳,叶娜,张桂平,黎天宇

作者信息 +

Improving Unsupervised Neural Machine Translation with Quality Estimation

XU Jia, YE Na, ZHANG Guiping, LI Tianyu

Author information +

History +

摘要

传统上神经机器翻译依赖于大规模双语平行语料,而无监督神经机器翻译的方法避免了神经机器翻译对大量双语平行语料的过度依赖,更适合低资源语言或领域。无监督神经机器翻译训练时会产生伪平行数据,这些伪平行数据质量对机器翻译最终质量起到了决定性的作用。因此,该文提出利用质量估计的无监督神经机器翻译模型,通过在反向翻译的过程中使用质量估计对生成的伪平行数据评分,再选择评分(HTER)较高的平行数据训练神经网络。利用质量估计的方法可以控制反向翻译生成的伪平行数据的质量,为对抗生成网络提供了更丰富的训练样本,使对抗生成网络训练得更加充分。与基线模型相比,该模型在WMT 2019德语—英语和捷克语—英语新闻单语语料上BLEU值分别提升了0.79和0.55。

Abstract

Traditionally, neural machine translation relies on large-scale bilingual parallel corpora. In contrast, unsupervised neural machine translation avoids the dependence on bilingual corpora by generating pseudo-parallel data, whose quality plays a decisive role in the model training. To ensure the final quality of machine translation, we propose an unsupervised neural machine translation model using quality estimation to control the quality of pseudo-parallel data generated. Specifically, in the process of back-translation, we use quality estimation to score the generated pseudo-parallel data, and then select parallel data with higher score (HTER) to train the neural network. Compared with the baseline system, the BLEU scores are increased by 0.79 and 0.55, respectively, on WMT 2019 German-English and Czech-English monolingual news corpora.

导出引用

徐佳,叶娜,张桂平,黎天宇. 利用质量估计改进无监督神经机器翻译. 中文信息学报. 2021, 35(3): 51-59

XU Jia, YE Na, ZHANG Guiping, LI Tianyu. Improving Unsupervised Neural Machine Translation with Quality Estimation. Journal of Chinese Information Processing. 2021, 35(3): 51-59

参考文献

[1] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, et al. Learning phrase representations using RNN Encoder-Decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, PA: EMNLP, 2014: 1724-1734.
[2] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. Neural machine translation by jointly learning to align and translate[J].arXiv preprint arXiv: 1409.0473, 2014.
[3] Ashish Vaswani, Noam Shazeer, Niki Parmar, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc, 2017: 6000-6010.
[4] Mikel Artetxe, Gorka Labaka, Eneko Agirre, et al. Unsupervised neural machine translation[C]//Proceedings of the 6th International Conference on Learning Representations, PA: ICLR, 2018.
[5] Guillaume Lample, Myle Ott, Alexis Conneau, et al. Phrase-Based and neural unsupervised machine translation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, PA: EMNLP, 2018: 5039-5049.
[6] Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, et al. Word translation without parallel data[C]//Proceedings of the 6th International Conference on Learning Representations, PA: ICLR, 2018.
[7] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, et al. Generative Adversarial Networks[J]. Communications of the ACM, 2020,63(1): 139-144.
[8] Leng Yichong, Tan Xu, Qin Tao, et al. Unsupervised pivot translation for distant languages[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, PA: ACL, 2019: 175-183.
[9] Artetxe Mikel, Labaka Gorka, and Eneko Agirre. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, PA: ACL, 2018: 789-798.
[10] Guillaume Lample, Alexis Conneau, Ludovic Denoyer, et al. Unsupervised machine translation using monolingual corpora only[C]//Proceedings of the 6th International Conference on Learning Representations, PA: ICLR, 2018.
[11] Zhen Yang, Wei Chen, Feng Wang, et al. Unsupervised neural machine translation with weight sharing[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, PA:ACL, 2018: 46-55.
[12] Wang Shuo, Liu Yang, Wang Chao, et al. Improving Back-Translation with Uncertainty-based confidence estimation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019: 791-810.
[13] Kepler, Fábio, Trénous, Jonay, Treviso M, et al. OpenKiwi: An open source framework for quality estimation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, PA: ACL, 2019: 117-122.
[14] Zhenhao Li, Lucia Specia. Improving neural machine translation robustness via data augmentation: Beyond back translation[C]//Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), 2019: 328-336.
[15] Miguel Graa, Yunsu Kim, Julian Schamper, et al. Generalizing Back-Translation in neural machine translation[C]//Proceedings of the Fourth Conference on Machine Translation, 2019: 45-32.
[16] Rico Sennrich, Barry Haddow, Alexandra Birch. Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, PA: ACL, 2015: 1715-1725.
[17] Tomas Mikolov, Ilya Sutskever, Kai Chen, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems, Volume 2, 2013: 3111-3119.
[18] Julia Kreutzer, Shigehiko Schamoni, Stefan Riezler. Quality estimation from scraTCH (QUETCH): Deep learning for word-level translation quality estimation[C]//Proceedings of the 10th Workshop on Statistical Machine Translation, 2015: 316-322.
[19] André F. T. Martins, Marcin Junczys-Dowmunt, Fabio Kepler, et al. Pushing the limits of translation quality estimation[J]. Transactions of the Association for Computational Linguistics, 2017, 5: 205-218.
[20] Jiayi Wang, Kai Fan, Bo Li, et al. Alibaba submission for WMT18 quality estimation task[C]//Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers, 2018: 809-815.
[21] Lucia Specia, Carolina Scarton, Gustavo Henrique Paetzold. Quality estimation for machine translation[J]. Synthesis Lectures on Human Language Technologies, 2018, 11(1): 1-162.
[22] Philipp Koehn, Hieu Hoang, Alexandra Birch, et al. Moses: Open source toolkit for statistical machine translation[C]//Proceedings of the Association for Computational Linguistics ACL'07, 2007, 9(1):177-180.
[23] Rico Sennrich, Barry Haddow, Alexandra Birch. Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, PA: ACL, 2015: 1715-1725.
[24] Piotr Bojanowski, Edouard Grave, Armand Joulin, et al. Enriching word vectors with subword information[J]. Transactions of the Association for Computational Linguistics, 2017, 5:135-147.
[25] Kishore Papineni, Salim Roukos, Todd Ward, et al. Bleu: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational linguistics. Association for Computational Linguistics, Stroudsburg, PA: ACL, 2002: 311-318.

基金

教育部人文社会科学研究青年基金(19YJC740107);国家自然科学基金(U1908216);辽宁省重点研发计划(2019JHZ/10100020)

PDF(2124 KB)

1106

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

Received	Published
2020-02-07	2021-04-16
Issue Date
2021-04-16

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金