宗勤勤,李茂西. 基于重解码的神经机器翻译方法研究[J]. 中文信息学报, 2021, 35(6): 39-46.
ZONG Qinqin, LI Maoxi. Research on Neural Machine Translation Based on Re-decoding. , 2021, 35(6): 39-46.
基于重解码的神经机器翻译方法研究
宗勤勤,李茂西
江西师范大学 计算机信息工程学院,江西 南昌 330022
Research on Neural Machine Translation Based on Re-decoding
ZONG Qinqin, LI Maoxi
School of Computer and Information Engineering, Jiangxi Normal University, Nanchang, Jiangxi 330022, China
Abstract:The Transformer is one of the best performing machine translation models. Generating tokens one by one from left to right, this approach lacks the guidance of future contextual information. To alleviate this issue, we propose a neural machine translation model based on re-decoding. The model treats the generated machine translation outputs as approximate contextual environment of the target language, and then re-decodes each token in the machine translation output successively. The masked multi-head attention of the Transformer decoder only masks the current position token in the generated translation output. As a result, every token re-decoded can make full use of its contextual information. Experimental results on several test sets from the WMT show that the quality of machine translation is improved significantly by leveraging the re-decoding.
[1] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the NIPS, 2017: 5998-6008. [2] Jonas G, Michael A, David G, et al. Convolutional sequence to sequence learning[C]//Proceedings of the ICML, 2017: 1243-1252. [3] Dzmitry B, KyungHyun C, Yoshua B. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the ICLR, 2015: 1-15. [4] Watanabe T, Sumita E. Bidirectional decoding for statistical machine translation[C]//Proceedings of the COLING, 2002: 1-7. [5] Finch A, Sumita E. Bidirectional phrase-based statistical machine translation[C]//Proceedings of the EMNLP, 2009: 1124-1132. [6] Lemao L, Masao U, Andrew F,et al. Agreement on target-bidirectional neural machine translation[C]//Proceedings of the NAACL, 2016: 411-416. [7] Sennrich R, Haddow B, Birch A. Edinburgh neural machine translation systems for WMT16[C]//Proceedings of the WMT, 2016: 371-376. [8] Zoph B, Knight K. Multi-source neural translation[C]//Proceedings of the NAACL, 2016: 30-34. [9] Zhou L, Zhang J, Zong C, et al. Sequence generation: From both sides to the middle[C]//Proceedings of the IJCAI, 2019: 5471-5477. [10] Zheng Z, Huang S, Tu Z, et al. Dynamic past and future for neural machine translation[C]//Proceedings of the EMNLP, 2019: 931-914. [11] Zhang Z, Wu S, Liu S, et al. Regularizing neural machine translation by target-bidirectional agreement[C]//Proceedings of the NAACL, 2019: 443-450. [12] Fan K, Wang J, Li B, et al. “Bilingual expert” can find translation errors[C]//Proceedings of the AAAI, 2019: 6367-6374. [13] Zhao W, Wang L, Shen K, et al. Improving grammatical error correction via pre-training a copy-augmented architecture with unlabeled data[C]//Proceedings of the NAACL, 2019: 156-165. [14] Huang X, Liu Y, Luan H, et al. Learning to copy for automatic post-editing[C]//Proceedings of the EMNLP, 2019: 6122-6132. [15] Jaehun S, Jong-Hyeok L. Multi-encoder transformer network for automatic post-editing[C]//Proceedings of the WMT, 2018: 840-845. [16] Rico S, Barry H, Alexandra B. Neural machine translation of rare words with subword units[C]//Proceedings of the ACL, 2016: 1715-1725. [17] Matthew S, Bonnie D, Richard S,et al. A study of translation edit rate with targeted human annotation[C]//Proceedings of AMTA, 2006: 223-231. [18] Ott M, Edunov S, Baevski A, et al. Fairseq: A fast, extensible toolkit for sequence modeling[C]//Proceedings of the NAACL, 2019: 48-53. [19] Sun M, Jiang B, Xiong H, et al. Baidu neural machine translation systems for WMT19[C]//Proceedings of the WMT, 2019: 374-381. [20] Guo X, Liu C, Li X, et al. Kingsoft's neural machine translation system for WMT19[C]//Proceedings of the WMT, 2019: 196-202. [21] Nathan N, Kyra Y, Alexei B, et al. Facebook FAIR's WMT19 news translation task submission[C]//Proceedings of the WMT, 2019: 314-319. [22] Marcin J. Microsofttranslator at WMT 2019: Towards large-scale document-level neural machine translation[C]//Proceedings of the WMT, 2019: 225-233. [23] Jan R, Christian H, Yunsu K, et al. The RWTH Aachen University machine translation systems for WMT 2019[C]//Proceedings of the WMT, 2019: 349-355. [24] Amirhossein T, Ruchit A, Matteo N, et al. Multi-source transformer for automatic post-editing [C]//Proceedings of the 5th Italian Conference on Computational Linguistics CLiC-it, 2018: 366-371. [25] Santanu P, Nico H, Antonio K, et al. A transformer-based multi-source automatic post-editing system[C]//Proceedings of the WMT, 2018: 827-835. [26] Maja P. Hjerson: An open source tool for automatic error classification of machine translation output[J]The Prague Bulletin of Mathematical Linguistics, 2011, 96: 59-67.