Abstract:For sentence-level neural machine translation, the problem of incomplete semantic representation is noticeable since the context information of the current sentence is not considered. We extract effective information from each sentence in a document by dependency parsing, and then complement the extracted information into the source sentences, making the semantic representation of the sentences more complete. We conduct experiments on Chinese-English language pair, and propose a training method on large-scale parallel language pairs for the scarcity of document-level parallel corpus. Compared with the baseline model, our approach improves 1.47 BLEU significantly. Experiments show that the document-level neural machine translation based on context recovery can effectively solve the problem of incomplete semantic representation of sentence-level neural machine translation.
[1] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the International Conference on Learning Representations, 2015. [2] Sutskever I, Vinyals O, Le Q V.Sequence to sequence learning with neural networks[C]//Proceedings of Advances in Neural Information Processing Systems, 2014: 3104-3112. [3] Ashish Vaswani, Noam Shazeer, Niki Parmar, et al. Attention is all you need[C]//Proceedings of NIPS,2017: 5998-6008. [4] Jonas Gehring, Michael Auli, David Grangier, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning,2017: 1243-1252. [5] Gong Z, Zhang M, Zhou G. Cache-based document-level statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. DBLP, 2011: 909-919. [6] Xiong D, Ben G, Zhang M, et al. Modeling lexical cohesion for document-level machine translation.[C]//Proceedings of the International Joint Conference on Artificial Intelligence. AAAI Press, 2013: 2183-2189. [7] Shaohui Kuang, Deyi Xiong, Weihua Luo, et al. Cache-based document-level neural machine translation[C]//Proceedings of COLING, 2018. [8] Jiacheng Zhang, Huanbo Luan, Maosong Sun, et al. Improving the transformer translation model with document-level context[C]//Proceedings of EMNLP,2018: 533-542. [9] Elena Voita, Pavel Serdyukov, Rico Sennrich, et al. Context-aware neural machine translation learns anaphora resolution[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistic,2018: 1264-1274. [10] Miculicich L, Ram D, Pappas N, et al. Document-level neural machine translation with hierarchical attention networks[C]//Proceedings of EMNLP 2018,2018: 2947-2954. [11] Meyer T, Popescu-Belis A. Using sense-labeled discourse connectives for statistical machine translation[C]//Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra). Association for Computational Linguistics, 2012: 129-138. [12] Eva Martinez Garcia, Cristina Espana-Bonet, Lluis Márquez. Document-level machine translation as a re-translation process[J]. Pàgina Inicial Del Recercat, 2014. [13] Eva Martinez Garcia, Carles Creus, Cristina Espana-Bonet, et al. Using word embeddings to enforce document-level lexical consistency in machine translation[J]. The Prague Bulletin of Mathematical Linguistics,2017. [14] Jrg Tiedemann, Yves Scherrer.Neural machine translation with extended context[C]//Proceedings of the Association for Computational Linguistics,2017: 82-92. [15] Longyue Wang, Zhaopeng Tu, Andy Way, et al. Exploiting cross-sentence context for neural machine translation[C]//Proceedings of the 32nd AAAI Conference on Artifificial Intelligence,2018: 1-9. [16] Zhaopeng Tu, Liu Yang, Shuming Shi, et al. Learning to remember translation history with a continuous cache[C]//Proceedings of the Association for Computational Linguistics, 2017. [17] Sameen Maruf, Gholamreza Haffari. Document context neural machine translation with memory networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistic, 2018: 1275-1284. [18] Mihail Eric, Lakshmi Krishnan, Francois Charette, et al. Manning. Key-value retrieval networks for task-oriented dialogue[C]//Proceedings of SIGDIAL,2017: 37-49. [19] Edouard Grave, Armand Joulin, Nicolas Usunier.Improving neural language models with a continuous cache[C]//arXiv: 1612.04426v1[cs.CL],2016. [20] Hao Xiong, Zhongjun He, Hua Wu, et al. Modeling coherence for discourse neural machine translation[C]//Proceedings of AAAI,2018: 7338-7345. [21] Xia Yingce, Tian Fei, Wu Lijun, et al. Deliberation networks: Sequence generation beyond one-pass decoding[C]//Proceedings of Advances in Neural Information Processing Systems,2017: 1784-1794. [22] Elena Voita, Pavel Serdyukov, Rico Sennrich, et al. Context-aware neural machine translation learns anaphora resolution[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,2018: 1264-1274. [23] Wang L, Tu Z, Shi S. Translating pro-drop languages with reconstruction models[C]//Proceedings of AAAI,2018: 4937-4945.