知识扩充和增量修剪的领域自适应神经机器翻译

陈洋,杨春明,张晖,王意,李波

PDF(4903 KB)
PDF(4903 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (6) : 96-103.
机器翻译

知识扩充和增量修剪的领域自适应神经机器翻译

  • 陈洋1,杨春明1,2,张晖1,王意1,李波1,2
作者信息 +

Knowledge Augmentation and Incremental Pruning for NMT Domain Adaptation

  • CHEN Yang1, YANG Chunming1,2, ZHANG Hui1, WANG Yi1, LI Bo1,2
Author information +
History +

摘要

领域自适应神经机器翻译是解决低资源翻译中领域语料稀少的一种方法。针对目前多模型集成方法中领域知识过拟合、领域适配缺乏自适应的问题,该文提出了一种基于知识扩充和增量修剪的多领域自适应方法(KAIP)。该方法首先利用知识隐藏策略生成目标领域的辅助语料库进行辅助任务学习,实现知识扩充;然后使用模型修剪策略构建通用领域参数,并结合辅助任务学习训练目标领域参数,在无需调整模型参数的情况下适应多个不同领域。在多个语种、多个领域语料上的实验结果表明,模型在单领域和多领域下的翻译质量均有显著提升。

Abstract

Domain adaptation of neural machine translation is a method to address the data sparse issue in low-resource translation. This paper proposes a multi-domain adaptive method based on knowledge augmentation and incremental pruning (KAIP for short). The main idea is to apply a knowledge mask strategy to generate an auxiliary corpus of the target domain to achieve knowledge augmentation. A model pruning strategy is then used to construct generic domain parameters and train target domain parameters in conjunction with auxiliary task learning. This method is capable of adaptation to several different domains without adjusting the model parameters. Experiment results on multilingual and multidomain corpus show that the model has significantly improved the translation quality.

关键词

神经机器翻译 / 知识扩充 / 模型剪枝 / 领域自适应

Key words

neural machine translation / knowledge augmentation / model pruning / domain adaptation

引用本文

导出引用
陈洋,杨春明,张晖,王意,李波. 知识扩充和增量修剪的领域自适应神经机器翻译. 中文信息学报. 2023, 37(6): 96-103
CHEN Yang, YANG Chunming, ZHANG Hui, WANG Yi, LI Bo. Knowledge Augmentation and Incremental Pruning for NMT Domain Adaptation. Journal of Chinese Information Processing. 2023, 37(6): 96-103

参考文献

[1] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Procedings of Advances in Neural Information Processing Systems, 2014: 3104-3112.
[2] GEHRING J, AULI M, GRANGIER D, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning, 2017: 1243-1252.
[3] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000-6010.
[4] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of ICLK, 2015: 1-15.
[5] CHU C, WANG R. A survey of domain adaptation for machine translation[C]//Proceedings of the 27th Interntional Conference on Computational Linguistics, 2018: 1304-1319.
[6] WANG R, FINCH A,UTIYAMA M, et al. Sentence embedding for neural machine translation domain adaptation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017: 560-566.
[7] VAN DER WEES M, BISAZZA A, MONZ C. Dynamic data selection for neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2017: 1400-1410.
[8] LUONG M, MANNING C. Stanford neural machine translation systems for spoken language domains[C]//Proceedings of the International Workshop on Spoken Language Translation, 2015: 76-79.
[9] DAKWALE P, MONZ C.Fine-tuning for neural machine translation with limited degradation across in-and out-of-domain data[C]//Proceedings of the XVI Machine Translation Summit, 2017: 156-169.
[10] VILAR D. Learning hidden unit contribution for adapting neural machine translation models[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 500-505.
[11] ZENG J, LIU Y, LU Y, et al.Iterative dual domain adaptation for neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 845-854.
[12] SENNRICH R, HADDOW B, BIRCH A. Controlling politeness in neural machine translation via side constraints[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 35-40.
[13] VOITA E, SENNRICH R, TITOV I. Analyzing the source and target contributions to predictions in neural machine translation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language, 2020: 1126-1140.
[14] LI H,KADAV A, DURDANOVIC I, et al. Pruning filters for efficient ConvNet[C]//Proceedings of the 5th International Conference on Learning Representations, 2017: 24-26.
[15] LIANG J, ZHAO C, WANG M, et al. Finding sparse structures for domain specific neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 13333-13342.
[16] KOEHN P, KNOWLES R. Six challenges for neural machine translation[C]//Proceedings of the 1st Workshop on Neural Machine Translation, 2017: 28-39.
[17] TIEDEMANN J. Parallel data, tools and interfaces in OPU[C]//Proceedings of the 8th International Conference on Language Resources and Evaluation, 2012: 2214-2218.
[18] KUDO T, RICHARDSON J. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text process[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2018: 66-71.
[19] VCTOR M. Sánchez Cartagena, Miquel Espla' Gomis Juan Antonio Pe'rez Ortiz, et al. Rethinking data Augmentation for low-resource neural machine translation: a multi-task learning approach[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2021: 8502-8516.
[20] THOMPSON B, KHAYRALLAH H, ANASTASOPOULOS A, et al.Freezing subnetworks to analyze domain adaptation in neural machine translation[C]//Proceedings of the 3rd Conference on Machine Translation, 2018: 124-132.
[21] BAPNA A, ARIVAZHAGAN N, FIRAT O. Simple, scalable adaptation for neural machine translatio[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , 2019: 1538-1548.

基金

四川省科技厅重点研发项目(2021YFG0031);四川省省级科研院所科技成果转化项目(22YSZH0021)
PDF(4903 KB)

Accesses

Citation

Detail

段落导航
相关文章

/