融合双语命名实体信息的神经机器翻译模型

贺楚祎,张家俊

PDF(2329 KB)
PDF(2329 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (12) : 44-53.
机器翻译

融合双语命名实体信息的神经机器翻译模型

  • 贺楚祎1,2,张家俊1,2
作者信息 +

Neural Machine Translation Model Incorporating Bilingual Named Entity

  • HE Chuyi1,2, ZHANG Jiajun1,2
Author information +
History +

摘要

神经机器翻译(NMT)模型在机器翻译任务上取得了良好效果,但由于对训练数据规模的依赖,NMT模型对于命名实体等稀有词语翻译能力有限,存在大量错翻、漏翻等问题。针对上述问题,该文提出了基于多引擎融合的双语命名实体词典构建方法和基于双语命名实体进行数据增强的Transformer模型架构,在多个中-英翻译测试集上的实验表明,该文提出的神经机器翻译模型相比于朴素Transformer模型在译文整体质量和命名实体翻译正确率上都有一定的提升,分别提升1.58的BLEU值和35.3个百分点的命名实体翻译准确率。

Abstract

Neural machine translation (NMT) model has made great achievements on machine translation tasks, but has limited ability to translate rare words such as named entities. To address this issue, the paper proposes a multi-engine fusion-based dictionary construction method using bilingual named entities, and a Transformer NMT architecture with data augmentation based on bilingual named entities. Experimental results on several Chinese-English translation test sets show that the proposed method has a certain improvement over the vanilla model in terms of the overall translation quality and the correct translation rate of named entities, by 1.58 BLEU score and 35.3% of named entity translation rate, respectively.

关键词

命名实体翻译 / 神经机器翻译 / 双语命名实体词典

Key words

named entity translation / neural machine translation / bilingual named entity dictionary

引用本文

导出引用
贺楚祎,张家俊. 融合双语命名实体信息的神经机器翻译模型. 中文信息学报. 2023, 37(12): 44-53
HE Chuyi, ZHANG Jiajun. Neural Machine Translation Model Incorporating Bilingual Named Entity. Journal of Chinese Information Processing. 2023, 37(12): 44-53

参考文献

[1] LUONG T, SUTSKEVER I, LE Q, et.al. Addressing the rare word problem in neural machine translation[C]//Proceedings of the 24th International Conference on Computational Linguistics, 2015: 11-19.
[2] ARTHUR P, NEUBIG G, NAKAMURA S. Incorporating discrete translation lexicons into neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016:1557-1567.
[3] CHENG S, HUANG S, CHEN H, et.al. PRIMT: A pick-revise framework for interactive machine translation[C]//Proceedings of NAACL, 2016: 1240-1249.
[4] KOEHN P, OCH F, MARCU D. Statistical phrase-based translation[C]//Proceedings of NAACL, 2003: 127-133.
[5] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
[6] HOKAMP C, LIU Q. Lexically constrained decoding for sequence generation using grid beam search[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1535-1546.
[7] POST M, VILAR D. Fast lexically constrained decoding with dynamic beam allocation for neural machine translation[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, 2018: 1314-1324.
[8] LI X, YAN J, ZHANG J. Neural name translation improves neural machine translation[C]//Proceedings of China Workshop on Machine Translation. Springer, Singapore, 2018: 93-100.
[9] SONG K, ZHANG Y, YU H, et.al. Code-switching for enhancing NMT with pre-specified translation[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, 2019: 449-459.
[10] DINU G, MATHUR P. Training neural machine translation to apply terminology constraints[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 3063-3068.
[11] ALKHOULI T, BRETSCHNER G, NEY H. On the alignment problem in multi-head attention-based neural machine translation[J]. arXiv preprint arXiv:1809.03985, 2018.
[12] ZENKEL T, WUEBKER J, DENERO J. Adding interpretable attention to neural translation models improves word alignment[J]. arXiv preprint arXiv:1901.11359, 2019.
[13] SNOVER M, DORR B, SCHWARTZ R, et al. A study of translation edit rate with targeted human annotation[C]//Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, 2006: 223-231.
[14] 李茂西,宗成庆. 机器翻译系统融合技术综述[J]. 中文信息学报, 2010, 24(4): 74-85.
[15] CHEN Y, ZONG C, SU K. On jointly recognizing and aligning bilingual named entities[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010: 631-639.
[16] HARRIS Z S. Distributional structure[J]. Word, 1954, 10(2-3): 146-162.
[17] CONNEAU A, LAMPLE G, RANZATO M A, et al. Word translation without parallel data[J]. arXiv: 1710.04087, 2017.
[18] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on neural information processing systems, 2017: 6000-6010.
[19] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, 2019: 4171-4186
[20] GULCEHRE C, AHN S, NALLAPATI R, et al. Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 140-149.
[21] WANG T, KUANG S, XIONG D,et al. Merging external bilingual pairs into neural machine translation. arXiv preprint arXiv:1912.00567, 2019.
[22] PAPINENI K, ROUKOS S, WARD T. BLEU: A method for automatic evaluation of machine translation[C]//Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002: 311-318.

基金

国家自然科学基金(6212088)
PDF(2329 KB)

478

Accesses

0

Citation

Detail

段落导航
相关文章

/