译文质量估计中基于Transformer的联合神经网络模型

陈聪,李茂西,罗琪

PDF(2590 KB)
PDF(2590 KB)
中文信息学报 ›› 2021, Vol. 35 ›› Issue (6) : 47-54.
机器翻译

译文质量估计中基于Transformer的联合神经网络模型

  • 陈聪,李茂西,罗琪
作者信息 +

A Transformer-based Unified Neural Network for Quality Estimation of Machine Translation

  • CHEN Cong, LI Maoxi, LUO Qi
Author information +
History +

摘要

译文质量估计作为机器翻译中的一项重要任务,在机器翻译的发展和应用中发挥着重要的作用。该文提出了一种简单有效的基于Transformer的联合模型用于译文质量估计。该模型由Transformer瓶颈层和双向长短时记忆网络组成,Transformer瓶颈层参数利用双语平行语料进行初步优化,模型所有参数利用译文质量估计语料进行联合优化和微调。测试时,将待评估的机器译文使用强制学习和特殊遮挡与源语言句子一起输入联合神经网络模型以预测译文的质量。在CWMT18译文质量估计评测任务数据集上的实验结果表明,该模型显著优于在相同规模训练语料下的对比模型,和在超大规模双语语料下的最优对比模型性能相当。

Abstract

As an important task in machine translation, quality estimation of machine translation plays an important role in the development and application of machine translation. In the paper, we propose a simple and effective unified model base on Transformer for quality estimation of machine translation. The model is composed of the Transformer bottleneck layer and a Bi-LSTM network. Parameters of Transformer bottleneck layer are preliminarily optimized with bilingual parallel corpus, and all parameters of the model are jointly optimized and fine-tuned with the training dataset of quality estimation. In model testing, the translation outputs to be estimated are dealt with teacher forcing and a special masking, and then input into the unified model along with the source sentences. The experimental results on the datasets form CWMT18 quality estimation task show that the proposed model is significantly superior to the baseline models trained on the same data, and comparable with that of the best baseline model trained on the large scale bilingual corpus.

关键词

机器翻译 / 译文质量估计 / Transformer / 联合训练

Key words

machine translation / quality estimation of machine translation / Transformer / joint training

引用本文

导出引用
陈聪,李茂西,罗琪. 译文质量估计中基于Transformer的联合神经网络模型. 中文信息学报. 2021, 35(6): 47-54
CHEN Cong, LI Maoxi, LUO Qi. A Transformer-based Unified Neural Network for Quality Estimation of Machine Translation. Journal of Chinese Information Processing. 2021, 35(6): 47-54

参考文献

[1] Specia L, Shah K, De Souza J GC, et al. QuEst: A translation quality estimation framework[C]//Proceedings of the ACL, 2013: 79-84.
[2] Mikolov T, Sutskever I, Chen K,et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the NIPS, 2013: 3111-3119.
[3] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the NAACL, 2018: 4171-4186.
[4] Shah K, Logacheva V, Paetzold G,et al. Shef-nn: Translation quality estimation with neural networks[C]//Proceedings of the WMT, 2015: 342-347.
[5] Chen Z, Tan Y, Zhang C, et al. Improving machine translation quality estimation with neural network features[C]//Proceedings of the WMT, 2017: 551-555.
[6] Kim H, Jung H Y, Kwon H, et al. Predictor-estimator: Neural quality estimation based on target word prediction for machine translation[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2017, 17(1): 1-22.
[7] Fan K, Wang J, Li B, et al. “Bilingual Expert” can find translation errors[C]//Proceedings of the AAAI, 2019,33: 6367-6374.
[8] Hochreiter S, Jürgen S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[9] Ive J, Blain F, Specia L. DeepQuest: A framework for neural-based quality estimation[C]//Proceedings of the COLING, 2018: 3146-3157.
[10] Mikolov T, Karafiát M, Burget L,et al. Recurrent neural network based language model[C]//Proceedings of the ISCA, 2010: 1045-1048.
[11] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the ICLR, 2015: 1-15.
[12] Vaswani A, Shazeer N, Parmar N,et al. Attention is all you need[C]//Proceedings of the NIPS, 2017: 5998-6008.
[13] Li M, Xiang Q, Chen Z, et al. A unified neural network for quality estimation of machine translation[J]. IEICE Transactions on Information and Systems, 2018, 101(9): 2417-2421.
[14] Wang J, Fan K, Li B, et al. Alibaba submission for WMT18 quality estimation task[C]//Proceedings of the WMT, 2018: 809-815.
[15] Hou Q, Huang S, Ning T, et al. NJU Submissions for the WMT19 quality estimation shared task[C]//Proceedings of the WMT, 2019: 95-100.
[16] Wang Q, Li B, Xiao T, et al. Learning deep transformer models for machine translation[C]//Proceedings of the ACL, 2019: 1810-1822.
[17] Wang Z, Liu H, Chen H, et al. NiuTrans submission for CCMT19 quality estimation task[C]//Proceedings of the CCMT, 2019: 82-92.
[18] Dong Y, Michael L Seltzer. Improved bottleneck features using pretrained deep neural networks[C]//Proceedings of the INTERSPEECH, 2011: 237-240.
[19] Ott M, Edunov S, Baevski A,et al. Fairseq: A fast, extensible toolkit for sequence modeling[C]//Proceedings of the NAACL. 2019: 48-53.
[20] 李培芸, 李茂西, 裘白莲, 等. 融合BERT语境词向量的译文质量估计方法研究. 中文信息学报, 2020, 34 (3): 56-63.

基金

国家自然科学基金(61662031,61462044)
PDF(2590 KB)

1564

Accesses

0

Citation

Detail

段落导航
相关文章

/