机器翻译译文质量估计综述

邓涵铖,熊德意

PDF(2385 KB)
PDF(2385 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (11) : 20-37.
综述

机器翻译译文质量估计综述

  • 邓涵铖,熊德意
作者信息 +

A Survey on Machine Translation Quality Estimation

  • DENG Hancheng, XIONG Deyi
Author information +
History +

摘要

机器翻译译文质量估计(Quality Estimation,QE)是指在不需要人工参考译文的条件下,估计机器翻译系统产生的译文的质量,对机器翻译研究和应用具有很重要的价值。机器翻译译文质量估计经过最近几年的发展,取得了丰富的研究成果。该文首先介绍了机器翻译译文质量估计的背景与意义;然后详细介绍了句子级QE、单词级QE、文档级QE的具体任务目标、评价指标等内容,进一步概括了QE方法发展的三个阶段: 基于特征工程和机器学习的QE方法阶段,基于深度学习的QE方法阶段,融入预训练模型的QE方法阶段,并介绍了每一阶段中的代表性研究工作;最后分析了目前的研究现状及不足,并对未来QE方法的研究及发展方向进行了展望。

Abstract

Machine translation quality estimation refers to the estimation of the quality of the outputs by machine translation system without the human reference translations. It is of great value to the research and application of machine translation. In this survey, we firstly introduce the background and significance of machine translation quality estimation. Then we introduce in detail the specific task objectives and evaluation indicators of word-level QE, sentence-level QE, and document-level QE. We further summarize the development of QE methods to three main stage: methods based on feature engineering and machine learning, methods based on deep learning, and methods integrated with pre-training model. Representative research works in each stage are introduced, and the current research status and shortcomings are analyzed. Finally, we outline the outlook for the future research and development of QE.

关键词

机器翻译 / 译文质量估计 / 文献综述

Key words

machine translation / translation quality estimation / literature review

引用本文

导出引用
邓涵铖,熊德意. 机器翻译译文质量估计综述. 中文信息学报. 2022, 36(11): 20-37
DENG Hancheng, XIONG Deyi. A Survey on Machine Translation Quality Estimation. Journal of Chinese Information Processing. 2022, 36(11): 20-37

参考文献

[1] BARRACHINA S,BENDER O,CASACUBERTA F,et al. Statistical approaches to computer-assisted translation[J]. Computational Linguistics,2009,35 (1): 3-28.
[2] PAPINENI K,ROUKOS S,WARD T,et al. BLEU: A method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002: 311-318.
[3] BANERJEE S,LAVIE A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, 2005: 65-72.
[4] DODDINGTON G. Automatic evaluation of machine translation quality using N-gram co-occurrence statistics[C]//Proceedings of the 2nd International Conference on Human Language Technology Research, 2002: 138-145.
[5] SNOVER M,DORR B,SCHWARTZ R,et al. A study of translation edit rate with targeted human annotation[C]//Proceedings of Association for Machine Translation in the Americas. 2006: 223-231.
[6] GANDRABURD S,FOSTER G. Confidence estimation for translation prediction[C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, 2003: 95-102.
[7] QUIRK C. Training a sentence-level machine translation confidence measure[C]//Proceedings of the LREC, 2004: 825-828.
[8] BLATZ J,FITZGERALD E,FOSTER G,et al. Confidence estimation for machine translation[C]//Proceedings of the 20th International Conference on Computational Linguistics, 2004: 315-321.
[9] UEFFING N,NEY H. Word-level confidence estimation for machine translation[J]. Computational Linguistics,2007,33 (1): 9-40.
[10] UEFFING N,NEY H. Word-level confidence estimation for machine translation using phrase-based translation models[C]//Proceedings of HLT/EMNLP, 2005: 763-770.
[11] SPECIA L,TURCHI M,CANCEDDA N,et al. Estimating the sentence-level quality of machine translation systems[C]//Proceedings of the 13th Conference of the European Association for Machine Translation, 2009: 28-37.
[12] CALLISON-BURCH C,KOEHN P,MONZ C,et al. Findings of the workshop on statistical machine translation[C]//Proceedings of the 7th Workshop on Statistical Machine Translation, 2012: 10-51.
[13] SPECIA L,SCARTON C,PAETZOLD G H. Quality estimation for machine translation[J]. Synthesis Lectures on Human Language Technologies,2018,11 (1): 1-162.
[14] LOMMEL A,USZKOREIT H,BURCHARDT A. Multidimensional quality metrics (MQM): A framework for declaring and describing translation quality metrics[J]. Revista Tradumàtica,2014 (12): 455-463.
[15] MATTHEWS B W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme[J]. Biochimica at Biophysica Acta -protein Structure,1975,405 (2): 442-451.
[16] FONSECA E,YANKOVSKAYA L,MARTINS A F,et al. Findings of the WMT shared tasks on quality estimation[C]//Proceedings of the 4th Conference on Machine Translation, 2019: 1-10.
[17] KRINGS H P. Repairing texts: Empirical investigations of machine translation post-editing processes[M]. 5. Kent State University Press,2001.
[18] BOJAR O,BUCK C,CALLISONBURCH C,et al. Findings of the Workshop on Statistical Machine Translation[C]//Proceedings of the 8th Workshop on Statistical Machine Translation, 2013: 1-44.
[19] BOJAR O,BUCK C,FEDERMANN C,et al. Findings of the Workshop on Statistical Machine Translation[C]//Proceedings of the 9th Workshop on Statistical Machine Translation, 2014: 12-58.
[20] SPECIA L. Exploiting objective annotations for minimising translation post-editing effort[C]//Proceedings of the 15th Annual Conference of the European Association for Machine Translation, 2011: 73-80.
[21] GRAHAM Y. Improving evaluation of machine translation quality estimation[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 1804-1813.
[22] de Souza J G. Adaptive Quality Estimation for Machine Translation and Automatic Speech Recognition[D]. PHD Thesis, University of Trento,2016.
[23] BOJAR O,CHATTERJEE R,FEDERMANN C,et al. Findings of the conference on machine translation[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 131-198.
[24] SCARTON C,ZAMPIERI M,VELA M,et al. Searching for context: A study on document-level labels for translation quality estimation[C]//Proceedings of the 18th Annual Conference of the European Association for Machine Translation, 2015: 121-128.
[25] SPECIA L,BLAIN F,LOGACHEVA V,et al. Findings of the WMT shared task on quality estimation[C]//Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers, 2018: 689-709.
[26] SANCHEZ-TORRON M,KOEHN P. Machine translation quality and post-editor productivity[C]//Proceedings of AMTA, 2016: 267-272.
[27] SPECIA L,SHAH K,de SOUZA J G,et al. QuEst: A translation quality estimation framework[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2013: 79-84.
[28] SPECIA L,PAETZOLD G,SCARTON C. Multi-level translation quality prediction with quest++[C]//Proceedings of ACL-IJCNLP System Demonstrations, 2015: 115-120.
[29] RAPP R. The backtranslation score: Automatic mt evalution at the sentence level without reference translations[C]//Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, 2009: 133-136.
[30] ALMAGHOUT H,SPECIA L. A CCG-based quality estimation metric for statistical machine translation[C]//Proceedings of MT Summit XIV of Conference,2013: 223-230.
[31] LANGLOIS D. LORIA system for the WMT quality estimation shared task[C]//Proceedings of the 10th Workshop on Statistical Machine Translation, 2015: 323-329.
[32] KOZLOVA A,SHMATOVA M,FROLOV A. Ysda participation in the wmt'16 quality estimation shared task[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 793-799.
[33] ABDELSALAM A,BOJAR O,ELBELTAGY S R. Bilingual embeddings and word alignments for translation quality estimation[C]//Proceedings of the 1st Conference on Machine Translation: Volume 2, Shared Task Papers, 2016: 764-771.
[34] SAGEMO O,STYMNE S. The UU submission to the machine translation quality estimation task[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 825-830.
[35] BIICI E,WAY A. Referential translation machines for predicting translation quality[C]//Proceedings of the 9th Workshop on Statistical Machine Translation, 2014: 304-308.
[36] BIICI E. RTM-DCU: Predicting semantic similarity with referential translation machines[C]//Proceedings of the 9th International Workshop on Semantic Evaluation, 2015: 56-63.
[37] BIICI E,GROVES D,VAN GENABITH J. Predicting sentence translation quality using extrinsic and language independent features[J]. Machine Translation,2013,27 (3-4): 171-192.
[38] SHAH K,NG R W M,BOUGARES F,et al. Investigating continuous space language models for machine translation quality estimation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015: 1073-1078.
[39] LUONG N Q,LECOUTEUX B,BESACIER L. LIG system for WMT13 QE task: Investigating the usefulness of features in word confidence estimation for MT[C]//Proceedings of the 8th Workshop on Statistical Machine Translation, 2013: 386-391.
[40] GONZLEZ-RUBIO J,NAVARRO CERDN J R,CASACUBERTA F. Dimensionality reduction methods for machine translation quality estimation[J]. Machine Translation,2013,27 (3-4): 281-301.
[41] GONZLEZ-RUBIO J,SANCHIS A,CASACUBERTA F. PRHLT submission to the WMT quality estimation task[C]//Proceedings of the 7th Workshop on Statistical Machine Translation, 2012: 104-108.
[42] SHAH K,LOGACHEVA V,PAETZOLD G,et al. SHEF-NN: Translation quality estimation with neural networks[C]//Proceedings of the 10th Workshop on Statistical Machine Translation, 2015: 342-347.
[43] GUYON I,ELISSEEFF A. An introduction to variable and feature selection[J]. Journal of Machine Learning Research,2003,3 (3): 1157-1182.
[44] SORICUT R,BACH N,WANG Z. The SDL language weaver systems in the WMT quality estimation shared task[C]//Proceedings of the 7th Workshop on Statistical Machine Translation, 2012: 145-151.
[45] BECK D,SHAH K,COHN T,et al. SHEF-Lite: When less is more for translation quality estimation[C]//Proceedings of the 8th Workshop on Statistical Machine Translation, 2013: 337-342.
[46] COHN T,SPECIA L. Modelling annotator bias with multi-task Gaussian processes: An application to machine translation quality estimation[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013: 32-42.
[47] DE SOUZA J G,BUCK C,TURCHI M,et al. FBK-UEdin participation to the WMT13 quality estimation shared task[C]//Proceedings of the 8th Workshop on Statistical Machine Translation, 2013: 352-358.
[48] BUCK C. Black box features for the WMT quality estimation shared task[C]//Proceedings of the 7th Workshop on Statistical Machine Translation, 2012: 91-95.
[49] HILDEBRAND A S,VOGEL S. MT quality estimation: the CMU system for WMT ‘13[C]//Proceedings of the 8th Workshop on Statistical Machine Translation, 2013: 373-379.
[50] FELICE M,SPECIA L. Linguistic features for quality estimation[C]//Proceedings of the 7th Workshop on Statistical Machine Translation, 2012: 96-103.
[51] HARDMEIER C,NIVRE J,TIEDEMANN J. Tree kernels for machine translation quality estimation[C]//Proceedings of the 7th Workshop on Statistical Machine Translation, Montréal, Canada, 2012: 109-113.
[52] TEZCAN A,HOSTE V,MACKEN L. UGENTLT3 SCATE submission for WMT shared task on quality estimation[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 843-850.
[53] SINGH A K,WISNIEWSKI G,YVON F. LIMSI submission for the WMT'13 quality estimation task: An experiment with N-Gram posteriors[C]//Proceedings of the 8th Workshop on Statistical Machine Translation, 2013: 398-404.
[54] ESPLà-GOMIS M,SáNCHEZ MARTíNEZ F,FORCADA M. UAlacant word-level and phrase-level machine translation quality estimation systems at WMT[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 782-786.
[55] RUBINO R,WAGNER J,FOSTER J,et al. DCU-symantec at the WMT quality estimation shared task[C]//Proceedings of the 8th Workshop on Statistical Machine Translation, 2013: 392-397.
[56] ESPLAGOMIS M,SNCHEZMARTNEZ F,FORCADA M L. UAlacant word-level machine translation quality estimation system at WMT[C]//Proceedings of the 10th Workshop on Statistical Machine Translation, 2015: 309-315.
[57] LAFFERTY J,MCCALLUM A,PEREIRA F C. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning 2001: 282-289.
[58] MIKOLOV T,YIH W T,ZWEIG G. Linguistic regularities in continuous space word representations[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013: 746-751.
[59] PENNINGTON J,SOCHER R,MANNING C D. Glove: Global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543.
[60] SCHWENK H. Continuous space translation models for phrase-based statistical machine translation[C]//Proceedings of COLING: Posters, 2012: 1071-1080.
[61] SHAH K,BOUGARES F,BARRAULT L,et al. SHEF-LIUM-NN: Sentence level quality estimation with neural network features[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 838-842.
[62] CHEN Z,TAN Y,ZHANG C,et al. Improving machine translation quality estimation with neural network features[C]//Proceedings of the 2nd Conference on Machine Translation, 2017: 551-555.
[63] 陈志明,李茂西,王明文. 基于神经网络特征的句子级别译文质量估计[J]. 计算机研究与发展,2017,54 (8): 1804-1812.
[64] SCARTON C,BECK D,SHAH K,et al. Word embeddings and discourse information for quality estimation[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 831-837.
[65] KREUTZER J,SCHAMONI S,RIEZLER S. QUality estimation from ScraTCH: Deep learning for word-level translation quality estimation[C]//Proceedings of the 10th Workshop on Statistical Machine Translation, 2015: 316-322.
[66] MARTINS A F T,ASTUDILLO R,HOKAMP C,et al. Unbabel's participation in the WMT word-level translation quality estimation shared task[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 806-811.
[67] PATEL R N,SASIKUMAR M. Translation quality estimation using recurrent neural network[C]//Proceesings of the 1st Conference on Machine Translation, 2016: 819-824.
[68] SHANG L,CAI D,JI D. Strategy-based technology for estimating MT quality[C]//Proceedings of the 10th Workshop on Statistical Machine Translation, 2015: 348-352.
[69] PAETZOLD G,SPECIA L. Simplenets: Quality estimation with resource-light neural networks[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 812-818.
[70] BAHDANAU D,CHO K,BENGIO Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the 3rd International Conference on Learning Representations, 2015: 1-15.
[71] KIM H,LEE J H. A recurrent neural networks approach for estimating the quality of machine translation output[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 494-498.
[72] KIM H,LEE J H. Recurrent neural network based translation quality estimation[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 787-792.
[73] KIM H,JUNG H Y,KWON H,et al. Predictor-estimator: neural quality estimation based on target word prediction for machine translation[J]. ACM Transactions on Asian Low-resource Language Information Processing,2017,17 (1): 1-22.
[74] KIM H,LEE J H,NA S H. Predictor-estimator using multilevel task learning with stack propagation for neural quality estimation[C]//Proceedings of the 2nd Conference on Machine Translation, 2017: 562-568.
[75] LI M,XIANG Q,CHEN Z,et al. A unified neural network for quality estimation of machine translation[J]. Ieice Transactions on Information,2018,101 (9): 2417-2421.
[76] 李培芸,翟煜锦,项青宇,et al. 基于子词的句子级别神经机器翻译的译文质量估计方法[J]. 厦门大学学报(自然科学版),2020,59 (2): 159-166.
[77] MARTINS A F,JUNCZYSDOWMUNT M,KEPLER F N,et al. Pushing the limits of translation quality estimation[J]. Transactions of the Association for Computational Linguistics,2017,5: 205-218.
[78] MARTINS A F,KEPLER F,MONTEIRO J. Unbabel’s participation in the WMT translation quality estimation shared task[C]//Proceedings of the 2nd Conference on Machine Translation, 2017: 569-574.
[79] CRAMMER K,DEKEL O,KESHET J,et al. Online passive aggressive algorithms[J] Journal of Machine Learning Research,2006,7: 551-585.
[80] HU J,CHANG W C,WU Y,et al. Contextual encoding for translation quality estimation[C]//Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers, 2018: 788-793.
[81] VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008.
[82] WANG J,FAN K,LI B,et al. Alibaba submission for WMT18 quality estimation task[C]//Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers, 2018: 809-815.
[83] FAN K,WANG J,LI B,et al. “Bilingual Expert” can find translation errors[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019: 6367-6374.
[84] PETERS M E,NEUMANN M,IYYER M,et al. Deep contextualized word representations[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 2227-2237.
[85] DEVLIN J,CHANG M W,LEE K,et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2019: 4171-4186.
[86] CONNEAU A,LAMPLE G. Cross-lingual language model pretraining[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019: 7059-7069.
[87] KEPLER F,TRNOUS J,TREVISO M,et al. Unbabel's participation in the WMT translation quality estimation shared task[C]//Proceeding of the 4th Conference on Machine Translation, 2019: 78-84.
[88] HOU Q,HUANG S,NING T,et al. NJU submissions for the WMT quality estimation shared task[C]//Proceedings of the 4th Conference on Machine Translation, 2019: 95-100.
[89] ZHOU J,ZHANG Z,HU Z. SOURCE: SOURce-conditional elmo-style model for machine translation quality estimation[C]//Proceedings of the 4th Conference on Machine Translation, 2019: 106-111.
[90] YANKOVSKAYA E,TTTAR A,FISHEL M. Quality estimation and translation metrics via pre-trained word and sentence embeddings[C]//Proceedings of the 4th Conference on Machine Translation, 2019: 101-105.
[91] ARTETXE M,SCHWENK H. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond[J]. Transactions of the Association for Computational Linguistics,2019,7: 597-610.
[92] MATHUR N,BALDWIN T,COHN T. Putting evaluation in context: Contextual embeddings improve machine translation evaluation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 2799-2808.
[93] MIAO G,DI H,XU J,et al. Improved quality estimation of machine translation with pre-trained language representation[C]//Proceeding of the CCF International Conference on Natural Language Processing and Chinese Computing, 2019: 406-417.
[94] WU H,WANG Z,MA Q,et al. Tencent submission for WMT quality estimation shared task[C]//Proceeding of the 5th Conference on Machine Translation, 2020: 1062-1067.
[95] WANG M,YANG H,SHANG H,et al. HW-TSC's participation at WMT quality estimation shared task[C]//Proceeding of the 5th Conference on Machine Translation, 2020: 1056-1061.
[96] HOULSBY N,GIURGIU A,JASTRZEBSKI S,et al. Parameter-efficient transfer learning for NLP[C]//International Conference on Machine Learning, 2019: 2790-2799.
[97] LIU L,FUJITA A,UTIYAMA M,et al. Translation quality estimation using only bilingual corpora[J]. IEEE/ACM Transactions on Audio, Speech, Language Processing,2017,25 (9): 1762-1772.
[98] WU H,YANG M,WANG J,et al. Target oriented data generation for quality estimation of machine translation[C]//Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, 2019: 393-405.
[99] JUNCZYSDOWMUNT M,GRUNDKIEWICZ R. Log-linear combinations of monolingual and bilingual neural machine translation models for automatic post-editing[C]//Proceedings of the 1st Conference on Machine Translation, 2016: 751-758.
[100] TU Z,LU Z,LIU Y,et al. Modeling coverage for neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 76-85.
[101] YANG J,ZHANG B,QIN Y,et al. Otem&Utem: Over-and under-translation evaluation metric for NMT[C]//Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, 2018: 291-302.

基金

国家重点研发计划(2019QY1802)
PDF(2385 KB)

3227

Accesses

0

Citation

Detail

段落导航
相关文章

/