基于ListMLE排序学习方法的机器译文自动评价研究

李茂西,江爱文,王明文

PDF(704 KB)
PDF(704 KB)
中文信息学报 ›› 2013, Vol. 27 ›› Issue (4) : 22-30.
综述

基于ListMLE排序学习方法的机器译文自动评价研究

  • 李茂西,江爱文,王明文
作者信息 +

Research on ListMLE Approach to Ranking for
Automatic Machine Translation Evaluation

  • LI Maoxi, JIANG Aiwen, WANG Mingwen
Author information +
History +

摘要

机器翻译译文质量的自动评价是推动机器翻译技术快速发展的一条重要途径。该文提出了基于List-MLE 排序学习方法的译文自动评价方法。在此基础上,探讨引入刻画译文流利度和忠实度的特征,来进一步提高译文自动评价结果和人工评价结果的一致性。实验结果表明,在评价WMT11德英任务和IWSLT08 BTEC CE ASR任务上的多个翻译系统的输出译文质量时,该文提出的方法预测准确率高于BLEU尺度和基于RankSVM的译文评价方法。

Abstract

Automatic evaluation of machine translation plays an important role in promoting the rapid development of machine translation. In this paper, we apply the ListMLE approach to learning to rank for machine translation automatic evaluation. In addition, we introduce the features of translation fluency and adequacy to further improve the consistency between the results of the automatic evaluation and human judgments. When assess the translation quality of the submitted system outputs of WMT11 German-English tasks and IWSLT08 BTEC CE ASR tasks, the experimental results indicate that the predicted accuracy of the proposed approach is higher than the BLEU metric and the one based on RankSVM.
Key wordsmachine translation evaluation; learning to rank; ListMLE approach; automatic evaluation; human evaluation

关键词

机器译文评价 / 排序学习 / ListMLE方法 / 人工评价 / 自动评价

Key words

machine translation evaluation / learning to rank / ListMLE approach / automatic evaluation / human evaluation

引用本文

导出引用
李茂西,江爱文,王明文. 基于ListMLE排序学习方法的机器译文自动评价研究. 中文信息学报. 2013, 27(4): 22-30
LI Maoxi, JIANG Aiwen, WANG Mingwen. Research on ListMLE Approach to Ranking for
Automatic Machine Translation Evaluation. Journal of Chinese Information Processing. 2013, 27(4): 22-30

参考文献

[1] Zhifei Li, Chris Callison-Burch, Chris Dyer, et al. An Open Source Toolkit for Parsing-based Machine Translation[C]//Proceedings of the WMT. 2009.
[2] Callison-Burch Chris, Philipp Koehn, Christof Monz, et al. Findings of the 2011 Workshop on Statistical Machine Translation[C]//Proceedings of the WMT, Edinburgh, Scotland, UK, 2011: 22-64.
[3] Chang Liu, Daniel Dahlmeier, Hwee Tou Ng. Better Evaluation Metrics Lead to Better Machine Translation[C]//Proceedings of the EMNLP, Edinburgh, Scotland, UK, 2011: 375-384.
[4] Kishore Papineni, Salim Roukos, Todd Ward, et al. BLEU: a Method for Automatic Evaluation of Machine Translation[C]//Proceedings of the ACL, Philadelphia, Pennsylvania, 2002: 311-318.
[5] Callison-Burch Chris, M Osborne, Philipp Koehn. Re-evaluating the role of BLEU in machine translation research[C]//Proceedings of EACL, 2006:249-256.
[6] David Chiang, Steve DeNeefe, Yee Seng Chan, et al. Decomposability of translation metrics for improved evaluation and efficient algorithms[C]//Proceedings of the EMNLP, Honolulu, Hawaii, 2008:610-619.
[7] George Doddington. Automatic Evaluation of Machine Translation Quality Using N-gram Cooccurrence Statistics[C]//Proceedings of the HLT, San Diego, California, CA, USA, 2002: 138-145,.
[8] Satanjeev Banerjee, Alon Lavie. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, 2005:65-72.
[9] I Dan Melamed, Ryan Green, Joseph P Turian. Precision and Recall of Machine Translation[C]//Proceedings of the HLT-NAACL, Edmonton, Canada, 2003:61-63.
[10] Matthew Snover, Bonnie Dorr, Richard Schwartz, et al. A Study of Translation Edit Rate with Targeted Human Annotation[C]//Proceedings of AMTA, Cambridge, 2006: 223-231.
[11] S Corston-Oliver, M Gamon, C Brockett. A Machine Learning Approach to the Automatic Evaluation of Machine Translation[C]//Proceedings of the ACL, 2001:148-155.
[12] A Kulesza, S M Shieber. A Learning Approach to Improving Sentence-level MT Evaluation[C]//Proceedings of the TMI, 2004:75-84.
[13] Sebastian Padó, Michel Galley, Dan Jurafsky et al. Robust Machine Translation Evaluation with Entailment Features[C]//Proceedings of the ACL, Suntec, Singapore, 2009: 297-305.
[14] Joshua S Albrecht, Rebecca Hwa. Regression for Machine Translation Evaluation at the Sentence Level[J]. Machine Translation, 22 (1-2): 1-27.
[15] L Specia, J Giménez. Combining Confidence Estimation and Reference-based Metrics for Segment-level MT Evaluation[C]//Proceedings of the AMTA, Denver, Colorado, 2010.
[16] Yang Ye, Ming Zhou, Chin-Yew Lin. Sentence Level Machine Translation Evaluation as a Ranking Problem: One Step Aside from BLEU[C]//Proceedings of the WMT, Prague, Czech Republic, 2007: 240-247.
[17] Kevin Duh. Ranking vs. Regression in Machine Translation Evaluation[C]//Proceedings of the WMT, Columbus, Ohio, 2008: 191-194.
[18] X Song, T Cohn. Regression and Ranking based Optimisation for Sentence Level Machine Translation Evaluation[C]//Proceedings of the WMT, Edinburgh, Scotland,2011.
[19] H Li. A Short Introduction to Learning to Rank[J], IEICE Transactions on Information and Systems, vol. E94-D, 2011.
[20] T Joachims, T Finley, C N J Yu. Cutting-plane Training of Structural SVMs[J]. Machine Learning, 2009, 77 (1): 27-59.
[21] F Xia, T Y Liu, J Wang, et al. Listwise Approach to Learning to Rank: Theory and Algorithm[C]//Proceedings of the 25th International Conference on Machine learning, 2008:1192-1199.
[22] Z Cao, T Qin, T Y Liu, et al. Learning to Rank: From Pairwise Approach to Listwise Approach[C]//Proceedings of the 24th International Conference on Machine Learning, 2007:129-136.
[23] T.-Y. Liu. Learning to Rank for Information Retrieval[M], Now Publishers Inc., 2009.
[24] H Li, Learning to Rank for Information Retrieval and Natural Language Processing[M], Morgan & Claypool Publishers, 2011.
[25] R L Plackett, The analysis of permutations[J], Applied Statistics, 1975,24: 193-202.
[26] R D Luce, Individual choice behavior[M], Wiley, 1959.
[27] Michael Paul. Overview of the IWSLT 2008 Evaluation Campaign[C]//Proceedings of IWSLT 2008, Hawaii, USA, 2008:1-17.
[28] Chris Callison-Burch, Philipp Koehn, Christof Monz et al. Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation[C]//Proceedings of the WMT, Uppsala, Sweden, 2010:17-53.

基金

国家自然科学基金资助项目(61203313, 61272212, 61163006);江西省教育厅自然科学基金资助项目(GJJ12212)
PDF(704 KB)

Accesses

Citation

Detail

段落导航
相关文章

/