在机器翻译研究领域中,评测工作发挥着重要的作用,它不仅仅是简单地对各个系统输出结果进行比较,它还对关键技术的发展起到了促进作用。译文质量的评测工作长期以来一直以人工的方式进行。随着机器翻译研究发展的需要,自动的译文评测研究已经成为机器翻译研究中的一个重要课题。本文讨论了基于n-gram共现的自动机器翻译评测框架,介绍了BLEU、NIST、OpenE三种自动评价方法,并通过实验详细分析了三种方法的优缺点。其中的OpenE采用了本文提出了一种新的片断信息量计算方法。它有效地利用了一个局部语料库(参考译文库)和全局语料库(目标语句子库)。实验结果表明这种方法对于机器翻译评价来说是比较有效的。
Abstract
Evaluations are very helpful for the research of Machine Translation (MT) . The aim of evaluations is not only to output the differences among MT systems , but also to stimulate the improvement of key technologies in this area. In the past , the evaluations of MT are performed by human. With the increasing needs of MT research , the automatization of MT evaluations becomes more and more important . This paper introduces the basic framework of automatic MT evaluation using n-gram co-occurrence statistics. Three methods (BLEU , NIST and OpenE) based on this framework are described. The advantages and disadvantages of these methods are also discussed through the analysis of several experiments. Among these methods , OpenE adopts a new method of n-gram weighting which employs a local corpus and a large global corpus. Through the experiments , this method is proved to be practical for machine translation evaluation.
关键词
人工智能 /
机器翻译 /
机器翻译评测 /
信息量计算 /
n-gram共现
{{custom_keyword}} /
Key words
artificial intelligence /
machine translation /
MT evaluation /
information computing /
n-gram co-occurrence
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Kishore Papineni , et al. BLEU : a method for automatic evaluation of machine translation[R] . Technical Report RC22176 (W0109 - 022) , IBM Research Division , Thomas J. Watson Research Center , 2001.
[2] Doddington. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics [R] . NIST Research Report , 2002.
[3] Kjersti Aas , Line Eikvil. Text Categorisation : A Survey[M] . Raport NR 941. Norwegian Computing Center , 1999.
[4] E. H. Hovy. Toward finely differentiated evaluation metrics for machine translation[A] . In : Proceedings of the Eagles Workshop on Standards and Evaluation , Pisa , Italy , 1999.
[5] EAGL ES. Evaluation of Natural Language Processing Systems FINAL REPORT[R] . EAGLES DOCUMENT EAG-II-EWG-PR. 1 , 1999.
[6] J. S. White , T. O'Connell. The ARPA MT evaluation methodologies : evolution , lessons , and future approaches[A] . In : Proceedings of the First Conference of the Association for Machine Translation in the Americas , 193 - 205 , Columbia , Maryland , 1994.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点基础研究资助项目(G19980305011)
{{custom_fund}}