“细粒度英汉机器翻译错误分析语料库”的构建与思考

裘白莲,王明文,李茂西,陈聪,徐凡

PDF(1116 KB)
PDF(1116 KB)
中文信息学报 ›› 2022, Vol. 36 ›› Issue (1) : 47-55.
语言资源建设

“细粒度英汉机器翻译错误分析语料库”的构建与思考

  • 裘白莲1,2,王明文1,李茂西1,陈聪1,徐凡1
作者信息 +

Construction of Fine-Grained Error Analysis Corpus of English-Chinese Machine Translation

  • QIU Bailian1,2, WANG Mingwen1, LI Maoxi1, CHEN Cong1, XU Fan1
Author information +
History +

摘要

机器翻译错误分析旨在找出机器译文中存在的错误,包括错误类型、错误分布等,它在机器翻译研究和应用中发挥着重要作用。该文将人工译后编辑与错误分析结合起来,对译后编辑操作进行错误标注,采用自动标注和人工标注相结合的方法,构建了一个细粒度英汉机器翻译错误分析语料库,其中每一个标注样本包括源语言句子、机器译文、人工参考译文、译后编辑译文、词错误率和错误类型标注;标注的错误类型包括增词、漏词、错词、词序错误、未译和命名实体翻译错误等。标注的一致性检验表明了标注的有效性;对标注语料的统计分析结果能有效地指导机器翻译系统的开发和人工译员的后编辑。

Abstract

Machine translation error analysis, including error classes and error distribution etc. Error analysis of machine translaution output, plays an important role in the research and application of machine translation. In this paper, post-editing is introduced into error analysis to annotate error labels. Automatic error annotation and manual annotation are applied to build a Fine-grained Error Analysis Corpus of English-Chinese Machine Translation (ErrAC), in which every annotated sample includes a source sentence, MT output, reference, post-edit, WER and error type. The annotated error types include addition, omission, lexical error, word order error, untranslated word, named entity translation error etc. Annotator agreement analysis shows the effectiveness of the annotation. The statistics and analysis based on the corpus provide effective guidance for the development of machine translation system and post-editing practice.

关键词

机器翻译 / 错误分析 / 错误标注 / 译后编辑

Key words

machine translation / error analysis / error annotation / post-editing

引用本文

导出引用
裘白莲,王明文,李茂西,陈聪,徐凡. “细粒度英汉机器翻译错误分析语料库”的构建与思考. 中文信息学报. 2022, 36(1): 47-55
QIU Bailian, WANG Mingwen, LI Maoxi, CHEN Cong, XU Fan. Construction of Fine-Grained Error Analysis Corpus of English-Chinese Machine Translation. Journal of Chinese Information Processing. 2022, 36(1): 47-55

参考文献

[1] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation [C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002: 311-318.
[2] Banerjee S, Lavie A. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments [C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, 2005: 65-72.
[3] Snover M, Dorr B, Schwartz R, et al. A study of translation edit rate with targeted human annotation [C]//Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, 2006: 223-231.
[4] Koponen M. Assessing machine translation quality with error analysis [C]//Proceedings of the KTu Symposium on Translation and Interpreting Studies, 2010: 1-12.
[5] Bojar O. Analyzing error types in English-Czech machine translation [J]. The Prague Bulletin of Mathematical Linguistics, 2011, 95: 63-76.
[6] Klubicka F, Toral A, Sánchez Cartagena VM. Fine-grained human evaluation of neural versus phrase-based machine translation [J]. The Prague Bulletin of Mathematical Linguistics, 2017, 108: 121-132.
[7] 罗季美,李梅.机器翻译译文错误分析[J].中国翻译, 2012, 30(05): 84-89.
[8] 罗季美. 机器翻译句法错误分析[J]. 同济大学学报(社会科学版), 2014, 25(01): 111-118,124.
[9] 孙逸群. 海洋类论文摘要机辅翻译错误剖析[J]. 中国科技翻译, 2019, 32(02): 31-33,30.
[10] Vilar D, Xu J, D’Haro L F, et al. Error analysis of statistical machine translation output [C]//Proceedings of the 5th International Conference on Language Resources and Evaluation, 2006: 697-702.
[11] Popovic M, Ney H. Morpho-syntactic information for automatic error analysis of statistical machine translation output [C]//Proceedings of the Workshop on Statistical Machine Translation, 2006: 1-6.
[12] Lommel A, Burchardt A, Popovic M. Using a new analytic measure for the annotation and analysis of MT errors on real data [C]//Proceedings of the 17th Annual Conference of the European Association for Machine Translation, 2014: 165-172.
[13] Bentivogli L, Bisazza A, Cettolo M, et al. Neural versus phrased-based machine translation quality: a case study [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 257-267.
[14] Toral A, Sánchez-Cartagena V M. A multifaceted evaluation of neural versus statistical machine translation for 9 language directions [C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017: 1063-1073.
[15] Bentivogli L, Bisazza A, Cettolo M, et al. Neural versus phrase-based MT quality: an in-depth analysis on English-German and English-French [J]. Computer Speech and Language, 2018, 49: 52-70.
[16] Popovic M, Avramidis E, Burchardt A, et al. Learning from human judgments of machine translation output [C]// Proceedings of the XIV MT Summit, 2013: 231-238.
[17] Federico M,Negri M, Bentivogli L, et al. Assessing the impact of translation errors on machine translation quality with mixed-effects models [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1643-1653.
[18] Krings H. Repairing texts: Empirical investigations of machine translation post-editing process [M]. Kent, OH: The Kent State University Press, 2001.
[19] Zaretskay A, Vela M, Pastor G C, et al. Measuring post-editing time and effort for different types of machine translation errors [J]. New Voices in Translation Studies, 2016, 15: 63-92.
[20] Potet M, Esperanca Rodier E, Besacier L, et al. Collection of a large database of French-English SMT output corrections [C]//Proceedings of the 8th International Conference on Language Resources and Evaluation, 2012: 4043-4048.
[21] Wisniewski G, Singh A K, Segal N, et al. Design and analysis of a large corpus of post-edited translation: quality estimation, failure analysis and the variability of post-edition [C]//Proceedings of the MT Summit XIV, 2013: 117-124.
[22] Koponen M. Comparing human perceptions of post-editing effort with post-editing operations [C]//Proceedings of the 7th Workshop on Statistical Machine Translation, 2012: 181-190.
[23] Fishel M, Bojar O, Popovic M. Terra: a collection of translation error-annotated corpora [C]// Proceedings of the 8th International Conference on Language Resources and Evaluation, 2012: 7-14.
[24] Zeman D, Fishel M, Berka J, et al. Addicter: what is wrong with my translation? [J]. The Prague Bulletin of Mathematical Linguistics, 2011, 96: 79-88.
[25] Popovic M. Hjerson: an open source tool for automatic error classification of machine translation output [J]. The Prague Bulletin of Mathematical Linguistics, 2011, 96: 59-68.
[26] Avramidis E, Burchardt A, Hunsicker S, et al. The TARAXü corpus of human-annotated machine translations [C]//Proceedings of the 9th International Conference on Language Resources and Evaluation, 2014: 2679-2682.
[27] Popovic M, Arcan M. PE2rr Corpus: manual error annotation of automatically pre-annotated MT post-edits [C]//Proceedings of the 10th International Conference on Language Resources and Evaluation, 2016: 27-32.
[28] Barrault L, Bojar O, Costa-jussa M R, et al. Findings of the 2019 conference on machine translation [C]//Proceedings of the 4th Conference on Machine Translation, 2019: 1-61.
[29] Symne S, Ahrenberg L. On the practice of error analysis for machine translation evaluation [C]//Proceedings of the 8th International Conference on Language Resources and Evaluation, 2012: 1785-1790.
[30] Kirchhoff K,Capurro D, Turner A. Evaluating user preferences in machine translation using conjoint analysis [C]//Proceedings of the 16th EAMT Conference, 2012: 119-126.
[31] Popovic M, Lommel A, Burchardt A, et al. Relations between different types of post-editing operations, cognitive effort and temporal effort [C]//Proceedings of the 17th Annual Conference of the European Association for Machine Translation, 2014: 191-198.

基金

国家自然科学基金(61876074,61662031,61772246);国家社会科学基金(19BYY121);教育部人文社科基金(21YJC740040)
PDF(1116 KB)

1923

Accesses

0

Citation

Detail

段落导航
相关文章

/