为了解决在构建统计机器翻译系统过程中所面临的双语平行数据缺乏的问题,该文提出了一种新的基于中介语的翻译方法,称为Transfer-Triangulation方法。该方法可以在基于中介语的翻译过程中,结合传统的Transfer方法和Triangulation方法的优点,利用解码中介语短语的方法改进短语表。该文方法是在使用英语作为中介语的德-汉翻译任务中进行评价的。实验结果表明,相比于传统的基于中介语方法的基线系统,该方法显著提高了翻译性能。
Abstract
This paper presents a transfer-triangulation method for pivot-based translation between two languages with poor bilingual data. It takes the best of both typical transfer method and triangulation method for pivot-based translation, and decodes pivot phrases to improve phrase table. Evaluated on German-Chinese translation task with English as the pivot language, results show that our method achieves significant improvement over baseline pivot-based methods.
关键词
统计机器翻译 /
基于中介语的统计机器翻译 /
中介语 /
质量控制因子
{{custom_keyword}} /
Key words
statistical machine translation /
pivot-based statistical machine translation /
pivot language /
quality control factor
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Masao Utiyama, Hitoshi Isahara. A comparison of pivot methods for phrase-based statistical machine translation[C]//Proceedings of Human Language Technology: the Conference of the North American Chapter of the Association for Computational Linguistics, 2007: 484-491.
[2] Hua Wu, Haifeng Wang. Pivot language approach for phrase-based statistical machine translation[C]//Proceedings of 45th Annual Meeting of the Association for Computational Linguistics, 2007: 856-863.
[3] Trevor Cohn, MirellaLapata. Machine translation by triangulation: make effective use of multi-parallel corpora[C]//Proceedings of 45th Annual Meeting of the Association for Computational Linguistics, 2007: 828-735.
[4] Philipp Koehn, Franz Och, Daniel Marcu. Statistical phrase-based translation[C]//Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics(HLT: NAACL), 2003: 48-54, Edmonton, Canada, June.
[5] Tong Xiao, Jingbo Zhu, Hao Zhang, et al. NiuTrans: An open source toolkit for phrase-based and Syntax-based machine translation[C]//Proceedings of ACL: System Demonstrations, 2012: 19-24, Jeju Island, Korea, July.
[6] Franz Josef Och, Hermann Ney. A comparison of alignment models for statistical machine translation[C]//Proceedings of the 18th International Conference on Computational Linguistics, 2000: 1086-1090.
[7] Stanley F. Chen, Joshua Goodman. An empirical study of smoothing techniques for language modeling[J]. Computer Speech & Language, 1999(13): 359-393.
[8] Franz Och. Minimum error rate training in statistical machine translation[C]//Proceedings of ACL, 2003: 160-167, Sapporo, Japan, July.
[9] Kishore Papineni, Salim Roukos, Todd Ward, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computation Linguistics, 2002: 311-319.
[10] Jesús González-Rubio, Alfons Juan, Francisc Casacuberta. Minimum bayes-risk system[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 2011: 1268-1277.
[11] Kevin Duh, Katsuhito Sudoh, Xianchao Wu, et al. Generalized minimum bayes risk system combination[C]//Proceedings of the 5th International Joint Conference on Natural Language Processing, 2011: 1356-1360.
[12] Kholy A E, Habash N, Leusch G, et al. Language independent connectivity strength features for phrase pivot statistical machine translation[J]. Proc of Acl, 2013.
[13] Samira Tofighi Zahabi, Somayeh Bakhshaei, Shahram Khadivi. Using context vectors in improving a machine translation system with bridge language[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013: 318-322.
[14] Xiaoning Zhu, Zhongjun He, Hua Wu, H et al.2013. Improving pivot-based statistical machine translation using random walk[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013: 524-534.
[15] Xiaoning Zhu, Zhongjun He, Hua Wu, et al.2014. Improving pivot-based statistical machine translation by pivoting the co-occurrence count of phrase pairs[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP), 2014: 1665-1675.
[16] Akiva Miura, Graham Neubig, Sakriani Sakti, et al.2015. Improving pivot translation by remembering the pivot[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 573-577.
[17] Michael Paul, Hirofumi Yamamoto, Eiichiro Sumita et al. On the importance of pivot language selection for statistical machine translation[C]//Proceedings of NAACL HLT 2009: Short Papers, 2009: 221-224.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金青年基金(61300097);国家自然科学基金(61272376);国家自然科学基金(61432013)
{{custom_fund}}