Abstract:This paper improves the HMM based word alignment by introducing syntactic knowledge. HMM is combined with English phrase structure tree distance to align Chinese-English words. Experiments shows that the improved HMM can reduce the error rate of word alignment, and improve the BLEU score of statistical machine translation.
[1] Peter F Brown, Stephen A Della Pietra, Vincent J Della Pietra, et al.The mathematics of statistical machine translation parameter estimation[J]. Computational Linguistics, 1993, 19(2): 263-311. [2] Heidi J Fox. Phrasal cohesion and statistical machine translation[C]//Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, Philadelphia,USA, 2002: 304-311. [3] Stephan Vogel, Hermann Ney, Christoph Tillmann. HMM-based word alignment in statistical translation[C]//Proceedings of the 16th International Conference on Computational Linguistics Proceedings, 1996: 836-841. [4] Franz Josef Och, Hermann Ney. A systematic comparison of various statistical alignment models[J]. Computational Linguistics,2003, 29 (1):19-51. [5] Adam Lopez, Philip Resnik. Improved HMM alignment models for languages with scarce resources[C]//Proceedings of ACL-2005: Workshop on Building and Using Parallel Texts—Data-driven machine translation and beyond. University of Michigan, Ann Arbor, 2005: 83-86. [6] Colin Cherry, Dekang Lin. Soft syntactic constraints for word alignment through discriminative training[C]//Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, 2006: 105-112. [7] Yang Liu, Qun LIU, Shouxun LIN, Log-linear Models for Word Alignment[C]//Proceedings of the 43rd Annual Meeting of Association of Computational Linguistics, Michigan, 2005:25-30. [8] 常宝宝. 基于统计的翻译等价词对抽取研究[J]. 计算机学报, 2003,(5): 616-621. [9] 赵红梅,刘群,等,汉英词语对齐规范,中文信息学报, 2009,23(3): 65-87。 [10] 肖桐,李天宁,陈如山,等. 面向统计机器翻译的重对齐方法研究,中文信息学报,2010,24(1): 110-116. [11] Andreas Stolcke. SRILM—An Extensible Language Modeling Toolkit[C]//Proceedings of International Conference on Spoken Language Processing. Denver, Colorado, 2002. [12] Kishore Papineni, Salim Roukos, Todd Ward, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual meeting of the Association for Computational Linguistics, Philadelphia, 2002: 311-318. [13] D Gildea. Loosely tree-based alignment for machine translation[C]//Proceedings of the 41st Annual Meeting of Acl, 2003: 80-87.