一种基于改进隐马尔克夫模型的词语对齐方法

刘 颖,姜 巍

PDF(1321 KB)
PDF(1321 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (2) : 51-55.
机器翻译

一种基于改进隐马尔克夫模型的词语对齐方法

  • 刘 颖,姜 巍
作者信息 +

An Improved HMM Based Word Alignment Method

  • LIU Ying, JIANG Wei
Author information +
History +

摘要

该文在基本隐马尔克夫模型的基础之上,利用句法知识来改进词语对齐,把英语的短语结构树距离和基本隐马尔克夫模型相结合进行词语对齐。与基本隐马尔克夫模型相比,这个模型可以降低词语对齐的错误率,并且提高统计机器翻译系统BLEU值,从而提高机器翻译质量。

Abstract

This paper improves the HMM based word alignment by introducing syntactic knowledge. HMM is combined with English phrase structure tree distance to align Chinese-English words. Experiments shows that the improved HMM can reduce the error rate of word alignment, and improve the BLEU score of statistical machine translation.

关键词

短语结构树距离 / 隐马尔克夫模型 / 词语对齐 / BLEU值

Key words

phrase structure tree distance / hidden Markov model / word alignment / BLEU score

引用本文

导出引用
刘 颖,姜 巍. 一种基于改进隐马尔克夫模型的词语对齐方法. 中文信息学报. 2014, 28(2): 51-55
LIU Ying, JIANG Wei. An Improved HMM Based Word Alignment Method. Journal of Chinese Information Processing. 2014, 28(2): 51-55

参考文献

[1] Peter F Brown, Stephen A Della Pietra, Vincent J Della Pietra, et al.The mathematics of statistical machine translation parameter estimation[J]. Computational Linguistics, 1993, 19(2): 263-311.
[2] Heidi J Fox. Phrasal cohesion and statistical machine translation[C]//Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, Philadelphia,USA, 2002: 304-311.
[3] Stephan Vogel, Hermann Ney, Christoph Tillmann. HMM-based word alignment in statistical translation[C]//Proceedings of the 16th International Conference on Computational Linguistics Proceedings, 1996: 836-841.
[4] Franz Josef Och, Hermann Ney. A systematic comparison of various statistical alignment models[J]. Computational Linguistics,2003, 29 (1):19-51.
[5] Adam Lopez, Philip Resnik. Improved HMM alignment models for languages with scarce resources[C]//Proceedings of ACL-2005: Workshop on Building and Using Parallel Texts—Data-driven machine translation and beyond. University of Michigan, Ann Arbor, 2005: 83-86.
[6] Colin Cherry, Dekang Lin. Soft syntactic constraints for word alignment through discriminative training[C]//Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, 2006: 105-112.
[7] Yang Liu, Qun LIU, Shouxun LIN, Log-linear Models for Word Alignment[C]//Proceedings of the 43rd Annual Meeting of Association of Computational Linguistics, Michigan, 2005:25-30.
[8] 常宝宝. 基于统计的翻译等价词对抽取研究[J]. 计算机学报, 2003,(5): 616-621.
[9] 赵红梅,刘群,等,汉英词语对齐规范,中文信息学报, 2009,23(3): 65-87。
[10] 肖桐,李天宁,陈如山,等. 面向统计机器翻译的重对齐方法研究,中文信息学报,2010,24(1): 110-116.
[11] Andreas Stolcke. SRILM—An Extensible Language Modeling Toolkit[C]//Proceedings of International Conference on Spoken Language Processing. Denver, Colorado, 2002.
[12] Kishore Papineni, Salim Roukos, Todd Ward, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual meeting of the Association for Computational Linguistics, Philadelphia, 2002: 311-318.
[13] D Gildea. Loosely tree-based alignment for machine translation[C]//Proceedings of the 41st Annual Meeting of Acl, 2003: 80-87.

基金

教育部回国人员启动项目(20101021603)
PDF(1321 KB)

488

Accesses

0

Citation

Detail

段落导航
相关文章

/