Abstract:This paper focuses on extracting translation pairs from unaligned Chinese-English bilingual corpora. First ,it introduces two methods proposed by Dr. Pascale Fung. Then ,we revises the latter one to satisfy the need of real texts. The experiment results show the effectiveness of our method and it can be applied widely in many NLP applications such as phrase extraction ,bilingual lexicography ,etc.
[1] Fung P ,Church K W. K-vec :A New Approach for Aligning Parallel Texts. In : Proceedings of the 15th International Conference on Computational Linguistics (COLING’94) ,Tokyo ,Japan ,1994 ,1096 - 1102 [2] Fung P ,McKeown K. Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping. In :Proceedings of the First Conference of the Association for Machine Translation in the Americas (AMTA’94) ,Columbia ,MD. 1994 ,81 - 88 [3] Gale W A ,Church K W. A Program for Aligning Sentences in Bilingual Corpora. In : Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics (ACL’91) ,Berkeley ,CA ,1991 , 177 - 184 [4] Gale W A ,Church K W. Identifying Word Correspondences in Parallel Texts. In : Proceedings of the Fourth DARPA Speech and Natural Language Workshop ,Pacific Grove ,CA ,1991 ,152 - 157 [5] Tiedemann J . Extraction of Translation Equivalents from Parallel Corpora. 1998 ,From http://stp.ling.uu.se.