对话行为信息在口语翻译中的应用

周可艳,宗成庆

PDF(1213 KB)
PDF(1213 KB)
中文信息学报 ›› 2010, Vol. 24 ›› Issue (6) : 57-64.
综述

对话行为信息在口语翻译中的应用

  • 周可艳,宗成庆
作者信息 +

Apply Dialog Act Information in Spoken Language Translation

  • ZHOU Keyan, ZONG Chengqing
Author information +
History +

摘要

在口语翻译中,如何融入语义及语用信息一直是目前研究的难点之一。对话行为作为浅层话语结构描述的特征,近年来陆续应用于不同类型的翻译系统中。该文在介绍对话行为理论和口语标注语料的基础上,以基于短语的统计翻译系统为应用对象,提出了对话行为应用于翻译过程的三种方式。该方法通过对对话行为的自动分类,使训练语料—测试语料、开发集—测试集、源语言—目标语言的一致性得到提高,提高了翻译系统的性能,使最终的翻译结果可以更准确地反映源语言所要表达的对话意图。在汉英口语翻译评测数据上的实验证明,对话行为信息的加入使翻译系统的性能得到了有效的提高。

Abstract

How to apply semantic and pragmatics information is one of the difficulties in researches on spoken language translation. Dialog act, as a description of shallow discourse structure, has been utilized in several types of translation systems. In this paper, we first introduce dialog act theory and several famous dialog act annotated corpora. Based on annotated corpus and dialog act automatic recognition technology, we propose three kinds of applications of dialog act in phrase-based translation. By introducing the dialog act classification, our approach improves the consistencies between the training data and the test data, between the develop set and the test set, and between the source language and the target language. Further, the translation process is more effective and translation result is more accurate in reflecting the intention of source language. The experimental results on Chinese-to-English spoken language show that dialog act can make the spoken language translation system more accurate and effective.
Key wordsdialog act; spoken language translation; dialog act classification

关键词

对话行为 / 口语翻译 / 对话行为分类

Key words

dialog act / spoken language translation / dialog act classification
 
/   /   /
 
/   /   /
 
/   /  

引用本文

导出引用
周可艳,宗成庆. 对话行为信息在口语翻译中的应用. 中文信息学报. 2010, 24(6): 57-64
ZHOU Keyan, ZONG Chengqing. Apply Dialog Act Information in Spoken Language Translation. Journal of Chinese Information Processing. 2010, 24(6): 57-64

参考文献

[1] J. L. Austin. How to do Things with Words[M]. Oxford: Clarendon Press, 1962.
[2] D. Jurafsky, L. Shriberg, and D. Biasca. Switchboard SWBD-DAMSL Labeling Project Coder’s Manual, Draft 13[R]. Technical Report 97-02, University of Colorado Institute of Cognitive Science. 1997.
[3] R. Dhillon, S. Bhagat, H. Carvey, et al. Meeting Recorder Project: Dialog-act Labeling Guide[R]. ICSI Technical Report TR-04-002. International Computer Science Insitute. 2004.
[4] M. Walker, and R. Passonneau. DATE: A Dialog Act Tagging Scheme for Evaluation of Spoken Dialog Systems[C]//Proceedings of HLT 2001, San Diego. 2001.
[5] A. Stolcke, K. Ries, N. Coccaro, et al. Dialog Act Modeling for Automatic Tagging and Recognition of Conversational Speech[J]. Computational Linguistics, 2000. 26(3): 339-373.
[6] M. Woszczyna, N. Coccaro, A. Eisele, et al. Recent Advances in Janus: A Speech Translation System[C]//Third European Conference on Speech Communication and Technology. 1993.
[7] N. Reithinger, and E. Maier. Utilizing Statistical Dialog Act Processing in Verbmobil[C]//Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL)MIT, Cambredge, MA. 1995: 116-121.
[8] Wenjie Cao, Chengqing Zong, and Bo Xu. Approach to Interchange-Format Based Chinese Generation[C]//Proceedings of the International Conference on Spoken Language Processing (ICSLP). Jeju, Korea. 2004: 4-8.
[9] Yuncun Zuo, Yu Zhou and Chengqing Zong, Multi-Engine Based Chinese-to-English Translation System[C]//Proceedings of International Workshop on Spoken Language Translation, Japan, 2004: 73-76.
[10] V. K. R. Sridhar, S. Narayanan, et al. Enriching Spoken Language Translation with Dialog Acts[C]//Proceedings of ACL 2008, Short Papers(Companion Volume). Columbus, Ohio, USA, June, 2008: 225-228.
[11] JR Searle. Speech Acts: an Essay in the Philosophy of Language[M]. Cambridge University Press: Cambridge, England. 1969.
[12] 何兆熊. 新编语用学概要[M]. 上海: 上海外语教育出版社. 2000.
[13] G. Leech and M. Weisser. Pragmatics and Dialogue. The Oxford Handbook of Computational Linguistics[M]. Oxford University Press. 2003: 136-156.
[14] J. Carletta, S. Ashby, S. Bourban, et al. The AMI Meeting Corpus: A Pre-Announcement. In Steve Renals and Samy Bengio, editors. Machine Learning for Multimodal Interaction II[M]. Springer-Verlag, Berlin/Herdelberg. 2006. LNCS 3869, Pages 28-39.
[15] Keyan Zhou, Aijun Li, Zhigang Yin, et al. CASIA-CASSIL: a Chinese Telephone Comversation Corpus in Real Scenarios with Multi-leveled Annotation[C]//Proceedings of the seventh International Conference on Language Resources and Evaluation(LREC). May 2010, Malta.
[16] 解国栋, 宗成庆, 徐波. 面向中间语义表示格式的汉语口语解析方法[J]. 中文信息学报. 2002. 17(1): 1-6.
[17] 左云存, 宗成庆. 基于语义分类树的汉语口语理解方法[J]. 中文信息学报. 2005. 20(2): 8-15.
[18] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008.5.
[19] Y. Zhou, Y. He, and C. Zong. The CASIA Phrase-Based Statistical Machine Translation System for IWSLT 2007[C]//Proceedings of the International Workshop on Spoken Language Translation (IWSLT), Trento, Italy. October 15-16, 2007.
[20] Dinoj Surendran, and Gina-Anne Levow. 2006. DA Tagging with Support Vector Machines and Hidden Markov Models[C]//Proceedings of Interspeech, Pittsburgh, PA.
[21] K. Zhou, C. Zong, H. Wu, et al. Predicting and Tagging DA with SVM and MDP[C]//Proceedings of ISCSLP 2008. Kunming, China. 2008: 293-296.
[22] K. Zhou, C, Zong. Dialog-act Recognition Using Discourse and Sentence Structure Information[C]//Proceedings of IALP 2009. Singapore, 2009: 11-16.

基金

国家自然科学基金资助项目(60975053);国家支撑计划资助项目(2006BAH03B02)
PDF(1213 KB)

629

Accesses

0

Citation

Detail

段落导航
相关文章

/