基于无监督学习的问答模式抽取技术

吴友政,赵军,徐波

PDF(177 KB)
PDF(177 KB)
中文信息学报 ›› 2007, Vol. 21 ›› Issue (2) : 69-76.
综述

基于无监督学习的问答模式抽取技术

  • 吴友政,赵军,徐波
作者信息 +

Unsupervised Answer Pattern Acquisition

  • WU You-zheng, ZHAO Jun, XU Bo
Author information +
History +

摘要

本文提出了一种基于无监督学习算法的问答模式抽取技术从互联网上抽取应用于汉语问答系统的答案模式。该算法可以避免有监督学习算法的不足,它无需用户提供<提问,答案>对作为训练集,只需用户提供每种提问类型两个或以上的提问实例,算法即可通过Web检索、主题划分、模式提取、垂直聚类和水平聚类等步骤完成该类型提问的答案模式的学习。实验结果表明,论文提出的无监督问答模式学习方法是有效的,基于模式匹配的答案抽取技术能够较大幅度地提高汉语问答系统的性能。

Abstract

The paper presents an unsupervised learning algorithm to learn answer pattern for answer extraction module of Chinese Question Answering (QA). Given two or more questions of one question type, the algorithm can learn the corresponding answer patterns from internet via web search, topic segmentation, pattern extraction, vertical clustering and horizontal clustering, etc. The experimental results show that the performance of pattern-based answer extraction of Chinese QA is improved significantly.

关键词

人工智能 / 自然语言处理 / 汉语问答系统 / 问答模式 / 机器学习

Key words

artificial intelligence / natural language processing / Chinese question answering / answer pattern / machine learning

引用本文

导出引用
吴友政,赵军,徐波. 基于无监督学习的问答模式抽取技术. 中文信息学报. 2007, 21(2): 69-76
WU You-zheng, ZHAO Jun, XU Bo. Unsupervised Answer Pattern Acquisition. Journal of Chinese Information Processing. 2007, 21(2): 69-76

参考文献


[1] Deepak Ravichandran, Eduard Hovy. Learning Surface Text Patterns for a Question Answering[A]. In: Proceeding of the ACL2002 Conference[C]. Philadelphia, PA, July, 2002.
[2] Dekang Lin, Patrick Pantel. Discovery of Inference Rules for Question Answering[J]. In: Natural Language Engineering, volume 7, 343-360.
[3] Hui Yang, Tat-Seng Chua. The Integration of Lexical Knowledge and External Resources for Question Answering[A]. In: the Eleventh Text REtrieval Conference[C]. Maryland: USA, 2002. 155-161.
[4] M.M. Soubbotin, S.M. Soubbotin. Use of Patterns for Detection of Likely Answer Strings: A Systematic Approach[A]. In: the Eleventh Text Retrieval Conference[C]. Gaithersburg, Maryland: November 2002.
[5] Moldovan, D., Harabagio, S., Girju, R., et al. LCC Tools for Question Answering[A]. NIST Special Publication: SP 500-251 The Eleventh Text Retrieval Conference[C].
[6] Yongping Du, Xuanjing Huang, Xin Li, Lide Wu. A Novel Pattern Learning Method for Open Domain Question Answering[A]. In: the Proceedings of IJCNLP2004[C]. Sanya: China.
[7] Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin and Andrew Ng. Web Question Answering: Is More Always Better? [A] In: the Proceeding of 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C]. Tampere, Finland, 2002.
[8] Dell Zhang, Wee Sun Lee. Web Based Pattern Mining and Matching Approach to Question Answering[A]. In: the Proceeding of TREC-11[C]. Gaithersburg, MD, 2002.
[9] Regina Barzilay and Noemie Elhadad. Sentence Alignment for Monolingual Comparable Corpora[A]. In: the Proceedings of EMNLP2003[C]. Sapporo, Japan. 25-32.
[10] Y. Shinyama, S. Sekine, K. Sudo, R. Grishman. Automatic Paraphrase Acquisition from News Articles[A]. In: the Proceedings of Human Language Technology Conference[C]. San Diego, USA, 2002.
[11] J. Ponte, W. Bruce Croft. A Language Modeling Approach to Information Retrieval[A]. In: the Proceedings of ACM SIGIR 1998[C]. 1998. 275-281.
[12] Youzheng Wu, Jun Zhao, Bo Xu. Chinese Named Entity Recognition Model Based on Multiple Features[A]. In: Proceedings of HLT/EMNLP 2005[C]. October 6-8, Vancouver, B.C., Canada. 427-434.
[13] Youzheng Wu, Jun Zhao, Bo Xu. Chinese Question Classification from Approach and Semantic View[A]. In: Proceedings of the 2nd Asia Information Retrieval Symposium (AIRS2005)[C]. LNCS 3689, Jeju Island, Korea. October 13-15, 2005. 485-490.
[14] 吴友政, 赵军, 段湘煜,等. 构建汉语问答评测平台[A]. 第一届全国信息检索与内容安全学术会议[C]. 上海, 2004. 315-323.

基金

国家自然科学基金资助项目(60372016);北京市自然科学基金资助项目(40052027)
PDF(177 KB)

Accesses

Citation

Detail

段落导航
相关文章

/