融合基本特征和词袋绑定特征的问句特征模型

杨思春1,2,高 超3,秦 锋2,戴新宇1,陈家骏1

PDF(896 KB)
PDF(896 KB)
中文信息学报 ›› 2012, Vol. 26 ›› Issue (5) : 46-53.
综述

融合基本特征和词袋绑定特征的问句特征模型

  • 杨思春1,2,高 超3,秦 锋2,戴新宇1,陈家骏1
作者信息 +

A Feature Model Integrating Basic and Bag-of-Words Binding Features

  • YANG Sichun1, 2, GAO Chao3, QIN Feng2, DAI Xinyu1, CHEN Jiajun1
Author information +
History +

摘要

针对当前问句分类研究中特征提取的处理开销较大,提出一种融合基本特征和词袋绑定特征的问句特征模型。在分别提取问句中的词袋、词性、词义等基本特征及其对应的词袋绑定特征的基础上,通过将基本特征与词袋绑定特征进行融合,以获取更加高效的问句特征集合。在哈尔滨工业大学中文问句集上的实验结果表明,这种新的问句特征模型不仅具有实现简单、处理开销小的优点,而且有效弥补了单纯基本特征或词袋绑定特征在句法语义表达方面的不足,进一步提高了问句分类的准确率。

Abstract

To alleviate the heavy conputation cost of features extraction for question classification, a new feature model is proposed in which basic features and bag-of-word binding features are integrated. Firstly, the basic features (such as bag-of-words, part of speech, word sense) are extracted with their corresponding binding features, and then these two types of features are integrated for a more effective feature set. Experimental results on SVM classifier and the Chinese question set provided by Harbin Institute of Technology indicate that the new feature model, which is simple and cost much less in computation cost, effectively makes up the insufficiency of basic features and binding features in syntax and semantics and further improves the classification accuracy.
Key wordsquestion answering system; question classification; feature model; bag-of-words binding

关键词

问答系统 / 问句分类 / 特征模型 / 词袋绑定

Key words

question answering system / question classification / feature model / bag-of-words binding

引用本文

导出引用
杨思春1,2,高 超3,秦 锋2,戴新宇1,陈家骏1. 融合基本特征和词袋绑定特征的问句特征模型. 中文信息学报. 2012, 26(5): 46-53
YANG Sichun1, 2, GAO Chao3, QIN Feng2, DAI Xinyu1, CHEN Jiajun1. A Feature Model Integrating Basic and Bag-of-Words Binding Features. Journal of Chinese Information Processing. 2012, 26(5): 46-53

参考文献

[1] 张志昌,张宇,刘挺,等.开放域问答技术研究进展[J].电子学报,2009,37(5):1058-1069.
[2] Li X, Roth D. Learning question classifiers[C]//Proceedings of the 19th International Conference on Computational Linguistics (COLING2002). Taipei:Association for Computational Linguistics, 2002:1-7.
[3] Li X, Roth D. Learning question classifiers:the role of semantic information[J]. Journal of Natural Language Engineering, 2006,12(3):229-250.
[4] Huang Zhi-heng, Thint M, Qin Zeng-chang. Question classification using head words and their hypernyms[C]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing(EMNLP2008). Honolulu: Association for Computational Linguistics,2008:927-936.
[5] Huang Zhi-heng, Thint M, Celikyilmaz A. Investigation of question classifier in question answering[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing(EMNLP2009)[C]. Singapore:Association for Computational Linguistics,2009:543-550.
[6] Li Fang-tao, Zhang Xian, Yuan Jin-hui, et al. Classifying what-type questions by head noun tagging[C]//Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008). Manchester: Association for Computational Linguistics,2008:481-488.
[7] 李鑫,黄萱菁,吴立德.基于错误驱动算法组合分类器及其在问题分类中的应用[J].计算机研究与发展,2008, 45(3):535- 541.
[8] Wu You-Zheng, Zhao Jun, Xu Bo. Chinese question classification from approach and semantic views[J]. AIRS 2005, LNCS 3689,485-490.
[9] 张宇,刘挺,文勖.基于改进贝叶斯模型的问题分类[J].中文信息学报,2005,19(2):100-105.
[10] 余正涛,樊孝忠,郭剑毅.基于支持向量机的汉语问句分类[J].华南理工大学学报,2005,33(9):25-29.
[11] 文勖,张宇,刘挺,等.基于句法结构分析的中文问题分类[J].中文信息学报,2006,20(2):33-39.
[12] 孙景广,蔡东风,吕德新,等. 基于《知网》的中文问题自动分类[J].中文信息学报,2007,21(1):90-96.
[13] 张志昌,张宇,刘挺,等. 基于线索词识别和训练集扩展的中文问题分类[J].高技术通讯,2009,19(2):111-118.
[14] Yu Zheng-tao, Su Lei, Li Li-na, et al. Question classification based on co-training style semi-supervised learning[J].Pattern Recognition Letters, 2010, 31:1975-1980.
[15] 杨思春,高超,戴新宇,等.基于词袋绑定的问句新特征自动生成[J]. 北京理工大学学报,2012,32(6):590-595.

基金

国家自然科学基金资助项目(61003112,61170181);计算机软件新技术国家重点实验室开放课题基金(KFKT2010B02);安徽省高校省级自然科学研究重点项目(KJ2011A048)
PDF(896 KB)

553

Accesses

0

Citation

Detail

段落导航
相关文章

/