基于双语信息的问题分类方法研究

徐健,张栋,李寿山,王红玲

PDF(5062 KB)
PDF(5062 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (5) : 171-177.
信息检索与问答系统

基于双语信息的问题分类方法研究

  • 徐健,张栋,李寿山,王红玲
作者信息 +

Research on Question Classification via Bilingual Information

  • XU Jian, ZHANG Dong, LI Shoushan, WANG Hongling
Author information +
History +

摘要

问题分类是问答系统研究的一项基本任务。先前的研究仅仅是在单语语料上训练得到问题分类模型,存在语料不足和问题文本较短的问题。为了解决这些问题,该文提出了融合双语语料的双通道LSTM问题分类方法。首先,利用翻译语料分别扩充中文和英文语料;其次,将两种语言语料中的样本都分别用问题文本和翻译文本表示;最后,提出了双通道LSTM分类方法用于充分利用这两组特征,构建问题分类器。实验结果表明,该文提出的方法能有效提高问题分类的性能。

Abstract

Question classification is a basic task in question answering system. Previous studies only employ the monolingual corpus to train the question classification model, suffering from problems such as lack of corpus and short length of question text. To solve these problems, we propose a new approach named dual-channel LSTM model with bilingual information. Firstly, we extend the Chinese corpus and English corpus with the corresponding translated corpus. Secondly, the samples are represented by the question text and translation word vector. Finally, we build an question classifier using dual-channel LSTM model. The experimental result demonstrates that our approach improves the performance of question classification.

关键词

问答系统 / 问题分类 / LSTM

Key words

Q&A system / question classification / LSTM

引用本文

导出引用
徐健,张栋,李寿山,王红玲. 基于双语信息的问题分类方法研究. 中文信息学报. 2017, 31(5): 171-177
XU Jian, ZHANG Dong, LI Shoushan, WANG Hongling. Research on Question Classification via Bilingual Information. Journal of Chinese Information Processing. 2017, 31(5): 171-177

参考文献

[1] 李鑫, 黄萱菁, 吴立德. 基于错误驱动算法组合分类器及其在问题分类中的应用[J]. 计算机研究与发展, 2008, 45(3):535-541.
[2] Ray S K, Singh S, Joshi B P. A semantic approach for question classification using WordNet and Wikipedia[J]. Pattern Recognition Letters, 2010, 31(13):1935-1943.
[3] Hui Z, Liu J, Ouyang L. Question classification based on an extended class sequential rule model[C]//Proceedings of the 5th IJCNLP, Chiang Mai, 2011:938-946.
[4] Mishra M, Mishra V K, Sharma H R. Question classification using semantic, syntactic and lexical features[J]. International Journal of Web & Semantic Technology, 2013, 4(3):39.
[5] Yadav R, Mishra M, Bhilai S. Question classification using Na?ve Bayes machine learning approach[J]. International Journal of Engineering and Innovative Technology (IJEIT), 2013, 2(8):291-294.
[6] 田卫东, 高艳影, 祖永亮. 基于自学习规则和改进贝叶斯结合的问题分类[J]. 计算机应用研究, 2010, 27(8):2869-2871.
[7] 张巍, 陈俊杰. 信息熵方法及在中文问题分类中的应用[J]. 计算机工程与应用, 2013, 49(10):129-131.
[8] Liu L, Yu Z, Guo J, et al. Chinese question classification based on question property kernel[J]. International Journal of Machine Learning & Cybernetics, 2014, 5(5):713-720.
[9] 刘小明, 樊孝忠, 李方方. 一种结合本体和焦点的问题分类方法[J]. 北京理工大学学报, 2012, 32(5):498-502.
[10] 张栋, 李寿山, 周国栋. 基于答案辅助的半监督问题分类方法[J]. 计算机工程与科学, 2015, 37(12):2352-2357.

基金

国家自然科学基金(61672366);国家青年科学基金(61402314)
PDF(5062 KB)

686

Accesses

0

Citation

Detail

段落导航
相关文章

/