多特征文本蕴涵识别研究

赵红燕,刘 鹏,李 茹,王智强

PDF(2138 KB)
PDF(2138 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (2) : 109-115.
信息提取和文本挖掘

多特征文本蕴涵识别研究

  • 赵红燕1,刘 鹏2,李 茹3,4,王智强3
作者信息 +

Recognizing Textual Entailment Based on the Multi-features

  • ZHAO Hongyan1,LIU Peng2, LI Ru3,4,WANG Zhiqiang3
Author information +
History +

摘要

文本蕴涵识别是解决自然语言中存在的同义异形问题的有效途径。虽然国内外学者已经提出了很多文本蕴涵识别模型,但影响文本蕴涵识别的因素错综复杂,识别准确率普遍不高。该文把文本蕴涵识别看作二元分类问题,抽取词汇特征、句法依存关系特征及FrameNet语义知识库特征的多种特征构造特征矩阵,训练SVM分类器,实现文本蕴涵识别。该方法在国际文本蕴涵识别技术评测RTE3的测试集上进行测试,蕴涵正例识别准确率达到了78.1%,高于RTE3评测2-ways的最高结果。

Abstract

Recognizing text entailment is an effective solution to the natural language stating the same meaning in various ways. Although many text entailment recognition models have been proposed,the recognition accuracy rate is not satisfactory due to the complex factors in the text entailment.Treating the text entailment as a binary classification problem, this paper extracts multiple features of lexical, syntactic dependencies and FrameNet semantic knowledge to train a SVM classifiers for the text entailment recognition. Evaluated by the international RTE3 test set,this method achieves 78.1% precisionin in positive entailments,which is higher than the best result of RTE3.

关键词

文本蕴含识别 / 句法依存关系 / FrameNet

Key words

recognize textual entailment / syntactic dependent relationship / FrameNet

引用本文

导出引用
赵红燕,刘 鹏,李 茹,王智强. 多特征文本蕴涵识别研究. 中文信息学报. 2014, 28(2): 109-115
ZHAO Hongyan,LIU Peng, LI Ru,WANG Zhiqiang. Recognizing Textual Entailment Based on the Multi-features. Journal of Chinese Information Processing. 2014, 28(2): 109-115

参考文献

[1] Dagan Ido, Oren Glickman. Probabilistic Textua1 Entailment: Generic Applied Modeling of Language Variability[C]//Proceedings of the PASAL Workshop on Learning Methods for Text Understanding and Mining, Grenoble France.2004.
[2] 袁毓林,王明华.文本蕴涵的推理模型与识别模型.中文信息学报[J],2010,24(2): 3-13.
[3] Shachar Mirkin, Ido Dagan, Maayan Geffet. Integrating Pattern-based and Distributional Similarity Methods for Lexical Entailment Acquisition[C]//Proceedings of COLING-ACL 2006, Sydney, Australia, 2006, 7:17-21.
[4] Peter Clark,Phil Harrision.An Inference-Based Approach to Recognizing Entailment[C]//Proceedings of Text Analysis Conference(TAC).2009.
[5] 刘江利,杜永萍.基于词汇与句法关系匹配的蕴涵关系识别方法[A].第六届全国信息检索学术会议论文集[C];2010.
[6] 张鹏,李国臣,李茹等.基于FrameNet框架关系的文本蕴涵识别.中文信息学报[J],2012,26(2): 46-50.
[7] Vapnic V. The Nature of Statistical Learning[M].New York: Springerl-Verlag, 1995, 126-178.
[8] Joachims T.Text Categorization with Support Vector Machines: Learning with Many Relevant Features[C]//Proceedings of theEuropean Conference on Machine Learning,Berlin,Springer,1998.
[9] http://www.csie.ntu.edu.tw/~cjlin/.
[10] http://www.pudn.com/downloads521/sourcecode/windows/dotnet/detail2161512.html.
[11] Stanford POStagger.http://nlp.stanford.edu/software/tagger.shtml.
[12] Stanford Parser. http://nlp.stanford.edu/software/lex_parser.shtml.
[13] Dipanjan Das, Noah A. Smith. Graph-Based Lexicon Expansion with Sparsity-Inducing Penalties[C]//Proceedings of NAACL 2012.
[14] Dipanjan Das, Noah A. Smith. Semi-Supervised Frame-Semantic Parsing for Unknown Predicates[C]//Proceedings of ACL 2011.
[15] 袁毓林.文本蕴涵的类型和识别机制(2008).中国中文信息学会成立二十七周年学术会议[C].
[16] FrameNet. http://framenet.icsi.berkeley.edu.
[17] Burchardt A., Pennacchiotti,M.,Thater,S., et al. Assessing the Impact of Frame Semantics on Textual Entailment. Nat.Lang.Engineering,2009:15(4).
[18] Julio Javier Castillo.Sangn inTAC2010:A Machine Learning Approach to RTE within a Corpus. TAC 2010 Proceedings Papers.2010.
[19] Dagan I.,Dolan B. Recognizing Textual Entailment: Rational, Evaluation and Approaches. Natural Laguage Engineering,2009,15(4):i-xvii.
[20] Burchardt A. PH D Dissertation. Modeling Textual Entailment with Role-Semantic Information.2008.

基金

国家语委“十二五”科研规划项目(YB125-19);国家自然科学基金(61373082,60970053);山西省回国留学人员科研资助项目(2013-015);国家“863”高技术研究发展计划基金(2006AA0lZ142)
PDF(2138 KB)

Accesses

Citation

Detail

段落导航
相关文章

/