基于语言现象的文本蕴涵识别

任 函;冯文贺;刘茂福;万 菁

PDF(1402 KB)
PDF(1402 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (1) : 184-191.
语言分析与计算

基于语言现象的文本蕴涵识别

  • 任 函1,2,冯文贺1,2,刘茂福2,3,万 菁4
作者信息 +

Recognizing Textual Entailment Based on Inference Phenomena

  • REN Han1,2, FENG Wenhe1,2, LIU Maofu2,3, WAN Jing2
Author information +
History +

摘要

该文提出一种基于语言现象的文本蕴涵识别方法,该方法建立了一个语言现象识别和整体推理判断的联合分类模型,目的是对两个高度相关的任务进行统一学习,避免管道模型的错误传播问题并提升系统精度。针对语言现象识别,设计了22个专用特征和20个通用特征;为提高随机森林的泛化能力,提出一种基于特征选择的随机森林生成算法。实验结果表明,基于随机森林的联合分类模型能够有效识别语言现象和总体蕴涵关系。

Abstract

This paper introduces an approach of textual entailment recognition based on language phenomena. The approach asopts a joint classification model for language phenomenon recognition and entailment recognition, so as to learn two highly relevant tasks, avoiding error propagation in pipeline strategy. For language phenomenon recognition, 22 specific and 20 general features are employed. And for enhancing the generalization of random forest, a feature selection method is adopted on building trees of random forest. Experimental results show that the joint classification model based on random forest recognizes language phenomena and entailment relation effectively.

关键词

文本蕴涵识别 / 语言现象 / 随机森林

Key words

recognizing textual entailment / language phenomena / random forest

引用本文

导出引用
任 函;冯文贺;刘茂福;万 菁. 基于语言现象的文本蕴涵识别. 中文信息学报. 2017, 31(1): 184-191
REN Han; FENG Wenhe; LIU Maofu; WAN Jing. Recognizing Textual Entailment Based on Inference Phenomena. Journal of Chinese Information Processing. 2017, 31(1): 184-191

参考文献

[1] Dagan I, Glickman O. Probabilistic Textual Entailment: Generic Applied Modeling of Language Variability[C]//Proceedings of PASCAL Workshop on Learning Methods for Text Understanding and Mining. 2004.
[2] Androutsopoulos I, Malakasiotis P. A Survey of Paraphrasing and Textul Entailment Methods[J]. Journal of Artificial Intelligence Research, 2010, 38(1): 135-187.
[3] Dagan I, Dolan B. Recognizing textual entailment: Rational, evaluation and approaches[J]. Natural Language Engineering, 2009, 15(4): i-xvii.
[4] Cabrio E. Specialized Entailment Engines: Approaching Linguistic Aspects of Textual Entailment[C]//Proceedings of the 14th International Conference on Applications of Natural Language to Information Systems, 2009: 305-308.
[5] Bentivogli L, Cabrio E, Dagan I, et al. Building textual entailment specialized data sets: a methodology for isolating linguistic phenomena relevant to inference[C]//Proceedings of the International Conference on Language Resources and Evaluation. 2010: 3542-3549.
[6] Kaneko K, Miyao Y, Bekki D. Building Japanese Textual Entailment Specialized Data Sets for Inference of Basic Sentence Relations[C]//Proceedings of the 51st Annual Meeting of the Association of Computational Linguistics 2013: 273-277.
[7] Sammons M, Vydiswaran V G V, Roth D. “Ask not what Textual Entailment can do for you…”[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. 2010: 1199-1208.
[8] Garoufi K. Towards a better understanding of applied textual entailment: Annotation and evaluation of the RTE-2 dataset. Germany, Saarland University. Master Thesis. 2007.
[9] Matsuyoshi S, Miyao Y, Shibata T, et al. Overview of the NTCIR-11 Recognizing Inference in TExt and Validation (RITE-VAL) Task[C]//Proceedings of the 11th NTCIR Conference. 2014: 223-232.
[10] Huang H H, Chang K C, Chen H H. Modeling Human Inference Process for Textual Entailment Recognition[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013: 446-450.
[11] 江敏, 肖诗斌, 王弘蔚, 等. 一种改进的基于知网的词语语义相似度计算[J]. 中文信息学报, 2008, 22(5): 84-89.
[12] 张志昌, 周慧霞, 姚东任, 等. 基于词向量的中文词汇蕴涵关系识别[J]. 计算机工程, 2016, 42(2): 169-174.
[13] Ren H, Wu H, Tan X, et al. The WHUTE System in NTCIR-11 RITE Task[C]//Proceedings of the 11th NTCIR Conference. 2014: 309-316.

基金

国家自然科学基金(61402341);国家社会科学基金(11&ZD189);华中师范大学中央高校基本科研业务费教育科学专项资助(ccnu16JYKX014);教育部人文社科项目(13YJC740022);河南高校哲社基础研究重大项目(2015-JCZD-022);广东外语外贸大学语言工程与计算实验室2016年招标课题(LEC2016ZBKT002)
PDF(1402 KB)

704

Accesses

0

Citation

Detail

段落导航
相关文章

/