基于句法语义特征的中文实体关系抽取

郭喜跃,何婷婷,胡小华,陈前军

PDF(1702 KB)
PDF(1702 KB)
中文信息学报 ›› 2014, Vol. 28 ›› Issue (6) : 183-189.
信息抽取与文本挖掘

基于句法语义特征的中文实体关系抽取

  • 郭喜跃1,3,何婷婷2,胡小华2,陈前军1,4
作者信息 +

Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features

  • GUO Xiyue1,3, HE Tingting2 , HU Xiaohua2, CHEN Qianjun1,4
Author information +
History +

摘要

实体关系抽取的核心问题是实体关系特征的选择。以往的研究通常都以词法特征、实体原始特征等来刻画实体关系,其抽取效果已难再提高。在传统方法的基础上,该文提出一种基于句法特征、语义特征的实体关系抽取方法,融入了依存句法关系、核心谓词、语义角色标注等特征,选择SVM作为机器学习的实现途径,以真实新闻文本作为语料进行实验。实验结果表明该方法的F1值有明显提升。

Abstract

Identifying the relation features between named entities is the key aspect in named entity relation extraction. Traditional methods usually chose the lexical features and other surface features, which are well addressed already. This paper proposes a novel Chinese named entity relation extraction method, adding such syntactic and semantic features as dependency parsing, core predicate verb and semantic role labeling etc. Experimented by SVM over a true news text corpus, the results indicate that this method could improve the F1 value significantly.

关键词

句法特征 / 语义特征 / 实体关系抽取 / SVM

Key words

syntactic features / semantic features / named entity relation extraction / SVM

引用本文

导出引用
郭喜跃,何婷婷,胡小华,陈前军. 基于句法语义特征的中文实体关系抽取. 中文信息学报. 2014, 28(6): 183-189
GUO Xiyue, HE Tingting , HU Xiaohua, CHEN Qianjun. Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features. Journal of Chinese Information Processing. 2014, 28(6): 183-189

参考文献

[1] Kushmerick, N, Weld, D, and Doorenbos, R. Wrapper induction for information extraction[C]//Proceedings of Fifteenth International Joint Conference on Artificial Intelligence. Nagoya, Japan:1997: 729-737.
[2] D Zelenko, C Aone, A Richardella. Kernel methods for relation extraction[J]. The Journal of Machine Learning Research, 2003(3): 1083-1106.
[3] Philippe Thomas, Mariana Neves, Illés Solt. Relation Extraction for Drug-Drug Interactions using Ensemble Learning[C]//Proceedings of Drug-Drug Interaction Extraction, Huelva, Spain: 2011: 11-18.
[4] Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, et al. Multi-instance Multi-label Learning for Relation Extraction [C]//Proceedings of Conference on Empirical Methods in Natural Language Processing and Natural Language Learn ing. Jeju Island, Korea: 2012:455-465.
[5] Haiguang Li, Gongqing Wu, Xuegang Hu, et al. A relation extraction method of Chinese named entities based on location and semantic features[J]. Applied Intelligence, 2013, 38(1): 1-15.
[6] 何婷婷,徐超,李晶,等. 基于种子自扩展的命名实体关系抽取方法[J]. 计算机工程, 2006,32(31): 183-184, 193
[7] 徐芬, 王挺, 陈火旺. 基于SVM方法的中文实体关系抽取[C]//第九届全国计算语言学学术会议, 中国,大连,2007: 497-502.
[8] 陈鹏,郭剑毅,余正涛,等. 基于凸组合核函数的中文领域实体关系抽取[J]. 中文信息学院, 2013,27(5): 144-148.
[9] 胡宝顺, 王大玲,于戈,等. 基于句法结构特征分析及分类技术的答案提取算法[J]. 计算机学报, 2008,31(4): 662-676.
[10] 李业刚, 孙福振, 李鉴柏,等. 语义角色标注研究综述[J]. 山东理工大学学报(自然科学版), 2011,25(6): 19-14.
[11] 刘怀军, 车万翔, 刘挺. 中文语义角色标注的特征工程[J]. 中文信息学报, 2007,21(1): 79-84.
[12] Peter Harrington. Machine Learning in Action[M]. Connecticut: Manning Publications Co., 2012.
[13] 李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012.
[14] John C Platt. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines[R]. Seattle: Microsoft Research, 2003.
[15] Chih-Chung Chang, Chih-Jen Lin. LIBSVM: A Library for Support Vector Machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011,3(2):1-27.
[16] Xiugang Li, Dominique Lord, Yunlong Zhang, et al. Predicting motor vehicle crashes using Support Vector Machine models[J]. Accident Analysis and Prevention, 2008,40(4): 1611-1618.
[17] 刘一佳. 语言云简介[DB/OL]. http://www.ltp-cloud.com/intro/, 2014-5-29.
[18] 刘丹丹,彭成, 周国栋,等. 词汇语义信息对中文实体关系抽取影响的比较[J]. 计算机应用, 2012,32(8): 2238-2244.

基金

国家社会科学基金重大项目(12&2D223);国家“十二五”科技支撑计划课题(2012BAK24B01);国家自然科学基金(61300144);国家语委“十二五”重点项目(ZDI125-1); 教育部/国家外国专家局高等学校学科创新引智计划项目(B07042); 湖北省自然科学基金重点项目(2011CDA034);华中师范大学中央高校基本科研业务费项目(CCNU13A05014, No. CCNU13C01001, CCNU13F010)。
PDF(1702 KB)

747

Accesses

0

Citation

Detail

段落导航
相关文章

/