中文实体关系抽取中的特征选择研究

董静,孙乐,冯元勇,黄瑞红

PDF(539 KB)
PDF(539 KB)
中文信息学报 ›› 2007, Vol. 21 ›› Issue (4) : 80-91.
论文

中文实体关系抽取中的特征选择研究

作者信息 +

Chinese Automatic Entity Relation Extraction

Author information +
History +

摘要

命名实体关系抽取是信息抽取研究领域中的重要研究课题之一。通过分析,本文提出将中文实体关系划分为: 包含实体关系与非包含实体关系。针对同一种句法特征在识别它们时性能的明显差异,本文对这两种关系采用了不同的句法特征集,并提出了一些适合各自特点的新的句法特征。在CRF 模型框架下,以ACE2007 的语料作为实验数据,结果表明本文的划分方法和新特征有效的提高了汉语实体关系抽取任务的性能。关键词: 计算机应用;中文信息处理;实体关系抽取;包含关系;非包含关系;特征选择;ACE 评测

Abstract

Entity Relation Extraction is one of the important research fields in Information Ext raction. This paper present s a novel method through dividing the entity relations into two categories : embedding relations and non-embedding relations. After some simple experiments , we discover that some syntactic features have explicitly different effects on the identification of the two kinds of relations. So two different set of syntactic features are suggested to extract the two categories. Experiment s show that the new method achieves an improved performance on the ACE2007 Corpus for Chinese entity relation extraction task.

Key words

computer application / chinese information processing  / automatic entity relation extraction  / embedding entity relation  / nonembedding  / entity relation  / feature selection  / ACE evaluation

引用本文

导出引用
董静,孙乐,冯元勇,黄瑞红. 中文实体关系抽取中的特征选择研究. 中文信息学报. 2007, 21(4): 80-91
DONGJing,SUN Le,FENG Yuan-yong,HUANG Rui-hong. Chinese Automatic Entity Relation Extraction. Journal of Chinese Information Processing. 2007, 21(4): 80-91

参考文献

[1 ]  ACE. 2007. The nist ace evaluation website. http :/ / www. nist . gov/ speech/ test s/ ace/ ace07/ .
[2 ]  梁晗,陈群秀,吴平博. 基于事件框架的信息抽取系统. 中文信息学报,2006 ,20 (2) : 40246.
[3 ]  N. chinchor. Overview of MUC27 [ A ] . In : Proceed2 ings of the 6th Message Understanding Conference [C] . 1998.
[4 ]  Miller S. , Fox H. , Ramshaw L. and Weischedel R. A novel use of statistical parsing to ext ract information f rom text [A] . In : Proceedings of 6th Applied Natural Language Processing Conference [ C ] . Seattle , USA. 29 April2 4 May 2000.
[5 ]  Collins M. and Duffy N. Covolution kernels fornatural language [ A ] . In : Dietterich T. G. , Becker S. and Ghahramani Z. editors. Advances in Neural Informa2 tion Processing Systems 14 [ C ] . Cambridge , MA. 2002.
[6 ]  Zelenko D. , Aone C. and Richardella. Kernel methods for relation ext raction[J ] . Journal of MachineLearning Research , 2003. 108321106.
[7 ]  Culotta and J . Sorensen. Dependency t ree kernels for relation ext raction [ A ] . In : Proceedings of ACL [ C] . Barcelona , Spain. 2004.
[8 ]  Kambhatla N. Combining lexical , syntactic and seman2 tic features with Maximum Ent ropy models for ext rac2 ting relations [ A ] . In : Proceedings of 42th Annual Meeting of the Association for Computational Linguis2 tics[C] . Barcelona , Spain. 21226 , J uly 2004.
[9 ]  Zhou GuoDong , SU Jian , ZHANGJ ie , ZHAN G Min. Exploring various knowledge in relation ext raction[A] . In : Proceedings of ACL [C] . 2005.
[10 ]  车万翔,刘挺,李生. 实体关系自动抽取. 中文信息学报,2005 ,19 (2) : 126.
[11 ]  John Lafferty , Andrew McCallum and Fernando Pereira. Conditional Random Fields : Probabilistic Models for Segmenting and Labeling Sequence Data [A] . In : Proceedings of International Conference on Machine Learning[C] . San Francisco : Morgan Kauf2 man. 2001. 2822289.
PDF(539 KB)

1402

Accesses

0

Citation

Detail

段落导航
相关文章

/