中文事件抽取技术研究

赵妍妍,秦兵,车万翔,刘挺

PDF(305 KB)
PDF(305 KB)
中文信息学报 ›› 2008, Vol. 22 ›› Issue (1) : 3-8.
综述

中文事件抽取技术研究

  • 赵妍妍,秦兵,车万翔,刘挺
作者信息 +

Research on Chinese Event Extraction

  • ZHAO Yan-yan, QIN Bing, CHE Wan-xiang, LIU Ting
Author information +
History +

摘要

事件抽取是信息抽取领域一个重要的研究方向,本文对事件抽取的两项关键技术——事件类别识别以及事件元素识别进行了深入研究。在事件类别识别阶段,本文采用了一种基于触发词扩展和二元分类相结合的方法;在事件元素识别阶段,本文采用了基于最大熵的多元分类的方法。这些方法很好的解决了事件抽取中训练实例正反例不平衡以及数据稀疏问题,取得了较好的系统性能。

Abstract

Event Extraction is an important research point in the area of Information Extraction. This paper makes an intensive study of the two stages of Chinese event extraction, namely event type recognition and event argument recognition. A novel method combining event trigger expansion and a binary classifier is presented in the step of event type recognition while in the step of argument recognition, one with multi-class classification based on maximum entropy is introduced. The above methods solved the data unbalanced problem in training model and the data sparseness problem brought by the small set of training data effectively, and finally our event extraction system achieved a better performance.

关键词

计算机应用 / 中文信息处理 / 事件抽取 / 事件类别识别 / 事件元素识别

Key words

computer application / Chinese information processing / event extraction / event type recognition / event argument recognition

引用本文

导出引用
赵妍妍,秦兵,车万翔,刘挺. 中文事件抽取技术研究. 中文信息学报. 2008, 22(1): 3-8
ZHAO Yan-yan, QIN Bing, CHE Wan-xiang, LIU Ting. Research on Chinese Event Extraction. Journal of Chinese Information Processing. 2008, 22(1): 3-8

参考文献

[1] Naomi Daniel, Dragomir Radev and Timothy Allison. Sub-event based Multi-document Summarization [A]. In: Proceedings of the HLT-NAACL Workshop on Text Summarization [C]. 2003. 9-16.
[2] Elena Filatova and Vasileios Hatzivassiloglou. Event-based Extractive summarization [A]. In: Proceedings of ACL Workshop on Summarization [C]. 2004.104-111.
[3] Wenjie Li, Mingli Wu and Qin Lu. Extractive Summarization using Inter- and Intra- Event Relevance [A]. In: Proceedings of the 44th Annual Meeting of the Association for Computational Liguistics [C]. 2006.369-376.
[4] David Ahn. The stages of event extraction [A]. In: Proceedings of the Workshop on Annotations and Reasoning about Time and Events [C]. 2006.1-8.
[5] ACE (Automatic Content Extraction) Chinese Annotation Guidelines for Events. National Institute of Standards and Technology [R]. 2005.
[6] 姜吉发. 自由文本的信息抽取模式获取的研究[D]. 中国科学院博士学位论文, 2004: 1-18.
[7] Mihai Surdeanu, Sanda Harabagiu, John Williams, et al. Using Predicate-Argument Structures for Information Extraction [A]. In: Proceedings of ACL[C]. 2003.8-15.
[8] Mihai Surdeanu and Sanda Harabagiu. Infrastructure for Open-Domain Information Extraction [A]. In: Proceedings of the Human Language Technology Conference [C]. 2002.325-330.
[9] Hai Leong Chieu, Hwee Tou Ng. A Maximum Entropy Approach to Information Extraction from Semi-Structured and Free Text [A]. In: Proceedings of the 18th National Conference on Artificial Intelligence [C]. 2002.786-791.






基金

国家自然科学基金资助项目(60575042, 60675034);国家863资助项目(2006AA01Z145)
PDF(305 KB)

2116

Accesses

0

Citation

Detail

段落导航
相关文章

/