1. School of Computer Science and Technology, Soochow University, Suzhou,Jiangsu 215006,China; 2. Province Key Lab of Computer Information Processing Technology of Jiangsu, Suzhou, Jiangsu 215006, China)
Abstract:Currently, semi-supervised or unsupervised event extraction remains a challenge. According to the nature of Chinese language, this paper proposes a dual-view-based bootstrapping approach to extract event patterns. According to a small set of seeds, it applies a cross filtering method to two views, document relevance and semantic similarity, and extract new patterns in each iteration. Our experimental results show our system outperforms the existed systems.
[1] Roman Yangarber, Ralph Grishman, Pasi Tapanainen, Silja Huttunen. Unsupervised discovery of scenario-level patterns for Information Extraction[C]//Proceedings of the 6th Conference on Applied Natural Language Processing. 2000: 282-289. [2] Peifeng Li, Guodong Zhou, Qiaoming Zhu, et al. Employing compositional semantics and discourse consistency in Chinese event extraction[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012: 1006-1016. [3] Ellen Riloff. Automatically Generating Extraction Patterns from Untagged Text[C]//Proceedings of the Thirteenth National Conference on Artificial Intelligence. 1996: 1044-1049. [4] Roman Yangarber, Ralph Grishman, Pasi Tapanainen, Silja Huttunen. Automatic Acquisition of Domain Knowledge for Information Extraction[C]//Proceedings of the 18th Conference on Computational Linguistics.2000: 940-946. [5] Mihai Surdeanu, Jordi Turmo, Alicia Ageno. A Hybrid Approach for the Acquisition of Information Extraction Patterns[C]//Proceedings of the EACL 2006 Workshop on Adaptive Text Extraction and Mining. 2006. [6] Mark A. Greenwood, Mark Stevenson. Improving semi-supervised acquisition of relation extraction patterns[C]//Proceedings of the Workshop on Information Extraction Beyond the Document. 2006:29-35. [7] Ted Pedersen, Siddharth Patwardhan, Jason Michelizzi. WordNet: Similarity—Measuring the Relatedness of Concepts[C]//Proceedings of the Nineteenth National Conference on Artificial Intelligence. 2004: 1024-1025. [8] Shasha Liao, Ralph Grishman. Filtered Ranking for Bootstrapping in Event Extraction[C]//Proceedings of the 23rd International Conference on Computational Linguistics. 2010: 680-688. [9] 刘群,李素建. 基于《知网》的词汇语义相似度计算[J]. 计算语言学及中文信息处理,2002,7: 59-76.