胡熠,陆汝占,刘慧. 面向信息检索的概念关系自动构建[J]. 中文信息学报, 2007, 21(5): 46-50.
HU Yi, LU Ru-zhan, LIU Hui. Information Retrieval Oriented Auto-Construction of Conceptual Relations. , 2007, 21(5): 46-50.
面向信息检索的概念关系自动构建
胡熠,陆汝占,刘慧
上海交通大学 计算机系,上海 200240
Information Retrieval Oriented Auto-Construction of Conceptual Relations
HU Yi, LU Ru-zhan, LIU Hui
Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200240, China
Abstract:The dependence analysis between concepts is usually one of the key points for improving the performance of information retrieval system. In this paper, we explore a bootstrapping method to automatically extract semantic patterns for identifying the “(geographical) is-part-of”, “(entity) function” and “(motion) object” relations between concepts in contexts. A system, named SPG (Semantic Pattern Getter), is developed. Our contributions lie in: (1) introducing a bi-sequence alignment algorithm in bioinformatics to generate candidate patterns, and (2) defining a new evaluating metric for patterns’ confidences. In terms of the automatic recognition of the three relations, the experiments show that the pattern set generated by SPG achieves higher precision and coverage than DIPRE does.
[1] Jianfeng Gao, Jian-Yun Nie, Guangyuan et al. Dependence language model for information retrieval [A]. In: Proceedings of the 27th Annual International ACM SIGIR conference on Research and development in information retrieval [C]. Sheffield, UK: 2004. 170-177. [2] Nallapati, R. and J. Allan. Capturing term dependencies using a language model based on sentence tree [A]. In: Proceedings of CIKM’02 [C]. McLean, Virginia, USA: 2004. 383-390. [3] Genest, D. and Chenin, M. A. Content-Search Information Retrieval Process Based on Conceptual Graphs [J]. Knowledge and Information Systems Journal. 2005, Vol.8: 292-309. [4] 陆汝占. 信息检索现状、问题及思考 [A]. 中文信息处理的探索与实践——第三届HNC与语言学研究学术研讨会论文集 [C]. 北京师范大学, 北京: 2005. 48-54. [5] Etzioni O., Cafarella M., Downey D. et al. Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison [A]. In: Proceedings of the AAAI Conference [C]. San Jose, USA: 2004. [6] Thelen M. and Riloff E. A Bootstrapping Method for Learning Semantic Lexicon using Extraction Pattern Contexts [A]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing [C]. Philadelphia, USA: 2002. 214-221. [7] Agichtein, E., and Gravano, S. Snowball: Extracting relations from large plain-text collections [A]. In: Proceedings of the 5th ACM International Conference on Digital Libraries [C]. San Antonio, Texas, United States: 2000. 85-94. [8] Brin, S. Extracting patterns and relations from the World Wide Web [A]. In: Proceedings of the International Workshop on the Web and Databases [C]. Valencia, Spain: 1998. 172-183. [9] Michael Sammeth, B. Morgenstern, and J. Stoye. Divide-and-conquer multiple alignment with segment-based constraints [J]. Bioinformatics. 2003, 19(2): 189-195.