传统的搜索引擎性能评价方法需要人工标注标准答案集,需花费大量的人力物力,并且评价结果依赖于人工标注的准确性,效率较低。该文基于聚类分析的思路,提出了一种搜索引擎性能评价指标和自动进行搜索引擎性能评价的方法,此方法能自动计算信息类查询的覆盖范围,并根据其覆盖范围对检索结果进行聚类,通过类间距和类内距等指标实现检索性能的自动评价。实验结果表明,基于聚类指标的评价方法与人工标注的评价方法的评价结果是相一致的。
Abstract
Traditional search engine evaluation methods need manual annotation of correct answers for a set of queries, which is costly and time comsuming. In this paper, we present an automatic search engine performance evaluation method based on clustering analysis. This method includes three stepsfirst, computing the coverage score of the query for information; second, clustering the search results by the coverage score; last, evaluating the retrieval performance using intra-cluster cohesion and inter-cluster separation. Experimental results show that the automatic method gets a similar evaluation result with traditional assessor-based ones.
Key wordsinformation retrieval; performance evaluation; clustering analysis
关键词
信息检索 /
性能评价 /
聚类分析
{{custom_keyword}} /
Key words
information retrieval /
performance evaluation /
clustering analysis
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Saracevic, T. Evaluation of evaluation in information retrieval[C]//Fox E.A., Ingwersen P, Fidel R,eds. Proc. of the 18th Annual international ACM SIGIR Conf. on Research and Development in information Retrieval (SIGIR ’95). New York: ACM Press,1995: 138-146.
[2] 刘奕群,岑荣伟,张敏,等. 基于用户行为分析的搜索引擎自动性能评价[J]. 软件学报,2008,19 (11):3023-3032.
[3] Soboroff I., Nicholas C., Cahan P. Ranking retrieval systems without relevance judgments[C]//Kraft DH, Croft WB, Harper DJ, Zobel J, eds. Proc. of the 24th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (SIGIR 2001). New York :ACM Press, 2001: 66-73.
[4] Nuray R. Can F. Automatic ranking of retrieval systems in imperfect environments[C]//Clarke C, Cormack G, Callan J, Hawking D, Smeaton A, eds. Proc. of the 26th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (SIGIR 2003). New York :ACM Press, 2003: 379-380.
[5] Chowdhury A., Soboroff I. Automatic Evaluation of World Wide Web Search Services[C]//Jarverlin K, Beaulieu M, Baeza-Yates R, Myaeng SH, eds. Proc. of the 25th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (SIGIR 2002). New York: ACM Press, 2002:421-422.
[6] Beitzel S. M., Jensen E. C., Chowdhury A., Grossman D. Using titles and category names from editor-driven taxonomies for automatic evaluation[C]//Kraft D, Frieder O, Hammeer J, Qureshi S, Seligman L, eds.Proc. of the twelfth international conference on Information and knowledge management, 2003:17-23.
[7] Amitay E., Carmel D., Lempel R., Soffer A. Scaling IR-system evaluation using term relevance sets[C]//Jarvelin K, Allan J, Bruza P, Sanderson M, eds. Proc. of the 27th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (SIGIR 2004). New York :ACM Press,2004: 10-17.
[8] Joachims T. Evaluating Retrieval Performance Using Clickthrough Data[C]//Franke J, Nakhaeizadeh G, Renz I. Text Mining. Springer-Verlag, 2003:79-96.
[9] 郎皓,王斌,李锦涛,等. 文本检索的查询性能预测[J]. 软件学报,2008,19(2):291-300.
[10] Vinay V, Cox IJ, Milic-Frayling N, Wood K. On ranking the effectiveness of searches[C]//Proc. of the 29th Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval. New York: ACM Press, 2006. 398-404.
[11] Zhou Y, Croft WB. Ranking robustness: A novel framework to predict query performance[C]//Proc. of the 15th ACM International Conf. on Information and Knowledge Management.Arlington:ACM Press, 2006:567-574.
[12] Cronen-Townsend S, Zhou Y, Croft WB. Predicting query performance[C]//Proc. of the 25th Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval. Tampere: ACM Press, 2002: 299-306.
[13] Carmel D, Yom-Tov E, Darlow A, Pelleg D. What makes a query difficult?[C]//Proc. of the 29th Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval. New York: ACM Press, 2006:390-397.
[14] Charles Clarke, Maheedhar Kolla, Gordon Cormack. Novelty and Diversity in Information Retrieval Evaluation[C]//Proc. of the 31th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (SIGIR 2008). Singapore: ACM Press, 2008:659-666.
[15] Andrei Z. Broder.,A taxonomy of web search[C]//SIGIR Forum,2002,36(2):3-10.
[16] Lang H, Wang B, Jones G et al. Query performance prediction for information retrieval based on covering topic score[J]. Journal of Computer Science and Technology, 2008, 23 (4): 590-601.
[17] Liu YQ, Zhang M, Ru LY, Ma SP. Automatic Query Type Identification Based on Click through Information[C]//Ng HT, Leong MK, Kan MY, Ji DH, eds. Proc. of the 3rd Asia Information Retrieval Symp.,AIRS 2006,LNCS 4182, Berlin, Heidelberg: Springer-Verlag, 2006:593-600.
[18] 李晓明,闫宏飞,王继明. 搜索引擎——原理、技术与系统[M]. 北京: 科学出版社, 2005 : 179-181.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金资助项目(60963014);江西省自然科学基金资助项目(2008GZS0052);江西省科技攻关项目(2006-184);江西省教育厅科技课题(2007-129)
{{custom_fund}}