论子话题粒度对搜索结果多样化算法的影响

胡 莎,窦志成,文继荣

PDF(1620 KB)
PDF(1620 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (4) : 165-173.
信息检索与问答系统

论子话题粒度对搜索结果多样化算法的影响

  • 胡 莎1,窦志成2,3,文继荣2,3
作者信息 +

The Impact of Various Grained Subtopics on Search Result Diversification

  • HU Sha1, DOU Zhicheng2,3, WEN Jirong2,3
Author information +
History +

摘要

随着生活节奏的加快,用户习惯将简短的查询提交给搜索引擎,并希望搜索引擎能体贴地将自己需要的结果返回在靠前的结果中。面对大量有歧义的或者意义广泛的查询,搜索引擎努力地识别用户意图,并试图用有限的结果取悦更多的用户。为了解决这个问题,搜索结果多样化技术应运而生,其任务是是对搜索结果进行重排序,在有限的搜索结果中满足尽可能多的用户意图。该文重点关注多样化算法中子话题的粒度问题。利用传统方法生成了不同粒度的子话题,并比较了使用不同粒度的子话题对搜索结果多样化算法的影响。实验结果表明,经典多样化算法使用细粒度的子话题时表现更好。

Abstract

The search result diversification re-ranks search results to cover as many user intents as possible in the top ranks. Most intent-aware diversification algorithms use subtopics to diversify results. Focuses on the granularity of subtopics, this paper investigates the performance of diversification algorithms by using subtopics with different granularities. Experimental results show that state-of-the-art diversification algorithms work better by using fine-grained subtopics.

关键词

搜索结果多样化 / 查询意图 / 子话题

引用本文

导出引用
胡 莎,窦志成,文继荣. 论子话题粒度对搜索结果多样化算法的影响. 中文信息学报. 2017, 31(4): 165-173
HU Sha, DOU Zhicheng, WEN Jirong. The Impact of Various Grained Subtopics on Search Result Diversification. Journal of Chinese Information Processing. 2017, 31(4): 165-173

参考文献

[1] Bernard J Jansen, Amanda Spink, Tefko Saracevic. Real life, real users, and real needs: a study and analysis of user queries on the web[J]. Information Processing & Management, 2000,36(2): 207-227.
[2] Dou Z, Song R, Wen J R. A large-scale evaluation and analysis of personalized search strategies[C]//Proceedings of WWW, 2007: 581-590.
[3] Ruihua Song, Zhenxiao Luo, Jianyun Nie, et al. Identification of ambiguous queries in web search[J]. IPM, 2009: 45(2).
[4] Rakesh Agrawal,Sreenivas Gollapudi, Alan Halverson, et al. Diversifying search results[C]//Proceedings of WSDM, 2009.
[5] Saul Vargas, Pablo Castells, DavidVallet. Explicit relevance models in intent-oriented information retrieval diversification[C]//Proceedings of SIGIR, 2012: 75-84.
[6] Rodrygo L T Santos, Craig Macdonald, Iadh Ounis. Exploiting query reformulations for web search result diversification[C]//Proceedings of WWW, 2010: 881-890.
[7] Van Dang, W Bruce Croft. Diversity by proportionality: an election-based approach to search result diversification[C]//Proceedings of SIGIR, 2012: 65-74.
[8] BenCarterette, Praveen Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval[C]//Proceedings of CIKM, Hong Kong, China, 2009: 1287-1296.
[9] Van Dang, W. Bruce Croft. Term level search result diversification[C]//Proceedings of SIGIR, 2013: 603-612.
[10] Jaime Carbonell, Jade Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries[C]//Proceedings of SIGIR, 1998.
[11] Chengxiang Zhai, John Lafferty. A risk minimization framework for information retrieval[J]. IPM, 2006: 42(1): 31-55.
[12] Xiaojin Zhu, Andrew Goldberg, Jurgen Van Gael, et al. Improving diversity in ranking using absorbing random walks[C]//Proceedings of HLT-NAACL, 2007.
[13] Benyu Zhang, Hua Li, Yi Liu, et al. Improving web search results using anity graph[C]//Proceedings of SIGIR, 2005: 504-511.
[14] Karthik Raman, Paul N Bennett, Kevyn Collins-Thompson. Toward whole-session relevance: exploring intrinsic diversity in web search[C]//Proceedings of SIGIR, 2013.
[15] Hai-Tao Yu, Fuji Ren. Search result diversification via filling up multiple knap-sacks[C]//Proceedings of CIKM, Shanghai, China, 2014: 609-618.
[16] Shangsong Liang, Zhaochun Ren, Maarten de Rijke. Fusion helps diversification[C]//Proceedings of SIGIR, 2014: 303-312.
[17] FilipRadlinski, Robert Kleinberg, Thorsten Joachims. Learning diverse rankings with multi-armed bandits[C]//Proceedings of ICML, 2008.
[18] Zhicheng Dou, Sha Hu, Kun Chen, et al. Multi- dimensional search result diversification[C]//Proceedings of WSDM, Hong Kong, China, 2011: 475-484.
[19] Jiyin He, Vera Hollink, Arjen de Vries. Combining implicit and explicit topic representations for result diversification[C]//Proceedings of SIGIR, 2012: 851-860.
[20] Yisong Yue, Thorsten Joachims. Predicting diverse subsets using structural svms[C]//Proceedings of ICML, 2008.
[21] Yadong Zhu, Yanyan Lan, Jiafeng Guo, et al. Learning for search result diversification[C]//Proceedings of SIGIR, 2014: 293-302.
[22] Dawn Lawrie, W. Bruce Croft, Arnold Rosenberg. Finding topic words for hierarchical summarization[C]//Proceedings of SIGIR, New Orleans, Louisiana, USA, 2001: 349-357.
[23] Zhicheng Dou, Sha Hu, Yulong Luo, et al. Finding dimensions for queries[C]//Proceedings of CIKM, 2011.
[24] Yunhua Hu, Yanan Qian, Hang Li, et al. Mining query subtopics from search log data[C]//Proceedings of SIGIR, Portland, Ore-gon, USA, 2012: 305-314.
[25] Shoaib Jameel, Wai Lam. An unsupervised topic segmentation model incorporating word order[C]//Proceedings of SIGIR, Dublin, Ireland, 2013: 203-212.
[26] Olivier Chapelle, Donald Metlzer, Ya Zhang, et al. Expected reciprocal rank for graded relevance[C]//Proceedings of CIKM, Hong Kong, China, 2009: 621-630.
[27] Charles L A Clarke,Maheedhar Kolla, Gordon V Cormack, et al. Novelty and diversity in information retrieval evaluation[C]//Proceedings of SIGIR, 2008: 659-666.
[28] Charles L Clarke,Maheedhar Kolla, Olga Vechtomova. An eectiveness measure for ambiguous and underspecified queries[C]//Proceedings of ICTIR, 2009.
[29] Tetsuya Sakai, Ruihua Song. Evaluating diversified search results using per-intent graded relevance[C]//Proceedings of SIGIR, Beijing, China, 2011: 1043-1052.

基金

国家重点基础研究发展计划/973计划(2014CB340403);国家自然科学基金(61502501)
PDF(1620 KB)

Accesses

Citation

Detail

段落导航
相关文章

/