卷积重提取特征的文档列表排序学习方法

曹军梅,马乐荣

PDF(1786 KB)
PDF(1786 KB)
中文信息学报 ›› 2020, Vol. 34 ›› Issue (8) : 86-93.
信息检索与问答系统

卷积重提取特征的文档列表排序学习方法

  • 曹军梅,马乐荣
作者信息 +

Listwise Reranking via Convolutional Re-extracted Features

  • CAO Junmei, MA Lerong
Author information +
History +

摘要

在许多信息检索任务中,为了进一步提高检索性能,通常需要对检索到的文档进行重新排序,目前的排序学习方法主要集中在损失函数的构造上,而没有考虑特征之间的关系。该文将多通道深度卷积神经网络作用于文档列表排序学习方法,即ListCNN,实现了信息检索的精确重排序。由于从文档中提取的多个特征中有一些特征具有局部相关性和冗余性,因此,文中使用卷积神经网络来重新提取特征,以提高列表方法的性能。ListCNN架构考虑了原始文档特征的局部相关性,能够有效地重新提取代表性特征。在公共数据集LETOR 4.0上对ListCNN进行实验验证,结果表明其性能优于已有文档列表方法。

Abstract

Re-ranking retrieved documents are usually required to further improve the performance in many information retrieval tasks. In this paper, we conduct multi-channel deep convolutional neural networks (CNNs) on listwise approaches for learning to rank, namely ListCNN. For the multi-modal features extracted from documents, we find that some features are locally correlated with redundancy. Accordingly, we propose to employ deep neural networks (i.e., modified CNNs) to re-extract features to boost the performance of classical listwise approaches. Validated on public datasets of LETOR 4.0, the proposed ListCNN architecture demonstrates superior performance for re-ranking in comparison with other state-of-the-arts methods.

关键词

排序学习 / 文档列表 / 梯度下降 / 卷积神经网络

Key words

learning to rank / listwise / gradient descent / convolutional neural network

引用本文

导出引用
曹军梅,马乐荣. 卷积重提取特征的文档列表排序学习方法. 中文信息学报. 2020, 34(8): 86-93
CAO Junmei, MA Lerong. Listwise Reranking via Convolutional Re-extracted Features. Journal of Chinese Information Processing. 2020, 34(8): 86-93

参考文献

[1] Lai H, Pan Y, Liu C, et al. Sparse learning-to-rank via an efficient primal-dual algorithm[J]. IEEE Transactions on Computers, 2012, 62(6): 1221-1233.
[2] He L, Liu N N, Yang Q. Active dual collaborative filtering with both item and attribute feedback[C]//Proceedings of the 25th AAAI Conference on Artificial Intelligence, 2011: 1186-1191.
[3] Liu Xialei,Weijer Joost van de,Bagdanov Andrew D. Exploiting unlabeled data in CNNs by self-supervised learning to rank.[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(8): 1862-1878.
[4] Li H. A short introduction to learning to rank[J]. IEICE Transactions on Information and Systems, 2011, 94(10): 1854-1862.
[5] Crammer K, Singer Y. Pranking with ranking[C]//Proceedings of Advances in Neural Information Processing Systems, 2002: 641-647.
[6] Cao Y, Xu J, Liu T Y, et al. Adapting ranking SVM to document retrieval[C]//Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2006: 186-193.
[7] Cao Z, Qin T, Liu T Y, et al. Learning to rank: From pairwise approach to listwise approach[C]//Proceedings of the 24th International Conference on Machine Learning. ACM, 2007: 129-136.
[8] Xia F, Liu T Y, Wang J, et al. Listwise approach to learning to rank: Theory and algorithm[C]//Proceedings of the 25th International Conference on Machine Learning. ACM, 2008: 1192-1199.
[9] Wu M, Zhu J, Wang J, et al. Listwise approach based on the cross-correntropy for learning to rank[J]. Electronics Letters, 2018, 54(14): 878-880.
[10] Li P, Wu Q, Burges C J. Mcrank: Learning to rank using multiple classification and gradient boosting[C]//Proceedings of Advances in Neural Information Processing Systems, 2018: 897-904.
[11] Freund Y, Iyer R, Schapire R E, et al. An efficient boosting algorithm for combining preferences[J]. Journal of Machine Learning Research, 2013, 4: 933-969.
[12] Burges C, Shaked T, Renshaw E, et al. Learning to rank using gradient descent[C]//Proceedings of the 32nd International Conference on Machine Learning, 2015: 89-96.
[13] Plackett R L. The analysis of permutations[J]. Journal of the Royal Statistical Society: Series C (Applied Statistics), 2005, 24(2): 193-202.
[14] Tao Qin,Tie-Yan Liu,Jun Xu. LETOR: A benchmark collection for research on learning to rank for information retrieval[J].Information Retrieval,2010,13: 346-374.
[15] C Zhai, J Lafferty.A study of smoothing methods for language models applied to Ad Hoc information retrieval[C]//Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press New York, 2011: 334-342.
[16] Qin T,Liu T Y.Introducing LETOR 4.0 datasets [J].arXiv preprint,2013,13(6),195-207.
[17] Qin T, Liu T Y, Tsai M F, et al. Learning to search web pages with query-level loss functions [J]. Technical Report, 2016,36(5): 156-169.

基金

国家自然科学基金(61751217,61866038);延安大学科研引导项目(YDY2018-11)
PDF(1786 KB)

885

Accesses

0

Citation

Detail

段落导航
相关文章

/