传统的信息检索的研究多集中在文档级的检索场景中,然而,句子级的检索在如移动应用以及信息需求更加明确的检索场景下具有非常重要的意义。在句子级的检索场景下,我们认为句子的上下文能够提供更加丰富的语义信息来支撑句子与查询的匹配,基于此,该文提出了一个基于句子上下文的深度语义句子检索模型(context-aware deep sentence matching model, CDSMM)。具体的,我们使用双向循环神经网络来建模句子内部以及句子上下文的语义信息,基于句子和查询的语义信息得到它们的匹配程度,在WebAP句子检索数据集上的实验表明,我们的模型性能显著地优于其他的方法,并取得了目前最好的效果。
Abstract
Traditional researches on information retrieval are focuse on document-level retrieval, neglecting, sentence-level information retrieval which is of great importance in such applications, as searching in mobile phone Assuming that the context sentence could provide richer evidence for matching. this paper proposes a context-aware deep sentence matching model(CDSMM). Specifically, the model employs bi-directional LSTM to capture the interior and exterior information of the sentence; Then, a matching matrix is constructed based on the sentence representation and query representation; Finally, we get the matching score after a feed forward neural network. Experiment results on the WebAP dataset show that out model can significantly out-perform the state-of-the-art models.
关键词
信息检索 /
文本匹配 /
循环神经网络
{{custom_keyword}} /
Key words
information retrieval /
text matching /
RNN
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Shengxian Wan, Y Lan J. Guo, et al. A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto, 2016.
[2] Dolan W B, Brockett C. Automatically constructing a corpus of sentential paraphrases[C]//Proceedings of the 3rd International Workshop on Paraphrasing, Sydney, 2005.
[3] Huang, Po-Sen, et al. Learning deep structured semantic models for web search using click through data[C]//Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. San Francisco, 2013.
[4] M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to Rank Answers on Large Online QA Collections[C]//Proceedings of the 46th Annual Meeting of the ACL. Columbus, 2008:719-727.
[5] P Molino, L M Aiello. Distributed Representations for Semantic Matching in Non-factoid Question Answering[C]//Proceedings of the 37th Annual ACM SIGIR Conference . Queensland, 2014:38-45.
[6] P Jansen, M Surdeanu, P Clark. Discourse Complements Lexical Semantics for Non-factoid Answering Reranking[C]//Proceedings of the 52nd Annual Meeting of the ACL. Baltimore , 2014:977-986.
[7] W. tau Yih, M W Chang, C Meek, et al. Question answering using enhanced lexical semantic models[C]//Proceedings of the 51st Annual Meeting of the ACL. Sofia, 2013.
[8] X Yao, B V Durme, C Callison-burch, et al. Answer Extraction as Sequence Tagging With Tree Edit Distance[C]//Proceedings of the 11th Conference of the North American Chapter of the Association for Computational Linguistics. Atlanta, 2013.
[9] Yu Lei, Karl Moritz Hermann, Phil Blunsom, et al. Deep learning for answer sentence selection[J]. Computer Science, arXiv preprint arXiv:1412. 1632, 2014.
[10] M Bendersky, O Kurland. Utilizing passage-based language models for document retrieval[C]//Proceedings of the 30th European Conference on Information Retrieval, Glasgow, 2008:162-174.
[11] X Liu, W Croft. Passage Retrieval Based on Language Models[C]//Proceedings of the 11th ACM Conference on Information and Knowledge Management. McLean, 2002:375-382.
[12] Lv Yuanhua, Chengxiang Zhai. Positional language models for information retrieval[C]//Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. Boston, 2009.
[13] Yang Liu, et al. Beyond Factoid QA:Effective Methods for Non-factoid Answer Sentence Retrieval[M]. European Conference on Information Retrieval. Berlin:Springer , 2016.
[14] M Keikha, J H Park, W B Croft. Evaluating Answer Passages Using Summarization Measures[C]//Proceedings of the 37th Annual International ACM SIGIR Conference on Research & Development on Information Retrieval. Gold Coast, 2014.
[15] M KeikHa, J H Park, W B Croft, et al. Retrieving Passages and Finding Answers[C]//Proceedings of the 19th Australasian Document Computing Symposium. Melbourne, 2014:81-84.
[16] Qiu Xipeng, Xuanjing Huang. Convolutional neural tensor network architecture for community-based question answering[C]//Proceedings of the 24th International Joint Conference on Artificial Intelligence. Buenos Aires, 2015.
[17] Duchi John, Elad Hazan, Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research 2011, 12:2121-2159.
[18] Omas Mikolov, Ilya Sutskever, Kai Chen, et al. Distributed Representations of Words and Phrases and their Compositionality[C]//Proceedings of the 27th Neural Information Processing Systems. Lake Tahoe, 2013.
[19] S Huston, W B Croft. A comparison of Retrieval Models Using Term Dependences[C]//Proceedings of the 25th ACM Conference on Information and Knowledge Management. Shanghai, 2014.
[20] D Metzler, W B Croft. A Markov Random Field Model for Term Dependencies[C]//Proceedings of 28th Annual International ACM SIGIR Conference on Research & Development on Information Retrieval. Salvador, 2005.
[21] C Zhai, J Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval[C]//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New Orleans, 2001:334-342.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点基础研究发展计划(“973”计划)(2014CB340401,2013329606);科技部重点研发计划(2016QY02D0405);国家自然科学基金(61232010,61472401,61425016,61203298);中国科学院青年创新促进会优秀会员项目(20144310,2016102)
{{custom_fund}}