半结构化中文信息检索中查询结果相关度算法的研究

曲卫民,孙乐,孙玉芳

PDF(406 KB)
PDF(406 KB)
中文信息学报 ›› 2004, Vol. 18 ›› Issue (4) : 16-23.

半结构化中文信息检索中查询结果相关度算法的研究

  • 曲卫民,孙乐,孙玉芳
作者信息 +

Research on Ranking Algorithm in XML Document Retrieval

  • QU Wei-min,SUN Le,SUN Yu-fang
Author information +
History +

摘要

本文研究了对富含文本信息的XML数据进行基于关键字的查询时,查询结果与查询条件之间相关度的计算问题,分析了利用传统信息检索技术解决该问题时存在的一些不足,提出了一种基于节点的动态的关键字权重计算法,以及综合考虑关键字在查询结果中的频率分布特征和结构分布特征的查询结果相关度计算法,有效解决了XML数据中的结构信息对相关度计算的影响,实验证明本文中的方法取得了较好的检索性能。

Abstract

This paper study the problemof producing ranked result for keyword search over text-rich XML documents. We analyze the challenges introduced by XML data if utilize traditional Information Retrieval to solve the problem. Then we propose a dynamic element-oriented method to compute the weight of keywords , and a ranking function that consider both the frequency distribution and structural distribution of keywords in the result. Experimental results prove the effectiveness of our solution.

关键词

计算机应用 / 中文信息处理 / XML / 信息检索 / 相关度算法

Key words

computer application / Chinese information processing / XML / Information Retrieval / Ranking algorithm

引用本文

导出引用
曲卫民,孙乐,孙玉芳. 半结构化中文信息检索中查询结果相关度算法的研究. 中文信息学报. 2004, 18(4): 16-23
QU Wei-min,SUN Le,SUN Yu-fang. Research on Ranking Algorithm in XML Document Retrieval. Journal of Chinese Information Processing. 2004, 18(4): 16-23

参考文献

[1] V. Aguilera , S. Cluet , F. Wattez. Xyleme Query Architecture[A] . WWW Conf. , 2001.
[2] K. Bohm , et al. , Structured Document Storage and Refined Declarative and Navigational Access Mechanisms in HyperStorM. [J] VLDB Journal 6 (4) , 1997.
[3] L. J. Brown , et al. , A Structured Text ADT for Object-Relational Databases[J] . Theory and Practice of Object-Systems 4 (4) , 1998.
[4] D. Florescu , D. Kossmann , I. Manolescu. Integrating Keyword Search into XML Query Processing[A] . WWW Conf. , 2000.
[5] David Carmel , Yo?lle S. Maarek , Matan Mandelbrod , Yosi Mass , Aya Soffer. Searching XML documents via XML fragments[A] . In Proceeding of ACM SIGIR2003 , Toronto , 2003.
[6] A. Schmidt , M. Kersten , and M. Windhouwer. Querying XML documents made easy : Nearest concept queries[A] . In ICDE , 2001.
[7] N. Fuhr , K. Grobjohann. XIRQL : A Language for Information Retrieval in XML Documents[A] . SIGIR Conf. , 2001.
[8] Theobald , G. Weikum. The Index-Based XXL Search Engine for Querying XML Data with Relevance Rankings[A] . EDBT Conf. , 2002.
[9] T. Schlieder and H. Meuss. Result ranking for structured queries against XML documents[A] . In DELOS Workshop on Information Seeking , Searching and Querying in Digital Libraries , 2000.
[10] T. Chinenyanga , N. Kushmerick. Expressive and Efficient Ranked Querying of XML Data[A] . In 4th International Workshop on the Web and Databases (WebDB) , Santa Barbara , California , 2001.
[11] Y. Hayashi , J. Tomira , G. Kikui. Searching Text-rich XML Documents with Relevance Ranking[A] . ACM SIGIR 2000 Workshop on XML and Information Retrieval , Greece , 2000.
[12] G. Salton , Automatic text processing : The transformation , Analysis , and Retrieval of Information by Computer[J] . Addison-Wesley , 1989.
[13] Raghav Kaushik , Pradeep Shenoy , Philip Bohannon , and Ehud Gudes. Exploiting local similarity for indexing paths in graph-structured data[A] . In ICDE 2002.
[14] 曲卫民,张俊林,孙乐,孙玉芳. 基于记忆的中文自适应语言模型的研究[J] . 中文信息学报,2003. 17 (5) : 13 - 18.

基金

国家自然科学基金资助项目(69983009);国家863计划资助项目(2001AA114040)
PDF(406 KB)

492

Accesses

0

Citation

Detail

段落导航
相关文章

/