洪宇,张宇,刘挺,郑伟,龚诚,李生. 基于层次聚类的自适应信息过滤学习算法[J]. 中文信息学报, 2007, 21(3): 47-53.
HONG Yu, ZHANG Yu, LIU Ting, ZHENG Wei, GONG Cheng, LI Sheng. Learning Algorithm of Adaptive Information FilteringBased on Hierarchy Clustering. , 2007, 21(3): 47-53.
基于层次聚类的自适应信息过滤学习算法
洪宇,张宇,刘挺,郑伟,龚诚,李生
哈尔滨工业大学 计算机科学与技术学院 信息检索实验室,黑龙江 哈尔滨 150001
Learning Algorithm of Adaptive Information FilteringBased on Hierarchy Clustering
HONG Yu, ZHANG Yu, LIU Ting, ZHENG Wei, GONG Cheng, LI Sheng
Information Retrieval Lab, School of Computer Science and Technology, Haerbin Institute of Technology, Haerbin, Heilongjiang 150001, China
Abstract:This paper adopts an adaptive learning algorithm based on hierarchy clustering to update user profile, which continuously abstract the cancroids of one class of optimum information from the feedback flow of system, which effectively shield the learning process from plenty of feedback noises produced by distorted threshold and sparseness of initial information, which also can imitate artificial feedback approximately to perfect the intelligence of adaptive learning mechanism.
[1] 王斌, 潘文锋. 基于内容的垃圾邮件过滤技术综述[J]. 中文信息学报,2005(5): 1-10. [2] Belkin NJ, Croft W B. Information filtering and information retrieval: two sides of the same coin [J]. Communications of ACM, 1994, 35 (12): 29-38. [3] Robertson S and Soboroff I. The TREC-10 filtering track final report [A]. In: Proceeding of Tenth Text Retrieval Conference[C]. Gaithersburg, USA: MD, 2001, 26-37. [4] Robertson S and Soboroff I. The TREC-11 filtering track final report [A]. In: Proceeding of Eleventh Text Retrieval Conference [C]. Gaithersburg, USA: MD, 2002, 26-37. [5] Yang Y, Yoo S, Zhang J, Kisiel B. Robustness of Adaptive Filtering Methods In a Cross-benchmark Evaluation [A]. In: Proceedings of the 28th annual international ACM SIGIR [C]. Salvador, Brazil: ACM Press, 2005, 33-39. [6] Ault T, Yang Y. Knn, rocchio and metrics for information filtering at trec-10[A]. In Proceeding of Tenth Text Retrieval Conference [C]. Gaithersburg, USA: MD, 2001, 84-92. [7] Allan J. Incremental relevance feedback for information filtering [A]. In: Proceedings of the 19th annual international ACM SIGIR[C]. Zurich Switzerland: Center for Intelligent Information Retrieval, 1996, 270-277. [8] Zhang Y and Callan J. The bias problem and language models in adaptive filtering [A]. In Proceeding of Tenth Text Retrieval Conference[C]. Gaithersburg, USA: MD, 2001, 34-41. [9] Zhang Y and Callan J. Maximum likelihood estimation for filtering thresholds [A]. In The Twenty Fourth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02) [C]. New York: The Association for Computing Machinery. 2002, 294-302. [10] Yang Y, Kisiel B. Margin-based Local Regression for Adaptive Filtering [A]. In: Proceedings of the twelfth international conference on Information and knowledge management [C]. New Orleans, Louisiana, USA: CIKM, 2003, 88-95. [11] 甄彤. 基于层次与划分方法的聚类算法研究[J].计算机工程与应用, 2004, 01(6): 178-180. [12] 苏中, 马少平, 杨强等. 基于Web-Log Mining的Web文档聚类[J]. 软件学报, 2002, 13(01): 99-104. [13] 吴帆, 李石君. 一种高效的层次聚类分析算法[J].计算机工程, 2004, 30(9):70-71. [14] Wang L, Kitsuregawa M. Use Link-based Clustering to Improve Web Search Results [A]. In: Proceedings of the second International Conference on Web information Systems Engineering [C]. Washington,DC: WISE, 2001, 119-128. [15] Zhang T, Ramakrishnan R, Livny M. BIRCH: an efficient data clustering method for very large databases [A]. In: Proceedings of the 1996 ACM SIGMOD international conference[C]. Montreal: ACM Press, 1996, 103-114. [16] Zhang T, Ramakrishnan R, Livny M. Zhang. BIRCH: a new data clustering algorithm and its applications [J]. Journal of Data Mining and Knowledge Discovery, 1997, 1(2): 141-182.