1. School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi 030006, China; 2. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Taiyuan, Shanxi 030006,China
Abstract:Multi-label learning is proposed to deal with the ambiguity problem in which a single sample is associated with multiple concept labels simultaneously, while the semi-supervised multi-label learning is a new research direction in recent years. To further exploit the information of unlabeled samples, a semi-supervised multi-label learning algorithm based on Tri-training(MKSMLT) is proposed. It adopts ML-kNN algorithm to get more labeled samples, then employs the Tri-training algorithm to use three classifiers to rank the unlabeled samples. Experimental results illustrate that the proposed algorithm can effectively improve the classification performance.
[1] Tsoumakas G, Katakis I. Multi-label classification:An overview[J]. International Journal of Data Warehousing and Mining, 2007,3(3): 1-13. [2] Zhang Minling, Zhang K. Multi-label learning by exploiting label dependency[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., 2010, 999-1007. [3] Zhu Xiaojin. Semi-supervised Learning Literature Survey[R]. Madison University of Wisconsin,2008. [4] 常瑜, 梁吉业, 高嘉伟,等. 一种基于Seeds集和成对约束的半监督聚类算法[J]. 南京大学学报(自然科学版), 2012,48(4): 405-411. [5] Zhang Minling, ZhouZhihua. ML-kNN: A lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048. [6] 广凯, 潘金贵. 一种基于向量夹角的k近邻多标记文本分类算法[J]. 计算机科学, 2008,35(4): 205-207. [7] Robert E. Schapire, Yoram Singer. BoosTexter: a boosting-based system for text categorization[J]. Machine Learning, 2000, 39(2-3):135-168. [8] Amanda Clare, Ross D. King. Knowledge discovery in multi-label phenotype data[J]. Lecture Notes in Computer Science, 2001, 2168:42-53. [9] 张敏灵. 一种新型多标记懒惰学习算法[J]. 计算机研究与发展. 2012,49(11):2271-2282. [10] 程圣军, 黄庆成, 刘家锋,等. 一种改进的ML-kNN多标记文档分类方法 [J]. 哈尔滨工业大学学报,2013,45(11): 45-49. [11] Liu Yi, Jin Rong, Yang Liu. Semi-supervised multi-label learning by constrained non-negative matrix factorization[C]//Proceedings of the 21 st National Conference on Artificial Intelligence. Menlo Park: AAAI,2006: 421-426. [12] Chen Gang, Song Yangqiu, Wang Fei, et al. Semi-supervised multi-label learning by Solving a Sylvester equation[C]//Proceedings of SIAM International Conference on Data Mining. Los Alamitos, CA: IEEE Computer Society, 2008: 410-419. [13] 姜远,佘俏俏,黎铭,等. 一种直推式多标记文档分类方法[J]. 计算机研究与发展,2008,45(11): 1817-1823. [14] Sun Yuyin, Zhang Yin, Zhou Zhihua. Multi-label learning with weak label[C]//Proceedings of the 24 th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2010: 593-598. [15] 孔祥南, 黎铭, 姜远,等. 一种针对弱标记的直推式多标记分类方法[J]. 计算机研究与发展. 2010,47(8):1392-1399. [16] Xiangnan Kong, Michael K. Ng, Zhou Zhihua. Transductive Multi-label Learning via Label Set Propagation[J]. IEEE Transactions on Knowledge and Data Engineering, 2013,25(3): 704-719. [17] 李宇峰, 黄圣君, 周志华. 一种基于正则化的半监督多标记学习方法[J]. 计算机研究与发展. 2012,49(6): 1272-1278. [18] 周志华,王珏. 半监督学习中的协同训练算法[M]. 机器学习及其应用.北京:清华大学出版社, 2007: 259-275. [19] 刘杨磊, 梁吉业, 高嘉伟,等. 基于Tri-training的半监督多标记学习算法[J]. 智能系统学报.2013, 8(5):439-445. [20] Zhou Zhihua,Li Ming. Tri-training: Exploiting unlabeled data using three classifiers[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529-1541. [21] http://mulan.sourceforge. net/datasets.html[OL]. [22] Zhou Zhihua, Zhang Minling, Huang Shengjun, et al. Multi-instance multi-label learning[J]. Artificial Intelligence, 2012, 176:2291-2320