本体映射是解决本体异构问题的关键方案。该文以HowNet和CCD中的名词性概念为例,首先利用机器学习技术发现初始映射关系,主要包括特征选择、样本集合划分、分类器选择等步骤;然后考虑本体的整体结构信息,利用相似度传播算法,对初始映射关系进行全局调整。实验表明,最终的一对一和一对多映射关系的准确率分别达到了94%和87.5%。
Abstract
Ontology matching is the key solution to the semantic heterogeneity problem.Focusing on the Noun concept of HowNet and CCD, this paper applies machine learning to identify the initial mapping relationships, disicussing the the feature selection, sample collections division and classifier selection. Further, employing the overall structure of the ontology, the similarity propagation algorithm is introduced to adjust the initial mapping globally. Experiment result shows that the precision of 1:1 and 1:n mapping relationships reaches 94% and 87.5%, respectively.
关键词
本体映射 /
机器学习 /
分层抽样 /
相似度传播算法
{{custom_keyword}} /
Key words
ontology matching /
machine learning /
stratified cross sampling /
similarity propagation algorithm
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Jerome E, Pavel S. Ontology matching[C]//Proceedings of the Springer-Verlag, Heidelberg (DE), 2007.
[2] Qu Y, Hu W, Chen G. Constructing virtual documents for ontology matching[C]//Proceedings of the 15th International World Wide Web Conference (WWW). Edinburgh (UK), 2006: 23-31.
[3] Gligorov, Risto, et al. Using Google distance to weight approximate ontology matches[C]//Proceedings of the 16th international conference on World Wide Web (WWW). Beijing, China, 2007: 767-776.
[4] Atencia M, Borgida A, et al. A formal semantics for weighted ontology mappings[C]//Proceedings of the Semantic Web-ISWC 2012: 17-33.
[5] Nagy M, Vargas-Vera M. Towards an automatic semantic data integration: Multi-agent framework approach[C]//Proceedings of the Chapter in Sematic Web.In-Tech Education and Publishing KG, 2010.
[6] Li J, Tang J, Li Y, et al. Rimom: A dynamic multistrategy ontology alignment framework. Knowledge and Data Engineering[C]//Proceedings of the IEEE Transactions on 21, 2009: 1218-1232.
[7] Zhang D, Lee W S. Web taxonomy integration using support vector machines[C]//Proceedings of the 13th international conference on World Wide Web (WWW). New York, 2004: 472-481.
[8] Rong S, Niu X, et al. A Machine Learning Approach for Instance Matching Based on Similarity Metrics[C]//Proceedings of the Semantic Web-ISWC 2012: 460-475.
[9] Nezhadi A.H, Shadgar B, Osareh A. Ontology alignment using machine learning techniques[J]. International Journal of Computer Science & Information Technology (IJCSIT), 2011,3(2):139.
[10] 梅立军, 周强等. 知网与同义词词林的信息融合研究[J]. 中文信息学报. 2005,19(1):63-70.
[11] Matthew H, Simon J, Georgina M. A Practical Guide To Building OWL Ontologies Using Protégé 4 and CO-ODE Tools(1.)[J]. (2007-10-16)[2008-02-27].http://protege.stanford.edu,2001.
[12] 董振东. 语义关系的表达和知识系统的建造[J]. 语言文字应用,1998,(3):76-82.
[13] 刘杨,俞士汶,于江生. CCD语义知识库的构造研究[J].小型微型计算机系统. 2005,26(8):1411-1415.
[14] Melnik S, Garcia-Molina H, Rahm E. Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching[C]//Proceedings of the 18th International Conference on Data Engineering (ICDE), 2002: 117-128.
[15] Duchateau F, Bellahsene Z, Coletta R. A flexible approach for planning schema matching algorithms[M].On the Move to Meaningful Internet Systems: OTM 2008. Springer Berlin Heidelberg, 2008: 249-264.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点基础研究发展计划(2014CB340504),国家自然科学基金(61375074)。
{{custom_fund}}