为了提高对Web动画素材的组织、管理,该文提出了基于文本特征和视觉特征融合的Web动画素材标注算法。首先利用自动提取的Web动画素材上下文信息,结合Web动画素材名称、页面主题、URL以及ALT等属性组成特征集,提取出文本关键字;然后利用视觉与标注字之间的相关性,对自动提取的标注字进行过滤,实现Web动画素材的自动标注。实验表明该文提出的基于文本特征和视觉特征融合的Web动画素材标注算法可有效地应用于Web动画素材自动标注。
Abstract
In order to improve the management of web animation materials, a senmantic annotation algorithm based on fusion of text and visual features is proposed for web animation material. The context information of the animation material is first extracted, including its title, page caption, URL, ALT features. Then the candidate textual keywords are extracted by using WordNet semantic dictionary. We filter the annotation words by their correlation to the visual features. Finally, we build the semantic network over textual keywords and visual features to realize automatic annotation. Experiments show that the algorithm proposed in this paper can be effectively used in extracting semantic information from web animation material.
关键词
Web动画素材 /
文本特征 /
视觉特征 /
语义标注
{{custom_keyword}} /
Key words
web animation material /
text feature /
visual feature /
semantics annotation
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] C H Wang,L Zhang. Learning to Reduce the Semantic Gap in Web Image Retrieval and Annotation[C]//Proceedings of the SIGIR’08, Singapore. 2008,7.
[2] Deng Cai, Shipeng YU, Ji-rong Wen, et al. Extracting content structure for Web pages based on visual representation[C]//Proceedings of the 5th AsianPaci-fic Web Conference(AP Web). London: Springer-Verlag. 2003:406-417.
[3] 王琦,唐世渭,杨冬青,王腾蛟. 基于DOM的网页主题信息自动提取.计算机研究与发展[J].2004,41(10):1786-1792.
[4] 高琰,谷士文,谭立球. 基于多种策略的页面内容提取算法.西南交通大学学报[J]. 2007,42(4):473-477.
[5] 陈兆学,赵晓静,聂生.Mean shift 方法在图像处理中的研与应用. 中国医学物理学[J].2010,27(6):2244-2249.
[6] L Itti, C Koch, E Niebur. A Model of Saliency-based Visual Attention for Rapid Scene Analysis[C]//Proceedings of the IEEE Trans. on Pattern Analysis and Machine Intelligence. 1998:1254-1259.
[7] H Shao, Y S Wu, W C Cui, et al. Image Retrieval Based on MPEG-7 Dominant Color Descriptor[C]//Proceedings of the 9th International Conference for Young Computer Scientists. 2008:753-757.
[8] 李伟,王树梅,王玲. 基于内容的电影动画素材检索. 计算机工程[J]. 2007,33(12):222-230.
[9] Lumini A, Maio D. A Wavelet-based Image Watermarking Scheme[C]//Proceedings of Intel. Conf. on Information Tech.: Coding and Computing. 2000: 122-127.
[10] 向友君,谢胜利. 图像检索技术综述. 重庆邮电学院学报(自然科学版)[J].2006,18(3):348-354.
[11] 邰晓英, 吴成玉, 赵杰煜. 基于平均值位移聚类与 EMD 测量的图像检索. 电路与系统学报[J].2007,12(1):62-67.
[12] 邱兆文. 面向用户的Web图像检索关键技术研究[D].哈尔滨工业大学博士学位论文.2009.
[13] H Thomas. Probabilistic Latent Semantic Indexing[C]//Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval (SIGIR-99). 1999.
[14] T. Hofmann. Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning[J]. 2001,42 (1):177-196.
[15] Girolami, M. On an equivalence between PLSI and LDA[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY,USA: ACM Press, 2003:433-434.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
中央高校基本科研业务费专项基金(DL10CB01),黑龙江省留学归国科学基金(LC2012C06),哈尔滨市科技创新人才专项基金(2012RFLXG022),东北林业大学研究生论文资助项目。
{{custom_fund}}