一种基于关键词的中文文档图像检索方法

黄祥林,高芸,杨丽芳,王鹏鹏

PDF(333 KB)
PDF(333 KB)
中文信息学报 ›› 2007, Vol. 21 ›› Issue (4) : 61-64.
综述

一种基于关键词的中文文档图像检索方法

  • 黄祥林,高芸,杨丽芳,王鹏鹏
作者信息 +

A Chinese Document Image Retrieval Method by Keywords

  • HUANG Xiang-lin, GAO Yun, YANG Li-fang, WANG Peng-peng
Author information +
History +

摘要

本文提出了一种基于关键词的中文文档图像检索方法,能在不经OCR(Optical Character Recognition)识别的情况下,直接利用中文字符的图像特征进行关键词检索。首先将文档图像分割成单个中文字符图像,接着对字符图像进行汉字笔画的特征数据提取,然后在特征数据间进行基于WMHD(Weighted Modified Hausdorff Distance)的相似性测量。该方法不受字号的影响,也有一定的抗字体能力,实验证明其具有较高的检索效果。

Abstract

A Chinese document image retrieval method by keywords is proposed, which retrieved Chinese character directly from Chinese character image without OCR (Optical Character Recognition). At first, Chinese character image was segmented from Chinese document image. Then the feature data of Chinese stroke were extracted from the Chinese character image. At last, the similarity of the Chinese character images were measured by weighted modified Hausdorff distance between their feature data. That retrieval method is robust to character size and font. The experimental results show good performance.

关键词

计算机应用 / 中文信息处理 / 中文文档图像 / 关键词检索 / 加权的修正Hausdorff距离(WMHD)

Key words

computer application / chinese information processing / chinese document image / retrieval by keywords / WMHD (Weighted Modified Hausdorff Distance)

引用本文

导出引用
黄祥林,高芸,杨丽芳,王鹏鹏. 一种基于关键词的中文文档图像检索方法. 中文信息学报. 2007, 21(4): 61-64
HUANG Xiang-lin, GAO Yun, YANG Li-fang, WANG Peng-peng. A Chinese Document Image Retrieval Method by Keywords. Journal of Chinese Information Processing. 2007, 21(4): 61-64

基金

国家发改委CNGI资助项目(CNGI-04-12-2A)
PDF(333 KB)

Accesses

Citation

Detail

段落导航
相关文章

/