基于组合特征的中文版面分析方法

田学东,郭宝兰

PDF(381 KB)
PDF(381 KB)
中文信息学报 ›› 1999, Vol. 13 ›› Issue (4) : 23-29.
综述

基于组合特征的中文版面分析方法

  • 田学东,郭宝兰
作者信息 +

The Method for Chinese Document Layout Analysis Based on Comprehensive Features

  • Tian Xuedong , Guo Baolan
Author information +
History +

摘要

在对中、外文版面特点进行比较的基础上,指出了中文版面分析的困难所在,并有针对性地归纳出了相应的版面组合特征。利用这些特征,建立了一种以自底向上分析为主,同时融入自顶向下某些方法与结果的中文版面分析方法。实验结果表明,这种方法能够对比较规范的中文版面进行分析,具有较高的效率和较好的适应性。

Abstract

The main problems in Chinese page analysis are presented on the basis of the differences between Chinese and English layout . The Chinese layout comprehensive features are summarized with which a layout analysis method is built mainly based on a bottom-up approach. The results of experiment have shown that this method is able to analyse the standard Chinese layout . Compared with the existing approaches ,it is more efficient and suitable to process Chinese layout .

关键词

版面分析 / 文字识别 / 组合特征 / 连通区域

Key words

Layout analysis / Character recognition / Comprehensive features / Connected area

引用本文

导出引用
田学东,郭宝兰. 基于组合特征的中文版面分析方法. 中文信息学报. 1999, 13(4): 23-29
Tian Xuedong , Guo Baolan. The Method for Chinese Document Layout Analysis Based on Comprehensive Features. Journal of Chinese Information Processing. 1999, 13(4): 23-29

参考文献

[1] 林雁平、夏莹. 版面分解技术. 中国中文信息学会成立十周年学术报告会议论文集. 北京,1991 ,176~179
[2] O'Gorman L. The document spectrum for page layout analysis. IEEE Trans. on PAMI ,1993 ,15 (11) : 1162~1173
[3] Tsujimoto S ,Asada H. Major components of a complete text reading system. Proc. of the IEEE ,1992 ,80 (7) :1133~1149
[4] Fletcher L A ,Kasturi R. A robust algorithm for text string separation from mixed text/ graphics images. IEEE Trans. on PAMI ,1988 ,10 (6) :910~918
[5] 周长岭. 中文OCR中的版面分析算法初探. 第六届全国汉字识别学术会议论文集. 重庆,1996 ,137~142
[6] Akiyama T. Automated entry system for printed documents. Pattern Recognition ,1990 ,23 (11) :1141~1154
[7] Hinds S C ,Fisher J L ,D'Amato D P. A document skew detection method using run - length encoding and the Hough transform. Proc. 10th Int. Conf. on Pattern Recognition (ICPR) . Atlantic City. 1990 ,464~468
[8] Le D S ,Thoma G R ,Wechsler H. Automated page orientation and skew angle detection for binary document images. Pattern Recognition ,1994 ,27 (10) :1325~1344
PDF(381 KB)

788

Accesses

0

Citation

Detail

段落导航
相关文章

/