Abstract:The main problems in Chinese page analysis are presented on the basis of the differences between Chinese and English layout . The Chinese layout comprehensive features are summarized with which a layout analysis method is built mainly based on a bottom-up approach. The results of experiment have shown that this method is able to analyse the standard Chinese layout . Compared with the existing approaches ,it is more efficient and suitable to process Chinese layout .
[1] 林雁平、夏莹. 版面分解技术. 中国中文信息学会成立十周年学术报告会议论文集. 北京,1991 ,176~179 [2] O'Gorman L. The document spectrum for page layout analysis. IEEE Trans. on PAMI ,1993 ,15 (11) : 1162~1173 [3] Tsujimoto S ,Asada H. Major components of a complete text reading system. Proc. of the IEEE ,1992 ,80 (7) :1133~1149 [4] Fletcher L A ,Kasturi R. A robust algorithm for text string separation from mixed text/ graphics images. IEEE Trans. on PAMI ,1988 ,10 (6) :910~918 [5] 周长岭. 中文OCR中的版面分析算法初探. 第六届全国汉字识别学术会议论文集. 重庆,1996 ,137~142 [6] Akiyama T. Automated entry system for printed documents. Pattern Recognition ,1990 ,23 (11) :1141~1154 [7] Hinds S C ,Fisher J L ,D'Amato D P. A document skew detection method using run - length encoding and the Hough transform. Proc. 10th Int. Conf. on Pattern Recognition (ICPR) . Atlantic City. 1990 ,464~468 [8] Le D S ,Thoma G R ,Wechsler H. Automated page orientation and skew angle detection for binary document images. Pattern Recognition ,1994 ,27 (10) :1325~1344