Abstract:This paper presents a methord for Chinese document layout analysis based on component . This methord mostly bases on a bottom-up approach ,it also benefits from a top-down approach and a concept of component . The concept of component lets the methord have a clear structure and reduces the times of scanning picture. Union the bottom-up approach and the top-down approach lets the methord have a high efficiency ,precision and adaptability. We use a two-dimensional orderly tree structure to organize document and comoponents. It improves the seaching speed and gives a convenience for application and document description.
[1] 周长岭. 中文OCR的版面分析算法初探. 见:第六届全国汉字识别学术会议论文集, 1996 ,137 — 142 [2] Wang S - Y, Yagasaki T. Block Selection : A Method for Segmenting Page Image of Various Editing Styles. In : Proceedings of the Third International Conference on Document Analysis and Recognition , Volume Ⅰ,1995 ,128 —133 [3] Drivas D ,Amin A. Page Segmentation and Classification Utilising Bottom - Up Approach. In : Proceedings of the Third International Conference on Document Analysis and Recognition ,Volume Ⅱ, 1995 , 610 —614