赵骥,李晶皎,王丽君,张继生. 基于HMM的满文文本识别后处理的研究[J]. 中文信息学报, 2006, 20(4): 65-69.
ZHAO JI,LI Jing-jiao,WANG Li-jun,ZHANG Ji-sheng. Research on the Post-processing of Manchu Character Recognition Based on Hidden Markov Model. , 2006, 20(4): 65-69.
基于HMM的满文文本识别后处理的研究
赵骥1,李晶皎2,王丽君1,张继生1
1.鞍山科技大学计算机科学与工程学院 2.东北大学信息科学与工程学院
Research on the Post-processing of Manchu Character Recognition Based on Hidden Markov Model
1.School of Computer Science & Engineering , Anshan Science and Technology University 2.School of Information Science & Engineering , Northeastern Science University
Abstract:The study proposes a post-processing method to improve the performance of Manchu character recognition. A evaluation model based on the Bayes rule are used to estimate the probability of the candidate Manchu words, which takes both the posterior probability of candidate and the prior probability of Manchu phrases into account. A Hidden Markov Model and Viterbi dynamic programming algorithm are adopted to check the output of the character recognition and to correct the rejected and mistaken words. This efficiently enhances the recognition rate of Manchu manuscript. The results indicate that the post-processing performance depends on the language model and the accuracy of the evaluation model. Additionally, a higher recognition precision of SCR (Single Character Recogniton) will yield a better performance of error correction of post-processing.
[1] 张俐,胡明函,李晶皎,等. 满汉计算机辅助翻译系统的满文字符编码[J]. 东北大学学报(自然科学版) , 2002, 23 (2) : 119 - 122. [2] 张广渊,李晶皎,张俐. 满文罗马转写与圈点满文转换算法的实现[J]. 东北大学学报(自然科学版) , 2003, 24 (12) : 1157 - 1160. [3] Chang J. S. , Chen S. D. , The Post-processing of Optical Character Recognition Based on Statistical Noisy Channel and Language Model[J]. Proceedings of PACLIC, 1995: 127 - 13. [4] 王维兰,丁晓青,戴玉刚. 藏文识别后处理的研究[J]. 术语标准化与信息技术, 2002, 2: 30 - 34. [5] 刘家锋,黄健华,唐降龙. 基于HMM的联机汉字识别系统及其改进的训练方法[J]. 中文信息学报, 2000, 15 (4) : 47 - 52. [6] Guo Q. , Zheng F. , Wu J. , Et al. A New Method Used in HMM for Modeling Frame Correlation[J]. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’99) , 1999: 169 - 172. [7] 蔡樱,盛立东. 手写文稿识别的一种后处理方法及系统集成[J]. 中文信息学报. 1999, 14 (3) : 30 - 36. [8] Lin X. F. , Ding X. Q. , Chen M. , Et al. Adaptive confidence transform based classifier combination for Chinese character recognition[J]. Pattern Recogn. Lett. 1998, 19 (10) : 975 - 988. [9] 李元祥,丁晓青,刘长松. 基于HMM的汉语文本识别后处理的研究[J]. 中文信息学报. 1999, 13 (4) : 29 - 34. [10] Wong P. K. , Chan C. Post-processing statistical Language models for a handwritten Chinese character recognizer. [J]. IEEE Trans. Syst. Man Cybern. 1999. 29 (2) : 286 - 291.