相似字识别的正确与否对整个识别系统的准确性和可用性都有着极大的影响。在实际应用中,我们发现相似汉字之间的误识存在不对称性,并对这种不对称现象的成因进行了细致的探讨和分析。基于这种不对称性,本文提出了一种分类的部分空间方法来解决相似字的识别问题。相似字按其结构特点被分成若干基本类别,不同类别在相应的部分空间提取不同的特征进行比较,以达到正确识别相似字的目的。实验结果表明了本方法的有效性,相似字识别的准确性得到了很大的提高,其中易错相似字的识别正确率平均提高了4.55个百分点,不易错相似字的识别正确率平均提高了0.38个百分点。
Abstract
Similar characters recognition has a great impact on the accuracy and usability of the whole OCR system. In this paper , the asymmetry in similar Chinese character recognition is introduced. The causes of the asymmetry phenomena are discussed and analyzed in details. Based on the asymmetry , we propose a method of category-based partial area matching for similar Chinese characters recognition. According to their structural characteristics , similar characters are divided into some different elementary categories. The different category features extracted in corresponding partial area are used to recognize similar characters. Our experiment results show the validity of the proposed method , which significantly improves the accuracy of similar Chinese character recognition. There are a 4.55 percent improvement on error-prone similar Chinese character recognition and a 0.38 percent improvement on less error-prone one.
关键词
人工智能 /
模式识别 /
不对称性 /
相似汉字识别 /
部分空间法 /
分类
{{custom_keyword}} /
Key words
artificial intelligence /
pattern recognition /
asymmetry /
similar Chinese character recognition /
partial area matching /
category
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] SIN-CHIA FU , Y. Y. XU and H. Y. CHANG. Recognition of Handwritten Similar Chinese Characters by Self-Growing Probabilistic Decision-Based Neural Network [J] . International Journal of Neural Systems , 1999 ,9 (16) : 545 - 561.
[2] 张德喜,马少平,朱绍文,金奕江. 基于统计和神经元方法相结合的手写体相似字识别[J]. 中文信息学报,13 (3) :33 - 39.
[3] 梁曼君,石竹. 基于神经网络的相似汉字识别的研究[J]. 中文信息学报,7 (8) :26 - 32.
[4] 田盛丰,黄厚宽,李洪波. 基于支持向量机的手写体相似字识别[J]. 中文信息学报,14 (3) :37 - 41.
[5] 蔺志青,郭军. 一种相似汉字的识别算法[J]. 中文信息学报,16 (5) :44 - 48.
[6] N. Sun , M. Abe , Y. Nemoto. A Fine Classification Method of Handwritten Character by Using Automatic Learning Algorithm of Partial Area Matching [J] . Trans. IEICE , J78 - D - II (3) , 1995 :492 - 500.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
中科院计算所领域前沿青年基金资助项目(20026180-19)
{{custom_fund}}