蔡佳,王向东,唐李真,崔晓娟,刘宏,钱跃良. 基于汉盲对照语料库和深度学习的汉盲自动转换[J]. 中文信息学报, 2019, 33(4): 60-67.
CAI Jia, WANG Xiangdong, TANG Lizhen, CUI Xiaojuan, LIU Hong, QIAN Yueliang. A Deep Learning Method for Chinese-Braille Conversion Based onParallel Corpora. , 2019, 33(4): 60-67.
A Deep Learning Method for Chinese-Braille Conversion Based onParallel Corpora
CAI Jia1,2, WANG Xiangdong1, TANG Lizhen3, CUI Xiaojuan1,2, LIU Hong1, QIAN Yueliang1
1.Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; 2.University of Chinese Academy of Sciences, Beijing 100049, China; 3.China Braille Press, Beijing 100142, China
Abstract:The Chinese-Braille conversion can be applied to fields such as Braille publication, education for the blind, etc. This paper presents a deep learning solution to automatic Chinese-Braille conversion based on parallel corpora. A Bi-directional LSTM model is trained using segmented Chinese texts according to the Braille segmentation rules and achieves high accuracy of Braille word segmentation. In order to support the model training, this paper also presents a strategy of automatically generating a corpus from Chinese and braille texts with the same content, with alignments at article-level, sentence-level and word-level, totaling 270 000 sentences, 2.34 million Chinese characters, and 4.48 million Braille symbols. The experimental results show that the proposed method outperforms the existing models.
[1] Christensen L B, Keegan S J,Stevns T.SCRIBE: A model for implementing robobraille in a higher education institution[C]//Proceedings of International Conference on Computers Helping People with Special Needs. Springer-Verlag, 2012:77-83. [2] Christensen L B,Chourasia A.Document transformation infrastructure[C]//Proceedings of 8th International Conference on Universal Access in Human-Computer Interaction(UAHCI 2014), Springer International Publishing, 2014:93-100. [3] Christensen L B,Stevns T.Universal access to alternate media[C]//Proceedings of 9th International Conference on Universal Access in Human-Computer Interaction. Springer International Publishing, 2015:406-414. [4] Coutinho L R R, Girao A M, Frota J B B, et al.Device to assist the visually impaired in reading printed or scanned documents[C]//Proceedings of Brazilian Symposium on Computing System Engineering. IEEE Computer Society, 2012:25-30. [5] Bodale F, Bhide U, Gore D, et al.Braille translation[J].International Journal of Research in Advent Technology, E-ISSN, 2014, 57(20):2321-9637 [6] GB/T 15720—2008中国盲文[S], 2008. [7] 滕伟民, 李伟洪.中国盲文[M]. 北京: 华夏出版社, 2006. [8] 钟经华.汉语盲文规范化的新起点[J].现代特殊教育, 2017, 5(3):25-26 [9] 黄河燕, 陈肇雄, 黄静.基于多知识分析的汉盲转换算法[C]. 全国计算语言学联合学术会议, 2003. [10] Xiaoyan Zhu, Ta Bao.EasyBraille: a translation system for Mandarin and Braille[C]//Proceedings of Natural Language Understanding and Machine Translation Proceedings of the 6th Joint Symposium on Computational Linguistics in China (JSCL-2001), Tsinghua University Press, Beijing, China, 2001: 326-331. [11] Minghu Jiang, Xiaoyan Zhu.Segmentation of Mandarin Braille word and Braille translation based on multi-knowledge[C]//Proceedings of International Conference on Signal Processing, Publishing House of Electronics Industry, Beijing, China, 2000: 2070-2074. [12] 庄丽, 包塔, 朱小燕.盲人用计算机软件系统中的语音和自然语言处理技术[J].中文信息学报, 2004, 18(4):73-79. [13] 李宏乔,樊孝忠,李良富,等.汉语—盲文机器翻译系统的研究与实现[J].计算机应用, 2002, 22(11):3-6. [14] 杨潮,车磊.汉字—盲文转换系统的设计[J].北京印刷学院学报, 2011, 19(6):36-38. [15] 吕先超.视障汉语转换软件SunBraille的设计实现[D]. 兰州: 兰州大学硕士学位论文, 2016. [16] Wang X, Yang Y, Liu H, et al.Chinese-Braille translation based on Braille corpus[J].International Journal of Advanced Pervasive and Ubiquitous Computing (IJAPUC), 2016, 8(2):56-63. [17] 肖航,钟经华.汉语盲文语料库建设方案[J].语言文字应用, 2015, 3(3):109-118. [18] Chen X,Qiu X, Zhu C, et al.Long short-term memory Neural Networks for Chinese word segmentation[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing, 2015:1197-1206.