张顺昌,孙乐. 音字转换中分层解码模型的研究与改进[J]. 中文信息学报, 2009, 23(6): 79-86.
ZHANG Shunchang, SUN Le. The Research on Hierarchical Decoding for PinyintoCharacter Conversion. , 2009, 23(6): 79-86.
音字转换中分层解码模型的研究与改进
张顺昌1,2,孙乐1
1. 中国科学院 软件研究所,北京 100190; 2. 中国科学院 研究生院,北京 100049
The Research on Hierarchical Decoding for PinyintoCharacter Conversion
ZHANG Shunchang1,2 SUN Le
1. Institute of Software, Chinese Academy of Chinese, Beijing 100190, China;
2. Graduate University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:Pinyin-to-Character conversion is an important task in Chinese Information Processing with widely applications in such tasks as Chinese Speech Recognition, Chinese Pinyin input method et al. This paper investigates the Pinyin-to-Character conversion and the segmentation of pinyin stream and proposes a method using Language Model to improve pinyin stream segmentation model. This method achieves about 3% enhancement in precision of the first character compared to the traditional hierarchical model. Key wordsartifical intelligence; natural language processing; pinyin-to-character conversion; hidden markov model; Chinese information processing; segmentation ambiguity