普通话发音错误自动检测技术

张 峰1,黄 超2,戴礼荣1

PDF(725 KB)
PDF(725 KB)
中文信息学报 ›› 2010, Vol. 24 ›› Issue (2) : 110-116.
综述

普通话发音错误自动检测技术

  • 张 峰1,黄 超2,戴礼荣1
作者信息 +

Automatic Mispronunciation Detection for Mandarin Chinese

  • ZHANG Feng1, HUANG Chao2, DAI Lirong1
Author information +
History +

摘要

统计语音识别框架是现在发音错误检测系统的主流框架,而声学模型则是统计语音识别的基础。 该文一方面为了获得对于发音错误检测更好的声学模型,引入了说话人自适应训练(SAT)和选择性最大似然线性回归(SMLLR)技术;另一方面,由于字发音检错中存在严重的信息量不足问题和专家对于不同水平说话人的评价标注不一样,在后端上加入了话者得分归一化技术。在包含40个不同水平说话人的8 000个字的数据库上的实验结果表明,文中提出的方法有效的提高了系统性能,召回率为30%时,正确率从45.8%升到了53.6%,召回率为10%时,正确率从64.6%升到了79.9%。

Abstract

The current automatic mispronunciation detection systems are mostly based on automatic speech recognition (ASR) framework with statistical model. This paper presents the methods to improve the performance of mispronunciation detection at syllable level for Mandarin Chinese from two aspectsintroducing the speaker adaptive training (SAT) and the selective maximum likelihood linear regression (SMLLR) to get a better acoustic statistical model, and proposing speaker normalization backend because of the limited information and the different rating level for the different pronunciation level. Experiments on a database of 8 000 syllables pronounced by 40 speakers with varied pronunciation proficiency indicate the promising effects of these strategies by improving the precision from 45.8% to 53.6% at 30% recall, and 64.6% to 79.9% at 10% recall.
Key wordscomputer application; Chinese information processing; Automatic mispronunciation detection; Speaker Adaptive Training (SAT); Selective Maximum Likelihood Linear Regression (SMLLR); speaker normalization;

关键词

计算机应用 / 中文信息处理 / 发音错误自动检错 / 说话人自适应训练 / 选择性最大似然线性回归 / 话者归一化

Key words

words computer application / Chinese information processing / Automatic mispronunciation detection / Speaker Adaptive Training (SAT) / Selective Maximum Likelihood Linear Regression (SMLLR) / speaker normalization /
 
/   /   /
 
/   /   /
 
/   /  

引用本文

导出引用
张 峰1,黄 超2,戴礼荣1. 普通话发音错误自动检测技术. 中文信息学报. 2010, 24(2): 110-116
ZHANG Feng1, HUANG Chao2, DAI Lirong1. Automatic Mispronunciation Detection for Mandarin Chinese. Journal of Chinese Information Processing. 2010, 24(2): 110-116

参考文献

[1] Zheng, J., Huang, C., Chu, M., Soong, F. K., Ye, W., Generalized Segment Posterior Probability for Automatic Mandarin Pronunciation Evaluation[C]//Proc. ICASSP, Hawaii, USA, 2007:201-204.
[2] Witt, S., M, Use of Speech recognition in Computer assisted Language Learning[D].PhD Thesis, University of Cambridge, 1999.
[3] Truong, K., Automatic Pronunciation Error Detection in Dutch as a Second Language: an Acoustic-Phonetic Approach[D].Master’s thesis, Utrecht University, Netherlands, 2004.
[4] Ito, A., Lim, Y., Suzuki, M., Makino, S., Pronunciation Error Detection Method based on Error Rule Clustering using a Decision Tree[C]//Proc. EuroSpeech, 2005:173-176.
[5] Franco, H., Neumeyer, L., Kim, Y., Ronen, O., Bratt, H., Automatic Detection of phone-level mispronunciation for language learning[C]//Proc. Eurospeech, 1999,2, 851-854.
[6] Anastasakos, T., McDonough, J., Schwartz, R. & Makhoul, J. A compact model for speaker-adaptive training[C]//Proc. ICSLP, Philadelphia, 1996:1137-1140.
[7] Giuliani, D., Gerosa, M., Brugnara, F., Improved automatic speech recognition through speaker normalization[J]. computer speech and language, 2006,20, 107-123.
[8] Gales, M.J.F, Maximum likelihood linear transformations for HMM-based speech recognition[J], Computer Speech and Language, 1998,12, 75-98.
[9] 魏思,刘庆升,胡郁,王仁华,普通话水平测试电子化系统[J],中文信息学报, 89-96, 2006。
[10] Tokuda, K., Masuko, T., Miyazaki, N., and Kobayashi, T., Multi-space Probability Distribution HMM[J], IEICE Trans. Inf. & Syst., E85-D(3): pp.455-464, 2002.
[11] Zhang, L., Huang, C., Chu, M., Soong, F. K. Automatic detection of tone mispronunciation in Mandarin Chinese[C]//Proc. ISCSLP, LNAI 4272, Springer, 2006:590-601.
PDF(725 KB)

633

Accesses

0

Citation

Detail

段落导航
相关文章

/